Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Use mutate and case_when on a group of variables

I am trying to make a new variable that depends on a few conditions. Here is an example of data similar to mine:

df <- read.table(text="
color     num_1   shape      num_2   season    num_3     num_4  
red        1      triangle    4       Fall      2          8
blue       5      square      4       Summer    8          1
green      3      square      11      Summer    4          1
red        3      circle      2       Summer    1          5
red        7      triangle    6       Winter    7          9
blue       9      square      2       Fall      7          4", header=T)

I want to use mutate and case_when to make a new variable, for example if the color=red and any of the "num" categories are less than 3, the new variable’s value would be "yes", or if the color=blue and any of the num categories are less than 5, the new variable would be "yes".

color     num_1   shape      num_2   season    num_3     num_4     new_var
  
red        1      triangle    4       Fall      2          8         yes 
blue       5      square      4       Summer    8          1         yes
blue       9      square      11      Summer    8          7         no
red        3      circle      2       Summer    1          5         yes
red        7      triangle    6       Winter    7          9         no
blue       9      square      2       Fall      7          4         yes

I think I can do something like:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


df <-df %>%
 mutate(new_var=case_when(
   color=="red" & c(2,4,6,7) < 3 ~ "Yes",
   color=="blue" & c(2,4,6,7) < 5 ~ "Yes" ,
   TRUE~"No"))

But I don’t know if it is possible to chose the columns by position like this. Any advice would be great!

>Solution :

You can’t use raw column indexes like that, but you can use if_any

df %>% 
  mutate(
    new_var = case_when(
      color=="red" & if_any(starts_with("num"), ~ . < 3) ~ "Yes",
      color=="blue" & if_any(starts_with("num"), ~ . < 5) ~ "Yes",
      TRUE ~ "No")
  )

The functions across, if_any, and if_all are all related and allow you to use the tidyselect helpers to look at multiple columns at once.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading