Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

mutate case_when and combined with or

I want to create a new variable based on two conditions, for which the second condition may take two alternative values. Is there a way in the case_when function to combine "and" with "or" in one line of code?

Say this is the first dataframe:

df1 <- data.frame(company = c("A", "B", "C", "D"),
                 since = c(2014, 2017, 2014, 2013),
                 end = c(NA,NA,NA, 2014))

And this is the second:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df2 <- data.frame(company = c("A", "B", "C", "D",
                              "A", "B", "C", "D",
                              "A", "B", "C", "D"),
                  year = c(2013, 2013, 2013, 2013,
                           2014, 2014, 2014, 2014,
                           2015, 2015, 2015, 2015))

Now I want to combine the two and create a new variable based on conditions of the first dataframe. The new variable should get a score of 1 when since is the same or greater than year AND when end is either greater than year OR when end is NA.

The following code works:

df3 <- df2 %>% 
  left_join(df1, by = "company") %>% 
  mutate(new_var = case_when(year >= since & end > year ~ 1,
                             year >= since & is.na(end) ~ 1,
                             TRUE ~ 0)) %>% 
  select(!c(since, end))

And gives me:

   company year new_var
1        A 2013       0
2        B 2013       0
3        C 2013       0
4        D 2013       1
5        A 2014       1
6        B 2014       0
7        C 2014       1
8        D 2014       0
9        A 2015       1
10       B 2015       0
11       C 2015       1
12       D 2015       0

But, ideally I would want to combine this into one line of code. Something like:

df3 <- df2 %>% 
  left_join(df1, by = "company") %>% 
  mutate(new_var = case_when(year >= since & end > year | is.na(end) ~ 1,
                             TRUE ~ 0)) %>% 
  select(!c(since, end))

But this gives me:

   company year new_var
1        A 2013       1
2        B 2013       1
3        C 2013       1
4        D 2013       1
5        A 2014       1
6        B 2014       1
7        C 2014       1
8        D 2014       0
9        A 2015       1
10       B 2015       1
11       C 2015       1
12       D 2015       0

So is there a way to combine this into the same line of code? Any help appreciated.

>Solution :

Wrap OR with brackets:

df2 %>% 
  left_join(df1, by = "company") %>% 
  mutate(new_var = case_when(year >= since & (end > year | is.na(end)) ~ 1,
                             TRUE ~ 0)) %>% 
  select(!c(since, end)))

#    company year new_var
# 1        A 2013       0
# 2        B 2013       0
# 3        C 2013       0
# 4        D 2013       1
# 5        A 2014       1
# 6        B 2014       0
# 7        C 2014       1
# 8        D 2014       0
# 9        A 2015       1
# 10       B 2015       0
# 11       C 2015       1
# 12       D 2015       0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading