I am trying to create a new column based on specific values in several (four) other columns.
Reprex:
col1 <- c("a", "", "a")
col2 <- c("", "b", "b")
df <- data.frame(col1, col2)
I have tried with case_when, but cannot get it to apply to all values.
df$col3 <- with(df, case_when(col1 %in% c("a") & col2 %in% c("b") ~ "c" ))
What I want to achieve:
| col1| col2 | col3 |
|:--: | :-------:|:-------:|
| a | | a |
| | b | b |
| a | b | c |
So if col1 is a, col2 is na then col 3 is a.
If col1 is na, col2 b then col3 should be b and finally if col1 is a, col2 b then col3 should be c
>Solution :
Here is how we could do it with case_when, first just replace blank cells with NA
library(dplyr)
df %>%
mutate(across(starts_with("col"), na_if,""),
col3 = case_when(
col1=="a" & is.na(col2) ~ "a",
is.na(col1) & col2 == "b" ~ "b",
col1 == "a" & col2 == "b" ~ "c",
TRUE ~ NA_character_))
col1 col2 col3
1 a <NA> a
2 <NA> b b
3 a b c