Given a small data set as follows:
df <- structure(list(date = c("2021-09", "2021-10", "2021-11", "2021-12",
"2021-06", "2021-10"), act_direction = c("decrease", "increase",
NA, NA, "unchanged", "unchanged"), pred_direction = c(NA, "decrease",
NA, NA, "decrease", "increase"), direction_acc = c("true", "-",
"-", "true", "false", "false")), class = "data.frame", row.names = c(NA,
-6L))
df:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> -
4 2021-12 <NA> <NA> true
5 2021-06 unchanged decrease false
6 2021-10 unchanged increase false
I try to update direction_acc based on act_direction column, more specifically, if act_direction==unchanged, then I want to update to the content of direction_acc to true, ignoring whatever its original values are.
The expected result:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> -
4 2021-12 <NA> <NA> true
5 2021-06 unchanged decrease true
6 2021-10 unchanged increase true
I used code below which I find no error, but it returns an outcome which is out of expectation since other values in direction_acc were changed to NAs:
df %>%
# mutate_all(na_if, '') %>% # I will replace empty cell '' to NA as example data
mutate(direction_acc = ifelse(act_direction == 'unchanged',
'true',
as.character(direction_acc)))
Result:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> <NA>
4 2021-12 <NA> <NA> <NA>
5 2021-06 unchanged decrease true
6 2021-10 unchanged increase true
So my question is why when act_direction with NAs, direction_acc also become NAs, and how could we achieve that correctly?
>Solution :
base R
df$direction_acc[df$act_direction == "unchanged"] <- "true"
df
# date act_direction pred_direction direction_acc
# 1 2021-09 true
# 2 2021-10 decrease -
# 3 2021-11 -
# 4 2021-12 true
# 5 2021-06 unchanged decrease true
# 6 2021-10 unchanged increase true
dplyr
library(dplyr)
df %>%
mutate(
direction_acc = if_else(act_direction == "unchanged", "true", direction_acc)
)
If I add the NAs as you have (I’ll use across as mutate_if is deprecated), then we can change == to %in% for the desired effect.
df %>%
mutate(across(where(is.character), ~ na_if(., ""))) %>%
mutate(
direction_acc = if_else(act_direction %in% "unchanged", "true", direction_acc)
)
# date act_direction pred_direction direction_acc
# 1 2021-09 <NA> <NA> true
# 2 2021-10 <NA> decrease -
# 3 2021-11 <NA> <NA> -
# 4 2021-12 <NA> <NA> true
# 5 2021-06 unchanged decrease true
# 6 2021-10 unchanged increase true