I have a dataset (dataraw) with column labels such as
condition1_men, condition1_women, condition2_men, condition3_women (etc)
I want to replace the strings ‘condition1’, ‘condition2’ with their names.
condition1_women = related_women;
condition2_men = unrelated_men;
condition3_men = filler_men;
Current code:
data <- dataraw %>%
rename_all(~ str_replace_all(str_replace(., 'condition1', "related"), 'condition2', "unrelated"))
This is working for up to 2 strings, every way I attempt to add a third string, I get unexpected symbol errors.
data <- dataraw %>%
rename_all(~ str_replace_all(str_replace((., 'condition1', "related"), 'condition2', "unrelated"), 'condition3', "filler")))
I’m sure this must be simple, but no matter the combinations I try I’m getting errors.
Would anyone be able to point me towards the simple mistake I’m making?
Thanks.
>Solution :
rename_all was superseded over 6 years ago in favor of rename_with, I’ll use that:
library(dplyr)
dataraw <- data.frame(condition1_men=1, condition1_women=2, condition2_men=3, condition2_women=4, condition3_men=5)
dataraw
# condition1_men condition1_women condition2_men condition2_women condition3_men
# 1 1 2 3 4 5
dataraw |>
rename_with(.fn = ~ sub("^condition1_", "related_", sub("^condition2_", "unrelated_", .)))
# related_men related_women unrelated_men unrelated_women condition3_men
# 1 1 2 3 4 5
If you have a (named) vector of "from=to" assignments, we can also do it like this to be a little more general:
conds <- c(condition1="related", condition2="unrelated")
dataraw |>
rename_with(.fn = ~ Reduce(function(st, i) sub(names(conds)[i], conds[i], st), seq_along(conds), init = .x))
# related_men related_women unrelated_men unrelated_women condition3_men
# 1 1 2 3 4 5
We need Reduce since we need to preserve all changes from previous condition mappings.
I often find data like this does better (in later data-munging/analysis) in a long format (as Limey suggested). For that, we can also do:
dataraw |>
tidyr::pivot_longer(cols = everything(), names_pattern = "(.*)_(.*)",
names_to = c("cond", ".value")) |>
mutate(cond2 = conds[match(sub("_.*", "", cond), names(conds))])
# # A tibble: 3 × 4
# cond men women cond2
# <chr> <dbl> <dbl> <chr>
# 1 condition1 1 2 related
# 2 condition2 3 4 unrelated
# 3 condition3 5 NA NA
though it might be simpler (data management, visualizing, updating, etc) if your mapping were in a different frame, which we can merge/join onto the original data:
cond_df <- tribble(
~ cond, ~ cond2
, "condition1", "related"
, "condition2", "unrelated"
)
dataraw |>
tidyr::pivot_longer(cols = everything(), names_pattern = "(.*)_(.*)",
names_to = c("cond", ".value")) |>
left_join(cond_df, by = "cond")
# # A tibble: 3 × 4
# cond men women cond2
# <chr> <dbl> <dbl> <chr>
# 1 condition1 1 2 related
# 2 condition2 3 4 unrelated
# 3 condition3 5 NA NA