I wonder if it might be possible to drop the parts in n1 character vector that partially overlap with elements in f1 formula.
For example, in n1, we see "study_typecompare" & "study_typecontrol" partially overlap with study_type in f1.
Thus in the desired_output, we want to drop the "study_type" part of them. Because other elements (ex. time_wk_whn) in n1 fully overlap with an element in f1, we leave them unchanged.
Is obtaining my desired_output possible in BASE R or tidyvesrse?
f1 <- gi ~ 0 + study_type + time_wk_whn + time_wk_btw + items_whn +
items_btw + training_hr_whn + training_hr_btw
n1 <- c("study_typecompare","study_typecontrol","time_wk_whn",
"time_wk_btw","items_whn","items_btw","training_hr_whn",
"training_hr_btw")
desired_output <- c("compare","control", "time_wk_whn",
"time_wk_btw","items_whn","items_btw",
"training_hr_whn","training_hr_btw")
>Solution :
We create a function to pass the formula and the vector (‘fmla’, ‘vec’) respectively. Extract the variables from the ‘fmla’ (all.vars), find the values in the vector that are not found in the formula variables (setdiff), create a pattern by paste those variables and replace with blank ("") using sub, and update the ‘vec’, return the updated vector
fun1 <- function(fmla, vec) {
v1 <- all.vars(fmla)
v2 <- setdiff(vec, v1)
v3 <- sub(paste(v1, collapse = "|"), "", v2)
vec[vec %in% v2] <- v3
vec
}
-checking
> identical(fun1(f1, n1), desired_output)
[1] TRUE