I have a database with a string column, this database looks like this:
structure(list(variables = c("data$Ageee[data$Beneficiary == 1] and data$Age[data$Beneficiary == 0]",
"data$var[data$Beneficiary == 1] and data$Age[data$Beneficiary == 0]",
"data$variable_test[data$Beneficiary == 1] and data$Age[data$Beneficiary == 0]"
), values = c(0, 0, 0)), class = "data.frame", row.names = c(NA,
-3L))
However, I would like to get a new column considering the text after the first $ and before the first [, so I get:
structure(list(variables = c("Ageee", "var", "variable_test"
), values = c(0, 0, 0)), class = "data.frame", row.names = c(NA,
-3L))
I appreciate any help.
>Solution :
We may use sub to capture the word ((\\w+) after the $ – $ is a metacharacter in regex that denotes the end of the string, so it is escaped (\\)
df1$variables <- sub("\\w+\\$(\\w+).*", "\\1", df1$variables)
-output
> df1
variables values
1 Ageee 0
2 var 0
3 variable_test 0