I have a datatable with mainly numerical column variables, as well as one character variable containing ~30 different character variables. I would like to use str_detect in R to read information in the column of data type character, compare it to a vector containing a set of strings, create a new dt column, and assign a value in this new column based on whether the initial column contained any of the strings found in the character/string vector.
Explaining in words isn’t very clear, so I’ll try to use iris as an example to explain, although this dataset is a bit too simple; I wouldn’t really want to do what I’m attempting here with iris.
This is what I tried:
mydesiredgrouping <- c("virginica","versicolor")
test <- iris %>% mutate(Group1 = if_else(str_detect(iris$Species, mydesiredgrouping), "mygroup", "notmygroup"))
But when I do this I get the following error.
Error in
mutate(): ℹ In argument:Group1 = if_else(...). Caused by
error instr_detect(): ! Can’t recyclestring(size 150) to match
pattern(size 2). Runrlang::last_trace()to see where the error
occurred.
Does anyone know what I need to change?
This is the output I am hoping to get, but am not getting.
EDIT
I forgot to mention that in my own dataset I had some partial matches, which was why I was trying to use str_detect in the first place. Fortunately, I got the answer I was looking for despite this, but I am adding this information for clarity.
>Solution :
Ronak’s answer is the best if you are sure you have complete matches. If you have partial matches, you can use vertical bars to collapse the search terms into one regex, and then pass that as a pattern to str_detect:
test <- iris %>% mutate(Group1 = if_else(str_detect(Species, paste(mydesiredgrouping, collapse = "|")), "mygroup", "notmygroup"))
