Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to str_detect a string that contains parentheses?

I am trying to do a string search for multiple strings on one column, but the script does not return strings that have parentheses.

a <-  c("Apple", "Facebook", "Google (1992)")
b <- c(1, 2, 3)

c <-  data.frame(a, b)

d <- c %>% 
  distinct(a) %>% 
  pull()

c %>% 
  filter(str_detect(a, paste(d, collapse = "|"))) %>% 
  group_by(a) %>% 
  tally()

I want the last script to return "Apple", "Facebook", "Google (1992)", but it only returns the first two. Is there something I can add to the "collapse" argument to include strings with parentheses?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

(Per the comments, you don’t even need regex in this case. But for future reference:) As you may already know, parentheses have to be escaped in regular expressions. This is easy enough when you’re specifying the pattern directly — e.g., str_detect(a, "Google \\(1992\\)"). But it can be slightly trickier when the pattern is stored in a variable, as in your case. You can handle this as

library(stringr)

str_detect(a, paste(
  str_replace_all(d, c("\\(" = "\\\\(", "\\)" = "\\\\)")), 
  collapse = "|"
))

In the vector of replacements, we have to escape the parenthesis on the left side ("\\("). But on the right hand side, we have to escape the "\" — we use "\\\\" to insert a literal "\\".

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading