Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Recode if string (with punctuation) contains certain text

How can I search through a character vector and, if the string at a given index contains a pattern, replace that index’s value?

I tried this:

List <- c(1:8)
  Types<-as.character(c(
    "ABC, the (stuff).\n\n\n fun", "meaningful", "relevant", "rewarding", 
    "unpleasant", "enjoyable", "engaging", "disinteresting"))
  for (i in List) {
    if (grepl(Types[i], "fun", fixed = TRUE))
    {Types[i]="1"
    } else if (grepl(Types[i], "meaningful", fixed = TRUE))
    {Types[i]="2"}} 

The code works for "meaningful", but doesn’t when there’s punctuation or other things in the string, as with "fun".

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The first argument to grepl is the pattern, not the string.

This would be a literal fix of your code:

for (i in seq_along(Types)) {
  if (grepl("fun", Types[i], fixed = TRUE)) {
    Types[i] = "1"
  } else if (grepl("meaningful", Types[i], fixed = TRUE)) {
    Types[i] = "2"
  }
}
Types
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"

BTW, the use of List works, but it’s a little extra: when you have separate variables like that, it is possible that one might go out of sync with the other. For instance, if you update Types and forget to update List, then it will break (or fail). For this, I used seq_along(Types) instead.

BTW: here’s a slightly different version that leaves Types untouched and returns a new vector, and is introducing you to the power of vectorization:

Types[grepl("fun", Types, fixed = TRUE)] <- "1"
Types[grepl("meaningful", Types, fixed = TRUE)] <- "2"
Types
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"

The next level (perhaps over-complicating?) would be to store the patterns and recoding replacements in a frame (always a 1-to-1, you’ll never accidentally update one without the other, can be stored in CSV if needed) and Reduce on it:

ptns <- data.frame(ptn = c("fun", "meaningful"), repl = c("1", "2"))
Reduce(function(txt, i) {
  txt[grepl(ptns$ptn[i], txt, fixed = TRUE)] <- ptns$repl[i]
  txt
}, seq_len(nrow(ptns)), init = Types)
# [1] "1"              "2"              "relevant"       "rewarding"      "unpleasant"    
# [6] "enjoyable"      "engaging"       "disinteresting"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading