Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Problem with regex (check string for certain repetitions)

I would like to check whether in a text there are a) three consonants in a row or b) four identical letters in a row. Can someone please help me with the regular expressions?

library(tidyverse)

df <- data.frame(text = c("Completely valid", "abcdefg", "blablabla", "flahaaaa", "asdf", "another text", "a last one", "sj", "ngbas"))

consonants <- c("q", "w", "r", "t", "z", "p", "s", "d", "f", "g", "h", "k", "l", "m", "n", "b", "x")

df %>% mutate(
         invalid = FALSE, 
         # Length too short
         invalid = ifelse(nchar(text)<3, TRUE, invalid),
         # Contains three consonants in a row: e.g. "ngbas"
         invalid = ifelse(str_detect(text,"???"),  TRUE, FALSE),   # <--- Regex missing
         # More than 3 identical characters in a row: e.g. "flahaaaa" 
         invalid = ifelse(str_detect(text,"???"),  TRUE, FALSE)    # <--- Regex missing
       )

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Three consonants in a row:

[qwrtzpsdfghklmnbx]{3}

Sequences of length > 3 of a specific char:

([a-z])(\\1){3}
    # The double backslash occurs due to its role as the escape character in strings.

The latter uses a backreference. The number represents the ordinal number assigned to the capture group (= expression in parentheses) that is referenced – in this case the character class of latin lowercase letters.

For clarity, character case is not taken into account here.

Without backreferences, the solution gets a bit lengthy:

(aaaa|bbbb|cccc|dddd|eeee|ffff|gggg|hhhh|iiii|jjjj|kkkk|llll|mmmm|nnnn|oooo|pppp|qqqq|rrrr|ssss|tttt|uuuu|vvvv|wwww|xxxx|yyyy|zzzz)

The relevant docs can be found here.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading