Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

regex to find the position of the first four concurrent unique values

I’ve solved 2022 advent of code question 6, but was wondering if there was a regex way to find the first occurance of 4 non-repeating characters:

From the question:

bvwbjplbgvbhsrlpgdmjqwftvncz

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

bvwbjplbgvbhsrlpgdmjqwftvncz

# discard as repeating letter b

bvwbjplbgvbhsrlpgdmjqwftvncz

# match the 5th character, which signifies the end of the first four character block with no repeating characters

in R I’ve tried:

txt <- "bvwbjplbgvbhsrlpgdmjqwftvncz"
str_match("(.*)\1", txt)

But I’m having no luck

>Solution :

You can use

stringr::str_extract(txt, "(.)(?!\\1)(.)(?!\\1|\\2)(.)(?!\\1|\\2|\\3)(.)")

See the regex demo. Here, (.) captures any char into consequently numbered groups and the (?!...) negative lookaheads make sure each subsequent . does not match the already captured char(s).

See the R demo:

library(stringr)
txt <- "bvwbjplbgvbhsrlpgdmjqwftvncz"
str_extract(txt, "(.)(?!\\1)(.)(?!\\1|\\2)(.)(?!\\1|\\2|\\3)(.)")
## => [1] "vwbj"

Note that the stringr::str_match (as stringr::str_extract) takes the input as the first argument and the regex as the second argument.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading