Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex to match substrings containing n non repeted characters

I am facing a (naive) problem with regular expression.
I need to find any substrings composed of a fixed number (n) of different characters.

So, for "aaabcddd", if n=3 the substrings that I expect to find are: "abc" and "bcd".

My idea is to use n-1 capture groups and ‘[^’ to exclude characters already matched. Thus, I wrote the following Perl regex (in Julia):

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

r"(([[:alpha:]])[^\2])[^\1]"

But is not working.

Do you have any tips?

>Solution :

You can not use a backreference to a capture group using a negated character class [^\1]

What you can do is use a negative lookahead to assert what is directly to the right of the current position is not what you have already captured in a previous group.

If that is the case, capture a single alpha in a new group.

The matches abc and bcd are in capture group 1

(?=(([[:alpha:]])(?!\2)([[:alpha:]])(?!\3|\2)[[:alpha:]]))
  • (?= Positive lookahead
    • ( Capture group 1
      • ([[:alpha:]]) Capture the first char in group 2
      • (?!\1)([[:alpha:]]) If not looking at what is captured by group 2 to the right, capture the second char in group 3
      • (?!\2|\1) If not looking to the right at what is captured by group 2 or 3
      • [[:alpha:]] Mach the 3rd char
    • ) Close group 1
  • ) Close the lookahead

Regex demo

Or a bit shorter using a case insensitive match:

(?=(([a-z])(?!\2)([a-z])(?!\3|\2)[a-z]))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading