Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex pattern skips last matches and misses content with parenthesis

Say I have a string:

r'pat1=a, pat2=b, (e, e*89=f), bb, pat3=c, pat4=hi, pat10=ex'

I need to extract patterns as:

pat1=a, 
pat2=b, (e, e*89=f), bb, 
pat3=c, 
pat4=hi, 
pat10=ex

This is the pattern I tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

re.findall(r'(pat\d*.*?)[(pat\d*)|$]', s)

which gives me:

['pat1=', 'pat2=b, ', 'pat3=c, ', 'pat1']

I am more interested in knowing how exactly my pattern is working here that it did not match the required string. Also what could be the solution.

>Solution :

The pattern that you tried (pat\d*.*?)[(pat\d*)|$] matches pat and optional digits, then as least as possible chars until it matches one of the listed characters in the character class [(pat\d*)|$]

To get your desired matches, you don’t want to match anything after .*? but you want to assert either the start of a part with the same pattern for pat.

And for the last part, you can assert the end of the string.


You could write the pattern as:

\bpat\d+=.*?(?=\s*\bpat\d+=|$)

The pattern matches:

  • \bpat\d+= Match the word pat followed by 1+ digits and =
  • .*? Match as least chars as possible
  • (?= Positive lookahead, assert to the right
    • \s*\bpat\d+= Match optional whitespace chars, then pat, 1+ digits and =
    • | Or
    • $ Assert the end of the string for the last part
  • ) Close the lookahead

Regex demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading