Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Problem about regular expression not matching with "\1"

IWhile I want to match string begins with ‘c’ or ends with ‘c’,
I write a regular expression like this:

import re
reg = re.compile(r'^(c)|\1$')
reg.search('ca') # match
reg.search('ac') # not match

Actually, the ‘c’ part in the regexp is a complex substring like ‘x|y|z|[0-9_]|…’,
I don’t want to write it twice in a regexp.
And I think it could work with a group matching by use '\1',
but I don’t know why it doesn’t work.

I tried to use a named group matching like

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

reg = re.compile(r'^(?P<name>c)|(?P=name)$')

and it doesn’t work, too.

>Solution :

This is not how regex backreferences work. Their purpose is to match the same exact sub-string multiple times in a single string.

For example, regex ([ab])\1 will match 'aa' and 'bb' but not 'ab' nor 'ba'.
Until the capturing group 1 is matched, \1 is meaningless.

If you want to avoid repeating yourself, while writing the regex, I suggest to just compose it from a few sub-regexes:

sub_reg = r'c'
reg = re.compile(rf'^{sub_reg}|{sub_reg}$')

This has an additional advantage of improving readability if you give descriptive names to your variables.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading