Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex pattern breaks when I surround it with parentheses

I want to match any piece of non-empty text surrounded by either (1) single quotes or (2) double quotes or (3) parentheses. So for example "wekj fowekjf e" or 'ohogkwoefo je e' but not "dl ekj). I figured out that the pattern (["']).+\1|\(.+\) works, but if I surround this pattern with an outer pair of parentheses, making it ((["']).+\1|\(.+\)), then it breaks, because it matches strings like "owoek fij kefw ), which shouldn’t match. Why is that?

I need to surround the pattern it with parentheses because I need to capture it in a group.

Test case:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

"wdok jeow i "
'oi foeifjeifj ei'
(ekj foiejowek)
£doihoiefj£
ekfj ei 
"owoek fij kefw )

Expected result: The first 3 lines should match but the other lines should not match. In particular, the last line should not match.

EDIT

Forgot to mention what regex engine I was using. I’m just using regex search in VSCode for the moment.

EDIT EDIT

It turns out that the nested parentheses was the cause of the problem (see @tripleee’s answer) and that didn’t occur to me at the time, but did I really deserve a downvote for not noticing that?

EDIT EDIT EDIT

OK so my question has been closed because apparently it’s "a duplicate". Let me be clear, this is not a duplicate. The other question already asks about nested parentheses. The reason I asked my question was that it didn’t occur to me that nested parentheses was the root of the problem. This question would be very useful for people who search things like "regex pattern stops working when I surround it with parentheses". This is indeed what I searched and didn’t find an answer. I hope this will still be useful for people.

>Solution :

Adding parentheses renumbers the back references; so what used to be in \1 is now in \2.

In very brief, \1 refers to the string captured by the parentheses whose opening parenthesis comes first, \2 corresponds to the second opening parenthesis, etc.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading