Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Non Capturing group lazy when pattern repeats

I am trying to capture 1-2 groups in a line. If a line has a dash I want a group for before and a group for after the dash. If it does not then I would like 1 group of everything.

However, occasionally a line will start with ‘Remove – ‘, which is a phrase I would like to ignore.

Example data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

| Strings |
| -------- |
| Remove - Precision Speed - Recap |
| Precision Speed - Recap |
| Remove - Precision Speed |
| Precision Speed |

The first two should each capture group 1: ‘Precision Speed’ AND group 2: ‘Recap’. While the last two should only capture 1 group: ‘Precision Speed’.

Right now I have ^(?:Remove - )?(.+)(?:\s*-\s*)(.*) and it is working correctly for the first two (because there is a second dash in there I believe). For the 3rd one it is capturing ‘Remove’ and ‘Precision Speed’ and for the 4th one it isn’t capturing anything.

>Solution :

You may use the following pattern:

^(?:Remove - )?([^-]+)(?: - ([^-]+))?$

And if you’re dealing with a multiline text, simply add \r\n to the negated character class to avoid matches across multiple lines:

^(?:Remove - )?([^-\r\n]+)(?: - ([^-\r\n]+))?$

Demo.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading