Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

REGEX, How to select based on 2 conditions

I have the following 3 lines:

CO2_Max
Max_CO2
CO2_Max_SP

I use python and I want a regex pattern that selects the first 2 lines and ignores the third one.
( contains the word ‘Max’ and does not end with ‘SP’ ).

I tried the following pattern

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

".*Max.*[^SP]"

This selected the 3 lines.. and third line was selected like CO2_Max_ SP

And the following pattern

".*Max.*[^SP]$"

Returned only the second line for some reason
Any ideas how to make this work as intended?

>Solution :

You need to use a negative lookbehind. Try this: ^.*Max.*(?<!SP)$

Test here: https://regex101.com/r/sVvpj1/1

import re

s = """CO2_Max
Max_CO2
CO2_Max_SP"""

regex = re.compile(r'^.*Max.*(?<!SP)$', re.M)
l = regex.findall(s)

print(l)
# ['CO2_Max', 'Max_CO2']

Explanation:

^           matches the start of line
.*Max.*     matches anything till and after 'Max'
(?<!SP)$    ensures that the line does not end with 'SP'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading