Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Regex Expression that finds variable string between whitespace, punctuation and/ or string end

I need the regex expression to find a substring (it’s a variable) that is either preceded/followed by punctuation, whitespaces or the start/end of the string. I don’t know the size or content of the substring. I’ve come up with [\?\.!\- ]1abc[\?\.!\- ] (this is a specific example where the substring is 1abc) but I don’t know how to add start/end as a possibility. With the following list:

  • "1abc"
  • "2131abc2411abc"
  • "Hausstrasse 1abc"
  • "Parkallee 1abc "
  • "1abc-2"
  • "abc-def-1abc!"

I’d want the matches to be on all lines but not "2131abc2411abc". Alternatively I tried the pattern [\?\.!\- ]*1abc[\?\.!\- ]* but with this, "2131abc2411abc" returns matches also.

Could someone help me out?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Directly use (^|[\?\.!\- ]) to match one of the boundary characters or the start of the string. For the end, use $.

In addition, directly use 1abc to match that substring literally rather than putting it into a character class which matches only one character from the set.

re.search(r'(^|[\?\.!\- ])1abc([\?\.!\- ]|$)', s)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading