Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex to match string and special characters

I am new to regex and I want to know how to generate a pattern with letters including special characters and Capital letters from 3 letters up.

Suppose I have a string like this:

my_string = 'Syrians/NORP, Turkish/NORP, Turkish/NORP, Turkish/NORP, the last 2 , 3 years/DATE, Turkey/LOC'

What I have tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

my_new_string = re.findall('[\w+\,]+/[A-Z]{4}', my_string)
#result
['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'years/DATE']

Expected result:

['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'the last 2 , 3 years/DATE', 'Turkey/LOC']

I also struggled with the pattern of capital letters from 3 or up.

Can you propose a good solution? Thanks in advance!

>Solution :

>>> re.findall(r'\w[\w, ]+/[A-Z]{3,4}', my_string)
['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'the last 2 , 3 years/DATE', 'Turkey/LOC']

just add space to your character class (where the ‘+’ is not needed after \w), and range from 3 to 4 to match "LOC" (or whatever range you need). Start with an alphanum to avoid matching leading spaces (which also matches _ btw but not a problem here)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading