Regex to match string and special characters


I am new to regex and I want to know how to generate a pattern with letters including special characters and Capital letters from 3 letters up.

Suppose I have a string like this:

my_string = 'Syrians/NORP, Turkish/NORP, Turkish/NORP, Turkish/NORP, the last 2 , 3 years/DATE, Turkey/LOC'

What I have tried:

my_new_string = re.findall('[\w+\,]+/[A-Z]{4}', my_string)
['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'years/DATE']

Expected result:

['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'the last 2 , 3 years/DATE', 'Turkey/LOC']

I also struggled with the pattern of capital letters from 3 or up.

Can you propose a good solution? Thanks in advance!

>Solution :

>>> re.findall(r'\w[\w, ]+/[A-Z]{3,4}', my_string)
['Syrians/NORP', 'Turkish/NORP', 'Turkish/NORP', 'Turkish/NORP', 'the last 2 , 3 years/DATE', 'Turkey/LOC']

just add space to your character class (where the ‘+’ is not needed after \w), and range from 3 to 4 to match "LOC" (or whatever range you need). Start with an alphanum to avoid matching leading spaces (which also matches _ btw but not a problem here)

Leave a Reply Cancel reply