Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Finding dates in text using regex

I want to find all dates in a text if there is no word Effective before the date.
For example, I have the following line:

FEE SCHEDULE Effective January 1, 2022 STATE OF January 7, 2022 ALASKA DISCLAIMER The January 5, 2022

My regex should return ['January , 2022', 'January 5, 2022']

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can I do this in Python?

My attempt:

>>> import re
>>> rule = '((?<!Effective\ )([A-Za-z]{3,9}\ *\d{1,2}\ *,\ *\d{4}))'
>>> text = 'FEE SCHEDULE Effective January 1, 2022 STATE OF January 7, 2022 ALASKA DISCLAIMER The January 5, 2022'
>>> re.findall(rule, text)
[('anuary 1, 2022', 'anuary 1, 2022'), ('January 7, 2022', 'January 7, 2022'), ('January 5, 2022', 'January 5, 2022')]

But it doesn’t work.

>Solution :

You can use

\b(?<!Effective\s)[A-Za-z]{3,9}\s*\d{1,2}\s*,\s*\d{4}(?!\d)

See the regex demo. Details:

  • \b – a word boundary
  • (?<!Effective\s) – a negative lookbehind that fails the match if there is Effective + a whitespace char immediately to the left of the current location
  • [A-Za-z]{3,9} – three to nine ASCII letters
  • \s* – zero or more whitespaces
  • \d{1,2} – one or two digits
  • \s*,\s* – a comma enclosed with zero or more whitespaces
  • \d{4} – four digits
  • (?!\d) – a negative lookahead that fails the match if there is a digit immediately on the right.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading