Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extracting specific dates from strings

I’m attempting to extract some specific dates from text. The text looks like the following:

'Shares of Luxury Goods Makers Slip on Russia Export Ban',
'By Investing.com\xa0-\xa0Mar 15, 2022 By Dhirendra Tripathi',
'Investing.com – Stocks of European retailers such as LVMH (PA:LVMH), Kering (PA:PRTP), H&M (ST:HMb), Moncler (MI:MONC) and Hermès (PA:HRMS) were all down around 4% Tuesday... ',
'',
'',
'',
' ',
'Europe Stocks Open Lower as Wider Sanctions, Covid Rebound Hit Mood',
'By Investing.com\xa0-\xa0Mar 15, 2022 By Geoffrey Smith\xa0',
'Investing.com -- European stock markets opened lower on Tuesday as a fresh round of EU sanctions, a rebound in Covid-19 cases and more signs of red-hot inflation all weighed on... ',
'',
'\xa0',

Obviously in this small snippet id like to extract only: March 15 2022 and March 15 2022.

I’ve attempted:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

datefinder.find_dates(text)

dateutil.parser

The first returns all the dates I want plus a load of others that don’t exist.

The second returns "String does not contain a date:"

Can anyone think of the best way I can do this?

>Solution :

You could use a regular expression

import re

line = r'By Investing.com\xa0-\xa0Mar 15, 2022 By Geoffrey Smith\xa0'

re_results = re.findall(r'[A-Z][a-z]{2} \d{1,2}, \d{4}', line)

for result in re_results:
    print(result)

Output:

Mar 15, 2022

You can test regular expressions here https://regexr.com/

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading