Extracting specific dates from strings

I’m attempting to extract some specific dates from text. The text looks like the following:

'Shares of Luxury Goods Makers Slip on Russia Export Ban',
'By Investing.com\xa0-\xa0Mar 15, 2022 By Dhirendra Tripathi',
'Investing.com – Stocks of European retailers such as LVMH (PA:LVMH), Kering (PA:PRTP), H&M (ST:HMb), Moncler (MI:MONC) and Hermès (PA:HRMS) were all down around 4% Tuesday... ',
' ',
'Europe Stocks Open Lower as Wider Sanctions, Covid Rebound Hit Mood',
'By Investing.com\xa0-\xa0Mar 15, 2022 By Geoffrey Smith\xa0',
'Investing.com -- European stock markets opened lower on Tuesday as a fresh round of EU sanctions, a rebound in Covid-19 cases and more signs of red-hot inflation all weighed on... ',

Obviously in this small snippet id like to extract only: March 15 2022 and March 15 2022.

I’ve attempted:



The first returns all the dates I want plus a load of others that don’t exist.

The second returns "String does not contain a date:"

Can anyone think of the best way I can do this?

>Solution :

You could use a regular expression

import re

line = r'By Investing.com\xa0-\xa0Mar 15, 2022 By Geoffrey Smith\xa0'

re_results = re.findall(r'[A-Z][a-z]{2} \d{1,2}, \d{4}', line)

for result in re_results:


Mar 15, 2022

You can test regular expressions here https://regexr.com/

Leave a Reply