Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Problems in converting ".to_datetime" in Python

I have the following list:

l = [<div class="date">8 December 2004</div>,
 <div class="date">6 December 2004</div>,
 <div class="date">18 October 2004</div>,
 <div class="date">9 October 2004</div>,
 <div class="date">8 August 2004</div>,
 <div class="date">18 June 2004</div>,
 <div class="date">23 December 2005</div>,
 <div class="date">19 December 2005</div>,
 <div class="date">19 December 2005</div>,
 <div class="date">15 December 2005</div>]

I would like to convert it into a dataframe with a Date column in a to.datetime format.

I tried many solutions (see one below) but I couln’t get my head around it.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


pd.to_datetime(pd.DataFrame({'Date':l}), format = '%d %B %Y')        

Can anyone help me?

Thanks!

>Solution :

Extract text inside tags by BeautifulSoup and then convert to datetimes:

from bs4 import BeautifulSoup

df = pd.DataFrame({'Date':[ BeautifulSoup(x, features="lxml").text for x in l]})
df['Date'] = pd.to_datetime(df['Date'], format = '%d %B %Y')
print (df)
        Date
0 2004-12-08
1 2004-12-06
2 2004-10-18
3 2004-10-09
4 2004-08-08
5 2004-06-18
6 2005-12-23
7 2005-12-19
8 2005-12-19
9 2005-12-15
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading