Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Format Pandas date array with mixed formats

I’m trying to unify dates in a column as they come in different formats; current date entries:

[… ’18-Aug-21′ ’16-Aug-21′ ’17-Aug-21′
’22-Aug-21′ ’21-Aug-21′ ’20-Aug-21′ ’19-Aug-21′ ’23-Aug-21′ ’24-Aug-21′
’25-Aug-21′ ’28-Aug-21′ ’26-Aug-21′ ’27-Aug-21′ ’31-Aug-21′ ’30-Aug-21′
’29-Aug-21′ ’06 Sep 2021′ ’07 Sep 2021′ ’23 Sep 2021′ ’17 Sep 2021′
’18 Sep 2021′ ’30 Sep 2021′ ’11 Sep 2021′ ’12 Sep 2021′ ’20 Sep 2021′
’15 Sep 2021′ ’16 Sep 2021′ ’08 Sep 2021′ ’09 Sep 2021′ ’24 Sep 2021′
’25 Sep 2021′ ’03 Sep 2021′ ’10 Sep 2021′ ’19 Sep 2021′ ’01 Sep 2021′
’29 Sep 2021′ ’26 Sep 2021′ ’27 Sep 2021′ ’13 Sep 2021′ ’14 Sep 2021′
’02 Sep 2021′ ’04 Sep 2021′ ’05 Sep 2021′ …

#1: trying to replace the dash here doesn’t work on all dates

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#2: when the year is YY as in ‘6-Aug-21’, how can I format it?

for date in DF_all["SALES_DATE"]:
    date = date.replace("-"," ")

DF_all["SALES_DATE"] = pd.to_datetime(DF_all["SALES_DATE"], format='%d 
%b%Y', errors='ignore')

print(DF_all["SALES_DATE"].unique())

Output:

[…’18-Aug-21′ ’16-Aug-21′ ’17-Aug-21′
’22-Aug-21′ ’21-Aug-21′ ’20-Aug-21′ ’19-Aug-21′ ’23-Aug-21′ ’24-Aug-21′
’25-Aug-21′ ’28-Aug-21′ ’26-Aug-21′ ’27-Aug-21′ ’31-Aug-21′ ’30-Aug-21′
’29-Aug-21′ ’06 Sep 2021′ ’07 Sep 2021′ ’23 Sep 2021′ ’17 Sep 2021′
’18 Sep 2021′ ’30 Sep 2021′ ’11 Sep 2021′ ’12 Sep 2021′ ’20 Sep 2021′
’15 Sep 2021′ ’16 Sep 2021′ ’08 Sep 2021′ ’09 Sep 2021′ ’24 Sep 2021′
’25 Sep 2021′ ’03 Sep 2021′ ’10 Sep 2021′ ’19 Sep 2021′ ’01 Sep 2021′
’29 Sep 2021′ ’26 Sep 2021′ ’27 Sep 2021′ ’13 Sep 2021′ ’14 Sep 2021′
’02 Sep 2021′ ’04 Sep 2021′ …]

Is there a preferred method in python that solves this issue?

>Solution :

I recommend dateutil for this:

import dateutil
DF_all["SALES_DATE"] = DF_all["SALES_DATE"].apply(dateutil.parser.parse)

Output:

>>> DF_all
0    2021-08-18
1    2021-08-16
2    2021-08-17
3    2021-08-22
4    2021-08-21
5    2021-08-20
6    2021-08-19
7    2021-08-23
8    2021-08-24
9    2021-08-25
...
Name: 0, dtype: datetime64[ns]

You might need to install dateutil first. Run the following in a terminal:

pip install python-dateutil

Or, in IPython or a Jupyter Notebook, run:

!pip install python-dateutil
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading