Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I change timestamp to readable date in pandas when I have mixed timestamp formats

I have a column with different timestamp formats as shown below. I want to convert the date column to readable date. Since the timestamp units are mixed, I find other converted properly while others default to 1970. Is there a way I can convert them together or convert them to a unix timestamp unit before converting to readable date so that it happens uniformly.

data = ["2022-04-14 17:31:03.023","2022-04-20 12:49:50.295",1647597943249,1647519101441,"2022-03-19 18:10:59.024"]
df = pd.DataFrame(data, columns=['date'])
df['newdate'] = pd.to_datetime(df['date'], unit='ns')
df

    date                     newdate
0   2022-04-14 17:31:03.023  2022-04-14 17:31:03.023000
1   2022-04-20 12:49:50.295  2022-04-20 12:49:50.295000
2   1647597943249            1970-01-01 00:27:27.597943249
3   1647519101441            1970-01-01 00:27:27.519101441
4   2022-03-19 18:10:59.024  2022-03-19 18:10:59.024000

If I change unit to ‘ms’ I get


ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

ValueError: could not convert string to float: '2022-04-14 17:31:03.023'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

ValueError: non convertible value 2022-04-14 17:31:03.023 with the unit 'ms'```

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Idea is replace values with missing values if not possible convert to datetimes and then use Series.fillna:

df['newdate'] = (pd.to_datetime(df['date'], unit='ms', errors='coerce')
                   .fillna(pd.to_datetime(df['date'], errors='coerce')))

print (df)
                      date                 newdate
0  2022-04-14 17:31:03.023 2022-04-14 17:31:03.023
1  2022-04-20 12:49:50.295 2022-04-20 12:49:50.295
2            1647597943249 2022-03-18 10:05:43.249
3            1647519101441 2022-03-17 12:11:41.441
4  2022-03-19 18:10:59.024 2022-03-19 18:10:59.024
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading