I am trying to convert a pandas df’s time series column which contains Unix time stamp to a datetime object. When I try to use online tools to see the real datetime it works perfectly fine. However, when I pass it to the pandas to_datetime it converts everything wrongly.
My df:
| date |
|---|
| 1680524178283156 |
| 1680524178283542 |
| 1680524178284691 |
| 1680524178289810 |
the code is just a one-liner:
df["date"] = pd.to_datetime(df["date"], dayfirst=True)
df.head()
The pandas output:
| date |
|---|
| 1970-01-20 10:48:44.178283156 |
| 1970-01-20 10:48:44.178283542 |
| 1970-01-20 10:48:44.178284691 |
| 1970-01-20 10:48:44.178289810 |
The real dates while checking online (https://www.unixtimestamp.com/) are quite recent. Such as Mon Apr 03 2023 12:16:18 GMT+0000 for the first timestamp in the df above. Also the digits looks quite huge for Pandas datetime to convert if I try to play around other methods. Sometime, I also get error like OverflowError: signed integer is greater than maximum
Thanks in advance.
>Solution :
You can divide the date by 1_000_000 and then convert to date:
df['date'] = pd.to_datetime(df['date'] / 1_000_000, unit='s', origin='unix')
print(df)
Prints:
date
0 2023-04-03 12:16:18.283155968
1 2023-04-03 12:16:18.283542016
2 2023-04-03 12:16:18.284691200
3 2023-04-03 12:16:18.289809920
OR: Convert from micro-seconds:
df['date'] = pd.to_datetime(df['date'], unit='us')