Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Pandas – Timeseries – Timestamp missing time exact at 00:00

I am trying to read in a csv with a time series of measured values. I have the problem that the time stamp omits the time at 00:00 at the extraction from the measurement system.

Here is an excerpt from a csv file:

Time    NS          Status  u12 u23 u31 p1  q1
27.12.2023 23:30:00 0   0   20854,6 20482,8 20706,1 7599130 -2050710
27.12.2023 23:40:00 0   0   20882,8 20510,9 20728,6 7494070 -2078320
27.12.2023 23:50:00 0   0   20819,2 20448,5 20674,7 7672400 -1929610
28.12.2023          0   0   20792,9 20413,2 20645,9 7565910 -1942710
28.12.2023 00:10:00 0   0   20768,6 20368,6 20613,4 7174330 -2002890

And I used:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df = pd.read_csv('C:/Python/Input/measured_values.csv',
                 sep = '\t', decimal=',',
                 skiprows=1, encoding='unicode_escape',
                 parse_dates=['Time'], dayfirst=True)
  • I have tried different parameters from the parse_dates function to solve the problem
  • I also tried to create a new timestamp with incrementing the time between the start and end of the time series – with isn’t simple because i have to pay attention to the time change

is there a simple and efficient, robust solution to read in this sort of csv and declare the column time as datetime64[ns] ?

>Solution :

You could try to pass date_format='mixed', and should probably remove skiprows=1:

df = pd.read_csv(filename, sep='\t', decimal=',', encoding='unicode_escape',
                 parse_dates=['Time'], dayfirst=True, date_format='mixed')

print(df)
                 Time  NS          Status      u12      u23      u31       p1       q1
0 2023-12-27 23:30:00   0               0  20854.6  20482.8  20706.1  7599130 -2050710
1 2023-12-27 23:40:00   0               0  20882.8  20510.9  20728.6  7494070 -2078320
2 2023-12-27 23:50:00   0               0  20819.2  20448.5  20674.7  7672400 -1929610
3 2023-12-28 00:00:00   0               0  20792.9  20413.2  20645.9  7565910 -1942710
4 2023-12-28 00:10:00   0               0  20768.6  20368.6  20613.4  7174330 -2002890

print(df.dtypes)
Time              datetime64[ns]
NS                         int64
        Status             int64
u12                      float64
u23                      float64
u31                      float64
p1                         int64
q1                         int64
dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading