Finding consecutive minutes in pandas

I am following this article – Pandas output date, start and end time and event status given datetime continuity

An example of testing consecutive hours is in the post. I need to test in consecutive minutes. I modified the line of code from 3600 to 60 (hours to minutes)

#test consecutive minutes
df['g'] = df['Date'].diff().dt.total_seconds().div(60).ne(1)

The end result returns all True for any consecutive minutes.

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    True
2009-02-13 13:25:00   32.04    True
2009-02-13 13:26:00   45.94    True
2009-02-13 15:45:00   45.94    True

Where the result should be

Date                  meter    g
2009-02-13 13:23:00   53.49    True
2009-02-13 13:24:00   64.91    False
2009-02-13 13:25:00   32.04    False
2009-02-13 13:26:00   45.94    False
2009-02-13 15:45:00   45.94    True

What is wrong here?

>Solution :

The issue with your code is likely due to floating point approximation? This would be solved if you round the values:

pd.to_datetime(df['Date']).diff().dt.total_seconds().div(60).round().ne(1)

However, there is a much better way, use the Timedelta comparison properties:

df['g'] = pd.to_datetime(df['Date']).diff().ne('1min')

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94  False
4  2009-02-13 15:45:00  45.94   True

For your initial question (first and last of group):

forward = pd.to_datetime(df['Date']).diff().ne('1min')
reverse = (-pd.to_datetime(df['Date']).diff(-1)).ne('1min')
df['g'] = forward|reverse

output:

                  Date  meter      g
0  2009-02-13 13:23:00  53.49   True
1  2009-02-13 13:24:00  64.91  False
2  2009-02-13 13:25:00  32.04  False
3  2009-02-13 13:26:00  45.94   True

Leave a Reply