Pandas groupby all rows between boolean column values

November 21, 2021

Looking for the Pandas syntax for grouping rows between column boolean values.

DataTable

distance	time	new
0	2020-10-11 1:30:15	true
5	2020-10-11 1:31:15	false
10	2020-10-11 1:32:15	false
1	2020-10-11 2:30:15	true
2	2020-10-11 2:31:15	false
30	2020-10-11 2:32:15	false
31	2020-10-11 2:33:15	false
0	2020-10-12 1:31:15	true
5	2020-10-12 1:32:15	false

Question

Looking to group all the rows that are a part of each activity. A new activity is determined by the true value in the new column. Based on the data above there should be 3 groups. Each group needs to start with the first row with True and end with the last False before the next True.
How can this be achieved using groupby?

Code

Here is the ability to use groupby and group by the date, but like I said above, I’m looking to change this to group by the "new" column group rows into activities. I’m planning to use in a for loop like this though.

for idx, day in df.groupby(df.timestamp.dt.date):
    print(idx)
    print(day)

>Solution :

Just group by df['new'].cumsum():

for idx, day in df.groupby(df['new'].cumsum()):
    print('-----')
    print(idx)
    print(day)

Output:

-----
1
   distance                time    new
0         0  2020-10-11+1:30:15   True
1         5  2020-10-11+1:31:15  False
2        10  2020-10-11+1:32:15  False
-----
2
   distance                time    new
3         1  2020-10-11+2:30:15   True
4         2  2020-10-11+2:31:15  False
5        30  2020-10-11+2:32:15  False
6        31  2020-10-11+2:33:15  False
-----
3
   distance                time    new
7         0  2020-10-12+1:31:15   True
8         5  2020-10-12+1:32:15  False