Count consecutive boolean values in Python/pandas array for whole subset

April 6, 2023

I am looking for a way to aggregate pandas data frame by consecutive same values and perform actions like count or max on this aggregation.

for example, if I would have one column in df:

the result needs to be:

Why: We have two 0 at the beginning, and three 1 next,…

What I need, is similar that this answer but for all elements in the group I need the same value.

The preferred answer would be one that shows this aggregation of the consecutive same element and applies the aggregation function to it. So that I could do even max value:

    my_column    other_value
0        0           7
1        0           4
2        1           1
3        1           0
4        1           5
5        0           1
6        0           1
7        0           2
8        0           8
9        1           1
10       1           0
11       0           2

and the result would be

>Solution :

You can use :

g = df["my_column"].ne(df["my_column"].shift()).cumsum()

out = df.groupby(g)["my_column"].transform("count")

Output :

print(out)

    my_column
0           2
1           2
2           3
3           3
4           3
5           4
6           4
7           4
8           4
9           2
10          2
11          1

NB : to get the max, use df.groupby(g)["other_value"].transform("max").