Pandas: pandas count groups if column value less than previous

August 30, 2022

I have a large df. I need to count groups of the ‘years’ column if the year restarts to 1.

I have tried

df["count"] = df.groupby((df["year"]>=df["year"].shift(1)).cumsum()).cumcount()+1

but this doesn’t give the expected result. Below is an example df and the expected output.

year month   alpha
1      0     1.24036
1      1     1.17023
1      0     1.24036
1      1     1.17023
1      0     1.24036
1      1     1.17023
2      0     1.24036
2      1     1.17023
2      0     1.24036
2      1     1.17023
2      0     1.24036
2      1     1.17023
1      0     1.24036
1      1     1.17023
1      0     1.24036
1      1     1.17023
1      0     1.24036
1      1     1.17023
2      0     1.24036
2      1     1.17023
2      0     1.24036
2      1     1.17023
2      0     1.24036
2      1     1.17023

Expected output:

year month   alpha     count
1      0     1.24036     1
1      1     1.17023     1
1      0     1.24036     1    
1      1     1.17023     1
1      0     1.24036     1
1      1     1.17023     1
2      0     1.24036     1
2      1     1.17023     1
2      0     1.24036     1
2      1     1.17023     1
2      0     1.24036     1
2      1     1.17023     1
1      0     1.24036     2
1      1     1.17023     2
1      0     1.24036     2 
1      1     1.17023     2
1      0     1.24036     2
1      1     1.17023     2
2      0     1.24036     2
2      1     1.17023     2
2      0     1.24036     2
2      1     1.17023     2
2      0     1.24036     2
2      1     1.17023     2

>Solution :

You don’t really want a groupby.cumcount, you want to enumerate the groups.

You can use:

m = df['year'].eq(1) & df['year'].shift().ne(1)
df['count'] = m.cumsum()

Or, if you want to ensure having 1 as the first count even if the year does not necessarily start with 1:

df['count'] = m.cumsum() + ~m.iloc[0]

output:

    year  month    alpha  count
0      1      0  1.24036      1
1      1      1  1.17023      1
2      1      0  1.24036      1
3      1      1  1.17023      1
4      1      0  1.24036      1
5      1      1  1.17023      1
6      2      0  1.24036      1
7      2      1  1.17023      1
8      2      0  1.24036      1
9      2      1  1.17023      1
10     2      0  1.24036      1
11     2      1  1.17023      1
12     1      0  1.24036      2
13     1      1  1.17023      2
14     1      0  1.24036      2
15     1      1  1.17023      2
16     1      0  1.24036      2
17     1      1  1.17023      2
18     2      0  1.24036      2
19     2      1  1.17023      2
20     2      0  1.24036      2
21     2      1  1.17023      2
22     2      0  1.24036      2
23     2      1  1.17023      2