Home Is there an efficient way to compute column values in Pandas using values from previous rows based on conditional values from other columns?

Questions

Is there an efficient way to compute column values in Pandas using values from previous rows based on conditional values from other columns?

byMR

December 5, 2021

Consider looping through my DataFrame:

import pandas as pd

df = pd.DataFrame({
    'Price': [1000, 1000, 1000, 2000, 2000, 2000, 2000, 1400, 1400],
    'Count': [0, 0, 0, 0, 0, 0, 0, 0, 0]
})

for idx in df.index:
    if df['Price'].iloc[idx] > 1500:
        if idx > 0:
            df['Count'].iloc[idx] = df['Count'].iloc[idx - 1] + 1

Resulting in:

	Price	Count
0	1000	0
1	1000	0
2	1000	0
3	2000	1
4	2000	2
5	2000	3
6	2000	4
7	1400	0
8	1400	0

Is there a more efficient way to do this?

>Solution :

Create pseudo-groups using Series.cumsum, then use groupby.cumcount to generate the within-group counts:

groups = df.Price.le(1500).cumsum()
df['Count'] = df.Price.gt(1500).groupby(groups).cumcount()

#    Price  Count
# 0   1000      0
# 1   1000      0
# 2   1000      0
# 3   2000      1
# 4   2000      2
# 5   2000      3
# 6   2000      4
# 7   1400      0
# 8   1400      0