Consider the following pandas series:
import pandas as pd
s = pd.Series([0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1])
I want to identify the first block of 1 values. The block starts when 0 switches to 1 for the first time and ends when it switches back (don’t has to). The rest should just equal zero. One restriction: no iteration allowed, only pure pandas.
Expected output:
s_new = pd.Series([0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>Solution :
You can identify the first block of 1 by first identifying the 1s that follow a 0, computing the cumsum and keeping the block equal to 1:
out = s.where(s.diff().eq(1).cumsum().eq(1), 0)
Output:
0 0
1 0
2 0
3 1
4 1
5 1
6 1
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
dtype: int64
Intermediates:
s diff eq(1) cumsum
0 0 NaN False 0
1 0 0.0 False 0
2 0 0.0 False 0
3 1 1.0 True 1
4 1 0.0 False 1
5 1 0.0 False 1
6 1 0.0 False 1
7 0 -1.0 False 1
8 1 1.0 True 2
9 0 -1.0 False 2
10 0 0.0 False 2
11 1 1.0 True 3
12 1 0.0 False 3
13 1 0.0 False 3
14 0 -1.0 False 3
15 1 1.0 True 4