I have the following df, the last column is the desired output. thanks!
group date value desired_first_nonzero
1 jan2019 0 2
1 jan2019 2 2
1 feb2019 3 2
1 mar2019 4 2
1 mar2019 5 2
2 feb2019 0 4
2 feb2019 0 4
2 mar2019 0 4
2 mar2019 4 4
2 apr2019 5 4
I want to group by "group" and find the first non-zero value
>Solution :
You can use GroupBy.transform with a custom function to get the index of the first non-zero value with idxmax (that return the first True value here):
df['desired_first_nonzero'] = (df.groupby('group')['value']
.transform(lambda s: s[s.ne(0).idxmax()])
)
alternatively, using an intermediate Series:
s = df.set_index('group')['value']
df['desired_first_nonzero'] = df['group'].map(s[s.ne(0)].groupby(level=0).first())
output:
group date value desired_first_nonzero
0 1 jan2019 0 2
1 1 jan2019 2 2
2 1 feb2019 3 2
3 1 mar2019 4 2
4 1 mar2019 5 2
5 2 feb2019 0 4
6 2 feb2019 0 4
7 2 mar2019 0 4
8 2 mar2019 4 4
9 2 apr2019 5 4