I have a dataset and I need to get the sum from two other columns and group them by name.
DF
a b c
Joe 1 0
Joe 1 0
Joe 0 1
Adam 1 0
Adam 0 1
Adam 0 0
Desired Output:
a b c d
Joe 1 0 1
Joe 1 0 2
Joe 0 1 3
Adam 1 0 1
Adam 0 1 2
Adam 0 0 2
I have tried df['d'] = df.groupby('a')['b','c'].sum()
When I do this I get NaN as a result.
>Solution :
You need to do a cumulative sum grouped on that column.
Something like this should work
df['sum'] = df['b'] + df['c']
df['d'] = df.groupby('a')['sum'].cumsum()
which gives
a b c sum d
0 Joe 1 0 1 1
1 Joe 1 0 1 2
2 Joe 0 1 1 3
3 Adam 1 0 1 1
4 Adam 0 1 1 2
5 Adam 0 0 0 2