I have a dataframe that I would like to group by a given column, BUT only if one other column is also the same, while doing a sum on an other column. Given this example:
test=pd.DataFrame({'A':['0','0','0','1'],'B':['AAA','AAA','BBB','CCC'],'C':[0.5,0.2,0.3,0.1]})
A B C
0 0 AAA 0.5
1 0 AAA 0.2
2 0 BBB 0.3
3 1 CCC 0.1
I would like group for A only if B is different. I am targeting following dataframe:
A B C
0 0 AAA 0.7
1 0 BBB 0.3
2 1 CCC 0.1
So far I did not find any way to do it
>Solution :
test=pd.DataFrame({'A':['0','0','0','1'],'B':['AAA','AAA','BBB','CCC'],'C':[0.5,0.2,0.3,0.1]})
test.groupby(['A','B'])['C'].sum()
A B
0 AAA 0.7
BBB 0.3
1 CCC 0.1
Name: C, dtype: float64
test.groupby(['A','B'], as_index=False)['C'].sum()
A B C
0 0 AAA 0.7
1 0 BBB 0.3
2 1 CCC 0.1