I have a pandas dataframe that has rows like this
Same1 Same2 Diff3 Encoded1 Encoded2 Encoded3
0 33 22 150 0 0 0
1 33 22 300 1 0 1
What I want to achieve is to combine all rows where the ‘Same1’ and ‘Same2’ variables are the same, by adding up the other variables.
Same1 Same2 Diff3 Encoded1 Encoded2 Encoded3
0 33 22 450 1 0 1
What would be the cleanest way to achieve this using pandas?
Executable python code:
https://trinket.io/python3/1da371fd04
>Solution :
You can try
out = df.groupby(['Same1', 'Same2']).agg(sum).reset_index()
print(out)
Same1 Same2 Diff3 Encoded1 Encoded2 Encoded3
0 33 22 450 1 0 1