Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Add a column in pandas based on sum of the subgroup values in another column

Here is a simplified version of my dataframe (the number of persons in my dataframe is way more than 3):

df = pd.DataFrame({'Person':['John','David','Mary','John','David','Mary'],
               'Sales':[10,15,20,11,12,18],
               })

The original df

I would like to add a column "Total" to this data frame, which is the sum of total sales per person
The desired df

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What is the easiest way to achieve this?

I have tried

df.groupby('Person').sum()

but the shape of the output is not congruent with the shape of df.

What I have tried

>Solution :

What you want is the transform method which can apply a function on each group:

df['Total'] = df.groupby('Person')['Sales'].transform(sum)

It gives as expected:

  Person  Sales  Total
0   John     10     21
1  David     15     27
2   Mary     20     38
3   John     11     21
4  David     12     27
5   Mary     18     38
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading