Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to attach a groupby aggregate to the original dataframe where the aggregate is placed in a new column at the bottom of each group

I’ve got a dataframe df = pd.DataFrame({'A':[1,1,2,2],'values':np.arange(10,30,5)})

How can I group by A to get the sum of values, where the sum is placed in a new column sum_values_A, but only once at the bottom of each group. e.g.

    A   values  sum_values_A
0   1   10      NaN
1   1   15      25
2   2   20      NaN
3   2   25      45

I tried

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['sum_values_A'] = df.groupby('A')['values'].transform('sum')

df['sum_values_A'] = df.groupby('A')['sum_values_A'].unique()

But couldn’t find a way to get the unique sums to be sorted at the bottom of each group

>Solution :

You can use:

df.loc[~df['A'].duplicated(keep='last'),
       'sum_values_A'
      ] = df.groupby('A')['values'].transform('sum')

print(df)

Or:

m = ~df['A'].duplicated(keep='last')

df.loc[m, 'sum_values_A'] = df.loc[m, 'A'].map(df.groupby('A')['values'].sum())

Output:

   A  values  sum_values_A
0  1      10           NaN
1  1      15          25.0
2  2      20           NaN
3  2      25          45.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading