Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculated mean column for a group transposed in a dataframe

I’m having an issue with my final analysis column. I’m looking to get the mean of each row in the below output.

    ValueSource StackedValues   Count   Sum_Weight  Group_Count Mean
0      AgeBand      4.0          402    6152.237828    2418      NaN
2      AgeBand      2.0          402    5250.436317    2053      NaN
7      AgeBand      3.0          402    4344.387011    1667      NaN
11     AgeBand      5.0          402    7296.371395    2911      NaN
19     AgeBand      1.0          402    3260.035257    1254      NaN
20     AgeBand      6.0          402    8501.978737    3341      NaN
59     AgeBand      8.0          402    15487.932515   6210      NaN
92     AgeBand      7.0          402    12054.620941   4846      NaN

So for index row 0, the mean would be Sum_Weight/SUM(Sum_Weight) and grouped across Valuesource

I tried the following Data['Mean'] = Data.groupby("ValueSource")['Sum_Weight'].mean() but as you can see, it didn’t quite work.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The end result would be a mean column that has a value for each row per ValueSource and StackedValue

Any help would be much appreciated.

>Solution :

You could do that with groupby and apply like

Data['Mean'] = Data.groupby("ValueSource")['Sum_Weight'].apply(lambda x: x / x.sum()) 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading