I’m having an issue with my final analysis column. I’m looking to get the mean of each row in the below output.
ValueSource StackedValues Count Sum_Weight Group_Count Mean
0 AgeBand 4.0 402 6152.237828 2418 NaN
2 AgeBand 2.0 402 5250.436317 2053 NaN
7 AgeBand 3.0 402 4344.387011 1667 NaN
11 AgeBand 5.0 402 7296.371395 2911 NaN
19 AgeBand 1.0 402 3260.035257 1254 NaN
20 AgeBand 6.0 402 8501.978737 3341 NaN
59 AgeBand 8.0 402 15487.932515 6210 NaN
92 AgeBand 7.0 402 12054.620941 4846 NaN
So for index row 0, the mean would be Sum_Weight/SUM(Sum_Weight) and grouped across Valuesource
I tried the following Data['Mean'] = Data.groupby("ValueSource")['Sum_Weight'].mean() but as you can see, it didn’t quite work.
The end result would be a mean column that has a value for each row per ValueSource and StackedValue
Any help would be much appreciated.
>Solution :
You could do that with groupby and apply like
Data['Mean'] = Data.groupby("ValueSource")['Sum_Weight'].apply(lambda x: x / x.sum())