Advertisements
What I’m looking to do is to group by ColA, and compute the ratio between ColB value and the median value in ColB for that group. This ratio will be a new column.
Original DF
ColA ColB
grp1 2
grp1 10
grp1 8
grp1 22
grp2 6
grp2 14
grp2 12
grp2 32
grp3 10
grp3 20
grp3 30
grp3 6
The median for grp1 is 9.0, grp2 is 13.0, grp3 is 15. Thus the new column (ColC) is just the ratio between ColB and the groups median median value. So:
Desired DF
ColA ColB ColC
grp1 2 0.22
grp1 10 1.11
grp1 8 0.89
grp1 22 2.44
grp2 6 0.46
grp2 14 1.07
grp2 12 0.92
grp2 32 2.46
grp3 10 0.66
grp3 20 1.33
grp3 30 2.00
grp3 6 0.40
any suggestions is appreciated!
>Solution :
Try with groupby
:
data["ColC"] = data["ColB"].div(data.groupby("ColA")["ColB"].transform("median"))
>>> data
ColA ColB ColC
0 grp1 2 0.222222
1 grp1 10 1.111111
2 grp1 8 0.888889
3 grp1 22 2.444444
4 grp2 6 0.461538
5 grp2 14 1.076923
6 grp2 12 0.923077
7 grp2 32 2.461538
8 grp3 10 0.666667
9 grp3 20 1.333333
10 grp3 30 2.000000
11 grp3 6 0.400000