Group Dataframe and compute ratio between cell value and median calcluation

Advertisements

What I’m looking to do is to group by ColA, and compute the ratio between ColB value and the median value in ColB for that group. This ratio will be a new column.

Original DF

ColA     ColB
grp1      2
grp1      10
grp1      8
grp1      22
grp2      6
grp2      14
grp2      12
grp2      32
grp3      10
grp3      20
grp3      30
grp3      6

The median for grp1 is 9.0, grp2 is 13.0, grp3 is 15. Thus the new column (ColC) is just the ratio between ColB and the groups median median value. So:

Desired DF

ColA     ColB     ColC
grp1      2       0.22
grp1      10      1.11
grp1      8       0.89
grp1      22      2.44
grp2      6       0.46
grp2      14      1.07
grp2      12      0.92
grp2      32      2.46
grp3      10      0.66
grp3      20      1.33
grp3      30      2.00
grp3      6       0.40

any suggestions is appreciated!

>Solution :

Try with groupby:

data["ColC"] = data["ColB"].div(data.groupby("ColA")["ColB"].transform("median"))

>>> data
    ColA  ColB      ColC
0   grp1     2  0.222222
1   grp1    10  1.111111
2   grp1     8  0.888889
3   grp1    22  2.444444
4   grp2     6  0.461538
5   grp2    14  1.076923
6   grp2    12  0.923077
7   grp2    32  2.461538
8   grp3    10  0.666667
9   grp3    20  1.333333
10  grp3    30  2.000000
11  grp3     6  0.400000

Leave a ReplyCancel reply