Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Computed column with group by in a dataframe

I have a dataframe :

Page_ID   Volume   Conversion   KPI     OSBR
A          100       10         0.7    (10,12)
A          150       11         0.2    (10,12)
B          100       11         0.4    (11,16)

I would like to goupe all the Page_ID by OSBR by counting the sum of Volume and Conversion , and the KPI should be equal the sum of Conversion devided by Conversion.

The expected result should be :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Page_ID   Volume   Conversion   KPI               OSBR
A          250       21         0.084(21/250)     (10,12)
B          100       11         0.110(11/100)     (11,16)

I tryed with this code :

subdata1=df.groupby(["PageId", "OSBrowser"]).sum().reset_index()

But the result for KPI is uncorrect cause it counted the sum .

Any idea please to solve it ? thanks

>Solution :

If I understand correctly:

x = df.groupby(['Page_ID', 'OSBR']).agg({'Volume': 'sum', 'Conversion':'sum'})
x['KPI'] = x['Conversion'] / x['Volume']
x = x.reset_index()

Output:

>>> x
  Page_ID     OSBR  Volume  Conversion    KPI
0       A  (10,12)     250          21  0.084
1       B  (11,16)     100          11  0.110
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading