Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculate % in Pandas pivot_table?

I have a table with data for who viewed a page and who clicked on it. The following code gets me the pivot table below:

# users who clicked, by hour and weekday
pct = df.pivot_table(columns=['weekday'],index=['hour'], values=['users_who_clicked','users_who_viewed'], aggfunc= sum, fill_value=0, margins=True)
pct

pivot table

I’ve been googling and searching here for a solid hour but I can’t seem to find how to just display the rate: i.e. ‘users_who_clicked’ divided by ‘users_who_viewed’. I don’t need to see the other two series.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I feel like if I just create a new column across the whole data set dividing the two it won’t aggregate correctly, so the aggregation needs to happen within the table itself. It’s easy in Excel, just use a calculated field, so I imagine it must be fairly straightforward in pandas as well. Any thoughts?

Thanks!

p.s. Tried using a lambda function but got a "TypeError: unsupported operand type(s) for +: ‘int’ and ‘str’" error. Tried putting the two tables directly in aggfunc and no luck.

>Solution :

Use DataFrame.xs for select by MultiIndex, rename first level and divide, last append to original DataFrame:

d = {('users_who_clicked','a'): [5, 6, 7, 8], 
     ('users_who_clicked','b'): [9, 10, 11, 12], 
     ('users_who_viewed','a'): [4, 1, 3, 7], 
     ('users_who_viewed','b'): [1, 3, 7, 3]}

df = pd.DataFrame(data=d)
  
df1 = (df.xs('users_who_clicked', axis=1, level=0, drop_level=False)
         .rename(columns={'users_who_clicked':'%'}, level=0))
df2 = (df.xs('users_who_viewed', axis=1, level=0, drop_level=False)
         .rename(columns={'users_who_viewed':'%'}, level=0))

out = pd.concat([df, df1.div(df2)], axis=1)  
print (out)
  users_who_clicked     users_who_viewed            %          
                  a   b                a  b         a         b
0                 5   9                4  1  1.250000  9.000000
1                 6  10                1  3  6.000000  3.333333
2                 7  11                3  7  2.333333  1.571429
3                 8  12                7  3  1.142857  4.000000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading