Home Succinct / pythonic method of groupby percentage calculation

Questions

Succinct / pythonic method of groupby percentage calculation

November 8, 2022

I want to groupby and calculate the percentage within each respective group over a certain value. I have achieved this by two groupbys into temporary dataframes (group count over a value, and group count), then merging the dataframes before calculating the percentage. I assume there is a more succinct method that doesn’t require creating temporary dataframes.

e.g.

df = pd.DataFrame({'n': ['a', 'b', 'c', 'a', 'c', 'a', 'a', 'c', 'a'],
                    'm': ['j', 'j', 'j', 'j', 'j', 'j', 'f', 'f', 'f'],
                    'v': [1, 1, 1, 2, 1, 2, 3, 6, 5]})

Count groupby on ‘n’ and ‘m’ and count

dfcount = df.groupby(['n', 'm'])['v'].agg('count').reset_index().rename(columns={'v': 'count'})

Filter values over 1, then groupby ‘n’ and ‘m’ and count

dfover1 = df[df['v'] > 1].groupby(['n','m']).count().reset_index().rename(columns={'v': 'over1count'})

Merge two new dataframes on the groupby columns ‘n’ and ‘m’, then calculate the percentage over 1 for those groupbys.

dfnew = dfcount.merge(dfover1, on=['n', 'm'])
dfnew['late%'] = (100 * dfnew['over1count'] / dfnew['count']).round(1)

	n	m	count	latecount	late%
0	a	f	2	2	100.0
1	a	j	3	2	66.7
2	c	f	1	1	100.0

>Solution :

You can use method chaining (pipeline) to avoid using named intermediates:

out = (df
   .groupby(['n', 'm'], as_index=False)
   # compute count and latecount
   .agg(**{'count': ('v', 'count'),
           'latecount': ('v', lambda v: len(v[v>1]))
          })
   # filter rows with latecount > 0
   .query('latecount > 0')
   # compute late%
   .assign(**{'late%': lambda d: d['latecount'].div(d['count']).mul(100).round(2)})
)

output:

   n  m  count  latecount   late%
0  a  f      2          2  100.00
1  a  j      3          2   66.67
3  c  f      1          1  100.00

group-by

byMR

Published November 08, 2022

Add a comment

how to use aggregate function bit_xor in postgres

byMR

November 8, 2022

Questions

When using the classname() call to create a new object does this not pass by the setter property in the class?

byMR

November 8, 2022

Questions

Does a object stores lambda function have it's own address?

byMR

November 8, 2022

Questions

I want my program to not count same numbers again

byMR

November 8, 2022

Questions

How to replace the values of nested lists stored in python dataframe?

byMR

November 8, 2022

Questions

how to return specific size of list using flutter?

byMR

November 8, 2022

Succinct / pythonic method of groupby percentage calculation

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

how to use aggregate function bit_xor in postgres

When using the classname() call to create a new object does this not pass by the setter property in the class?

Does a object stores lambda function have it's own address?

I want my program to not count same numbers again

How to replace the values of nested lists stored in python dataframe?

how to return specific size of list using flutter?

Keep Up to Date with the Most Important News

Succinct / pythonic method of groupby percentage calculation

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

how to use aggregate function bit_xor in postgres

When using the classname() call to create a new object does this not pass by the setter property in the class?

Does a object stores lambda function have it's own address?

I want my program to not count same numbers again

How to replace the values of nested lists stored in python dataframe?

how to return specific size of list using flutter?

Discover more from Dev solutions