Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

value_counts not working in groupby apply

I am using .apply(pd.Series.value_counts, axis=0) to count the values in two pandas columns [‘a’,’b’].

However when I try and use it after grouping on column ‘Group’, I get the error:

TypeError: value_counts() got an unexpected keyword argument 'axis'

It works when grouping in a for loop, but not with a groupby apply.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here is code with a working groupby in for loop, and with the groupby.apply not working:

import pandas as pd
import numpy as np

 #example dataframe 
df = pd.DataFrame(
    {
        "a": [1, 1, 2, 3, 3, 4, 5, 1, 1, 1, 4, 4, 4, 5, 6, 6, 6, 6, 3],
        'b': [3, 4, 5, 5, 5, 2, 1, 3, 4, 4, 4, 5, 6, 6, 4, 3, 6, 6, 3],
        "Group": ['g1', 'g1', 'g1', 'g2', 'g2', 'g1', 'g2', 'g1', 'g1', 'g2','g2', 'g2', 'g2', 'g2', 'g1','g1', 'g2', 'g2', 'g2'],
    }
)


 #grouping and applying with for loop. 
lst = []
for key, grp in df.groupby('Group'): 
  df_ = grp[['a','b']].apply(pd.Series.value_counts, axis=0)
  df_['Group']=key
  lst.append(df_)
print ('this works', pd.concat(lst), sep='\n')

 # with df.groupby it doesn't work. 
df.groupby('Group')[['a','b']].apply(pd.Series.value_counts,  axis=0)

OUTPUT, with the expected result from the for loop

     a    b Group
1  4.0  NaN    g1
2  1.0  1.0    g1
3  NaN  3.0    g1
4  1.0  3.0    g1
5  NaN  1.0    g1
6  2.0  NaN    g1
1  1.0  1.0    g2
3  3.0  1.0    g2
4  3.0  2.0    g2
5  2.0  3.0    g2
6  2.0  4.0    g2

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
   1274             try:
-> 1275                 result = self._python_apply_general(f, self._selected_obj)
   1276             except TypeError:

11 frames
TypeError: value_counts() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/groupby/groupby.py in f(g)
   1257                 def f(g):
   1258                     with np.errstate(all="ignore"):
-> 1259                         return func(g, *args, **kwargs)
   1260 
   1261             elif hasattr(nanops, "nan" + func):

TypeError: value_counts() got an unexpected keyword argument 'axis'


>Solution :

Here’s what happened.

In a for-loop you were applying pd.Series.value_counts to the DataFrame. In this case the method apply has a parameter axis.

In the second case you have a different method apply of DataFrameGroupBy instance. This method has a different signature. It accepts the function as a first parameter and all other parameters are used as additional parameters of this function. So axis goes to pd.Series.value_counts. As far as Series.value_counts has no axis parameter in its signature, you got an error.

(
    df
    .groupby('Group')[['a','b']]
    .apply(lambda x: x.apply(pd.Series.value_counts, axis=0))
    .fillna(0)
    .astype(int)
)

P.S.
See also GroupBy.apply vs DataFrame.apply

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading