I have this function that I’ve created which does the following:
def groupby2(df, col1, col2):
return(df.groupby([col1])[col2].count().reset_index().sort_values(by=[col2], ascending=False))
It words great but I want to work with a multiple columns. For example if I run this:
groupby2(df, 'Age', 'Weight')
^ This runs fine. But this,
groupby2(df, ['Age','Gender'], 'Weight')
returns an error saying:
ValueError: not enough values to unpack (expected 2, got 0)
Stuck and not sure how to mend the code to accept multiple columns
>Solution :
You can do a quick isinstance check and coerce the input like so:
import pandas as pd
def groupby2(df, col1, col2):
if isinstance(col1, (str, pd.Series, tuple)):
col1 = [col1]
return (
df.groupby(col1)[col2].count()
.reset_index()
.sort_values(by=[col2], ascending=False)
)
I am essentially checking to see if a single column is being passed to col1 and coercing it to be come a list of columns. Otherwise, I assume a list is passed in and I should leave the input untouched.