Modify function to take list of columns instead of just one

March 21, 2022

I have this function that I’ve created which does the following:

def groupby2(df, col1, col2):
     
    return(df.groupby([col1])[col2].count().reset_index().sort_values(by=[col2], ascending=False))

It words great but I want to work with a multiple columns. For example if I run this:

groupby2(df, 'Age', 'Weight')

^ This runs fine. But this,

groupby2(df, ['Age','Gender'], 'Weight')

returns an error saying:

ValueError: not enough values to unpack (expected 2, got 0)

Stuck and not sure how to mend the code to accept multiple columns

>Solution :

You can do a quick isinstance check and coerce the input like so:

import pandas as pd

def groupby2(df, col1, col2):
    if isinstance(col1, (str, pd.Series, tuple)):
        col1 = [col1]

    return (
        df.groupby(col1)[col2].count()
        .reset_index()
        .sort_values(by=[col2], ascending=False)
    )

I am essentially checking to see if a single column is being passed to col1 and coercing it to be come a list of columns. Otherwise, I assume a list is passed in and I should leave the input untouched.