Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

groupby and aggregate on the resulting groups

I have a dataframe with a column Status which values are either ‘OPEN’ or ‘CLOSED’. I would like to do a groupby() on multiple columns and use the following rule for that Status column: if there is one or more ‘OPEN’ values in the group then the aggregate should return ‘OPEN’ else ‘CLOSED’

I tried the following:

df_agg = df.groupby(['col1', 'col2', 'col3'], as_index=False)
           .agg({'col4': 'sum', 'Status': lambda x: np.where(x == 'OPEN', 'OPEN', 'CLOSED')}).reset_index(drop=True)

but this returns lists like [‘OPEN’, ‘CLOSED’], [‘OPEN’, ‘CLOSED’, ‘CLOSED’] and so on. Is there a better way using the agg() function to return a single value of ‘OPEN’ or ‘CLOSED’ rather than doing the groupby() and then doing again something like below?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

np.where(df_agg['Status'].str.contains('OPEN'), 'OPEN', 'CLOSED')

>Solution :

Yes, but not so nice – select only slice per group and test if at least one match in lambda function:

lambda x: 'OPEN' if df.loc[x.index, 'Status'].str.contains('OPEN').any() else 'CLOSED'

Better is test before groupby and because OPEN > CLOSED alpha numerically aggregate max:

df_agg = (df.assign(Status = np.where(df['Status'].str.contains('OPEN'), 
                                      'OPEN', 'CLOSED'))
           .groupby(['col1', 'col2', 'col3'], as_index=False)
           .agg({'col4': 'sum', 'Status': 'max'}))

Or:

df_agg = (df.assign(Status = df['Status'].str.contains('OPEN'))
           .groupby(['col1', 'col2', 'col3'], as_index=False)
           .agg({'col4': 'sum', 'Status': 'any'})
           .assign(Status = lambda x: x['Status'].map({True:'OPEN',False:'CLOSED'}))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading