Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do you sum a dataframe based off a grouping in Python pandas?

I have a for loop with the intent of checking for values greater than zero.

Problem is, I only want each iteration to check the sum of a group of ID’s.

The grouping would be a match of the first 8 characters of the ID string.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I have that grouping taking place before the loop but the loop still appears to search the entire df instead of each group.

LeftGroup = newDF.groupby(‘ID_Left_8’)
for g in LeftGroup.groups:
     if sum(newDF[‘Hours_Calc’] > 0):
     print(g)

Is there a way to filter that sum to each grouping of leftmost 8 characters?

I was expecting the .groups function to accomplish this, but it still seems to search every single ID.

Thank you.

>Solution :

def filter_and_sum(group):
    return sum(group[group['Hours_Calc'] > 0]['Hours_Calc'])

LeftGroup = newDF.groupby('ID_Left_8')
results = LeftGroup.apply(filter_and_sum)
print(results)

This will compute the sum of the Hours_Calc column for each group, filtered by the condition Hours_Calc > 0. The resulting series will have the leftmost 8 characters as the index, and the sum of the Hours_Calc column as the value.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading