Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Aggregating Pandas DF – Losing Data

I’m trying to aggregate a pandas df in a way an excel pivot table would. I have one quantitative variable called "Count". I would like the same qualitative variables to combine and the "Count" data to sum.

However, when I am trying to do this with the below code, I see that I am somehow losing data. Any idea why this might be happening and how I can fix it?

I expect the number of rows to decrease but the total sum of the "Count" column shouldn’t change.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

>Solution :

Since you have NaNs in your dataframe, they won’t be included in your groupby operation, and thus the data for those rows will not be summed.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading