Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas how to count boolean column value and the distinct count of other columns at the same time

I have a Dataframe df, you can have it by running the following code:

import pandas as pd
from io import StringIO
    
df = """  
    month       status_review             supply_review  case_id
2023-01-01           False                    False      12             
2023-01-01           True                     True       33           
2022-12-01           False                    True       45           
2022-12-01           True                     True       45         
2022-12-01           False                    False      44
    """
df= pd.read_csv(StringIO(df.strip()), sep='\s\s+', engine='python')

How can I count how many status_reviews and supply_review are True in each month and also the number of case in each month?

The output should looks like the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    month       # of true status_review      # of true supply_review  # of case
2023-01-01           1                         1                        2
2022-12-01           1                         2                        2

I have tried both:

df.groupby("month").sum()
df.groupby('month').agg('sum')

But the output is:

           status_review    supply_review   case_id
month           
2022-12-01             1    2               134
2023-01-01             1    1               45

The case_id is not what I want. I want the distinct count of case_id. How can I achieve the desired output?

>Solution :

You can use .groupby() and .agg():

df.groupby("month").agg({
    "status_review": "sum",
    "supply_review": "sum",
    "case_id": pd.Series.nunique
})

This outputs:

            status_review  supply_review  case_id
month
2022-12-01              1              2        2
2023-01-01              1              1        2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading