Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group by Date and Column

I have a DataFrame:

Date_time           | Available
21/10/2020 05:00:01 |   Yes
21/10/2020 12:20:01 |   No
22/10/2020 04:30:01 |   Yes
22/10/2020 03:40:01 |   Yes
22/10/2020 01:50:01 |   No
23/10/2020 02:10:01 |   Yes
23/10/2020 11:30:01 |   Yes

I want to replicate SQL’s case state and group by in Python.

SELECT date,
       sum(case when Available = 'Yes' then 1 else 0 end)*100/count(*) as Available_rate 
FROM table
group by date

What I am looking for is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Date        | Available_rate 
21/10/2020  |   50
22/10/2020  |   66.667
23/10/2020  |   100

I am able to do this in python:

daily_count = df.groupby([df['date'].dt.date]).size().reset_index(name='counts')

I am not sure how to go further.

>Solution :

You can convert "Available" to boolean, and use GroupBy.mean, this will take advantage of the True/1 and False/0 equivalence to give you the rate:

out = (df['Available']
       .eq('Yes')
       .groupby(df['Date'])
       .mean()
       .mul(100)
       )

Output:

Date
21/10/2020     50.000000
22/10/2020     66.666667
23/10/2020    100.000000
Name: Available, dtype: float64
alternative format:
out = (df['Available']
       .eq('Yes')
       .groupby(df['Date'])
       .mean()
       .mul(100)
       .rename('available rate')
       .reset_index()
       )
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading