Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Counting number of events up until a specific date in a pandas dataframe

I would like to count the number of events that occured up until a date that is specified in the row.
I want to know how many events took place per ID up until the specified date.

Original dataframe:

Date ID Event
23.01.23 1 AA
19.01.23 1 AB
23.12.22 1 AA
23.01.23 2 AA
19.01.23 2 AA
23.12.22 2 AB

Expected result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Date ID Event Count of AA Count of AB
23.01.23 1 AA 2 1
19.01.23 1 AB 1 1
23.12.22 1 AA 1 0
23.01.23 2 AA 2 1
19.01.23 2 AA 1 1
23.12.22 2 AB 0 1

I was thinking to do a groupby, then pivot the table but the results were incorrect. Then I tried to do it SQL style and I almost have it but it’s not yet what I need.

d = {'Date': ["23.01.23", "19.01.23", "23.12.22", "23.01.23", "19.01.23", "23.12.22"],'ID': [1,1,1,2,2,2], "Event": ["AA","AB","AA","AA","AA","AB"]}

test_df = pd.DataFrame(data = d)

import duckdb

duckdb.query("SELECT Date, Id, Event, COUNT() OVER(PARTITION BY ID, event ) as 'count' FROM test_df").df()

outputs:

enter image description here

The SQL query is missing a distinct within the partition by but I didn’t manage to make it work.

Any type of solution (pandas, python, SQL) is greatly appreciated.

>Solution :

You can use crosstab+groupby.cumsum and merge:

# ensure datetime
test_df['Date'] = pd.to_datetime(test_df['Date'])

# compute crosstab, cumsum per group and merge
test_df.merge((pd.crosstab([test_df['Date'], test_df['ID']], test_df['Event'])
                 .groupby(level='ID').cumsum()
                ), left_on=['Date', 'ID'], right_index=True
             )

Output:

        Date  ID Event  AA  AB
0 2023-01-23   1    AA   2   1
1 2023-01-19   1    AB   1   1
2 2022-12-23   1    AA   1   0
3 2023-01-23   2    AA   2   1
4 2023-01-19   2    AA   1   1
5 2022-12-23   2    AB   0   1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading