Home Count occurrences in last 30 days with Pandas Dataframe

Questions

Count occurrences in last 30 days with Pandas Dataframe

September 9, 2022

I have a pandas Dataframe with an ID column and a date column (YYYY-MM-DD),

ID	Date
001	2022-01-01
001	2022-01-04
001	2022-02-07
002	2022-01-02
002	2022-01-03
002	2022-01-28

There may be gaps in the date field, as shown. I would like to have a new column, "occurrences_last_month" where it counts the number of occurrences for each ID in the last month (30 days).

The idea would be to add a new column with the following output,

ID	Date	Ocurrences_last_month
001	2022-01-01	0
001	2022-01-04	1
001	2022-02-07	0
002	2022-01-02	0
002	2022-01-03	1
002	2022-01-28	2

For example, in the case of ID 001:

January 1st: substracts 1 month to December 2nd, so 0 occurrences
January 2nd: goes from December 3rd to January 1st, so 1 occurrence
February 7th: goes from January 8th, so 0 occurrences

I tried to use the datetime.timedelta to compute a new column "date_previous_month", but I am unable to compute what I need from there, I tried using count() but I failed to obtain what I need.

>Solution :

First idea is per grouops use Rolling.count with remove first level created by ID:

df = df.set_index('Date')
df['Ocurrences_last_month'] = (df.groupby('ID')
                                 .rolling('30D')
                                 .count().sub(1).droplevel(0).astype(int))
print (df)
            ID  Ocurrences_last_month
Date                                 
2022-01-01   1                      0
2022-01-04   1                      1
2022-02-07   1                      0
2022-01-02   2                      0
2022-01-03   2                      1
2022-01-28   2                      2

EDIT: If possible duplciated values create Series and assign to original DataFrame by DataFrame.join:

s = df.groupby('ID').rolling('30D', on='Date')['Date'].count().sub(1).astype(int)

df = df.join(s.rename('Ocurrences_last_month'), on=['ID','Date'])
print (df)
   ID       Date  Ocurrences_last_month
0   1 2022-01-01                      0
1   1 2022-01-04                      1
2   1 2022-02-07                      0
3   2 2022-01-02                      0
4   2 2022-01-03                      1
5   2 2022-01-28                      2

Alternative solution from comments:

df = df.merge(s.rename('Ocurrences_last_month'), on=['ID','Date'])

date

byMR

Published September 09, 2022

Add a comment

Conditional Random Assignment

byMR

September 9, 2022

Questions

Helm upgrade and immediately show logs

byMR

September 9, 2022

Questions

How to get the index of function argument in C/C++ with or without a macro?

byMR

September 9, 2022

Questions

How to iterate dict in string.format

byMR

September 9, 2022

Questions

to remove text from a sentence in excel after numerics

byMR

September 9, 2022

Questions

Type 'string' is not assignable to type 'keyof CustomerInput'

byMR

September 9, 2022

Count occurrences in last 30 days with Pandas Dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Conditional Random Assignment

Helm upgrade and immediately show logs

How to get the index of function argument in C/C++ with or without a macro?

How to iterate dict in string.format

to remove text from a sentence in excel after numerics

Type 'string' is not assignable to type 'keyof CustomerInput'

Keep Up to Date with the Most Important News

Count occurrences in last 30 days with Pandas Dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Conditional Random Assignment

Helm upgrade and immediately show logs

How to get the index of function argument in C/C++ with or without a macro?

How to iterate dict in string.format

to remove text from a sentence in excel after numerics

Type 'string' is not assignable to type 'keyof CustomerInput'

Discover more from Dev solutions