Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

"window must be an integer 0 or greater" issue with '30D' style rolling calculations

I’ve had a look and can’t seem to find a solution to this issue. I’m wanting to calculate the rolling sum of the previous 30 days’ worth of data at each date in the dataframe – by subgroup – for a set of data that isn’t daily – it’s spaced fairly irregularly. I’ve been attempting to use ChatGPT which is getting in a twist over it.

Initially the suggestion was that I’d not converted the Date column to datetime format to allow for the rolling calculation, but now from the code below:

import pandas as pd
from datetime import datetime, timedelta
import numpy as np

# Create a dataset with irregularly spaced dates spanning two years
np.random.seed(42)
date_rng = pd.date_range(start='2022-01-01', end='2023-12-31', freq='10D')  # Every 10 days
data = {'Date': np.random.choice(date_rng, size=30),
        'Group': np.random.choice(['A', 'B'], size=30),
        'Value': np.random.randint(1, 30, size=30)}

df = pd.DataFrame(data)

# Sort DataFrame by date
df.sort_values(by='Date', inplace=True)

df['Date'] = pd.to_datetime(df['Date'])

# Calculate cumulative sum by group within the previous 30 days from each day
df['RollingSum_Last30Days'] = df.groupby('Group')['Value'].transform(lambda x: x.rolling(window='30D', min_periods=1).sum())

I’m getting an error of:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ValueError: window must be an integer 0 or greater

I’ve found conflicting comments online as to whether the format ’30D’ works in rolling windows but I’m none the wiser as to a solution to this. Any help appreciated.

Running in VSCode in Python 3.11.8.

>Solution :

The issue if that you need to specify which column to use as Date but don’t have access to the Date with groupby.transform.

You could use groupby.apply:

# Calculate cumulative sum by group within the previous 30 days from each day
df['RollingSum_Last30Days'] = (df.groupby('Group', group_keys=False)
                                 .apply(lambda x: x.rolling(window='30D', on='Date', min_periods=1)['Value'].sum())
                              )

Output:

         Date Group  Value  RollingSum_Last30Days
9  2022-01-11     A     22                   22.0
12 2022-01-11     A     22                   44.0
6  2022-01-21     A      4                   48.0
1  2022-05-21     B     14                   14.0
23 2022-05-21     A      8                    8.0
15 2022-07-20     B     26                   26.0
4  2022-07-20     A     18                   18.0
18 2022-07-30     B     10                   36.0
7  2022-07-30     A      2                   20.0
5  2022-08-19     A      8                   10.0
10 2022-10-18     B     10                   10.0
16 2022-11-17     B     12                   12.0
11 2023-01-06     B      4                    4.0
21 2023-02-15     B     16                   16.0
26 2023-04-06     B     28                   28.0
19 2023-04-26     A      4                    4.0
28 2023-05-16     B      8                    8.0
0  2023-05-26     B      3                   11.0
8  2023-06-05     A      6                    6.0
29 2023-06-25     A     21                   27.0
17 2023-07-25     A      2                    2.0
20 2023-08-04     B     14                   14.0
22 2023-08-14     B     15                   29.0
14 2023-08-14     B     18                   47.0
3  2023-08-24     A      4                    4.0
24 2023-09-03     B     14                   47.0
25 2023-09-03     A     23                   27.0
27 2023-09-03     A     25                   52.0
13 2023-09-23     B     29                   43.0
2  2023-12-12     A     17                   17.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading