Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Cumulative product for each element over x days in pandas DataFrame

I try to calculate for each element in a Dataframe df1 the cumulative product over x days (for example 3). Is there an efficient way to do that? So far, I only know to generate a normal cumulative product over the rows, without fixing the lookback period to x days (=df2). For example:

  • 2022-01-02 ID1: 1.0528 = (ignoring NaN) 0.94 * 1.12
  • 2022-01-05 ID1: 1.2002 = 0.94 * 1.12 * 1.14
  • 2022-01-09 ID1: 1.4045 = 1.12 * 1.14 * 1.10

Optimally, it would also ignore NaN values and calculate the cumulative return over the remaining numbers. So far I only know how to compute that by creating a copy and shifting it, but I would like to do calculate the cumulative product for many columns over 50 rows, which would not be efficient.

df1:
            ID1     ID2
Date        
2022-01-02  NaN     0.95
2022-01-05  0.94    0.98
2022-01-09  1.12    NaN
2022-01-10  1.14    1.02
2022-01-11  1.10    1.00
2022-01-12  0.92    0.82

df2:
            ID1     ID2
Date        
2022-01-02  1.0528  0.9309
2022-01-05  1.2002  0.9996
2022-01-09  1.4045  1.0200
2022-01-10  1.1537  0.8364
2022-01-11  1.0120  0.8200
2022-01-12  0.9200  0.8200

For reproducability:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
import numpy as np
df1 = pd.DataFrame({
    'Date':['2022-01-02', '2022-01-05', '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12'],
    'ID1':[np.nan, 0.94, 1.12, 1.14, 1.1, 0.92], 
    'ID2':[0.95, 0.98, np.nan, 1.02, 1, 0.82]})
df1 = df1.set_index('Date')

Thanks a lot for you suggestion!

>Solution :

You can use:

df1.fillna(1)[::-1].rolling(window=3, min_periods=1).agg(lambda x: x.prod())[::-1]

output:

                 ID1     ID2
Date                        
2022-01-02  1.052800  0.9310
2022-01-05  1.200192  0.9996
2022-01-09  1.404480  1.0200
2022-01-10  1.153680  0.8364
2022-01-11  1.012000  0.8200
2022-01-12  0.920000  0.8200

How does it work?

  • rolling uses the previous rows (or is centered), here we reverse the array to compute an inverse rolling
  • We fill the NaNs with 1 (we could also use np.nanprod)
  • We use min_periods=1 to enable computation on less than 3 elements
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading