Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Shift dataframe values for all columns to make monotonically increasing

I have a dataframe of measures in multiple columns, that are aggregated. This means that the function they represent is a monotonically increasing one. Now, due to reset of an apparatus, all measurements are reset to zero, after which the aggregation resumes. But to work with the data, I need to disgard the reset and shift all values in all columns to mimic that th reset never occured.

Hence, I what this sistuation:

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

to become

enter image description here

What I want is a function that will shift all values in all columns to to the last measured maximum.

for some sample data, I have created this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


date_range = pd.date_range(start='2021-01-01', end='2021-01-05', freq='1D')
df1 = pd.DataFrame({'Date': date_range, 'Column 1': range(5), 'Column 2': range(5)})

date_range = pd.date_range(start='2021-01-06', end='2021-01-10', freq='1D')
df2 = pd.DataFrame({'Date': date_range, 'Column 1': range(5), 'Column 2': range(5)})


df = pd.concat([df1,df2])

which I want to become

date_range = pd.date_range(start='2021-01-01', end='2021-01-10', freq='1D')
df3 = pd.DataFrame({'Date': date_range, 'Column 1': range(10), 'Column 2': range(10)})

I know how to do this in the case where I know that df is constructed from df1 and df2

def shift_df(df, df1, df2):

    columns = list(df.columns)
    columns.remove('Date')

    max_values = {}
    for col in columns:
        max_values[col] = df1[col].max()

    min_values = {}
    for col in columns:
        min_values[col] = df2[col].min()

    differences = {}
    for col in columns:
        differences[col] = max_values[col] - min_values[col]+1

    for col in columns:
        df[col] = np.where(df['Date'].isin(df2['Date']),
                            df[col] + differences[col],
                            df[col])

    return df

But I do not know how to generalize it if I only have the knowledge of df. Basically, how do I transform my function

shift_df(df, df1, df2)

to

shift_df(df)

Any help will be greatly appreciated.

>Solution :

Here is one option assuming you want to handle all columns independently:

  • compute and identify the negative jumps with a diff
  • keep only those, inverse the values, add 1
  • compute a cumsum
  • add to original Series
def shifter(s):
    m = s.diff()
    return s.add((-m.sub(1)).where(m.lt(0), 0).cumsum())

out = df.set_index('Date').apply(shifter)

Output:

            Column 1  Column 2
Date                          
2021-01-01       0.0       0.0
2021-01-02       1.0       1.0
2021-01-03       2.0       2.0
2021-01-04       3.0       3.0
2021-01-05       4.0       4.0
2021-01-06       5.0       5.0
2021-01-07       6.0       6.0
2021-01-08       7.0       7.0
2021-01-09       8.0       8.0
2021-01-10       9.0       9.0

Another example:

df = pd.DataFrame({'Date': [Timestamp('2021-01-01 00:00:00'), Timestamp('2021-01-02 00:00:00'), Timestamp('2021-01-03 00:00:00'), Timestamp('2021-01-04 00:00:00'), Timestamp('2021-01-05 00:00:00'), Timestamp('2021-01-06 00:00:00'), Timestamp('2021-01-07 00:00:00'), Timestamp('2021-01-08 00:00:00'), Timestamp('2021-01-09 00:00:00'), Timestamp('2021-01-10 00:00:00')],
                   'Column 1': [0, 1, 2, 3, 4, 0, 1, 2, 3, 4],
                   'Column 2': [3, 4, 5, 6, 7, 3, 4, 5, 6, 7]}
)

out = df.set_index('Date').apply(shifter)

Output:

            Column 1  Column 2
Date                          
2021-01-01       0.0       3.0
2021-01-02       1.0       4.0
2021-01-03       2.0       5.0
2021-01-04       3.0       6.0
2021-01-05       4.0       7.0
2021-01-06       5.0       8.0
2021-01-07       6.0       9.0
2021-01-08       7.0      10.0
2021-01-09       8.0      11.0
2021-01-10       9.0      12.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading