Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python piecewise linear interpolation across dataframes in a list

I am trying to apply piecewise linear interpolation. I first tried to use pandas built-in interpolate function but it was not working.

Example data looks below

import pandas as pd
import numpy as np

d = {'ID':[5,5,5,5,5,5,5], 'month':[0,3,6,9,12,15,18], 'num':[7,np.nan,5,np.nan,np.nan,5,8]}
tempo = pd.DataFrame(data = d)
d2 = {'ID':[6,6,6,6,6,6,6], 'month':[0,3,6,9,12,15,18], 'num':[5,np.nan,2,np.nan,np.nan,np.nan,7]}
tempo2 = pd.DataFrame(data = d2)
this = []
this.append(tempo)
this.append(tempo2)

The actual data has over 1000 unique IDs, so I filtered each ID into a dataframe and put them into the list.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The first dataframe in the list looks as below

enter image description here

I am trying to go through all the dataframe in the list to do a piecewise linear interpolation. I tried to change month to a index and use .interpolate(method=’index’, inplace = True) but it was not working.

The expected output is

ID | month | num

5 | 0 | 7

5 | 3 | 6

5 | 6 | 5

5 | 9 | 5

5 | 12 | 5

5 | 15 | 5

5 | 18 | 8

This needs to be applied across all the dataframes in the list.

I would really appreciate any help! Thank you.

>Solution :

Assuming this is a follow up of your previous question, change the code to:

for i, df in enumerate(this):
    this[i] = (df
        .set_index('month')
        # optional, because of the previous question
        .reindex(range(df['month'].min(), df['month'].max()+3, 3))
        .interpolate()
        .reset_index()[df.columns]
        )

NB. I simplified the code to remove the groupby, which only works if you have a single group per DataFrame, as you mentioned in the other question.

Output:


[   ID  month  num
0   5      0  7.0
1   5      3  6.0
2   5      6  5.0
3   5      9  5.0
4   5     12  5.0
5   5     15  5.0
6   5     18  8.0,
   ID  month   num
0   6      0  5.00
1   6      3  3.50
2   6      6  2.00
3   6      9  3.25
4   6     12  4.50
5   6     15  5.75
6   6     18  7.00]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading