Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas resample based on string like PeriodIndex

I have a dataframe like as below

df = pd.DataFrame({'subject_id':[1,1,1,1,1,2,2,2,2,2],
                   'qtr_info' :['2017Q1','2017Q3','2017Q4','2018Q1','2018Q4','2017Q1','2017Q4','2018Q2','2018Q4','2019Q1'],
                   'val' :[5,5,5,5,1,6,5,5,8,3],
                   'Prod_id':['A','B','C','A','E','Q','G','F','G','H']})

I would like to do the below

a) Fill all the missing quarters of a subject

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

b) fillna for other columns using the mean value for respective columns (for the same subject). Don’t refer other subject ids for computing mean value

So, I tried the below

df_time.resample('Q').mean()   #didn't work
df_time.groupby('subject_id').resample('Q).mean()  #didn't work

But I got the below error

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or
PeriodIndex, but got an instance of ‘Int64Index’

I expect my output to be like as below for subject_id = 1. Here yellow color shows newly inserted rows (to fill the missing gaps)

enter image description here

>Solution :

Create PeriodIndex first and for resample use Resampler.agg with dictionary, for replace missing values by means use Series.fillna with GroupBy.transform:

df_time['qtr_info'] = pd.PeriodIndex(df_time['qtr_info'], freq='Q')
df = (df_time.set_index('qtr_info')
             .groupby('subject_id')
             .resample('Q')
             .agg({'val':'mean', 'Prod_id':'ffill'}))

df['val'] = df['val'].fillna(df.groupby(level=0)['val'].transform('mean'))
print (df)
                     val Prod_id
subject_id qtr_info             
1          2017Q1    5.0       A
           2017Q2    4.2       A
           2017Q3    5.0       B
           2017Q4    5.0       C
           2018Q1    5.0       A
           2018Q2    4.2       A
           2018Q3    4.2       A
           2018Q4    1.0       E
2          2017Q1    6.0       Q
           2017Q2    5.4       Q
           2017Q3    5.4       Q
           2017Q4    5.0       G
           2018Q1    5.4       G
           2018Q2    5.0       F
           2018Q3    5.4       F
           2018Q4    8.0       G
           2019Q1    3.0       H
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading