Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas fill not all nan in 2 concated date frames with different timestamp

I have 2 data frames one with frequent entries. I would like to concat them and fill NaN in less frequent last entry, but if the last entry was NaN, I would like to fill with NaN

Example:

df = pd.DataFrame(data=[4.5, 4.6, 5.7, 5.7, 6.7, 4, 9.0],
              index=list(map(pd.to_datetime, ['00:00', '00:30', '01:00', '01:30', '02:00', '02:30', '03:00'])),
              columns=['frequent data'])

df2 = pd.DataFrame(data=[4.5, np.NaN, 5.7, np.NaN],
              index=list(map(pd.to_datetime, ['00:00', '01:00', '02:00', '03:00'])),
              columns=['data'])

df2
                         frequent data  data
2022-01-15 00:00:00            4.5   4.5
2022-01-15 01:00:00            5.7   NaN
2022-01-15 02:00:00            6.7   5.7
2022-01-15 03:00:00            9.0   NaN

new_df = pd.concat((df, df2), axis=1)
new_df
                     frequent data  data
2022-01-15 00:00:00            4.5   4.5
2022-01-15 00:30:00            4.6   NaN
2022-01-15 01:00:00            5.7   NaN
2022-01-15 01:30:00            5.7   NaN
2022-01-15 02:00:00            6.7   5.7
2022-01-15 02:30:00            4.0   NaN
2022-01-15 03:00:00            9.0   NaN

I would like to achieve such a date frame

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

                     frequent data  data
2022-01-15 00:00:00            4.5   4.5
2022-01-15 00:30:00            4.6   4.5
2022-01-15 01:00:00            5.7   NaN
2022-01-15 01:30:00            5.7   NaN
2022-01-15 02:00:00            6.7   5.7
2022-01-15 02:30:00            4.0   5.7
2022-01-15 03:00:00            9.0   NaN

Is there any easy way for this or do I need to write my function for this?

>Solution :

IIUC:

df2 = df2.reindex(df.index).groupby(lambda x: x.floor('H')).ffill()

new_df =  pd.concat([df, df2], axis=1)
print(new_df)

# Output
                     frequent data  data
2022-01-15 00:00:00            4.5   4.5
2022-01-15 00:30:00            4.6   4.5
2022-01-15 01:00:00            5.7   NaN
2022-01-15 01:30:00            5.7   NaN
2022-01-15 02:00:00            6.7   5.7
2022-01-15 02:30:00            4.0   5.7
2022-01-15 03:00:00            9.0   NaN

You can also fillna after concat:

new_df = pd.concat([df, df2], axis=1).groupby(lambda x: x.floor('H')).ffill()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading