Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: Join two dataframes based on hour and nearest minute of date index

I have two dataframes with different dates as follows:

df1 = pd.DataFrame(index=['2022-01-01 00:37:57', '2022-01-01 03:49:12', '2022-01-01 09:30:11'], columns = ['price'])
df1['price'] = [10,13,12]
df1.index = df1.index.rename('date')
df1:
                        price
date                      
2022-01-01 00:37:57     10
2022-01-01 03:49:12     13
2022-01-01 09:30:11     12 

df2 = pd.DataFrame(index=['2022-01-01 00:35:00', '2022-01-01 00:47:00', '2022-01-01 00:56:12', '2022-01-01 03:45:00', '2022-01-01 03:50:32',
                        '2022-01-01 09:29:20', '2022-01-01 09:31:21'], columns=['price'])
df2['price'] = [3000,3210, 2999, 3001, 3027, 3021, 3002]
df2.index = df2.index.rename('date')
df2:
                      price
date                      
2022-01-01 00:35:00   3000
2022-01-01 00:47:00   3210
2022-01-01 00:56:12   2999
2022-01-01 03:45:00   3001
2022-01-01 03:50:32   3027
2022-01-01 09:29:20   3021
2022-01-01 09:31:21   3002

I want to left join df1 with df2, df1.join(df2,how='left'), on the hour and nearest minute to get the following:

df:
                        price_x price_y
date
2022-01-01 00:37:57     10      3000
2022-01-01 03:49:12     13      3210
2022-01-01 09:30:11     12      3021

So for example, the last row left joined on the date "2022-01-01 09:29:20" since it is closest to "2022-01-01 09:30:11".

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can this be done?

>Solution :

Try pd.merge_asof() (assuming the index of DateTime type and sorted):

print(
    pd.merge_asof(
        df1,
        df2,
        left_index=True,
        right_index=True,
        direction="nearest",
    )
)

Prints:

                     price_x  price_y
date                                 
2022-01-01 00:37:57       10     3000
2022-01-01 03:49:12       13     3027
2022-01-01 09:30:11       12     3021
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading