Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Merge two pandas dataframes by index and replace column values in Python

I have two pandas dataframes:

DF1

index = np.arange('2020-01-01 00:00', '2020-01-01 00:04', dtype='datetime64[m]')
df = np.random.randint(100,500, size=(4,4))
columns =['Open','High','Low','Close']
df = pd.DataFrame(df, index=index, columns = columns)
df.index.name = 'Time'

                     Open  High  Low  Close
Time                                       
2020-01-01 00:00:00   266   397  177    475
2020-01-01 00:01:00   362   135  456    235
2020-01-01 00:02:00   315   298  296    493
2020-01-01 00:03:00   324   411  198    101

DF2

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

index = np.arange('2020-01-01 00:02', '2020-01-01 00:05', dtype='datetime64[m]')
df2 = np.random.randint(100,500, size=(3,4))
columns =['Open','High','Low','Close']
df2 = pd.DataFrame(df2, index=index, columns = columns)
df2.index.name = 'Time'

                     Open  High  Low  Close
Time                                       
2020-01-01 00:02:00   430   394  131    490
2020-01-01 00:03:00   190   211  394    359
2020-01-01 00:04:00   192   291  143    350

I need to merge both dataframes by the index (Time) and replace the column values of DF1 by the column values of DF2.

This is my expected output:

                     Open  High  Low  Close
Time                                       
2020-01-01 00:00:00   266   397  177    475 ->>>> Correspond to DF1
2020-01-01 00:01:00   362   135  456    235 ->>>> Correspond to DF1
2020-01-01 00:02:00   430   394  131    490 ->>>> Correspond to DF2
2020-01-01 00:03:00   190   211  394    359 ->>>> Correspond to DF2
2020-01-01 00:04:00   192   291  143    350 ->>>> Correspond to DF2

I have try several functions including merge or concat (concat([df1, df2], join="inner")) but with no success. Any help would be very appreciated. Thanks!

>Solution :

Try this:

df2.combine_first(df)
                     Open  High  Low  Close
Time                                       
2020-01-01 00:00:00   266   397  177    475
2020-01-01 00:01:00   362   135  456    235
2020-01-01 00:02:00   430   394  131    490
2020-01-01 00:03:00   190   211  394    359
2020-01-01 00:04:00   192   291  143    350

Because you mentioned pd.concat, here is how you could do it with that.

out = pd.concat([df, df2])
out = out[~out.index.duplicated(keep='last')]
print(out)
                     Open  High  Low  Close
Time                                       
2020-01-01 00:00:00   266   397  177    475
2020-01-01 00:01:00   362   135  456    235
2020-01-01 00:02:00   430   394  131    490
2020-01-01 00:03:00   190   211  394    359
2020-01-01 00:04:00   192   291  143    350

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading