Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas merge two dataframes horizontally

I am dealing with a problem like this:

mydata_old
|  x  |  y   |
|-----|------|
| 1   | 10   |
| 2   | 15   |
| 3   | 12   |
| 4   | 21   |

mydata_new
|  x  |  y   |
|-----|------|
| 4   | 20   |
| 5   | 25   |
| 6   | 30   |

I would like to combine both tables to get:

mydata
|  x  |  y   |
|-----|------|
| 1   | 10   |
| 2   | 15   |
| 3   | 12   |
| 4   | 20   | # x = 4 does not show up twice
| 5   | 25   |
| 6   | 30   |

I tried with pandas.merge, but I cannot get the desired result.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

mydata = pd.merge(mydata_new, mydata_old, how="left")

>Solution :

One way to do this is with an outer join (i.e., keep the index from both dataframes). You can achieve this using pd.merge(mydata_new, mydata_old, how='outer') and then keep the newest data available.

However, this could be more straight forward with pandas.concat.

Here, I am concatenating both dataframes. Note that I exclude rows from mydata_old if x appears in mydata_new. This way you can keep the newest data when it’s available in both dataframes (old and new).

# Concat frames and if data is available in both, keep data from mydata_new
mydata = pd.concat(
    [
        mydata_old.loc[~mydata_old['x'].isin(mydata_new['x'])],
        mydata_new
    ],
    axis=0)

This will print the following:

x y
1 10
2 15
3 12
4 20
5 25
6 30
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading