Dropping the key column of right data frame after merging

November 30, 2023

I’m looking for a simple way to get rid of the key column of the right table after merging it with another data frame:

import pandas as pd

users = pd.DataFrame({"id": [1,2,3], "name": ["Mark", "Elon", "Jeff"]})
orders = pd.DataFrame({"id": [1,2,3,4,5,6], "user_id": [2,3,2,1,1,3]})

orders_full = pd.merge(left=orders, right=users, how="left", left_on="user_id", right_on="id")
orders_full

This is the result:

I guess there is a pythonic elegant code to have better column names as result (avoid renaming e and dropping each column manually)

good solution: id, user_id, name

even better: order_id, user_id, user_name

>Solution :

To get around the id_x/id_y issue, you can set_index on the appropriate columns and use join instead:

orders_full = (
    orders
   .set_index('user_id')
   .join(users.set_index('id'), how='left')
   .reset_index()
)

Output:

   user_id  id  name
0        1   4  Mark
1        1   5  Mark
2        2   1  Elon
3        2   3  Elon
4        3   2  Jeff
5        3   6  Jeff

To finish with your desired column names, you will have to do a rename e.g.

orders_full = (
    orders
   .set_index('user_id')
   .join(users.set_index('id'), how='left')
   .reset_index()
   .rename(columns={'id':'order_id', 'name':'user_name'})
)

Output:

   user_id  order_id user_name
0        1         4      Mark
1        1         5      Mark
2        2         1      Elon
3        2         3      Elon
4        3         2      Jeff
5        3         6      Jeff