VLOOKUP using pandas without repetition

June 4, 2022

I have two dataframes df1 and df2. How do I obtain df3 using pd.merge or any other function?

What I tried?

df3=df1.merge(df2, on='A', how='left')

This gives me df3 with number of rows same as df2. But, what I want is number of rows in df1 and df3 are same. df3 should look exactly as in the example provided.

df1

df2

A   C
1   XY
1   XY
2   XY
2   XY
2   XY
3   XY
3   XY
4   XY
4   XY
5   XY
5   XY
6   AB
6   AB
7   AB
8   AB
9   AB
9   AB
9   AB
10  MN
10  MN

df3

>Solution :

Use drop_duplicates to keep one instance of (A, B):

>>> df1.merge(df2, on='A', how='left').drop_duplicates(['A', 'B'], ignore_index=True)
     A  B   C
0    1  1  XY
2    2  1  XY
5    3  2  XY
7    4  2  XY
9    5  3  XY
11   6  7  AB
13   7  1  AB
14   8  1  AB
15   9  9  AB
18  10  3  MN