Compare two pandas dataframe with different index

I have two Dataframes

data1 = pd.DataFrame({'item': ['A', 'B', 'C'], 'cost': [10, 20, 30], 
                     'quantity': [1, 2, 3]})
data2 = pd.DataFrame({'item': ['B', 'C', 'C','D'], 
                    'cost': [20, 30, 30, 40], 'price': [10, 20, 30, 35], 
                    'serialnumber': ['x', 'y', 'z', 'a'],
                    'color': ['red', 'blue', 'green']})

I wanted to compare data1['item'] and data1['cost'] with data2['item'] and data2['cost'].
If both rows item and cost matches, I wanted to get the serialnumber and price from data2 and append in data1

I used this python script

data1['price','serial']=np.where((data1['item']==data2['item']) & 
                                (data1['cost']==data2['cost']),
                                 data2['price','serialnumber'], np.nan)

I am getting an error can only compare identically-labeled dataframe objects due to an index mismatch between the 2 dataframes.

Expected Output is

  item  cost  quantity  price  serial
0    A    10         1    NaN    NaN
1    B    20         2    NaN    NaN
2    C    30         3   30.0      z

Please help me sort this out

>Solution :

You can use merge function using left key word

data1[['price','serial']]=(data1.merge(data2,how='left',on= 

['item','cost']).loc[:,['price','serialnumber']])

Leave a Reply