Update subset of values from one column based on another dataframe

I have a dataframe df1 :-

Store_id fruit region
1 orange x
2 apple y
3 NotKnown z
5 Notknown q
6 banana w

I have a dataframe df2 :-

Store_id fruit region
1 orange x
2 apple y
3 pears z
5 strawberry q
6 banana w
8 mango i

Expected df1 :-

Store_id fruit region
1 orange x
2 apple y
3 pears z
5 strawberry q
6 banana w

Store_id is the primary key.
How do I update column fruit of df1 based on df2 column fruit for value NotKnown under fruit column of df1

>Solution :

reg_to_fru = df2.set_index("region")["fruit"]
df1.fruit  = df1.region.map(reg_to_fru)

you can form a mapper (a Series) from df2 as region -> fruit, then map the region in df1 with it:

In [39]: reg_to_fru = df2.set_index("region")["fruit"]

In [40]: reg_to_fru 
Out[40]:
region
x        orange
y         apple
z         pears
q    strawberry
w        banana
i         mango
Name: fruit, dtype: object

In [41]: df1.fruit = df1.region.map(reg_to_fru)

In [42]: df1
Out[42]:
   Store_id       fruit region
0         1      orange      x
1         2       apple      y
2         3       pears      z
3         5  strawberry      q
4         6      banana      w

Leave a Reply