I have a dataframe df1
:-
Store_id | fruit | region |
---|---|---|
1 | orange | x |
2 | apple | y |
3 | NotKnown | z |
5 | Notknown | q |
6 | banana | w |
I have a dataframe df2
:-
Store_id | fruit | region |
---|---|---|
1 | orange | x |
2 | apple | y |
3 | pears | z |
5 | strawberry | q |
6 | banana | w |
8 | mango | i |
Expected df1
:-
Store_id | fruit | region |
---|---|---|
1 | orange | x |
2 | apple | y |
3 | pears | z |
5 | strawberry | q |
6 | banana | w |
Store_id
is the primary key.
How do I update column fruit
of df1
based on df2
column fruit
for value NotKnown
under fruit
column of df1
>Solution :
reg_to_fru = df2.set_index("region")["fruit"]
df1.fruit = df1.region.map(reg_to_fru)
you can form a mapper (a Series) from df2
as region -> fruit, then map the region in df1
with it:
In [39]: reg_to_fru = df2.set_index("region")["fruit"]
In [40]: reg_to_fru
Out[40]:
region
x orange
y apple
z pears
q strawberry
w banana
i mango
Name: fruit, dtype: object
In [41]: df1.fruit = df1.region.map(reg_to_fru)
In [42]: df1
Out[42]:
Store_id fruit region
0 1 orange x
1 2 apple y
2 3 pears z
3 5 strawberry q
4 6 banana w