I’m trying to replace values in a dataframe based on values in a different dataframe. Below you will find a simple data sample:
df1
old_index Date
0 3 2019-10-23
1 19 2019-12-05
2 29 NaT
3 32 NaT
4 34 2019-12-05
5 116 NaT
6 118 2020-01-29
df2
old_index Date
4 3 2019-10-24
12 19 2019-12-10
15 29 2019-11-12
22 32 2021-11-19
29 34 2019-12-10
40 116 2020-03-24
45 118 2020-01-30
The resulting df should be
old_index Date
0 3 2019-10-23
1 19 2019-12-05
2 29 2019-11-12
3 32 2021-11-19
4 34 2019-12-05
5 116 2020-03-24
6 118 2020-01-29
I tried to use merge as shown below:
mrg = pd.merge(df1, df2, how="right", on="old_index")
But that resulted in the wrong df
old_index Date_x Date_y
0 3 2019-10-23 2019-10-24
1 19 2019-12-05 2019-12-10
2 29 NaT 2019-11-12
3 32 NaT 2021-11-19
4 34 2019-12-05 2019-12-10
5 116 NaT 2020-03-24
6 118 2020-01-29 2020-01-30
Any help is appreciated.
>Solution :
You can create a mapping series then fill missing values:
df1['Date'] = df1['Date'].fillna(df1['old_index'].map(df2.set_index('old_index')['Date']))
print(df1)
# Output
old_index Date
0 3 2019-10-23
1 19 2019-12-05
2 29 2019-11-12
3 32 2021-11-19
4 34 2019-12-05
5 116 2020-03-24
6 118 2020-01-29
A mapping Series acts as a dictionary:
>>> df2.set_index('old_index')['Date']
old_index
3 2019-10-24
19 2019-12-10
29 2019-11-12
32 2021-11-19
34 2019-12-10
116 2020-03-24
118 2020-01-30
Name: Date, dtype: datetime64[ns]