I’m trying to make a temporary dataframe where NaN values are replaced by zeros without affecting the original dataframe. However, I noticed that when I replace the NaN’s of df_2 with 0s the corresponding column in df_1 is also changed. Have I done something wrong here when creating df_2?
Code
d = {'A':[np.nan, np.nan],'B':[1,2]}
df_1 = pd.DataFrame(data=d)
df_2 = df_1
print(df_1)
print(df_2)
df_2['A'] = df_2['A'].replace(np.nan,0)
print(df_1)
print(df_2)
Outputs
A B
0 NaN 1
1 NaN 2
A B
0 NaN 1
1 NaN 2
A B
0 0.0 1
1 0.0 2
A B
0 0.0 1
1 0.0 2
>Solution :
Use deep copy
df_2 = df_1.copy()
See more in pandas.DataFrame.copy