Why do we replace the nan value in DataFrame with the Mean, and when we change it doesn’t it affect our data ?
0 1.048242
1 1.688173
2 NaN
3 0.194162
4 0.194162
5 0.493194
6 NaN
7 0.675041
8 NaN
9 0.101743
10 3.112086
df['view_duration'].fillna(mean,inplace = True)
0 1.048242
1 1.688173
2 0.938350
3 0.194162
4 0.194162
5 0.493194
6 0.938350
7 0.675041
8 0.938350
9 0.101743
10 3.112086
>Solution :
Replacing Nulls with other relevant data (like Mean) is called imputation and is usually done for machine learning models as they cannot accept Nulls.
It will not change the Mean of the data.
Please note that if you have too many Nulls in the same column (usually above 30% but this should be considered on a case to case basis) – then we better not impute but drop the rows with Nulls.