filling mean age according the class of the Student

I have a df with students from three different classes. I am trying to fill in the missing ages based on the mean age of the other students in the same class. I tried two different ways. One is working and the other one is not . I am not able to figure out why that is the case as I feel both ways are doing the exact same thing. Could you kindly explain me why the solution B is not working while A works?

Solution A: (Working)

df.loc[(df['Age'].isna()) & (df['Class'] == 1),'Age'] = mean_age

Solution B: (not working)

df.loc[df['Class'] == 1,'Age'].fillna(mean_age, inplace=True)

>Solution :

IIUC:

df['Age'] = df['Age'].fillna(df.groupby('Class')['Age'].transform('mean'))

The solution B can’t work because you slice your dataframe so you create a "copy" and fill nan values inplace. The copy is filled but not the original dataframe.

Leave a Reply