Why does df.where() not replace all null values?

January 21, 2022

I have a dataframe with very mixed columns. I am trying to set all occurrences of None or NaN to None.

I am trying the answer to this question: Use None instead of np.nan for null values in pandas DataFrame But the accepted answer does not catch all null instances. Example:

my_array = ['1','2',None,4]
df = pd.DataFrame([my_array], columns=['Morning', 'Midday', 'Evening', 'Night'])
df = df.append({'Midday':'10'}, ignore_index=True)
df.where(df.notnull(), None)

Thus, input df as assemble above is:

  Morning Midday Evening  Night
0       1      2    None    4.0
1     NaN     10     NaN    NaN

I do see output

  Morning Midday Evening  Night
0       1      2    None    4.0
1    None     10    None    NaN

But I want output

  Morning Midday Evening  Night
0       1      2    None    4.0
1    None     10    None    None

What am I getting wrong, or is df.where() getting wrong?

>Solution :

You need to change the datatype to object

out = df.astype(object).where(df.notna(), None)
Out[392]: 
  Morning Midday Evening Night
0       1      2    None   4.0
1    None     10    None  None