Why does dividing columns by another column yield NaN?

There is a Pandas dataframe df.

df.info()
---
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 1 to 3
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       3 non-null      int64
 1   1       3 non-null      int64
 2   2       3 non-null      int64
dtypes: int64(3)
memory usage: 96.0 bytes
Survived 0 1 2
Pclass
1 80 136 216
2 97 87 184
3 372 119 491

Dividing 1st and 2nd columns with the 3rd column cause Nan. Why is it?

df[[0, 1]].div(df[[2]], axis=0)
Survived 0 1 2
Pclass
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN

>Solution :

Because divide by one column DataFrame by DataFrame.div with different column name 2 like 0,1 columns names.

print (type(df[[2]]))
<class 'pandas.core.frame.DataFrame'>

print (df[[2]])
            2
Survived     
1         216
2         184
3         491

#divide by 0 column from rename
print (df[[0, 1]].div(df[[2]].rename(columns={2:0}), axis=0))

                 0   1
Survived              
1         0.370370 NaN
2         0.527174 NaN
3         0.757637 NaN
         
#divide by 0 column from rename
print (df[[0, 1]].div(df[[2]].rename(columns={2:1}), axis=0))

           0         1
Survived              
1        NaN  0.629630
2        NaN  0.472826
3        NaN  0.242363

For correct output need divide by Series by Series.div:

print (type(df[2]))
<class 'pandas.core.series.Series'>

print (df[2])
Survived
1    216
2    184
3    491
Name: 2, dtype: int64

out = df[[0, 1]].div(df[2], axis=0)
print (out)
                 0         1
Survived                    
1         0.370370  0.629630
2         0.527174  0.472826
3         0.757637  0.242363

Leave a Reply