Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas: .isna() shows that whole column is NaNs, but it is strings

I have a pandas dataframe with a column that is populated by "yes" or "no" strings.
When I do .value_counts() to this column, i receive the correct distribution.
But, when I run .isna() it shows that the whole column is NaNs.

I suspect later it creates problems for me.

Example:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df = pd.DataFrame(np.array([[0,1,2,3,4],[40,30,20,10,0], ['yes','yes','no','no','yes']]).T, columns=['A','B','C'])

len(df['C'].isna())  # 5 --> why?!
df['C'].value_counts()  # yes : 3,  no: 2 --> as expected. 

>Solution :

len gives you the length of the Series (irrespective of its content), not the number of True values.

Use sum if you want the count of True:

df['C'].isna().sum()
# 0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading