Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to erase 'nan' values that are in a certain column using pandas?

I have a column that has bunch of rows with a mixture of ‘nan’.
I only want to delete ‘nan’, not the entire row that includes ‘nan’.
Some cells in that column have multiple nans like: nan,nan,nan,nan
and some cells has the name that I need with nan attached like: Jefferson,nan,nan,nan

How can I just erase nan?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

This should work. I’m using regular expressions to match 'nan' only or 'nan,<something>' and I’m replacing that by an empty string ''.

I decided to use regex because by your question I think you can’t use a literal string since you don’t know exactly is within the cell (can be any number of 'nan's.

import pandas as pd

data = {'names': ['Jefferson', 'nan', 'Olivia', 'nan', 'nan', 'nan,nan,nan', 'Rebekah'],
        'numbers': [1, 2, 3, 4, 5, 6, 7]}

df = pd.DataFrame(data=data)
df['names'] = df['names'].replace({r'^nan$': '', r'^nan,.*': ''}, regex=True)
df

If we are not talking about the string 'nan' then the df.fillna('') should do.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading