I have a column that has bunch of rows with a mixture of ‘nan’.
I only want to delete ‘nan’, not the entire row that includes ‘nan’.
Some cells in that column have multiple nans like: nan,nan,nan,nan
and some cells has the name that I need with nan attached like: Jefferson,nan,nan,nan
How can I just erase nan?
>Solution :
This should work. I’m using regular expressions to match 'nan' only or 'nan,<something>' and I’m replacing that by an empty string ''.
I decided to use regex because by your question I think you can’t use a literal string since you don’t know exactly is within the cell (can be any number of 'nan's.
import pandas as pd
data = {'names': ['Jefferson', 'nan', 'Olivia', 'nan', 'nan', 'nan,nan,nan', 'Rebekah'],
'numbers': [1, 2, 3, 4, 5, 6, 7]}
df = pd.DataFrame(data=data)
df['names'] = df['names'].replace({r'^nan$': '', r'^nan,.*': ''}, regex=True)
df
If we are not talking about the string 'nan' then the df.fillna('') should do.