I have an excel file with GEO column:
GEO
-------
EMEA
NA
LA
ASAP
EMEA
NA
NA
But when I read it in python:
df.read_excel(path + '\\' + file)
It reads "NA" as missing:
GEO
-------
EMEA
LA
ASAP
EMEA
I know how to tell python to consider something else as missing value, but I haven’t found how to tell to ignore "NA"
>Solution :
Use na_values and keep_default_na according to the documentation of read_excel:
# This list was built from the default na_values, minus NA
NA_VALUES = ['', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',
'1.#IND', '1.#QNAN', '<NA>', 'N/A', 'NULL', 'NaN', 'n/a', 'nan', 'null']
df = pd.read_excelpath + '\\' + file, na_values=NA_VALUES, keep_default_na=False)
Output:
>>> df
GEO
0 EMEA
1 NA
2 LA
3 ASAP
4 EMEA
5 NA
6 NA