Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas numerci column concider as string if NaN is inside

I am starting to learn Python and I have an issue with pandas data frame. In R even if numeric columns have NaN values R manages to define the correct type of data in each column. In Pandas this does not seem to be the case:

data = {
"calories": ["NA", 380, 390],
"duration": [50, 40, 45]
}

df = pd.DataFrame(data)
df.dtypes

How can I manage to automatically detect the right type of data in each column?

Thanks in advance

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

"NA" is a string, use np.nan or float('nan'):

data = {
"calories": [float('nan'), 380, 390],
"duration": [50, 40, 45]
}

df = pd.DataFrame(data)
print(df.dtypes)

calories    float64
duration      int64
dtype: object

Or:

import numpy as np
data = {
"calories": [np.nan, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data)

Note that if you use read_csv, pandas can infer NA values (by default, '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan', '1.#IND', '1.#QNAN', '<NA>', 'N/A', 'NA', 'NULL', 'NaN', 'n/a', 'nan', 'null').

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading