Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas astype doesn't work as expected (fails silently and badly)

I’ve encountered this strange behavior of pandas .astype() (I’m using version 1.5.2). When trying to cast a column as integer, and later requesting dtypes, all seems fine. Until you try to extract the values by row, when you get inconsistent types.

Code:

import pandas as pd
import numpy as np
​
df = pd.DataFrame(np.random.randn(3, 3))
df.loc[:, 0] = df.loc[:, 0].astype(int)
​
print(df)
print(df.dtypes)
print(df.iloc[0, :])
print(type(df.values[0, 0]))

Out:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   0         1         2
0  0 -0.232432  1.025643
1 -1  0.556968 -0.729378
2 -1  1.285546 -0.541676
0      int64
1    float64
2    float64
dtype: object
0    0.000000
1   -0.232432
2    1.025643
Name: 0, dtype: float64
<class 'numpy.float64'>

Any guess of what I’m doing wrong here?

Tried to call without loc as

df[0] = df[0].astype(int)

dind’t work either

>Solution :

I think this is due to the usage of df.values because it will try to return a Numpy representation of the DataFrame. As per the docs

By default, the dtype of the returned array will be the common NumPy
dtype of all types in the DataFrame.

>>> from pandas.core.dtypes.cast import find_common_type
>>> find_common_type(df.dtypes.to_list()) # df is your dataframe
dtype('float64')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading