Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python provides different dtypes for same column

I am trying to provide a minimal example soon, but in the meantime: How is it possible, that column "Home Points" is type object and int64 simultaniously? Any hint? Is this a pandas bug?

>>>print(df[["Home Team", "Away Team", "Home Points", "Away Points"]].dtypes)
>>>print()
>>>print(df["Home Points"].describe())
>>>print()
>>>df['Home Points'].unique()

Home Team      object
Away Team      object
Home Points    object
Away Points    object
dtype: object

count     8754
unique       3
top          3
freq      3801
Name: Home Points, dtype: int64

array([3, 1, 0], dtype=object)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

It is not. In your first info() you are describing the column within the dataframe, whereas in the output of

df['Home Point'].describe()

You are evaluating the output of said method, which per its documentation:

Returns
Series or DataFrame
Summary statistics of the Series or Dataframe provided.

Said output is what’s being evaluated and considered as int, not the source column for the method. Therefore, it’s a completely different object for Python, it just happens that the series has the same name as the column in the original dataframe.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading