Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas read_json dtype=pd.CategoricalDtype does not work but dtype='category' does

Is this a known issue that specifying CategoricalDtype dtype at read_json does not convert the column dtype, or is there a mistake in the code?

import pandas as pd

df = pd.read_json(
    "./data/data.json",
    dtype={
        #"facility": pd.CategoricalDtype, # does not work
        "facility": 'category',           # does work
        "supplier": pd.CategoricalDtype,  # does not work
    }
)
df.info()
-----
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   facility      232 non-null    category      
 3   supplier      111 non-null    object     

Environment

MacOS 13.0.1 (22A400)
$ python --version
Python 3.9.13
$ pip list | grep pandas
pandas                      1.5.2

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

According to the documentation:

Since dtype=’category’ is essentially CategoricalDtype(None, False), and since all instances CategoricalDtype compare equal to ‘category’, all instances of CategoricalDtype compare equal to a CategoricalDtype(None, False), regardless of categories or ordered.

Try to:

"supplier": pd.CategoricalDtype()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading