Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas 2.0.3? Problems keeping format when file is saved in .json or .csv format

Here is some random code.

# create df
import pandas as pd
df2 = pd.DataFrame({'var1':['1_0','1_0','1_0','1_0','1_0'],
                   'var2':['X','y','a','a','a']})
df2.to_json('df2.json')

# import df
df2 = pd.read_json('df2.json')
df2

Code above would be expected to genereate:

    var1    var2
0   1_0      X
1   1_0      y
2   1_0      a
3   1_0      a
4   1_0      a

However it generates:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    var1    var2
0   10      X
1   10      y
2   10      a
3   10      a
4   10      a

If on the initial code, I modify a entry from [‘var1’], then the code it generates when df is imported is correct.

Here is an example to illustrate it.

df2 = pd.DataFrame({'var1':['1_0','hello','1_0','1_0','1_0'],
                   'var2':['X','y','a','a','a']})
df2.to_json('df2.json')

# import df
df2 = pd.read_json('df2.json')
df2


    var1    var2
0   1_0     X
1   hello   y
2   1_0     a
3   1_0     a
4   1_0     a

Same problem is observed if file is saved in csv format and then imported.

Has anyone encountered the same issue?

>Solution :

This is due to the fact that underscores are valid separators in python (often used as thousand separator: 1_000 is 1000). You could force the dtype upon import (or use dtype=False):

df2 = pd.read_json('df2.json', dtype='str')

If you want to keep dtype detection for the other columns:

df2 = pd.read_json('df2.json', dtype={'var1': 'str'})

Output:

  var1 var2
0  1_0    X
1  1_0    y
2  1_0    a
3  1_0    a
4  1_0    a

When you have a string in the json, there is no ambiguity that the values are not numbers and the conversion is not done.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading