Home Pandas 2.0.3? Problems keeping format when file is saved in .json or .csv format

Questions

Pandas 2.0.3? Problems keeping format when file is saved in .json or .csv format

September 6, 2024

Here is some random code.

# create df
import pandas as pd
df2 = pd.DataFrame({'var1':['1_0','1_0','1_0','1_0','1_0'],
                   'var2':['X','y','a','a','a']})
df2.to_json('df2.json')

# import df
df2 = pd.read_json('df2.json')
df2

Code above would be expected to genereate:

    var1    var2
0   1_0      X
1   1_0      y
2   1_0      a
3   1_0      a
4   1_0      a

However it generates:

    var1    var2
0   10      X
1   10      y
2   10      a
3   10      a
4   10      a

If on the initial code, I modify a entry from [‘var1’], then the code it generates when df is imported is correct.

Here is an example to illustrate it.

df2 = pd.DataFrame({'var1':['1_0','hello','1_0','1_0','1_0'],
                   'var2':['X','y','a','a','a']})
df2.to_json('df2.json')

# import df
df2 = pd.read_json('df2.json')
df2


    var1    var2
0   1_0     X
1   hello   y
2   1_0     a
3   1_0     a
4   1_0     a

Same problem is observed if file is saved in csv format and then imported.

Has anyone encountered the same issue?

>Solution :

This is due to the fact that underscores are valid separators in python (often used as thousand separator: 1_000 is 1000). You could force the dtype upon import (or use dtype=False):

df2 = pd.read_json('df2.json', dtype='str')

If you want to keep dtype detection for the other columns:

df2 = pd.read_json('df2.json', dtype={'var1': 'str'})

Output:

  var1 var2
0  1_0    X
1  1_0    y
2  1_0    a
3  1_0    a
4  1_0    a

When you have a string in the json, there is no ambiguity that the values are not numbers and the conversion is not done.

jupyter

byMR

Published September 06, 2024

Add a comment

Convert Ed25519PrivateKeyParameters to PKCS8

byMR

September 6, 2024

Questions

in R, add the value of a column (rowwise) to any column that is named as a date (and output the result as a date)

byMR

September 6, 2024

Questions

Sort a list of objects based on the index of the object's property from another list

byMR

September 6, 2024

Questions

What does && in javascript mean when used on html elements?

byMR

September 6, 2024

Questions

Assertion failing when trying to assert an ID number in a URL Path

byMR

September 6, 2024

Questions

Update values in df from other df sharing same column names

byMR

September 7, 2024

Pandas 2.0.3? Problems keeping format when file is saved in .json or .csv format

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Convert Ed25519PrivateKeyParameters to PKCS8

in R, add the value of a column (rowwise) to any column that is named as a date (and output the result as a date)

Sort a list of objects based on the index of the object's property from another list

What does && in javascript mean when used on html elements?

Assertion failing when trying to assert an ID number in a URL Path

Update values in df from other df sharing same column names

Keep Up to Date with the Most Important News

Pandas 2.0.3? Problems keeping format when file is saved in .json or .csv format

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Convert Ed25519PrivateKeyParameters to PKCS8

in R, add the value of a column (rowwise) to any column that is named as a date (and output the result as a date)

Sort a list of objects based on the index of the object's property from another list

What does && in javascript mean when used on html elements?

Assertion failing when trying to assert an ID number in a URL Path

Update values in df from other df sharing same column names

Discover more from Dev solutions