I have been trying so many options but not able to retain the quotes present in the input file onto my output file.
Reproducible code:
# Input file
csv_data = '''A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999
'''
# Load CSV data into dataframes
df = pd.read_csv(StringIO(csv_data), header=0, dtype=str, keep_default_na=False, engine='python', sep=',')
df.to_csv('output.txt', sep=',', index=False, header=True)
Now, the output.txt looks like:
A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,C99,D,9999
Expected output:
A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999
I just don’t want to lose anything present in my input data while saving (including the quotes).
>Solution :
Add parameter quoting to pd.read_csv and to df.to_csv with 3 (QUOTE_NONE):
# Load CSV data into dataframes
df = pd.read_csv(StringIO(csv_data),
header=0,
dtype=str,
keep_default_na=False,
engine='python',
sep=',',
quoting=3)
print (df)
A B C D E
0 234 mno C22 U
1 567 pqr "C3""" U 5555
2 999 abc "C99" D 9999
print (df.to_csv(sep=',', index=False, header=True, quoting=3))
A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999
df.to_csv('output.txt', sep=',', index=False, header=True, quoting=3)