Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

\ufeff is appearing while reading csv using unicodecsv module

I have following code

import unicodecsv
CSV_PARAMS = dict(delimiter=",", quotechar='"', lineterminator='\n')
unireader = unicodecsv.reader(open('sample.csv', 'rb'), **CSV_PARAMS)
for line in unireader:
    print(line)

and it prints

['\ufeff"003', 'word one"']
['003,word two']
['003,word three']

The CSV looks like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

"003,word one"
"003,word two"
"003,word three"

I am unable to figure out why the first row has \ufeff (which is i believe a file marker). Moreover, there is " at the beginning of first row.

The CSV file is comign from client so i can’t dictate them how to save a file etc. Looking to fix my code so that it can handle encoding.

Note: I have already tried passing encoding='utf8' to CSV_PARAMS and it didn’t solve the problem

>Solution :

encoding='utf-8-sig' will remove the UTF-8-encoded BOM (byte order mark) used a UTF-8 signature in some files:

import unicodecsv

with open('sample.csv','rb') as f:
    r = unicodecsv.reader(f, encoding='utf-8-sig')
    for line in r:
        print(line)

Output:

['003,word one']
['003,word two']
['003,word three']

But why are you using the third-party unicodecsv with Python 3? The built-in csv module handles Unicode correctly:

import csv

# Note, newline='' is a documented requirement for the csv module
# for reading and writing CSV files.
with open('sample.csv', encoding='utf-8-sig', newline='') as f:
    r = csv.reader(f)
    for line in r:
        print(line)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading