Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

issues reading csv line by line in python

edit using utf-16 seems to get me closer in the right direction, but I have csv values that include commas such as "one example value is a description, which is long and can include commas, and quotes"

So with my current code:

filepath="csv_input/frups.csv"

rows = []
with open(filepath, encoding='utf-16') as f:
    for line in f:
        print('line=',line)
        formatted_line=line.strip().split(",")
        print('formatted_line=',formatted_line)
        rows.append(formatted_line)
        print('')

Lines get formatted incorrectly:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


line= "FRUPS"   "11111112"        "Paahou 11111112, 11111112,11111112"    "Bar, Achal"      "Iagress"   "Unassigned"    "Normal"        "GaWu , Suaair center will not be able to repair 3 couch part 11111112, 11111112,11111112 . Pleasa to repair .

formatted_line= ['"FRUPS"\t"11111112"\t"Parts not able to repair in Suzhou 11111112', ' 11111112', '11111112"\t"Baaaaaar', ' Acaaaal"\t"In Progress"\t"Unassigned"\t"Normal"\t"Got coaow Wu ', ' Suar cat 11111112', ' 11111112', '11111112. Pleasa to repair .']

line= 11111112

formatted_line= ['11111112']

So in this example, the line is separated by long spaces, but breaking up by commas is not as reliable for reading data line by line correctly


I am trying to read a csv line by line in python but each solution leads to a different error.

  1. Using pandas:
filepath="csv_input/frups.csv"
data = pd.read_csv(filepath, encoding='utf-16')
for thing in data:
    print(thing)
    print('')

Fails to read_csv the file with an error Error tokenizing data. C error: Expected 7 fields in line 16, saw 8

  1. Using csv_reader
# open file in read mode
with open(filepath, 'r') as read_obj:
    # pass the file object to reader() to get the reader object
    csv_reader = reader(read_obj)
    # Iterate over each row in the csv using reader object
    for row in csv_reader:
        # row variable is a list that represents a row in csv
        print(row)

Fails with error at for row in csv_reader line with line contains NUL

I’ve tried to figure out what these NUL characters our but trying to investigate using code leads to different errors:

data = open(filepath, 'rb').read()
print(data.find('\x00'))

error: argument should be integer or bytes-like object, not 'str'
  1. another read solution trying to strip certain characters

with open(filepath,'rb') as f:
    contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
contents = contents.split("\r\n")

error: TypeError: a bytes-like object is required, not 'str'

It seems like my csv has some weird characters that cause python to error out. I can open and view my csv just fine in excel, how can I read my csv line by line? Such as

row[0]=['col1','col2','col3']
row[1]=['val1','val2','val3']
etc...

>Solution :

What you have shown at line and formatted_line is a hint that:

  • your file is utf-16 encoded
  • it uses tabs (\t) as delimiters

So you should use:

  1. with the csv module:

     # open file in read mode
     with open(filepath, 'r', encoding='utf-16') as read_obj:
         # pass the file object to reader() to get the reader object
         csv_reader = reader(read_obj, delimiter='\t')
         # Iterate over each row in the csv using reader object
         for row in csv_reader:
             # row variable is a list that represents a row in csv
             print(row)
    
  2. with Pandas:

     data = pd.read_csv(filepath, encoding='utf-16', sep='\t')
     for thing in data:
         print(thing)
         print('')
    
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading