Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

convert string list of unqoted string entries to list in python

I have a bizzare problem where I am reading a csv file with entries like so:

4,[the mentalist, dodgeball, meet the fockers]
5,[godfather, the heat, friends]
...

I read this python using the csv module, and normally Id do:

import ast
x=ast.literal_eval(row[1])

However this obviously fails because the list entries are not quoted.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How do I get around this problem? 🙁

>Solution :

This "format" is really unlucky to parse (e.g. what if the name of the movie contains ,? then I don’t know how to parse that file).

Your best bet is to fix it in the source (how the file is generated).

If you cannot fix how the file is generated you can try:

with open("data.csv", "r") as f_in:
    for line in map(str.strip, f_in):
        if not line:
            continue
        row = line.split(",", maxsplit=1)
        if "[" in row[1]:
            row[1] = row[1].strip("[]").split(", ")
        print(row)

Prints:

['col1', 'col2']
['4', ['the mentalist', 'dodgeball', 'meet the fockers']]
['5', ['godfather', 'the heat', 'friends']]

The data.csv contains:

col1,col2
4,[the mentalist, dodgeball, meet the fockers]
5,[godfather, the heat, friends]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading