convert string list of unqoted string entries to list in python

I have a bizzare problem where I am reading a csv file with entries like so:

4,[the mentalist, dodgeball, meet the fockers]
5,[godfather, the heat, friends]
...

I read this python using the csv module, and normally Id do:

import ast
x=ast.literal_eval(row[1])

However this obviously fails because the list entries are not quoted.

How do I get around this problem? 🙁

>Solution :

This "format" is really unlucky to parse (e.g. what if the name of the movie contains ,? then I don’t know how to parse that file).

Your best bet is to fix it in the source (how the file is generated).

If you cannot fix how the file is generated you can try:

with open("data.csv", "r") as f_in:
    for line in map(str.strip, f_in):
        if not line:
            continue
        row = line.split(",", maxsplit=1)
        if "[" in row[1]:
            row[1] = row[1].strip("[]").split(", ")
        print(row)

Prints:

['col1', 'col2']
['4', ['the mentalist', 'dodgeball', 'meet the fockers']]
['5', ['godfather', 'the heat', 'friends']]

The data.csv contains:

col1,col2
4,[the mentalist, dodgeball, meet the fockers]
5,[godfather, the heat, friends]

Leave a Reply