I’m trying to split current line into 3 chunks.
Title column contains comma which is delimiter
1,"Rink, The (1916)",Comedy
Current code is not working
id, title, genres = line.split(',')
Expected result
id = 1
title = 'Rink, The (1916)'
genres = 'Comedy'
Any thoughts how to split it properly?
>Solution :
Ideally, you should use a proper CSV parser and specify that double quote is an escape character. If you must proceed with the current string as the starting point, here is a regex trick which should work:
inp = '1,"Rink, The (1916)",Comedy'
parts = re.findall(r'".*?"|[^,]+', inp)
print(parts) # ['1', '"Rink, The (1916)"', 'Comedy']
The regex pattern works by first trying to find a term "..." in double quotes. That failing, it falls back to finding a CSV term which is defined as a sequence of non comma characters (leading up to the next comma or end of the line).