Trying to split a randomly generated string of letters, commas, periods, and spaces at the commas and periods, but I’ve only figured out how to split it at the commas with this code:
import re
with open('book.txt', 'r') as file_object:
for line in file_object:
word_list = list(ast.literal_eval(re.subn(r'(\w+)', r"'\1'", file_object.readline())[0]))
example string s,wgzggarhz hbmk.q.af mnttxvixkcxwheysijneupvkcmmnar.mhvsflinmk,dvoxuce,vb,f.cfb
End goal is to split it into a list such as ['s', 'wgzggarhz hbmk', 'q', 'af mnttxvixkcxwheysijneupvkcmmnar', 'mhvflinmk', 'dvoxuce', 'vb', 'f', 'cfb']
I’m new to using RegEx’s so I don’t know if there’s a better way to format this or not, but this is the error it’s returning.
Traceback (most recent call last):
File "main.py", line 32, in <module>
word_list = list(ast.literal_eval(re.subn(r'(\w+)', r"'\1'", file_object.readline())[0]))
File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 59, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 47, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 1
'bazmhffkibauiaexggdoqrvxzkjhqzwammyizcybqba'.'qkmhwbvm' 'cdioyazkwbg' .'bdrsujlrkfxaen'
^
SyntaxError: invalid syntax
Using Replit for IDE
>Solution :
Wrapping words in quotes and then evaluating them again is overkill.
You could use .split():
with open('book.txt', 'r') as file_object:
for line in file_object:
word_list = re.split(r'\s*[,.]\s*', line)
print(word_list)