So im trying to parse a logfile, and ive only found one way to split the string into three parts, date, error and message. I can easily do this with regex, but in order to learn im trying to find other ways. The end game is to parse logfiles, split the data into three parts and then depending on the flags you choose, print statistics. The log is formatted like this:
[Tue Nov 06 09:41:10 2020] [type] message
for line in f.readlines():
details = line.split(']')[0], line.split(']')[1], line.split(']')[2]
details = [x.strip() for x in details]
structure = {key:value for key, value in zip(order, details)}
data.append(structure)
This of course gives me output:
"date": "[date",
"type": "[log",
I have several other examples of other ways ive tried to split and then strip these characters, one way would be:
details = line.strip('[').split(']')[0], line.split(']')[1], line.split(']')[2]
and this would strip the [ bracket from the date string. That leaves the type, and if i do the same strip again but on the first position above, it doesnt strip. If i strip before the split in the same forloop, it doesnt strip anything at all. Like i said, ive tried to manipulate this in a hundred different ways and i think i need some input on the correct way to do this as im stuck.
>Solution :
Split by one (e.g, closing) bracket, then strip another (opening) one
details = line.split(']')
details = [x.strip('[ ') for x in details]
structure = dict(zip(order, details))
But guys from the comments are very much right: regex solution will be much simpler and more maintainable. Something along the lines of:
\[(?P<date>.+?)\] \[(?P<type>[a-z]+?)\] (?P<message>.+)
See it in action at regex101.com