i want to devide a sentence into words using regex, i’m using this code:
import re
sentence='<30>Jan 11 11:45:50 test-tt systemd[1]: tester-test.service: activation successfully.'
sentence = re.split('\s|,|>|<|\[|\]:', sentence)
but i’m getting not what i’m waiting for
expected output is :
['30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', 'tester-test.service: activation successfully.']
but what i’m getting is :
['', '30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', '', 'tester-test.service:', 'activation', 'successfully.']
i tried actually to ingnore the whitespace but actually it should be ignored only in the last long-word and i have no idea how can i do that..
any suggestions/help
Thank you in advance
>Solution :
You can use
import re
sentence='<30>Jan 11 11:45:50 test-tt systemd[1]: tester-test.service: activation successfully.'
chunks = sentence.split(': ', 1)
result = re.findall(r'[^][\s,<>]+', chunks[0])
result.append(chunks[1])
print(result)
# => ['30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', 'tester-test.service: activation successfully.']
See the Python demo
Here,
chunks = sentence.split(': ', 1)– splits the sentence into two chunks with the first:substringresult = re.findall(r'[^][\s,<>]+', chunks[0])– extracts all substrings consisting of one or more chars other than],[, whitespace,,,<and>chars from the first chunkresult.append(chunks[1])– append the second chunk to theresultlist.