Question: please debug logic to reflect expected output
import re
text = "Hello there."
word_list = []
for word in text.split():
tmp = re.split(r'(\W+)’, word)
word_list.extend(tmp)
print(word_list)
OUTPUT is :
[‘Hello’, ‘there’, ‘.’, ”]
Problem: needs to be expected without space
Expected :[‘Hello’, ‘there’, ‘.’]
>Solution :
First of all the actual output you shared is not right, it is ['Hello', ' ', 'there', '.', '']
because-
The \W
, Matches anything other than a letter, digit or underscore. Equivalent to [^a-zA-Z0-9_]
so it is splitting your string by space(\s
) and literal dot(.
) character
So if you want to get the expected output you need to do some further processing like the below-
With Earlier Code:
import re
s = "Hello there."
l = list(filter(str.strip,re.split(r"(\W+)", s)))
print(l)
With Edited code:
import re
text = "Hello there."
word_list = []
for word in text.split():
tmp = re.split(r'(\W+)', word)
word_list.extend(tmp)
print(list(filter(None,word_list)))
Output:
['Hello', 'there', '.']
Working Code: https://rextester.com/KWJN38243