Not getting expected output for some reason?

Advertisements

Question: please debug logic to reflect expected output

import re

text = "Hello there."

word_list = []

for word in text.split():

tmp = re.split(r'(\W+)’, word)

word_list.extend(tmp)

print(word_list)

OUTPUT is :

[‘Hello’, ‘there’, ‘.’, ”]

Problem: needs to be expected without space

Expected :[‘Hello’, ‘there’, ‘.’]

>Solution :

First of all the actual output you shared is not right, it is ['Hello', ' ', 'there', '.', ''] because-

The \W, Matches anything other than a letter, digit or underscore. Equivalent to [^a-zA-Z0-9_] so it is splitting your string by space(\s) and literal dot(.) character

So if you want to get the expected output you need to do some further processing like the below-

With Earlier Code:

import re
s = "Hello there."
l = list(filter(str.strip,re.split(r"(\W+)", s)))
print(l)

With Edited code:

import re
text = "Hello there."
word_list = []
for word in text.split():
    tmp = re.split(r'(\W+)', word)
    word_list.extend(tmp)
print(list(filter(None,word_list)))

Output:

['Hello', 'there', '.']

Working Code: https://rextester.com/KWJN38243

Leave a ReplyCancel reply