I want to extract full phrase (one or multiple words) that contain the specific substring. Substring can have one multiple words, and words from substring can ‘break’/’split’ words in the test_string, but desired output is full phrase/word from test_string, for example
test_string = 'this is an example of the text that I have, and I want to by amplifier and lamp'
substring1 = 'he text th'
substring2 = 'amp'
if substring1 in test_string:
print("substring1 found")
if substring2 in test_string:
print("substring2 found")
My desired output is:
[the text that]
[example, amplifier, lamp]
FYI
Substring can be at the beginning of the word, middle or end…it does not matter.
>Solution :
If you want something robust I would do something like that:
re.findall(r"((?:\w+)?" + re.escape(substring2) + r"(?:\w+)?)", test_string)
This way you can have whatever you want in substring.
Explanation of the regex:
'(?:\w+)' Non capturing group
'?' zero or one
I have done this at the begining and at the end of your substring as it can be the start or the end of the missing part