Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract words from sentence that are containing substring

I want to extract full phrase (one or multiple words) that contain the specific substring. Substring can have one multiple words, and words from substring can ‘break’/’split’ words in the test_string, but desired output is full phrase/word from test_string, for example

test_string = 'this is an example of the text that I have, and I want to by amplifier and lamp'
substring1 = 'he text th'
substring2 = 'amp'

if substring1 in test_string:
    print("substring1 found")
    
if substring2 in test_string:
    print("substring2 found")

My desired output is:

[the text that]
[example, amplifier, lamp]

FYI

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Substring can be at the beginning of the word, middle or end…it does not matter.

>Solution :

If you want something robust I would do something like that:

re.findall(r"((?:\w+)?" + re.escape(substring2) + r"(?:\w+)?)", test_string)

This way you can have whatever you want in substring.

Explanation of the regex:

'(?:\w+)'   Non capturing group
'?'         zero or one

I have done this at the begining and at the end of your substring as it can be the start or the end of the missing part

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading