Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Match whole word in string including special characters

I am aware of multiple existing answers that suggest:

def contains(string, word):
    return bool(re.search(rf"\b{word}\b", string))

But this pattern gives special treatment to alphanumeric character. For examples, contains("hello world!", "world!") returns False while contains("hello world!", "world") returns True.

I need a more ‘naive’ search pattern, one that matches a substring as long as it starts and ends with either the superstring’s boundary or a space. (Desired behavior: opposite of the examples above.)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You need to avoid using \b (word boundary) and assert that previous and next positions don’t have a non-whitespace character. Also it is safer to use re.escape as your search word may contain special regex meta characters.

You may use this python code:

def contains(string, word):
    return bool(re.search(rf"(?<!\S){word}(?!\S)", re.escape(string)))

print (contains("hello world!", "world"))
print (contains("hello world!", "world!"))

Output:

False
True

Online Code Demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading