Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to check which words from a list are contained in a string?

I was looking to collect each word from a list that is included in a string in python. I found some solutions but so far i get:

data = "Today I gave my dog some carrots to eat in the car"
tweet = data.lower()                             #convert to lower case
split = tweet.split()

matchers = ['dog','car','sushi']
matching = [s for s in split if any(xs in s for xs in matchers)]
print(matching)

The result is

['dog', 'carrots', 'car']

How do I fix that the result is only dog and car without adding spaces to my matchers?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Also how would I remove any $ signs (as example) from the data string but no other special characters like @?

>Solution :

How do I fix that the result is only dog and car without adding spaces to my matchers?

To do this with your current code, replace this line:

matching = [s for s in split if any(xs in s for xs in matchers)]

With this:

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list

You also mention this:

Also how would I remove any $ signs (as example) from the data string but no other special characters like @?

To do this, I would create a list that contains characters you want to remove, like so:

things_to_remove = ['$', '*', '#']  # this can be anything you want to take out

Then, simply strip each character from the tweet string before you split it.

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")

So a final code block that demonstrates all of these topics:

data = "Today I@@ gave my dog## some carrots to eat in the$ car"
tweet = data.lower()                             #convert to lower case

things_to_remove = ['$', '*', '#']

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")
print("After removeing characters I don't want:")
print(tweet)

split = tweet.split()

matchers = ['dog','car','sushi']

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list
print(matching)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading