Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to insert character in the beginning of all items in a list, while excluding specific items, based on another list?

I have a list1 (which can be much longer than this one) :

data_into_list = ["a", "good", "with", "amor", "and", "friand"]

and I want to add a hashtag (‘#’) only to certain items, excluding some, based on another list2 : if a word from list2 is in list1, no hashtag is added to it.

Like :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

excluded_words = ["a", "with", "and"]

(I tried several solutions, but since I’m rather a beginner in Python, I can’t find a correct way to do it. I tried to add a hashtag to all the words in the list, convert it to string, and then to do a substitution, via a loop and .replace. I also tried to use a dictionnary, and use re.sub. But in both cases, it doesn’t match the exact character (it removes the # not only to the "a" item, but also on all items that begin with "a", like "amor" in my list). And it seems that a dictionnary can’t use regex, to match the exact character, as far as I understand…)

In any case, it seems more logical to do a list comparison, and exclude some items based on a second list, but I can’t manage to find how…

Thx in advance

Edit, here are one of my failed solutions :

import re

# opening the file in read mode
my_file = open("LastPrompt.txt", "r")
  
# reading the file
data = my_file.read()
  
# split each word into a list
data_into_list = data.replace('\n', ' ').split()

string = '#'

#add hashtags on every words
addhashtag = [string + x for x in data_into_list]


#convert list to string -> needed to be saved in .txt
hashtags = ' '.join(map(str, addhashtag))

#replace every undesired word with # by the same word without #
for r in (("#A", "A"), ("#is", "is"), ("#a", "a"), ("#with", "with")):
     hashtag = hashtag.replace(*r)

>Solution :

You can convert the second list into a set and look up each string from the first list in that set.

This will do the trick:

data_into_list = ["a", "good", "with", "amor", "and", "friand"]
excluded_words = ["a", "with", "and"]

excluded_set = set(excluded_words)
new_list = [
    item if item in excluded_set else "#" + item
    for item in data_into_list
]
print(new_list)

Output: ['a', '#good', 'with', '#amor', 'and', '#friand']

EDIT: updated to use the exclude list from your updated question.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading