function that finds the most mentions of a given word in a list of strings?

April 15, 2022

I want to make a function that takes in a list of strings as an input and, for a given word, returns a tuple containing the string with the most mentions of the given word and the amount of mentions in the string. If multiple strings all have the same max mentions of the word, then the first occurring one out of these strings is returned. The word is not case-sensitive.

For example, consider the list:

Tomatoes = ['tonight tomatoes grow towards the torchlit tower',
 'the birds fly to the sky',
 'to take the fish to the sea and to tell the tale',
 'to fly to the skies and to taste the clouds']

Note that lines 3 and 4 have the most mentions of the word 'to'.When we put tomatoes into the function with the searchword, ‘to’, it should look like this:

most_word_mentions(Tomatoes, ‘to’)

And it should return the third line in a string and the amount of mentions of ‘to’ as a tuple which should look like (3, 3). Although line 3 shares the same amount of word mentions as line 4, it is returned because it occurs first in the list.

I have created a function that partially achieves what I want, however it fails under specific conditions.

def most_word_mentions(message, word):
    wordcount = []
    for i in range(len(message)):
        message[i] = message[i].lower() #word is not case sensitive
        wordcount.append(((message[i]).count(word)))
    return (wordcount.index(max(wordcount))+1), max(wordcount)

If we input most_word_mentions(Tomatoes, ‘to’), then the function fails to output the correct lines and word mentions. Instead, it returns (1, 6). This is because line 1, although it does not contain the explicit word ‘to’, contains many other words with ‘to’ in them. I would like to write a function that accounts for this issue, and that can be applied to similar scenarios. Could this be done with only for loops and if statements without list comprehension or imports?

>Solution :

This solution uses just for loops and if-statements like you required.

def most_word_mentions(list_of_strings, word):
    word = word.lower()
    highest_count = 0
    earliest_index = 0
    for i in range(len(list_of_strings)):
        curr_count = 0
        if word in list_of_strings[i].lower():
            curr_count = list_of_strings[i].lower().count(word)
            if curr_count > highest_count:
              if i > earliest_index:
                earliest_index = i
                highest_count = curr_count
    return (earliest_index+1, highest_count)

print(most_word_mentions(Tomatoes, 'to'))

Output: