Home Unable to implement nltk.stopwords

Questions

Unable to implement nltk.stopwords

July 18, 2022

I am trying to remove stopwords in my data with nltk, but after several attempts I am unable to remove the stopwords. The tokenization part of my code works, but I am unable to understand why stopwords does not work.

def pre_process(text):
    
    # remove special characters and digits
    text=re.sub("(\\d|\\W|_)+"," ",text)
    text=re.split("\W+",text)
    
    return text
text = dat['text'].apply(lambda x:pre_process(x))
nltk.download('stopwords')

def remove_stopwords(text):
    for word in text:
        if word in stopwords.words('english'):
            text.remove(word)
        return text

text_stopword = text.apply(lambda x:remove_stopwords(x))

The code should remove words such as ‘the’, but after running my csv through the code, that words such as ‘the’ is still present.

Current results:

text returns:

[tv, future, in, the, hands, of, viewers, with...

text_stopword returns:

[tv, future, in, the, hands, of, viewers, with...

>Solution :

Your return statement in remove_stopwords function is wrongly indented. Due to that function returns text right after the first iteration.

Please go with:

def remove_stopwords(text):
    for word in text:
        if word in stopwords.words('english'):
            text.remove(word)
    return text

nltk

byMR

Published July 18, 2022

Add a comment

Display first true and then false

byMR

July 18, 2022

Questions

Pandas map many to one instead of merge without dropping duplicates?

byMR

July 18, 2022

Questions

'NoneType' object has no attribute 'get' Python Flask login function

byMR

July 18, 2022

Questions

Size not being displayed on productdetails page

byMR

July 18, 2022

Unable to implement nltk.stopwords

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Display first true and then false

Pandas map many to one instead of merge without dropping duplicates?

'NoneType' object has no attribute 'get' Python Flask login function

Size not being displayed on productdetails page

Keep Up to Date with the Most Important News

Unable to implement nltk.stopwords

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Display first true and then false

Splitting a String and using the second part only in java

Conditionally update the value in column using lag value

Pandas map many to one instead of merge without dropping duplicates?

'NoneType' object has no attribute 'get' Python Flask login function

Size not being displayed on productdetails page

Discover more from Dev solutions