Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

remove extra words from text

ive been trying to remove extra words like {'by','the','and','of' ,'a'}
from text so my best way to do it is like this .

Code :

def clean_text(text):
    """
    takes the text and removes signs and some words
    """
    stopwords = {'by','the','and','of' ,'a'}
    result  = [word for word in re.split("\W+",text) if word.lower() not in stopwords]
    result = (' ').join(result)
    print(result)
    return result

#dummy text
long_string = "one Groups are marked by the ()meta-characters. two They group together the expressions contained one inside them, and you can one repeat the contents of a group with a repeating qualifier, such as there"
clean_text(long_string)

my question is , is there any better way to do it without using forloop , does regex has any method to remove some words from text and ignore using forloop

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You could use a regex replacement approach by forming an alternation of stop words and then removing them.

long_string = "one Groups are marked by the ()meta-characters. two They group together the expressions contained one inside them, and you can one repeat the contents of a group with a repeating qualifier, such as there"
words = ["by", "the", "and", "of", "a"]
regex = r'\s*\b(?:' + r'|'.join(words) + r')\b\s*'
output = re.sub(regex, ' ', long_string).strip()
print(output)

This prints:

one Groups are marked ()meta-characters. two They group together expressions contained one inside them, you can one repeat contents group with repeating qualifier, such as there

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading