Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex to remove special character based on a condition

I am using regex to remove special characters from a string in Python.

import re
txt = "This is a sample text(s). ANother sample line (testing)"

print (re.sub('[^A-Za-z0-9]+', ' ', txt))

OP:

'This is a sample text s ANother sample line testing '

Expected OP:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

'This is a sample texts ANother sample line testing '

If there is no space between the word and the special character (, the Op should also not have the space. IN the given example the correct op is texts and not text s

Any suggestions will be helpful.

>Solution :

the Op should also not have the space

However, re.sub('[^A-Za-z0-9]+', ' ', txt) says that each special character should be replaced by the "space" character ' '.

You can replace using the empty string '' and include the space itself into the list of "non-special" characters to avoid deleting all spaces:

>>> import re
>>> txt = "This is a sample text(s). ANother sample line (testing)"
#       add space here V     V replace with nothing
>>> re.sub('[^A-Za-z0-9 ]+', '', txt)
'This is a sample texts ANother sample line testing'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading