Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex: drop numbers with some symbols

I try to clean my text. So I need to remove some numbers and also some combinations of numbers and symbols.

I have a string

s = '4/13/2022 2:20:03 pm from our side a more detailed analysis4 +7 (495) 797-8700 77-8282'

And I want to get

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

'pm from our side a more detailed analysis4'

I tried to use

re.compile(r'\b(?:/|-|\+|\:)(\d+)\b').sub(r' ', s)

but it returns me

'4   2   pm from our side a more detailed analysis4 +7 (495) 797  77 '

What I do wrong and how can I drop just numbers and combinations of number and a specific symbol?

>Solution :

You might match at least a single non word character surrounded by optional digits and trim the result

\d*(?:[^\w\s]+\d*)+\s*

Regex demo

import re

regex = r"\d*(?:[^\w\s]+\d*)+\s*"

s = "4/13/2022 2:20:03 pm from our side a more detailed analysis4 +7 (495) 797-8700 77-8282"
result = re.sub(regex, "", s)

print(result)

Output

pm from our side a more detailed analysis4 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading