I try to clean my text. So I need to remove some numbers and also some combinations of numbers and symbols.
I have a string
s = '4/13/2022 2:20:03 pm from our side a more detailed analysis4 +7 (495) 797-8700 77-8282'
And I want to get
'pm from our side a more detailed analysis4'
I tried to use
re.compile(r'\b(?:/|-|\+|\:)(\d+)\b').sub(r' ', s)
but it returns me
'4 2 pm from our side a more detailed analysis4 +7 (495) 797 77 '
What I do wrong and how can I drop just numbers and combinations of number and a specific symbol?
>Solution :
You might match at least a single non word character surrounded by optional digits and trim the result
\d*(?:[^\w\s]+\d*)+\s*
import re
regex = r"\d*(?:[^\w\s]+\d*)+\s*"
s = "4/13/2022 2:20:03 pm from our side a more detailed analysis4 +7 (495) 797-8700 77-8282"
result = re.sub(regex, "", s)
print(result)
Output
pm from our side a more detailed analysis4