I want to replace words and spaces that appear before a digit in a string with nothing. For example, for the string = ‘Juice of 1/2’, I want to return ‘1/2’. I tried the following, but it did not work.
string = "Juice of 1/2"
new = string.replace(r"^.+?(?=\d)", "")
Also I am trying to perform this on every cell of a list of columns using the following code. How would I incorporate the new regex pattern into the existing pattern of r"(|)|?
df[pd.Index(cols2) + "_clean"] = (
df[cols2]
.apply(lambda col: col.str.replace(r"\(|\)|,", "", regex=True))
)
>Solution :
.+? will match anything, including other digits. It will also match the / in 1/2. Since you only want to replace letters and spaces, use [a-z\s]+.
You also have to use re.sub(), not string.replace() (in Pandas, .str.replace() processes regular expressions by default).
new = re.sub(r'[a-z\s]+(?=\d)', '', string, flags=re.I)