Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python replace regex match without spaces

I basically want to ‘join’ numbers that should clearly go together. I want to replace the regex match with itself but without any spaces.

I have:

df
               a
'Fraxiparine 9 500 IU (anti-Xa)/1 ml'
'Colobreathe 1 662 500 IU inhalačný prášok v tvrdej kapsule'

I want to have:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df
               a
'Fraxiparine 9500 IU (anti-Xa)/1 ml'
'Colobreathe 1662500 IU inhalačný prášok v tvrdej kapsule'

I’m using r'\d+\s+\d+\s*\d+' to match the numbers, and I’ve created the following function to remove the spaces within the string:

def spaces(x):
    match = re.findall(r'\d+\s+\d+\s*\d+', x)
    return match.replace(" ","")

Now I’m having trouble applying that function to the full dataframe, but I also don’t know exactly how to replace the original match with the string without any spaces.

>Solution :

Try using the following code:

def spaces(s):
    return re.sub('(?<=\d) (?=\d)', '', s)

df['a'] = df['a'].apply(spaces)

The regex will match:

  • any space
  • preceeded by a digit (?<=\d)
  • and followed by a digit (?=\d).

Then, the pandas.Series.apply function will apply your function to all rows of your dataframe.

Output:

0   Fraxiparine 9500 IU (anti-Xa)/1 ml
1   Colobreathe 1662500 IU inhalačný prášok v tvrd...
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading