Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

string.replace deletes whole string in certain cases

I have a data frame in which i convert a float64 column into a string and then drop the .0 off of the end of the string this is working for most values but for the value 50.0 its deleting the entire string so I’m left with a null value. Any ideas what could cause this? below is the two transformations I have on the data frame

Dataframe['Column'] = Dataframe['Column'].astype('string')
Dataframe['Column'] = Dataframe['Column'].str.replace('.0','')

from the few values I’ve checked it only happens to a few and not all, for a few rows the value is 50.0, 50.0, 49.0, 39.0 and after the transformation above I have the values: , ,49,39

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

As of today, str.replace uses a regex pattern by default (this will change in the future) and a .0 regex means any character followed by 0. So you deleted 50 and .0.

You should either use a non regex replacement:

Dataframe = pd.DataFrame({'Column': ['123.0', '50.0']})
Dataframe['Column'].str.replace('.0', '', regex=False)

or a correct regex:

Dataframe['Column'].str.replace(r'\.0$', '', regex=True)

output:

0    123
1     50
Name: Column, dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading