Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to transform the type of a column from object to float64?

I have downloaded a dataset with my financial results from my stockbroker.
One column contains the results for each trade:

df = pd.DataFrame({'Profit': ['-$315.00', '$605.00', '$680.00', '-$140.00 ']})

I want to change the type of this column from object to float64.
I wanted to remove the ‘$’ sign, and change the ‘,’ sign to ‘.’ :

df['Profit'] = df['Profit'].str.replace('$', '')
df['Profit'] = df['Profit'].str.replace(',', '.')

Then, when I try to transform the type of an object column to float64, I get the error: "ValueError: could not convert string to float: ‘-1\xa0000.00 ‘"

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here is the code I use for the transformation

df['Profit'] = df['Profit'].astype('float64')

I also tried to add but the same error always occurs
Here is the additional code I use:

df = df.round(decimals=0)
df['Profit'] = df['Profit'].astype('float64')

How to transform the type of an object column to float64?

When I use these lines of code with synthetic data like the one in the example, it works, but not with my csv data which are identical (copy and paste)

How to transform the type of a column of object to float64 and not to have any more this error: "ValueError: could not convert string to float: ‘-1\xa0000.00 ‘"
?

>Solution :

\xa0 is a non-breaking space (used as thousand separator here), which is not removed by your replace.

A safer approach might be to remove all non digits/dot/minus characters:

df = pd.DataFrame({'Profit': ['-$1\xa0000.00', '$605.00', '$680.00', '-$140.00 ']})

df['Profit'].str.replace('[^\d\.-]', '', regex=True).astype(float)

for only $ and \xa0:

df['Profit'].str.replace('[$\xa0]', '', regex=True).astype(float)

output:

0   -1000.0
1     605.0
2     680.0
3    -140.0
Name: Profit, dtype: float64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading