Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove letters from my numeric columns doesn't work

I have a x_train like this (all the columns are object type):

a     b    c
1      2   523f
2     45   52A
3     32    95
4    245    84A
5     86    42
6      7    52
7     45    31
7a    45    712
8b    53    62
194v  34    3

The Y_train only have 0 and 1. I tried to use RF.fit(x_train, Y_train) but I got an error:

could not convert string to float: 7a

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I try to have only the numeric value and remove the letters, so I tried to use something like:

x_train = re.findall(r'\d+\d+', x['a'])

but it doesn’t work. How can I fix this?

>Solution :

Assuming all integers, you can use this for any column that has non-numeric values:

df[col] = df[col].str.replace('\D', '', regex=True).astype(int)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading