Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Problem with changing column category – could not convert string to float

I wanted to change the column type to category with the following code:

df["Geography"] = df["Geography"].astype("category")

Then, use random forest algorithm as following:

X = df.drop('target', axis = 1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.15, random_state = 123,stratify=y )

forest = RandomForestClassifier(n_estimators = 500, random_state = 1)

And when fitting the algorithm:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

forest = RandomForestClassifier(n_estimators = 500, random_state = 1)

The following error occurs:

could not convert string to float: 'Spain'

Spain is a row in a geography column which I converted to categorical value. Why do I get an error?

>Solution :

your feature type has changed to "category", but categories could be names of countries, so if you need categories as numbers you could use the categorical index:

df["Geography"] = pd.CategoricalIndex(df["Geography"])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading