Got lower accuracy while training Random Forest with important features

byMR

January 20, 2022

I am using Random Forest for binary classification.

It gives me 85 % accuracy when I trained with all features(10 features).

After training, I visualized the important features. It shows that 2 features are really important.

So I chose anly two important features and trained RF(with same setup) but accuracy is decrease(0.70 %).

Does it happen ? I was expecting higher accuracy.

What can I do get better accuracy in this case?

Thanks

>Solution :

The general rule of thumb when using random forests is to include all observable data. The reason for this is that a priori, we don’t know which features might influence the response and the model. Just because you found that there are only a handful of features which are strong influencers does not mean that the remaining features do not play some role in the model.

So, you should stick with just including all features when training your random forest model. If certain features do not improve accuracy, they will be removed/ignored during training. You typically do not need to manually remediate by removing any features when training.