Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

NotFittedError: This DecisionTreeClassifier instance is not fitted yet

Am new to ML and trying to run a decision tree based model

I tried the below

X = df[['Quantity']]
y = df[['label']]
params = {'max_depth':[2,3,4], 'min_samples_split':[2,3,5,10]}
clf_dt = DecisionTreeClassifier()
clf = GridSearchCV(clf_dt, param_grid=params, scoring='f1')
clf.fit(X, y)
clf_dt = DecisionTreeClassifier(clf.best_params_)

And got the warning mentioned here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

FutureWarning: Pass criterion={'max_depth': 2, 'min_samples_split': 2} as keyword args. From version 1.0 (renaming of 0.25) passing these as positional arguments will result in an error
  warnings.warn(f"Pass {args_msg} as keyword args. From version "

Later, I tried running the below and got an error (but I already fit the model using .fit())

from sklearn import tree
tree.plot_tree(clf_dt, filled=True, feature_names = list(X.columns), class_names=['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])

NotFittedError: This DecisionTreeClassifier instance is not fitted yet. Call 
'fit' with appropriate arguments before using this estimator.

Can help me with this on how can I fix this error?

>Solution :

So there are two problems you are facing.

Firstly

Referring to

FutureWarning: Pass criterion={‘max_depth’: 2, ‘min_samples_split’: 2} as keyword args. From version 1.0 (renaming of 0.25) passing these as positional arguments will result in an error

You should use dictionary unpacking using the ** operator:

clf = GridSearchCV(clf_dt, param_grid=**params, scoring='f1')

Or just call the dict class constructor when creating params:

params = dict(max_depth=[2,3,4], min_samples_split=[2,3,5,10])

Secondly

Referring to

NotFittedError: This DecisionTreeClassifier instance is not fitted yet. Call ‘fit’ with appropriate arguments before using this estimator.

Here you can learn about the mandatory fitting step in sklearn. But as you said, you just did so in your first code example. Your problem is that using

clf_dt = DecisionTreeClassifier(clf.best_params_)

You instatiate a new DecisionTreeClassifier class which is therefore not fitted when you call

tree.plot_tree(clf_dt ...)

When you call

clf = GridSearchCV(clf_dt, param_grid=params, scoring='f1')

sklearn automatically assigns the best estimator to clf in your case. So just use this variable 🙂
The following step clf_dt = DecisionTreeClassifier(clf.best_params_) isn’t necessary.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading