Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why is cross_val_score not producing consistent results?

When this code executes the results are not consistent.
Where is the randomness coming from?

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.tree import DecisionTreeClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

seed = 42
iris = datasets.load_iris()
X = iris.data
y = iris.target

pipeline = Pipeline([('std', StandardScaler()), 
                     ('pca', PCA(n_components = 4)), 
                     ('Decision_tree', DecisionTreeClassifier())], 
                    verbose = False)

kfold = KFold(n_splits = 10, random_state = seed, shuffle = True)
results = cross_val_score(pipeline, X, y, cv = kfold)
print(results.mean())


0.9466666666666667
0.9266666666666665
0.9466666666666667
0.9400000000000001
0.9266666666666665

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

DecisionTreeClassifier does not use all columns, but by default the sqrt of the number of columns for each split. You assigned the seed to KFold, but not to DecisionTreeClassifier. So different columns will be selected each run. PCA also accepts a random state.

See DecisionTreeClassifier and PCA

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading