Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Sklearn model, The truth value of an array with more than one element is ambiguous error

I have been learning about decision trees and how to make them in sklearn. But when I have tried it out I have been unsuccessful in all my attempts to avoid a vlaue error that reads

"The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
here is the full error:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_15136/2104431115.py in <module>
      2 dt = DecisionTreeRegressor(max_depth= 5, random_state= 1, min_samples_leaf=.1)
      3 dt.fit(x_train.reshape(-1,1), y_train.reshape(-1,1))
----> 4 y_pred = dt.predict(x_test, y_test)

~\anaconda3\lib\site-packages\sklearn\tree\_classes.py in predict(self, X, check_input)
    465         """
    466         check_is_fitted(self)
--> 467         X = self._validate_X_predict(X, check_input)
    468         proba = self.tree_.predict(X)
    469         n_samples = X.shape[0]

~\anaconda3\lib\site-packages\sklearn\tree\_classes.py in _validate_X_predict(self, X, check_input)
    430     def _validate_X_predict(self, X, check_input):
    431         """Validate the training data on predict (probabilities)."""
--> 432         if check_input:
    433             X = self._validate_data(X, dtype=DTYPE, accept_sparse="csr", reset=False)
    434             if issparse(X) and (

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

and here is all of my code so far for this model:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

x = np.array(bat[["TB_x"]])
y = np.array(bat[["TB_y"]])

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= .2, random_state= 1)
dt = DecisionTreeRegressor(max_depth= 5, random_state= 1, min_samples_leaf=.1)
dt.fit(x_train.reshape(-1,1), y_train.reshape(-1,1))
y_pred = dt.predict(x_test, y_test)

origonally I was getting an error that would say that it was expecting a 2d array but wass getting a 1d array, I solved that problem by using reshape but now I get this value error that I do not understand.

>Solution :

This is a slight misunderstanding about how the predict function works. If you think about it conceptually, if you are trying to predict something, why would you need to pass in the expected labels?

In a DecisionTreeRegressor (and in probably all sklearn models) the signature of predict is predict(X, check_input=True), you only need to pass in the features, not the expected labels.

You are doing y_pred = dt.predict(x_test, y_test) but the second argument that predict expects is actually just a boolean that allows you to disable some sanity checks about x_test.

You just need to do the following instead:

y_pred = dt.predict(x_test)

You can refer to the sklearn documentation for a DecisionTreeRegressor for more info

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading