I working with the Titanic dataset and made some basic preprocessing (such as normalization, ohe, etc.).
Then, I tried to use H2O algorithm and got following error:
from h2o.estimators.gbm import H2OGradientBoostingEstimator
classifier = H2OGradientBoostingEstimator(nfolds = 5,
ntrees = 15,
seed = 42,
max_depth = 4)
classifier.train(predictors, target, training_frame = train_data)
H2OTypeError: Argument
xshould be a None | integer | string |
ModelBase | list(string | integer) | set(integer | string), got
H2OFrame
My target is train_data["Survived"].asfactor()
I tried to read to dataframe from file, instead of coverting the preprocessed df into H2OFrame but to no vail.
Any ideas would be appreciated.
>Solution :
It seems to me that you are passing frame instead of list of column names.
Both x and y are supposed to be "pointers" to columns in the training_frame.
If you want to use all columns as predictors (all except the target), you can specify just the y parameter.
Something like the following should do the trick:
train_data["Survived"] = train_data["Survived"].asfactor()
classifier = H2OGradientBoostingEstimator(nfolds = 5,
ntrees = 15,
seed = 42,
max_depth = 4)
classifier.train(x=["pclass", "sex", "age", ...], y="Survived", training_frame = train_data)