Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why does xgboost prediction have lower AUC than evaluation of same data in eval_set?

I am training a binary classifier and I want to know the AUC value for its performance on a test set. I thought there were 2 similar ways to do this: 1) I enter the test set into parameter eval_set, and then I receive corresponding AUC values for each boosting round in model.evals_result(); 2) After model training I make a prediction for the test set and then calculate the AUC for that prediction. I had thought that these methods should produce similar values, but the latter method (calculating AUC of a prediction) consistently produces much lower values. Can you help me understand what is going on? I must have misunderstood the function of eval_set.

Here is a fully reproducible example using a kaggle dataset (available here):

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import RocCurveDisplay, roc_curve, auc
from xgboost import XGBClassifier # xgboost version 1.7.6
import matplotlib.pyplot as plt

# Data available on kaggle here https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009/
data = pd.read_csv('winequality-red.csv')
data.head()

# Separate targets
X = data.drop('quality', axis=1)
y = data['quality'].map(lambda x: 1 if x >= 7 else 0) # wine quality >7 is good, rest is not good

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create model and fit
params = {
    'eval_metric':'auc',
    'objective':'binary:logistic'
}
model = XGBClassifier(**params)
model.fit(
    X_train, 
    y_train,
    eval_set=[(X_test, y_test)]
)

First I visualize the AUC metrics resulting from evaluating the test set provided in eval_set:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

results = model.evals_result()
plt.plot(np.arange(0,100),results['validation_0']['auc'])
plt.title("AUC from eval_set")
plt.xlabel("Estimator (boosting round)")
plt.ylabel("AUC")

enter image description here

Next, I make a prediction on the same test set, get the AUC, and visualize the ROC curve:

test_predictions = model.predict(X_test)
fpr, tpr, thresholds = roc_curve(y_true=y_test, y_score=test_predictions,pos_label=1)
roc_auc = auc(fpr, tpr)
display = RocCurveDisplay(roc_auc=roc_auc, fpr=fpr, tpr=tpr)
display.plot()

enter image description here

As you can see, the AUC value of the prediction is 0.81, which is lower than any AUC calculated from evaluating the same test set in eval_set. How have I misunderstood the two methods? Thanks, xgboost is new to me and I appreciate your advice.

>Solution :

XGBoost’s eval_results uses predict_proba to calculate the AUC values in your first graph. By using predict, you are getting the predicted class labels, instead of the predicted probabilities, hence the difference you are observing.

You should use predict_proba instead of predict:

test_probabs = model.predict_proba(X_test)[:, 1]
fpr, tpr, thresholds = roc_curve(y_true=y_test, y_score=test_probabs, pos_label=1)
roc_auc = auc(fpr, tpr)
display = RocCurveDisplay(roc_auc=roc_auc, fpr=fpr, tpr=tpr)
display.plot()

Output:

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading