Machine Learning: Logistic Regression Question

July 13, 2023

I’m working on my homework for my machine learning & data mining course. I’m getting the following error when trying to train the model using a logistic regression. I’m not even sure I’m asking the right question (sorry!). Can someone help, or explain what exactly I’m doing wrong?

Here is my code:

#Creating empty list to store the outputs of the different classifier
model_name = []
Accuracy = []
Precision = []
Recall= []
F1_score = []
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import metrics
#splitting the training and testing data set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=5)
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

logreg = LogisticRegression() #using logistic regression
logreg.fit(X_train, y_train) 
y_pred = logreg.predict(X_test)
print(metrics.accuracy_score(y_test, y_pred))

for model in Our_models:
    model.fit(X_train,y_train)
    y_pred = model.predict(X_test)
    # Get the name of the model
    model_name = model.__class__.__name__
    Print_metrics(y_test,y_pred,model_name)
    print(model)

Getting this error:

    Model name:  DecisionTreeClassifier
    accuracy 0.95
    recall 0.95
    precision 0.9545454545454546
    F1_score 0.949874686716792
    Area Under the Curve 0.95
    /usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py:1396: UserWarning:          Note that pos_label (set to 'positive') is ignored when average != 'binary' (got 'macro'). You     may use labels=[pos_label] to specify a single positive class.
      warnings.warn(
    /usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py:1396: UserWarning:     Note that pos_label (set to 'positive') is ignored when average != 'binary' (got 'macro'). You     may use labels=[pos_label] to specify a single positive class.
  warnings.warn(
    /usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py:1396: UserWarning:     Note that pos_label (set to 'positive') is ignored when average != 'binary' (got 'macro'). You     may use labels=[pos_label] to specify a single positive class.
  warnings.warn(
    ---------------------------------------------------------------------------
    NameError                                 Traceback (most recent call last)
    <ipython-input-146-5fd04e4709fc> in <cell line: 1>()
          4     # Get the name of the model
          5     model_name = model.__class__.__name__
    ----> 6     Print_metrics(y_test,y_pred,model_name)
          7     print(model)

    <ipython-input-64-66701b741e6b> in Print_metrics(y_test, y_pred, model_name)
         18 
         19     # Calculate confusion matrix
    ---> 20     print("Confusion Matrix", confusion_matrix(y_test, y_pred,label, average= 'macro'))

    NameError: name 'label' is not defined

Sorry, but I’m not sure what to put here because I’m a ball lost in high weeds right now. I’ve been searching Google which hasn’t been much help. I’m only posting here because I am at a complete loss.

>Solution :

The error message suggests that there is an undefined variable named label in your code. It seems that you’re passing label as an argument to the confusion_matrix function, but it’s not defined.

To resolve this issue, you need to define the label variable or remove it if it’s not necessary.

you can modify your code as follows:

from sklearn.metrics import confusion_matrix, classification_report

def Print_metrics(y_test, y_pred, model_name):
    # Calculate confusion matrix
    print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

    # Calculate classification report
    print("Classification Report:\n", classification_report(y_test, y_pred))

    # Print other metrics
    accuracy = metrics.accuracy_score(y_test, y_pred)
    recall = metrics.recall_score(y_test, y_pred, average='macro')
    precision = metrics.precision_score(y_test, y_pred, average='macro')
    f1_score = metrics.f1_score(y_test, y_pred, average='macro')

    print("Model name:", model_name)
    print("Accuracy:", accuracy)
    print("Recall:", recall)
    print("Precision:", precision)
    print("F1 Score:", f1_score)
    print("")

# Rest of your code...

for model in Our_models:
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    # Get the name of the model
    model_name = model.__class__.__name__

    Print_metrics(y_test, y_pred, model_name)
    print(model)

In the code above, I’ve added the classification_report function to calculate the precision, recall, and F1 score. The label argument is removed from the confusion_matrix function since you want to use the macro average. Additionally, the other metrics are calculated using the metrics module from scikit-learn. Check it if it works for you!!!