Open In App

Plot Multinomial and One-vs-Rest Logistic Regression in Scikit Learn

Last Updated : 30 Jun, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Logistic Regression is a popular classification algorithm that is used to predict the probability of a binary or multi-class target variable. In scikit-learn, there are two types of logistic regression algorithms: Multinomial logistic regression and One-vs-Rest logistic regression. Multinomial logistic regression is used when the target variable has more than two classes, while One-vs-Rest logistic regression is used when the target variable has two or more classes.

  1. Multinomial Logistic Regression: It is a logistic regression algorithm that is used when the target variable has more than two classes. It predicts the probability of each class and selects the class with the highest probability as the predicted class.
  2. One-vs-Rest Logistic Regression: It is a logistic regression algorithm that is used when the target variable has two or more classes. It trains one logistic regression model for each class, with that class as the positive class and all other classes as the negative class. It predicts the probability of each class and selects the class with the highest probability as the predicted class.

Multinomial and One-vs-Rest Logistic Regression

Here’s an example code that demonstrates how to plot Multinomial and One-vs-Rest logistic regression models in scikit-learn using the Iris dataset:

Python3




# import libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
import numpy as np
 
# load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target
 
# split the data into training and testing sets
X_train, X_test,\
y_train, y_test = train_test_split(X, y,
                                   test_size=0.2,
                                   random_state=42)
 
# create a Multinomial logistic regression model
multi_logreg = LogisticRegression(multi_class='multinomial',
                                  solver='lbfgs')
multi_logreg.fit(X_train, y_train)
 
# create a One-vs-Rest logistic regression model
ovr_logreg = LogisticRegression(multi_class='ovr',
                                solver='liblinear')
ovr_logreg.fit(X_train, y_train)
 
# make predictions using the trained models
y_pred_multi = multi_logreg.predict(X_test)
y_pred_ovr = ovr_logreg.predict(X_test)
 
# evaluate the performance of the models
# using accuracy score and confusion matrix
print('Multinomial logistic regression accuracy:',
      accuracy_score(y_test, y_pred_multi))
print('One-vs-Rest logistic regression accuracy:',
      accuracy_score(y_test, y_pred_ovr))
 
conf_mat_multi = confusion_matrix(y_test, y_pred_multi)
conf_mat_ovr = confusion_matrix(y_test, y_pred_ovr)
 
# plot the confusion matrices
fig, axs = plt.subplots(ncols=2, figsize=(10, 5))
axs[0].imshow(conf_mat_multi, cmap=plt.cm.Blues)
axs[0].set_title('Multinomial logistic regression')
axs[0].set_xlabel('Predicted labels')
axs[0].set_ylabel('True labels')
axs[0].set_xticks(np.arange(len(iris.target_names)))
axs[0].set_xticklabels(iris.target_names)
axs[0].set_yticklabels(iris.target_names)
axs[1].imshow(conf_mat_ovr, cmap=plt.cm.Blues)
axs[1].set_title('One-vs-Rest logistic regression')
axs[1].set_xlabel('Predicted labels')
axs[1].set_ylabel('True labels')
axs[1].set_xticks(np.arange(len(iris.target_names)))
axs[1].set_xticklabels(iris.target_names)
axs[1].set_yticks(np.arange(len(iris.target_names)))
axs[1].set_yticklabels(iris.target_names)
plt.show()


Output:

Multinomial logistic regression accuracy: 1.0
One-vs-Rest logistic regression accuracy: 1.0
Confusion Matrix for Multinomial and One-vs-Rest Logistic Regression

Confusion Matrix for Multinomial and One-vs-Rest Logistic Regression

Multinomial Logistic Regression Plot

For the iris dataset, we will use scikit-learn library in Python to load the dataset and fit the logistic regression model. Then we will use the Matplotlib library to plot the decision boundaries which are obtained by using the Multinomial Logistic Regression.

Python3




import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
 
# Load the iris dataset
iris = load_iris()
 
# Extract the features and target
X = iris.data[:, :2]
y = iris.target
 
# Create an instance of Logistic Regression classifier
clf = LogisticRegression(random_state=0,
                         multi_class='multinomial',
                         solver='newton-cg')
 
# Fit the model
clf.fit(X, y)
 
# Plot the decision boundaries
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, .02),
                     np.arange(y_min, y_max, .02))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(4, 3))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k',
            cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.show()


Output:

Decision boundaries obtained by using the Multinomial Logistic Regression

One-vs-Rest Logistic Regression Plot

For the iris dataset, we will use scikit-learn library in Python to load the dataset and fit the logistic regression model. Then we will use Matplotlib library to plot the decision boundaries which are obtained by using the one-vs-rest Logistic Regression.

Python3




import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
 
iris = load_iris()
 
# we only take the first two features for visualization
X = iris.data[:, :2]
y = iris.target
 
clf = LogisticRegression(random_state=0,
                         multi_class='ovr',
                         solver='liblinear')
 
clf.fit(X, y)
 
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, .02),
                     np.arange(y_min, y_max, .02))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(4, 3))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k',
            cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('One-vs-Rest logistic regression')
plt.show()


Output:

Decision boundaries obtained by using the One-vs-Rest Method

Decision boundaries obtained by using the One-vs-Rest Method

In conclusion, we have demonstrated how to implement Multinomial and One-vs-Rest logistic regression models in scikit-learn for multi-class classification problems. We have shown how to train and evaluate the models and how to visualize the performance using a confusion matrix. These models can be useful for a variety of applications where the target variable has multiple classes.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads