Prerequisite: Getting Started with Classification/
Classification is perhaps the most common Machine Learning task. Before we jump into what One-vs-Rest (OVR) classifiers are and how they work, you may follow the link below and get a brief overview of what classification is and how it is useful.
In general, there are two types of classification algorithms:
- Binary classification algorithms.
- Multi-class classification algorithms.
Binary classification is when we have to classify objects into two groups. Generally, these two groups consist of ‘True’ and ‘False’. For example, given a certain set of health attributes, a binary classification task may be to determine whether a person has diabetes or not.
On the other hand, in multi-class classification, there are more than two classes. For example, given a set of attributes of fruit, like it’s shape and colour, a multi-class classification task would be to determine the type of fruit.
So, now that you have an idea of how binary and multi-class classification work, let us get on to how the one-vs-rest heuristic method is used.
One-vs-Rest (OVR) Method:
Many popular classification algorithms were designed natively for binary classification problems. These algorithms include :
- Logistic Regression
- Support Vector Machines (SVM)
- Perceptron Models
and many more.
So, these popular classification algorithms cannot directly be used for multi-class classification problems. Some heuristic methods are available that can split up multi-class classification problems into many different binary classification problems. To understand how this works, let us consider an example : Say, a classification problem is to classify various fruits into three types of fruits: banana, orange or apple. Now, this is clearly a multi-class classification problem. If you want to use a binary classification algorithm like, say SVM. The way One-vs-Rest method will deal with this is illustrated below :
Since there are three classes in the classification problem, the One-vs-Rest method will break down this problem into three binary classification problems:
- Problem 1 : Banana vs [Orange, Apple]
- Problem 2 : Orange vs [Banana, Apple]
- Problem 3 : Apple vs [Banana, Orange]
So instead of solving it as (Banana vs Orange vs Apple), it is solved using three binary classification problems as shown above.
A major downside or disadvantage of this method is that many models have to be created. For a multi-class problem with ‘n’ number of classes, ‘n’ number of models have to be created, which may slow down the entire process. However, it is very useful with datasets having a small number of classes, where we want to use a model like SVM or Logistic Regression.
Implementation of One-vs-Rest method using Python3
Python’s scikit-learn library offers a method OneVsRestClassifier(estimator, *, n_jobs=None) to implement this method. For this implementation, we will be using the popular ‘Wine dataset’, to determine the origin of wines using chemical attributes. We can direct this dataset using scikit-learn. To know more about this dataset, you can use the link below : Wine Dataset
We will use a Support Vector Machine, which is a binary classification algorithm and use it with the One-vs-Rest heuristic to perform multi-class classification.
To evaluate our model, we will see the accuracy score of the test set and the classification report of the model.
Test Set Accuracy : 66.66666666666666 % Classification Report : precision recall f1-score support 0 0.62 1.00 0.77 5 1 0.70 0.88 0.78 8 micro avg 0.67 0.92 0.77 13 macro avg 0.66 0.94 0.77 13 weighted avg 0.67 0.92 0.77 13
We get a test set accuracy of approximately 66.667%. This is not bad for this dataset. This dataset is notorious for being difficult to classify and the benchmark accuracy is 62.4 +- 0.4 %. So, our result is actually quite good.
Now that you know how to use the One-vs-Rest heuristic method for performing multi-class classification with binary classifiers, you can try using it next time you have to perform some multi-class classification task.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- Multiclass classification using scikit-learn
- Why Python Is Used For Developing Automated Trading Strategy?
- Strategy Method - Python Design Patterns
- PyQt5 QSpinBox - Setting Style Strategy
- PyQt5 QSpinBox - Getting Style Strategy
- Getting started with Classification
- Regression and Classification | Supervised Machine Learning
- Basic Concept of Classification (Data Mining)
- Python | Image Classification using keras
- ML | Cancer cell classification using Scikit-learn
- ML | Classification vs Regression
- ML | Using SVM to perform classification on a non-linear dataset
- ML | Why Logistic Regression in Classification ?
- ML | Logistic Regression v/s Decision Tree Classification
- ML | Classification vs Clustering
- OpenCV and Keras | Traffic Sign Classification for Self-Driving Car
- Classification of Data Mining Systems
- An introduction to MultiLabel classification
- Multi-Label Image Classification - Prediction of image labels
- Handling Imbalanced Data for Classification
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.