Complement Naive Bayes (CNB) Algorithm
Naive Bayes algorithms are a group of very popular and commonly used Machine Learning algorithms used for classification. There are many different ways the Naive Bayes algorithm is implemented like Gaussian Naive Bayes, Multinomial Naive Bayes, etc. To learn more about the basics of Naive Bayes, you can follow this link.
Complement Naive Bayes is somewhat an adaptation of the standard Multinomial Naive Bayes algorithm. Multinomial Naive Bayes does not perform very well on imbalanced datasets. Imbalanced datasets are datasets where the number of examples of some class is higher than the number of examples belonging to other classes. This means that the distribution of examples is not uniform. This type of dataset can be difficult to work with as a model may easily overfit this data in favor of the class with more number of examples.
How CNB works:
Complement Naive Bayes is particularly suited to work with imbalanced datasets. In complement Naive Bayes, instead of calculating the probability of an item belonging to a certain class, we calculate the probability of the item belonging to all the classes. This is the literal meaning of the word, complement and hence is called Complement Naive Bayes.
A step-by-step high-level overview of the algorithm (without any involved mathematics):
- For each class calculate the probability of the given instance not belonging to it.
- After calculation for all the classes, we check all the calculated values and select the smallest value.
- The smallest value (lowest probability) is selected because it is the lowest probability that it is NOT that particular class. This implies that it has the highest probability to actually belong to that class. So this class is selected.
Note: We don’t select the one with the highest value because we are calculating the complement of the probability. The one with the highest value is least likely to be the class that item belongs to.
Now, let us consider an example: Say, we have two classes: Apples and Bananas and we have to classify whether a given sentence is related to apples or bananas, given the frequency of a certain number of words. Here is a tabular representation of the simple dataset:
Total word count in class ‘Apples’ = (2+1+1) + (2+1+1) = 8
Total word count in class ‘Bananas’ = (1 + 1 + 9 + 5) = 16
So, the Probability of a sentence to belong to the class, ‘Apples’,
Similarly, the probability of a sentence to belong to the class, ‘Bananas’,
In the above table, we have represented a dataset where the columns signify the frequency of words in a given sentence and then shows which class the sentence belongs to. Before we begin, you must first know about Bayes’ Theorem. Bayes’ Theorem is used to find the probability of an event, given that another event occurs. The formula is :
where A and B are events, P(A) is the probability of occurrence of A, and P(A|B) is the probability of A to occur given that event B has already occurred. P(B), the probability of event B occurring cannot be 0 since it has already occurred. If you want to learn more about regular Naive Bayes and Bayes Theorem, you can follow this link.
Now let us see how Naive Bayes and Complement Naive Bayes work. The regular Naive Bayes algorithm is,
where fi is the frequency of some attribute. For example, the number of times certain words occur in a sentence.
However, in complement naive Bayes, the formula is :
If you take a closer look at the formulae, you will see that complement Naive Bayes is just the inverse of the regular Naive Bayes. In Naive Bayes, the class with the largest value obtained from the formula is the predicted class. So, since Complement Naive Bayes is just the inverse, the class with the smallest value obtained from the CNB formula is the predicted class.
Now, let us take an example and try to predict it using our dataset and CNB,
So, we need to find,
We need to compare both the values and select the class as the predicted class as the one with the smaller value. We have to do this also for bananas and pick the one with the smallest value. i.e., if the value for (y = Apples) is smaller, the class is predicted as Apples, and if the value for (y = Bananas) is smaller, the class is predicted as Bananas.
Using the Complement Naive Bayes Formula for both the classes,
Now, since 6.302 < 85.333, the predicted class is Apples.
We DON’T use the class with a higher value because a higher value means that it is more likely that a sentence with those words does NOT belong to the class. This is exactly why this algorithm is called Complement Naive Bayes.
When to use CNB?
- When the dataset is imbalanced: If the dataset on which classification is to be done is imbalanced, Multinomial and Gaussian Naive Bayes may give a low accuracy. However, Complement Naive Bayes will perform quite well and will give relatively higher accuracy.
- For text classification tasks: Complement Naive Bayes outperforms both Gaussian Naive Bayes and Multinomial Naive Bayes in text classification tasks.
Implementation of CNB in Python:
For this example, we will use the wine dataset which is slightly imbalanced. It determines the origin of wine from various chemical parameters. To know more about this dataset, you can check this link.
To evaluate our model, we will check the accuracy of the test set and the classification report of the classifier. We will use the scikit-learn library to implement the Complement Naive Bayes algorithm.
Training Set Accuracy : 65.56291390728477 % Test Set Accuracy : 66.66666666666666 % Classifier Report : precision recall f1-score support 0 0.64 1.00 0.78 9 1 0.67 0.73 0.70 11 2 1.00 0.14 0.25 7 accuracy 0.67 27 macro avg 0.77 0.62 0.58 27 weighted avg 0.75 0.67 0.61 27
We get an accuracy of 65.56% on the training set and an accuracy of 66.66% on the test set. They are pretty much the same and are actually quite good given the quality of the dataset. This dataset is notorious for being difficult to classify with simple classifiers like the one we have used here. So the accuracy is acceptable.
Now that you know what Complement Naive Bayes classifiers are and how they work, next time you come across an unbalanced dataset, you can try using Complement Naive Bayes.