Prerequisite: Ensemble Classifier
Bagging and Boosting are two types of Ensemble Learning. These two decrease the variance of single estimate as they combine several estimates from different models. So the result may be a model with higher stability.
- If the difficulty of the single model is over-fitting, then Bagging is the best option.
- If the problem is that the single model gets a very low performance, Boosting could generate a combined model with lower errors as it optimises the advantages and reduces pitfalls of the single model.
Similarities Between Bagging and Boosting –
- Both are ensemble methods to get N learners from 1 learner.
- Both generate several training data sets by random sampling.
- Both make the final decision by averaging the N learners (or taking the majority of them i.e Majority Voting).
- Both are good at reducing variance and provide higher stability.
Differences Between Bagging and Boosting –
|1.||Simplest way of combining predictions that|
belong to the same type.
|A way of combining predictions that|
belong to the different types.
|2.||Aim to decrease variance, not bias.||Aim to decrease bias, not variance.|
|3.||Each model receives equal weight.||Models are weighted according to their performance.|
|4.||Each model is built independently.||New models are influenced|
by performance of previously built models.
|5.||Different training data subsets are randomly drawn with replacement from the entire training dataset.||Every new subsets contains the elements that were misclassified by previous models.|
|6.||Bagging tries to solve over-fitting problem.||Boosting tries to reduce bias.|
|7.||If the classifier is unstable (high variance), then apply bagging.||If the classifier is stable and simple (high bias) the apply boosting.|
|8.||Random forest.||Gradient boosting.|