Differences between Random Forest and AdaBoost

Last Updated : 18 May, 2022

Random Forest Algorithm is a commonly used machine learning algorithm that combines the output of multiple Decision Trees to achieve a single result. It handles both classification and regression problems as it combines the simplicity of decision trees with flexibility leading to significant improvements in accuracy.

AdaBoost Algorithm (Adaptive Boosting), is a Boosting method used as an Ensemble Machine Learning System. The weights of each tree is redistributed in each turn, with higher weights given to incorrectly classified conditions, therefore, it is called Adaptive Boosting. AdaBoost uses multiple single-level decision-making trees called the Forest of Trees.

random-forest-vs-adaboost

Difference between Random Forest and AdaBoost

Tree Size

In a random forest, each time you make a tree, you make a full-sized tree. Some trees might be bigger than others, but there is no predetermined maximum depth.

In contrast, in a forest of trees made with AdaBoost, the trees are usually just a node and two leaves. A tree with just one node and two leaves is called a stump, so forest of trees is actually just a Forest of Stumps.

Classification Accuracy

Fig. 3 : Data table

Random forest is much better at making accurate classifications. For eg. if we were using the data in fig. 3 to determine if someone had heart disease or not, then a full sized Decision tree would take advantage of all four variables that is measured (Chest Pain, Blood Circulation, Blocked Arteries and Weight) to make a decision.

Stumps are not great at making accurate classifications, since random forest uses all the variables given in the data table but AdaBoost uses only one variable to make a decision therefore, , they are said to be weak learners. For eg. as per fig. 3 forest of trees will use only chest pain to decide whether a person has heart disease or not. However, that’s the way AdaBoost likes it, and it’s one of the reasons why they are so commonly combined.

Distribution of say for each tree

In a Random Forest, each tree has an equal vote on the final classification. Any tree of any size has the same weightage of vote.

In contrast, in a Forest of Stumps made with AdaBoost, some stumps get more say in the final classification than others. In the diagram, larger stumps get more say than smaller stumps.

Order for constructing trees

In Random Forest, each decision tree is made independently of the others. In other words, it doesn’t matter which tree was made at what turn.

On the other hand, in a Forest of Stumps made with AdaBoost, the order is important. The errors that the first stump makes influence how the second stump is made, errors of the second stump influence the third stump and so on.

S. No.	Categories	Random Forest	AdaBoost
1.	Tree Size	Random forest uses a full-sized Decision tree with no predetermined depth size.	AdaBoost combines a lot of “weak learners” to make classifications. The weak learners are almost always stumps.
2.	Distribution of say for each tree.	Each tree has an equal vote on the final classification	Some stumps get more say in the classification than others.
3.	Classification Accuracy	Random Forest often gets outperformed by AdaBoost in terms of classification accuracy.	AdaBoost is often much better at making accurate classifications.
4.	Tree Construction Order	Each decision tree is made independently of the others	Each stump is made by taking the previous stump’s mistakes into account.
5.	Overfitting Tolerance	Random Forest is less sensitive to overfitting as compared to AdaBoost	Adaboost is also less tolerant to overfitting than Random Forest.
6.	Data Sampling Technique	In Random forest, the training data is sampled based on the bagging technique.	Adaboost is based on boosting technique.
7.	Estimate Calculation	Random Forest aims to decrease variance not bias.	Adaboost aims to decrease bias, not variance.
8.	Ensembling Operation	Random Forest employs parallel assembly. Forest processes trees in parallel, allowing jobs to be parallelized on a multiprocessor machine.	Adaboost makes use of sequential ensembling. It takes a step-by-step method.