>_TheQuery
← Glossary

AdaBoost (Boosting)

Fundamentals

The original boosting algorithm that trains a sequence of weak classifiers, reweighting misclassified examples after each round so subsequent models focus on the hardest cases.

Like a teacher who gives extra homework on the questions students got wrong — each round of practice targets the weak spots.

AdaBoost — short for Adaptive Boosting — was introduced by Freund and Schapire in 1997 and became the first boosting algorithm to see widespread practical adoption. It works by training a sequence of weak learners (typically decision stumps — trees with a single split) on weighted versions of the training data.

After each round, AdaBoost increases the weight of misclassified examples and decreases the weight of correctly classified ones. The next weak learner is trained on this reweighted dataset, forcing it to focus on the examples the ensemble currently gets wrong. Each weak learner also receives a coefficient proportional to its accuracy — better classifiers get more say in the final vote.

The final prediction is a weighted majority vote of all weak learners. Despite its simplicity, AdaBoost achieves strong performance on many classification tasks and was one of the first algorithms to demonstrate that combining many weak models could match or exceed a single complex one. Its main limitations are sensitivity to noisy data and outliers (since misclassified outliers receive exponentially increasing weight) and its restriction to exponential loss, which later algorithms like gradient boosting generalized.

Last updated: March 9, 2026