-
The Boosting AlgorithmData/Machine learning 2021. 1. 12. 11:05
In this posting, we address the basic idea behind the various boosting algorithms. The easiest to understand is Adaboost, which proceeds as follows. (Note that the commonly used boosting algorithms are: Adaboost, gradient boosting, and stochastic gradient boosting which is the most common).
- Initialize $M$, the maximum number of models to be fit, and set the iteration counter $m=1$. Initialize the observation weights $w_i = 1/N$ for $i = 1,2, \cdots, N$. Initialize the ensemble model $\hat{F}_0 = 0$.
- Train a model using $\hat{f}_m$ using the observation weights $w_1, w_2, \cdots, w_N$ that minimizes the weighted error $e_m$ defined by summing the weights for the misclassified observations.
- Add the model to the ensemble: $\hat{F}_m = \hat{F}_{m-1} + \alpha_m \hat{f}_m$ where $\alpha_m = \log{\frac{1-e_m}{e_m}}$.
- Up[date the weights $w_1, w_2, \cdots, w_N$ so that the weights are increased for the observations that were misclassified. The size of the increase depends on $\alpha_m$ with larger values of $\alpha_m$ leading to bigger weights.
- Increment the model counter $m = m+1$. If $m \le M$, go to step 1.
The boosted estimate is given by:
$$ \hat{F} = \alpha_1 \hat{f_1} + \alpha_2 \hat{f_2} + \cdots + \alpha_M \hat{f}_M $$
By increasing the weights for the observations that were misclassified, the algorithm forces the models to train more heavily on the data for which it performed poorly. The factor $\alpha_m$ ensures that models with lower error have a bigger weight.
Detailed information can be found in StateQuest.
'Data > Machine learning' 카테고리의 다른 글
[PyTorch] .detach() in Loss Function (0) 2021.01.28 Distance Metrics (0) 2021.01.11 Problems with Clustering Mixed Data (0) 2021.01.10