-
Impurity Metric - Gini, EntropyMathematics 2021. 1. 5. 15:40
The two most common matrices are 1) Gini impurity, 2) Entropy
Gini Impurity
Binary case
$ I(D) = Gini(D) = pq = p(1-p) $
$D$ denotes dataset, and $c$ denotes class.
General case
$ I(D) = 1 - \sum_{i=1}^{c}{p_i}^2 $
Entropy
Binary case
$ I(D) = -p\log_2{p} -q\log_2{q} $
General case
$ I(D) = 1 - \sum_i^c{p_i \log_2{p_i}} $
Gini Impurity v.s Entropy
The figure shows that Gini impurity (rescaled) and the entropy measures are similar, with entropy giving higher impurity scores for moderate and high misclassification error.
'Mathematics' 카테고리의 다른 글
Cross Entropy (0) 2021.01.07 Naive Bayes Explained (0) 2021.01.04 Prediction Interval and Confidence Interval (0) 2021.01.04