-
Leverage, Influential values, Cook's distance in RegressionMathematics 2021. 1. 2. 21:31
Leverage and Influential Values
The above image is from this youtube video.
The detailed information including its formula is introduced in this video.
Cook's Distance
It measures the influence of a data point. Cook's distance $D_i$ of observation $i$ for ($i=1,\cdots,n$) is defined as the sum of all the changes in the regression model when observation $i$ is removed from it.
$$D_i = \frac{\sum_{j=1}^{n}{ (\hat{y}_{j}-\hat{y}_{j(i)}})^2 }{ ps^2 }$$
where $\hat{y}_{j(i)}$ denotes the fitted response value obtained when excluding $i$, $s^2$ denotes $(\vec{e}^{T}\vec{e}) / (n-p)$, meaning a mean squared error of the regression model that outputs $\hat{y}_i$, and $p$ denotes a number of covariates or predictors for each observation (= a number of learnable parameters).
'Mathematics' 카테고리의 다른 글
Prediction Interval and Confidence Interval (0) 2021.01.04 Power and Sample size (0) 2021.01.01 ANOVA (Analysis of Variance) (0) 2020.12.31