Open source softwares - Regression

Back to Course

Lesson Description

Lession - #1514 Regression Influtial Points


We note in the former paragraph that influential data points affect the prophetic power of direct retrogression models. And influential data points do so, by greatly impacting the regression measure/s.

It's easy to mistake these points with “ outliers ”, still, they've different delineations. Not all outliers are considered influential points. In fact, in some cases, the presence of outliers, although unusual, may not change the regression line.

For illustration, if you have the data points(,1>
, and(,500>
, you can consider the last point as an outlier, but the regression life remains unchanged.

rather, we should breakdown these extreme values into extreme y- values( high residuals outliers>
and extremex-values( high influence>
. In some cases, the observation may have both high residuals and high influence.


An outlier is an observation with extreme y- values. Because the extreme values do in the dependent or target variable, these observations have high residuals.


Leverage is a measure of how far the value of a predictor variable(e.g. independent or generally thex-variable>
from the mean of that variable.

An observation is said to have high leverage if the value of the predictor variable is unusual and far from the rest.

An observation can have a high residual and high leverage and may or may not be an influential point.