Katana VentraIP

Simple linear regression

In statistics, simple linear regression (SLR) is a linear regression model with a single explanatory variable.[1][2][3][4][5] That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the x and y coordinates in a Cartesian coordinate system) and finds a linear function (a non-vertical straight line) that, as accurately as possible, predicts the dependent variable values as a function of the independent variable. The adjective simple refers to the fact that the outcome variable is related to a single predictor.

It is common to make the additional stipulation that the ordinary least squares (OLS) method should be used: the accuracy of each predicted value is measured by its squared residual (vertical distance between the point of the data set and the fitted line), and the goal is to make the sum of these squared deviations as small as possible. In this case, the slope of the fitted line is equal to the correlation between y and x corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that the line passes through the center of mass (x, y) of the data points.

Interpretation[edit]

Relationship with the sample covariance matrix[edit]

The solution can be reformulated using elements of the covariance matrix:

Design matrix#Simple linear regression

Linear trend estimation

Linear segmented regression

—derivation of all formulas used in this article in general multidimensional case

Proofs involving ordinary least squares

Newey–West estimator

Wolfram MathWorld's explanation of Least Squares Fitting, and how to calculate it

Mathematics of simple regression (Robert Nau, Duke University)