Simple Linear Regression

What is it?

A form of model-based prediction. Used for continuous variables, as opposed to Classification which is for discrete categories.

How do you do Linear Regression (and Machine Learning in general):

Specify the model
Specify the loss function
Minimise the loss

How does it work?

We assume x and y are related by: $y = β_{0} + β_{1} x + noise$
- Note: This is very similar to the equation of a line.
We optimise the model to be the line with the smallest average distance to all of the lines ( $∴ Loss \equiv sum of squared prediction errors$ ).
- In English: Square each prediction’s distance from the line and add them up.
- Mathematically, that looks like:
- The Loss is called Quadratic Loss or Sum of Square Errors.
- Note: Squaring the loss penalises overshooting and undershooting as equally bad.

Prediction: \overset{y}{^}_{i} = β_{0} + β_{1} x_{i} Error: e_{i} = actual y_{i} - predicted \overset{y}{^}_{i} Loss: f (β_{0}, β_{1}) = i = 1 \sum n e_{i}^{2}

Minimise the loss.

Example of above:

Optimisation Techniques for Linear Regression

A common way to optimise for the correct $β$ ‘s is the following formulae, known as ‘Least Squares Estimate’.

Optimal Solutions:

Once you have an optimised line, there’s still points that don’t go through it (technically still errors). They’re known as Residuals.

$\overset{e}{^}_{i} = y_{i} - (β_{0} + β_{1} x_{i}) = y_{i} - \overset{y}{ˉ} - \hat{β}_{1} (x_{i} - \overset{x}{ˉ})$

Smallest Loss Achievable:

The optimisation here lies in minimising the residuals. Why? Well because if your line is as close to all points not on the line as possible, and no alterations could make them closer, then you’ve the best fitting model. The formula for that (I.E. The smallest possible loss)?:

$min f (β_{0}, β_{1}) = (n - 1) s_{y}^{2} (1 - r^{2})$

Dissecting The Formula:

$n - 1$ : The number of data points
$s_{y}^{2}$ : Variance of target
- Predicting something that’s more variable is much harder. Thus as the variability increases, smallest possible loss also increases.
$1 - r^{2}$ : Regression Co-efficient.
- If $r = 1$ , $x$ is perfectly correlated with $y$ . If $r = 0$ , $x$ is perfectly non-correlated with $y$
- Loss decreases as

~/leocamacho.co

Get Around

🧠 EdinburghAI

🛠️ Projects

📝 Essays

Contact Me

📧 Email

💼 LinkedIn

🐦 Twitter

Simple Linear Regression

How do you do Linear Regression (and Machine Learning in general):

How does it work?

Example of above:

Optimisation Techniques for Linear Regression

Optimal Solutions:

Dissecting The Formula:

Graph View

Table of Contents

Backlinks