Open In App

Linear Regression Formula

Last Updated : 16 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Linear regression is a statistical method that is used to predicts the value of unknown data using other related data values. Linear regression is used to study the relationship between a dependent variable and independent variable.

In this article we will learn about, Linear Regression, Linear Regression Equation, Linear Equation Formulas, and others in detail.

What is Linear Regression?

Linear regression is a very common formula used in predictive analysis. In linear regression we have two variables and one variable is considered independent variable and other variable is considered dependent variable.

Linear-Regression-Equation

Various linear regression that are commonly used are,

  1. Simple Linear Regression: This is the simplest form, where we have one thing we’re trying to predict and one thing we think might influence it. For example, we might predict someone’s weight based on their height.
  2. Multiple Linear Regression: Here, things get a bit more complex. We’re still predicting one thing, but now we’re considering multiple factors that might influence it. For instance, we might predict a person’s weight based on their height, age, and maybe even their diet habits.
  3. Logistic Regression: This one comes into play when we’re dealing with binary outcomes, like whether someone will click on an ad or not. We’re still looking at multiple factors that might play a role.
  4. Ordinal Regression: Sometimes, what we’re trying to predict isn’t exactly numerical, but it has an order. Think of rating something from 1 to 5 stars. This kind of regression helps us predict such ordinal outcomes.
  5. Multinomial Regression: When our outcome has several categories but no inherent order, like predicting someone’s favorite color among several options, we turn to multinomial regression.
  6. Discriminant Analysis: Similar to multinomial regression, this helps us when we have multiple categories for our outcome variable, but here, we’re specifically focused on classifying cases into those categories based on the predictor variables.

Each of these methods has its own strengths and best-use scenarios.

Linear Regression Equation

Linear regression line equation is written in the form:

y = a + bx

where,

  • x is Independent Variable, Plotted along X-axis
  • y is Dependent Variable, Plotted along Y-axis

The slope of the regression line is “b”, and the intercept value of regression line is “a”(the value of y when x = 0).

Linear Regression Formula

Formula used for linear regressions is, y = a + bx

Intercept value, a, and slope of the line, b, are evaluated using the formulas given below:

\begin{array}{l}\large a~=~\frac{\sum y \sum x^{2} ~–~ \sum x \sum xy} {n(\sum x^{2}) ~–~ (\sum x)^{2}}\end{array} \\

\begin{array}{l}\large b~=~\frac{n\sum xy~-~\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}~-~\left(\sum x\right)^{2}}\end{array}


where,

  • y is Dependent Variable that Lies along Y-axis
  • a is Y-Intercept
  • b is Slope of Regression Line
  • x is Independent Variable that Lies along X-axis

Properties of Linear Regression

In the linear regression line if the regression parameters a0 and a1 are defined, the properties are given as below:

  • Linear regression line reduces the sum of squared differences between observed values and predicted values.
  • Linear regression line always passes through the mean of X and Y variable values.
  • Linear regression constant (b0) is equal to the y-intercept of the linear regression.
  • Linear regression coefficient (b0) is the slope of the regression line.

Linear Regression Line

Least square method is the most common method used to fit a regression line, in the X-Y graph. In this process we determines the line of best fit by reducing the sum of the squares of the vertical deviations from each data point to the line.

For any point that is fitted accurately, its perpendicular deviation is zero. Linear regression line is shown in the image added below,

X-and-Y-Linear-Regression

Regression Coefficient

Linear regression line, equation:

Y = B0 + B1X

where,

  • B0 is a Constant
  • B1 is Regression Coefficient

Here, B1 is the regression coefficient and its formula is,

B1 = b1 = Σ [ (xi – x)(yi – y) ] / Σ [(xi – x)2]

where,

  • xi and yi are Observed Data Sets
  • x and y are Mean Value

What is Linear Regression Used for?

Various uses of Linear Regression are,

  • It is used in market research and study of customer survey results.
  • It is used for studying performance of engine of automobiles.
  • It is used in deciding the effective price of any goods.
  • It is used in astronomy.

Error in Linear Regression Formula

Standard error about the regression line is defined as the measure of the average proportion that the regression equation predicts. Standard error in this case is denoted by ‘SE‘. Higher the coefficient of the determination involved, the lower the standard error and hence, a more accurate result is generated.

Related Articles:

Exponential Growth Formula

Compound Interest Formula

Simple Interest Formula

Fibonacci Sequence

Recursive Formula

Golden Ratio

Solved Questions on Linear Regression

Question 1: Find the linear regression equation for the given data:

x

y

3

8

9

6

5

4

3

2

Solution:

Calculating intercept and slope value.

x

y

x2

xy

3

8

9

24

9

6

81

54

5

4

25

20

3

2

9

6

∑x = 20

∑y = 20

∑x2 = 124

∑xy = 104

Using formula,

[Tex]\begin{array}{l}\large a~=~\frac{\sum y \sum x^{2} ~–~ \sum x \sum xy} {n(\sum x^{2}) ~–~ (\sum x)^{2}}\end{array}\\[/Tex]

a = {20 (124) – 20 (104)} / {4 (124) – 400}

a = 400/96 = 4.17

[Tex]\begin{array}{l}\large b~=~\frac{n\sum xy~-~\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}~-~\left(\sum x\right)^{2}}\end{array}[/Tex]

b = {4 (104) – 20 (20)} / {4 (124) – 400}

b = 16/96 = 0.166

So, linear regression equation is, y=a+bx => y = 4.17 + 0.166x


Question 2: Find the linear regression equation for the given data:

x

y

4

6

7

5

3

8

1

3

Solution:

Calculating intercept and slope value.

x

y

x2

xy

4

6

16

24

7

5

49

35

3

8

9

24

1

3

1

3

∑x = 15

∑y = 22

∑x2 = 75

∑xy = 86

Using formula,

[Tex]\begin{array}{l}\large a~=~\frac{\sum y \sum x^{2} ~–~ \sum x \sum xy} {n(\sum x^{2}) ~–~ (\sum x)^{2}}\end{array}\\[/Tex]

= (22 (75) – 15 (86)) / (4 (75) – 225)

= 360/75

= 4.8

[Tex]\begin{array}{l}\large b~=~\frac{n\sum xy~-~\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}~-~\left(\sum x\right)^{2}}\end{array}[/Tex]

= (4 (86) – 15 (22)) / (4 (75) – 225)

= 14/75

= 0.1867

So, the linear regression equation is, 4.8 + 0.1867x.

Question 3: Find the intercept of linear regression line if ∑x = 25, ∑y = 20, ∑x2 = 90, ∑xy = 150 and n = 5.

Solution:

Using formula,

[Tex]\begin{array}{l}\large a=\frac{\sum y \sum x^{2} – \sum x \sum xy} {n(\sum x^{2}) – (\sum x)^{2}}\end{array}\\ [/Tex]

= (20 (90) – 25 (150)) / (5 (90) – 625)

= -1950/-175

= 11.14

Question 4: Find the intercept of linear regression line if ∑x = 30, ∑y = 27, ∑x2 = 110, ∑xy = 190 and n = 4.

Solution:

Using formula,

[Tex]\begin{array}{l}\large a=\frac{\sum y \sum x^{2} – \sum x \sum xy} {n(\sum x^{2}) – (\sum x)^{2}}\end{array}\\ [/Tex]

= (27 (110) – 30 (190)) / (4 (110) – 900)

= -2730/-460

= 5.93

Question 5: Find slope of linear regression line if ∑x = 10, ∑y = 16, ∑x2 = 60, ∑xy = 120 and n = 4.

Solution:

Using formula,

[Tex]\begin{array}{l}\large b=\frac{n\sum xy-\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}-\left(\sum x\right)^{2}}\end{array} [/Tex]

= (4 (120) – 10 (16)) / (4 (60) – 100)

= 320/140

= 2.28

Question 6: Find slope of linear regression line if ∑x = 40, ∑y = 32, ∑x2 = 130, ∑xy = 210 and n = 4.

Solution:

Using formula,

[Tex]\begin{array}{l}\large b=\frac{n\sum xy-\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}-\left(\sum x\right)^{2}}\end{array} [/Tex]

= (4 (210) – 40 (32)) / (4 (130) – 1600)

= -440/-1080

= 0.407

Question 7: Find slope of linear regression line if ∑x = 50, ∑y = 44, ∑x2 = 150, ∑xy = 230 and n = 4.

Solution:

Using formula,

[Tex]\begin{array}{l}\large a=\frac{\sum y \sum x^{2} – \sum x \sum xy} {n(\sum x^{2}) – (\sum x)^{2}}\end{array}\\ [/Tex]

= (44 (150) – 50 (230)) / (4 (150) – 2500)

= -4900/-1900

= 2.57

[Tex]\begin{array}{l}\large b=\frac{n\sum xy-\left(\sum x\right)\left(\sum y\right)}{n\sum x^{2}-\left(\sum x\right)^{2}}\end{array} [/Tex]

= (4 (230) – 50 (44)) / (4 (150) – 2500)

= -1280/-1900

= 0.673

Linear Regression: FAQs

What is the application of linear regression?

Some applications of Linear Regression are,

  • Analysis market using various marketing strategies.
  • In financial study through various linear models.
  • In sports analysis by predicting game attendance, team size and market value, etc.

What are examples of linear regression?

Some examples of linear regression are,

  • Business Analysis
  • Height and Weight Analysis of Body, etc.

What are the parameters of linear regression?

Parameters of linear regression are, ‘α’ and ‘β’ or ‘a’ and ‘b’.

Why is it called linear regression?

Linear regression shows linear relationship between independent variable i.e. X-axis and dependent variable i.e. Y-axis, and hence is called linear regression.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads