Least Square Regression Line
Given a set of coordinates in the form of (X, Y), the task is to find the least regression line that can be formed.
In statistics, Linear Regression is a linear approach to model the relationship between a scalar response (or dependent variable), say Y, and one or more explanatory variables (or independent variables), say X.
Regression Line: If our data shows a linear relationship between X and Y, then the straight line which best describes the relationship is the regression line. It is the straight line that covers the maximum points in the graph.
Examples:
Input: X = [95, 85, 80, 70, 60]
Y = [90, 80, 70, 65, 60]
Output: Y = 5.685 + 0.863*X
Explanation:
The graph of the data given below is:
X = [95, 85, 80, 70, 60]
Y = [90, 80, 70, 65, 60]
The regression line obtained is Y = 5.685 + 0.863*X
The graph shows that the regression line is the line that covers the maximum of the points.
Input: X = [100, 95, 85, 80, 70, 60]
Y = [90, 95, 80, 70, 65, 60]
Output: Y = 4.007 + 0.89*X
Approach:
A regression line is given as Y = a + b*X where the formula of b and a are given as:
b = (n?(xiyi) – ?(xi)?(yi)) ÷ (n?(xi2)-?(xi)2)
a = y? – b.x?
where x? and y? are mean of x and y respectively.
- To find regression line, we need to find a and b.
- Calculate a, which is given by
- Calculate b, which is given by
- Put value of a and b in the equation of regression line.
Below is the implementation of the above approach.
C++
#include<bits/stdc++.h>
using namespace std;
double calculateB( int x[], int y[], int n)
{
int sx = accumulate(x, x + n, 0);
int sy = accumulate(y, y + n, 0);
int sxsy = 0;
int sx2 = 0;
for ( int i = 0; i < n; i++)
{
sxsy += x[i] * y[i];
sx2 += x[i] * x[i];
}
double b = ( double )(n * sxsy - sx * sy) /
(n * sx2 - sx * sx);
return b;
}
void leastRegLine( int X[], int Y[], int n)
{
double b = calculateB(X, Y, n);
int meanX = accumulate(X, X + n, 0) / n;
int meanY = accumulate(Y, Y + n, 0) / n;
double a = meanY - b * meanX;
cout << ( "Regression line:" ) << endl;
cout << ( "Y = " );
printf ( "%.3f + " , a);
printf ( "%.3f *X" , b);
}
int main()
{
int X[] = { 95, 85, 80, 70, 60 };
int Y[] = { 90, 80, 70, 65, 60 };
int n = sizeof (X) / sizeof (X[0]);
leastRegLine(X, Y, n);
}
|
Java
import java.util.Arrays;
public class GFG {
private static double calculateB(
int [] x, int [] y)
{
int n = x.length;
int sx = Arrays.stream(x).sum();
int sy = Arrays.stream(y).sum();
int sxsy = 0 ;
int sx2 = 0 ;
for ( int i = 0 ; i < n; i++) {
sxsy += x[i] * y[i];
sx2 += x[i] * x[i];
}
double b = ( double )(n * sxsy - sx * sy)
/ (n * sx2 - sx * sx);
return b;
}
public static void leastRegLine(
int X[], int Y[])
{
double b = calculateB(X, Y);
int n = X.length;
int meanX = Arrays.stream(X).sum() / n;
int meanY = Arrays.stream(Y).sum() / n;
double a = meanY - b * meanX;
System.out.println( "Regression line:" );
System.out.print( "Y = " );
System.out.printf( "%.3f" , a);
System.out.print( " + " );
System.out.printf( "%.3f" , b);
System.out.print( "*X" );
}
public static void main(String[] args)
{
int X[] = { 95 , 85 , 80 , 70 , 60 };
int Y[] = { 90 , 80 , 70 , 65 , 60 };
leastRegLine(X, Y);
}
}
|
Python3
def calculateB(x, y, n):
sx = sum (x)
sy = sum (y)
sxsy = 0
sx2 = 0
for i in range (n):
sxsy + = x[i] * y[i]
sx2 + = x[i] * x[i]
b = (n * sxsy - sx * sy) / (n * sx2 - sx * sx)
return b
def leastRegLine(X,Y,n):
b = calculateB(X, Y, n)
meanX = int ( sum (X) / n)
meanY = int ( sum (Y) / n)
a = meanY - b * meanX
print ( "Regression line:" )
print ( "Y = " , '%.3f' % a, " + " , '%.3f' % b, "*X" , sep = "")
X = [ 95 , 85 , 80 , 70 , 60 ]
Y = [ 90 , 80 , 70 , 65 , 60 ]
n = len (X)
leastRegLine(X, Y, n)
|
C#
using System;
using System.Linq;
class GFG{
private static double calculateB( int [] x,
int [] y)
{
int n = x.Length;
int sx = x.Sum();
int sy = y.Sum();
int sxsy = 0;
int sx2 = 0;
for ( int i = 0; i < n; i++)
{
sxsy += x[i] * y[i];
sx2 += x[i] * x[i];
}
double b = ( double )(n * sxsy - sx * sy) /
(n * sx2 - sx * sx);
return b;
}
public static void leastRegLine( int []X, int []Y)
{
double b = calculateB(X, Y);
int n = X.Length;
int meanX = X.Sum() / n;
int meanY = Y.Sum() / n;
double a = meanY - b * meanX;
Console.WriteLine( "Regression line:" );
Console.Write( "Y = " );
Console.Write( "{0:F3}" ,a );
Console.Write( " + " );
Console.Write( "{0:F3}" , b);
Console.Write( "*X" );
}
public static void Main(String[] args)
{
int []X = { 95, 85, 80, 70, 60 };
int []Y = { 90, 80, 70, 65, 60 };
leastRegLine(X, Y);
}
}
|
Javascript
<script>
function calculateB(x,y)
{
let n = x.length;
let sx = x.reduce((a, b) => a + b, 0);
let sy =y.reduce((a, b) => a + b, 0)
let sxsy = 0;
let sx2 = 0;
for (let i = 0; i < n; i++) {
sxsy += x[i] * y[i];
sx2 += x[i] * x[i];
}
let b = (n * sxsy - sx * sy)
/ (n * sx2 - sx * sx);
return b;
}
function leastRegLine(X,Y)
{
let b = calculateB(X, Y);
let n = X.length;
let meanX = X.reduce((a, b) => a + b, 0) / n;
let meanY = Y.reduce((a, b) => a + b, 0) / n;
let a = meanY - b * meanX;
document.write( "Regression line:<br>" );
document.write( "Y = " );
document.write( a.toFixed(3));
document.write( " + " );
document.write( b.toFixed(3));
document.write( "*X" );
}
let X = [95, 85, 80, 70, 60 ];
let Y = [90, 80, 70, 65, 60];
leastRegLine(X, Y);
</script>
|
Output: Regression line:
Y = 5.685 + 0.863*X
Last Updated :
07 Jul, 2021
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...