Represent a given set of points by the best possible straight line

Last Updated : 22 Nov, 2022

Find the value of m and c such that a straight line y = mx + c, best represents the equation of a given set of points (x $_1$ , y $_1$ ), (x $_2$ , y $_2$ ), (x $_3$ , y $_3$ ), ……., (x $_n$ , y $_n$ ), given n >=2.

Examples:

Input : n = 5
        x = 1, x = 2, x = 3, x = 4, x = 5y = 14, y = 27, y = 40, y = 55, y = 68   Output : m = 13.6c = 0 If we take any pair of number ( x, y ) from the given data, these value of m and cshould make it best fit into the equation for a straight line, y = mx + c. Take x = 1 and y = 14, then using valuesof m and c from the output, and putting it in the following equation,y = mx + c,L.H.S.: y = 14, R.H.S: mx + c = 13.6 x 1 + 0 = 13.6So, they are approximately equal.Now, take x = 3 and y = 40,L.H.S.: y = 40, R.H.S: mx + c = 13.6 x 3 + 0 = 40.8So, they are also approximately equal, and so onfor all other values.Input : n = 6x = 1, x = 2, x = 3, x = 4, x = 5, x = 6y = 1200, y = 900, y = 600, y = 200, y = 110, y = 50Output : m = -243.42c = 1361.97

Approach

To best fit a set of points in an equation for a straight line, we need to find the value of two variables, m and c. Now, since there are 2 unknown variables and depending upon the value of n, two cases are possible –

Case 1 – When n = 2 : There will be two equations and two unknown variables to find, so, there will be a unique solution .
Case 2 – When n > 2 : In this case, there may or may not exist values of m and c, which satisfy all the n equations, but we can find the best possible values of m and c which can fit a straight line in the given points .

So, if we have n different pairs of x and y, then, we can form n no. of equations from them for a straight line, as follows

f = mx + c,f = mx + c,f = mx + c,......................................,......................................,f = mx + c,where, f, is the value obtained by putting x in equation mx + c.

Then, since ideally f $_i$ should be same as y $_i$ , but still we can find the f $_i$ closest to y $_i$ in all the cases, if we take a new quantity, U = ?(y $_i$ – f $_i$ ) $^2$ , and make this quantity minimum for all value of i from 1 to n.

Note:(y $_i$ – f $_i$ ) $^2$ is used in place of (y $_i$ – f $_i$ ), as we want to consider both the cases when f $_i$ or when y $_i$ is greater, and we want their difference to be minimum, so if we would not square the term, then situations in which f $_i$
is greater and situation in which y $_i$ is greater will cancel each other to an extent, and this is not what we want. So, we need to square the term.

Now, for U to be minimum, it must satisfy the following two equations –

 = 0 and   = 0.

On solving the above two equations, we get two equations, as follows :

?y = nc + m?x, and
?xy = c?x + m?x, which can be rearranged as - m = (n * ?xy - ?x?y) / (n * ?x - (?x)), andc = (?y - m?x) / n,

So, this is how values of m and c for both the cases are obtained, and we can represent a given set of points, by the best possible straight line.

The following code implements the above given algorithm –

C++

// C++ Program to find m and c for a straight line given,
// x and y
#include <cmath>
#include <iostream>
using namespace std;
 
// function to calculate m and c that best fit points
// represented by x[] and y[]
void bestApproximate(int x[], int y[], int n)
{
    float m, c, sum_x = 0, sum_y = 0, sum_xy = 0, sum_x2 = 0;
    for (int i = 0; i < n; i++) {
        sum_x += x[i];
        sum_y += y[i];
        sum_xy += x[i] * y[i];
        sum_x2 += pow(x[i], 2);
    }
 
    m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - pow(sum_x, 2));
    c = (sum_y - m * sum_x) / n;
 
    cout << "m =" << m;
    cout << "\nc =" << c;
}
 
// Driver main function
int main()
{
    int x[] = { 1, 2, 3, 4, 5 };
    int y[] = { 14, 27, 40, 55, 68 };
    int n = sizeof(x) / sizeof(x[0]);
    bestApproximate(x, y, n);
    return 0;
}

C

// C Program to find m and c for a straight line given,
// x and y
#include <stdio.h>
 
// function to calculate m and c that best fit points
// represented by x[] and y[]
void bestApproximate(int x[], int y[], int n)
{
    int i, j;
    float m, c, sum_x = 0, sum_y = 0, sum_xy = 0, sum_x2 = 0;
    for (i = 0; i < n; i++) {
        sum_x += x[i];
        sum_y += y[i];
        sum_xy += x[i] * y[i];
        sum_x2 += (x[i] * x[i]);
    }
 
    m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - (sum_x * sum_x));
    c = (sum_y - m * sum_x) / n;
 
    printf("m =% f", m);
    printf("\nc =% f", c);
}
 
// Driver main function
int main()
{
    int x[] = { 1, 2, 3, 4, 5 };
    int y[] = { 14, 27, 40, 55, 68 };
    int n = sizeof(x) / sizeof(x[0]);
    bestApproximate(x, y, n);
    return 0;
}

Java

// Java Program to find m and c for a straight line given,
// x and y
import java.io.*;
import static java.lang.Math.pow;
 
public class A {
    // function to calculate m and c that best fit points
    // represented by x[] and y[]
    static void bestApproximate(int x[], int y[])
    {
        int n = x.length;
        double m, c, sum_x = 0, sum_y = 0,
                     sum_xy = 0, sum_x2 = 0;
        for (int i = 0; i < n; i++) {
            sum_x += x[i];
            sum_y += y[i];
            sum_xy += x[i] * y[i];
            sum_x2 += pow(x[i], 2);
        }
 
        m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - pow(sum_x, 2));
        c = (sum_y - m * sum_x) / n;
 
        System.out.println("m = " + m);
        System.out.println("c = " + c);
    }
 
    // Driver main function
    public static void main(String args[])
    {
        int x[] = { 1, 2, 3, 4, 5 };
        int y[] = { 14, 27, 40, 55, 68 };
        bestApproximate(x, y);
    }
}

Python3

# python Program to find m and c for
# a straight line given, x and y
 
# function to calculate m and c that
# best fit points represented by x[]
# and y[]
def bestApproximate(x, y, n):
     
    sum_x = 0
    sum_y = 0
    sum_xy = 0
    sum_x2 = 0
     
    for i in range (0, n):
        sum_x += x[i]
        sum_y += y[i]
        sum_xy += x[i] * y[i]
        sum_x2 += pow(x[i], 2)
 
    m = (float)((n * sum_xy - sum_x * sum_y)
            / (n * sum_x2 - pow(sum_x, 2)));
             
    c = (float)(sum_y - m * sum_x) / n;
     
    print("m = ", m);
    print("c = ", c);
     
     
# Driver main function
x = [1, 2, 3, 4, 5 ]
y = [ 14, 27, 40, 55, 68] 
n = len(x)
 
bestApproximate(x, y, n)
     
# This code is contributed by Sam007.

C#

// C# Program to find m and c for a
// straight line given, x and y
using System;
 
class GFG {
 
    // function to calculate m and c that
    // best fit points represented by x[] and y[]
    static void bestApproximate(int[] x, int[] y)
    {
        int n = x.Length;
        double m, c, sum_x = 0, sum_y = 0,
                     sum_xy = 0, sum_x2 = 0;
 
        for (int i = 0; i < n; i++) {
            sum_x += x[i];
            sum_y += y[i];
            sum_xy += x[i] * y[i];
            sum_x2 += Math.Pow(x[i], 2);
        }
 
        m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - Math.Pow(sum_x, 2));
 
        c = (sum_y - m * sum_x) / n;
 
        Console.WriteLine("m = " + m);
        Console.WriteLine("c = " + c);
    }
 
    // Driver main function
    public static void Main()
    {
        int[] x = { 1, 2, 3, 4, 5 };
        int[] y = { 14, 27, 40, 55, 68 };
 
        // Function calling
        bestApproximate(x, y);
    }
}
 
// This code is contributed by Sam007

PHP

<?php
// PHP Program to find m and c 
// for a straight line given,
// x and y
 
// function to calculate m and 
// c that best fit points
// represented by x[] and y[]
function bestApproximate($x, $y, $n)
{
    $i; $j;
    $m; $c; 
    $sum_x = 0; 
    $sum_y = 0; 
    $sum_xy = 0; 
    $sum_x2 = 0;
    for ($i = 0; $i < $n; $i++)
    {
        $sum_x += $x[$i];
        $sum_y += $y[$i];
        $sum_xy += $x[$i] * $y[$i];
        $sum_x2 += ($x[$i] * $x[$i]);
    }
 
    $m = ($n * $sum_xy - $sum_x * $sum_y) / 
         ($n * $sum_x2 - ($sum_x * $sum_x));
    $c = ($sum_y - $m * $sum_x) / $n;
 
    echo "m =", $m;
    echo "\nc =", $c;
}
 
    // Driver Code
    $x =array(1, 2, 3, 4, 5);
    $y =array (14, 27, 40, 55, 68);
    $n = sizeof($x);
    bestApproximate($x, $y, $n);
 
// This code is contributed by ajit
?>

Javascript

<script>
 
// Javascript Program to find m and c 
// for a straight line given, x and y
 
// function to calculate m and c that
// best fit points represented by x[] and y[]
function bestApproximate(x, y, n)
{
    let m, c, sum_x = 0, sum_y = 0, 
             sum_xy = 0, sum_x2 = 0;
    for(let i = 0; i < n; i++) 
    {
        sum_x += x[i];
        sum_y += y[i];
        sum_xy += x[i] * y[i];
        sum_x2 += Math.pow(x[i], 2);
    }
 
    m = (n * sum_xy - sum_x * sum_y) / 
        (n * sum_x2 - Math.pow(sum_x, 2));
    c = (sum_y - m * sum_x) / n;
 
    document.write("m =" + m);
    document.write("<br>c =" + c);
}
 
// Driver code
let x = [ 1, 2, 3, 4, 5 ];
let y = [ 14, 27, 40, 55, 68 ];
let n = x.length;
 
bestApproximate(x, y, n);
 
// This code is contributed by subham348
 
</script>

Output:

m=13.6
c=0.0

Analysis of above code-
Auxiliary Space : O(1)
Time Complexity : O(n). We have one loop which iterates n times, and each time it performs constant no. of computations.

Reference-
1-Higher Engineering Mathematics by B.S. Grewal.

Suggest improvement

Minimum lines to cover all points

Program to find line passing through 2 Points

Share your thoughts in the comments

Problems based on Pattern Printing

Problems based on Lines

Problems based on Triangles

Problems based on Rectangle, Square and Circle

Problems based on 3D Objects

Problems based on Quadrilateral

Problems based on Polygon and Convex Hull