Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting. Different regression models differ based on – the kind of relationship between dependent and independent variables, they are considering and the number of independent variables being used.
Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x). So, this regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression.
In the figure above, X (input) is the work experience and Y (output) is the salary of a person. The regression line is the best fit line for our model.
Hypothesis function for Linear Regression :
While training the model we are given :
x: input training data (univariate – one input variable(parameter))
y: labels to data (supervised learning)
When training the model – it fits the best line to predict the value of y for a given value of x. The model gets the best regression fit line by finding the best θ1 and θ2 values.
θ2: coefficient of x
Once we find the best θ1 and θ2 values, we get the best fit line. So when we are finally using our model for prediction, it will predict the value of y for the input value of x.
How to update θ1 and θ2 values to get the best fit line ?
Cost Function (J):
By achieving the best-fit regression line, the model aims to predict y value such that the error difference between predicted value and true value is minimum. So, it is very important to update the θ1 and θ2 values, to reach the best value that minimize the error between predicted y value (pred) and true y value (y).
Cost function(J) of Linear Regression is the Root Mean Squared Error (RMSE) between predicted y value (pred) and true y value (y).
To update θ1 and θ2 values in order to reduce Cost function (minimizing RMSE value) and achieving the best fit line the model uses Gradient Descent. The idea is to start with random θ1 and θ2 values and then iteratively updating the values, reaching minimum cost.
- ML | Linear Regression vs Logistic Regression
- Linear Regression Using Tensorflow
- Linear Regression using PyTorch
- Simple Linear-Regression using R
- Multiple Linear Regression using R
- ML | Multiple Linear Regression using Python
- Univariate Linear Regression in Python
- ML | Normal Equation in Linear Regression
- ML | Locally weighted Linear Regression
- Polynomial Regression for Non-Linear Data - ML
- ML - Advantages and Disadvantages of Linear Regression
- ML | Rainfall prediction using Linear regression
- Python | Linear Regression using sklearn
- Gradient Descent in Linear Regression
- Linear Regression (Python Implementation)
- Mathematical explanation for Linear Regression working
- A Practical approach to Simple Linear Regression using R
- Pyspark | Linear regression using Apache MLlib
- ML | Multiple Linear Regression (Backward Elimination Technique)
- ML | Boston Housing Kaggle Challenge with Linear Regression
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.