- Linear Regression
- Gradient Descent
Lasso Regression is also another linear model derived from Linear Regression which shares the same hypothetical function for prediction. The cost function of Linear Regression is represented by J.
Here, m is the total number of training examples in the dataset. h(x(i)) represents the hypothetical function for prediction. y(i) represents the value of target variable for ith training example.
Linear Regression model considers all the features equally relevant for prediction. When there are many features in the dataset and even some of them are not relevant for the predictive model. This makes the model more complex with a too inaccurate prediction on the test set ( or overfitting ). Such a model with high variance does not generalize on the new data. So, Lasso Regression comes for the rescue. It introduced an L1 penalty ( or equal to the absolute value of the magnitude of weights) in the cost function of Linear Regression. The modified cost function for Lasso Regression is given below.
Here, w(j) represents the weight for jth feature. n is the number of features in the dataset. lambda is the regularization strength.
Lasso Regression performs both, variable selection and regularization too.
During gradient descent optimization, added l1 penalty shrunk weights close to zero or zero. Those weights which are shrunken to zero eliminates the features present in the hypothetical function. Due to this, irrelevant features don’t participate in the predictive model. This penalization of weights makes the hypothesis more simple which encourages the sparsity ( model with few parameters ).
If the intercept is added, it remains unchanged.
We can control the strength of regularization by hyperparameter lambda. All weights are reduced by the same factor lambda.
Different cases for tuning values of lambda.
- If lambda is set to be 0, Lasso Regression equals Linear Regression.
- If lambda is set to be infinity, all weights are shrunk to zero.
If we increase lambda, bias increases if we decrease the lambda variance increase. As lambda increases, more and more weights are shrunk to zero and eliminates features from the model.
Dataset used in this implementation can be downloaded from the link.
It has 2 columns — “YearsExperience” and “Salary” for 30 employees in a company. So in this, we will train a Lasso Regression model to learn the correlation between the number of years of experience of each employee and their respective salary. Once the model is trained, we will be able to predict the salary of an employee on the basis of his years of experience.
Predicted values [ 40600.91 123294.39 65033.07] Real values [ 37731 122391 57081] Trained W 9396.99 Trained b 26505.43
Note: It automates certain parts of model selection and sometimes called variables eliminator.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- Implementation of Ridge Regression from Scratch using Python
- Linear Regression Implementation From Scratch using Python
- Implementation of Logistic Regression from Scratch using Python
- Implementation of Elastic Net Regression From Scratch
- Multi-task lasso regression
- Implementation of Lasso, Ridge and Elastic Net
- Polynomial Regression ( From Scratch using Python )
- ML | Naive Bayes Scratch Implementation using Python
- Implementation of K-Nearest Neighbors from Scratch using Python
- Implementation of neural network from scratch using NumPy
- ML - Neural Network Implementation in C++ From Scratch
- ANN - Implementation of Self Organizing Neural Network (SONN) from Scratch
- Bidirectional Associative Memory (BAM) Implementation from Scratch
- Lasso vs Ridge vs Elastic Net | ML
- ML | Linear Regression vs Logistic Regression
- Linear Regression (Python Implementation)
- Python | Implementation of Polynomial Regression
- Grid Searching From Scratch using Python
- Implementation of Bayesian Regression
- Implementation of Locally Weighted Linear Regression
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.