Open In App

Loss function for Linear regression in Machine Learning

The loss function quantifies the disparity between the prediction value and the actual value. In the case of linear regression, the aim is to fit a linear equation to the observed data, the loss function evaluate the difference between the predicted value and true values. By minimizing this difference, the model strives to find the best-fitting line that captures the relationship between the input features and the target variable.

In this article, we will discuss Mean Squared Error (MSE) , Mean Absolute Error (MAE) and Huber Loss.

Mean Squared Error (MSE)

One of the most often used loss functions in linear regression is the Mean Squared Error (MSE). The average of the squared difference between the real values and the forecasted values is how it is computed:

[Tex]MSE = (1/n) * Σ(y_{pred}- y_{true})^2[/Tex]

where,

Because of the squaring process, the MSE penalizes greater mistakes more severely than smaller ones. Because outliers have the potential to greatly raise the MSE, this makes it susceptible to them. But the MSE is differentiable, which is a desired characteristic for machine learning optimization techniques.

Computing Mean Squared Error in Python

import numpy as np

def mse(y_true, y_pred):
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    return np.mean((y_true - y_pred) ** 2)

# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Squared Error:",mse(y_true, y_pred))

Output:

Mean Squared Error: 1.75

Computing Mean Squared Error using Sklearn Library

from sklearn.metrics import mean_squared_error

# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Squared Error:", mean_squared_error(y_true, y_pred))

Output:

Mean Squared Error: 1.75

Mean Absolute Error (MAE)

For linear regression, another often-used loss function is the Mean Absolute Error (MAE). The average of the absolute differences between the real values and the forecasted values is used to compute it:

[Tex]MAE = (1/n) * Σ|y_{pred} - y_{true}|[/Tex]

Since the MAE does not square the errors, it is less susceptible to outliers than the MSE. MAE handles all mistakes the same way, no matter how big. However, certain optimization techniques may encounter difficulties since the MAE is not differentiable at zero.

Computing Mean Absolute Error in Python

import numpy as np

def mae(y_true, y_pred):
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    return np.mean(np.abs(y_true - y_pred))

# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Absolute Error:',mae(y_true, y_pred))

Output:

Mean Absolute Error: 1.25

Computing Mean Absolute Error using Sklearn

from sklearn.metrics import mean_absolute_error

# Example usage
y_true = [3, 6, 8, 12]
y_pred = [4, 5, 7, 10]
print("Mean Absolute Error:",mean_absolute_error(y_true, y_pred))

Output:

Mean Absolute Error: 1.25

Huber Loss

The MSE and the MAE are combined to get the Huber Loss. It is intended to maintain differentiation but be less susceptible to outliers than the MSE:

[Tex]Huber Loss = (1/n) * Σ L_δ(y_{pred} - y_{true})[/Tex]

where L_δ is the Huber loss function defined as:

[Tex]L_δ(x)=\begin{cases} 0.5*x^2 & \text{ if } |x|\leq \delta\\ \delta(|x|-0.5*\delta)& \text otherwise \end{cases} ​ [/Tex]

The Huber Loss exhibits the same behavior as the MAE for big errors (|x| > δ) and the MSE for minor errors (|x| <= δ). The point of transition between the two regimes is determined by the parameter δ.

Computing Huber Loss in Python

import numpy as np

def huber_loss(y_true, y_pred, delta):
    residual = y_true - y_pred
    huber_loss = np.where(np.abs(residual) <= delta, 0.5 * residual ** 2, delta * (np.abs(residual) - 0.5 * delta))
    return np.mean(huber_loss)

# Example usage:
y_true = np.array([3, -0.5, 2, 7])
y_pred = np.array([2.5, 0.0, 2, 8])
delta = 1.0
print("Huber Loss:", huber_loss(y_true, y_pred, delta))

Output:

Huber Loss: 0.1875

Comparison of Loss Functions for Linear Regression

In this section, we compare different loss functions commonly used in regression tasks: Mean Squared Error (MSE), Mean Absolute Error (MAE), and Huber Loss.

The plot provides a visual comparison of the loss values for the different functions, allowing you to observe their behavior and relative magnitudes.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Sample target and predicted values
y_true = np.array([3, 7, 4, 1, 8, 5])
y_pred = np.array([4, 6, 5, 3, 7, 6])

# Calculate MSE and MAE
mse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)

# Huber Loss implementation
def huber_loss(y_true, y_pred, delta=1.0):
    error = np.abs(y_true - y_pred)
    loss = np.where(error <= delta, 0.5 * error**2, delta * error - 0.5 * delta**2)
    return np.mean(loss)

huber_delta1 = huber_loss(y_true, y_pred, delta=1.0)

# Plot the loss functions
losses = [mse, mae, huber_delta1]
labels = ['MSE', 'MAE', 'Huber Loss (delta=1)']

# Providing x-values explicitly for plotting
x = np.arange(len(losses))

plt.figure(figsize=(10, 6))
plt.bar(x, losses, tick_label=labels)
plt.xlabel('Loss Function')
plt.ylabel('Loss Value')
plt.title('Comparison of Loss Functions')
plt.show()

Output:

Screenshot-(429)

In linear regression, the particular issue and the data's properties determine the loss function to use. where handling regularly distributed mistakes and where outliers are not a significant problem, the MSE is often used. When robustness to outliers is crucial, the Huber Loss offers robustness without sacrificing differentiability, and the MAE is the recommended choice.

FAQs on Loss Functions for Linear Regression

Why are loss functions required in linear regression calculations?

A quantifiable indicator of a model's performance during training is given by loss functions. The model may increase its accuracy and forecast more accurately by reducing the loss function and adjusting its parameters accordingly.

How can I choose the best loss function for the data I have?

A: The kind of data you have and the issue you're attempting to address will determine the loss function you choose. Huber Loss combines the advantages of both MSE and MAE, and is a popular and effective model optimization technique. MAE is also resistant to outliers. Your choice will be guided by experimentation and an awareness of the trade-offs associated with each loss function.

How is the Huber Loss different from the MSE in handling outliers?

A: Beyond a threshold ($\delta$), the Huber Loss grows linearly instead of quadratically. As a result, it is less susceptible to significant mistakes, or outliers, than the mean square error (MSE), which squares errors and increases their effect on the loss amount.

Is it possible to design my own unique loss function?

A: It is possible to create unique loss functions to meet certain needs. Tailored loss functions have the ability to integrate subject expertise, manage complex data structures, and accommodate distinct assessment standards. However, mathematical optimization and a solid grasp of the issue area are often necessary for developing a meaningful bespoke loss function.

Article Tags :