In Machine Learning, Regression problems can be solved in the following ways:

1. Using Optimization Algorithms – Gradient Descent

• Other Advanced Optimization Algorithms like ( Conjugate Descent … )

2. Using the Normal Equation :

• Using the concept of Linear Algebra.

Let’s consider the case for Batch Gradient Descent for Univariate Linear Regression Problem.

The cost function for this Regression Problem is :

Goal:

In order to solve this problem, we can either go for a Vectorized approach ( Using the concept of Linear Algebra ) or unvectorized approach (Using for-loop).

1. Unvectorized Approach:

Here in order to solve the below mentioned mathematical expressions, We use for loop.

The above mathematical expression is a part of Cost Function.


The above Mathematical Expression is the hypothesis.

Code: Python Implementation of Unvectorzed Grad
 # Import required modules. from sklearn.datasets import make_regression import matplotlib.pyplot as plt import numpy as np import time    # Create and plot the data set. x, y = make_regression(n_samples = 100, n_features = 1,                        n_informative = 1, noise = 10, random_state = 42)   plt.scatter(x, y, c = 'red') plt.xlabel('Feature') plt.ylabel('Target_Variable') plt.title('Training Data') plt.show()   # Convert y from 1d to 2d array. y = y.reshape(100, 1)    # Number of Iterations for Gradient Descent num_iter = 1000   # Learning Rate alpha = 0.01   # Number of Training samples. m = len(x)    # Initializing Theta. theta = np.zeros((2, 1),dtype = float)    # Variables t0 = t1 = 0Grad0 = Grad1 = 0  # Batch Gradient Descent. start_time = time.time()    for i in range(num_iter):     # To find Gradient 0.     for j in range(m):         Grad0 = Grad0 + (theta[0] + theta[1] * x[j]) - (y[j])           # To find Gradient 1.     for k in range(m):         Grad1 = Grad1 + ((theta[0] + theta[1] * x[k]) - (y[k])) * x[k]     t0 = theta[0] - (alpha * (1/m) * Grad0)     t1 = theta[1] - (alpha * (1/m) * Grad1)     theta[0] = t0     theta[1] = t1     Grad0 = Grad1 = 0       # Print the model parameters.     print('model parameters:',theta,sep = '\n')    # Print Time Take for Gradient Descent to Run. print('Time Taken For Gradient Descent in Sec:',time.time()- start_time)   # Prediction on the same training set. h = [] for i in range(m):     h.append(theta[0] + theta[1] * x[i])        # Plot the output. plt.plot(x,h) plt.scatter(x,y,c = 'red') plt.xlabel('Feature') plt.ylabel('Target_Variable') plt.title('Output')

Output:

model parameters:
[[ 1.15857049]
[44.42210912]]

Time Taken For Gradient Descent in Sec: 2.482538938522339


2. Vectorized Approach:

Here in order to solve the below mentioned mathematical expressions, We use Matrix and Vectors (Linear Algebra).

The above mathematical expression is a part of Cost Function.

The above Mathematical Expression is the hypothesis.


Concept To Find Gradients  Using Matrix Operations:

Code: Python implementation of vectorized Gradient Descent approach
 # Import required modules. from sklearn.datasets import make_regression import matplotlib.pyplot as plt import numpy as np import time    # Create and plot the data set. x, y = make_regression(n_samples = 100, n_features = 1,                        n_informative = 1, noise = 10, random_state = 42)   plt.scatter(x, y, c = 'red') plt.xlabel('Feature') plt.ylabel('Target_Variable') plt.title('Training Data') plt.show()     # Adding x0=1 column to x array. X_New = np.array([np.ones(len(x)), x.flatten()]).T   # Convert y from 1d to 2d array. y = y.reshape(100, 1)    # Number of Iterations for Gradient Descent num_iter = 1000   # Learning Rate alpha = 0.01   # Number of Training samples. m = len(x)    # Initializing Theta. theta = np.zeros((2, 1),dtype = float)    # Batch-Gradient Descent. start_time = time.time()    for i in range(num_iter):     gradients = X_New.T.dot(X_New.dot(theta)- y)     theta = theta - (1/m) * alpha * gradients    # Print the model parameters.     print('model parameters:',theta,sep = '\n')    # Print Time Take for Gradient Descent to Run. print('Time Taken For Gradient Descent in Sec:',time.time() - start_time)   # Hypothesis. h = X_New.dot(theta) # Prediction on training data itself.    # Plot the Output. plt.scatter(x, y, c = 'red') plt.plot(x ,h) plt.xlabel('Feature') plt.ylabel('Target_Variable') plt.title('Output')

Output:

model parameters:
[[ 1.15857049]
[44.42210912]]

Time Taken For Gradient Descent in Sec: 0.019551515579223633

Observations:

1. Implementing a vectorized approach decreases the time taken for execution of Gradient Descent( Efficient Code ).
2. Easy to debug.

Previous
Next