Open In App

How to Differentiate a Gradient in PyTorch?

PyTorch is an open-source machine-learning framework based on the Torch library. It is built by the Facebook AI team. 

It is used for Computer vision and Natural Language Processing applications. PyTorch uses tensors to use the power of GPU.

Differentiation is part of Calculus. So, In this article, we will know to calculate the derivative value of any function with the help of PyTorch with examples.

In machine-learning gradient is simply defined as the derivative of any function (e.g cost function) which has more than one input variable.

Example 1:

Calculate the derivative value of   for x = 5.0.

Method 1: By using the backward function

  1.  First import the torch library
  2.  Then create a tensor input value with requires_grad = True. Basically, this is used to record the autograd operations.
  3.  Define the function .
  4.  Use f.backward() to execute the backward pass and computes all the backpropagation gradients automatically.
  5. Calculate the derivative value of the given function for the given input using x.grad

Note:  Here the input should be float value.

# Import the torch library
import torch
# Assign any value for x as tensor form
# Set requires_grad=True So,
# that autograd will record the operations
# Define the equation
f = 2*(x**2)+5
# Differentiate using torch
#Uses backward function to compute the gradient value
# Print the derivative value
# of y i.e dy/dx = 4x  = 4 X 5.0 = 20.




Explanations :


Method 2: By using torch.autograd

We can also use torch.autograd is PyTorch’s automatic differentiation engine that powers neural network training.

It performs the backpropagation starting from a variable.  This variable often holds the value of the cost function.  To differentiate a gradient in PyTorch, compute the gradient of a tensor with respect to some parameter in PyTorch, you can use the torch.autograd.grad function. This function takes in the tensor we want to compute the gradient of, as well as the parameter with respect to which we want to compute the gradient.

# Import the necessary modules
import torch
from torch.autograd import grad
# Define the function f
def f(x):
    return 2 *(x ** 2) + 5
# Create a tensor x and set
# requires_grad=True to track the gradient
x = torch.tensor([5.0], requires_grad=True)
# Compute the gradient of f with respect to x
grad_f = grad(f(x), x)[0]
# Print the gradient


Output :


Example 2: 

Calculate the derivative value of    for  .

Step1: Import the torch library

# import the library
import torch


Step 2: Assign the input value i.e x in tensor form. and make sure requires_grad=True.

x = torch_input=torch.tensor([[1.0,2.0,3.0],[4.0,5.0,6.0]],requires_grad=True)



tensor([[1., 2., 3.],
        [4., 5., 6.]], requires_grad=True)

Step 3: Define the function of the given mathematical equation  .

def f(x):
    return 2*(x**3) + 5*(x**2) + 7*x + 10



tensor([[ 24.,  60., 130.],
        [246., 420., 664.]], grad_fn=<AddBackward0>)

Step 4: Use torch.autograd.functional.jacobian() to find the derivative but here the output will be of the size of torch.Size([2, 3, 2, 3]).

gred = torch.autograd.functional.jacobian(f, x)



torch.Size([2, 3, 2, 3])

tensor([[[[ 23.,   0.,   0.],
          [  0.,   0.,   0.]],

         [[  0.,  51.,   0.],
          [  0.,   0.,   0.]],

         [[  0.,   0.,  91.],
          [  0.,   0.,   0.]]],

        [[[  0.,   0.,   0.],
          [143.,   0.,   0.]],

         [[  0.,   0.,   0.],
          [  0., 207.,   0.]],

         [[  0.,   0.,   0.],
          [  0.,   0., 283.]]]])

Here the output tensor shape is [2, 3, 2, 3]. There is some alternate method by which we can get the same outputs in the same shape as the input.

Alternate Method 1: By using torch.autograd.grad()

Step 1 to 3 is the same as the above.

Step 4: Let’s create a variable z with the sum of f(x). Because torch.autograd.grad() works only for scalar input. 




tensor(1544., grad_fn=<SumBackward0>)

Step 5: Use torch.autograd.grad(z, x) to get the derivative value of f(x) with respect to x.

grad_f = torch.autograd.grad(z, x)



(tensor([[ 23.,  51.,  91.],
        [143., 207., 283.]]),)


df = 6*x**2 + 10*x + 7



tensor([[ 23.,  51.,  91.],
        [143., 207., 283.]], grad_fn=<AddBackward0>)

As we can see from above that the grad_f value and df are the same.

Alternate method: 2 By using the backward() function

All steps are similar to the above except, We will use here z.backward() function in place of torch.autograd.grad(z,x).  and we can get the output by using x.grad().

# import the library
import torch
# Assign the input variable
x = torch_input=torch.tensor([[1.0,2.0,3.0],[4.0,5.0,6.0]],requires_grad=True)
# define the function
def f(x):
    return 2*(x**3) + 5*(x**2) + 7*x + 10
# Assign the sum to another variable z
# Compute the gradient
# Find the gradient value



tensor([[ 23.,  51.,  91.],
        [143., 207., 283.]])

Example 3:  Plot the gradient of sin(x)

Let’s plot the graph of sin(x) and the derivative of sin(x) = cos(x)

# import the necessary libraries
import torch
import matplotlib.pyplot as plt
# Assign the input variable
x = torch.linspace(-5, 5, steps=200, requires_grad=True)
# define the function
def f(x):
    return torch.sin(x)#+torch.cos(y)
# create the scalar elements
z = f(x).sum()
# find gradient
# Plot the figure
# original f(x) = sin(x) function
plt.plot(x.detach(), f(x).detach(), label="f(x) = sin(x)")
# derivative of the (x) = sin(x) function w.r.t x
plt.plot(x.detach(), x.grad, label="f'(x) = cos(x)")
plt.ylabel("f(x)  or  f'(x)")



Sin(x) and derivative of Sin(x)

Partial Derivative 

The partial derivative of a function is defined as the derivative with respect to one of those variables, with the others held constant.

Example 1:

Calculate the   and  value of   for   and 

Let’s use jacobian() from torch.autograd.functional 

# import the necessary libraries
import torch
from torch.autograd.functional import jacobian
# Assign the input variable
x = torch_input=torch.tensor([[1.0,2.0,3.0],[4.0,5.0,6.0]],requires_grad=True)
y = torch_input=torch.tensor([[5.0,6.0,7.0],[8.0,9.0,10.0]],requires_grad=True)
# define the function
def f(x,y):
    return 2*(x**2) + 5*(y**2)
# Find the gradient value
gred = jacobian(func = f,inputs=(x,y))
print('df(x,y)/dx = ',gred[0])
print('\ndf(x,y)/dy =',gred[1])



df(x,y)/dx =  tensor([[[[ 4.,  0.,  0.],
          [ 0.,  0.,  0.]],

         [[ 0.,  8.,  0.],
          [ 0.,  0.,  0.]],

         [[ 0.,  0., 12.],
          [ 0.,  0.,  0.]]],

        [[[ 0.,  0.,  0.],
          [16.,  0.,  0.]],

         [[ 0.,  0.,  0.],
          [ 0., 20.,  0.]],

         [[ 0.,  0.,  0.],
          [ 0.,  0., 24.]]]])

df(x,y)/dy = tensor([[[[ 50.,   0.,   0.],
          [  0.,   0.,   0.]],

         [[  0.,  60.,   0.],
          [  0.,   0.,   0.]],

         [[  0.,   0.,  70.],
          [  0.,   0.,   0.]]],

        [[[  0.,   0.,   0.],
          [ 80.,   0.,   0.]],

         [[  0.,   0.,   0.],
          [  0.,  90.,   0.]],

         [[  0.,   0.,   0.],
          [  0.,   0., 100.]]]])

Alternate method: By using grad() from torch.autograd

# import the library
import torch
from torch.autograd import grad
# Assign the input variable
x = torch_input=torch.tensor([[1.0,2.0,3.0],
y = torch_input=torch.tensor([[5.0,6.0,7.0],
# define the function
def f(x,y):
    return 2*(x**2) + 5*(y**2)
# Assign the sum to another variable z
# Compute the gradient
grad_f = grad(z, inputs =(x,y))
# Find the gradient value
print('df(x,y)/dx = ',grad_f[0])
print('\ndf(x,y)/dy = ',grad_f[1])



df(x,y)/dx =  tensor([[ 4.,  8., 12.],
        [16., 20., 24.]])

df(x,y)/dy =  tensor([[ 50.,  60.,  70.],
        [ 80.,  90., 100.]]

Example 2: Plot the graph

Find the derivative of  and plot the graph.



# import the necessary libraries
import torch
import matplotlib.pyplot as plt
# Assign the input variable
x = torch.linspace(-5, 5, steps=200,
y = torch.linspace(-5, 5, steps=200,
# define the function
def f(x,y):
    return (x**3) + 5*x + (y**2)
# create the scalar elements
z = f(x,y).sum()
# find gradient
# Plot the figure
# original f(x,y) = x^3 +5x + y^2 function
plt.plot(x.detach(), f(x,y).detach(), label="f(x,y) = x^3 +5x + y^2")
# derivative of the ff(x,y) = x^3 +5x + y^2 function w.r.t x
plt.plot(x.detach(), x.grad, label="d(f(x,y))/dx = 3x^2 + 5")
plt.ylabel('f(x,y) & d(f(x,y))/dx')
# original f(x,y) = x^3 +5x + y^2 function
plt.plot(y.detach(), f(x,y).detach(), label="f(x,y) = x^3 +5x + y^2")
# derivative of the f(x,y) = x^3 +5x + y^2 function w.r.t y
plt.plot(y.detach(), y.grad,  label="d(f(x,y))/dy = 2y")
plt.ylabel('f(x,y) & d(f(x,y))/dy')



Derivative of  f(x,y) = x^3 +5x + y^2

Article Tags :