Open In App

How to Compute the Hessian in PyTorch

Last Updated : 28 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

A Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a function. The function must be a scalar-valued function. A scalar-valued function is a function that takes one or more values and returns a single value.  For example f(x,y) = xy^3-7x     is a scalar function as it takes two values x and y but returns a single value (computed value of xy^3-10x  ).  

Computing Hessian in PyTorch

To compute Hessian of a scalar-valued function in PyTorch.

scalar-valued() function:

Syntax: torch.autograd.functional.hessian(func, inputs, create_graph=False, strict=False, vectorize=False)

Parameters:

  • func: a Python function. It takes tensor inputs and returns a tensor with a single element.
  • inputs: input to the function func. It is a tensor or tuple of tensors.
  • create_graph: it’s an optional boolean. Default is set to False.  If True, a graph of the derivatives will be created.
  • strict: if True, an error will be raised when it is detected that there exists an input such that all the outputs are independent of it. If False, it returns a Tensor of zeros as the hessian for such inputs.
  • vectorize: still in it’s experimental phase if True, the function uses the vmap prototype feature to compute the gradients. It invokes autograd.grad only once instead of once per row of the Hessian.

Return: it returns the Hessian for the input. Tensor if the input is a single tensor, a tuple of tuples if the input is a tuple.

Example 1: 

In this example, we use a scalar-valued function of a single variable (univariate function). We compute the hessian of this function for an input tensor with single elements, and also input tensor with multiple elements. See how the hessian looks for these inputs for the same function. The scalar-valued function is defined for a single variable. The input is a tensor and notices that the Hessian is also a tensor.  When the input tensor has a single element the hessian is a tensor with a single element. When the input tensor has three elements, the hessian is a tensor of size [3, 3]. In the same way, the hessian for the input tensor of size [2, 2] is a tensor of size  [2,2,2,2].

Python3

# Python program to compute Hessian in PyTorch
# importing libraries
import torch
from torch.autograd.functional import hessian
  
# defining a function
def func(x):
    return (2*x.pow(3) - x.pow(2)).sum()
  
# defining the input tensor
input = torch.tensor([3.])
print("Input:\n", input)
  
# computing the hessian
output = hessian(func, input)
  
# printing the above computed tensor
print("Hessian:\n", output)
  
# .....New input
input = torch.tensor([2., 3., 4.])
print("Input:\n", input)
  
# computing the hessian
output = hessian(func, input)
  
# printing the above computed tensor
print("Hessian:\n", output)
  
# .....New input
input = torch.tensor([[2., 3.], [4., 7]])
print("Input:\n", input)
  
# computing the hessian
output = hessian(func, input)
  
# printing the above computed tensor
print("Hessian:\n", output)

                    

Output:

Input:
 tensor([3.])
Hessian:
 tensor([[34.]])
Input:
 tensor([2., 3., 4.])
Hessian:
 tensor([[22.,  0.,  0.],
        [ 0., 34.,  0.],
        [ 0.,  0., 46.]])
Input:
 tensor([[2., 3.],
        [4., 7.]])
Hessian:
 tensor([[[[22.,  0.],
          [ 0.,  0.]],

         [[ 0., 34.],
          [ 0.,  0.]]],


        [[[ 0.,  0.],
          [46.,  0.]],

         [[ 0.,  0.],
          [ 0., 82.]]]])

Example 2:

In the below example we define a scalar-valued function of two variables (bivariate function). We input a tuple of two tensors. The scalar-valued function is defined for two variables. The input is a tuple of two tensors and notices that the output (the hessian) is a tuple of tuples of tensors.  Each inner tuple has two elements (tensors). Here Hessian[i][j] contains the Hessian of the ith input and jth input.

Python3

# Python program to compute Hessian in PyTorch
# importing libraries
import torch
from torch.autograd.functional import hessian
  
# defining a function
def func(x, y):
    return (2*x*y.pow(2) + x.pow(3) - 10).sum()
  
# defining the inputs
input_x = torch.tensor([2.])
input_y = torch.tensor([-3.])
inputs = (input_x, input_y)
print("inputs:\n", inputs)
  
# compute the hessian
output = hessian(func, inputs)
  
# printing the above computed hessian
print("Hessian:\n", output)

                    

Output:

inputs:
 (tensor([2.]), tensor([-3.]))
Hessian:
 ((tensor([[12.]]), tensor([[-12.]])), 
 (tensor([[-12.]]), tensor([[8.]])))

Example 3:

In the below example we define a scalar-valued function of three variables.  We input a tuple of three tensors. The scalar-valued function is defined for three variables. The input is a tuple of three tensors and notice that the output (the hessian) is a tuple of tuples of tensors.  Here Hessian[i][j] contains the Hessian of the ith input and jth input.

Python3

# Python program to compute Hessian in PyTorch
# importing libraries
import torch
from torch.autograd.functional import hessian
  
# defining a function
def func(x, y, z):
    return (2*x.pow(2)*y + x*z.pow(3) - 10).sum()
  
# defining the inputs
input_x = torch.tensor([1.])
input_y = torch.tensor([2.])
input_z = torch.tensor([3.])
  
#inputs = (input_x, input_y, input_z)
  
# compute the hessian
output = hessian(func, (input_x, input_y, input_z))
  
# printing the above computed hessian
print("Hessian Tensor:\n", output)

                    

Output:

Hessian Tensor:
 ((tensor([[8.]]), tensor([[4.]]), tensor([[27.]])), 
 (tensor([[4.]]), tensor([[0.]]), tensor([[0.]])), 
 (tensor([[27.]]), tensor([[0.]]), tensor([[18.]])))


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads