How to Compute the Hessian in PyTorch

Last Updated : 28 Feb, 2022

A Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a function. The function must be a scalar-valued function. A scalar-valued function is a function that takes one or more values and returns a single value. For example $f(x,y) = xy^3-7x$ is a scalar function as it takes two values x and y but returns a single value (computed value of $xy^3-10x$ ).

Computing Hessian in PyTorch

To compute Hessian of a scalar-valued function in PyTorch.

scalar-valued() function:

Syntax: torch.autograd.functional.hessian(func, inputs, create_graph=False, strict=False, vectorize=False)

Parameters:

func: a Python function. It takes tensor inputs and returns a tensor with a single element.
inputs: input to the function func. It is a tensor or tuple of tensors.
create_graph: it’s an optional boolean. Default is set to False. If True, a graph of the derivatives will be created.
strict: if True, an error will be raised when it is detected that there exists an input such that all the outputs are independent of it. If False, it returns a Tensor of zeros as the hessian for such inputs.
vectorize: still in it’s experimental phase if True, the function uses the vmap prototype feature to compute the gradients. It invokes autograd.grad only once instead of once per row of the Hessian.

Return: it returns the Hessian for the input. Tensor if the input is a single tensor, a tuple of tuples if the input is a tuple.

Example 1:

In this example, we use a scalar-valued function of a single variable (univariate function). We compute the hessian of this function for an input tensor with single elements, and also input tensor with multiple elements. See how the hessian looks for these inputs for the same function. The scalar-valued function is defined for a single variable. The input is a tensor and notices that the Hessian is also a tensor. When the input tensor has a single element the hessian is a tensor with a single element. When the input tensor has three elements, the hessian is a tensor of size [3, 3]. In the same way, the hessian for the input tensor of size [2, 2] is a tensor of size [2,2,2,2].

Python3

# Python program to compute Hessian in PyTorch 
# importing libraries 
import torch 
from torch.autograd.functional import hessian 
  
# defining a function 
def func(x): 
    return (2*x.pow(3) - x.pow(2)).sum() 
  
# defining the input tensor 
input = torch.tensor([3.]) 
print("Input:\n", input) 
  
# computing the hessian 
output = hessian(func, input) 
  
# printing the above computed tensor 
print("Hessian:\n", output) 
  
# .....New input 
input = torch.tensor([2., 3., 4.]) 
print("Input:\n", input) 
  
# computing the hessian 
output = hessian(func, input) 
  
# printing the above computed tensor 
print("Hessian:\n", output) 
  
# .....New input 
input = torch.tensor([[2., 3.], [4., 7]]) 
print("Input:\n", input) 
  
# computing the hessian 
output = hessian(func, input) 
  
# printing the above computed tensor 
print("Hessian:\n", output) 

Output:

Input:
 tensor([3.])
Hessian:
 tensor([[34.]])
Input:
 tensor([2., 3., 4.])
Hessian:
 tensor([[22.,  0.,  0.],
        [ 0., 34.,  0.],
        [ 0.,  0., 46.]])
Input:
 tensor([[2., 3.],
        [4., 7.]])
Hessian:
 tensor([[[[22.,  0.],
          [ 0.,  0.]],

         [[ 0., 34.],
          [ 0.,  0.]]],


        [[[ 0.,  0.],
          [46.,  0.]],

         [[ 0.,  0.],
          [ 0., 82.]]]])

Example 2:

In the below example we define a scalar-valued function of two variables (bivariate function). We input a tuple of two tensors. The scalar-valued function is defined for two variables. The input is a tuple of two tensors and notices that the output (the hessian) is a tuple of tuples of tensors. Each inner tuple has two elements (tensors). Here Hessian[i][j] contains the Hessian of the ith input and jth input.

Python3

# Python program to compute Hessian in PyTorch 
# importing libraries 
import torch 
from torch.autograd.functional import hessian 
  
# defining a function 
def func(x, y): 
    return (2*x*y.pow(2) + x.pow(3) - 10).sum() 
  
# defining the inputs 
input_x = torch.tensor([2.]) 
input_y = torch.tensor([-3.]) 
inputs = (input_x, input_y) 
print("inputs:\n", inputs) 
  
# compute the hessian 
output = hessian(func, inputs) 
  
# printing the above computed hessian 
print("Hessian:\n", output) 

Output:

inputs:
 (tensor([2.]), tensor([-3.]))
Hessian:
 ((tensor([[12.]]), tensor([[-12.]])), 
 (tensor([[-12.]]), tensor([[8.]])))

Example 3:

In the below example we define a scalar-valued function of three variables. We input a tuple of three tensors. The scalar-valued function is defined for three variables. The input is a tuple of three tensors and notice that the output (the hessian) is a tuple of tuples of tensors. Here Hessian[i][j] contains the Hessian of the ith input and jth input.

Python3

# Python program to compute Hessian in PyTorch 
# importing libraries 
import torch 
from torch.autograd.functional import hessian 
  
# defining a function 
def func(x, y, z): 
    return (2*x.pow(2)*y + x*z.pow(3) - 10).sum() 
  
# defining the inputs 
input_x = torch.tensor([1.]) 
input_y = torch.tensor([2.]) 
input_z = torch.tensor([3.]) 
  
#inputs = (input_x, input_y, input_z) 
  
# compute the hessian 
output = hessian(func, (input_x, input_y, input_z)) 
  
# printing the above computed hessian 
print("Hessian Tensor:\n", output) 

Output:

Hessian Tensor:
 ((tensor([[8.]]), tensor([[4.]]), tensor([[27.]])), 
 (tensor([[4.]]), tensor([[0.]]), tensor([[0.]])), 
 (tensor([[27.]]), tensor([[0.]]), tensor([[18.]])))

Suggest improvement

How OpenCV’s blobFromImage works?

R - How to Search by class in RSelenium

Share your thoughts in the comments

How to Compute the Hessian in PyTorch

Computing Hessian in PyTorch

Example 1:

Python3

Example 2:

Python3

Example 3:

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?