Open In App

Jacobians in TensorFlow

Last Updated : 26 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In the field of machine learning and numerical computation, it is extremely important to understand mathematical concepts. One of such fundamental concepts is Jacobian matrix. It is important part of calculus and has extensive applications in diverse fields. In this article, we will discuss Jacobians and how we can compute Jacobians using TensorFlow.

Jacobians

At its core, the Jacobian matrix encapsulates the rate of change of a vector-valued function concerning its input variables. Represented as a matrix of partial derivatives, the Jacobian elegantly captures the intricate relationship between the input and output of a multivariate function. Mathematically, for a function [Tex]f: \mathbb{R}^n \rightarrow \mathbb{R}^m [/Tex] J is defined as

[Tex]\begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \cdots & \frac{\partial f_2}{\partial x_n} \\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \cdots & \frac{\partial f_m}{\partial x_n} \end{bmatrix} [/Tex]

As you can see, each element in the matrix is the partial derivative of the corresponding output function [Tex]f_i [/Tex]with respect to each input variable [Tex]x_j [/Tex]. Therefore, the Jacobian briefly encloses the local sensitivity of each output with respect to changes in the inputs. We can clearly see that with the help of the following example.

Consider the function having the given vector values[Tex] f(x, y) = [2x, 3y][/Tex] . Here, [Tex] f [/Tex] takes in a two-dimensional input [Tex](x, y)[/Tex] and produces a two-dimensional output. In order to compute the Jacobian matrix [Tex]J [/Tex] for this function, we calculate the partial derivatives as follows:

[Tex]J = \begin{bmatrix} \frac{\partial (2x)}{\partial x} & \frac{\partial (2x)}{\partial y} \\ \frac{\partial (3y)}{\partial x} & \frac{\partial (3y)}{\partial y} \end{bmatrix} [/Tex]


After evaluating each element we get,

[Tex]J = \begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix} [/Tex]


From the results, we understand that a unit change in [Tex]x [/Tex] leads to a 2-unit change in the first output. Similarly, a unit change in [Tex]y[/Tex] corresponds to a 3-unit change in the second output. This might seem like a very basic interpretation but it forms the basis for more complex applications in machine learning, optimization, and sensitivity analysis.

How to compute Jacobian using TensorFlow?

TensorFlow is powerful and capable for computing Jacobians. You can compute Jacobians for scalar sources, tensor sources, and even efficiently calculate batch Jacobians. We will now discuss each of those in detail along with the code examples.

Scalar Source

If you have a scalar source, the Jacobian represents the gradient of a scalar-valued function with respect to a vector or tensor input. You can use tf.GradientTape to calculate the Jacobian. You can understand from example given below that scalar_function is a simple scalar-valued function that will calculate the sum of the squared elements of the input vector.

Let’s compute the Jacobian Matrix if a scalar funtion with respect to a vector using TensorFlow’s automatic differentiation capabilities. In the following code we have,

  • imported tensorflow library
  • defined the vector value function that takes x as input
  • define a scalar variable that is used as input to the vector value function
  • used tf.GradientTape() to compute Jcobian , the gradient tape is created to record operations for automatic differentiation.
  • then, tape.jacobian() is used to compute the Jacobian of the vector y w.r.t the scalar variable.

Python3

import tensorflow as tf
 
# Define the vector-valued function
def f(x):
    return tf.stack([x**2, tf.sin(x)])
 
# Define the scalar variable
x_val = tf.constant(1.0)
 
# Use tf.GradientTape to compute the Jacobian
with tf.GradientTape() as tape:
    tape.watch(x_val)
    y = f(x_val)
 
Jacobian = tape.jacobian(y, x_val)
 
print("Jacobian at x =", x_val.numpy(), ":\n", Jacobian.numpy())

Output:

Jacobian at x = 1.0 :
[2. 0.5403023]

In the output, you can the input tensor and the Jacobian matrix. Thus, it tells you how each element of the input affects the scalar output.

Tensor source

In a tensor valued function, both the source and target, are tensors. However, you don’t have to worry much about computation because TensorFlow that a seamless mechanism to compute Jacobian from tensor source. As discussed previously, you can simply use tf.GradientTape to calculate the derivatives quicky and form the Jacobian matrix. To make sure you understand it better, take a look at the example given below.
Consider the tensor-valued function [Tex]f(x) = \begin{bmatrix} x_0^2 ,\ 2x_1 \end{bmatrix} [/Tex].Here, the input [Tex]x[/Tex] is a vector. Our goal is to compute the Jacobian matrix of this function in TensorFlow.

Python3

import tensorflow as tf
 
# Define a tensor-valued function
def tensor_function(x):
    return tf.constant(x**2 + 2*x)
 
# Create an input tensor
x = tf.constant([1.0, 2.0], dtype=tf.float32)
 
# Use tf.GradientTape to compute the Jacobian
with tf.GradientTape(persistent=True) as tape:
    tape.watch(x)
    y = tensor_function(x)
 
# Compute the Jacobian matrix
Jacobian_matrix = tape.jacobian(y, x)
 
print("Input Tensor (x):", x.numpy())
print("Jacobian Matrix:\n", Jacobian_matrix.numpy())

Output:

Input Tensor (x): [1. 2.]
Jacobian Matrix:
[[4. 0.]
[0. 6.]]

When we see the matrix, we can say that a unit change in [Tex]x_0[/Tex] leads to a 4-unit change in the first output, and a unit change in [Tex]x_1[/Tex] corresponds to a 6-unit change in the second output. Therefore, you can see how easy it is to compute Jacobians for tensor-valued functions in TensorFlow.

Batch Jacobian

The term “batch Jacobian” refers to the Jacobian matrix computed across a batch of input-output pairs. Specifically, it represents the partial derivatives of a stack of target outputs with respect to a stack of source inputs, where each target-source pair in the batch has an independent Jacobian matrix.

In the following code, we have illustrated the concept of Batch Jacobian in the context of a neural network model where the input tensor is passed through a series of layers to produce an output tensor y.

  1. The first step is to generate a tensor x containing random integer values.
  2. Then we have converted the integer tensor to float32 as TensorFlow layers expect float32 inputs.
  3. Then, we defined the layers of neural network.
  4. The last step is to compute the Jacobian matrix.
    • tf.GradientTape() records the operations for automatic differentiation.
    • tape.watch(x.float) is used to watch gradient computation
    • output tensor is computed by passing input tensor ‘x_float’ through the layers
    • tape.batch_jacobian() computes the batch Jacobian.

Python3

import tensorflow as tf
 
# Generate random integer values for x
x = tf.random.uniform([5, 5], minval=0, maxval=10, dtype=tf.int32)
 
# Convert integer tensor to float32
x_float = tf.cast(x, dtype=tf.float32)
 
# Define layers
layer1 = tf.keras.layers.Dense(8, activation=tf.nn.elu)
bn = tf.keras.layers.BatchNormalization()
layer2 = tf.keras.layers.Dense(6, activation=tf.nn.elu)
 
with tf.GradientTape(persistent=True, watch_accessed_variables=False) as tape:
    tape.watch(x_float)
    y = layer1(x_float)
    y = bn(y, training=True)
    y = layer2(y)
 
# Compute batch Jacobian
jb = tape.batch_jacobian(y, x_float)
 
print(f'jb.shape: {jb.shape}')

Output:

jb.shape: (5, 6, 5)




Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads