Open In App

Dynamic vs Static Computational Graphs – PyTorch and TensorFlow

TensorFlow and Pytorch are two of the most popular deep learning libraries recently. Both libraries have developed their respective niches in mainstream deep learning with excellent documentation, tutorials, and, most importantly, an exuberant and supportive community behind them. 

Difference between Static Computational Graphs in TensorFlow and Dynamic Computational Graphs in Pytorch

Though both libraries employ a directed acyclic graph(or DAG) for representing their machine learning and deep learning models, there is still a big difference between how they let their data and calculations flow through the graph. The subtle difference between the two libraries is that while Tensorflow(v < 2.0) allows static graph computations, Pytorch allows dynamic graph computations. This article will cover these differences in a visual manner with code examples. The article assumes a working knowledge of computation graphs and a basic understanding of the TensorFlow and Pytorch modules. For a quick refresher of these concepts, the reader is suggested to go through the following articles:



Static Computation graph in Tensorflow

Properties of nodes & edges:  The nodes represent the operations that are applied directly on the data flowing in and out through the edges. For the above set of equations, we can keep the following things in mind while implementing it in TensorFlow:

Now let’s implement the above calculations in TensorFlow and observe how the operations occur:






# Importing tensorflow version 1
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
  
# Initializing placeholder variables of
# the graph
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
  
# Defining the operation
c = tf.multiply(a, b)
  
# Instantiating a tensorflow session
with tf.Session() as sess:
  
    # Computing the output of the graph by giving
    # respective input values
    out = sess.run(, feed_dict={a: [15.0], b: [20.0]})[0][0]
  
    # Computing the output gradient of the output with
    # respect to the input 'a'
    derivative_out_a = sess.run(tf.gradients(c, a), feed_dict={
                                a: [15.0], b: [20.0]})[0][0]
  
    # Computing the output gradient of the output with
    # respect to the input 'b'
    derivative_out_b = sess.run(tf.gradients(c, b), feed_dict={
                                a: [15.0], b: [20.0]})[0][0]
  
    # Displaying the outputs
    print(f'c = {out}')
    print(f'Derivative of c with respect to a = {derivative_out_a}')
    print(f'Derivative of c with respect to b = {derivative_out_b}')

Output:

c = 300.0
Derivative of c with respect to a = 20.0
Derivative of c with respect to b = 15.0

As we can see, the output matches correctly with our calculations in the Introduction section, thus indicating successful completion. The static structure is evident from the code, as we can see that once, inside a session, we can not define new operations(or nodes), but we can surely change the input variables using the feed_dict attribute in the sess.run() method.

Advantages:

Disadvantages:

Dynamic computation graph in Pytorch

Properties of nodes & edges: The nodes represent the data(in form of tensors) and the edges represent the operations applied to the input data. 

For the equations given in the Introduction, we can keep the following things in mind while implementing it in Pytorch:

Now let’s check out a code example to verify our findings:




# Importing torch
import torch
  
# Initializing input tensors
a = torch.tensor(15.0, requires_grad=True)
b = torch.tensor(20.0, requires_grad=True)
  
# Computing the output
c = a * b
  
# Computing the gradients
c.backward()
  
# Collecting the output gradient of the
# output with respect to the input 'a'
derivative_out_a = a.grad
  
# Collecting the output gradient of the
# output with respect to the input 'b'
derivative_out_b = b.grad
  
# Displaying the outputs
print(f'c = {c}')
print(f'Derivative of c with respect to a = {derivative_out_a}')
print(f'Derivative of c with respect to b = {derivative_out_b}')

Output:

c = 300.0
Derivative of c with respect to a = 20.0
Derivative of c with respect to b = 15.0

As we can see, the output matches correctly with our calculations in the Introduction section, thus indicating successful completion. The dynamic structure is evident from the code. We can see that all the inputs and outputs can be accessed and changed during the runtime only, which is entirely different from the approach used by Tensorflow.

Advantages:

Disadvantages:

Conclusion

This article sheds light on the difference between the modeling structure of Tensorflow and Pytorch. The article also lists some advantages and disadvantages of both approaches by going through code examples. The respective organizations behind the development of these libraries keep improving in subsequent iterations, but the reader can now take a more well-informed decision before choosing the best framework for their next project. 


Article Tags :