What is a Tensor Processing Unit?
With machine learning gaining its relevance and importance everyday, the conventional microprocessors have proven to be unable to effectively handle it, be it training or neural network processing. GPUs, with their highly parallel architecture designed for fast graphic processing proved to be way more useful than CPUs for the purpose, but were somewhat lacking. Therefore, in order to combat this situation, Google developed an AI accelerator integrated circuit which would be used by its TensorFlow AI framework. This device has been named TPU (Tensor Processing Unit). The chip has been designed for Tensorflow Framework.
What is TensorFlow Framework?
TensorFlow is an open source library developed by Google for its internal use. Its main usage is in machine learning and dataflow programming. TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations that such neural networks perform on multidimensional data arrays. These arrays are referred to as “tensors”. TensorFlow is available for Linux distributions, Windows, and MacOS.
The following diagram explains the physical architecture of the units in a TPU:
The TPU includes the following computational resources:
- Matrix Multiplier Unit (MXU): 65, 536 8-bit multiply-and-add units for matrix operations.
- Unified Buffer (UB): 24MB of SRAM that work as registers
- Activation Unit (AU): Hardwired activation functions.
There are 5 major high level instruction sets devised to control how the above resources work. They are as follows:
|Read_Host_Memory||Read data from memory|
|Read_Weights||Read weights from memory|
|MatrixMultiply/Convolve||Multiply or convolve with the data and weights, accumulate the results|
|Activate||Apply activation functions|
|Write_Host_Memory||Write result to memory|
The following is the diagram the application stack maintained by the google applications that use TensorFlow and TPU:
Advantages of TPU
The following are some notable advantages of TPUs:
- Accelerates the performance of linear algebra computation, which is used heavily in machine learning applications.
- Minimizes the time-to-accuracy when you train large, complex neural network models.
- Models that previously took weeks to train on other hardware platforms can converge in hours on TPUs.
When to use a TPU
The following are the cases where TPUs are best suited in machine learning:
- Models dominated by matrix computations.
- Models with no custom TensorFlow operations inside the main training loop.
- Models that train for weeks or months
- Larger and very large models with very large effective batch sizes.
- ML | Understanding Data Processing
- Introduction to Tensor with Tensorflow
- Understanding Types of Mean | Set 2
- Using Vectors in Processing Language
- Satellite Image Processing
- Processing text using NLP | Basics
- Understanding Search Engines
- Understanding Augmented Reality
- Understanding Types of Means | Set 1
- Understanding Logistic Regression
- Understanding ReDoS Attack
- Understanding Character Encoding
- Understanding different Box Plot with visualization
- Introduction to Natural Language Processing
- NLP | Parallel list processing with execnet
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.