Open In App

Hadoop YARN Architecture

YARN stands for “Yet Another Resource Negotiator“. It was introduced in Hadoop 2.0 to remove the bottleneck on Job Tracker which was present in Hadoop 1.0. YARN was described as a “Redesigned Resource Manager” at the time of its launching, but it has now evolved to be known as large-scale distributed operating system used for Big Data processing. 

 



YARN architecture basically separates resource management layer from the processing layer. In Hadoop 1.0 version, the responsibility of Job tracker is split between the resource manager and application manager. 



 

YARN also allows different data processing engines like graph processing, interactive processing, stream processing as well as batch processing to run and process data stored in HDFS (Hadoop Distributed File System) thus making the system much more efficient. Through its various components, it can dynamically allocate various resources and schedule the application processing. For large volume data processing, it is quite necessary to manage the available resources properly so that every application can leverage them. 

YARN Features: YARN gained popularity because of the following features- 
 

 

Hadoop YARN Architecture

 

The main components of YARN architecture include: 

 

Application workflow in Hadoop YARN: 

 

 

  1. Client submits an application
  2. The Resource Manager allocates a container to start the Application Manager
  3. The Application Manager registers itself with the Resource Manager
  4. The Application Manager negotiates containers from the Resource Manager
  5. The Application Manager notifies the Node Manager to launch containers
  6. Application code is executed in the container
  7. Client contacts Resource Manager/Application Manager to monitor application’s status
  8. Once the processing is complete, the Application Manager un-registers with the Resource Manager

Advantages :

Disadvantages :

Article Tags :