Daemons mean Process. Hadoop Daemons are a set of processes that run on Hadoop. Hadoop is a framework written in Java, so all these processes are Java Processes.
Apache Hadoop 2 consists of the following Daemons:
- Secondary Name Node
- Resource Manager
- Node Manager
Namenode, Secondary NameNode, and Resource Manager works on a Master System while the Node Manager and DataNode work on the Slave machine.
NameNode works on the Master System. The primary purpose of Namenode is to manage all the MetaData. Metadata is the list of files stored in our HDFS(Hadoop Distributed File System). As we know the data is stored in the form of blocks in a Hadoop cluster. So on which DataNode or on which location that block of the file is stored is mentioned in MetaData. Log of the Transaction happening in a Hadoop cluster, when or who read or write the data, all this information will be stored in MetaData. MetaData is stored in the memory.
- It never stores the data that is present in the file.
- As Namenode works Master System, the Master system should have the good processing power and more RAM then Slaves.
- it stores the information of DataNode such as their Block id’s and Number of Blocks
How to start Name Node?
hadoop-daemon.sh start namenode
How to stop Name Node?
hadoop-daemon.sh stop namenode
DataNode works on the Slave system. The NameNode always instructs DataNode for storing the Data. DataNode is a programme run on the slave system that serves the read/write request from the client. As the data is stored in this DataNode so they should possess a high memory to store more Data.
How to start Data Node?
hadoop-daemon.sh start datanode
How to stop Data Node?
hadoop-daemon.sh stop datanode
3. Secondary NameNode
Secondary NameNode is used for taking the hourly backup of the data. Suppose in case Hadoop cluster fails, or it got crashed, then, in that case, the secondary Namenode will take the hourly backup or checkpoints of that data and store this data into a file name fsimage. Then this file got transferred to a new system means this MetaData is assigned to that new system and a new Master is created with this MetaData, and the cluster is made to run again correctly.
This is the benefit of Secondary Name Node. Now in Hadoop2, we have High-Availability and Federation features that minimize the importance of this Secondary Name Node in Hadoop2.
Major Function Of Secondary NameNode:
- it group together the Edit logs and Fsimage from NameNode
- it continuously reads the MetaData from the RAM of NameNode and writes into the Hard Disk.
As secondary NameNode keeps track of checkpoint in a Hadoop Distributed File System, it is also known as the checkpoint Node.
|The Hadoop Daemon’s||Port|
|Secondary Name Node||50090|
These ports can be configured manually in hdfs-site.xml and mapred-site.xml files.
4. Resource Manager
Resource Manager is also known as the Global Master Daemon that works on the Master System. The Resource Manager Manages the resources for the application that are running in a Hadoop Cluster. The Resource Manager Mainly consists of 2 things.
An Application Manager is responsible for accepting the request for a client and also make a memory resource on the Slaves in a Hadoop cluster to host the Application Master. The scheduler utilizes for providing resources for application in a Hadoop cluster and for monitoring this application.
How to start ResourceManager?
yarn-daemon.sh start resourcemanager
How to stop ResourceManager?
stop:yarn-daemon.sh stop resoucemnager
5. Node Manager
The Node Manager works on the Slaves System that manages the memory resource within the Node and Memory Disk. Each Slave Nodein, a Hadoop cluster, has single NodeManager Daemon running in it. It also sends this monitoring information to the Resource Manager.
How to start Node Manager?
yarn-daemon.sh start nodemanager
How to stop Node Manager?
yarn-daemon.sh stop nodemanager
In a Hadoop cluster Resource Manager and Node Manager can be tracked with the specific URLs, of type http://:port_number
|The Hadoop Daemon’s||Port|
The below diagram shows how Hadoop works.
- Hadoop - Features of Hadoop Which Makes It Popular
- Difference between Hadoop 1 and Hadoop 2
- Difference Between Hadoop 2.x vs Hadoop 3.x
- Hadoop - HDFS (Hadoop Distributed File System)
- Sum of even and odd numbers in MapReduce using Cloudera Distribution Hadoop(CDH)
- Volunteer and Grid Computing | Hadoop
- Difference Between Hadoop and Cassandra
- Difference Between Hadoop and Teradata
- Difference Between Cloud Computing and Hadoop
- Difference Between Big Data and Apache Hadoop
- Difference Between Hadoop and HBase
- Difference Between Hadoop and Splunk
- Difference Between Hadoop and Elasticsearch
- Difference Between Hadoop and SQL Performance
- Difference Between Hadoop and Spark
- Difference Between Hadoop and SQL
- Difference Between Hadoop and Hive
- Difference Between Apache Hadoop and Apache Storm
- Difference Between Hadoop and Apache Spark
- Difference Between Hadoop and MongoDB
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.