How Does Namenode Handles Datanode Failure in Hadoop Distributed File System?

Hadoop file system is a master/slave file system in which Namenode works as the master and Datanode work as a slave. Namenode is so critical term to Hadoop file system because it acts as a central component of HDFS. If Namenode gets down then the whole Hadoop cluster is inaccessible and considered dead. Datanode stores actual data and works as instructed by Namenode. A Hadoop file system can have multiple data nodes but only one active Namenode.

Datanode and Namenode

Basic operations of Namenode:



  • Namenode maintains and manages the Data Nodes and assigns the task to them.
  • Namenodde does not contain actual data of files.
  • Namenode stores metadata of actual data like Filename, path, number of data blocks, block IDs, block location, number of replicas and other slave related informations.
  • Namenode manages all the request(read, write) of client for actual data file.
  • Namenode executes file system name space operations like opening/closing files, renaming files and directories.

Basic Operations of Datanode:

  • Datanodes is responsible of storing actual data.
  • Upon instruction from Namenode, it performs operations like creation/replication/deletion of data blocks.
  • When one of Datanode gets down then it will not make any effect on Hadoop cluster due to replication.
  • All Datanodes are synchronized in the Hadoop cluster in a way that they can communicate with each other for various operations.

What happens if one of the Datanodes gets failed in HDFS?

Namenode periodically receives a heartbeat and a Block report from each Datanode in the cluster. Every Datanode sends heartbeat message after every 3 seconds to Namenode. The health report is just information about a particular Datanode that is working properly or not. In the other words we can say that particular Datanode is alive or not.
A block report of a particular Datanode contains information about all the blocks on that resides on the corresponding Datanode. When Namenode doesn’t receive any heartbeat message for 10 minutes(ByDefault) from a particular Datanode then corresponding Datanode is considered Dead or failed by Namenode. Since blocks will be under replicated, the system starts the replication process from one Datanode to another by taking all block information from the Block report of corresponding Datanode. The Data for replication transfers directly from one Datanode to another without data passing through Namenode.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.