All the 3 components are described below:
- HMaster –
The implementation of Master Server in HBase is HMaster. It is a process in which regions are assigned to region server as well as DDL (create, delete table) operations. It monitor all Region Server instances present in the cluster. In a distributed environment, Master runs several background threads. HMaster has many features like controlling load balancing, failover etc.
- Region Server –
HBase Tables are divided horizontally by row key range into Regions. Regions are the basic building elements of HBase cluster that consists of the distribution of tables and are comprised of Column families. Region Server runs on HDFS DataNode which is present in Hadoop cluster. Regions of Region Server are responsible for several things, like handling, managing, executing as well as reads and writes HBase operations on that set of regions. The default size of a region is 256 MB.
- Zookeeper –
It is like a coordinator in HBase. It provides services like maintaining configuration information, naming, providing distributed synchronization, server failure notification etc. Clients communicate with region servers via zookeeper.
Advantages of HBase –
- Can store large data sets
- Database can be shared
- Cost-effective from gigabytes to petabytes
- High availability through failover and replication
Disadvantages of HBase –
- No support SQL structure
- No transaction support
- Sorted only on key
- Memory issues on the cluster
Camparison between HBase and HDFS:
- HBase provides low latency access while HDFS provide high latency operations.
- HBase supports random read and write while HDFS supports Write once Read Many times.
- HBase is accessed through shell commands, Java API, REST, Avro or Thrift API while HDFS is accessed through MapReduce jobs.
Note – HBase is extensively used for online analytical operations, like in banking applications such as real-time data updates in ATM machines, HBase can be used.
- Hadoop YARN Architecture
- Architecture and Working of Hive
- Data with Hadoop
- How MapReduce handles data query ?
- RDMS vs Hadoop
- Distributed Cache in Hadoop MapReduce
- How to Execute WordCount Program in MapReduce using Cloudera Distribution Hadoop(CDH)
- Sum of even and odd numbers in MapReduce using Cloudera Distribution Hadoop(CDH)
- Difference between Hadoop 1 and Hadoop 2
- What is Unstructured Data?
- What is Semi-structured data?
- What is Structured Data?
- HDFS Commands
- How to find top-N records using MapReduce
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.