Open In App

Difference Between Hadoop and HBase

Last Updated : 24 Sep, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

Hadoop: Hadoop is an open source framework from Apache that is used to store and process large datasets distributed across a cluster of servers. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology. 

HBase: HBase is an open source database from Apache that runs on Hadoop cluster. It falls under the non-relational database management system. Three important components of HBase are HMaster, Region server, Zookeeper. CapitalOne, JPMorganchase, apple, MTB, AT& T, Lockheed Martin are some of the companies using HBase. 

Hadoop-vs-HBase

Below is a table of differences between Hadoop and HBase: 

S.No. Hadoop HBase
1 Hadoop is a collection of software tools HBase is a part of hadoop eco-system
2 Stores data sets in a distributed environment Stores data in a column-oriented manner
3 Hadoop is a framework HBase is a NOSQL database
4 Data are stored in form of chunks Data are stored in form of key/value pair
5 Hadoop does not allow run time changes HBase allows run time changes
6 File can be written only once, can be read many times File can be read and write multiple times
7 Hadoop has low latency operations HBase has high latency operations
8 HDFS can be accessed through MapReduce HBase can be accessed through shell commands, Java API, REST

 


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads