Open In App
Related Articles

Difference Between Hadoop and HBase

Improve Article
Improve
Save Article
Save
Like Article
Like

Hadoop: Hadoop is an open source framework from Apache that is used to store and process large datasets distributed across a cluster of servers. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology. 

HBase: HBase is an open source database from Apache that runs on Hadoop cluster. It falls under the non-relational database management system. Three important components of HBase are HMaster, Region server, Zookeeper. CapitalOne, JPMorganchase, apple, MTB, AT& T, Lockheed Martin are some of the companies using HBase. 

Hadoop-vs-HBase

Below is a table of differences between Hadoop and HBase: 

S.No.HadoopHBase
1Hadoop is a collection of software toolsHBase is a part of hadoop eco-system
2Stores data sets in a distributed environmentStores data in a column-oriented manner
3Hadoop is a frameworkHBase is a NOSQL database
4Data are stored in form of chunksData are stored in form of key/value pair
5Hadoop does not allow run time changesHBase allows run time changes
6File can be written only once, can be read many timesFile can be read and write multiple times
7Hadoop has low latency operationsHBase has high latency operations
8HDFS can be accessed through MapReduceHBase can be accessed through shell commands, Java API, REST

 

Last Updated : 24 Sep, 2021
Like Article
Save Article
Similar Reads
Related Tutorials