Skip to content
Related Articles

Related Articles

Improve Article

Difference Between Hadoop and HBase

  • Last Updated : 22 May, 2020

Hadoop: Hadoop is an open source framework from Apache that is used to store and process large datasets distributed across a cluster of servers. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology.

HBase: HBase is an open source database from Apache that runs on Hadoop cluster.It falls under the non-relational database management system. Three important components of HBase are HMaster, Region server, Zookeeper. CapitalOne, JPMorganchase, apple, MTB, AT& T, Lockheed Martin are some of the companies using HBase.

Hadoop-vs-HBase

Below is a table of differences between Hadoop and HBase:

S.No.HadoopHBase
1Hadoop is a collection of software toolsHBase is a part of hadoop eco-system
2Stores data sets in a distributed environmentStores data in a column-oriented manner
3Hadoop is a frameworkHBase is a NOSQL database
4Data are stored in form of chunksData are stored in form of key/value pair
5Hadoop does not allow run time changesHBase allows run time changes
6File can be written only once, can be read many timesFile can be read and write multiple times
7Hadoop has low latency operationsHBase has high latency operations
8HDFS can be accessed through MapReduceHBase can be accessed through shell commads, Java API, REST

[/sourcecode]

My Personal Notes arrow_drop_up
Recommended Articles
Page :