Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Difference between Hive and HBase

  • Last Updated : 19 Dec, 2018

Hive is a datawarehousing package built on the top of Hadoop. It is mainly used for data analysis. It generally target towards users already comfortable with Structured Query Language (SQL). It is very similar to SQL and called Hive Query Language (HQL). Hive manages and queries structured data. Moreover, hive abstracts complexity of Hadoop. Hive was developed by Facebook in 2007 to handle massive amount of data. It does not support:

  • Not a full database.
  • Not a real time processing system.
  • Not SQL-92 compliant.
  • Does not provide row level insert, updates or deletes.
  • Doesn’t support transactions and limited sub-query support.
  • Query optimization in evolving stage.

HBase is a column-oriented database management system that runs on top of Hadoop Distributed File System (HDFS). It is well suited for sparse data sets, which are common in many big data use cases. It is an opensource, distributed database developed by Apache software foundations. Initially, it was named Google Big Table, afterwards it was re-named as HBase and is primarily written in Java. It can store massive amount of data from terabytes to petabytes. It is built for low-latency operations and is used extensively for read and write operations. It stores large amount of data in the form of tables.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.


Difference between Hive and HBase:

Hive is a query engineData storage particularly for unstructured data
Mainly used for batch processingExtensively used for transactional processing
Not a real time processingReal-time processing
Only for analytical queriesReal-time querying
Runs on the top of HadoopRuns on the top of HDFS (Hadoop distributed file system)
Apache Hive is not a databaseIt support NoSQL database
It has schema modelIt is free from schema model
Made for high latency operationsMade for low level latency operations

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!