Prerequisite – Introduction to Hadoop
HBase is a data model that is similar to Google’s big table. It is an open source, distributed database developed by Apache software foundation written in Java. HBase is an essential part of our Hadoop ecosystem. HBase runs on top of HDFS (Hadoop Distributed File System). It can store massive amounts of data from terabytes to petabytes. It is column oriented and horizontally scalable.
Features of HBase –
- It is linearly scalable across various nodes as well as modularly scalable, as it divided across various nodes.
- HBase provides consistent read and writes.
- It provides atomic read and write means during one read or write process, all other processes are prevented from performing any read or write operations.
- It provides easy to use Java API for client access.
- It supports Thrift and REST API for non-Java front ends which supports XML, Protobuf and binary data encoding options.
- It supports a Block Cache and Bloom Filters for real-time queries and for high volume query optimization.
- HBase provides automatic failure support between Region Servers.
- It support for exporting metrics with the Hadoop metrics subsystem to files.
- It doesn’t enforce relationship within your data.
- It is a platform for storing and retrieving data with random access.
Facebook Messenger Platform was using Apache Casandra but it shifted from Apache Cassandra to HBase in November 2010. Facebook was trying to build a scalable and robust infrastructure to handle set of services like messages, email, chat and SMS into a real time conversation so that’s why HBase is best suited for that.
RDBMS Vs HBase –
- RDBMS is mostly Row Oriented whereas HBase is Column Oriented.
- RDBMS has fixed schema but in HBase we can scale or add columns in run time also.
- RDBMS is good for structured data whereas HBase is good for semi-structured data.
- RDBMS is optimized for joins but HBase is not optimized for joins.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
- Difference between Apache Hive and Apache Spark SQL
- Difference between PostgreSQL and HBase
- Difference between MySQL and HBase
- Difference between Hive and HBase
- Difference between RDBMS and HBase
- Difference between HBase and MongoDB
- Difference between Impala and hBASE
- Architecture of Apache Cassandra
- SSTable in Apache Cassandra
- Introduction to Apache CouchDB
- Introduction to Apache Cassandra
- Apache Cassandra tools
- Node in Apache Cassandra
- Concept of indexing in Apache Cassandra
- Apache Cassandra (NOSQL database)
- Five main benefits of Apache Cassandra
- Pre-defined data type in Apache Cassandra
- Collection Data Type in Apache Cassandra
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.