HDFS: Hadoop Distributed File System is a distributed file system designed to store and run on multiple machines that are connected to each other as nodes and provide data reliability. It consists of clusters, each of which is accessed through a single NameNode software tool installed on a separate machine to monitor and manage the that cluster’s file system and user access mechanism.
HBase: HBase is a top-level Apache project written in java which fulfills the need to read and write data in real-time. It provides a simple interface to the distributed data. It can be accessed by Apache Hive, Apache Pig, MapReduce, and store information in HDFS.
Below is a table of differences between HDFS and HBase:
|HDFS is a java based file distribution system||Hbase is hadoop database that runs on top of HDFS|
|HDFS is highly fault-tolerant and cost-effective||HBase is partially tolerant and highly consistent|
|HDFS Provides only sequential read/write operation||Random access is possible due to hash table|
|HDFS is based on write once read many times||HBase supports random read and writeoperation into filesystem|
|HDFS has a rigid architecture||HBase support dynamic changes|
|HDFS is prefereable for offline batch processing||HBase is preferable for real time processing|
|HDFS provides high latency for access operations.||HBase provides low latency access to small amount of data|
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.
- Difference Between Hadoop and HBase
- Anatomy of File Read and Write in HDFS
- Introduction to Hadoop Distributed File System(HDFS)
- HDFS Commands
- Hadoop - HDFS (Hadoop Distributed File System)
- Characteristics of HDFS
- Why a Block in HDFS is so Large?
- Architecture of HBase
- Difference between node.js require and ES6 import and export
- Difference Between DOS and Windows
- Difference Between Apache Kafka and Apache Flume
- Difference between Hadoop 1 and Hadoop 2
- Difference between Machine learning and Artificial Intelligence
- Difference between Preemptive Priority based and Non-preemptive Priority based CPU scheduling algorithms
- Difference between Supervised and Unsupervised Learning
- Difference between Algorithm, Pseudocode and Program
- Difference between Applets and Servlets
- Difference between fundamental data types and derived data types
- Difference between Deterministic and Non-deterministic Algorithms
- Difference between HTML and HTML5
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.