Skip to content
Related Articles
Open in App
Not now

Related Articles

Difference Between HDFS and HBase

Improve Article
Save Article
Like Article
  • Difficulty Level : Hard
  • Last Updated : 17 May, 2020
Improve Article
Save Article
Like Article

HDFS: Hadoop Distributed File System is a distributed file system designed to store and run on multiple machines that are connected to each other as nodes and provide data reliability. It consists of clusters, each of which is accessed through a single NameNode software tool installed on a separate machine to monitor and manage the that cluster’s file system and user access mechanism.

HBase: HBase is a top-level Apache project written in java which fulfills the need to read and write data in real-time. It provides a simple interface to the distributed data. It can be accessed by Apache Hive, Apache Pig, MapReduce, and store information in HDFS.

HDFS-vs-HBase

Below is a table of differences between HDFS and HBase:

HDFSHBase
HDFS is a java based file distribution systemHbase is hadoop database that runs on top of HDFS
HDFS is highly fault-tolerant and cost-effectiveHBase is partially tolerant and highly consistent
HDFS Provides only sequential read/write operationRandom access is possible due to hash table
HDFS is based on write once read many timesHBase supports random read and writeoperation into filesystem
HDFS has a rigid architectureHBase support dynamic changes
HDFS is prefereable for offline batch processingHBase is preferable for real time processing
HDFS provides high latency for access operations.HBase provides low latency access to small amount of data
My Personal Notes arrow_drop_up
Like Article
Save Article
Related Articles

Start Your Coding Journey Now!