Skip to content
Related Articles
Open in App
Not now

Related Articles

Difference Between Hadoop and Cassandra

Improve Article
Save Article
  • Last Updated : 07 Jul, 2022
Improve Article
Save Article

Hadoop is an open-source software programming framework. The framework of Hadoop is based on Java Programming Language with some native code in shell script and C

This framework is used to manage, store and process the data & computation for the different applications of big data running under clustered systems. The main components of Hadoop are HDFS, MapReduce and YARN. 

Cassandra is an open-source distributed data management system with wide column store and NoSQL database. In this NoSQL database provides the capability to handle a very large amount of data across many commodity hardware with no single point of failure and high availability. The code is written in Java and developed by the Apache Software Foundation.

Difference Between Hadoop and Cassandra

S.NO.HADOOPCASSANDRA
1Hadoop is a scalable framework that is designed to be deployed on low-cost hardware.It is deployed in a very distributed fashion as a cluster of instances that are all aware of each other.
2Hadoop is a big data processing framework based on the famous MapReduce programming model.Cassandra is mainly used for real-time data processing.
3Hadoop supports a variety of formats.Cassandra does not support images.
4Hadoop follows a master slave architecture.Cassandra follows a peer-to-peer architecture
5Hadoop is deployed in a single data center.Cassandra is deployed in a very distributed fashion.
6It used map reduce to read/write.This uses Cassandra query language.
7In hadoop, data is directly written to data node.While in Cassandra, data is first written to mem-table and then it is written to disk.
8Hadoop has a fixed replication factor of 3.Replication factor in Cassandra depend on the number of nodes.
9It has high latency rate.It has less latency rate.
10Hadoop uses TCP and UDP for communication.In Cassandra, gossip protocol is used for communication.
11It is for data batch processing.It is for real-time processing.
12It is difficult to create multiple indexes in hadoop.Cassandra stores data as Key-value pair thus makes it easy to create multiple indexes.
My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!