Skip to content
Related Articles

Related Articles

Improve Article

Features of Cassandra

  • Difficulty Level : Basic
  • Last Updated : 22 Jun, 2021

Apache Cassandra is an open source, user-available, distributed, NoSQL DBMS which is designed to handle large amounts of data across many servers. It provides zero point of failure. Cassandra offers massive support for clusters spanning multiple datacentres. 

There are some massive features of Cassandra. Here are some of the features described below: 

  1. Distributed: 
    Each node in the cluster has has same role. There’s no question of failure & the data set is distributed across the cluster but one issue is there that is the master isn’t present in each node to support request for service. 
  2. Supports replication & Multi data center replication: 
    Replication factor comes with best configurations in cassandra. Cassandra is designed to have a distributed system, for the deployment of large number of nodes for across multiple data centers and other key features too. 
  3. Scalability: 
    It is designed to r/w throughput, Increase gradually as new machines are added without interrupting other applications. 
  4. Fault-tolerance: 
    Data is automatically stored & replicated for fault-tolerance. If a node Fails, then it is replaced within no time. 
  5. MapReduce Support: 
    It supports Hadoop integration with MapReduce support.Apache Hive & Apache Pig is also supported. 
  6. Query Language: 
    Cassandra has introduced the CQL (Cassandra Query Langugae). Its a simple interface for accessing the Cassandra. 

Cassandra Query Language (CQL) : 
CQL has simple interface for accessing the Cassandra, also an alternative for the traditional SQL. CQL adds an abstraction layer to hide the implementation of structure & also provides the native syntax for collections. 

For example please follow the given sample which shows how to create a keyspace including column family in CQL 3.0- 

CREATE KEYSPACE MyKeySpace
WITH REPLICATION = { 'class' : 'SimpleStrategy', 
                     'replication_factor' : 3 };

USE MyKeySpace;

CREATE COLUMNFAMILY MyColumns (id text, Last text, 
                               First text, PRIMARY KEY(id));

INSERT INTO MyColumns (id, Last, First) 
VALUES ('1', 'Doe', 'John'); 

Query:  



SELECT * FROM MyColumns; 

Which gives:  

id | First | Last
----+-------+------
1 | Ratul | Sarkar 

(1 rows) 

Some facts regarding Cassandra are as follows:  

  • Before the updates of versions of Cassandra, upto Cassandra 1.0, Cassandra wasn’t row level consistent, which means inserting & updating the table. It may affect the same row that are processed at approximately the same time may affect the non-key columns in a inconsistent manner. 
    Cassandra 1.1 solved this using row level isolation. 
  • Deletion of markers called the Tombstones (source Internet) are also known to causes performance degradation upto severe consequence levels. 
  • Cassandra, essentially a hybrid between a key-value & a organised tabular DBMS.Tables can be created, dropped and altered at run time without blocking updates & queries. 
  • A column family called table represents a RDBMS. Each row is specifically identified by a row & key, name, value, timestamp etc. A table in Cassandra is a disturbed multi dimensional map monitored by a key. Further more applications are specified by a super column family. 
     

 

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

 

My Personal Notes arrow_drop_up
Recommended Articles
Page :