Features of Cassandra

Apache Cassandra is an open source, user-available, distributed, NoSQL DBMS which is designed to handle large amounts of data across many servers. It provides zero point of failure. Cassandra offers massive support for clusters spanning multiple datacentres.

There are some massive features of Cassandra. Here are some of the features described below:

  1. Distributed:
    Each node in the cluster has has same role. There’s no question of failure & the data set is distributed across the cluster but one issue is there that is the master isn’t present in each node to support request for service.

  2. Supports replication & Multi data center replication:
    Replication factor comes with best configurations in cassandra. Cassandra is designed to have a distributed system, for the deplyoment of large number of nodes for across multiple data centers and other key features too.

  3. Scalability:
    It is designed to r/w throughtput, Increase gradually as new machines are added without the interrupting other applications.

  4. Fault-tolerance:
    Data is automatically stored & replicated for fault-tolerance. If a node Fails, then it is replaced within no time.

  5. MapReduce Support:
    It supports Hadoop integration with MapReduce support.Apache Hive & Apache Pig is also supported.

  6. Query Language:
    Cassandra has introduced the CQL (Cassandra Query Langugae). Its a simple interface for accessing the Cassandra.

Cassandra Query Language (CQL) :
CQL has simple interface for accessing the Cassandra, also an alternative for the traditional SQL. CQL adds an abstraction layer to hide the implementation of structure & also provides the native syntax for collections.

For example please follow the given sample which shows how to create a keyspace including coloumn family in CQL 3.0-

WITH REPLICATION = { 'class' : 'SimpleStrategy', 
                     'replication_factor' : 3 };

USE MyKeySpace;

CREATE COLUMNFAMILY MyColumns (id text, Last text, 
                               First text, PRIMARY KEY(id));

INSERT INTO MyColumns (id, Last, First) 
VALUES ('1', 'Doe', 'John'); 


SELECT * FROM MyColumns; 

Which gives:

id | First | Last
1 | Ratul | Sarkar 

(1 rows)

Some facts regarding Cassandra are as follows:

  • Before the updates of versions of Cassandra, upto Cassandra 1.0, Cassandra wasn’t row level consistent, which means inserting & updating the table. It may affect the same row that are processed at approximately the same time may affect the non-key columns in a inconsistent manner.
    Cassandra 1.1 solved this using row level isolation.
  • Deletion of markers called the Tombstones (source Internet) are also known to causes performance degradation upto severe consequence levels.
  • Cassandra, essentially a hybrid between a key-value & a organised tabular DBMS.Tables can be created, dropped and altered at run time without blocking updates & queries.
  • A column family called table represents a RDBMS. Each row is specifically identified by a row & key, name, value, timestamp etc. A table in Cassandra is a disturbed multi dimensional map monitored by a key. Further more applications are specified by a super column family.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Article Tags :
Practice Tags :


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.