In this article we will discussed Data distribution in Cassandra and how data distributes over cluster. So, let’s have a look.
In Cassandra data distribution and replication go together. In Cassandra distribution and replication depending on the three thing such that partition key, key value and Token range.
In this table there are two rows in which one row contains four columns and its values. 2nd row contains two columns (column 1 and column 3) and its values. In this table column 1 having the primary key.
Now, let’s take an example of how user data distributes over cluster.
Below given ring architecture of given four nodes has token range and each row has its own token id so, with the help of partitioner we will generate token values and assigned them and distribute over cluster accordingly.
Token: Tokens are hash values and Murmur3 Hash Algorithm is use for hashing in Cassandra that partitioners use to determine where to store rows on each node in the ring.
for Example: let’s take random hash value for each row for given data in above table.
|Partition key||Murmur3 Hash value|
Let’s have a look for better understanding.
Replication Factor: In Cassandra replication factor is very important which indicates the total number of replicas across the cluster.
Let’s take RF = 2 which simply means that there are two copies of each rows. There is no primary or master replica in Cassandra.
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.
- Overview of Data modeling in Apache Cassandra
- Collection Data Type in Apache Cassandra
- Pre-defined data type in Apache Cassandra
- Data Manipulation in Cassandra
- Altering a table to add a collection data type in Cassandra
- Updating MAP collection data type in Cassandra
- Data Backup and Restoration in Cassandra
- Export and Import data in Cassandra
- Static type using batch in Cassandra
- Role of keys in Cassandra
- High Availability Mechanism in Cassandra
- Apache Cassandra tools
- Blob conversion function in Cassandra
- Designing models in Cassandra
- Introduction to Apache Cassandra
- Apache Cassandra (NOSQL database)
- Architecture of Apache Cassandra
- Overview of User Defined Type (UDT) in Cassandra
- Concept of indexing in Apache Cassandra
- Monitoring cluster in Cassandra
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.