Open In App

Data Distribution in Cassandra

Last Updated : 18 Nov, 2019
Improve
Improve
Like Article
Like
Save
Share
Report

In this article we will discussed Data distribution in Cassandra and how data distributes over cluster. So, let’s have a look.

In Cassandra data distribution and replication go together. In Cassandra distribution and replication depending on the three thing such that partition key, key value and Token range.

Cassandra Table:
In this table there are two rows in which one row contains four columns and its values. 2nd row contains two columns (column 1 and column 3) and its values. In this table column 1 having the primary key.


Figure – Cassandra Table

Now, let’s take an example of how user data distributes over cluster.

E_id E_name E_sal
101 Ashish 90000
102 Aayush 95000
103 Rahul 70000
104 Abi 60000

Below given ring architecture of given four nodes has token range and each row has its own token id so, with the help of partitioner we will generate token values and assigned them and distribute over cluster accordingly.


Figure – Data Center with Random token range

Token: Tokens are hash values and Murmur3 Hash Algorithm is use for hashing in Cassandra that partitioners use to determine where to store rows on each node in the ring.

for Example: let’s take random hash value for each row for given data in above table.

Partition key Murmur3 Hash value
Ashish 1500
Aayush -800
Rahul -1500
Abi 700

Let’s have a look for better understanding.


Figure – Example of Data Distribution in Cassandra

Replication Factor: In Cassandra replication factor is very important which indicates the total number of replicas across the cluster.
Let’s take RF = 2 which simply means that there are two copies of each rows. There is no primary or master replica in Cassandra.


Figure – Example of data distribution when RF = 2


Similar Reads

Overview of Data modeling in Apache Cassandra
In this article we will learn about these three data model in Cassandra: Conceptual, Logical, and Physical. Learning Objective: To Build database using quick design techniques in Cassandra. To Improve existing model using a query driven methodology in Cassandra. To Optimize Existing model via analysis and validation techniques in Cassandra. Data mo
3 min read
Collection Data Type in Apache Cassandra
Collection Data Type in Cassandra In this article, we will describe the collection data type overview and Cassandra Query Language (CQL) query of each collection data type with an explanation. There are 3 types of collection data type in Cassandra. 1. SET 2. LIST 3. MAP Let discuss one by one. 1. SET: A Set is a collection data where we can store a
3 min read
Pre-defined data type in Apache Cassandra
Prerequisite - User Defined Type (UDT) in Cassandra In this article, we will discuss different types of data types in Cassandra which is used for various purpose in Cassandra such that in data modeling, to create a table, etc. Basically, there are 3 types of data type in Cassandra. Lets have a look. Figure - Data Types in Cassandra Now, here we are
4 min read
Data Manipulation in Cassandra
In this article, we will describe the following DML commands in Cassandra which help us to insert, update, delete, query data, etc. In CQL there are following data manipulation command. Let’s discuss one by one. 1. Insert 2. Update 3. Delete 4. Batch Let’s take an example: Table Name: Employee_info CREATE TABLE Employee_info ( E_id int, E_name text
3 min read
Altering a table to add a collection data type in Cassandra
In this article, we will discuss how we can alter a table to add MAP collection data type and how we insert data after altering the table with the help of the UPDATE command. Let’s discuss one by one. First, we are going to create a table let’s consider E_book is a table name and Bookk_name, Author_name, Selling_price are the fields in the E_book t
2 min read
Updating MAP collection data type in Cassandra
In this article, we will discuss how we can update the MAP Collection Data Type and how we can update rows, how we can remove rows and how we can add rows by using the UPDATE clause for updating the map collection data types. First, we will create a table let’s consider Food_menu is a table name and Order_id, Order_Date, Oreder_cost, and Menu_items
3 min read
Data Backup and Restoration in Cassandra
In this article, we will discuss how we can Backup and Restore our data and also discuss how many ways we can restore our data in Cassandra. We can restore our data by using a snapshot and by using sstableloader utility and by using a nodetool refresh. Let's discuss one by one. First, we are going to create keyspace for backup the data. Let's discu
3 min read
Export and Import data in Cassandra
Prerequisite - Cassandra In this article, we are going to discuss how we can export and import data through cqlsh query. Let's discuss one by one. First, we are going to create table namely as Data in which id, firstname, lastname are the fields for sample exercise. Let's have a look. Table name: Data CREATE TABLE Data ( id UUID PRIMARY KEY, firstn
3 min read
Inserting data using a CSV file in Cassandra
In this article, we will discuss how you can insert data into the table using a CSV file. And we will also cover the implementation with the help of examples. Let's discuss it one by one. Pre-requisite - Introduction to Cassandra Introduction :If you want to store data in bulk then inserting data from a CSV file is one of the nice ways. If you have
2 min read
Inserting JSON data into a table in Cassandra
In this article, you will be able to understand how you can insert JSON data into a table in Cassandra and will discuss with the help of an example and then finally conclude the importance of JSON insertion. Let's discuss it one by one. Overview :It is a practical way than cqlsh to insert column and column values. In JSON values inserted in the for
2 min read
Article Tags :