Table Partitioning in Cassandra

Last Updated : 31 Aug, 2020

In this article, we are going to cover how we can our data access on the basis of partitioning and how we can store our data uniquely in a cluster. Let’s discuss one by one.

Pre-requisite — Data Distribution

Table Partitioning :
In table partitioning, data can be distributed on the basis of the partition key. If you did not specify any partitioning key then it might be the chance of losing data. And It will be difficult to access data as per requirement.

Example :
Let’s consider if your requirement where you want to query user data by the first name. Now, first, you have to create a table where the role of the partitioning key is very important.

CREATE TABLE User_data_by_first_name
(
Usr_id UUID,
first_name text,
last_name text,
primary key (first_name)
);

Let’s insert some data for the above-created table.

Insert into User_data_by_id(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'A');
Insert into User_data_by_id(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'A');
Insert into User_data_by_id(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'B');

Now, if you want to read data then used the following cqlsh query.

select * from User_data_by_id;

Output :

first_name	last_name	Usr_id
Ashish	B	8a71a441-ebee-48e6-861e-cb097570b09b

In the above example, If you have specified it’s partitioning key by the first name then it is not the recommended way to specify the only first name as partitioning key. Otherwise, it might be a chance not uniquely identified your data and your data will be lost if you have multiple entries with same name.

Now, to resolve this issue specify Usr_id and first_name as the partitioning key.

CREATE TABLE User_data_by_first_name_modify
(
Usr_id UUID,
first_name text,
last_name text,
primary key (first_name, Usr_id)
);

Now, Insert the same data as you have to insert for User_data_by_first_name.

Insert into User_data_by_first_name_modify(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'A');
Insert into User_data_by_first_name_modify(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'A');
Insert into User_data_by_first_name_modify(Usr_id, first_name, last_name) 
values(uuid(), 'Ashish', 'B');

Now, If you will read your data then it will uniquely be identified and your data will be not lost.

select * from User_data_by_first_name_modify;

Output :

first_name	Usr_id	last_name
Ashish	9d0bd000-d822-41f4-9a0b-ee5a3f100dbf	B
Ashish	ea0fb4a4-af2e-4eaf-8bf6-f8e894416206	A
Ashish	f1fcbdd0-5e6e-4dc7-83d7-2708db806668	A

Suggest improvement

Role of Keys in Cassandra

DROP SCHEMA in SQL Server

Share your thoughts in the comments

Table Partitioning in Cassandra

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?