Open In App

Data Backup and Restoration in Cassandra

Last Updated : 23 Dec, 2019
Like Article

In this article, we will discuss how we can Backup and Restore our data and also discuss how many ways we can restore our data in Cassandra. We can restore our data by using a snapshot and by using sstableloader utility and by using a nodetool refresh.

Let’s discuss one by one.

First, we are going to create keyspace for backup the data. Let’s discuss this with a sample exercise.

Creating a keyspace:

create keyspace backup_copy 
with replication = { 'class' : 'SimpleStrategy', 
                     'replication_factor': 2 }; 

Now, here we are using backup_copy keyspace.

use backup_copy ;  

Now, we are going to create the table for backup purposes. In below given CQL query facebook_user is a table name in which login_time, user_name, and post are the fields.
Let’s have a look.

create table facebook_user
  login_time timeuuid primary key, 
  user_name text, 
  post set<text>

Let’s see the table schema.

describe table facebook_user; 


Now, we are going to insert some data for backup and restore purposes. let’s have a look.

Insert into facebook_user(login_time, user_name, post) 
values(now(), 'Ashish', {'join webinar at 10:00 am'});

Insert into facebook_user(login_time, user_name, post) 
values(now(), 'Rana', {'join Cassandra meetup at 10:00 am'}); 

Now, let’s Verify the records have been persisted successfully with the select command. let’s have a look.

select * 
from facebook_user; 


To take a snapshot, we need to use the nodetool utility:

nodetool -h localhost -p 7199 snapshot facebook_user 

Here, 7199 is a port no.

Requested creating a snapshot for facebook_user.

Snapshot directory: 1205514051242

The output shows that running nodetool snapshot over a local node has created a snapshot 1205514051242 under the $CASSANDRA_DATA_DIR/backup_copy/facebook_user folder. Here $CASSANDRA_DATA_DIR is the value defined in Cassandra.yaml file for data_file_directories properties

To restore data first we need to delete some data that we can restore. let’s have a look.

truncate facebook_user; 

Restore data by using the sstableloader utility:
To begin, we need to copy all .db files in the Snapshot directory into a folder which should be in sync with the database schema, meaning keyspace/table name. Here in our case, it should be the user’s folder facebook_user under backup_copy(/home/Ashish/backup_copy/facebook_user).

Now, let’s execute the sstableloader.

$CASSANDRA_HOME/bin/sstableloader -d localhost /home/Ashish/backup_copy/facebook_user 

Now, here this is how it will execute. let’s have a look.

Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /home/Ashish/backup_copy/facebook_user/facebook_user-jb-1-Data.db to
[/, /, /]
progress: [/ 1/1 (100%)] [/ 1/1 (100%)] [total: 100% - 0MB/s (avg: 0MB/s)] 

Once it completes, we can verify whether the data has been restored by running the select command.
Once the above CQL query completes, we can verify whether the data has been restored by using the following CQL query given below.

select * 
from backup_copy.facebook_user; 


Using nodetool refresh:
This is one of the ways that we can restore our data by using nodetool refresh utility and it is different from sstableloader method. In this, we need to manually copy .db files in Cassandra data directory.

To run the nodetool refresh command use the following CQL query given below. let’s have a look.

$CASSANDRA_HOME/bin/nodetool refresh backup_copy facebook_user 

Using clearsnapshot:

$CASSANDRA_HOME/bin/nodetool -h localhost -p 7199 clearsnapshot 

Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads