Snitches in Cassandra

In this article, we will discuss the Snitch and its types and Understanding how to configure the cassandra-topology.properties and cassandra-rackdc.properties files helps configure data centers and clusters.

Snitches :
In Cassandra Snitch is very useful and snitch is also helps in keep record to avoid storing multiple replicas of data on the same rack. In Cassandra, it is very important aspects to avoid multiple replica. In replication strategy we assign number of replica and also we define the data-center. This information is very helpful for snitch to identify the node and which rack belong to.

In Cassandra, snitch job is to determine which data centers and racks it should use to read data from and write data to. In Cassandra, all snitch are dynamic by default.

Types of Snitches:

  1. SimpleSnitch:
    In Cassandra, It is default snitch and good for development environments. It is unaware of datacenters or racks and also is not look for Cassandra-topologies.properties file and therefore is unusable for multi-datacenter environments.
  2. GossipingPropertyFileSnitch:
    In Cassandra, it is very important file snitch also recommends by datastax for production usage. This snitch also look for the Cassandra-topologies.properties file to identify the cluster inforamtion such that which data center and rack belong to then we configure in the cassandra-rackdc.properties file to the rest of the nodes using gossip.



    We can configure the GossipingPropertyFileSnitch by editing the Cassandra-topologies.properties file.
    Let’s have a look.

    dc=DC1
    rack=RACK1
    prefer_local=true 

    Here, we are using dc and rack which refers to datacenter and rack and prefer_local=true refers to communicate with local IP adress while it is not communicating in multiple data center in order to limit the network bandwidth usage.

  3. Ec2Snitch:
    It is important snitch for deployments and it is a simple snitch for Amazon EC2 deployments where all nodes are in a single region. In Ec2Snitch region name refers to data center and availability zone refers to rack in a cluster.

  4. Ec2MultiRegionSnitch:
    In Cassandra, this snitch we use where the clusters span multiple regions and Ec2MultiRegionSnitch for Amazon EC2-based clusters.

  5. GoogleCloudSnitch:
    In Cassandra, it is the snitch for a Cassandra deployment on the Google Cloud Platform (GCP) across a single or multiple regions. It is the snitch which supports GCP (Google Cloud Plateform).

  6. RackInferringSnitch:
    In this snitch we find out the loaction by rack and datacenter. In this snitch the 3rd and 4th octets of IP address for example 10.40.08.230 corresponds to rack and datacenter. This is very useful snitch for writing custom snitch classes.

  7. PropertyFileSnitch:
    This snitch uses the cassandra-topology.properties file and we must define the nodes information by which we can Determines the closeness of the nodes.
    We can identify nodes information based on the datacenter and rack which they belong to. To determine the closeness of the nodes The PropertyFileSnitch used the network definitions from the cassandra-topology.properties file.

  8. CloudstackSnitch:
    It is the snitch which based on cloud and It is snitch for an Apache Cloudstack-based cluster.

Now, let’s understand the cassandra-topology.properties and the cassandra-rackdc.properties Files.



Understanding the cassandra-topology.properties and the cassandra-rackdc.properties Files:

It contains the topology of entire cluster and the information of the cassandra-topology.properties and the cassandra-rackdc.properties Files.
Let’s take an example.

dc = DC1
rack = RAC1 
rack= RAC2

In below given example DC1 and DC2 are two physical datacenters and there are two rack for each of them. In Cassandra The PropertyFileSnitch uses the properties file which is cassandra-topologies.properties file to identify the cluster’s node. If we don’t identify cluster’s nodes in the cassandra-topologies.properties file then database assumes that data are in default datacenter and rack.

# datacenter One
10.40.08.230  = DC1:RAC1
10.30.11.231  = DC1:RAC1
10.54.06.232  = DC1:RAC1

130.40.20.106  = DC1:RAC2
130.41.21.229  = DC1:RAC2
130.42.29.111  = DC1:RAC2

# datacenter Two

100.46.12.120  = DC2:RAC1
100.60.13.201  = DC2:RAC1
100.24.35.184  = DC2:RAC1

30.22.20.110  = DC2:RAC2
30.35.21.210  = DC2:RAC2
30.27.20.231  = DC2:RAC2 

The Cassandra-topologies.properties file is very important to update while we are going to add or delete nodes from cluster then to make aware that the nodes belong to which datacenter and rack. from performance perspective it is very important to keep record that information to Cassandra.

Here, we are going to describe the cassandra-rackdc.properties file: In above given example we are using following information of node.
Let’s have a look.

dc=DC1
rack=RAC1 

Note:
There are following snitch types look up the cassandra-rackdc.properties file for identifying the nodes cluster information such that which data center and which rack belong to.
Let’s have a look.

GossipingPropertyFileSnitch
Ec2Snitch
Ec2MultiRegionSnitch 

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: DSA Self Paced. Become industry ready at a student-friendly price.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.