Sharding Vs. Consistent Hashing

Last Updated : 29 Feb, 2024

Sharding and consistent hashing are two fundamental concepts in distributed systems and databases that play crucial roles in achieving scalability and performance. Understanding the differences between these two approaches is essential for designing efficient and resilient distributed systems.

What is Sharding?

range-based-sharding

Sharding is a database architecture pattern used to horizontally partition data across multiple servers or nodes.

In sharding, each server or node in the database cluster is responsible for storing only a subset of the data, called a shard.
By distributing the data across multiple shards, sharding enables databases to scale horizontally, allowing them to handle larger volumes of data and higher numbers of transactions.

What is Consistent Hashing?

Mapping-in-the-hashing-(1)

Consistent hashing is a technique used in computer systems to distribute keys (e.g., cache keys) uniformly across a cluster of nodes (e.g., cache servers). The goal is to minimize the number of keys that need to be moved when nodes are added or removed from the cluster, thus reducing the impact of these changes on the overall system.

It represents the requests by the system/clients and the server nodes in a virtual ring structure which is known as a hashring.
The number of locations in this ring is not fixed, but it is considered to have an infinite number of points
The server nodes can be placed at random locations on this ring which can be done using hashing.
The requests, that is, the users, computers, or serverless programs, are also placed on the same ring using the same hash function.