Open In App

MongoDB – Replication and Sharding

Replication and Sharding are two important features for scalability and data availability in MongoDB. Replication enhances data availability by creating duplicate copies of the dataset, whereas sharding helps in horizontal scaling by partitioning the large collection (dataset) into smaller discrete parts called shards.

In this article, we will learn about Sharding and Replication in MongoDB. We will cover all important concepts related to them and look at their functioning with diagrams.



Replication in MongoDB

Replication is the method of duplication of data across multiple servers in MongoDB.

For example, we have an application that reads and writes data to a database and says server A has a name and balance which will be copied/replicated to two other servers in two different locations.



Replication increases redundancy and data availability with multiple copies of data on different database servers. So, it will increase the performance of reading scaling.

The set of servers that maintain the same copy of data is known as replica servers or MongoDB instances.

Key Features of Replication:

Advantages of Replication

How to Perform Replication in MongoDB

In order to perform replication in MongoDB, we need to first create replica sets and give permission to script the file. The basics syntax of –replSet  is −

mongod --port "PORT" --dbpath "YOUR_DB_DATA_PATH" --replSet "REPLICA_SET_INSTANCE_NAME"

 Or

create a ".sh"  file create_replicaset.sh and init_mongoreplica.js

Examples: 

Then run the following script :

./create_replicaset.sh

Sharding in MySQL

Sharding is a method for distributing large collection(dataset) and allocating it across multiple servers. MongoDB uses sharding to help deployment with very big data sets and high volume operations.

Sharding combines more devices to carry data extension and the needs of read and write operations.

Need for Sharding

Database systems that have big data sets or high throughput requests can not be handled by a single server.

For example, High query flows can drain the CPU limit of the server and large data set stress the I/O capacity of the disk drive.

How does Sharding work?

Sharding determines the problem with horizontal scaling. It breaks the system dataset and store it over multiple servers, adding new servers to increase the volume as needed.

Now, instead of one signal as primary, we have multiple servers called Shard. We have different routing servers that will route data to the shard servers.

For example: Let say we have Data 1, Data 2, and Data 3 this will be going to the routing server which will route the data (i.e, Different Data will go to a particular Shard ). Each Shard holds some pieces of data.

Here the configuration server will hold the metadata and it will configure the routing server to integrate the particular data to a shard however configure server is the MongoDB instance if it goes down then the entire server will go down, So it again has Replica Configure database.

Advantages of Sharding

In order to create sharded clusters in MongoDB, We need to configure the shard, a config server, and a query router

Conclusion

Both Replication and sharding in MongoDB helps in scaling of database. Where replication helps in data availability, sharding is useful to horizontally scale large datasets.

In this article, we have learnt about their uses, advantages and implementation. Using both of these techniques, users can ensure efficient and optimal database performance.

Article Tags :