Open In App

What Is DocumentDB? Setiing Up Your First DocumentDB Cluster

Last Updated : 11 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

A DocumentDB is known as a “Document Database” which is a cloud-based database service offered by Microsoft Azure. It is designed in such a way that it can store and manage data easily in JSON (JavaScript Object Notation) format. The DocumentDB allows its users to store data in a more natural, document-like format. It is a powerful and flexible database solution that is built to work for modern, cloud-based applications that require the ability to store and manage semi-structured data at a large scale.

A DocumentDB supports an automatic indexing feature which means that the user doesn’t have to worry about defining indexes or managing them manually because it automatically indexes all the data the user stores, making it easy to query and retrieve information quickly.

What is DocumentDB?

Amazon DocumentDB (with MongoDB compatibility) is a fast, reliable, and fully managed database service. An Amazon DocumentDB makes it easy for users to set up and operate with scale MongoDB-compatible databases in the cloud. With Amazon DocumentDB, users can run the same application code and use the same drivers and tools we use with MongoDB. As the application grows, users need to store more data which can scale seamlessly to accommodate the user’s needs. This means users don’t have to worry about running out of storage space or having to manually manage database scaling.

A DocumentDB cluster consists of 0 to 16 instances and a cluster storage volume that manages the data for those instances. All operations are done through the primary instance and all the instances (primary and replicas) support reads. The cluster’s data is stored in the cluster volume with copies in three different Availability Zones.

Features of DocumentDB

  1. Amazon DocumentDB offers two types of clusters: instance-based and elastic. Elastic clusters can handle millions of reads/writes per second and petabytes of storage.
  2. The DocumentDB storage automatically grows up to 128 TiB as needed, without requiring you to provision extra storage upfront.
  3. It can increase read throughput by adding up to 15 replica instances. Replicas share the same storage, reducing costs and improving read performance.
  4. DocumentDB allows you to scale compute and memory resources for each instance up or down quickly, typically within minutes.
  5. DocumentDB runs within your Amazon VPC, isolating your database in a private virtual network with configurable firewall settings.
  6. DocumentDB continuously monitors cluster health and automatically restarts instances if needed, without requiring crash recovery replay.
  7. If an instance fails, DocumentDB can automatically fail over to one of your replicas in other Availability Zones, or create a new instance if no replicas exist.
  8. DocumentDB provides point-in-time recovery for up to 35 days, allowing you to restore your cluster to any second within that period. Backups are incremental, continuous, and stored durably in Amazon S3.

Creating and Configuring DocumentDB Cluster

A DocumentDB Cluster can be created in two ways-

  • Using the AWS Management Console
  • Using the AWS CLI

Using the AWS Management Console

Step 1: Open the Amazon DocumentDB management console and go to the “Clusters” section. If creating a new cluster, click “Create.” If modifying an existing cluster, select the cluster and click “Actions,” then choose “Modify.”

Amazon DocumentDB management console and go to the "Clusters" section.

Step 2: If creating a new cluster, make sure to select “Instance Based Clusters” as the cluster type (this is the default option).

select "Instance Based Clusters" as the cluster type

Step 3: In the “Configuration” section, under “Cluster storage configuration,” choose the choice “Amazon DocumentDB I/O Optimized.”

choose the option "Amazon DocumentDB I/O Optimized.

Step 4: Complete the remaining steps for creating or modifying your cluster, and click “Create cluster” or “Modify cluster” when finished.

click "Create cluster"

Advantages of DocumentDB Cluster

  1. DocumentDB Cluster is incredibly fast because it uses an in-memory database architecture, allowing for super-quick data retrieval and processing.
  2. It ensures strong consistency that every time you read data, you’re getting the most up-to-date information, making your applications reliable and predictable.
  3. The Cluster works seamlessly with Redis, so you can use all the familiar Redis data structures, APIs, and commands, making it easier to develop your applications.
  4. The workloads grow, DocumentDB Cluster can easily scale out and give you more database resources, ensuring your performance never takes a hit.
  5. It can store data across multiple Availability Zones, DocumentDB Cluster ensures low-latency access to your data, no matter where in the world your users are, making it perfect for globally distributed applications.
  6. It is built-in with the redundancy and high availability through replication, DocumentDB Cluster keeps your data accessible, even if there are node failures or maintenance activities going on.
  7. The robust architecture and fault-tolerant design, DocumentDB Cluster can handle node failures, database recovery, and node restarts without losing data or causing downtime, keeping your applications running smoothly.

Disadvantages of DocumentDB Cluster

  1. While DocumentDB Cluster delivers top-notch performance and scalability, it can really put a dent in your wallet, especially if you’ve got a massive deployment or workloads that need a ton of throughput.
  2. Jumping into DocumentDB Cluster might require learning some new tricks, particularly if you’re not familiar with NoSQL databases or how distributed systems work.
  3. Once you choose DocumentDB Cluster, you’re kind of tied to the Microsoft Azure ecosystem, making it tough to pack up and move to a different cloud provider or platform down the road.
  4. DocumentDB Cluster up and running can be a bit of a headache, requiring some serious expertise in database administration, performance tuning, and strategies for keeping things fault-tolerant.
  5. DocumentDB does offer a SQL-like query language, it might be missing some advanced features or functionality compared to traditional SQL databases, putting some constraints on the types of queries and operations you can perform.

Conclusion

The use of DocumentDB clusters fast, reliable and fully managed database service that work nicely with other AWS services. But there are some things to think about. They can get costly according to our application size, especially for big setups or high availability. It difficult for us to manage clusters with multiple servers and we need to have knowledge about the technical setup, scaling, monitoring, and troubleshooting. At last, they might be limited to only a few AWS locations, which could impact relativity. However, we also can’t ignore that DocumentDB has awesome speed and convenience are worth these potential drawbacks. We can integrates with it seamlessly with other AWS services and we don’t have to worry about maintenance tasks. DocumentDB cluster provides point-in-time recovery for up to 35 days, allowing you to restore your cluster to any second within that period. Backups are incremental, continuous, and stored durably in Amazon S3. So we can conclude that it saves, both the time and effort of the user.

DocumentDB clusters – FAQ’s

What do you mean by DocumentDB?

A DocumentDB is known as a “Document Database” which is a cloud-based database service offered by the Microsoft Azure.

Why is there a need of the DocumentDB?

The DocumentDB allows its users to store data in a more natural, document-like format. It is a powerful and flexible database solution which is build to work for modern, cloud-based applications who require the ability to store and manage the semi-structured data at large scale.

How does the DocumentDB Clusters work?

A DocumentDB cluster consists of 0 to 16 instances and a cluster storage volume which manages the data for those instances. All operations are done through the primary instance and all the instances (primary and replicas) support reads. The cluster’s data is stored in the cluster volume with copies in three different Availability Zones.

Explain any two features of DocumentDB Clusters.

  • It can increase read throughput by adding up to 15 replica instances. Replicas share the same storage, reducing costs and improving read performance.
  • DocumentDB allows you to scale compute and memory resources for each instance up or down quickly, typically within minutes.

What are the limitations of DocumentDB Clusters?

DocumentDB does offer a SQL-like query language, it might be missing some advanced features or functionality compared to traditional SQL databases, putting some constraints on the types of queries and operations you can perform.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads