Open In App

Distributed Data Storage with Amazon S3 and Amazon DynamoDB: Scalable Databases

Before moving to the hands-on of distributed data storage with Amazon S3, let us have a brief overview of “What is Amazon S3 ?”.



Versioning

Cross Region Replication



Cross-region replication(CRR) enables automatic, asynchronous copying of objects across buckets in different AWS regions. Buckets configured for cross-region replication can be owned by the same AWS account or by different accounts.

Distributed Data Storage with Amazon S3

Step 1: First step will be to create an IAM user group and an IAM user.

Why so ? Using an IAM user account over a root user account is safer in terms of cost control because you can apply fine-grained permissions, preventing accidental and costly actions. IAM users are limited in their access, reducing the risk of unexpected resource provisioning or deletions that could result in uncertain charges.

Creating an IAM user group

Creating an IAM user belonging to the above IAM user group

Console sign-in Page

IAM User Sign-in Page

And finally, you are good to go. Now, remember you have to create your Amazon S3 bucket as an IAM user, not as a root user. And that’s the reason, we first created an IAM user account.

Step 2: Create a Amazon S3 bucket(both public and private bucket) and upload a file into it.

Creating a public Amazon S3 bucket

The same above picture continued…

In essence, this policy opens up the S3 bucket to public read access, allowing anyone with the URL of an object within the bucket to retrieve that object. This might be useful for making certain content publicly accessible on the internet, like images, documents, or other resources. However, you should be cautious and consider the security implications of granting public access to your S3 bucket, especially for sensitive or confidential data.

After heading to the uploaded file, you will get to know about 2 URLs – one is the “OPEN” button itself present on the top navigation bar and another one is present in the properties section named as “Object URL”. You can see both the URLs marked as “red” in the image below. ( First one – marked as 1, Second one – marked as 2)

Let’s know what’s the difference between the two URLs.

URL marked as 1 (i.e., Object URL)

In Amazon S3, the “Object URL” in the properties section of an uploaded object refers to the unique URL that provides direct access to the specific object in your S3 bucket. This URL is also known as the object’s Amazon S3 URL or endpoint.

The format of the object URL typically follows this pattern:

https://s3.amazonaws.com/<bucket-name>/<object-key>

The object URL allows you to access and download the object directly from a web browser or through HTTP requests. It’s a convenient way to share or reference the specific object stored in your S3 bucket.

URL marked as 2(i.e., Presigned URL)

This URL is also known as “Presigned URL”. A presigned URL for an object in an Amazon S3 bucket is a time-limited, secure URL that provides temporary access to that object.

This URL is generated using AWS credentials and includes a signature as part of the URL to verify permission and authenticity. It grants temporary access to perform specific actions (e.g., GET) on the object for a defined duration.

Presigned URLs are often used to grant temporary, controlled access to S3 objects without requiring the recipient to have AWS credentials, making them useful for scenarios like sharing private files or enabling temporary access for downloads.

Note: In the context of Amazon S3 buckets, public buckets allow both the ‘Object URL’ and the ‘Presigned URL’ to open objects in a web browser. In contrast, objects stored in private Amazon S3 buckets can only be accessed using a presigned URL and not through a direct object URL.

Let’s prove the above point now –

Creating a private Amazon S3 bucket

Go to the Amazon management console and search for S3. After being redirected to the Amazon S3 page, go the “buckets” section of the left navigation bar, and then click on “Create bucket”.

Distributed Data Storage with Amazon DynamoDB

Amazon DynamoDB

Why DynamoDB ?

DynamoDB vs Other DB Services

Features

DynamoDB

RDS

RedShift

Aurora

Scaling

Scales seamlessly

Relatively easy to scale

More complexity in scaling

Relatively easy to scale

Storage

Unlimited storage

64TB

2PB

64TB with RDS

Pricing

pay-per-Use Model

Generally cheaper than others

On-demand model with added costs

Pricing depends upon RDS

Maintenance

Maintained by AWS

Maintained by AWS

Requires more maintenance

No maintenance required

DynamoDB features

Creating a Table on Amazon DynamoDB

Step 1: Go to the AWS management console and search for DynamoDB. On the DynamoDB page, click on “Tables” present in the left navigation bar. Then click on “Create table”.

Step 2: Let’s now fill the necessary table details. Go for a table name of your own choice(in my case, it’s “Product”).

Step 3: Fill the field named “Partition key” which is actually the primary key of the table you are creating. As mentioned above, this attribute needs to be unique and not null. In my case, it’s “ProductID”. And we can assume that the “ProductID” field can be alphanumeric, so let it be of type “string”.

Step 4: Then scroll down and click on “Create Table”.

Step 5: Click on the newly created table and you will be redirected to the table description. Then click on the button “Explore table items” to create the items for the table.

Step 6: Scroll down and click on “Create item”. Their you can add as many attributes ass you can for each item. Remember, as the DynamoDB is a NoSQL database, you can add any different attributes for each item. For simplicity purpose, I am adding 4 particular attributes to each item i.e., ProductID, Product name, Product price, Product category. And below you can see my created table.

Step 7: After you have created the table, you have scan or query items from your table. On that same page, click on “filters”, the dropdown menu will appear with all the fields which is required to design a query. You can refer to some of my examples shown below in the images.

Query – 1

  • Attribute name : ProductID
  • Type : String
  • Condition : Equal to
  • Value : 3

Query – 2

  • Attribute name : Product name
  • Type : String
  • Condition : Equal to
  • Value : nike

Query – 3

  • Attribute name : Product price
  • Type : Number
  • Condition : Less than
  • Value : 5000

Query – 4

  • Attribute name : Product category
  • Type : String
  • Condition : Equal to
  • Value : food

Use Cases of Amazon DynamoDB

Frequently Asked Questions On Data Storage with Amazon S3 and Amazon DynamoDB

1. What are some strategies for optimizing query performance in DynamoDB?

Strategies for optimizing query performance include designing efficient schema structures, using appropriate primary and secondary keys, and making use of partitioning and indexing.

2. How does DynamoDB handle security and access control?

DynamoDB provides fine-grained access control through AWS Identity and Access Management (IAM) and resource-based policies. You can control who can read, write, and manage your DynamoDB tables.

3. How does DynamoDB handle schema changes and data migration?

Schema changes can be handled by adding or removing attributes. You can use item versioning or create a new table with the updated schema for smooth data migration. Data migration can be achieved using Data Pipeline or custom scripts.

4. How does DynamoDB handle backup and restore operations?

DynamoDB offers automated and on-demand backups. The process involves creating backups, setting up retention policies, and performing restores. Point-in-Time Recovery allows you to restore a table to a specific moment in time.


Article Tags :