Block storage as the name suggests the data is stored in the form of block. Block Storage store the data in the form of the fixed-size chunks which is called block with its own address but no metadata(additional information) which provide the context for what that block of data is all about. It is the most commonly used storage type for most applications. Block Storage works best when the application and storage are local as it leads to the low latency otherwise latency becomes a factor of a disadvantage when they are farther apart. It cannot be accessed directly through APIs.It is controlled or accessed by the external Operating system.
Block Storage in the cloud:
Azure Premium Storage: This allows 32Tb of volume for the storage. It delivers high performance and low latency in I/O intensive workloads running on Azure Virtual Machine.
AWS elastic block storage: This allow up to 16Tb of storage in size. It is like a hard disk that can be attached to the EC2 instances and can access the storage.
Rackspace Cloud lock storage: It allows up to 10GbE of storage for the internal connection.
Object storage is used to store the unstructured data which can be a photo, video, audio of any size, and suitable for the situation which has to written once and read once or multiple times. There should not be many incremental updates because the small change leads to the full change in the object. Object Consist of three things- Data, Metadata( which is data about data), and a global unique identifier. Data consist of any type and amount of information that has to be store. Metadata is contextual information about what data is about, its confidentiality, or any information regarding its use. And global unique identifier is a 128-bit unique value given to the storage to identify the object over a distributed system.
Object Storage in the cloud:
Amazon S3: Amazon uses a bucket for storage and ensures 99.9999% durability and high performance, cross-region replication, versioning, encryption, and flexible storage.
Google cloud storage: It allows storing data in Google cloud and allows the users to store individual objects in terabytes in size. It provides strong read-after-write consistency for all upload and delete operations.
What main problems does object storage solve?
The first very critical problem solved by object storage is increasing problems of data growth. You can store any amount of data for any amount of time you want with a minimal cost. It is highly scalable. You can store your data in Pb or beyond that. If sometimes object storage scales out it add on the additional node solving the problem. And the advantage of using this is data integrity. You can access your data from anywhere within just clicks. You don’t have to face the problem in accessing in any manner like loss of data because this becomes a problem when data is huge. A bigger challenge comes out with growing data like in accessing full data at once so object storage uses an erasure coding approach for this increasing threat.RAID protects the data by replicating a disk drive information whereas erasure coding protects data by rebuilding chunks of data, not a physical device. Secondly object storage uses simplified and advanced techniques to manage the data because data in gigabytes can be managed by rack-based management techniques, identifying failed HDDs but this is not worked with the data in petabytes. So object storage manages the namespace instead of Rackspace. Namespace refers to the rack of storage or multiple racks and it can be locally present or globally dispersed. So solving the provisioning management provided by the object storage by the expansion of the storage enlightens the quality of this storage even more. You don’t have to look much for the management you can directly store your data. Thirdly security of your data is optimal. It stores the multiple copies at different centers so that if by chance one or more nodes fail you can still access your data.
What about trade-offs?
Object storage has all qualities that can outrage the performance of the IT department. It is scalable, performance is optimal, provide resilience and usability. But there are some situations where object storage fails to meet your need of the application. The hierarchical structure of the file system that is files and folders and the naming convention proves to be well understood and interacted by the users. As object storage is linked with the identifier which is hard to remember is sometimes becomes a problem for the direct users. This is the reason why the file system used as a bridge to interact with the object storage by naming files and then save it to directories and later convert it to the object. And this is the reason that this gateway sometimes becomes problematic leading to the performance challenge and make us think upon if we are using these gateways then why we do not use them directly.
Workloads for object versus block storage
As already explained that object storage is a storage for the unstructured data like the static web content data can be stored, backup data, and many more. So as it deals with the huge amount of data there is a barrier that you cannot make updates to the data regularly because updating means changing in the whole data, not a specific chunk. So you can read your data multiple times but incremental updates become problematic.
Whereas Block storage is for a more sophisticated environment where you can make updates any time you want and basically used for the storage of real-time transaction database where you have to access your data regularly and also make updates in it.
Object storage in practice
Today object storage emerges as the leading storage. Many companies are using it for their storage. Data stored as an object is extended up to exabytes. You need to think upon the architecture according to your application and then only you can decide which storage will be beneficial for the usage. Amazon is the leading company which is providing Amazon S3 as an object storage service. According to your data nodes are added on when your data scales out. Best use cases of the object storage are companies storing their backup files, unstructured data, database dumps, and log files.
Eventual Consistency and Strong Consistency
For more availability of the data, object storage duplicate the multiple copies of data and store it in a distributed system. So there is a concept of eventual consistency and strong consistency in scalable resiliency of object. Eventual consistency refers to the process where the latest version first stored in the node and then later replicated while in strong consistency as soon as data is stored it starts replicating immediately this leads to the delay in the written acknowledgment until all the data is replicated. This is the reason why eventual consistency ensures high availability and durable and relatively static and not adaptable to the changes. And also sometimes when the data is retrieved it is not necessary that it will return the latest version of the data this is also one of the drawbacks.This is the reason why the object case is a suitable use case for storing videos, photos, and unstructured data which is not needed to be altered periodically.
Strong consistency is for more real-time based system such as transactions and database. And the most recent version of the data needs to be retrieved as a result. Therefore when eventual consistency needs to impose importance than object storage is used and when strong consistency comes to play block storage is used.
|Factors||Object storage||Block storage|
|Scalability||Can scale up to infinite i.e Pb and beyond.||
Scale up to limit because of the addressing
Can access directly through API’s or http/
Can only be accessed through external operating
Higher performance for big content and
High performance with database and transactional
|Analytics||Contain metadata and unique identifier.||No metadata but contain the address of the block|
|Consistency||Eventual Consistency||Strong Consistency|
Written once and read once or multiple times.
Doesn’t provide incremental update.
Flexible to update any time and can be written or read
storing multiple copies of data over a
Block storage systems offer RAID, erasure coding,
and multi-site replication
storage for backup files, unstructured data
database dumps and log files.
Ideal for databases, service side processing, like Java
and Running mission-critical applications like Oracle
Amazon S3, Google cloud storage, Azure Blob
storage, Rackspace cloud Files
Azure Premium Storage, AWS Elastic Block storage
,Rackspace cloud block storage, Google persistent Disk
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.