RAID (Redundant Arrays of Independent Disks)

Last Updated : 08 Nov, 2023

RAID is a technique that makes use of a combination of multiple disks instead of using a single disk for increased performance, data redundancy, or both. The term was coined by David Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987.

Why Data Redundancy?

Data redundancy, although taking up extra space, adds to disk reliability. This means, that in case of disk failure, if the same data is also backed up onto another disk, we can retrieve the data and go on with the operation. On the other hand, if the data is spread across multiple disks without the RAID technique, the loss of a single disk can affect the entire data.

Key Evaluation Points for a RAID System

Reliability: How many disk faults can the system tolerate?
Availability: What fraction of the total session time is a system in uptime mode, i.e. how available is the system for actual use?
Performance: How good is the response time? How high is the throughput (rate of processing work)? Note that performance contains a lot of parameters and not just the two.
Capacity: Given a set of N disks each with B blocks, how much useful capacity is available to the user?

RAID is very transparent to the underlying system. This means, that to the host system, it appears as a single big disk presenting itself as a linear array of blocks. This allows older technologies to be replaced by RAID without making too many changes to the existing code.

Different RAID Levels

RAID-0 (Stripping)
RAID-1 (Mirroring)
RAID-2 (Bit-Level Stripping with Dedicated Parity)
RAID-3 (Byte-Level Stripping with Dedicated Parity)
RAID-4 (Block-Level Stripping with Dedicated Parity)
RAID-5 (Block-Level Stripping with Distributed Parity)
RAID-6 (Block-Level Stripping with two Parity Bits)

Raid Controller

1. RAID-0 (Stripping)

Blocks are “stripped” across disks.

RAID-0

In the figure, blocks “0,1,2,3” form a stripe.
Instead of placing just one block into a disk at a time, we can work with two (or more) blocks placed into a disk before moving on to the next one.

Raid-0

Evaluation

Reliability: 0
There is no duplication of data. Hence, a block once lost cannot be recovered.
Capacity: N*B
The entire space is being used to store data. Since there is no duplication, N disks each having B blocks are fully utilized.

Advantages

It is easy to implement.
It utilizes the storage capacity in a better way.

Disadvantages

A single drive loss can result in the complete failure of the system.
Not a good choice for a critical system.

2. RAID-1 (Mirroring)

More than one copy of each block is stored in a separate disk. Thus, every block has two (or more) copies, lying on different disks.

Raid-1

The above figure shows a RAID-1 system with mirroring level 2.
RAID 0 was unable to tolerate any disk failure. But RAID 1 is capable of reliability.

Evaluation

Assume a RAID system with mirroring level 2.

Reliability: 1 to N/2
1 disk failure can be handled for certain because blocks of that disk would have duplicates on some other disk. If we are lucky enough and disks 0 and 2 fail, then again this can be handled as the blocks of these disks have duplicates on disks 1 and 3. So, in the best case, N/2 disk failures can be handled.
Capacity: N*B/2
Only half the space is being used to store data. The other half is just a mirror of the already stored data.

Advantages

It covers complete redundancy.
It can increase data security and speed.

Disadvantages

It is highly expensive.
Storage capacity is less.

3. RAID-2 (Bit-Level Stripping with Dedicated Parity)

In Raid-2, the error of the data is checked at every bit level. Here, we use Hamming Code Parity Method to find the error in the data.
It uses one designated drive to store parity.
The structure of Raid-2 is very complex as we use two disks in this technique. One word is used to store bits of each word and another word is used to store error code correction.
It is not commonly used.

Advantages

In case of Error Correction, it uses hamming code.
It Uses one designated drive to store parity.

Disadvantages

It has a complex structure and high cost due to extra drive.
It requires an extra drive for error detection.

4. RAID-3 (Byte-Level Stripping with Dedicated Parity)

It consists of byte-level striping with dedicated parity striping.
At this level, we store parity information in a disc section and write to a dedicated parity drive.
Whenever failure of the drive occurs, it helps in accessing the parity drive, through which we can reconstruct the data.

Raid-3

Here Disk 3 contains the Parity bits for Disk 0, Disk 1, and Disk 2. If data loss occurs, we can construct it with Disk 3.

Advantages

Data can be transferred in bulk.
Data can be accessed in parallel.

Disadvantages

It requires an additional drive for parity.
In the case of small-size files, it performs slowly.

5. RAID-4 (Block-Level Stripping with Dedicated Parity)

Instead of duplicating data, this adopts a parity-based approach.

Raid-4

In the figure, we can observe one column (disk) dedicated to parity.
Parity is calculated using a simple XOR function. If the data bits are 0,0,0,1 the parity bit is XOR(0,0,0,1) = 1. If the data bits are 0,1,1,0 the parity bit is XOR(0,1,1,0) = 0. A simple approach is that an even number of ones results in parity 0, and an odd number of ones results in parity 1.

Raid-4

Assume that in the above figure, C3 is lost due to some disk failure. Then, we can recompute the data bit stored in C3 by looking at the values of all the other columns and the parity bit. This allows us to recover lost data.

Evaluation

Reliability: 1
RAID-4 allows recovery of at most 1 disk failure (because of the way parity works). If more than one disk fails, there is no way to recover the data.
Capacity: (N-1)*B
One disk in the system is reserved for storing the parity. Hence, (N-1) disks are made available for data storage, each disk having B blocks.

Advantages

It helps in reconstructing the data if at most one data is lost.

Disadvantages

It can’t help in reconstructing when more than one data is lost.

6. RAID-5 (Block-Level Stripping with Distributed Parity)

This is a slight modification of the RAID-4 system where the only difference is that the parity rotates among the drives.

Raid-5

In the figure, we can notice how the parity bit “rotates”.
This was introduced to make the random write performance better.

Evaluation

Reliability: 1
RAID-5 allows recovery of at most 1 disk failure (because of the way parity works). If more than one disk fails, there is no way to recover the data. This is identical to RAID-4.
Capacity: (N-1)*B
Overall, space equivalent to one disk is utilized in storing the parity. Hence, (N-1) disks are made available for data storage, each disk having B blocks.

Advantages

Data can be reconstructed using parity bits.
It makes the performance better.

Disadvantages

Its technology is complex and extra space is required.
If both discs get damaged, data will be lost forever.

7. RAID-6 (Block-Level Stripping with two Parity Bits)

Raid-6 helps when there is more than one disk failure. A pair of independent parities are generated and stored on multiple disks at this level. Ideally, you need four disk drives for this level.
There are also hybrid RAIDs, which make use of more than one RAID level nested one after the other, to fulfill specific requirements.

Raid-6

Advantages

Very high data Accessibility.
Fast read data transactions.

Disadvantages

Due to double parity, it has slow write data transactions.
Extra space is required.

Advantages of RAID

Data redundancy: By keeping numerous copies of the data on many disks, RAID can shield data from disk failures.
Performance enhancement: RAID can enhance performance by distributing data over several drives, enabling the simultaneous execution of several read/write operations.
Scalability: RAID is scalable, therefore by adding more disks to the array, the storage capacity may be expanded.
Versatility: RAID is applicable to a wide range of devices, such as workstations, servers, and personal PCs

Disadvantages of RAID

Cost: RAID implementation can be costly, particularly for arrays with large capacities.
Complexity: The setup and management of RAID might be challenging.
Decreased performance: The parity calculations necessary for some RAID configurations, including RAID 5 and RAID 6, may result in a decrease in speed.
Single point of failure: RAID is not a comprehensive backup solution, while offering data redundancy. The array’s whole contents could be lost if the RAID controller malfunctions.

Conclusion

In Conclusion, RAID technology in database management systems distributes and replicates data across several drives to improve data performance and reliability. It is a useful tool in contemporary database setups since it is essential to preserving system availability and protecting sensitive data.

Suggest improvement

Disk Scheduling Algorithms

Last Minute Notes – Operating Systems

Share your thoughts in the comments

OS Basics

Structure of Operating System

Types of OS

Process Management

CPU Scheduling in OS

Threads in OS

Process Synchronization

Critical Section Problem Solution

Deadlocks & Deadlock Handling Methods

Memory Management

Page Replacement Algorithms

Storage Management

OS Interview Questions

OS Quiz and GATE PYQ's

OS Basics

Structure of Operating System

Types of OS

Process Management

CPU Scheduling in OS

Threads in OS

Process Synchronization

Critical Section Problem Solution

Deadlocks & Deadlock Handling Methods

Memory Management

Page Replacement Algorithms

Storage Management

OS Interview Questions

OS Quiz and GATE PYQ's

RAID (Redundant Arrays of Independent Disks)

Why Data Redundancy?

Key Evaluation Points for a RAID System

Different RAID Levels

1. RAID-0 (Stripping)

Evaluation

Advantages

Disadvantages

2. RAID-1 (Mirroring)

Evaluation

Advantages

Disadvantages

3. RAID-2 (Bit-Level Stripping with Dedicated Parity)

Advantages

Disadvantages

4. RAID-3 (Byte-Level Stripping with Dedicated Parity)

Advantages

Disadvantages

5. RAID-4 (Block-Level Stripping with Dedicated Parity)

Evaluation

Advantages

Disadvantages

6. RAID-5 (Block-Level Stripping with Distributed Parity)

Evaluation

Advantages

Disadvantages

7. RAID-6 (Block-Level Stripping with two Parity Bits)

Advantages

Disadvantages

Advantages of RAID

Disadvantages of RAID

Conclusion

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?