Stable-Storage Implementation in Operating system
By definition, information residing in the Stable-Storage is never lost. Even, if the disk and CPU have some errors, it will never lose any data.
To achieve such storage, we need to replicate the required information on multiple storage devices with independent failure modes. The writing of an update should be coordinate in such a way that it would not delete all the copies of the state and when we are recovering from a failure we can force all the copies to a consistent and correct valued even if another failure occurs during the recovery. In this article we will discuss how to cover these needs.
The disk write operation results to one of the following outcome:
Figure – Outcomes of Disk
- Successful completion –
The data will be written correctly on the disk.
- Partial Failure –
In this case, failure has occurred in the middle of the data transfer, such that only some sectors were written with the new data, and the sectors which were written during the failure may have been corrupted.
- Total Failure –
The failure occurred before the disk write started, so the previous data values on the disk remains intact.
During writing a block somehow if failure occurs, the system’s first work is to detect the failure and then invoke a recovery process to restore the consistent state. To do that, the system must contain two physical block for each logical block.
An output operation is executed as follows:
Figure – Process of execution of output operation
- Write the information onto the first physical block.
- When the first write completes successfully, perform the same operation onto the second physical block.
- When both the operations are successful, declare the operation as complete.
During the recovery from a failure each of the physical block is examined. If both are the same and no detectable error exists, then no further action is necessary. If one block contains detectable errors then we replace its content with the value of the other block. If neither block contains the detectable error, but the block differ in content, then we replace the content of first block with the content of the second block. This procedure of the recovery give us an conclusion that either the write to stable content succeeds successfully or it results in no change.
This procedure will be extended if we want arbitrarily large number of copies of each block of the stable storage. With the usage of large number of copies, the chances of the failure reduces. Generally, it is reasonable to simulate stable storage with only two copies. The data present in the stable storage is safe unless a failure destroys all the copies. The data that is present in the stable storage is guaranteed to be safe unless a failure destroys all the copies.
Because waiting for disk writes to complete is time consuming, many storage arrays add NVRAM as a cache. Since the memory is non-volatile it can be trusted to store the data in route to the disks. In this way it is considered as a part of the stable storage. Writing to the stable storage is much faster than to the disk, so performance is greatly improved.