Bad Block in Operating system
Bad Block is an area of storing media that is no longer reliable for the storage of data because it is completely damaged or corrupted.
We know disk have moving parts and have small tolerances, they are prone to failure. In case when the failure is complete, then the disk needs to be replaced and its contents restored from backup media to the new disk. More frequently, one or more sectors become defective. More disks even come from the factory named Bad blocks.
This is also referred to as Bad Sector.
Cause of Bad Block :
Storage drives can ship from the factory with defective blocks that originated in the manufacturing process. The device with bad-blocks are marked as defective before leaving the factory. These are remapped with the available extra memory cells.
A physical damage to device also makes a device as bad block because sometimes operating system does not able to access the data. Dropping a laptop will also cause damage to the platter of the HDD’s. Sometimes dust also cause damage to HDD’s.
When the memory transistor fails it will cause damage to the solid-state drive. Storage cells can also become unreliable over time, as NAND flash substrate in a cell becomes unusable after a certain number of program-erase cycles.
For the erase process on the solid-state drive it requires a huge amount of electrical charge through the flash cards. This degrades the oxide layer that separates the floating gate transistors from the flash memory silicon substrate and the bit error rates increase. The drive’s controller can use error detection and correction mechanisms to fix these errors. However, at some point, the errors can outstrip the controller’s ability to correct them and the cell can become unreliable.
Soft bad sectors are caused by software problems. For instance, if a computer unexpectedly shuts down, due to this, hard drive also turn off in the middle of writing to a block. Due to this, the data contain in the block doesn’t match with the CRC detection error code and it would marked as bad sector.
Types of Bad Blocks :
There are two types of bad blocks –
- Physical or Hard bad block : It comes from damage to the storage medium.
- Soft or Logical bad block : A soft, or logical, bad block occurs when the operating system (OS) is unable to read data from a sector.
A soft bad block include when the cyclic redundancy check (CRC), or error correction code (ECC), for a particular storage block does not match the data read by the disk.
How Bad blocks are handled :
These blocks are handled in a number of ways, but it depends upon the disk and controller.
On simple disks, such as some disks with IDE controller, bad blocks are handled manually. One strategy is to scan the disk to find bad blocks while disk is being formatted. Any bad block that are discovered as flagged as unusable so that file system does not allocate them. If blocks go bad during normal operation, a special program (such as the Linux badblocks command) must be run manually to search for the bad blocks and to lock them away.
More sophisticated disks are smarter about bad-block recovery. The work of controller is to maintain the list of bad blocks. The list formed by the controller is initialized during the low-level formatting at the factory and is updated over the life of the disk. Low-level formatting holds the spare sectors which are not visible to the operating system.The last task is done by controller which is to replace each bad sector logically with the spare sectors. This scheme is also known as sector sparing and forwarding.
A typical bad-sector transaction is as follows –
- Suppose Operating system wants to read logical block 80.
- Now, the controller is going to calculate EEC and suppose it found the block as bad. It reports to operating system that the requested block is bad.
- Whenever, next time the system is rebooted, a special command is used and it will tell the controller that this sector is to be replaced with the spare sector.
- In future, whenever there is a request for the block 80, the request is translated to replacement sector’s address by the controller.
The redirection by the controller (i.e., the request translated to replacement) could invalidate any optimization by the operating system’s disk-scheduling algorithm. For this reason, most disks are formatted to provide a few spare sectors in each cylinder and spare cylinder as well. Whenever the bad block is going to remap, the controller will use spare sector from the same cylinder, if possible; otherwise spare cylinder is also present.
Some controllers use spare sector to replace bad block, there is also another technique to replace bad block which is sector slipping.
Example of sector slipping –
Suppose that logical block 16 becomes defective and the first available spare sector follows sector 200. Sector slipping then starts remapping. All the sectors from 16 to 200, moving them all down one spot. That is, sector 200 is copied into the spare, then sector 199 into 200, then 198 into 199, and so on, until sector 17 is copied into sector 18.
In this way slipping the sectors frees up the space of sector 17 so that sector 16 can be mapped to it.
The replacement of bad block is not totally automatic, because data in the bad block are usually lost. A process is trigger by the soft errors in which a copy of the block data is made and the block is spared or slipped. Hard error which is unrecoverable will lost all its data. Whatever file was using that block must be repaired and that requires manual intervention.