Log based Recovery in DBMS

Last Updated : 13 Nov, 2023

The atomicity property of DBMS states that either all the operations of transactions must be performed or none. The modifications done by an aborted transaction should not be visible to the database and the modifications done by the committed transaction should be visible. To achieve our goal of atomicity, the user must first output stable storage information describing the modifications, without modifying the database itself. This information can help us ensure that all modifications performed by committed transactions are reflected in the database. This information can also help us ensure that no modifications made by an aborted transaction persist in the database.

Log and log records

The log is a sequence of log records, recording all the updated activities in the database. In stable storage, logs for each transaction are maintained. Any operation which is performed on the database is recorded on the log. Prior to performing any modification to the database, an updated log record is created to reflect that modification. An update log record represented as: <Ti, Xj, V1, V2> has these fields:

Transaction identifier: Unique Identifier of the transaction that performed the write operation.
Data item: Unique identifier of the data item written.
Old value: Value of data item prior to write.
New value: Value of data item after write operation.

Other types of log records are:

<Ti start>: It contains information about when a transaction Ti starts.
<Ti commit>: It contains information about when a transaction Ti commits.
<Ti abort>: It contains information about when a transaction Ti aborts.

Undo and Redo Operations

Because all database modifications must be preceded by the creation of a log record, the system has available both the old value prior to the modification of the data item and new value that is to be written for data item. This allows system to perform redo and undo operations as appropriate:

Undo: using a log record sets the data item specified in log record to old value.
Redo: using a log record sets the data item specified in log record to new value.

The database can be modified using two approaches –

Deferred Modification Technique: If the transaction does not modify the database until it has partially committed, it is said to use deferred modification technique.
Immediate Modification Technique: If database modification occur while the transaction is still active, it is said to use immediate modification technique.

Recovery using Log records

After a system crash has occurred, the system consults the log to determine which transactions need to be redone and which need to be undone.

Transaction Ti needs to be undone if the log contains the record <Ti start> but does not contain either the record <Ti commit> or the record <Ti abort>.
Transaction Ti needs to be redone if log contains record <Ti start> and either the record <Ti commit> or the record <Ti abort>.

Use of Checkpoints – When a system crash occurs, user must consult the log. In principle, that need to search the entire log to determine this information. There are two major difficulties with this approach:

The search process is time-consuming.
Most of the transactions that, according to our algorithm, need to be redone have already written their updates into the database. Although redoing them will cause no harm, it will cause recovery to take longer.

To reduce these types of overhead, user introduce checkpoints. A log record of the form <checkpoint L> is used to represent a checkpoint in log where L is a list of transactions active at the time of the checkpoint. When a checkpoint log record is added to log all the transactions that have committed before this checkpoint have <Ti commit> log record before the checkpoint record. Any database modifications made by Ti is written to the database either prior to the checkpoint or as part of the checkpoint itself. Thus, at recovery time, there is no need to perform a redo operation on Ti. After a system crash has occurred, the system examines the log to find the last <checkpoint L> record. The redo or undo operations need to be applied only to transactions in L, and to all transactions that started execution after the record was written to the log. Let us denote this set of transactions as T. Same rules of undo and redo are applicable on T as mentioned in Recovery using Log records part. Note that user need to only examine the part of the log starting with the last checkpoint log record to find the set of transactions T, and to find out whether a commit or abort record occurs in the log for each transaction in T. For example, consider the set of transactions {T0, T1, . . ., T100}. Suppose that the most recent checkpoint took place during the execution of transaction T67 and T69, while T68 and all transactions with subscripts lower than 67 completed before the checkpoint. Thus, only transactions T67, T69, . . ., T100 need to be considered during the recovery scheme. Each of them needs to be redone if it has completed (that is, either committed or aborted); otherwise, it was incomplete, and needs to be undone.

Log-based recovery is a technique used in database management systems (DBMS) to recover a database to a consistent state in the event of a failure or crash. It involves the use of transaction logs, which are records of all the transactions performed on the database.

In log-based recovery, the DBMS uses the transaction log to reconstruct the database to a consistent state. The transaction log contains records of all the changes made to the database, including updates, inserts, and deletes. It also records information about each transaction, such as its start and end times.

When a failure occurs, the DBMS uses the transaction log to determine which transactions were incomplete at the time of the failure. It then performs a series of operations to undo the incomplete transactions and redo the completed ones. This process is called the redo/undo recovery algorithm.

The redo operation involves reapplying the changes made by completed transactions that were not yet saved to the database at the time of the failure. This ensures that all changes are applied to the database.

The undo operation involves undoing the changes made by incomplete transactions that were saved to the database at the time of the failure. This restores the database to a consistent state by reversing the effects of the incomplete transactions.

Once the redo and undo operations are completed, the DBMS can bring the database back online and resume normal operations.

Log-based recovery is an essential feature of modern DBMSs and provides a reliable mechanism for recovering from failures and ensuring the consistency of the database.

Advantages of Log based Recovery

Durability: In the event of a breakdown, the log file offers a dependable and long-lasting method of recovering data. It guarantees that in the event of a system crash, no committed transaction is lost.
Faster Recovery: Since log-based recovery recovers databases by replaying committed transactions from the log file, it is typically faster than alternative recovery methods.
Incremental Backup: Backups can be made in increments using log-based recovery. Just the changes made since the last backup are kept in the log file, rather than creating a complete backup of the database each time.
Lowers the Risk of Data Corruption: By making sure that all transactions are correctly committed or canceled before they are written to the database, log-based recovery lowers the risk of data corruption.

Disadvantages of Log based Recovery

Additional overhead: Maintaining the log file incurs an additional overhead on the database system, which can reduce the performance of the system.
Complexity: Log-based recovery is a complex process that requires careful management and administration. If not managed properly, it can lead to data inconsistencies or loss.
Storage space: The log file can consume a significant amount of storage space, especially in a database with a large number of transactions.
Time-Consuming: The process of replaying the transactions from the log file can be time-consuming, especially if there are a large number of transactions to recover.

Conclusion

In conclusion, data integrity and system reliability are maintained by log-based recovery in database management systems. It minimizes data loss and ensures reliability by assisting in the continuous restoration of databases following failures.

Suggest improvement

Polygraph to check View Serializability in DBMS

Timestamp based Concurrency Control

Share your thoughts in the comments

Basic of DBMS

Entity Relationship Model

Relational Model

Relational Algebra

Functional Dependencies

Normalisation

Transactions and Concurrency Control

Indexing, B and B+ trees

File organization

DBMS Interview questions and Last minute notes

DBMS GATE Previous Year Questions

Log based Recovery in DBMS

Log and log records

Undo and Redo Operations

Recovery using Log records

Advantages of Log based Recovery

Disadvantages of Log based Recovery

Conclusion

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?