Why do we need Checkpoints ?
Whenever transaction logs are created in a real-time environment, it eats up lots of storage space. Also keeping track of every update and its maintenance may increase the physical space of the system. Eventually, the transaction log file may not be handled as the size keeps growing. This can be addressed with checkpoints. The methodology utilized for removing all previous transaction logs and storing them in permanent storage is called a Checkpoint.
What is a Checkpoint ?
The checkpoint is used to declare a point before which the DBMS was in the consistent state, and all transactions were committed. During transaction execution, such checkpoints are traced. After execution, transaction log files will be created.
Upon reaching the savepoint/checkpoint, the log file is destroyed by saving its update to the database. Then a new log is created with upcoming execution operations of the transaction and it will be updated until the next checkpoint and the process continues.
How to use Checkpoints in database ?
- Write begin_checkpoint record into log.
- Collect checkpoint data in the stable storage.
- Write end_checkpoint record into log.
The behavior when the system crashes and recovers when concurrent transactions are executed is shown below –
- The recovery system reads the logs backward from the end to the last checkpoint i.e. from T4 to T1.
- It will keep track of two lists – Undo and Redo.
- Whenever there is a log with instruction <Tn, start>and <Tn, commit> or only <Tn, commit> then it will put that transaction in Redo List. T2 and T3 contain <Tn, Start> and <Tn, Commit> whereas T1 will have only <Tn, Commit>. Here, T1, T2, and T3 are in the redo list.
- Whenever a log record with no instruction of commit or abort is found, that transaction is put to Undo List <Here, T4 has <Tn, Start> but no <Tn, commit> as it is an ongoing transaction. T4 will be put in the undo list.
All the transactions in the redo-list are deleted with their previous logs and then redone before saving their logs. All the transactions in the undo-list are undone and their logs are deleted.
Relevance of Checkpoints :
A checkpoint is a feature that adds a value of C in ACID-compliant to RDBMS. A checkpoint is used for recovery if there is an unexpected shutdown in the database. Checkpoints work on some intervals and write all dirty pages (modified pages) from logs relay to data file from i.e from a buffer to physical disk. It is also known as the hardening of dirty pages. It is a dedicated process and runs automatically by SQL Server at specific intervals. The synchronization point between the database and transaction log is served with a checkpoint.
Advantages of using Checkpoints :
- It speeds up data recovery process.
- Most of the dbms products automatically checkpoints themselves.
- Checkpoint records in log file is used to prevent unnecessary redo operations.
- Since dirty pages are flushed out continuously in the background, it has very low overhead and can be done frequently.
Real-Time Applications of Checkpoints :
- Whenever an application is tested in real-time environment that may have modified the database, it is verified and validated using checkpoints.
- Checkpoints are used to create backups and recovery prior to applying any updates in the database.
- The recovery system is used to return the database to the checkpoint state.
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.