Transaction Recovery in Distributed System
Transactions may be performed effectively using distributed transaction processing. However, there are instances in which a transaction may fail for a variety of causes. System failure, hardware failure, network error, inaccurate or invalid data, application problems, are all probable causes. Transaction failures are impossible to avoid. These failures must be handled by the distributed transaction system. When mistakes arise, one must be able to identify and correct them. Transaction Recovery is the name for this procedure. In distributed databases, the most difficult procedure is recovery. It is extremely difficult to recover a communication network system that has failed.
Let us consider the following scenario to analyze how transaction fail may occur. Let suppose, we have two-person X and Y. X sends a message to Y and expects a response, but Y is unable to receive it.
The following are some of the issues with this circumstance:
- The message was not sent due to a network problem.
- The communication sent by location B was not delivered to place A.
- Location B was destroyed.
- As a result, locating the source of a problem in a big communication network is extremely challenging.
Distributed commit in the network is another major issue that can wreak havoc on a distributed database’s recovery.
One of the most famous methods of Transaction Recovery is the “Two-Phase Commit Protocol”. The coordinator and the subordinate are the two types of nodes that the Two-Phase Commit Protocol uses to accomplish its procedures. The coordinator’s process is linked to the user app, and communication channels between the subordinates and the coordinator are formed.
The two-phase commit protocol contains two stages, as the name implies. The first step is the PREPARE phase, in which the transaction’s coordinator delivers a PREPARE message. The second step is the decision-making phase, in which the coordinator sends a COMMIT message if all of the nodes can complete the transaction, or an abort message if at least one subordinate node cannot. Centralized 2PC, Linear 2PC, and Distributed 2PC are all ways that may be used to perform the 2PC.
- Centralized 2 PC: Contact in the Centralized 2PC is limited to the coordinator’s process, and no communication between subordinates is permitted. The coordinator is in charge of sending the PREPARE message to the subordinates, and once all of the subordinates’ votes have been received and analysed, the coordinator chooses whether to abort or commit. There are two stages to this method:
- The First Phase: When a user desires to COMMIT a transaction during this phase, the coordinator sends a PREPARE message to all subordinates. When a subordinate gets the PREPARE message, it either records a PREPARE log and sends a YES VOTE and enters the PREPARED state if the subordinate is willing to COMMIT; or it creates an abort record and sends a NO VOTE if the subordinate is not willing to COMMIT. Because it knows the coordinator will issue an abort, a subordinate transmitting a NO VOTE does not need to enter a PREPARED state. In this situation, the NO VOTE functions as a veto since only one NO VOTE is required to cancel the transaction.
- Second Phase: After the coordinator has reached a decision, it must communicate that decision to the subordinates. If COMMIT is chosen, the coordinator enters the committing state and sends a COMMIT message to all subordinates notifying them of the choice. When the subordinates get the COMMIT message, they go into the committing state and send the coordinator an acknowledge (ACK) message. The transaction is completed when the coordinator gets the ACK messages. If the coordinator, on the other hand, makes an ABORT decision, it sends an ABORT message to all subordinates. In this case, the coordinator does not need to send an ABORT message to the NO VOTE subordinate(s).
- Linear 2 PC: Subordinates in the linear 2PC, can communicate with one another. The sites are numbered 1 to N, with site 1 being the coordinator. As a result, the PREPARE message is propagated in a sequential manner. As a result, the transaction takes longer to complete than centralized or dispersed approaches. Finally, it is node N that sends out the Global COMMIT.
- Distributed 2 PC: All of the nodes of a distributed 2PC interact with one another. Unlike other 2PC techniques, this procedure does not require the second phase. Furthermore, in order to know that each node has put in its vote, each node must hold a list of all participating nodes. When the coordinator delivers a PREPARE message to all participating nodes, the distributed 2PC gets started. When a participant receives the PREPARE message, it transmits his or her vote to all other participants. As a result, each node keeps track of every transaction’s participants.