Open In App

Various Failures in Distributed System

Last Updated : 23 Nov, 2022
Like Article

DSM implements distributed systems shared memory model in an exceedingly distributed system, that hasn’t any physically shared memory. The shared model provides a virtual address space shared between any numbers of nodes. The DSM system hides the remote communication mechanism from the appliance author, protecting the programming ease and quality typical of shared-memory systems. 


These are explained as following below. 

1. Method failure : 
In this type of failure, the distributed system is generally halted and unable to perform the execution. Sometimes it leads to ending up the execution resulting in an associate incorrect outcome. Method failure causes the system state to deviate from specifications, and also method might fail to progress.

  • Behavior – 
    It may be understood as if incorrect computation like Protection violation, deadlocks, timeout, user input, etc is performed then the method stops its execution.
  • Recovery – 
    Method failure can be prevented by aborting the method or restarting it from its prior state.

2. System failure : 
In system failure, the processor associated with the distributed system fails to perform the execution. This is caused by computer code errors and hardware issues. Hardware issues may involve CPU/memory/bus failure. This is assumed that whenever the system stops its execution due to some fault then the interior state is lost.

  • Behavior – 
    It is concerned with physical and logical units of the processor. The system may freeze, reboot and also it does not perform any functioning leading it to go in an idle state.
  • Recovery – 
    This can be cured by rebooting the system as soon as possible and configuring the failure point and wrong state.

3. Secondary storage device failure : 
A storage device failure is claimed to have occurred once the keep information can’t be accessed. This failure is sometimes caused by parity error, head crash, or dirt particles settled on the medium.

  • Behavior – 
    Stored information can’t be accessed.
  • Errors inflicting failure – 
    Parity error, head crash, etc.
  • Recovery/Design strategies – 
    Reconstruct content from the archive and the log of activities and style reflected disk system. A system failure will additionally be classified as follows.
    • Associate cognitive state failure
    • A partial cognitive state failure
    • a disruption failure
    • A halting failure

4. Communication medium failure : 
A communication medium failure happens once a web site cannot communicate with another operational site within the network. it’s typically caused by the failure of the shift nodes and/or the links of the human activity system.

  • Behavior – 
    A web site cannot communicate with another operational site.
  • Errors/Faults – 
    Failure of shift nodes or communication links.
  • Recovery/Design strategies – 
    Reroute, error-resistant communication protocols.

Failure Models:

1. Timing failure: 
Timing failure occurs when a node in a system correctly sends a response, but the response     arrives earlier or later than anticipated. Timing failures, also known as performance failures, occur when a         node delivers a response that is either earlier or later than anticipated.

2. Response failure
When a server’s response is flawed, a response failure occurs. The response’s value could     be off or transmitted using the inappropriate control flow.

3. Omission failure: 
A timing issue known as an “infinite late” or omission failure occurs when the node’s answer never appears to have been sent.

4. Crash failure: 
If a node encounters an omission failure once and then totally stops responding and goes unresponsive, this is known as a crash failure.

5. Arbitrary failure : 
A server may produce arbitrary response at arbitrary times. 

Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads