Distributed System – Thrashing in Distributed Shared Memory
In this article, we are going to understand Thrashing in a distributed system. But before that let us understand what a distributed system is and why thrashing occurs. In naive terms, a distributed system is a network of computers or devices which are at different places and linked together. Each one of these distributed computers shares the same state. Let say if we have opened a bank account in the Delhi branch in ABC bank. Then this account will be reflected in all the branches of the ABC branch present in every part of the country and world.
In earlier times, there was a single machine to save the database. But now due to distributed system, we can save information on any device- and it and changes will be done everywhere. Thus we can see how important a distributed system is as all our information is stored in it. For inter-process communication, shared memory is frequently utilized. Distributed shared memory is used to allow applications on separate computers to work together and share memory. Distributed shared memory might experience ‘Thrashing’. Although multiple tasks modify distinct pieces of data, the majority of work is consumed on data synchronization. The result is that the advancement made by each process is very little.
One of the most significant and unresolved issues is concurrent program scheduling. The pertinent question is – How to properly utilize idle resources and how to equally share the machine across the activities? Local scheduling of processes is an easy and scalable solution. In this scheduling, every workstation plans its activities separately. In a distributed system, however, synchronized coordination of concurrent work all over the nodes of a multiprocessor is also required. Because of CPU thrashing, the processes that make up a parallel work may experience significant communication latencies if they are not co-scheduled. Thrashing can result in excessive transmission delays. It can lead to poor efficiency if scheduling is not synchronized. With connecting devices by high-speed networks with delays in microseconds, the co-scheduling ability seems an essential element in determining throughput.
Thrashing is a process that occurs when the system spends a major portion of time transferring shared data block blocks from one node to another in comparison with the time spent on doing the useful work of executing the application process. If thrashing is not handled carefully it degrades system performance considerably.
Situations that can cause thrashing:
- Ping pong effect- It occurs when processes make interleaved data access on two or more nodes it may cause a data block to move back and forth from one node to another in quick succession known as the ping-pong effect,
- When the blocks having read-only permission are repeatedly invalidated after they are replicated. It is caused due to poor locality of reference.
- When data is being modified by multiple nodes at the same instant.
How to control Thrashing?
1. Providing application-controlled locks
- Data is locked for a short period of time to prevent nodes from accessing data and thus will prevent thrashing.
- For this method, an application-controlled lock can be associated with each data block.
2. Nailing a block to the node for a minimum amount of time(t):
- A block is disallowed to be taken away from a node until a minimum amount of time(t) passes after it has been allocated to the node.
- On the basis of past access patterns, time t can be fixed statically or dynamically
- The problem with this method is fixing the value for t
- There are two ways to tune the value of t. They are as follows:
- Based on the past access pattern of the block- the value of t can be tuned. And,
- Based on the length of the queue of processes waiting to access that block.
3. Tailoring the coherence algorithm to the shared data usage patterns
- Different coherence protocols for shared data having different characteristics can be used to minimize thrashing.