The goal of distributed scheduling is to distribute a system’s load across available resources in a way that optimizes overall system performance while maximizing resource utilization.
The primary concept is to shift workloads from strongly laden machines to idle or lightly loaded machines.
To fully utilize the computing capacity of the Distributed Systems, good resource allocation schemes are required. A distributed scheduler is a resource management component of a DOS that focuses on dispersing the system’s load among the computers in a reasonable and transparent manner. The goal is to maximize the system’s overall performance. A locally distributed system is made up of a group of independent computers connected by a local area network. Users submit tasks for processing at their host computers. Because of the unpredictable arrival of tasks and their random CPU service time, load distribution is essential in such an environment.
The length of resource queues, particularly the length of CPU queues, are useful indicators of demand since they correlate closely with task response time. It is also fairly simple to determine the length of a queue. However, there is a risk in oversimplifying scheduling decisions.
A number of remote servers, for example, could notice at the same time that a particular site had a short CPU queue length and start a lot of process transfers. As a result, that location may become overburdened with processes, and its initial reaction may be to try to relocate them. We don’t want to waste resources (CPU time and bandwidth) by making poor decisions that result in higher migration activity because migration is an expensive procedure. Therefore, we need proper load distributing algorithms.
Static, dynamic, and adaptive load distribution algorithms are available.
Static indicates that decisions on process assignment to processors are hardwired into the algorithm, based on a priori knowledge, such as that gleaned via an analysis of the application’s graph model.
Dynamic algorithms use system state information to make scheduling decisions, allowing them to take use of under utilized system resources at runtime while incurring the cost of gathering system data.
To adjust to system loading conditions, an adaptive algorithm alters the parameters of its algorithm. When system demand or communication is high, it may lower the amount of information required for scheduling decisions.
Components of Load Distributing Algorithm :
A load distributing algorithm has 4 components –
- Transfer Policy –
Determine whether or not a node is in a suitable state for a task transfer.
- Process Selection Policy –
Determines the task to be transferred.
- Site Location Policy –
Determines the node to which a task should be transferred to when it is selected for transfer.
- Information Policy –
It is in-charge of initiating the gathering of system state data.
A transfer policy requires information on the local nodes state to make the decisions. A location policy requires information on the states of the remote nodes to make the decision.
1. Transfer Policy –
Threshold policies make up a substantial portion of transfer policies. The threshold is measured in units of load. The transfer policy determines that a node is a Sender when a new task begins at that node and the load at the node exceeds a threshold T. If the node’s load falls below T, the transfer policy determines that the node can be used as a remote task recipient.
2. Selection Policy –
A selection policy decides which task in the node should be transferred (as determined by the transfer policy). If the selection policy cannot locate an appropriate job in the node, the transfer procedure is halted until the transfer policy signals that the site is a sender again. The selection policy selects a task for transfer after the transfer policy decides that the node is a sender.
- The most straightforward method is to choose a recently generated task that has led the node to become a sender by exceeding the load threshold.
- On the other way, a job is only transferred if its response time improves as a result of the transfer.
Other criteria to consider in a task selection approach are: first, the overhead imposed by the transfer should be as low as possible, and second, the number of location-dependent calls made by the selected task should be as low as possible.
3. Location Policy –
The location policy’s job is to discover suitable nodes for sharing. After the transfer policy has determined that a task should be transmitted, the location policy must determine where the task should be sent. This will be based on data collected through the information policy. Polling is a widely used approach for locating a suitable node. In polling, a node polls another node to see if it is a suitable load-sharing node. Nodes can be polled sequentially or concurrently. A site polls other sites in a sequential or parallel manner to see whether they are acceptable for a transfer and/or if they are prepared to accept one. For polling, nodes could be chosen at random or more selectively depending on information obtained during prior polls. It’s possible that the number of sites polled will change.
4. Information Policy –
The information policy is in charge of determining when information regarding the states of the other nodes in the system should be collected. Most information policies fall into one of three categories:
- Demand – driven –
Using sender initiated or receiver initiated polling techniques, a node obtains the state of other nodes only when it desires to get involved in either sending or receiving tasks. Because their actions are dependent on the status of the system, demand-driven policies are inherently adaptive and dynamic. The policy here can be sender initiative : sender looks for receivers to transfer the load, receiver initiated – receivers solicit load from the senders and symmetrically initiated – a combination of both sender & receiver initiated.
- Periodic –
At regular intervals, nodes exchange data. To inform localization algorithms, each site will have a significant history of global resource utilization over time. At large system loads, the benefits of load distribution are negligible, and the periodic exchange of information may thus be an unnecessary overhead.
- State – change – driven –
When a node’s state changes by a specific amount, it sends out state information. This data could be forwarded to a centralized load scheduling point or shared with peers. It does not collect information about other nodes like demand-driven policy. This policy does not alter its operations in response to changes in system state. For example, if the system is already overloaded, exchanging system state information on a regular basis will exacerbate the problem.
Share your thoughts in the comments
Please Login to comment...