Open In App

Process Management in Distributed System

Last Updated : 21 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

A distributed System is a collection of autonomous nodes (computers, servers, and other networking devices) that work together in coordination and serve a single purpose. These nodes will have a single goal to maintain consistency among them, perform certain operations on the data, communicate, share resources, and keep the distributed system functioning properly.

In a system such as this, process management becomes a crucial part of maintaining & monitoring the overall system. These nodes are geographically distributed across the world and are interconnected through a network.

distributed-OS

Distributed System

Process Management in Distributed Systems

The process management in the Distibuted System takes place through various collaborative mechanism among the processes residing in the system and it takes place with a set of steps. We’ll first have a basic overview of it and then go in its details.

Process management is a core mechanism used in the distributed system to gain control of all the processes and the task that they’re associated with, the resources they’ve occupied, and how they’re communicating through various IPC mechanisms. All of this is a part of process management, managing the lifecycle of the executing processes. There are basic functionalities that are a core part of this procedure let’s have a look over them separately.

Foundation of Process Management

The following tasks are the foundation of process management:

1. Process creation and termination

Creation: When the program moves from secondary memory to main memory, it becomes a process and that’s when the real procedure starts. In the context of Distributed System, the initialization of process can be done by one of the node in the system, user’s request , or required as dependency by other system’s component, forked() by other processes as a part of some bigger functionality.

Termination: A process can be terminated either voluntarily or involuntarily by one of the node in the run-time environment. The voluntary termination is done when the process has completed its task and the process might be terminated by the OS if its consuming resources beyond a certain criteria set by the distributed system.

2. Process Coordination

As a part of a whole system with multiple nodes in it, Process coordination (or otherwise known as process synchronization) becomes crucial part of managing the overall system, very frequently, our distributed system may come across the scenario when multiple process have to agree on a single decision based on some criteria. This agreement is governed by various algorithms like 2 – Phase Commit (2-PC) and 3 – Phase Commit (3-PC) algorithms.

3. Fault Tolerance

Fault tolerance is the ability of the system to give response to the client even in case of system failure. The distributed system does this by replicating the data across various nodes, so if one of the node fails for any reason like under maintenance, or down due to hardware failure then the system will fetch the data from other nodes.

4. Load Balancing

This feature roots mainly from Database management system (for delgating requests to other servers). Load Balancing is one of the major feature that forms the core of distributed system a process needs to be developed in a certain way that after reaching a pre-defined limit of request on a specific node, further upcoming requests will be handled by other nodes.

How Process Management is Done in Distributed Systems ?

Process management has different ways to share the process among all the processors, and other nodes that are a part of a distributed system. There are following 3 policies & mechanisms defined to do so let’s have a look at each of them.

1. Process Allocation

Process Allocation deals with allocating processor, or node, or some fixed size of memory to the process (size may vary as the requirements of the process increase). This is initial procedure when the process is born and is about to perform the assigned tasks.

2. Process Migration

Process migration as its name indicates, is the shifting (or migrating) the process to the desired node or processor. Migration can be done for many reasons like load balancing if the current node on which it was executing has exhausted its limit of handling a certain amount of processes at a time, or it could be for resource utilization. Process Migration is further of 2 types :

  1. Non Pre-emptive Migration: The process is migrated before starting its execution on the source node that is, the node on which the process was born, before starting its execution it will migrate to its target node.
  2. Pre-emptive Migration: In this case, the process has already started its execution but due to some unexpected factors or demands it needs to be migrated to other nodes. This is a costly procedure as this requires the OS to save the state of the process, all the related information like process id, files it has opened, program counter, state, priority etc to be save in the Process Control Block (PCB)

3. Threads

Threads in distributed system play a major role, dividing the process in multiple parts each having their own control flow of execution and hence achieving parallel computing. This parallel computing helps in improvising the overall efficiency of the distributed system.

FAQs on Process Management in Distributed System

Q.1: What is a distributed process Scheduler?

Answer:

A distributed process scheduler is one of the critical component responsible for allocating resources to various different processes. It aims to achieve load balancing by managing the execution, scheduling and other process related tasks across multiple nodes.

In simple words, it basically decides which process shall be executed on which node.

Q.2: What are the challenges in process management in Distributed Systems?

Answer:

Following are the major challenges that one will come across while designing the process maangement part :-

Scheduling a process to a specific node, migrating a process along with its state to another node in case of overload (load balancing), synchronizing the state of every process, maintaining data consistency among the processes operating on same data etc.

Q.3: What is a Middleware in distributed Operating System ?

Answer:

The middleware acts as the intermediatory between the distributed OS and the applications that run on it. It has various responsibilities such as load balancing, remote procedure calls (RPCs) etc.

Load balancing is best achieved by the use of middleware. Every request made by the client application will first be received on the middleware and it will have all the information regarding which node is serving how many processes and it will delegate request on the node which is serving smaller amount of requests.

Remote Procedure Calls in the middleware system aids in invoking method calls on processes running on remote machines as if they were local.

Q.4: How do distributed system handles process synchronization in process management?

Answer:

Process management part of distributed System utilises various mechanism like distributed locks, semaphore and other locking techniques to ensure that the processes that are operating doesn’t interfere with each other.

Q.5: What are some examples of Distributed Systems ?

Answer:

Most common example that you’d be familiar with is world-wide-web (WWW), cloud computing platforms such as Google Cloud platform (GCP), Amazon web Services (AWS), Azure, networks.

We can observe that among these, any number of users can register, perform operations based on their specific requirements from any parts of the world. Hence the distributed system concept is very crucial in operating these such softwares.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads