Queuing Models in Operating System

In general, there is no fixed set of processes that run on systems; thus, measuring the exact processing requirements of processes is impossible. We can, however, measure the distributions of CPU bursts and I/O bursts over the course of a process and derive a mathematical formula that identifies the probability of a specific CPU burst. The arrival rate of processes in the system can be approximated in the same way. The development of queuing theory, a branch of mathematics, resulted from the use of mathematical models for evaluating the performance of various systems.

The fundamental model of queuing theories is the same as the model of a computer system. Each computer system is represented as a collection of servers such as CPUs and I/O devices, each with its own queue. The article focuses on discussing the Queuing System in Operating System.

The following topics will be discussed here:

Components of Queuing System
Number of Servers
Measures of Performance for Queuing Systems
Notation for Queues
Queue Discipline
Queuing Models

Let’s start discussing each of these topics in detail.

Components of Queuing System

A queuing system typically includes the following elements:

Arrival process: The arrival process describes how customers enter the system.
Server: The server is the person who provides the service to the customers.
Queue: Customers who are waiting for service are held in a queue.
Service discipline: The order in which customers are served is determined by service discipline.
Service time distribution: The amount of time required to serve a customer is described as service time distribution.
Departure process: The departure process describes how customers exit the system once they have been served.
System performance measures: System performance measures are used to analyze and evaluate the system’s performance. Examples include the average wait time, the number of customers in the system, and the server’s utilization.

Optional extras include multiple servers or channels, priority service, and feedback or renege mechanism.

Number of Servers

The number of servers in a queuing system can vary depending on the application and the level of service desired. In some cases, a single server may suffice, whereas, in others, multiple servers may be required to meet demand.

Single-server queuing systems: These are the most fundamental type of queuing systems, and they are frequently used in simple applications such as retail stores or fast-food restaurants. Customers arrive and queue to be served by a single server in these systems.
Multi-server queuing systems: Multi-server queuing systems on the other hand, are used in more complex applications where demand is high and more than one server is required to handle the workload. A call center with multiple agents to handle incoming calls is an example of this type of system. Customers are usually directed to an available server in a multi-server system, and the service time distribution is assumed to be the same across all servers.

Various methods, such as queuing analysis, simulation, and optimization techniques, can be used to determine the number of servers in a queuing system. The goal is typically to find the optimal number of servers that minimizes system costs (e.g., staff wages) while providing an acceptable level of service.

Measures of Performance for Queuing Systems

Performance measures for queuing systems are used to assess how well the system is performing and to identify areas for improvement. Some common performance indicators for queuing systems are:

Utilization: The percentage of time spent by the server serving customers. A high utilization rate indicates that the server is being used effectively, whereas a low utilization rate indicates that the server is being underutilized.
Average waiting time: The amount of time customers spend waiting in line to be served. A long waiting time may indicate a system bottleneck, whereas a short waiting time indicates that the system is running efficiently.
An average number of customers in the system: The average number of customers in the system, including those being served as well as those waiting in line. A high number of customers in the system may indicate that there is a high demand for service, whereas a low number indicates that the system is running efficiently.
An average number of customers in line: The average number of customers in line to be served. A large number of customers in the queue may indicate that the system is unable to meet the demand for service, whereas a small number indicates that the system is operating efficiently.
Throughput: The rate at which the system serves customers. A high throughput indicates that the system is running efficiently, whereas a low throughput may indicate that the system has a bottleneck.
The steady-state probability distribution of the system: provides the likelihood of finding a certain number of customers in the system at any given time.
Waiting time probability distribution: indicates the likelihood that a customer will have to wait for a certain amount of time before being served.
Response time or cycle time: Response time, also known as cycle time, is the total amount of time a customer spends in the system from arrival to departure.

These performance measures are not all applicable to every system; the most common and important are determined by the queueing model used to represent the system and the performance objectives.

Notation for Queues

Kendall’s notation and A/S/n notation are two popular notations for describing queues.

Kendall’s notation: This describes a queue by using a set of symbols to represent the queue’s various characteristics. It is represented by a three-letter notation, with each letter representing a different aspect of the queue. The first letter denotes the arrival process, the second the service process, and the third the number of servers. For example, an M/M/1 queue has a Poisson arrival process (represented by the letter M), an exponential service time distribution (also represented by the letter M), and one server (indicated by the number 1).
A/S/n notation: In this notation A represents the probability distribution of the interarrival time, S represents the service time distribution, and n represents the number of servers.

M/M/1, for example, denotes a queue with a Poisson arrival process, an exponential service time distribution, and one server. M/M/c denotes a queue with Poisson Arrival and Exponential service with c servers, implying that the service is provided by more than one server while the service time remains exponential.
These notations are widely used in queueing theory and analysis because they allow for a quick understanding of the queue’s characteristics and the selection of appropriate mathematical models to represent the queueing system and thus evaluate its performance.

Queue Discipline

The order in which customers are served in a queuing system is referred to as queue discipline. In practice, there are several queue disciplines that are used, including:

First-In-First-Out (FIFO): Customers are served in the order in which they arrive (first-in, first-out). This is the most commonly used queue discipline in retail stores, fast-food restaurants, and other similar establishments.
Last-In-First-Out (LIFO): Customers are served in reverse order of arrival (last-in-first-out, or LIFO). This discipline is less commonly used than FIFO, but it can be found in some applications, such as a stack of plates in a cafeteria.
Priority: Customers are served in accordance with their priority level. Customers with the highest priority are served first, followed by customers with lower priority. This discipline is used in situations where certain customers, such as in an emergency room or a customer service call center, must be served before others.
Random: Customers are served at random.
Shortest Job first (SJF): Customers are served based on the time required to complete their service, with the shortest jobs served first.
Processor sharing: Processor sharing means that all customer requests are treated equally and receive an equal share of the server’s time.

Each queue discipline has advantages and disadvantages, so the discipline chosen will be determined by the system requirements and performance objectives.

Queuing Models

Below are the four queuing models that will be discussed here:

1. [M/M/1]: {//FCFS} Queue System

M/M/1 denotes a queueing system with one server and a Poisson distribution for customer interarrival times and service times. The notation /FCFS indicates that a first-come-first-served (FCFS) service discipline is being used, which means that customers are served in the order in which they arrive. This type of queue is also known as an M/M/1/FCFS queue or an M/M/1/FIFO (first-in-first-out) queue. It is one of the most basic and widely studied queueing models in queuing theory, and it is frequently used as a starting point for understanding the performance of more complex queueing systems. To analyze and evaluate an M/M/1 queue, several performance measures are commonly used. Among the most important measures are:

The average number of customers in the system.
The average waiting time in the queue.
The system utilization.
The probability of a customer finding the server busy.

These metrics can be computed using a variety of analytical techniques, including Queueing formulae, Markov Chain analysis, and even numerical methods. In the case of the M/M/1 queueing model, closed-form solutions for these measures are available, making the analysis relatively simple.

Example:

A solved example of an M/M/1 queue system with FCFS scheduling could look like the following:

The arrival rate of customers is 2 per minute, and the service rate of each server is 3 per minute.

The probability of the system being empty is calculated using the Erlang-C formula: P0 = (2/(2+3)) = 0.4

The average number of customers in the system is calculated as: L = (2/3) / (1-(2/(2+3))) = 0.8

The average time a customer spends in the system is calculated as: W = 1 / (3 – 2) = 1 minute

The probability of there being n customers in the system is calculated using the Poisson distribution with lambda = 2.

This is just a high-level example and parameters such as arrival, service rate or utilization are often used to solve this in real-world scenarios.

2. [M/M/1]: {N//FCFS} System (Limited queue length system)

It is a single-server queueing system with a Poisson arrival process and an exponential service time distribution. N denotes a limited queue length, implying that the queue can only hold a certain number of customers. The notation N/FCFS indicates that the service discipline is first-come-first-served (FCFS), which means that customers are served in the order in which they arrive, but also that when the queue is full, new arriving customers are blocked or rejected, a practice is known as Balking and Reneging. An M/M/1/FCFS/N or M/M/1/FIFO/N queue is another name for this type of queuing system. It is a variant of the basic M/M/1 queue with a limited buffer capacity, which means that the number of customers in the system is limited by N. When the buffer is full, additional customers may be blocked or rejected, complicating the analysis and necessitating the modification of some performance measures to include the blocked/rejected customers. Some of the most important performance measures that can be calculated, similar to an M/M/1 system, are:

The average number of customers in the system (including those blocked or rejected).
The average waiting time in the queue.
The system utilization.
The probability of a customer finding the server busy.
The probability of customers being blocked/reneged.

Due to the bounded nature of the system, calculating these performance measures may necessitate the use of more complex analytical techniques than in the case of the M/M/1 model.

Example:

A solved example of an M/M/1/N queue system with FCFS scheduling could look like the following:

The arrival rate of customers is 2 per minute, the service rate of each server is 3 per minute and the maximum queue length is N=5.

The probability of the system being empty is calculated using the Erlang-C formula: P0 = (2/(2+3)) = 0.4

The average number of customers in the system is calculated as: L = (2/3) / (1-(2/(2+3))) = 0.8

The average time a customer spends in the system is calculated as: W = 1 / (3 – 2) = 1 minute

The probability of there being n customers in the queue is calculated using the Poisson distribution with lambda = 2, But with the condition that once the queue reaches the limit (n=N), the additional customers will be lost or rejected.

The probability of rejection P_R = P(n > N) = probability of customers waiting in the queue when n > N = P(n= N+1) + P(n= N+2) + … (by using Poisson formula)

Note that when the queue length is limited, and if the arrival rate exceeds the service rate, it will eventually leads to a high rejection rate, and customers will experience long waiting time before being served.

3. M/D/1 Queue

The M/D/1 queue is a queuing system in which customer arrival times follow a Poisson distribution (M), service times are deterministic (D) and have a constant value, and the system has one server. This is also known as an M/D/1/FCFS or M/D/1/FIFO queue, where FCFS or FIFO denotes the first-come-first-served service discipline. This type of queuing system is useful for simulating situations where customer service times are known in advance and are consistent, such as a carwash service. The queue will be stable if the arrival rate is less than the service rate, otherwise, it will be unstable due to the deterministic service time. The performance measures for this system are similar to those for the M/M/1 queue, but because the service time is deterministic, closed-form solutions for these measures are frequently easier to obtain. Among the most important measures are:

The average number of customers in the system.
The average waiting time in the queue.
The system utilization.
The probability of a customer finding the server busy.

Furthermore, the queue length is predictable in this case, which means that given a specific arrival rate and service time, the number of customers in the system will always be the same, rather than being dependent on the randomness of the service time as it is in M/M/1.

Example:

A solved example of an M/D/1 queue system could look like the following:

The arrival rate of customers is 2 per minute and the service rate of each server is 3 per minute.

The utilization of the server is calculated as: U = 2 / (2 + 3) = 0.4

The probability of the system being empty is calculated using the Erlang-B formula: P0 = (2 / 3) / (1 + (2 / 3)) = 0.4

The average number of customers in the system is calculated as: L = 2 * (2 / (2 + 3)) / (1 – (2 / (2 + 3))) = 0.8

The average time a customer spends in the system is calculated as: W = 1 / (3 – 2) = 1 minute

This is just a high-level example, in practice there are other parameters that can be used in real-world scenarios such as the average service time, variance of service time. Also, one important aspect to consider is that M/D/1 is a specific case of a more general queueing system called G/D/1 where the service time is general and not necessarily follows a Markov Process.

4. M/M/c Queue

The M/M/c queue is a queuing system in which customer arrival times follow a Poisson distribution (M), service times are also exponentially distributed (M), and the system has c servers. This is also known as an M/M/c/FCFS or M/M/c/FIFO queue, where FCFS or FIFO denotes the first-come-first-served service discipline. It is also known as the Erlang-c queue. This type of queuing system is useful for simulating situations in which multiple servers provide service. This queuing system allows customers to be served concurrently by the c servers, increasing system capacity and decreasing average customer wait time. This system’s performance metrics are similar to those of the M/M/1 queue. However, some measures are more difficult to calculate because the number of servers c influences the queue’s behavior. The most important measures are as follows:

The average number of customers in the system.
The average waiting time in the queue.
The system utilization.
The probability of a customer finding the server busy.
The probability of customers waiting in the queue.
The probability of customers being blocked.

The M/M/c queue is a more complex model than the M/M/1 queue, and the performance measures are calculated using various approximate and numerical methods. Furthermore, when the number of servers is large, as in many industrial and telecommunications systems, the system can be approximated as an M/M/c queue.

Example:

A solved example of an M/M/c queue system could look like the following:

The arrival rate of customers is λ per minute, the service rate of each server is μ per minute, and there are c=3 servers.

The utilization of the servers is calculated as U = λ / (c*μ)

The probability of the system being empty is calculated using Erlang-C formula: P0 = (λ/(cμ))^c / (c! * (1-(λ/(cμ))) ), where c! = c*(c-1)(c-2)…1

The average number of customers in the system is calculated as: L = (λ/(c*(μ-λ)))

The average time a customer spends in the system is calculated as: W = 1 / (c*(μ-λ))

This is just a high-level example, in practice there are other parameters that can be used in real-world scenarios such as utilization, number of customer in queue, in service, probability of n customer in the system, and etc.

It’s worth mentioning that for large value of c the system is considered as an Erlang-C queue and this approximation is becoming accurate. Also, this formula is only valid when the arrival rate is less than the service rate (λ < μ), so the system is stable and the queue length remains finite.

These are just a few examples of common queuing models; there are many other variations and extensions that can be used to analyze more complex systems.

Article Tags :

Operating Systems

Technical Scripter

Technical Scripter 2022