Queueing theory

Truba College of Science & Technology, Bhopal Queuing Model

Compiled By : Ms. Nandini Sharma Page 1

Queuing theory is the mathematical study of waiting lines, or queues. In queuing theory a

model is constructed so that queue lengths and waiting times can be predicted. Queuing theory is

generally considered a branch of operations research because the results are often used when

making business decisions about the resources needed to provide a service.

Queuing theory has its origins in research by Agner Krarup Erlang when he created models to

describe the Copenhagen telephone exchange. The ideas have since seen applications

including telecommunication, traffic engineering, computing and the design of factories, shops,

offices and hospitals.

Single queueing nodes

Single queueing nodes are usually described using Kendall's notation in the form A/S/C where A describes the time between arrivals to the queue, S the size of jobs and C the

number of servers at the node. Many theorems in queue theory can be proved by reducing queues to mathematical systems known as Markov chains, first described by Andrey Markov in his 1906

paper.

Agner Krarup Erlang, a Danish engineer who worked for the Copenhagen Telephone Exchange, published the first paper on what would now be called queuing theory in 1909. He modeled the

number of telephone calls arriving at an exchange by a Poisson process and solved the M/D/1 queue in 1917 and M/D/k queueing model in 1920. In Kendall's notation.

M stands for Markov or memoryless and means arrivals occur according to a Poisson process.D stands for deterministic and means jobs arriving at the queue require a fixed amount of servicek describes the number of servers at the queueing node (k = 1, 2,...). If there are more jobs

at the node than there are servers then jobs will queue and wait for service.The M/M/1 queue is a simple model where a single server serves jobs that arrive according to a Poisson process and

have exponentially distributed service requirements. In an M/G/1 queue the G stands for general and indicates an arbitrary probability distribution. The M/G/1 model was solved by Felix Pollaczek in 1930, a solution later recast in probabilistic terms by Aleksandr Khinchin and now

known as the Pollaczek–Khinchine formula. After World War II queueing theory became an area of research interest to mathematicians. Work on queueing theory used in modern packet

switchingnetworks was performed in the early 1960s by Leonard Kleinrock. It was in this period that John Little gave a proof of the formula which now bears his name: Little's law. In 1961 John Kingman gave a formula for the mean waiting time in a G/G/1 queue: Kingman's formula.

The matrix geometric method and matrix analytic methods have allowed queues with phase-type distributed interarrival and service time distributions to be considered. Problems such as

performance metrics for the M/G/k queue remain an open problem.

Service disciplines

Various scheduling policies can be used at queuing nodes:

https://en.wiktionary.org/wiki/queue

http://en.wikipedia.org/wiki/Operations_research

http://en.wikipedia.org/wiki/Agner_Krarup_Erlang

http://en.wikipedia.org/wiki/Telecommunication

http://en.wikipedia.org/wiki/Traffic_engineering_(transportation)

http://en.wikipedia.org/wiki/Computing

http://en.wikipedia.org/wiki/Kendall%27s_notation

http://en.wikipedia.org/wiki/Markov_chain

http://en.wikipedia.org/wiki/Andrey_Markov

http://en.wikipedia.org/wiki/Agner_Krarup_Erlang

http://en.wikipedia.org/wiki/Denmark

http://en.wikipedia.org/wiki/Poisson_process

http://en.wikipedia.org/wiki/M/D/1_queue

http://en.wikipedia.org/wiki/M/D/1_queue

http://en.wikipedia.org/wiki/M/D/k_queue

http://en.wikipedia.org/wiki/Poisson_process

http://en.wikipedia.org/wiki/M/M/1_queue

http://en.wikipedia.org/wiki/Exponentially_distributed

http://en.wikipedia.org/wiki/M/G/1_queue

http://en.wikipedia.org/wiki/Probability_distribution

http://en.wikipedia.org/wiki/Felix_Pollaczek

http://en.wikipedia.org/wiki/Felix_Pollaczek

http://en.wikipedia.org/wiki/Aleksandr_Khinchin

http://en.wikipedia.org/wiki/Pollaczek%E2%80%93Khinchine_formula

http://en.wikipedia.org/wiki/World_War_II

http://en.wikipedia.org/wiki/Packet_switching

http://en.wikipedia.org/wiki/Packet_switching

http://en.wikipedia.org/wiki/Leonard_Kleinrock

http://en.wikipedia.org/wiki/John_Little_(academic)

http://en.wikipedia.org/wiki/Little%27s_law

http://en.wikipedia.org/wiki/John_Kingman

http://en.wikipedia.org/wiki/John_Kingman

http://en.wikipedia.org/wiki/G/G/1_queue

http://en.wikipedia.org/wiki/Kingman%27s_formula

http://en.wikipedia.org/wiki/Matrix_geometric_method

http://en.wikipedia.org/wiki/Matrix_analytic_method

http://en.wikipedia.org/wiki/Phase-type_distribution

http://en.wikipedia.org/wiki/Phase-type_distribution

http://en.wikipedia.org/wiki/M/G/k_queue



First in first out

This principle states that customers are served one at a time and that the customer that has been

waiting the longest is served first.

Last in first out

This principle also serves customers one at a time, however the customer with the shortest waiting time will be served first. Also known as a stack.

Processor sharing

Service capacity is shared equally between customers.

Priority

Customers with high priority are served first. Priority queues can be of two types, non-preemptive (where a job in service cannot be interrupted) and preemptive (where a job in service can be interrupted by a higher priority job). No work is lost in either model.

Shortest job first

The next job to be served is the one with the smallest size

Preemptive shortest job first

The next job to be served is the one with the original smallest size[18]

Shortest remaining processing time

The next job to serve is the one with the smallest remaining processing requirement. [19]

Service facility

Single server:customers line up and there is only one server

Parallel servers:customers line up and there are several servers

Tandem queue:there are many counters and customers can

decide going where to queue

Customer’s behavior of waiting

Balking:customers deciding not to join the queue if it is too long

Jockeying:customers switch between queues if they think they

will get served faster by so doing

Reneging:customers leave the queue if they have waited too long

for service

Queueing networks

Networks of queues are systems in which a number of queues are connected by customer routing. When a customer is serviced at one node it can join another node and queue for service,

or leave the network. For a network of m the state of the system can be described by an m–dimensional vector (x1,x2,...,xm) where xirepresents the number of customers at each node. The

first significant results in this area were Jackson networks, for which an efficient product-form

http://en.wikipedia.org/wiki/FIFO_(computing)

http://en.wikipedia.org/wiki/LIFO_(computing)

http://en.wikipedia.org/wiki/Stack_(data_structure)

http://en.wikipedia.org/wiki/Processor_sharing

http://en.wikipedia.org/wiki/Shortest_job_first

http://en.wikipedia.org/wiki/Queueing_theory#cite_note-18

http://en.wikipedia.org/wiki/Shortest_remaining_processing_time

http://en.wikipedia.org/wiki/Queueing_theory#cite_note-19

http://en.wikipedia.org/wiki/Jackson_network

http://en.wikipedia.org/wiki/Product-form_stationary_distribution



stationary distribution exists and the mean value analysiswhich allows average metrics such as throughput and sojourn times to be computed. If the total number of customers in the network

remains constant the network is called a closed network and has also been shown to have a product–form stationary distribution in the Gordon–Newell theorem. This result was extended to the BCMP network[25] where a network with very general service time, regimes and customer

routing is shown to also exhibit a product-form stationary distribution. Networks of customers have also been investigated, Kelly networks where customers of different classes experience

different priority levels at different service nodes. Another type of network are G-networks first proposed by Erol Gelenbe in 1993: these networks do not assume exponential time distributions like the classic Jackson Network.

Example of M/M/1

Birth and Death process

A/B/C

A:distribution of arrival time

B:distribution of service time

C:the number of parallel servers

A system of inter-arrival time and service time showed exponential distribution, we denoted M.

λ：the average arrival rate

µ：the average service rate of a single serviceP : the probability of n customers in system

n :the number of people in system

Let E represent the number of times

of entering state n, and L represent

the number of times of leaving state

n. We have .

When the system arrives at steady

state, which means t, we have ,

therefore arrival rate=removed rate.

Balance equation

situation 0:

http://en.wikipedia.org/wiki/Product-form_stationary_distribution

http://en.wikipedia.org/wiki/Mean_value_analysis

http://en.wikipedia.org/wiki/Gordon%E2%80%93Newell_theorem

http://en.wikipedia.org/wiki/BCMP_network

http://en.wikipedia.org/wiki/BCMP_network

http://en.wikipedia.org/wiki/Kelly_network

http://en.wikipedia.org/wiki/G-networks

http://en.wikipedia.org/wiki/Erol_Gelenbe

http://en.wikipedia.org/wiki/File:Poiuy.png



situation 1:

situation n:

By balance

equation,

By mathematical induction,

Because

we get

In queueing theory, a discipline within the mathematical theory of probability, Little's

result, theorem, lemma, law or formulais a theorem by John Little which states:

The long-term average number of customers in a stable system L is equal to the long-term

average effective arrival rate, λ, multiplied by the (Palm-)average time a customer spends in the system, W; or expressed algebraically: L = λW.

Although it looks intuitively reasonable, it is quite a remarkable result, as the relationship is

"not influenced by the arrival process distribution, the service distribution, the service order,

or practically anything else."

The result applies to any system, and particularly, it applies to systems within systems. So in

a bank, the customer line might be one subsystem, and each of the tellers another subsystem,

and Little's result could be applied to each one, as well as the whole thing. The only

requirements are that the system is stable and non-preemptive; this rules out transition states

such as initial startup or shutdown.

In some cases it is possible to mathematically relate not only the average number in the

system to the average wait but relate the entire probability distribution (and moments) of the

number in the system to the wait.

In a 1954 paper Little's law was assumed true and used without proof. The form L = λW was first

published by Philip M. Morse where he challenged readers to find a situation where the

relationship did not hold. Little published in 1961 his proof of the law, showing that no such

http://en.wikipedia.org/wiki/Queueing_theory

http://en.wikipedia.org/wiki/Probability_theory

http://en.wikipedia.org/wiki/John_Little_(academic)

http://en.wikipedia.org/wiki/Palm_calculus

http://en.wikipedia.org/wiki/Philip_M._Morse



situation existed. Little's proof was followed by a simpler version by Jewelland another by

Eilon. Shaler Stidham published a different and more intuitive proof in 1972.

Finding Response Time

Imagine an application that had no easy way to measure response time. If you can find the mean

number in the system and the throughput, you can use Little's Law to find the average response

time like so:

MeanResponseTime = MeanNumberInSystem / MeanThroughput

For example: A queue depth meter shows an average of nine jobs waiting to be serviced.

Add one for the job being serviced, so there is an average of ten jobs in the system. Another

meter shows a mean throughput of 50 per second. You can calculate the mean response time

as: 0.2 seconds = 10 / 50 per second. When exploring Little’s law and learning to trust it, be

aware of the common mistakes of using arrivals(work arriving) when throughput(work

completed) is called for and not keeping the units of your measurements the same.

Customers In The Store

Imagine a small store with a single counter and an area for browsing, where only one person

can be at the counter at a time, and no one leaves without buying something. So the system is

roughly:

Entrance → Browsing → Counter → Exit

In a stable system, the rate at which people enter the store is the rate at which they arrive

at the store (called the arrival rate), and the rate at which they exit as well (called the exit

rate). By contrast, an arrival rate exceeding an exit rate would represent an unstable

system, where the number of waiting customers in the store will gradually increase

towards infinity.

Little's Law tells us that the average number of customers in the store L, is the effective

arrival rate λ, times the average time that a customer spends in the store W, or simply:

Assume customers arrive at the rate of 10 per hour and stay an average of 0.5 hour.

This means we should find the average number of customers in the store at any time

to be 5.

Now suppose the store is considering doing more advertising to raise the arrival

rate to 20 per hour. The store must either be prepared to host an average of 10

http://en.wikipedia.org/wiki/Response_time_(technology)



occupants or must reduce the time each customer spends in the store to 0.25

hour. The store might achieve the latter by ringing up the bill faster or by adding

more counters.

We can apply Little's Law to systems within the store. For example, the counter

and its queue. Assume we notice that there are on average 2 customers in the

queue and at the counter. We know the arrival rate is 10 per hour, so customers

must be spending 0.2 hours on average checking out.

We can even apply Little's Law to the counter itself. The average number of

people at the counter would be in the range (0, 1) since no more than one

person can be at the counter at a time. In that case, the average number of

people at the counter is also known as the utilisation of the counter.

However, because a store in reality generally has a limited amount of space,

it cannot become unstable. Even if the arrival rate is much greater than the

exit rate, the store will eventually start to overflow, and thus any new

arriving customers will simply be rejected (and forced to go somewhere else

or try again later) until there is once again free space available in the store.

This is also the difference between the arrival rate and the effective arrival

rate, where the arrival rate roughly corresponds to the rate at which

customers arrive at the store, whereas the effective arrival rate corresponds to

the rate at which customers enter the store. However, in a system with an

infinite size and no loss, the two are equal.

Estimating parameters

To use Little's law on data formulas must be used to estimate the parameters as the result does not necessarily directly apply over finite time intervals, due to problems like how to log

customers already present at the start of the logging interval and those who have not yet departed when logging stop.



Engineering

Queueing theory