Stochastic Simulations

STOCHASTIC SIMULATION,MONTE CARLOMETHODS AND APPLICATIONS1

Ion Vaduva,University of Bucharest, Romania

e-mail: [email protected]: [email protected]

Key words: Random numbers, Random variates, Random number gener-ators, Liniar congruential generators, Inverse method, Composition method,Acceptance-rejection method, Accepting probability, Ratio-of-uniform method,Bootstrap method, Monte Carlo method, Monte Carlo procedure, Primaryestimator, Secondary estimator, Crude Monte Carlo, Variance reductiontechniques, Importance sampling, Antithetic variates, Control variates, Op-erational equations, Simulated annealing, Markov Chain Monte Carlo (MCMC)method, Metropolis-Hastings algorithm, Gibbs sampler, Queueing models,Birth and death processes, Constant/variable increment clock time in simu-lation models, Models with parallel stations, Machine interference problem,Inventory models, Economic lot model,Shortage model, Distribution of totaldemand,Stochastic inventory model, Feedback rule, Reserve stock in pro-duction series, Stochastic processes simulation, Uniform Poisson Processes,Uniform binomial processes, Scan statistics.

Contents

1 Introduction

2 Random Number Generation2.1 Linear Congruential Generators2.2 Other Sources of Uniform Random Numbers

3 Non Uniform Random Variate Generation3.1 General Methods3.1.1 The Inverse Method3.1.2 The Composition (Mixture) Method3.1.3 The Acceptance-Rejection Method3.2 Other Methods3.2.1 Ratio-of-uniforms Method

1This paper contains lectures given at Polytechnic University of Toluca Valey, Mexic,in November 2009

1

3.2.2 Simulation of Some Particular Distributions3.2.3 Simulation of Some Multivariate Distributions

4 The Use of Simulation in Statistics4.1 Estimation of Parameters via Simulation4.2 Use of Simulation in Hypothesis Testing4.3 The Bootstrap Technique

5 Use of Simulation in Numerical Calculations5.1 Generalities on the Monte Carlo Method5.2 Evaluating an Integral5.2.1 Crude Monte Carlo5.3 Variance Reduction Techniques5.3.1 Importance Sampling5.3.2 Antithetic Variates5.3.3 Control Variates5.4 Solving Operatorial Equations5.4.1 Solving Systems of Linear Equations5.4.2 Solving Integral Equations5.5 Monte Carlo Optimization5.6 Markov Chain Monte Carlo

6. Introduction to queueing models6.0 Preliminaries6.1 Birth and Death processes6.2 Simulation of a queueing system with N parallel stations.6.3 Simulation of the machine interference problem

7. Introduction to inventory models7.0 Preliminaries7.1 Simple one product models7.2 Multiproduct models7.3 Stochastic models7.4 A simulation model

8.Simulation of some discrete uniform processes8.1 Poisson uniform bivariate Process8.2 Binomial uniform bivariate Process8.3 An application to a healthcare problem

GlossaryRandom number - a sampling value (or a realization) of a random variable

2

U uniformly distributed on (0, 1) (denoted U ∼ unif(0, 1);Random variate - a sampling value of a non-uniform distribution;Random number generator - an algorithm which produces an uniforn

(0, 1) random number;Linear congruential generator - a random number generator based on

using a linear function modulo(m);Inverse method - a method for simulating a random variate by inverting

the cumulative distribution function (cdf) of the variate;Composition method - a method for simulating a random variate based

on representing the cdf of the variate;Acceptance-rejection method - a method for simulating a random vari-ate

as a function of some set of ”simpler” random variates satisfying somespecific condition;(the random set of variates which do not satisfy thecondition are rejected and another set is tried);

Ratio-of-uniform method - a method for simulating a random variate asa ratio of uniformly distributed variates on some bivariate set;

Bootstrap method - a resampling method which produces a new samplefrom a given sample by extracting ”with replacement” sampling valuesfrom the original sample;

Monte Carlo method - a method for solving a numerical problem by usingrandom variates to estimate the solution to the problem;

Monte Carlo procedure-an algorithm producing a solution based on theMonte Carlo method;

Primary estimator-a random variable which estimates the solution to anumerical problem that is solved with a Monte Carlo procedure;

Secondary estimator-the arithmetic mean of the primary estimator givingthe numerical solution based on the Monte Carlo method;

Simulated annealing-a method for solving optimization problems basedMultiple integral- an integral from a function of several variables defined onon a domain in a multivariate spece.

on an algorithm that simulates the annealing physical process;Markov Chain Monte Carlo (MCMC)-a simulation method based on

simulating a statistical distribution as an ergodic distribution of a Markovchain;

Metropolis-Hastings algorithm-an algorithm used in the MCMC method;Gibbs sampler- a particular form of the Metropolis-Hastings algorithm.Queueing model- a model to analyze a queueing system.Queueing model with parallel stations- a model for analyzing a queue-ing system with several parallel stations.Machine interference problem- a model to analyze the maintenance ofa system with N machines and M repair units.

3

Inventory system- a system which conserves a storage.Economic lot model- a model defining the optimum order of a stock.Shortage model- a model assuming shortage of stock.Multistation inventory model - a model analyzing several types of prod-ucts in a storage.Stochastic inventory modrl- model with random demand, random leadtime and/or random reorder cycle.Uniform bivariate Poisson Process- a discrete stochastic process basedon Poisson distribution of random points in a domain D ⊂ R2.Uniform bivariate binomial Process- a discrete process based on bino-mial distribution of random points in a domain D ⊂ R2.Bivariate scan statistics- a statistic counting the number of randompoints in a scanning window while scanning a given domain D ⊂ R2.

Summary

The paper presents in short the main questions related to the use of sim-ulation in studying statistical problems, solving some classes of numericalproblems and analysing the behaviour of some particular systems (such asqueueing and inventory systems). Methods for simulating various types ofstatistical distributions are presented first . Then, some applications of sim-ulation in statistics, including bootstrap techniques, are also discussed. Aspecial attention is paid to Monte Carlo techniques and the Markov ChainMonte Carlo method. Some simple mathematical models related to queue-ing and inventory systems are presented. Finaly, algorithms for simulatingdiscrete uniform bivariate Poisson or Binomial processes are presented anda practical application involving scan statistics in healthcare is presented.References contain only some of the old representative or recent publications.

1 Introduction

The term simulation represents in our days science a wide class of problems,solved or analysed via computers. There are many ways to define and un-derstand simulation but all of them assume the use of random numbers toperform a computer experiment for solving some mathematical or practicalproblem. In this papere, the word simulation is also associated with termslike Monte Carlo techniques and resampling techniques, the last one involv-ing statistical problems.The word simulation can be also understood in thispaper as a mathematical experiment.

Random numbers are sampling values on the uniform distribution over (0, 1)which has the probability density function (pdf)

f(u) =

1, if u ∈ (0, 1)0, otherwise.

4

(Note that it makes no difference if we consider the interval (0, 1) as beeingopen or closed at any of its limits).The following proposition (due to Khintchine, see Ermakov-1971, Vaduva-1977) plays a great role in simulating non uniform random variates.

Theorem 1. 1 If X is a random variable having the cummulative distri-bution function (cdf) F (x), x ∈ R, and U denotes the random variable uni-formly distributed over (0, 1), then the random variable F−1(U) (with F−1-the inverse of F ), has the cdf F.

In other words, this theorem gives a general method for simulating a sam-pling value x of the random variable X when we have a sampling value u ofU, namely, x = F−1(u). That is why the next chapter will be dedicated tosimulating random numbers. The same theorem suggests that there couldbe various methods which transform sequences of random numbers into nonuniform variates. Thus, another chapter will discuss these methods.

Since one purpose of this paper is to discuss the use of simulation in solvingstatistical problems, one section is devoted to the bootstrap method andsome applications. The Monte Carlo method for solving various numericalproblems is introduced in another chapter. Some applications of the so calledMarkov Chain Monte Carlo are also presented. Then, some applications ofsimulation and stochastic modeling in queueing problems, inventory modelsand scan statistics are presented.

2 Random Number Generation

Random numbers, i.e. sampling values on the random variable U uniformlydistributed over (0, 1) (denoted U ∼ (0, 1)), are very important for theproblems to be treated in this paper.

The aim of this chapter is to present in short some methods for generat-ing with the computer sampling values on the random variable U , whichare independent and uniformly distributed over [0, 1). As Knuth and otherauthors have shown, the computer calculations necessary to produce goodrandom numbers, require first to generate an uniform integer over some in-terval [0,m), and then to divide this by m in order to obtain the requiredrandom number. The calculations needed to produce an uniform integer in[0,m) must be simple. In other words, the generation algorithm must have alow complexity, both regarding computing time and memory complexities.Details on random number generation are found in many books (see forinstance Devroye-1986, Ermakov-1971, Gentle-1998, Ripley-1986 and Ross-1997).

2.1 Linear Congruential Generators

5

A linear congruential generator is of the form

xn = (k∑

i=1aixn−i + c)(modm), xn ∈ N , (2.1)

where m is a large positive integer, k ≤ n, and ai, xi, i = 1, ..., k, c are givenconstants, all chosen such that the produced numbers xn, n > k are integersuniformly distributed over the interval [0,m− 1). Then the uniform U [0, 1)random numbers are obtained as

un = xn/m. (2.1’)

The usual linear (mixed) congruential generator is the one with k = 1, i.e.xn+1 = (a ∗ xn + c)(modm). If a, c and m are properly chosen, then, in thiscase, u′ns ”look like” they are randomly and uniformly distributed between0 and 1. Even if this linear congruential generator has a low complexity, themost used is the multiplicative congruential generator

xn+1 = (axn)(modm). (2.2)

It is shown (Knuth-1981) that if x0 6= 0 is prime to m, and a is a primitiveroot modm, close to

√m, then the numbers un produced by this generator

have a large period λ, (defined as the minimum λ such as xn = xn+λ),they are aproximately uniform (0, 1) distributed and have a small serialcorrelation coefficient ρ = corr(un, un+1)∀n (i.e are almost independent).Of course, the modulus m must be very large (usualy close to the computerword, i.e. close to 231 for usual computers).

In simulation we use sequences of random numbers u1, u2, ..., un producedby a random number generator. These numbers must pass any test whichassumes that they are uniformly distributed and stochastically independent.It is obvious that a random number generator cannot produce ”pure” ran-dom numbers to pass the mentioned tests. (see knuth-1981). Therefore wecall them pseudo-random numbers. A ”good” random number generatormust produce sequences close to pure random numbers. A linear congruen-tial generator cannot produce good random numbers. It can be used whenthere is no need to perform very accurate calculations or to obtain exactsolutions to the problems.

One trouble with using pseudorandom numbers produced by a linear con-gruential generator is that pairs (ui, ui+1) or triplets (ui, ui+1, ui+2) are lyingon lines or planes (i.e have a latice structure). This means that these gener-ators must be used with care in numerical calculations. In order to obtain

6

”better” random numbers from an uniform pseudo-random number gener-ator, the numbers produced by the generator must be transformed. If weconsider the binary representation of the numbers ui, then one way to obtainbetter numbers is to use the bit stripping, i.e. to obtain the new numbers byselecting some bits from the sequences of bits representing previous givennumbers (e.g. odd bits or even bits, etc).

2.2 Other Sources of Uniform Random Numbers

Note that if in (2.2) we take ak instead of a and start with xs, then the se-quence of pseudo-random numbers obtained is xs, xs+k, xs+2k, ... and there-fore, for various values of s, the corresponding stream can be used by oneprocessor in a parallel architecture.

Shuffling random numbers. A way of improving the quality of an uni-form pseudo-random number generator is to define the new number y bymixing (or shuffling) two generators G1, G2. One mixing algorithm (due toMacLaren and Marsaglia) is:

Take an array (i.e. a table) T [1..k], k = fixed, and initialize (fill in) it usingG1;generate with G2 a random index j ∈ 1, 2, ..., k;take y := T [j]; generate x with G1, and put T [j] = x.

(The notation a := b means that b is assigned to a). The better gener-ated number is y. This mixed generator can have a larger period and canbreak up the lattice structure of the generated sequence yi. If instead oftwo generators we use only one, G = G1 = G2, then the above algorithm(called Bays-Durham shuffling of random numbers) can be easily changedby generating only one x in the initial step and determining j by the ”bitstripping”procedure mentioned before.

Lagged Fibonacci sequences. Apart from linear congruential generators,another way of generating random numbers is to use the lagged Fibonaccigenerator, defined as

xi = (xi−j + xi−k)(mod m) (2.3)

which, when m is prime and k > j, gives a period close to mk − 1.

Inversive congruential generators (due to Eichenauer Hermann). Thismethod produces uniform integers over [0,m− 1] by the relation

xi = (ax−1i−1 + c)(modm) (2.4)

7

where x−1 denotes the multiplicative inverse modulo m if it exists, or else is0. Even if these inversive generators imply computational difficulties, theypromise to give high quality random sequences.

Matrix congruential generators. Such a generator is of the form

xi = (Axi−1 + C)(modm)

where xi are vectors of dimension d and A and C are d×d matrices. This kindof generators are important when parallel computers are used to producecorrelated random vectors.

Feedback shift register generators. Such a generator takes into conside-ration the binary representation of integers in registers of the computer. Ifai, i = 1, ..., p, denote the binary digits of the random number, and ci aregiven (not all zero) binary digits, then the digits ai of the new generatednumbers are produced by

ai = (cpai−p + cp−1ai−p+1 + ... + c1ai−1)(mod 2). (2.5)

This generator was introduced by Tausworthe. In practice it has the form

ai = (ai−p + ai−p+q)(mod 2) (2.5’)

or, if we denote⊕

, the binary exclusive-or operation, as addition of 0′s and1′s modulo 2, equation (2.5′) becomes

ai = ai−p⊕

ai−p+q. (2.5”)

Note that this recurrence of bits a′is is the same as the recurrence of randomnumbers, (interpreted as l−tuples of bits), namely,

xi = xi−p⊕

xi−p+q. (2.6)

If the random number has l binary digits (l ≤ p), and l is relatively primeto 2p − 1, then the period of the l−tuples (i.e. of the sequence of generatednumbers) is 2p− 1. A variation of the Tausworthe generator, called general-ized feedback shift register (GFSR), is obtained if we use a bit-generator inthe form (5’) to obtain an l−bit binary number and next bit-positions areobtained from the same bit-positions but with delay (by shifting usualy tothe left). A particular GFSR is xi = xi−3p

⊕xi−3q, p = 521, q = 32 which

gives a period 2521−1. Another generator of this kind is the so called twistedGSFR generator, which recurrently defines the random integers xi as

xi = xi−p⊕

Axi−p+q (6′)

8

where A is a properly chosen p× p matrix .

A practical remark. Apart from shuffling random numbers as mentionedabove, some other simple combinations could be used to produce ”good”random numbers. Thus, if we use the following three generators

xi = 171xi−1(mod 30269), yi = 172yi−1(mod 30307), zi = 170zi−1(mod 30323)

with positive initializations (seeds) (x0, y0, z0, ) and take uniform (0, 1) num-bers such as

ui = ( xi30269 + yi

30307 + zi30323) (mod 1)

it can be shown that the sequence of u′is has a period of order 1012.

3 Non Uniform Random Variate Generation

In this chapter we assume that an uniform (0, 1) random number generatorcalled rnd is given. The aim of this chapter is to present methods andalgorithms which transform sequences of random numbers u1.u2, ..., un, n ≥1 into a sampling value of a given random variable X which has a cdf F (x).(For further information see Devroye-1986,Gentle-1998 and Ross-1997).

3.1 General Methods

3.1.1 The Inverse Method

Theorem 1.1 leads to the following algorithm (the inverse method):

generate u with rnd; take x := F−1(u).

The following list gives some examples of the inverse method:

Distribution cdf InverseExp(λ) F (x) = 1− e−λx, x > 0, λ > 0 x := −ln(u)Weib(0, 1, ν) F (x) = 1− e−xν

, ν > 0 x := (−ln(u))1/ν

Cauch F (x) = 1π (arctan x + π

2 ), x ∈ R x = tan π(u− 12)

Pears XI F (x) = 1− 1(1+αx)ν , x > 0, ν > 0 x = 1

u1/ν

(The abbreviations are: Exp for exponential; Weib for Weibull; Cauch forCauchy; Pears XI for Pearson type XI).

In the multivariate case, there is a generalization of Theorem 1.1 (see Ermakov-1981), which gives a similar algorithm for simulating a sampling value x =(x1, x2, ..., xk)′ of the k-dimensional random vector X which has the cdfF (x). Let us denote

9

F1(x1) = P (X1 < x1), Fj(xj |xj−1, ..., x1) = P (Xj < xj |Xj−1 = xj−1, ..., X1 =x1), 1 < j ≤ k.

The algorithm is (the multivariate inverse method):

generate u with rnd; take x1 = F−11 (u|xj−1, ..., x1);

for i := 2 to k dobeginGenerate u with rnd; take xi = F−1

i (u);end.

An inverse algorithm for simulating a finite discrete random variate havingprobability distribution

X :

(a1, a2, ..., an

p1, p2, ..., pn

)

is:

calculate Fi =i∑

α=1pα, 1 ≤ i ≤ n; take i := 0;

generate u with rnd;repeati := i + 1;until u < Fi;take x := ai.

The loop in the algorithm searches for the value of index i; this can be betterdone by using the binary search technique.

3.1.2 The Composition (Mixture) Method

If the cdf of the random variable X is of the form

F (x) =k∑

i=1piFi(x), pi > 0,

k∑i=1

pi = 1 (3.1)

then one says that F is a mixture (or composition) of Fi(x), i = 1, ..., k.Note that pi = P (X = Xi), where Xi ∼ cdf Fi(x). If xi denotes the randomvariate corresponding to Xi, 1 ≤ i ≤ k, which is assumed can be simulated,then the algorithm for simulating the random variate x is:

generate a random index i, such as P (i) = pi;generate a random variate xi having the cdf Fi(x);take x := xi.

10

Example 3.1. Assume that X has a mixed exponential distribution i.e. itspdf is:

f(x) =k∑

i=1piλie

−λix.

As xi (which are exponential) can be generated by the inverse method, xcan be generated by the previous algorithm.

A composition algorithm can be built up also in the case of a continuousmixture i.e.

F (x) =∞∫−∞

G(x, y)dH(y) (3.1′)

where for each y,G(x, y) (as a function in x) is the cdf of a random variableZy, and H(y) is the cdf of a random variable Y. (It is assumed that Y andZy are random variables that can be simulated).

Example 3.2 Assume that Zλ is a rv exponentially distributed with param-eter λ and that the parameter λ is random being Gamma(0, a, b) distributed,i.e. has the pdf

h(λ) = ab

Γ(b)λb−1e−aλ, a, b > 0. (3.2)

In this case, the pdf of the continuous mixture (i.e. the pdf of the rv X = Zλ)is

f(x) = bab

(a+x)b+1 , x > 0 (3.3)

and the composition algorithm for simulating x is

generate λ ∼ Gamma(0, a, b);generate zλ ∼ Exp(λ);take x = zλ.

One says that X has a Lomax distribution (Pearson XI) and it is used inreliability as a life time distribution.

3.1.3 The Acceptance-Rejection Method

This method (simply called the rejection method sometimes), consists inusing some random variates which can be simulated, say s1, s2, ..., sn and,when some predicate P(s1, s2, ..., sn) is true, the random variate x is x :=Ψ(s1, s2, ..., sn). Of course, for a given probability distribution (i.e. for agiven X), the predicate P and the function Ψ must be defined in a suitable

11

manner. The index n itself could be a rv which can be generated. Theattribute ”rejection” for the method means that, for some set of generatedvariates s1, s2, ..., sn, the predicate could be false; in this case, the set isrejected and another set is tried. That is why, for the rejection method, theprobability pa = P (P(s1, ..., sn) = true) (called the acceptance probability)must be large in order to give a good rejection algorithm.

In the following we will present some theorems leading to rejection algo-rithms.

Theorem 3. 1 (The enveloping rejection method). Let X be a rv with thepdf f(x) and assume that Y is another rv with pdf h(x) and both f(x), h(x)are non negative on the same set in R. If there is a constant α > 1 suchthat for an uniform (0, 1) random variable U, independent from Y, we have0 ≤ U ≤ f(Y )/(αh(Y )), then the pdf of this Y is f.

Example 3.3 Let X be the normal N(0, 1) rv and assume that X1 > 0 isthe positive normal deviate which has the pdf

f(x) =√

2πe−x2/2, x > 0. (3.4)

Take as envelope Y ∼ Exp(λ = 1). Then, one can easily find that α =√

2e/πand pa = 1/α =

√π/(2e). The rejection algorithm for simulating x1 is

repeatgenerate u ∼ unif(0, 1) and generate y independent from u;until u < e−(y2)/2+y−0.5;take x1 := y.

For simulating the rv X we must add the following steps:

generate u with rnd;if u < 0.5 then s := 1 else s := −1; s is a random sign;take x := s.x1.

In order to generate a normal N(µ, σ) random variate w, the following stepshould be added to the preceding algorithm:

take w := µ + σx.

Theorem 3. 2 (See Vaduva-1977). Assume that the pdf of X is of the form

f(x) = cQ(ϕ(x))r(x) (3.5)

where Q(z) is the cdf of a rv Z, M > Z > 0, r(y) is the pdf of a rv Y,(Y stochastically independent from Z), and ϕ is a function such that 0 ≤ϕ(z) ≤ M. Then, the conditional pdf of Y, given that Z ≤ ϕ(Y ), is f.

12

The acceptance probability is pa = P (Z ≤ ϕ(Y )). Note that if X and Y arerandom vectors, Theorem 3.1, as well as Theorem 3.2, are valid. Note alsothat there is an alternative of Theorem 3.2, when

f(x) = c(1−Q(φ(x)))r(x). (3.5′)

In this case, the statement of the theorem remains valid if the predicate ischanged as Z ≥ ϕ(Y ).

Example 3.4. (See Vaduva-1977). Let X∗ be a Gamma(0, 1, ν) randomvariable, 0 ≤ ν ≤ 1 and take X = X∗ > 1. Then the pdf of X is of the form(12) with

ϕ(x) = x, Q(x) = 1−1/(xν−1), x ≥ 1, r(x) = e−x+1, x ≥ 1, c = 1e(Γ(ν)−Γ(1;ν) .

In the previous formula Γ(ν) and Γ(1; ν) (the incomplete gamma function)are

Γ(ν) =∞∫0

xν−1e−xdx , Γ(1; ν) =1∫0

xν−1e−xdx. (3.6)

From the mentioned alternative theorem the following algorithm for simu-lating X ∼ Gamma(0, 1, ν), X ≥ 1 is derived:

repeatgenerate a random variate z ∼ Qi.e. generate u ∼ unif(0, 1) and take z := ub, b = −1

1−ν ;generate y0 ∼ Exp(λ = 1), y0 ∈ [1,∞);until z > y0;take x := y0.

Theorem 3. 3 Assume that Z1, Z2, ... are iid (short for ”independent andidentically distributed”) random variables with cdf G(x), and that Z0 is in-dependent from Zi having the cdf G0(x). Then the following assertions arevalid:10. P (x > Z1 ≥ ... ≥ Zk−1 < Zk) = (G(x))k−1

(k−1)! − (G(x)k

k! for a fixed k and x;.20. If K is a rv such as x ≥ Z1 ≥ Z2 ≥ ... ≥ ZK−1 < Zk, then pa = P (K =odd integer) = e−G(x);30. If Z0 is the above mentioned rv and the descending sequence from 10,starting with Z0, breaks at K, which is an odd integer, then

F (x) = P (Z0|K = odd) = 1pa

x∫−∞

e−G(t)dG0(t), pa =+∞∫−∞

eG(t)dG0(t). (3.7)

13

For the rejection algorithm deriving from this theorem, the pa is the accep-tance probability.

Example 3.5. Theorem 3.3 (due to Forsythe) leads to the following Johnvon Neumann’s algorithm for simulating a random variate x ∼ Exp(λ = 1).

N := 0;repeatgenerate u0, u1, iid ∼ unif(0, 1); take u∗ := u0; k := 1;while u0 > u1 dobegin u0 := u1; k := k + 1; generate u1; end;if k modm = 0 then N := N + 1;until k mod m = 1;take x := u∗ + N .

According to Theorem 3.3, if G0 and G are iid uniform (0, 1) then u∗ inthe algorithm is Exp(λ = 1) distributed, truncated on [0, 1]. The theoremof John von Neumann says that x = N + u∗ ∼ Exp(λ = 1), which is theoutput of the previous algorithm.

Example 3.6. If G0(x) := uν , u ∈ [0, 1] and G(x) is the uniform (0, 1) cdf(i.e if in Theorem 3.3, Z0 = U1/ν , Zi = Ui- uniform (0, 1)), then Z0 in theaccepted descending sequence has a Gamma(0, 1, ν) distribution truncatedon [0, 1]. Combining this result with example 5 one can derive a composition-rejection algorithm for simulating the Gamma(0, 1, ν) distribution when 0 <ν < 1.

3.2 Other Methods

3.2.1 Ratio-of-uniform MethodThis method, due to Kinderman and Monahan (see Devroye-1986), wasused to simulate various particular distributions. The following theorem(see Vaduva-1993) gives a general form of the method.

Theorem 3. 4 Assume that the pdf of an m−dimensional random vectorX has the pdf in the form

f(x) = 1H h(x),x ∈ Rm, H =

∫Rm

h(x)dx (3.8)

and consider the mapping ϕ : Rm+1 → Rm defined as

ϕ(v0, v1, ..., vm) =(

v1vc0, ..., Vm

V c0

, c > 0)

. (3.9)

Consider the set C ⊂ Rm+1,

14

C = (v0, v1, ..., v′m|γ(v0, v1, ..., vm) ≤ 0, (3.10)

where V0 > 0 and

γ(v0, ..., vm) = log v0 − d.log(hϕ(v0, v1, ..., vn)), d = 1mc+1 . (3.10′)

If the set C is bounded on Rm+1 and V in an uniform random vector overC, then the random vector X = ϕ(V) has the pdf f.

This theorem leads to the following general algorithm:

generate v ∼ unif(C);take x := ϕ(v).

In order to simulate the sampling value v ∼ unif(C), the following rejectionprocedure is used:

find a minimum interval I = [a0, b0]× [a1, b1]× ...× [am, bm], C ⊂ I;repeatgenerate a random vector w ∼ unif(I);This will be done in section 3.2.3;until w ∈ C;take v = w.

The minimum interval I is obtained from the conditions

min(v0,v1,...,vm)∈C

vi, max(v0,v1,...,vm)∈C

vi, i = 0, 1, ..., m. (3.11)

The acceptance probability of the previous algorithm is

pa = mesC∏m

i=0(bi−ai)

, mesC =∫

Rm

dx. (3.11′)

Example 3.7. Applying Theorem 3.4 for generating the Gamma(0, 1, ν)distribution, we obtain the limits of I as

a0 = 0, b0 = (ν − 1)ν−1c+1 e−

ν−1c+1 ; a1 = 0, b1 = ( cν+1

c )cν+1c+1 e−

cν+1c+ν . (3.12)

The interval I is bounded and the algorithm is obvious.

Example 3.8. For simulating the normal N(0, 1) distribution, the limitsof the interval I, when c = 1/2, (the value which maximizes the probabilitypa), and pa are

a0 = 0, b0 = 1, b1 =√

3e , a1 = −b1, pa =

√e/(12π). (3.13)

15

3.2.2 Simulation of Some Particular DistributionsApart from the mentioned examples, various properties, sometimes com-bined with the general methods, can give us algorithms for simulating par-ticular random variates. We will list up some of these methods

• Normal distribution could be simulated (approximately) by using thecentral limit theorem as follows:

z := 0;for i := 1 to 12 dobegingenerate u ∼ unif(0, 1); z := z + u;end;

z is a normal N(0, 1) random variate.

• The Gamma(0, 1, ν) distribution, when ν > 1, can be easily simulatedby the following simple procedure:

generate an Erlang variate as E =∑k

i=1 zi, k = [ν] (integerpart) where zi

are iid Exp(1) distributed;generate G ∼ Gamma(0, 1, p), 0 < p < 1;take x = E + G.

(Here the fact that the sum of independent gamma random variables is stillgamma distributed was used).

• The Beta distribution has the pdf in the form

f(x) = 1B(a,b)x

a−1(1 − x)b−1, x ∈ [0, 1], a > 0, b > 0; f(x) = 0, x /∈ [0, 1].(3.14)

One simple method for simulating a Beta distributed random variate xis based on its property which says that if w1, w2 are independent andGamma(0, 1, a), Gamma(0, 1, b) distributed respectively, then x = w1/(w1+w2), is Beta distributed.

•Distributions based on Bernoulli trials are binomial(n,p), geomet-ric(p) and negative binomial(k,p) (or Pascal). A Bernoulli trial is an exper-iment which involves a fixed event A, with a constant probability p = P (A);in an experiment the event may occur (i.e. succes takes place) or may notoccur (i.e. failure takes place). If we associate to the Bernoulli trial therandom variable Z such that P (Z = 1) = p = P (succes) and P (Z = 0) =1− p = q = P (failure), then this can be simulated as:

generate u ∼ unif(0, 1); if u < p then z := 1 else z := 0.

16

The Binomial(n, p), n ∈ N+ rv X is the number of successes in n indepen-dent Bernoulli trials, this being one method for generating it.It is known that the binomial distribution is connected with an urn contain-ing balls of two colors, white and black, such that p = P (to extract onewhite),the ball being returned to the urn. The rv X is the number of white ballsfrom n independent extractions. If the extracted balls are not returned andextractions start with a given composition of the urn (say, A = p.N -whiteballs, N -the total number of balls in the urn), then the number Y of whiteballs extracted out of n, has a hypergeometric distribution and it issimulated as:

given N, p, n initialize i := 0; y := 0;repeatgenerate u with rnd, take i := i + 1;if u < p then s := 1 else s := 0;update p := Np−s

N−1 ; N := n− 1; y := y + s;until i = n.

The geometric(p) rv Y is the number of failures until one success occursin several Bernoulli trials.

The Pascal(k, p), k ∈ N+ rv T is the number of failures until k successesoccur.When k = 1, T is a geometric rv. An algorithm (based on countingBernoulli trials) for simulating T is:

T := 0; s := 0;repeatgenerate u ∼ unif(0, 1); if u > p then T := T + 1 else s := s + 1;until s = k.

• Poisson distribution. The rv X ∈ N+ has a Poisson(λ) distribution ifits frequency function is

f(k) = P (X = k) = λk

k! e−λ. (3.15)

Taking into account the fact that, if some events occur at random timeintervals which are Exp(λ) distributed, then the number of events occuringon the unit time is Poisson(λ), it results that the Poisson(λ) random variatex is simulated as

x := 0; P := 1;L := e−λ;repeatgenerate u ∼ unif(0, 1); take P := P ∗ u;if P ≥ L then x := x + 1;

17

until P < L;

3.2.3 Simulation of Some Multivariate DistributionsThe multivariate inverse method and Theorem 5 are general methods forsimulating random vectors. All such methods consist in simulating the com-ponents of the random vectors. Here we will present some algorithms forsimulating particular multivariate distributions.

• Uniform random vectors. If W = (W1,W2, ..., Wm)′ is uniformly dis-tributed on the interval I = [a1, b1] × ... × [am, bm], then the componentsWi, i = 1, 2, ....,m are independent and uniformly distributed over [ai, bi].Therefore the algorithm for simulating W is

for i := 1 to m dobegin generate u ∼ unif(0, 1); take wi := ai + (bi − ai) ∗ u end.

An algorithm for simulating an m-dimensional random vector V uniformlydistributed over some domain C ⊂ Rm was presented in connection withTheorem 3.4.

• The multivariate normal distribution N(µ,Σ) has the pdf

f(x) = 1(2π)m/2det(Σ)1/2 e−Q(x), Q(x) = (x− µ)′Σ−1(x− µ) (3.16)

where µ is the mean vector and Σ = (σij) is the covariance matrix. Therandom vector X, distributed N(µ,Σ) can be simulated by using Theorem 5(as in examples 3.8,3.9). The optimum interval I ⊃ C obtained for c = 1/2is

bi =√

(m+2)σii

e , ai = −bi, 1 ≤ i ≤ m. (3.17)

Another method uses the following property of the N(µ,Σ) distribution: ifY is N(µ,Σ) and D is a m×m matrix, then the random vector X = DY isN(µ,DΣD′). Therefore, to generate X ∼ N(µ,Σ), one first determines thelower triangular matrix D such that DD′ = Σ. Then X is generated as

generate z ∼ N(0, I) (I is the unit matrix);take x = µ + DD′.

• The Multivariate Lomax distribution MLm(a, θ1, ..., θm) has the pdfin the form

f(x1, ..., xm) =(

m∏i=1

θi)a(a+1)...(a+m−1)

(1+m∑

i=1

θixi)a+m

, θi, a > 0. (3.18)

18

It is known that, if X = (X1, ..., Xm)′, and Xi, 1 ≤ i ≤ m are independentand Exp(ηλi) distributed with η a random variable, η ∼ Gamma(0, b, a),then X is Lomax MLM (a, θ1, ..., θm) with θi = λi/b, 1 ≤ i ≤ m. This givesan obvious composition procedure for simulating X. The multivariate inversemethod induces the following algorithm for simulating X:

generate u ∼ unif(0, 1) and take x1 := 1θ1

(u−1/a − 1);for k = 2 to m dobegingenerate u ∼ unif(0, 1);

take xk =1+

k−1∑i=1

θjxj

θk(u−1(a+k−1) − 1)

end.

• The Multinomial distribution MD(n, p1, p2, ..., pm) is a m-dimensionalextension of the binomial distribution. For X ∼ MD(n, p1, ..., pm), thefrequency function is

P (X1 = n1, ..., Xm = nm) = n!n1!...nm!p

n11 ...pnm

m , n = n1 + ... + nm. (3.19)

If A1, A2, ..., Am are exclusive events, pi = P (Ai), 1 ≤ i ≤ m, (i.e. in oneexperiment only one of the Ai is produced), then the component Xi of X isthe number of realizations of Ai in n independent experiments. This givesthe hint for simulating X.

4 The Use of Simulation in Statistics

4.1 Estimation of Parameters via Simulation

A simulation study is frequently performed to determine the value of someparameter θ connected to a particular stochastic model such as θ = E(X), Xbeing a random quantity which can be simulated. Of course, if we generatea sample x1, ..., xn on X, the estimate of θ is the arithmetic mean x =(∑n

i=1 xi)/n which is an unbiased and consistent estimator of θ. We canoperate in a similar manner if θ is the expectation of a function of X, θ =E[f(X)]. One question when estimating a parameter is first to determinethe sample size n. This can be done by using the Tchebyshev’s inequality.More precisely, if the variance σ2 of X (or f(X)) is known, and we wish toestimate θ with a given error ε, such as

P (|x− θ| < ε) ≥ δ, δ ∈ (0, 1) (4.1)

(with δ large enough), then one can see that

n ≥ σ2t2δε2

, tδ =√

11−δ . (4.2)

19

If σ2 is unknown, then it is estimated by a preliminary computer run,namely: k variates x1, ..., xk are generated (k ≈ 30) after x is calculated,σ2 ≈ s2 = (

∑ki=1(xi−x)2)/(k−1) (s2 is un unbiased and consistent estimate

of σ2).

A good estimate of θ can be obtained if one first determines k such thatV ar(x) is less then some constant d, as follows:

Take k = 30 and generate the sample x1, ..., xk; calculate s2;repeatgenerate a new x and include it in the sample, then again calculate s2;take k := k + 1;until s/

√k < d.

Now, with the resulting k we can determine a good estimate x of θ.

If θ = p (i.e. X is a Bernoulli random variable taking values 0 or 1), thenthe previous algorithm gives the estimate of p with a required error ε = d. Instatistics parameters are also estimated by a tolerance interval, which is aninterval in the form [t1, t2] with ti = ti(x1, ..., xn), i = 1, 2 and satisfying theproperty that P (θ ∈ [t1, t2]) = δ where δ ∈ (0, 1), δ large being the tolerancelevel. If we are interested to estimate θ by a short tolerance interval, thenwe must also use one suitably large sample size. But for a large sample sizen, a tolerance interval is obtained by taking into account the fact that thearithmetic mean x is asymptotically normal N(θ, σ/

√n). This interval is

[t1, t2], t1 = x− zδ/2s√n, t2 = x + zδ/2

s√n, (4.3)

where zδ/2 is defined as

1√2π

zδ/2∫0

e−z2

2 dz (4.4)

and x, s are estimates of θ and σ respectively.

4.2 Use of Simulation in Hypotheses TestingIn many situations, in order to test some statistical hypothesis, we need thecritical values of the test statistic or, for some tests, we need to estimatetest power.

Assume that we are interested in testing some hypothesis H on some randomvariable X and that we use some sample x1, x2, ..., xn with a given n. If thetest uses the statistic t = t(x1, ..., xn), for a given significance level α, thenwe need the critical value tα such that P (t > tα) = α (for an one sidetest!). In order to estimate tα we simulate N replicates of t and build-up ahistogram as follows:

20

input N=the sample size used to estimate tα;input n and k=the number of intervals of the histogram of t;for i := 1 to k do νi := 0;νi are the frequencies of the histogram;input a sample size n1, n1 << N ;for i := 1 to n1 dobegingenerate under the hypothesis H the sample x1, ..., xn;calculate the test statistic ti := t(x1, ..., xn);end;order the vector (t1, ..., tn1); take a1 := min

1≤i≤n1

ti; ak−1 := max1≤i≤n1

ti;

take h := (ak−1 − a1)/(k − 2); h is the common length of the inner k − 2 intervals of the histogram;for i := 1 to n1 dobegin calculate j := [(ti − a1)/h] + 2; take νj := nuj + 2; end;[z] is the integer part of z take a0 := a1; ak := ak−1;initialize i := n1 + 1;repeatgenerate a sample of size n on X, x1, ..., xn;calculate t := t(x1, ..., xn); take i := i + 1;if t < a0 then begin a0 := t, ν0 := ν0 + 1; end;if t > ak−1 then begin ak := t, νk := νk + 1; endelse begin take j := [(t− a1)/h] + 2, νj := νj + 1;end;until i = N ;

By a liniar interpolation between the values ai0 , ai0+1, for whichi0∑

i=1

νiN ≤

1− α <i0+1∑i=1

νiN , one can estimate the upper quantile tα. If the test power is

needed, the procedure is similar: the ”generate” statements in the previousalgorithm assume that Xi are simulated under the alternative hypothesisnonH.

A special case arises for the Kolmogoroff-Smirnoff test of goodness of fit. Inthis case the statistic is t = sup

x|F (x)−Fn(x)|, where F (x) is the theoretical

cdf of X and Fn(x) is the empirical cfd of X i.e. Fn(x) = (∑

i,xi<xνi)/n. It is

known that the cdf of t does not depend on F. Therefore, to estimate thetest power π = P (t > tα|nonH), we need to simulate xi, 1 ≤ i ≤ N, in anyalternative hypothesis, estimate Fn(x) and then calculate t with F =the real(null) cdf. The previous algorithm will produce a histogram on this t. Then,with tα known, the test power π is easily estimated from this histogram.

21

For the problem of the two samples, (i.e. to test H : F0 = G0), the testpower is estimated by also using a histogram of tn,m = sup

x|Fn(x)− gm(x)|,

where Fn(x), Gm(x) are empirical cdf’s obtained by simulating two different(known) cdf’s F and G. The same technique will be used for the Cramer-vonMises test or Anderson-Darling test.

4.3 The Bootstrap Technique

The bootstrap method, introduced by Efron, consists in resampling (or reuseof a sample) as it will be underlined in the following. Let x = (x1, x2, ..., xn)be a sample on the rv X which has the cdf F (x) and assume that we haveto estimate a parameter θ(F ) of X. The bootstrap technique consists inresampling the initial sample. In other words, if F (x) is the empirical cdf ofX, obtained with the given sample, a bootstrap sample is x∗ = (x∗1, x∗2, ..., x∗n),obtained by simulating F (x). Usually the bootstrap sample is obtained bythe following algorithm:

take i := 0repeatgenerate u ∼ unif(0, 1); take j := [nu] + 1; i := i + 1; x∗i := xj ;until i := n.

Therefore the bootstrap sample is obtained by extracting with replacementn values from the initial sample. Usually, B bootstrap samples x∗1,x∗2, ...,x∗Bare considered. The bootstrap samples are used in solving various statisticalproblems. Here we will give only some examples.

Estimating the mean square error. Assume that the parameter θ(F )can be estimated by g(x1, ..., xn). If the parameter θ(F ) is estimated byθ(F ), we say that the plug-in principle of estimation has been used.Themean square error of the estimate is

MSEF (g) = EF [(g(x1, ..., xn)− θ(F ))2]. (4.5)

The algorithm for estimating MSEF (g) is:

for b = 1 to B dobegin generate a bootstrap sample x∗b ; calculate θ(b) = g(x∗b); end;calculate the arithmetic mean and the variance of the variates θ(b), 1 ≤ b ≤B

θ∗ :=

B∑b=1

θ(b)

B ; MSEF (g) :=

√√√√B∑

b=1

(θ∗(b)−θ∗)2

B−1 . (4.6)

22

Confidence intervals based on bootstrap percentiles. A confidenceinterval for θ(F ) can be obtained according to the following procedure:

Generate a large number B of bootstrap samples xb, 1 ≤ b ≤ B; calculate thebootstrap estimates θ∗(b);build-up a histogram of θ∗(b);for a given confidence level δ = 1 − 2α, determine the lower and upperα-quantiles θα, θ1−α as

θα = G−1(α), θ1−α = G−1(1− α) (4.7)

G is the cdf of θ and G−1 is the inverse of G.The calculation of quantiles was illustrated in section 4.2.

Many bootstrap applications are based on the fact that, for large B, thearithmetic mean of bootstrap estimates θ(b), 1 ≤ b ≤ B is assymptoticallynormal. This can induce a bootstrap confidence interval for θ(F ) (withquantiles of the N(0, 1) distribution) namely

θ∗ − MSE.zα/2 ≤ θ(F ) ≤ θ

∗ + MSE.zα/2,12π

zα/2∫−zα/2

e−u2/2 = 1− α. (4.7’)

Bootstrap in linear models. Assume we have the liniar model

Ω : y = Cβ + e (4.8)

where C is a n × p (design matrix), y is the response vector (of samplingvalues) and e is the error vector assumming to have the covariance matrixΣ = σ2I with I the unit n×n matrix. The problem is to estimate the vectorβ and the variance σ from the observed data set x = (C,y). Usually, theestimate β of β is obtained from the least squares method as the solution ofthe system of normal equations

CTCβ = CTy. (4.9)

Let us denote ci the lines of C (called covariates) and ei the components of e.Here the bootstrap samples could be obtained in two ways:by bootstrappingthe data set x obtaining the bootstrapped data set x∗ = (x∗1,x∗2, ...,x∗n),x∗i =(ci, yi), or by first estimating β from the original data set x, then calculatingthe residuals ei = yi − ciβ, bootstrapping ei as e∗i and taking as bootstrapsample

x∗ = (c1, c1β + e∗1), ..., (cn, cnβ + e∗n). (4.10)

23

The first method consists in bootstrapping pairs, while the second consistsin bootstrapping residuals.

It is reported that, when the probability distribution of errors is dependingon the covariates ci (which might be the general case), the pairs bootstrap-ping is better to be used.

5 Use of Simulation in Numerical Calculation

5.1 Generalities on the Monte Carlo Method

The Monte Carlo method consists in estimating the solution θ of a nu-merical problem by using a suitable sample of a random variable or ofa stochastic process ξ.(See Ermakov-1971, Fishman- 1996, Gentle 1998,Ripley-1986, Ross-1997, Roberts and Casella-1999). The function τ(ξ) suchas E[τ(ξ)] = θ, is called a primary estimator of θ. The solution θ is thenestimated by

τ = 1n

n∑i=1

τ(ξi) (5.1)

where ξi, 1 ≤ i ≤ n is a sample of ξ simulated by the computer. The τ iscalled a secondary estimator. The minimum sample size n0 can be calculatedby using Tchebysheff’s inequality as in section 4.1. If the approximationerror ε is given and δ is a given large probability to obtain an error less thenε, then the minimum sample size is

n0 = t2δσ2

ε2+ 1, tδ =

√1

1−δ V ar[τ(ξ)] = σ2 (5.2)

where z is the nearest integer to z.

5.2 Evaluating an Integral

5.2.1 Crude Monte Carlo

Assume, without loss of generality, that we have to calculate the integral

θ =1∫0

f(x)dx. (5.3)

Note that this can be written as θ = E[f(U)], where U is uniform (0, 1).Therefore, a primary estimator is τ = f(U), and θ can be estimated by thecorresponding secondary estimator. This method (based on ξ = U), is calledthe crude Monte Carlo method. Of course, for calculating an integral oversome real interval (a, b), a change of variable can reduce the problem to theinterval (0, 1). Note also that if σ2 = V ar(f(U)) exists, then the minimum

24

sample size n0 can be calculated. Note also that the Monte Carlo methodis highly recommended for calculating a multiple integral in the form

θ =∫D

f(x)dx, D ∈ Rk, θ = mes(D)∫D

f(x)dxmes(D) = mes(D)E[f(V)] (5.3′)

where V is a random vector uniformly distributed on D and mes(D) is themeasure of D.

Example 5.1 Let us calculate the m ultiple integral

I =∫

Df(x, x, z; x′, y′, z′)dxdydzdx′dy′dz′, f(x, y, z; x′, y′, z′) =

α log d + β, d =√

(x− x′)2 + (y − y′)2 + (z − z′)2,

and D = ∆×∆, δ = [a, a1]× [b, b1]× [c, c1], D ⊂ R3.(Such a calculus can be reqired in Geostatistics). Such an integral is diffi-

cult to be calculated by quadrature formulae. Therefore it will be calculatedby using Monte Carlo methods.

The integral I can be written in the form

I = (mes(∆)2)∫

D

1mes(∆)2

f(x, y, z; x′y′z′)dxdydzdx′dy′dz′ = mes(∆)2I1,

(5.3”)mes(∆) = |x− x′||y − y′|z − z′| > 0.

The integral in the last formula can be written in the form

I1 = E[f(V ; V ′)

], V, V ′ − uniform on ∆, independent. (5.3′′′)

Now, the crude Monte Carlo procedure for calculating I1 ( and I) is obvious.The sample size n can be previously determined according to formula

(4.2) or (5.2) and discussions regarding it.Exercice. Calculate the integral

I =∫

Df(x, y, z)dxdydz, f(x, y, z) =

√x2 + y2 + z2(cosx + siny + cosz)

where D is the sphere D = (x, y, z);x2 + y2 + z2 ≤ 1.Hint. Note that |f(x, y, z)| ≤ m = 3

√3. Now, determine the sample size

n according to (4.2) with σ = m. Then, generate points Pi = (xi, yi, zi)′, 1 ≤i ≤ n and apply crude Monte Carlo method.

Finally, note that the error of the estimator τ depends on the varianceof the primary estimator σ2 = V ar(τ). If, for a given problem, one canfind a primary esimator having a smaller variance then the variance of the

25

crude Monte Carlo, then one says that this last primary estimator producesa variance reduction.

5.3 Variance Reduction Techniques

5.3.1 Importance SamplingNote that the variance of τ in the case of crude Monte Carlo depends on thevariation of the function f(x) on the interval (a, b). One primary estimatorwhich can reduce this variation can be obtained if we choose a pdf p(x) over(0, 1) which has a shape similar to f(x). The pdf p(x) is called importancefunction and the sample of this pdf is called importance sample. Note that

θ =∫ 10 p(x)f(x)

p(x)dx = E[F (Y )p(Y ) ], (5.3)

where Y is the random variable having the pdf p. If p(x) is properly chosen,then f(x)/p(x) ≈ const and the primary estimator τi(Y ) = f(Y )/p(Y ) hasa variance V ar(τi) = σ2

i < σ2. Note that the method is highly recommendedin the multivariate case.

Example 5.2. Let us calculate the integral

I =∫ ∞

0

∫ ∞

0(cos(x) + sin(x))3e−x2−y3

dxdy =∫ ∞

0

∫ ∞

0f(x, y)dxdy. (5.3′)

This is a double improper integral and it is convergent because

|(cos(x) + sin(x)3)e−x2−y3 | < 8e−x2−y3

and the last function is integrable.In order to calculate I by importance sample, we must select a good

importance function which is to be a pdf, easy to simulate. Note that sucha function could be

p(x, y) = e−x−y.

The random vector (X, Y ) withe pdf p(x, y) has components exponentialExp(1) distributed and independent, therefore (X,Y ) can be straightfor-ward generated. Because I is in the form

I = E

[f(X,Y )p(X, Y )

](5.3”)

it results that a primary estimator is

(cos(x) + sin(x)3)e−x2−y3+x+y.

Note that this primary estimator is integrable, (because the previous func-tion is bounded on (0,∞)× (0,∞)), therefore the importance sampling pro-cedure can be applied.

26

Exercice. Calculate the integral

I =∫ ∞

0

∫ ∞

0f(x, y)dxdy : f(x, y) =

1x2a + y2a

, a > 1.

Hint. Use the importance sampling. Take as importance density

p(x, y) =a(a− 1)

1 + x + y)a+2.

The primary estimator is

f(x, y)p(x, y)

=1 + x + y)a+2

a(a + 1)(1 + x2a + y2aq)

which is an integrable function on (0,∞)× (0,∞).

5.3.2 Antithetic Variates

Another variance reduction technique (called antithetic variates) is obtainedas follows: consider the primary unbiased estimator τ and another primaryunbiased estimator τ1 and take as a new primary unbiased estimator

ψ = τ+τ12 . (5.4)

If Cov(τ, τ1) = στ,τ1 < 0 (Cov denotes the covariance), then the varianceV ar(ψ) = σ2

ψ can be made less then σ2, i.e. the variance reduction isobtained. If τ = f(U) is the crude Monte Carlo unbiased estimator, thenone can take τ1 = f(1 − U), for which V ar(τ1) = σ2. This gives a negativecovariance στ,τ1 < 0, implying the variance reduction: σ2

ψ < σ2/2.Exercice. Calculate the integral

I =∫ 1

0f(x)dx : f(x) = exa

(cos2x + sinx), a > 0.

Hint. Apply the method of antithetic variates. The procedure is the follow-ing:

- Calculate n using (4.2) with σ + m = 2e where |f(x)| ≤ m;- Generate random numbers Ui, 1 ≤ i ≤ n;- Calculate Vi = 1− Ui and

τ1i = eUai (cos2Ui + sinUi); τ2i = eV a

i (cos2Vi + sinVi);

- Calculate the primary estimator τi = τ1i+τ2i2 , 1 ≤ i ≤ n;

The secondary estimator of I is

In =1n

n∑

i=1

τi.

27

5.3.3 Control VariatesAnother variance reduction technique (called control variates) consists inusing an approximate integrable function φ, such as V ar(φ(U)) < σ2, µ =∫ 10 φ(x)dx, where µ is finite. Then θ =

∫ 10 f(x)dx − ∫ 1

0 φ(x)dx + µ andtherefore it is necessary to estimate only the parameter θ − µ = ν. Theprimary estimator in this case can be taken τa = f(U) − φ(U), which alsogives a variance reduction.

5.4 Solving Operatorial Equations

In order to solve operatorial equations, some notions related to MarkovChains are necessary. Consider a sequence X1, X2, ... of random variables.Interpret the value of Xn as the state of the system at time n and assumethat the set of possible states is 1, 2, ..., N. If the conditional probabilityP (Xn+1 = j|Xn = i,Xn−1 = i1, ...) = P (Xn+1 = j|Xn = i) = Pij (i.e.the probability to pass from state i at time n to state j at time n + 1depends only on the previous state i), we say that Xn, n ≥ 0 is a Markovchain with transition probabilities Pij , i, j = 1, ..., N. The elements of thetransition matrix P = Pij satisfy the condition

N∑j=1

Pij = 1, i = 1, 2, ..., N. (5.5)

As the transition probabilities Pij do not depend on time n, the Markovchain is homogeneous. The previous formula leads to the following algorithmfor simulating a random state of a Markov chain.

Assume the previous state is i. Calculate Fk =k∑

s=1Pis, k = 1, ..., N ;

Generate u ∼ unif(0, 1) and take j = 0;while Fj > u do j := j + 1; j is the generated state.

If the algorithm is repeated (with a previous update i := j) one obtains atrajectory of the process, i, j1, j2, ... If, for a Markov chain, there is a statei0 such that Pi0i0 = 1, then i0 is an absorbing state. (If the chain enters thisstate it will never leave it).A Markov chain is ergodic if, when the time n →∞, the probability distri-bution of states is stationary (i.e. is unchanged in time).

5.4.1 Solving Systems of Liniar Equations

Every system of liniuar equations can be written in the form

x = Hx + b,H = |hij |, (5.6)

28

where x and b are k× 1 matrices, and H is a k× k nonsingular matrix with‖ H ‖< 1 (‖ H ‖ is the norm of the matrix H). The Monte Carlo procedurefor solving the system consists first of associating a Markov chain with k+1states having transition probabilities satisfying conditions:

Pij 6= 0 if hij 6= 0, 1 ≤ i, j ≤ k,k∑

j=1Pij = pi < 1 (5.7)

Pi,k+1 = pi, 1 ≤ i ≤ k, Pk+1,k+1 = 1, Pk+1,i = 0, i ≤ k. (5.7′)

Consider now

vij =

hij

Pij, if Pij 6= 0

0, if Pij = 0.(5.7”)

The Monte Carlo algorithm for estimating the solution xi, i− fixed, is thefollowing:

input N ; N is the sample size; j := 0;xi := 0;repeatgenerate a trajectory γ = (i, i1, i2, ..., im, k + 1) of the Markov chain (i.e.until absorbtion);calculate Vm(γ) = vii1vi1i2 ...vim−1im ; X(γ) = Vm(γ)aim

pim;

take xi := xi + Vm(γ); j := j + 1;until j = N ;take xi := xi/N.

One theorem (see Ermakov-1971) says that Vn(γ) is a primary estimator ofthe component xi and the algorithm gives the estimate of xi as the arithmeticmean of the primary estimator. This algorithm is mainly used when it isnecessary to obtain only the component xi of the solution.

In order to obtain other components xj , j 6= i, the algorithm can be ap-plied for each j, 1 ≤ j ≤ k. An alternative is to use an uniform distributionfor the initial states, calculate the primary estimator for each trajectorystarting in the initial random state and then calculate the secondary esti-mators for all the states. (In this case, the sample size N must be largeenough to make sure that several trajectories starting from each state areobtained, and these must be counted in N1, N2, ..., Nk).

5.4.2 Solving Integral EquationsIn order to find a numerical solution of the integral equation:

f(x) = g(x) +b∫a

K(x, y)f(y)dy (5.8)

29

where g is a known function and K is a known kernel (f is the unknownfunction), one method is to start with the discretization of the equation (ona gird of points a = a1 < a2 < ... < ak = b), reducing the problem to solvinga system of liniar equations in the form (5.6) where x = (f(a1), ..., f(ak)),H =‖ K(ai, aj) ‖, and b = (g(a1), ..., g(ak)).

When the functions involved belong to the space of functions L2[a, b] (i.e.are squared integrable), the problem can also be studied in a similar man-ner. Assume that φ0, φ1, ... is a base of functions of L2 and consider theexpansions

g(x) = a0φ0(x) + a1φ1(x) + ...b∫a

K(x, y)φj(y)dy = h0jφ0(x) + h1jφ1(x) + ... (5.9)

f(x) = x0φ0(x) + x1φ1(x) + ...

If the sums in formulae (5.9) have a finite number of terms (i.e. they approx-imate the functions with a finite number of terms, say k), then the problemis again reduced to a system of linear equations of the form (5.6), for whichthe Monte Carlo solution is known from the previous section.

Another method consists in using an absorbing Markov process having atransition pdf P (x, y) such as

P (x, y) > 0 if K(x, y) 6= 0,∫ ba P (x, y)dy < 1. (5.10)

Consider now the notations

p(x) = 1−b∫a

P (x, y)dy, v(x, y) = K(x,y)P (x,y) . (5.10′)

Generate a trajectory γ = (x0, x1, ..., xk) of states for the absorbing Markovchain as follows. Generate first an initial state x0 with the initial distributionπ(x) (this could be a density). Then generate the next state x1 with pdfP (x0, x)/(1−p(x)). Given the non-absorbing state xi−1, generate the state xi

with pdf P (xi−1, x)/(1−p(x)), which is absorbing with probability p(xi) (andnon-absorbing with probability 1− p(xi)). When the last state is absorbing(say xk), then the trajectory is completed. The primary estimator is

X(γ) = Vk(γ)p(xk) , where Vk(γ) = v(x0, x1)...v(xk−1xk).

One can show that if

‖ K ‖= supx

b∫a|K(x, y)|dy < 1 (5.11)

30

then X(γ) is an unbiased estimator for f(x0).The algorithm is similar to the previous one. Details on these methods arefound in the mentioned literature.

5.5 Monte Carlo Optimization

Another type of numerical problem which can be solved via Monte Carlois the problem of optimization in the following form: find the optimum ofa function h(x) ∈ R when x ∈ D ⊂ Rk. In general, the set D is bounded.(See Robert and Casella-1999 for details). Without loss of generality assumethat the problem is: find x∗ ∈ D such that

h∗ = maxx∈D

h(x) = h(x∗), h(x) = E[h(x, Z)] (5.12)

where Z is a random variable. Any deterministic maximization problem canbe formulated as (48) (except for the last formula involving expectation).This suggests that the maximum point x∗ can be obtained by a randomsearch procedure such as:

i := 1; generate v1 ∼ unif(D); take M = h(v1); input ε > 0;repeattake v1 = vi; i := i + 1; generate vi ∼ unif(D);calculate M := max[M, h(vi)]; if M = h(vi) then x∗ = vi;until |vi − v1| < ε.

One theorem (of Gnedenko-1943) says that if h is continuous on D, then Mconverges to h∗ and x∗ converges to x∗.When h(x) is defined as an expectation and Z has the pdf f(z,x) then,instead of using uniform points vi on D we can use search technique, basedon random variates z1, z2, ..., zn, from the pdf f(z,x), involved in definitionof h. Now the maximization algorithm becomes:

take i := 1; select x1 ∈ D, uniformly distributed on D; input m and ε > 0;repeatgenerate z1, ..., zm with the pdf f(z,xi);

calculate hi =

m∑j=1

h(zj ,xi)

m ;calculate h∗ = max

ihi;

if h∗ = h(xi0) then x∗ = xi0 ;x1 := xi; update i := i + 1;xi := x∗;

until |x1 − xi| < ε;

This kind of optimization problem is frequently used in statistics when themaximum likelihood method is applied for estimating parameters. In this

31

case Z is the rv and x is the parameter to be estimated. A similar problemarises when we estimate parameters of a posterior distribution in Bayesianstatistical analysis.

In the deterministic case, when h(x) > 0 is integrable, one can build up animportance sampling procedure using the importance pdf defined as:

g(x) =

h(x)H if x ∈ D

o, otherwise, H =∫D

h(x)dx. (5.13)

The previous algorithm is then simply modified just by using vectors zi sim-ulated with pdf g(x). One can prove that, in this case (for h-continuous),the h∗ converges to h∗, faster than M.

One connected optimization problem (which also uses the notion of aMarkov chain) is the one called simulated annealing.(See Pham and Karaboga-2000, Robert and Casella-1999 and Ross-1997). Annealing is a physical pro-cess of heating up a solid and then cooling it down slowly until it cristallyses.An optimization algorithm based on simulating the annealing process firstneeds to use the probability distribution of system energy E at a given tem-perature T. (Usually this distribution is P (E) = e−E/(kT ), where k is Boltz-mann’s constant). Then, the representation of the solutions to the problemmust be established. Of course, the definition of the cost function E (whichis to be minimized), is required . The definition of the generation algorithmfor the neighbours is another requirement.(In the physical annealing casethis distribution is Boltzmann’s probability from above; otherwise it is astate of a Markov chain). Finally, a cooling schedule of the ”temperature”T is required. The general form of the algorithm is:

select an initial solution, x0 i.e. an initial state of the chain, maybe froman initial distribution; take n = 0; input T0;repeatgenerate y, maybe from Boltzmann’s distribution or from another fixed dis-tribution, conditionally on xn; then, the new state of the chain is:

xn+1 =

y, with probability min(1, e−∆E/Tn)xn otherwise,∆E = E(y)− E(xn);

(5.14)

evaluate the energy as a function E(xn+1);update T (cooling schedule) which is usually in the form Tn+1 := cTn, 0 <c < 1); take n := n + 1;until energy is minimum.

5.6 Markov Chain Monte Carlo

32

A Markov Chain Monte Carlo (MCMC) method for the simulation of adistribution f is any method producing an ergodic Markov chain Xn whosestationary distribution is f. (See Robert & Casella-1999 and Ross-1997).• The standard Metropolis-Hastings (M-H) algorithm for simulatinga random vector X ∼ pdff(x) is based on a conditional density q(y|x),x,y ∈A ⊂ Rk which is symmetric and easy to be simulated. It may be a transitiondensity of a Markov process. Denote

ρ(x,y) = minf(y)f(x)

q(x|y)q(y|x) , 1

The M-H algorithm is the following:

take t = 0 -the initial time, xt -the initial state; input a large N ;repeatgenerate yt ∼ q(y|xt);generate u ∼ unif(0, 1); if u < ρ(xt,yt) then xt+1 := yt else xt+1 := xt;take t := t + 1;until t = N.

The sampling (simulated) value of X ∼ f is x := xN . There are manyparticular forms of the Metropolis-Hastings algorithm; one is when q =q(x,y) is the transition density of the Markov chain; some others refer totransition kernels in the form q(x,y) = g(x− y), g-symmetric (the Markovprocess is, in this case, a random walk) and so on. Partricular forms areobtained when f is a particular univariate distribution.

•The Gibbs sampler is one particular form of M-H algorithm. It is as-sumed that, in order to simulate the k× 1 vector X ∼ pdf f(x),x ∈ A,x =(x1, ..., xk), one can simulate the univariate conditional densities f1, f2, ..., fk,such as Xi|x1, ..., xi−1, xi+1, ..., xk ∼ fi(xi|x1, ..., xi−1, xi+1, ..., xk). If for afixed i one denotes y = (x1, ..., xi−1, x, xi+1, ..., xk) and takes

q(x,y) = 1kfi(x|xj , j 6= i) (5.15)

in the Metropolis-Hastings algorithm, then one obtains the Gibbs sampleras:

take x ∈ A, for which f(x) > 0for i = 1 to k dobegingenerate x ∼ fi(x|xj , j 6= i)if y ∈ A then take xi := x else leave the value of xi as it is;end.

33

Note that the output of this algorithm is a sampling value x of X.There are many applications of the MCMC method. (See the mentionedliterature: Robert-Casella-1999,Ross-1997). Here we will mention an usualone for statistics.Assume we have to estimate θ = E[X] and X is simulated with the M-Halgorithm. Denote x1, x2, ..., xN (with large N) a trajectory of the Markovchain produced by the M-H algorithm. Assume that states i ≥ p are almoststationary.Then the estimate of θ and its mean square error MSE are

θ = 1n−p

N∑i=p+1

h(xi), MSE = E

[(θ − θ

)2].

To estimate the MSE the batch means method is used, namely, break upthe N − p states intyo s batches of size r, s = (N − p)/r (assuming thats is an integer!). Denote by Yj , j = 1, 2, ..., s the sampling means of thebatches, which are identically distributed with V ar(Yj) = σ2, and calculateits estimate as σ2. Then the estimate of MSE is MSE = σ2/s.

6. Introduction to queueing models

6.1 PreliminariesA queueing system is ome in which arrive discrete units (called cus-

tomers, which take a service. When the flow of customers arriving in thesystem is large and the system service is slow, then, the surplus of customersin the system join a wayting queue where they wait until the service is avail-lable. Such a queing system can be a shop selling goods for people, a bankwhich serves its customers, a raylway station which sels travelling tickets,a naval station (or port) which gives service to ships, n asembler line in afactory and so on. The manager of a queueing systemIs interesed both indecreasing the wayting time of customers in the system, and in decreasingthe the ”idle time ” of service stations. But these two objectives of themanager are in strife:when one decreases, the other one increases.In this caethe idea is to introduce some costs related to the two elements (waytingtime and idle time) and to tray to ballance them (i. e. the to costs to beequal or close each other).In order to built up a mathematical model for aqueueing system, it is necesary to define first some variables and parame-ters of the system.(Difference between varisbles and parameters is obvious:parameters keep constant values over large periods of time, while variableschange their value during short intervals of time). Some of the variablesare known-input (e.g. arrivals and services) and some other are unknown-output (e.g waiting time and idle time). The purpose of the model should beto determine the output elements in term of input ones. The problem now

34

is how to measure the arrivals and services. Both could be measured eitherby discrete (integer) numeric values or by continuous values.Arrivals couldbe measured by arrival time (AT ), i.e. the interarrivel time of customers,or by integer number (NA), i.e. the mumber of units arriving per unit oftime (the flow of arrivals). Similarly, the services could be measured by theduration of one service (ST ) or by the number of customers served per unittime (SN). These are imput variables. All these input variables are randomvariables for which the probability distributions are assume to be known.The output variables, in the continuous version, are WT the curent waytingtime of a customer and TID the current idle time of a service station ( inthe case when there are several stations!). The discrete output variables areWL-the current queue length and NID- the current number of idle stations.These autput variables are also random, but their probability distributionsare unknown and in ideal situation they must be determined depending ondistributions of input variables. Practically this is not allways possible andtherefore we need to use moments (or expectations) of these distributions,which are parameters.

Input parameters could be expected arrival time E[AT ] or E[NA]-theexpected number of customers arriving per unit of time, and, the expectedservice time E[ST ] or E[SN ]- the expected number of customers served perunit time. The following relations are valid for input parameters:

E[AT ] =1

E[NA], E[ST ] =

1E[SN ]

. (6.1)

Similarly, bewtween output parametrs there are relations:

E[WT ] =1

E[WL], E[TID] =

1E[NID]

. (6.1′)

For calculating output parameters of a queueing model we use a discretestochastic process N(t) which is the number of customers in the system atthe time t, t > 0. Note that N(t) consists in both the number of customersin wayting queues, plus the number of customers under service.

The key element in studying a queueing system is to derive the prob-ability distribution of N(t) in terms of input elements. The probabilitydistribution of N(t)is in the form

N(t) :

(0, 1, 2, ..., n, ...

P0(t), P1(t), P2(t), ..., Pn(t), ...

)(6.2)

where Pn(t) = P [N(t) = n], n = 0, 1, 2, .... We shall denote a queueingmodel as folows

A/S/c : (Lq, d) (6.3)

35

where A is an information on the probability of arrivals, S is an informationon the distribution of services, c is the number of service station (chanels!)together with the topology of these stations when are many, Lq is the max-imum length of the queue and d is the discipline of service. The topologyconcerning the c > 1 service stations could be serial, when the service isperformed by crossing all stations, parallel when the service is fulfilled byonly one of the service stations, or network when the sevice stations are thenodes of a directed graph. The disciplene of service could be FIFO (in orderof arrivals), LIFO (in the inverse order of arrivals), with priorities (whensome important customers are served before others), or some other rulesof service (e.g. when some costomers do not wait in the queue more thansome maximum time). Note that all elements in the previous notation of aqueueing model are known and we must take them in consideration whensearching for the probability distribution of N(t).

If we know probabilities Pn(t) we can determine some interesting outputparameters as follows

E[N(t)] =∞∑

n=0

nPn(t), E[WL] =∞∑

n=c

(n−c)Pn(t), E[NID] =c∑

n=0

(c−n)Pn(t).

(6.4)In the following we introduce the birth snd death process, which is the math-ematical instrument to solvr a queueing model.

6.2 Birth and Death Processes

The discrete stochastic process N(t), t > 0 is a birth and death processif it has the following properties

(a). N(t) counts random events occuring in time such as N(t) is thenumber of events occuring on time interval [0, t]; furhtermore for t1 <t2, N(t2 − t1) = N(t2)−N(t1) and for t1 < t2 < t3 < t4 we have N(t2 − t1)is stochasticaly independent fron N(t4 − t3); (one says that the process iswith independent stochastic increments.

(b). P ([N(t + ∆t) = n + 1]/[N(t) = n]) = λn∆t + O(∆t), n ≥ 0;(c). P ([N(t + ∆t) = n− 1]/[N(t) = n]) = µn∆t + o(∆t), n ≥ 0;(d). For i > 1, P ([N(t+∆t) = n±i]/[N(t) = n]) = O(∆t). In (b),(c), (d)

the notation o(∆t) denotes a quantity which is neglected (i.e. very small) fora small increment ∆t > 0. The class of functions O = o(∆t) has properties

f1(t)o1(∆t) + f2(t)o2(∆t) ∈ O, ∀|fi(t)| < ∞, o1, o2 ∈ O.

o1 + o2 + ... + on ∈ O, ∀oi ∈ O (6.5)

lim∆t→0

o(∆t)∆t

= 0

36

(i.e. o(∆t) tends to zero faster then ∆t.).The prooerties (b),(c) say that on a short interval of time [t, t+∆t] is born

or die only one individual (i.e these arerandom events and the correspondingprobabilities depend only on λn, n ≥ 0 (the intensity of birth and on µn, n ≥1 (the intensity of death).

A birth and death process can describe the evolution of a queueing sys-tem, the evolution of a human population, the evolution of trafic of usersinto a computer network, and so on.

The properties (b),(c),(d) say that events counted by N(t) are rareevents. These properties of a birth and death process allow us to find asystem of differential equations satisfied by the probabilities of Pn(t) =P [N(t) = n]. It results in the following theorem.

Theorem 6. 1 The probabilities Pn(t) of a birth and death process satisfythe following system of differential equations

P ′0(t) = −λ0P0(t) + µ1P1(t) (6.6)

P ′n(t) = −(λn + µn)Pn(t) + λn−1Pn−1(t) + µn+1Pn+1(t), n ≥ 1. (6.6′)

Proof. For n = 0 if we use properties (a),(b),(c),(d) and properties (6.5)one obtain

P [N(t + ∆t) = 0] = P [N(t) = 0](1− λ0∆t−O(∆t))+

+P [N(t) = 1](µ1∆t+o(∆t))(1−λ1∆t−o(∆t))+P [N(t) = n+i, i > 1]o(∆t)

and after some calculations we obtain

P [N(t+∆t) = 0]−P [N(t) = 0] = −λ0P [N(t) = 0]∆t+P [N(t) = 1]µ1∆t+o(∆t)

and therefore

P0(t + ∆t)− P0(t)∆t

= −λ0P0(t) + µ1P1(t) +o(∆t)∆t

.

Making ∆t → 0 in the last formula we obtain (6.6). In a similar mannerone can prove also (6.6’). The theorem is proved.

The processN(t) is well defined by λn > 0, n ≥ 0 (called birth intensi-ties) and µn > 0, n ≥ 1 (called death intensities).

If we fix up initial conditions P0(0) = 1, pi(0) = 0, i ≥ 1 then thesystem (6.6),(6.6’) has a unique solution (a result known from the theory ofdifferential equations. We need that Pn(t) satisfy condition

∞∑

n=0

Pn(t) = 1, ∀t. (6.6′′)

The following theoren, due to Feller says

37

Theorem 6. 2 The necessary and sufficient condition to be satisfied rela-tion (6.6”) is

∞∑n=o

n∏

i=1

µi

λi−1= ∞.

An interesting practical case is when the process becomes stationary, i.e.when

limn→∞Pn(t) = pn = const,∀n ≥ 0.

In the case of queueing models this means that after some period runningin, the system becomes stable, i.e. Pn(t) = pn = const. In this case, thesystem (6.6),(6.6’) becomes

−λ0p0 + µ1p1 = 0 (6.7)

−(λn + µn)p2 + λn−1pn−1 + µn+1pn+1 = 0. (6.7′)

If we denote Zk = −λk−1pk−1 + µkpk, from (6.7) and (6.7’) it results

Z0 = 0, Zk+1 = Zk = 0, k ≥ 1

and hence we havepk =

λk−1

µkpk−1

i.e.

pn =n−1∏

i=0

λi

µi+1p0. (6.8)

The condition∑∞

n=0 pn = 1 gives finaly

p0 = [1 +∞∑

n=1

n−1∏

i=0

λi

µi+1]−1. (6.8′)

We note that is necessary to have

∞∑

n=0

n−1∏

i=0

λi

µi+1< ∞.

Example 6.1 We will show how the birth and death process can beused to solve the following queueing model

Exp(λ)/Exp(µ)/1 : (∞; FIFO). (6.9)

(When the distribution of arrivals and services are certain, the model isdenoted

•/ • /1 : (∞;FIFO). (6.9′))

38

Therefore, the arrival time AT is exponentialy distributed Exp(λ), the ser-vice time ST is exponentialy distributed Exp(µ), λ 6= µ, there is only oneservice station, the maximum length of queue is ∞, and the discipline ofservice is FIFO (First In First Out).

Let us determine the intensities λn, n ≥ 0;µn, n ≥ 1. Note thatP [to arrive a customer on [0, ∆t] is

P [AT ≤ ∆t] = 1− eλ∆t = 1− (1− λ∆t) + o(∆t) = λ∆t + o(∆t)

and therefore we have λn = λ = const = 1E[AT ] ,∀n ≥ 0.

In a similar manner one can found that µn = µ,∀n ≥ 1. Now if we denoteρ = λ

µ we obtain in the stationary case (according to (6.8),(6.8’))

pn = ρnpo, p0 = [∞∑

i=0

ρi]−1 = 1− ρ,

assuming |ρ| < 1 and therefore

pn = (1− ρ)ρn.

Note that ρ (called traffic intensity), is the ratio of the expected number ofarrivals per unit time and the expected number of customers served per unittime; the queueing system can run only when |ρ| < 1.

Now we can calculate some interesting output parameters of the queueingmodel (6.9), namely

E[N(t)] =∞∑

n=0

npn = ρ(1− ρ)∞∑

n=1

nρn−1 =

= ρ(1− ρ)d

dρ

( ∞∑

n=1

ρn

)= ρ(1− ρ)

1(1− ρ)2

=ρ

1− ρ,

E[WL] =∞∑

n=2

(n− 1)pn = ρ2(1− ρ)∞∑

n=2

(n− 1)ρn−2 =

= ρ2(1− ρ)∑

n=1

∞nρn−1 = ρ2(1− ρ)d

dρ

( ∞∑

n=1

ρn

)=

ρ2

1− ρ,

E[WT ] = E[ST ]E[WL] =ρ2

µ(1− ρ), (6.10)

E[NID] =1∑

0

(1− n)pn = p0 = 1− ρ,

39

E[TID] = E[AT ]E[NID] =1− ρ

λ. (6.10′)

If we denote W the (random) time a customer spends in the queueing systemthen we have

E[W ] = E[N(t)]E[ST ] =ρ

µ(1− ρ).

For the queueing system is defined the so called efficiency factor as

Ie =E[W ]E[ST ]

.

Note that in this example AT and ST are exponentialy distributed. Usualy,in practice AT may be expinentialy distributed, while ST could have anyother distribution. In this case the model (6.9) is diffiucult to be solved.Pollaczek stated a conjecture that for the model

Exp(λ)/B/1 : (∞, F IFO)

where B means any distribution of ST for which E[B], V ar[B] exist, theefficiency factor Ie is

Ie =ρ

1− ρ(1 + C2

s ) (6.11)

where C2s is the variability coefficient of ST defined as

Cs =√

V ar[ST ]E[ST ]

. (6.11′)

This queueing model assumed exponential arrivals and services.

•Use of discrete simulation. In order to analyze the queueing systemwith any type of distribution for arrivals and services we need to built upa simulation model. Such a model consists in an algorithm which producesartifficial experiments via computer runs and then, these experiments (infact samples on output variables) will be processed and analyzed givingsolutions for the management of the real queuieing system described by thesimulation model. The instruments used to built up the simulation model(i.e. algorithm) are the clock time and agenda.The clock time has a doublepurpose: to record the time elapsed of the system and to keep the correctorder in time of the events produced by the simulation model. Agenda is aconcept related to store the events produced by simulation. The clock timeis increased by a finite number of values. After each increment of the clock,the simulation algorithm processes the events occured at that moment oftime (these events define the agenda of current events -ACE). When theACE is empty, the clock is incremented with some value and the process

40

is repeated for the new events fron ACE. When an event is processed, itcam produce another event (at a future moment of the clock, saying thatthe set of these events is the agenda of future events-AFE) or can cancelsome events from ACE (this is the set of canceled events-CE). Therefore theagenda A has a dynamic evolution in the form

A = ACE ⊕AFE ª CE.

The clock time can be incremented in two main ways: with variable incre-ments ( called variable increment clock time) or with constant increments(called constant increment clock time). Initially the clock is zero. In the caseof variable increment, the clock is incremented up to the time occurence ofthe first event in AFE. In the case of constant increment, the clock is in-cremented with a constant ”hour” c. After incrementing the clock, themain cycle of the simulation algorithm selects the events from AFE havingthe occurence time equal to clock and introduces them in ACE. Then, theevents from ACE are processed; when ACE is empty, the clock is advancedagain and the process is repeated until clock takes a given Tmax input valuewhen the simulation ends. As an alternative, the simulation ends when thenumber of a type of simulated events takes a given value Nmax. Sometime,instead of the variable clock tine we use the equivalent rule i.e. next eventrule; in this case the end of simulation is determined by Nmax

In the book by Zeigler and all (2000) is presented a general formal de-scription of discrete simulation models, based on formal sytem theory, par-ticularly on the so-called system with external discrete events. This formaldescription is a background for construction of discrete simulation languages.We do not come into details concerning this formal description of discretesimulation models; we resume description to the above short preserntation.

After this brief introduction to the ideas of building up simulation modelswe describe now the simulation model of the system (6.9’). Apart from theknown variables AT, ST, WT, TID the simulation model uses variables:

TWT - total wayting time;TTID-total idle time;ICOUNT -an integer variable which counts the number of services per-

formed;NS = Nmax-the numbes of customers to be served;AWT - average wayting time=TWT/NS;ATID-average idle time=TTID/NS;The flowchart of the simulation algorithm is given in Fig.6.1. The rule

of next event is used (i.e.an implicite variable clock time). The block (1) ofthe flowchart reads NS and input parameters for problability distributions

41

of AT. Initializations are

WT = TID = 0, TWT = TTID = 0, ICOUNT = 0,

Block (2) simulates an arrival. Because the rule of next event is used, block(3) fits the arrival time using wayting time, to allow selection of the first nextevent. Then, block (4) simulates a service time for the current customer.Block (5), according to discipline FIFO, selects the next event to be pro-cessed; when the arrival time is less then service time, then the customer hasto wait (block (6) calculates current WT ) or on the contrary block (7) calcu-lates the idle time TID of the service station. Block (8) gives the conditionto terminate the simulation. Finally, block (9) calculates output parametersAWT and ATID and delivers results. After building up the simulation al-gorithm the problem is to transform it into a computer program. This canbe done by using a general purpose language (like, Fortran, Pascal, C++,etc) or by using a specialized simulation language (like GPSS, Simula,etc).A general purpose langusge has the advantage that it alows more flexibilityin processing and interpreting simulation experiments, while a simulationlanguage allows a facile construction of the simulation algorithm (avoidingfor instance the difficult handling with clock and agenda), but it has onlylimited possibility of analyzing simulated experiments. The flowchart pre-sented is oriented to use a general purpose language.

The simulation model presented,can be changed for various hypotheses.For example AT and/or ST could have different distributions as; Gamma,Erlang, Lomax, Weibull, etc. They could have mixture distributions in theform

F (x) =k∑

i=1

piFi(x), pi > 0,k∑

i=1

pi = 1. (6.12)

In this case customers of the k classes coulkd have different priorities inservice and the algorithm can be developed accordingly. To finish the con-struction of the simulation model, we need to answer the following question:is the maodel logically correct? Even if it is running well on the computer,this does not mean that it is solving well the problem. Therefore, we mustdo the validation of the model.

This can be done by using a mathematical model, whose solution isknown in some particular case. For instance the mathematical model (6.9)gives formulae for E[WT ] and E[TID] (see (6.10),(6.10’)). If for NS large(to satisfy conditions of the law of large numbers) we obtain E[WT ] <approxAWT, E[TID] ≈ ATID, then the model is valid and it can be usedfor any types of distribution of AT and ST.

Finally we note that the same model can be developed by using a con-stant increment clock time. (See Vaduva-1977). The construction of such a

42

model is usually easier than the construction of the variable increment clocktime model.

± °² ¯

START

(1)?Initializations;read

input parameters

?Generate AT

(2)

?AT = AT - WT

(3)

?Generate ST

ICOUNT ← ICOUNT + 1

(4)

?!!!!aaaa

aaaa!!!!ST > AT

?

DA (5) NU

? ?

TWT = TWT+WT

WT = ST - AT

TID = 0

TTID = TTID+TID

TID = AT - ST

VT = 0

(6) (7)

? ??!!!!!aaaaa

aaaaa!!!!!ICOUNT < NS

?

DA(8)

?

useful rezultsparameters and deliverCalculate output

NU (9)

?

± °² ¯

STOP

¾

-

Fig. 6.1. The flowchart of the model ./1./1: (∞ FIFO),with variabile clock time

6.2 Simulation of a queueing system with N parallel stations

Now, we built up the simulation model

•/ • /N : (∞; FIFO). (6.13)

i.e.the system contains N parallel stations. Such a system can be imagined

43

in relation with desks of a bank, the desks of a public service, the tanksof a petrol station, etc. We bult up the simulation model to be used via ageneral purpose language. The variables and parameters used are

AT -arrival time;ST (j)- service time at the station j, 1 ≤ j ≤ N ;WT and TID were already used;TAT - the time of the last arrival (the clock of arrivals);TT (j)- the time of station j (i.e. the clock time of j);TTMIN - the clock time of the system (the variable increment clock

time);it is defined as

TTMIN = min1≤j≤N

TT (j).

(Discipline FIFO is involved).L- the current analyzed station, i.e the station for which TTMIN =

TT (L);NS- the number of customers to be served (i.e. an input parameter

giving the condition to finish simulation);ICOUNT - integer variable which counts the served customers;SST (j)- sum of service times fulfilled by the station j;SWT (j)- sum of waiting times of customers served by the station j;TTID(j)-total idle time of the station j;DIF = TAT − TTMIN - a working variable.The initial values of variables are

TAT = TT (j) = SST (j) = SW (j) = TTID(j) = 0, 1 ≤ j ≤ N.

Apart from NS, input parameters are also the parameters of AT and ST (j).The flowchart of the simulation model ios given in Fig.6.2. We will underlinethe function of each block in Fig.6.2.

The block (1) reads input parameters and makes initializations. Then,the cycle (2) and the block (3) simulate N − 1 arrivals (it is assumed thatone customer is already in the system). In a similar way the cycle (5) andthe block (6) simulate the service of the first arived customers, while block(4) records the customers served. Block (7) determines the clock time andthe station L with this clock. Block (8) records a new customer and block(9) checks the termination of simulation.

44

m?A

TT(L)= TTMIN+ST(L)

(15)?

SST(L)=SST(L)+ST(L)Generate ST(L)

(14)?? ?

SWT(L)=SWTL(L)+WTWT=-DIF

TTID(L)=TTID(L)+TIDTID = DIF

? ?

!!!!aaaa

aaaa!!!!

?

DIF ?

(12) (13)

= 0

+(11)

?DIF = TAT - TTMIN

TAT = TAT + AT

Generate AT (10)?≤ NS

!!!!aaaa

aaaa!!!!

?

ICOUNT ?

(9)> NS

?

useful rezults

and deliver

Calculate statistics

(16)

?

± °² ¯

STOP

ICOUNT=ICOUNT+1

? (8)

TTMIN=TT(L)

1 ≤ J ≤ N

TTMIN=MIN(TT(J))? (7)

-

-?

¾ ±°²¯

A

TTMIN=TT(L)

1 ≤ J ≤ N

TTMIN=MIN(TT(J))? (7)

SST(J)=SST(J)+ST(J)

TT(J)=TTID(J)=ST(J)

Genereaza ST(J)? (6)JJ

JJDO J=1,N

(5)?ICOUNT=N

(4)?TAT=TAT+ATGenerate AT

? (3)JJ

JJDO J=2,N

(2)?and set initial conditionsRead input parameters

? (1)± °² ¯

START

Fig. 6.2. Flow chart for simulation of a queueing systemwith N parallel stations.

45

If the simulation continues then block (10) generates AT , updates TATand calculates DIF. Block (11) tests the sign of DIF ; if this is negativethen one calculates (in block (12)) WT and SWT (L) is updated, while, oncotrary, when the sign is positive, then one calculates TID and TTID(L) isupdated (block (13)). Block (14) simulates a new service time (which wasalready recorded in block (8)) and the clock time of the station L is updatedin block (15). Block (16) calculates final statistics such as

AWT =∑NS

L=1 SWT (L)NS

, ATID =∑NS

L=1 TTID(L)NS

, (6.14)

and delivers useful results.

Validation of the model. In order to validate rhe model (6.13), wepresent the solution of the queueing model

Exp(λ)/Exp(µ)/N : (∞; FIFO) (6.13′)

where ist is assumed that all service stations have the same service distribu-tion Exp(µ). Therefore, according to Example 6.1 we have in the stationarycase

λn = λ, ∀n ≥ 0, µn =

nµ, for 0 ≤ n ≤ N − 1Nµ, otherwise

. (6.15)

Then, denoting ρ = λµ , one obtain

pn =

ρn

n! p0, for 1 ≤ n ≤ N − 1,ρn

(N−1)!Nn−N p0 for N ≤ n ≤ ∞ (6.15′)

and

p0 =

[N−1∑

n=0

ρn

n!+

∞∑

n=N

ρn

(N − 1)!Nn−N

]−1

=

[N−1∑

n=0

ρn

n!+

ρN

(N − 1)!N

N − ρ

]−1

.

(6.15”)Some characteristics of this queueing system are as follows.

The average queue length:

E[WL] =∞∑

n=N

(n−N)pn = p0

[NN

N !

∞∑

n=N

nρn

Nn− NN+1

N !

∞∑

n=N

ρn

Nn

].

Putting ρ∗ = ρN (i.e. traffic intensity of the system) we obtain

E[WL] = po

[NN

N !ρ∗

d

dρ(∞∑

n=N

nρ∗n−1)− NN+1

N !ρ∗N+1

1− ρ∗

]=

46

=NN

N !ρ∗N + 11− ρ∗

=λµρN

N(N − 1)!(Nµ− λ). (6.16)

In order to calculate the average wayting time we note that the averageservice time of the whole system is E[S] = Nµ and therefore

E[WT ] = E[WL]E[S] =λρ∗N

N(N − 1)(Nµ− λ). (6.16′)

The average number of idle stations is

E[NID] = p0

[N−1∑

n=0

(n−N)ρn

n!

]=

= p0

[N−1∑

n=0

nρn

n!−N

N−1∑

n=0

ρn

n!

].

Beeing a finite sum, the E[NID] can be easily computed and therefore

E[TID] = E[NID]E[AT ] = E[NID]1λ

. (6.17)

In order to validate the simulation model (6.16’) we must compare the the-oretical wayting time E[WT ] given by (6.16’) with the simulated averagewayting time AWT given by (6.14) and/or to compare the theoretical av-erage idle time E[TID] with the simulated averge idle time ATID. If, for alarge number of simulated services NS the theoretcal and simulated values( in the hypotheses of (6.13’)) are close each other , then the simulationmodel is valid.

Note. When the model requires service with priorities, the AT will havea mixture distribition (e.g. mixtexpinential). In this case for each priority,another queue length WL must be introduces and the simulation model willbe completed with new commands to handle these quees.

Exercice. An investor intends to built-up a petrol station with N pomps(to sell petrol). He knows that rate of arrival of customers to the petrol sta-tion is λ (known) and the service rate of each pomp is µ and the trobabilitydistributions of arrival time and sedrvice time are exponential. The questionis the following: how many places for cars comming to be served the investormust preserve arround petrol station, to be able to keep at least 95% of thecars which arrive at the station? Which is the propbability for a car to waitfor service?

47

Hint. Use the formulae (6.15),(6.15’) anf for α = 0.95 and determine Nα

such asNα∑

i=1

pn ≥ α.

Nα is the solution. Now the investor can calculate the area of land to bepreserved for cars comming to get petrol.

The probability for a car to wait is

P (AT > 0) = P (WL > 0) =∞∑

i=N+1

pn,

with pn given by (6.15),(6.15’).

6.3 Simulation of the machine interference problem

The problem is the following. There are N machines which run auto-matically, and there are M repair stations (e.g repairmen), such as N > M..When a machine breaks down, a free repairman repairs it. The runningtimes of the machines are random and the service times are also random.When all repairmen are busy and a new machine fails down, that machinemust be wayting until a repair man becomes free. The engineers call thiswayting time- interference time. The purpose of the manager of such a sys-tem is to ballance costs of wayting times (of the machines) with the idletimes (of the repairmen). Therefore such a system is a queueing system ofthe form

•/ • /N : (N −M ; FIFO) (6.18)

with N parallel stations and discipline FIFO.The simulation model uses the following variables and parameters;N -the number of machines;M - The number of repairmen;I- The queue length;K- The current number of free repairmen, 1 ≤ K ≤ M ;NS- The number of repairs (services) to be performed (i.e. this is an

input parameter);RT - The running time (i.e. the arival time);ST - the service time (duration) of a repair;WT - The current waiting time;TID- The idle time of the current busy repairman;J- An index for the machine, 1 ≤ J ≤ N ;

48

T (J)- The total time ( clock) of the machine J, 1 ≤ J ≤ N ;SS(J)- The total repair time of the machine J ;SR(J)- The total running time of the machine J ;SW (J)- The total interference (waiting) time of the machine J ;TAT (L)- The total time of the repairmam L (i.e. the clock tome of the

current repairman L,1 ≤ L ≤ M ; L denotes the current repairman);SID(L)- The total idle time of the repairman L, 1 ≤ L ≤ M ;TBS(L)- The total busy time of the repairman L;IX(J)-an index (code) associated to machine J as follows:

IX(j) =

0, if the machine J is running;1, if the machine J is under repair;−1, if the machine J is wayting;

JM - The current machine processed by the simulation program;KM - The machine processed, having the largest wayting time;LM - The current repairman dooing a repair;LA -The first repairman to become free ( when all repairmen are busy);TMI- The clock time opf the simulation;TATM - The total time of a repairman doing a repair ( i.e. TATM =

TAT (LM));TMK- The total time of the machine with the largest wayting time (i.e.

TMK = T (KM));ICOUNT - a counter (i.e integer variable) which records the repairs (i.e.

when ICOUNT becomes NB then the simulation is finished).Input variables are RT and ST (i.e. their probability distributions areknown) and input parameters are N, M, NB; parameters of RT and STare also input.

The variables SS(J), SR(J), SW (j), 1 ≤ J ≤ N and SID(L), TBS(L), 1 ≤L ≤ M are output variables. All variables and parameters constizute theagendaand IX(j) are system states. Initial conditions of the simulation are

I = ICOUNT0;K = M ;

SS(J) = SW (J) = o, 1 ≤ J ≤ N ; (6.19)

TAT (L) = TBS(L) = SIDF (L) = 0, 1 ≤ L ≤ M ;

IX(j) = 0, 1 ≤ J ≤ N.

The flowchart of the simulation model •/ • /M ; (N − M : FIFO) (of themachine interference problem ) is given in Fig.3. The block (1) sets initialconditions and reads input parameters. Then, blocks (2) and (3) (the cycle)

49

start the simulation by generating running times for all the machines andgive initial values for clock times T (J). Block (4) calculates the clock timeTMI and determines the machine JM having this clock time. Note thataccording to FIFO, the clock tie is the MIN [T (J)] for the machines runningor under repair (i. e. the machines for which IX(J) ≥ 0).

The block (5) tests the state of the machine JM. If this is running (i.e.IX(JM) = 0) then follows block (8) which selects (according to FIFO)the repairman LM with the smallest clock. Then, the block (7) tests ifthis repairman is not free (i.e. K = 0) and block (8) makes the machineJM to wait (i.e. IX(JM) = −1), increases the queue length I with oneunit and the repair man LM (which is not yet free) becomes LA, i.e. itwill be the first to do a repair. Otherwise, if in block (7) the repairmanLM is availlable, then block (9) puts him to work (generates ST , updatesthe TBS(LM) and SID(LM)), puts the machine JM under repair (i.e.IX(JM) − 1), updates the nummber of free repairmen K (i.e. K:=K+1),updates the LM and TAT (LM) and records the repair performed (i.e. putsICOUNT := ICOUNT + 1).If in the test block (5) one founds the alternative situation (i.e. JM is underrepair, IX(JM) = 1) this means that the machine was just finished to berepaired and then, block (10) puts it to run (i.e. IX(JM) = 0). Then, ifthere are no machines waiting (i.e in (11) I = 0), then the repairman whichfinished the repais becomes free (i.e. in (12) K := K+1). Otherwise, if in thetest block (11) it results that there are machines wayting (i.e. I > 0), themthe block (13) performs the following operations: the machine KM from themachines waiting is selected (it has the clock TMK = minIX(J)=−1 T (J)),the wayting time WT of the machine KM is calculated and sum of waitingtimes of the machine KM is updated (by SW (KM) := SW (KM) + WT ).

Then, the service time ST for the machine KM is simulated, the state ofKM is put ”under repair” (i.e. IX(KM) = 1), the queue is diminished byone unit (i.i. I := I−1) and SS(KM) is upgated. This service is performedby the repairman LA ( determined in block (8)) and therefore TBS(LA)is updated by adding ST and clock times of the machine KM and of therepairman LA are also updated becoming TMI + ST.

Finaly a new service is recorded in ICOUNT. For the machine JMselected in block (4) and just repaired, the block (14) generates its runningtime RT , updates its sum of running times SR(JM) and also updates itsclock time T (JM). Block (15) tests the condition of termination. If thereare not NB repairs performed, then the simulation cycle is continued fromblocl (4), otherwise the simulation is finished and block (16) displays the

50

simulation results.

Note. If the running times of the system have different probabiliti dis-tributions, then this can be modeled by assuming that RT has a mixturedistribution with k terms (say). In this case could be also used k priorities

51

and instead of one queue we can use many queues I(p), 1 ≤ p ≤ k. Then, themachine to be repaired, (in blocks (5), (9),(13) ) is selected from the ma-chines with tthe highest priority. The model must be also extended in thecase when the repairmen have different probability distributions of repairtime, RT (L), 1 ≤ L ≤ M.Validation of the model. In order to valigate the simulation model (6.18)we use a similar way as in Example 6.1, i. e. we solve first tzhe mathematicalqueueing model

Exp(λ)/Exp(µ)/N −M ; (∞FIFO). (6.20)

In this model, the number of customers in the system N(t) is n, 1 ≤ n ≤ N,Because the intensity of RT (exponential case) is λ, then we have

λn =

(N − n)λ, 0 ≤ n ≤ N0, otherwise.

(6.21)

(This is explaned by the fact that when there are n machines down, theintensiry of a new gailure in the systen is (N − n)λ). In a similar manner,the intensity of a repair pf a machine is µ, therefore wnen are n machinesdown the intensity of the repair in the system is

µn =

nµ, if 0 ≤ n ≤ M − 1Mµ, if M ≤ n ≤ N0 otherwise.

(6.21′)

Using notation ρ = λµ the probabilities pn = P (N(t) = n) are

pn =

N !n!(N−n)!ρ

npo, if 1 ≤ n ≤ MN !

M !mN−n(N−n)!ρnp0, if M + 1 ≤ n ≤ N,

0, otherwise.

(6.22)

As usual, p0 is determined from the condition∑N

n=0 pn = 1 which gives

p0 =

1 +

M∑

n=1

N !n!(N − n)!

ρn +N∑

n=M+1

N !M !Mn−M (N − n)!

ρn

−1

. (6.22′)

Formulae (6.22),(6.22’) could be simplified, but from computational pointof view they coul be simply calculated by the computer just as simple finitesums. Assuming calculated pn. 0 ≤ n ≤ N one can easily calculate:

52

- Average wayting (interference) time as

E[I] =N∑

n=M

(n−M)pn, E[WT ] = E[I]E[ST ] = E[I]1µ

, (6.23))

-Average number of customers in the system

E[N(t)] =N∑

n=0

npn,

- Average number of busy stations E[NO] is

E[NO] =M−1∑

n=1

npn + MN∑

n=M

, (6.23′)

- Average number of idle stations and average idle time

E[NID] = N − E[NO], E[TID] = E[RT ]E[NID] =1λ

E[NID], (6.23”)

-The average busy time

E[BS] = E[ST ]E[NO] =1µ

E[NO]. (6.23′′′)

We can calculate also- The interference factor

IF =E[WT ]E[ST ]

, (6.24)

- The efficiency factor of the nachines

EF =E[RT ]

E[RT + WT + ST, (6.24′)

-The efficiency factor of repairmen

EM =E[BS]

E[BS + TID]. (6.24”)

E[WT ], E[TID] are theoretical values. From a simulation run we can cal-culate the simulated (estimated) values of the described factors as

IF =TWT

TST, TWT =

NB∑

J=1

SW (J), TST =NB∑

J=1

SS(J),

53

EF =TRT

TRT + TWT + TST, TRT =

NB∑

J=1

SR(J),

EM =TTBS

TTBS + TTID, TTBS =

M∑

L=1

TBS(L), TTID =M∑

L=1

TID(L).

Validation of the simulation model can be done by comparison of theoreticalefficiency factors with the estimated ones, given by previous formulae. Analternative of validation is to compare theoretical values of E[WT ], E[TID]with corresponding estimated values calculated as

AWT =TWT

NB, ATID =

TTID

NB.

The queueing model (6.20) was studied in the particular case M = 1. Thus,in this case, the interference factor IF is in the form

IF =1yρ

+ N − 1− 1ρ, ρ =

λ

µ, λ =

1E[RT ]

, µ =1

E[ST ], (6.25)

when RT is exponential and ST is either constant, or exponential. (The yis different in these cases). If ST = const. and AT is Exp(λ), then

Y = yA = 1 +N−1∑

k=1

CkN−1(e

ρ − 1)(e2ρ − 1)...(ekρ − 1). (6.26)

This formula was found by Ashcroft. If ST is Exp(µ) and AT is Exp(λ)then one obtain formula of Palm

y = yP = 1 +N−1∑

k=1

(N − 1)(N − 2)...(N − k)ρk. (6.27)

Pollaczek has noticed that formulae (6.26),(6.27) can be written in the com-pressed form

IF = (1− C2S)IFA + C2

SIFP , (6.28)

where C2S is the coefficient of variance of ST and IFA,IFP are interference

factors corresponding to Ashcroft and Palm.In the book (Vaduva-1977) the simulation model (6.20) was validated

for M = 1 using the formulae (6.25),(6.26),(6.27), λ = 1.0 and the resultsare presented in the following table.

54

A table with validation results of model (6.20)Case Serv. Param. Simul.servs NB IF -Theoret IF -Simulated

Ashcroft µ = 5.0 8,000 1.035578 1.034116µ = 2.0 8,000 3.004391 3.00312

Palm µ = 5.0 5,000 1.424339 1.404621µ = 2.0 5,000 3.073394 3.132766

The table shows that there is a small difference between theoretical in-terference factor IF and simulated interference factor IF .

Pollaczek formulated the conjecture that in the case M = 1 with RT -exponential is valid for any distribution of ST (i.e. IF depends only onCS = CV (ST )). In the book (Vaduva-1977) this conjecture was tested viasimulatioon in the case when ST has an Erlang distribution ERLANG(µ, ν).In this case E[ST ] = ν

µ , C2S = 1

ν . Test results for λ = 1.0 and different valuesof µ and ν are given in the following table

A table with results for testing Pollaczek’s conjectureNo.of run µ ν E[ST ] C2

S NB IF -Theoret. IF -Simul.1 16.0 8 o.5 0.125 6,660 3.013017 3.0625172 80.0 8 0.10 0.125 6,600 0.402161 0.4203613 50.0 7 0.14 0.1428 6,900 0.649463 0.6786094 25.0 8 0.32 0.125 7,000 1.113330 1.1608125 25.0 5 0.2 0.2 6,600 1.113330 1.1612016 40.0 4 0.1 0.25 7,000 0.436070 0.4073817 8.0 4 0.5 0.25 7,000 3.021642 2.9947868 20.0 4 0.2 0.25 7,000 1.132768 1.0935179 12.0 4 0.33 0.25 7,000 2.157796 2.15624910 12,0 3 0.25 0.3333 7,000 1.562721 1.61826411 30.0 3 0.1 0.3333 8,400 0.458676 0.46735012 15.0 3 0.2 0.3333 8,400 1.165165 1.15034613 10.0 2 0.5 0.5 7,000 1.229958 1.16160114 20.0 2 0.1 0.5 7,000 0.503887 0.52084015 8.0 2 0.25 0.5 7,000 3.038893 3.090300

From the table it results that the theoretical IF and the simulated IFare close each other, therefore the Pollaczek’s conjecture is tested. For largernumbe of simulated services NB probably the figures will be almost equal.

This is an example on how we can use a mathematical experiment (i.e.stochastic simulation) to test formulae which were not yet proved, mathe-matically.

55

7. Introduction to inventory models7.1 Preliminaries

An inventory (or a stock) is any resource which has an economic value,characterized by inputs and outputs. The output from the stock is deter-mined frequently by a demand. (Sometimes the output may contain alsoexpired or damaged elements (goods)). An inventory may consist of variousgoods (in a shop), food, materials or even water (in a dam or reservoir). Afactory, for instance can have an inventory of materials or items used forbeeing processed in produstion, or, can have an inventory of processed ob-jects (machines, devices). Output from an inventory of a factory consists indelivery of smaler quantities to satisfy a demand (either for beeing processedor to be sent so final users). In this section we will deal with mathematicalmodels which are intended to do an efficient management (or to adminis-trate) an inventory. An inventory model uses the following elements:

- The time t used to describe dynamic behaviour of the inventory;- The stoc level I(t) at time t;- The input rate a(t) in the stock at time t;- The rate of demand r(t) at time t;- The output b(t) at time t (when this must be distinguished by r(t));The demand rate and/or output rate could be random variables. Some-

times (e.g. in the case of a dam), the input rate a(t) (i.e the rate of flow ofthe water entering the dam) can be also a random variable. These randomvariables have known probability distributions. In the folowing we deal onlywith inventories of the type used in a factory or in a shop. The input ina such a stock is called an order (or an order quantity) and is denoted q;usualy the order q enters the stock at discrete moments of time.

In the management of an inventory are important some costs (expressedin money) as:

- The holding cost h (i. e the cost to stock one unit of good per unit oftime);

The penalty cost or the shortage cost d (i. e. cost of a shortage of a unitin the stock, per unit of time);

The set-up cost s, of an order quantity. Note the difference betweenholding cost and shortage cost on one side and the set-up cost. The last oneis associated to the whole ordered quantity q.

From the above notations it results the following relation

I(t) = I0 +∫ t

0[a(u)− b(u)]du, (7.1)

56

which describes the evolution of the stock level I(t) in therms of input andoutput.

Using the introduced costs, we can define an objective E[a(t), b(t), r(t)](e.g. a cost function or a profit) to be optimised. When some of the functionsa(t), b(t), r(t) are random then the objective E is random and, is requiredthat expectation of E to be optimized. From the optimization of E we aremainly interested in defining the optimum order quantity q.

-

6

0A

AA

AA

AA

AA

AAAK

t1 t2T1 T2

I0

I(t)

q1 q2

tJJ

JJ

JJ

JJ

JJ

J]

AA

AA

AA

AA

AA

AAAK

Fig. 7.1. Variation of inventory level I(t)The problems we deal with could be classified in two categories: stock of

manufactured products ( i. e. the stock consists in materials for processingin a factory or in goods to be sold in a shop) or stock supply (when the stockcontains manufactured goods as machines, devices and so on). Mathematicalmodels for these problems are identical.

Before presenting some mathematical models of inventory theory, wedescribe first the dynamic of the stock level. This is done in Fig. 7.1. Notethat initialy ( at time t = 0), The stock level is I0. Than, on the interval[0, t1] the stoch level is decxreasing becausae it statisfies a demand.

(From the figure, the rate of demand is constant and therefore the stocklevel decreases linearly). Then, at time t1 the order quantity q1 enters thestock. Then, on [t1, t2] the stock decreases again and another quantity q2

enters the stock at time t2 and so on. The time lengtths Ti = ti+1 − tiare the time intervals when the order quantities enter the stock; they arecalled reorder cycles. (Sometimes the reeorder cycle is constant or random).The dynamics of supply of the inventory (presented in Fig. 7.2) needs some

57

more elements used by a mathematical model. The order quantity q is set-upwhen the (decreasing) stock level is P. This is called reorder level or reorderpoint. It is assumed that quantity P existing in the stock, can satisfy thedemand until the ordered quantity q enters the stock.

-

6

0

I(t)

T 2T

t′t”

@@

@@

@@

@@

@@

@@I

SS

SS

SS

SS

SS

SS

SS

SSo BBBBBBBBB

L1 L2

P

S

t

Fig. 7.2. Re-ordering mechanism

This order enters the stock at time T (reorder cycle); the time L elapsedfrom the moment when q is set-up until it enters the stock is the lead time.Some time (on the interval [T, 2T ]) the stock level I(t) becomes negative ( attime 2T ). This means that from the moment of time T there is an intervalof time of the length t′ when the stock is positive (i.e. on this interval oftime the demand can be satisfied). On the other interval of time of thelength t”, there is a shortage and therefore the demand cannot be satisfied.The figure shows that when the new order q arrives in stock (at time 2T )the previous unsatisfied demand (on t”) must be retrived (i.e. this demandmust be satisfied when the order enters the stock). (Usualy, this is the caseof stock supply in a factory; in the case of a merchant shop, the unsatisfieddemand is lost, in the sense that it is not satisfied when the order enters thestock). In mathematical models the costs h, d, s are known, the demand rand the lead time L are known, while the order quantity q and reorder pointP must be determined subject an optimum requirement (e.g. mimimum costor maximum profit). All mentioned elements defeining optimum supply ofa stock are the optimum policy of the inventory management. When thedemand r and/or lead time L are random the model is stochastic, otherwise

58

it is deterministic. If the time is not explicitedly involved in the modelthis is a static model, while contrary, the model is dynamic. The simplestmodels are those which are deterministic and static. Tho most complex arestochastic and dynamic models.

7.2 Simple one product models

In the following we will present the simplest static, deterministic oneproduct models.

7.2.1 The economic lot size model.

This models assumes that r is constant, the costs h, s are given, there is nolead time (i.e. L = 0), there is no shortage (i.e. I(t) > 0,∀t and d = 0).The stock variation is given in Fig. 7.3. The model will determine the orderquantity q and the reorder cycle T which minimize a cost function to bedefined. Note that because

T =q

r(7.2)

the only objective is to determine the optimum q. In order to specify thecost function we note first that the cost per reorder cycle CT is the sum

CT = Ch,T + s, (7.3)

where Ch,T is the holding cost at the time T. We have

Ch,T = h

∫ T

0I(t)dt = h

qT

2=

hq2

2. (7.3′)

The cost function cannot be CT because thies is minimum when T = 0,whch does not have any sense. Therefore, we define the cost function as

C(q) =Ct

T=

hq2

21T

+s

T=

hq

2+

sr

q. (7.3”)

From the optimum condition ( i.e minimization of C(q)) one obtain

q0 =√

2rs

h, T0 =

q0

r=

√2s

rh, (7.4)

and the minimum cost isC0 =

√2rsh. (7.4′)

Note that (from (7.3’)) the holding cost per interval [0, t] is

Ch,t = hq

2t. (7.4”)

59

-

6

0

I(t)

T 2T@

@@

@@

@@

@@

@@@I

@@

@@

@@

@@

@@

@@Iq

t

Fig. 7.3. Inventory variation of the economic lot model

7.2.2 The shortage model.Assume satisfied all hypotheses of the previous model and furthermore

there is shortage (cost d > 0, is given) and the demand is retrieved (i.e. theunsatisfied demand is satisfied when q enters the stock). The stock variationin this case is shown in Fig.7.4. Note that apart from q, the maximum(optimum) stock level S must be also calculated. From some triangles inthe figure one finds

t′ =Sr

,t” = T − t′ = T − S

r=

rT − S

r.

Similar to the previous model one finds that the cost per unit time has threecomponents, namely

C(S, T ) =S

T+

hS2

2rT+

d(rT − S)2

2rT. (7.5)

From the minimum conditions of the cost function

∂C

∂S= 0,

∂C

∂T= 0,

we derive

T1 =√

2s

rh

√1ρ, ρ =

d

h + d,

S1 =√

2rs

h

√ρ, q1 = T1r =

√2rs

h

√1ρ, C1 =

√2rsh

√ρ. (7.6)

60

-

6

0

I(t)

T 2T

@@

@@

@@

@@

@@

@@I SS

SS

SS

SS

SS

SS

qS

t′

t′′t

Fig. 7.4. Stock variation for the shortage model

Note that 0 < ρ < 1 (ρ is called shortage index) which gives

C1 =√

ρC0 < C0

which means that the sortage model is better than the economic lot model.But this is not quite true because ρ = S1

q1, and if we assume that α = 1−ρ is

given then would result in gd = 1−αα h, a relation whch means a dependence

between h and d, a reation which does not seem to be valid in practice.This model will be used for building up a simulation model in section

7.5.

7.3 Multiproduct models.

These models derive from the economic lot model. Assume that in afactory are produced p different types of products. Let us denote for theproduct i the unit price of row materials ci, the demand rate is ri, the setupcost is si and assume that the holding cost hi is a fraction ki, of ci (i.ehi = kici),0 < ki < 1, 1 ≤ i ≤ p. Therefore the production cost per unitof product is ci + si

qiand the cost of the delivered production is ri(ci + si

qi).

Hence, the cost function to be minimized is

C(q1, q : 2, ..., qp) =p∑

i=1

[ri(ci +si

qi) + ki

qi

2(ci +

si

qi)]. (7.7)

61

The last formula is in the form

C(q1, q2, ..., qp) = K +p∑

i=1

(Ai

qi+

Bi

2qi), (7.8)

where

K =p∑

i=1

(rici +ki

2si), Ai = risi, Bi = kici.

If the quantities are independent then the optimum q′is are derived as in theeconomic lot model. In practice the q′is are connected by some restrictions(constraints).

One restriction can be defined by an upper limit of the whole stock V.From the economic lot model it resulta that the average stocked products is12

∑pi=1 viqi where vi is the volume ocupied by an unit of product i, i.e. the

volume restriction is12

p∑

i=1

viqi ≤ V. (7.8′)

(Note that vi, V could be expressed in currency!).Another restriction can be defined in terms of production capacity (pro-

duction time is limited!). If total production time T is given and if ti is thetime necessary to produce an unit of type i, then the restriction of produc-tion capacity is

p∑

i=1

ritiqi

≤ T. (7.8”)

If we denote Q = (q1, ..., qn) and Gi(Q) ≤ 0 with

G1(Q) =12

n∑

i=1

qivi − V, G2(Q) =n∑

i=1

tiri

qi− T

and D = Q;G1(q) ≤ 0; G2(q) ≤ 0; then, the optimization problems couldbe formulated as follows

MinQ∈DC(q1, q2, ..., qn) (7.9)

with C given by (7.8). These problems are solved by using the method ofLagrange multipliers. The multipliers λ1, λ2 are selected depending of thenature of constraints. If the constraints are equalities then λi ∈ R, and ifthe constraints are inequalities, then

λi =

< 0, if Q ∈ Fr(D)0, if Q ∈ Int(D)

62

where

Fr(D) = Q; G1(Q) = 0, G2(q) = 0, , Int(D) = Q; G1(Q) < 0, or, G2(Q) < 0.

The minimization problem (7.9) is equivalent to

MinQ∈DL(Λ, Q), Λ = (λ1, λ2), L(Λ, Q) = C(Q) + λ1G1(Q) + λ2G2(Q).(7.9′)

The minimum condition of L in (7.9’) i.ie ∂L∂qi

= 0 gives finally in our case

qi(λ1, λ2) =

√2(Ai − λ2tiri)

Bi − λ1ui. (7.9”)

Note that ∂L∂λi

= 0 give Gi(Q) ≤ 0. Now , by applying a suitable numericalalgorithm, we must find (λ1, λ2) ∈ (−∞, 0) × (−∞, 0) such as G1(Λ, Q) <0, G2(Λ, Q) < 0. This can be done also by a random search algorithm (seesection 5.5 or 8 of this text).

7.4 Stochastic models.In such models, the demand rate r, the lead time L and the reorder cycle

T = τ (maybe) are positive random variables. The number of demandsper reorder cycle n(τ) is a random variable and the random variable Y =r1 + r2 + ... + rn(τ) is total demand. The following proposition is true

Proposition 7. 1 If τ has the cummulative distribution function (cdf) G(τ),the demand r has the cdf F (x) and the number of demands on [0, t] isPoisson(λt), then the total demand Y on τ has the cdf

H(x) =∫ ∞

0

∞∑

n=0

e−λτ (λτ)n

n!Fn(x)dG(τ) (7.10)

where Fn(x) = (F ∗ F ∗ ... ∗ F )(x) (the n−th convolution product) is the cdfof n independent demands ri, 1 ≤ i ≤ n.

(The proof can be found in Vaduva -1964).Note that the probability distribution of Y can be also derived by a direct

simulation algorithm, namelyfor i = 1 to N do begin (N is very large!);1. Simulate a random variate τ with cdf G(τ);2. Simulate a Poisson random variate n with distribution Poisson(λτ);3. Simulate n independent random variates ri with cdf F (x);

63

4. Calculate Y =∑n

i=1 ri;end;

5. Using Yj , 1 ≤ j ≤ N generated in the previous cycle, built-up ahistogram. This is the empirical distribution of total demand Y.

¡smallskipThe distribution of total demand Y can be expressed in terms of char-

acteristic functions. If V is a random variable with cdf U(x) then the cor-responding characteristic function is

ϕ(t) = E[eitV ] =∫ ∞

−∞eitxdU(x), i =

√−1, (7.11)

where the integral is in Stieltjes sense. (If the cdf U is zero on (−∞, 0) thenthe integral is extended only on (0,∞)). If Φ(t), Ψ(t),W (t) are characteristicfunctions of G(x), F (x),H(x) respectively, then from (7.10) the followingproperty is derived

W (t) = Φ[iλ(1−Ψ(t)]. (7.10′)

The inverse of the characteristic function W (t) is the cdf, i.e.

H(x− 0) = limh→0

1π

∫ ∞

0

sin th

t[Φ(iλ(1−Ψ(t)) + Φ(iλ(1−Ψ(−t))]dt. (7.10”)

If the function H(t) is known, then one can calculate the order quantity qof the stock for a given risk α, 0 < α << 1. Thus, if the ”stock on hand” isI0 then we must have

H(I0 + q) = P (I0 + q ≥ Y ) = 1− α. (7.12)

where 1− α is a large probability.If yα is the α-quantile of Y , such as P (Y > yα) = α, then the order

quantity isq = yα − I0. (7.12′)

As the cdf H(x) is difficult to be handled according to previous formulae,the simplest way to determine yα is to use the histogram of Y determinedby the described simulation algorithm.

Exercice. The reorder cycle has an exponential Exp(µ) distributionand the distribution of a demand X is Exp(θ). If the number of demandsper reorder cycle n(τ) has a Poisson(λ) distribution, give the expressionof the c.d.f. of total demand Yn(τ) = X1 + X2 + ... + Xn(τ). Give also theformula of the characteristic function W (t), t ∈ R, of the total demand Y.

64

Hint. We have

G(u) = P (τ < u) =

0, u < 0,1− e−µu, u ≥ 0 ;F (x) = P (X < x) =

0, x < 01− e−θx, x ≥ 0,

and F (n)(x) = P (Yn < x) is an Erlang(θ, n) c.d.f., i.e

F (n)(x) =θn

(n− 1)!

∫ x

0un−1e−θudu.

Hence

H(x) = P (Yn(τ) < x) =∫ ∞

0

[ ∞∑

n=0

e−λτ (λτ)n

n!θn

(n− 1)!

∫ x

0un−1e−θudu

]µe−µτdτ.

In terms of characteristic functions (according to (7.10’)) we have

ϕτ (t) = φ(t) =µ

µ− it; ϕX(t) = Ψ(t) =

θ

θ − it, i =

√−1,

giving finally

W (t) = ϕYn(τ)(t) = φ[iλ(1−Ψ(t))] =

µ(θ − it)µθ − it(µ + λθt)

.

7.4.1 A dynamic stochastic modelA model which describes the most common situations is the folowing.

Assume that orders are set-up at regular (equal) intervals (periods) of timei =, 1, 2, ... (i.e days, weeks, months,etc). Assume that the lead time L is agiven constant integer (L > 1).We use the notations

Ii- the stock level at the end of period i;qi- the order quantity which enters the stock during the period i;ri- the demand per period i; ri are random variables, identically dis-

tributed and independent, with a known probability distribution;Si- is the generalized stock at the beginning of the period i; it consists of

the stock on hand Ii−1 at the beginning of the period i plus, all quantitiesq set up until then, which did not arrived yet in the stock, hence

Si = Ii−1 + qi + ... + qi+L−1, (7.13)

(Note that qi+L is the ordered quantity at time i, which will arrive in stockat time i + L).

65

From the above it results that the demand per L + 1 periods of time is

Ri = ri + ri+1 + ... + ri+L,

it has a known distribution and Ri, Ri+1 are independent. Denote F (r) thecdf of R anf f(r) the pdf of R. If the cost h, d are known and set-up cost isomitted (i.e. s = 0), the problem is to determine the optimum order qi+L

which is set-up at the end of period i,and, which will enter the stock in theperiod i + L. Let us consider the quantity

Q = Si + qi+L.

From the relation (7.13) and

Ii = Ii−1 + qi − ri

Si = Ii + ri + qi+1 + ... + qi+L−1 =

= Ii+1 + ri + ri+1 + ... + qi+L−1 =

.............................

= Ii+L + ri + ... + ri+L−1

we obtainQ = Si + qi+L = Ii+L + R.

Ii+L = Q−R. (7.14)

Denoting

I+i+L =

Ii+L, if Ii+L > 0,o, otherwise

; I−i+L =

o, if Ii+L < 0Ii+L, otherwise

it results from (7.14) that we have to minimize the average cost function

E[C(Q)] = hE[(Q−R)+]− dE[(Q−R)−],

E[C] = h

∫ Q

0(Q−R)f(R)dR + d

∫ ∞

Q(R−Q)f(R)dR. (7.13′)

(For a variable I, I+ is called surplus and I− is called deficit). If we findthe optimum Q then we determine easily the optimum qi+L = Q−Si. (Notethat Si is known according to (7.13)).

66

Our problem is now to solve the min problem (7.13’) which gives dE(c)dQ =

0.. Let us denote A(Q,R) = (Q−R)+ > 0, B(Q, R) = (Q−R)− < 0. Then(7.13’) becomes

C = hA(Q,R)− dB(Q, R), E[C] = min,d

dQE[A(Q,R) + B(Q,R)] = α,

(7.15)which is a problem of linear control having the objective (7.13’) and a (dif-ferential) constraint specified by the second formula (7.15). Because in ourcase we have E[A + B] = Q−E[R] (from (7.14)), we have α = 1. From thecondition of minimum and from (7.15) we have

hdE[A]dQ

− d(−dE[A]dQ

+ α) = 0,

which givesdE[A]dQ

=αd

h + d, or F (Q) =

d

h + d. (7.16)

The solution in Q of the second relation (7.16) is

Q = F−1(d

h + d

and the model is solved, i.e qi+L = Q− Si.Ecercice. The demand r per (unit) period of time has an exponential

Exp(λ) distribution. In the hypotheses of the model 7.4.1, determine theoptimum order qi+L when the initial stock level I0 is known and the previousordered quantities qi, qi+1, ..., qi+L−1 are known. The lead time L is a givenpositive integer and the costs h, d are also known.

Hint. The demand per L + 1 control periods of time is Erlang(λ, L + 1)distributed, i.e.its p.d.f. is

f(R) =λL+1

L!RLe−λR, r > 0.

Hence

F (Q) =λL+1

L!

∫ Q

0RLe−λRdR.

The optimum Q is the solution of equation

F (Q) =λL+1

L!

∫ Q

0RLe−λRdR =

d

h + d

67

and qi+L = Q−Si, Si = I0+ri+ri+1+...+ri+l−1. If we apply Proposition 2.1bellow, then qi+L = ri−1, where ri−1 is simulated from Exp(λ) distribution.

For a cdf F (x) which cannot be inversed analytically, (as in the previousEcercice), its inverse F−1(t) (which exists for every t > 0) can be determinedby an obvious numerical procedure. One can prove the following proposition

Proposition 7. 2 In the hypotheses of this model we have

qi+L = ri−1. (7.17)

The last formula is called the feed back rule and it is proved in (Vaduva-1974).It states that the best order to be set-up now at period i is equal withthe demand ri−1 from the previous period i− 1.

7.4.2 A model for reserve stock

Assume that in a manufacturing company there are two automatic ma-chines M1,M2 (See Fig. 7.5)

→ M1S→ M2 →

Fig.7.5. Two machines M1,M2 in series

which are running in series, i.e. in M1 enters material which is partiallyprocessed by this machine; then the output of M1 enters M2 to get furtherprocessing. The machine M1 has a random failure time τ with pdf g(τ) andthe machine M1 has the average running time E[θ] = µ. When the machineM1 breaks-down, then the machine M2 is suffering due to shortage output ofM1. The problem is to determine an optimum reserve S (a safety inventory)with material partially processed (by M1 or by other parallel like machine),to ensure a continuous running of the system of serial machines M1,M2.Apart from the known elements µ and g(τ) we assume that the holding costh of the stock S and the shortage cost d (related to the machine M2) areknown, the demand rate r (or the processing rate of material) of the machineM2 is also known and we must determine the optimum reserve S. We assumealso that S remains constant such as during a failure of M1 the machine M2

gets supply from S and the reserve S is quickly filled in by another parallelmachine. Another assumption is that the failure duration of M1 plus thetime to refill S is smaller than E[τ ]. From this, results that the holding costper unit time for the whole stock S is Ch = hS (and not hS

2 as usual, evenif this does not imply any difficulties!). Note that for the machine M2 there

68

is an idle time t caused by the failure time τ of M1. (During this failuretime the machine M2 is supplied by S). Hence we nhave

t =

0, if S ≥ rττ − S

r , if S < rτ

and the average shortage cost per failure is

D = d

∫ ∞Sr

(τ − S

r)g(τ)dτ.

Because there are in the average 1µ failures per unit time, this means that

the shortage cost per unit time is

Cd =d

µ

∫ ∞Sr

(τ − S

r)g(τ)dτ.

Therefore the cost function of the model (i.e. the average cost) to be mini-mized is

C(s) = hS +d

µ

∫ ∞Sr

(τ − S

r)g(τ)dτ.

From the minimum condition we have

dC

dS= h− d

rµ[1−G(

S

r)),

where G(x) is cdf of τ, and hence optimum S is the solution of ecuation

G(S

r) = 1− rµh

d.

Note thatS =

0, if rµh ≤ d,rG−1(1− rµh

d ), if rµh > d.(7.18)

Note. The last formula says that when rµh ≤ d then is not necessarya reserve S. This model gives a rule of managing a production line withmachines running in series.

Example. If τ has an exponential distribution Exp(λ) (as usual!), thenG(x) = 1− e−λx, x > 0, therefore

S = − r

λlog

rµh

d, , rµh > d. (7.18′)

If the equation (7.18) cannot be solved analytically (as in (7.18’)) then onecan built up a numerical solution.

69

7.5 A simulation model.

Here we will present a simulation model using the fixed increment clocktime (increment c = 1),which is based on the shortage model from section7.1. The identifiers used in the model are:

H= the holding cost;D= the shortage cost; S= the set-up cost;CH= total holding cost over the period of time of simulation;CD= total shortage cost for the period of simulation;CS= totsl set-up cost for the peeriod of simulation;TC= total cost (i.e. TC = TH + TD + TS);T= the moment of time when an order enters the stock;R= demand rate (demand per unit of time); a random variable of a

known distribution;V I= the (current) stock level;Q= the optimum order quantity;P= the reorder stock level (reorder point);L= the lead time; a discrete random variable (L = 1, 2, ...) which has a

known probability distribution;CLOCK= the clock time (an integer);BI= the initial stock level;TT= the period of time for simulation; (a large integer!).Input random variables are R, L and input parameters are H, D, S, P,BI, TT.

The reorder point P can be determined by the following formula

Prob(R(L) > P ) = 1− α, (7.19)

whewre α is the risk of not satisfying the demand R(L) = R1+...+RL duringthe lead time L. If R is normal N(m, σ) distributed and denote l = E[L],then R(L) is normal N(lm,

√lσ. If we consider the quantile zα defined as

∫ zα

−zα

e−t2

2 dt = 1− α

then the reorder pouint is

P = lm + zα

√lσ. (7.20)

If R is distributed as Exp(λ), then R(L) is distributed as Gamma(0, λ, l)(i.e. Erlang) and therefore thew reorder point P satisfies relation

Prob(R(L) > P ) =∫ ∞

P

1λΓ(l)

tl−1e−tλ dt = 1− α (7.20′)

70

m?A

VI = VI-R(11)

?¾VI = VI+Q

(10)

?

?Nu

Da!!!!

aaaaaaaa

!!!!

?

?T=CLOCK

(9)Nu

!!!!aaaa

aaaa!!!!

?

?CLOCK>TT

(8)

?

write themoutput

statistics

Calc. TC and

(12)

Da

?

± °² ¯

STOP

Calculate Q,PCalculate AR,AL,SRCLOCK=CLOCK+1

? (7)RR(M)=R

(6)

?

?

JJ

JJDO I=1,M

RR(I) = RR(I+1)

(5)

?

-RR(1)∗RR(1)SSUMR=SSUMR+R∗R-SSUMR=SSUMR+R-RR(1)Generate R

m¾ B

(4)

?

JJ

JJDO J=1,N

SUML = SUML+L

LL(J) = L

Generate L

(3)?

JJ

JJDO I=1,M

SUMR=SUMR+R∗RSUMR=SUMR+RRR(I) = RGenerate L

(2)?

and set initial conditions

Read input parameters

? (1)± °² ¯

START

m?B

(22)

T = CLOCK + L

?

(21)

LL(N) = L

?

(20)

LL(J)=LL(J+1)JJ

JJDO J=1,N

?SUML=SUML+L-LL(1)

Generate L

(19)?CS = CS + S

(18)Nu?

!!!!aaaa

aaaa!!!!

?

?T>CLOCK

(17)m¾B

!!!!aaaa

aaaa!!!!

?

?P < VI

(16)m¾B

Nu

Da

Da

CH = CH+VI∗H (15)

?

CD = CD-VI∗DVI = 0

(14)

¾ ?

?

< 0

≥ 0

!!!!aaaa

aaaa!!!!

?

VI

(13)

mA

-

-

-

-

Fig. 7.6. Flowchart for an inventory simulation model

71

i.e P = pα = the upper α quanztile of R(L). The simulation model calculatesthe input parameter P the quantiles zα or pα and formulae (7.20),(7.20’)(depending on distribution of R). In order to calculate the order Q one usesthe formulare from the shortage model, with some further improvement.One uses instead of R its estimate using the moving average based on Mterms and instead of l = E[L] its estimate as moving average based on Nterms namely

E[R(L)] ≈ AR =∑M

i=1 Ri

M, l = E[L] ≈ AL =

∑Nj=1 Lj

N.

(It is assumed that R is N(m(t), σ(t)) or Exp(λ(t) and L has the meanE[L] = l(t)). The order Q is therefore

Q =

√2.AR.S

H

√H + D

D. (7.21)

The flowchart is given in Fig.7.5. The initial conditions are

TC = CS = CD = CH = 0, V I = BI, T = CLOCK = 0.

All blocks in the flowchart are explanatory. Note thar tme model refersto only one type of material. Using the ideas from this section, it can beimproved for different distributions of R and L or it can be modified formultiproduct inventories.

8 Optzimization using random search

8.1 Introduction

In this section we present methods of optimization based on randomsearch. The problem is the following,

maxx∈D

f(x), (8.1)

where D ⊂ Rk is a k-dimensional set. For solving a min problem, the func-tion f(x) is changed in −f(x) or in another problem (e.g. f(x) 7→ 1

f(x) iff(x) 6= 0). There is a large ”classical” literature based on properties of f(x)or of D. Here we are interested in determining the point x∗ ∈ D such asmaxx∈D

f(x) = f(x∗) = f∗, i.e. x∗ is a global maximum point. The global

maximum point is selected from several local maximums. Such a local max-imum belongs to a capture set i.e. a set which contains only the maximum

72

point x∗. (In fact the term capture comes from the recurent methods fordetermining x∗ and f∗).

The idea of random search is the following. Generate a large num-ber N of random points X1, X2, ..., XN in D. (When D is bounded, i.e.sup

x,y∈D||x−y|| < ∞, these points could be uniformly distribited on D). Then

calculate f(Xi), 1 ≤ i ≤ n. and take f∗(N) = f(X∗(N)) = max

1≤i≤Nf(Xi). A

theorem of Gnedenko (1943) says that in some conditions (i.e.f is a con-tinuous function), we have limN→∞ f∗(N) = f∗, limN→∞X∗

(N) = x∗. Whenthe points Xi, 1 ≤ i ≤ N are uniform on D, then we call this Crude MonteCarlo (CMC). If the optimum solution x∗ ∈ T ∗ ⊂ D, (T ∗ is a capture setcontaining the solution), then, it is known that for a given risk ε, 0 < ε < 1there is a p such as

P (XN ∈ T ∗; ||XN − x∗|| ≥ ε) = p, (8.2)

it is necessary to use

N > [log ε

log(1− p)+ 1] = N∗, (8.2′)

previous probability beeing calculated for the assumed distribution of X.As p is not known, one can use p > 1 − ε. The CMC algorithm, given byAnderssen and Bloomfeld (1975), for approximating x∗ is

1. Input ε, p and calculate N∗; Take Z0 = −A (where A = MAX is alarge number) and take Y0 a point in D;

fori = 1 to N∗ do beginGenerate X uniform distributed on D, and calculate f = f(X);Calculate Z1 = max(Z0, f) and

Y1 =

Y0, if f < Z0

X if f ≥ Z0;

Take Z0 = Z1, Y0 = Y1,end;

3 Take f∗ = f, x∗ = Y1.

Note that the algorithm works in the only assumption |f(x)| ≤ M <∞, ∀x ∈ d.

73

8.2 Importance Monte Carlo for Optimization

In the CMC algorithm, all poionts X ∈ D are equaly likely to be takenat random. It should be better if we use a sampling method for selectingwith higher probability those points in D which are close to x∗. (See Vaduvaand Spircu-1981). This can de done if we use a pdf on D which can performlike this. If the function f(x) has a maximum in x∗, then this must be amode of this distribution.

Therefore, we assume that f(x) has the following properties:a. f(x) > 0, ∀x ∈ D;b. f(x) is a continuous function (i.e. is an integrable function);c. f(x) is bounded on D, i.e |f(x)| ≤ M (this is true if D is a compact

set). If a. is not true, then we can select P, 0 < P < ∞ and change f(x)with f(x) + P. which has the same maximum point.

Now, we can choose a pdf in the form

g(x) =f(x)∫

D f(u)du, x ∈ D, I =

∫

Df(u)du.

A random vector X having pdf g(x) can be now simulated using the rejectionenveloping procedure, with the enveloping pdf density h(x) uniform on D,

h(x) =

1H , if x ∈ D,0, otherrwise,

H = mesD.

If denote K = supx∈D g(x), then we find g(x)h(x) ≤ α = KH = MH

I , andtherefore the algorithm for simulating X is the following

1. Input M, H, I;2. repeatGenerate Y uniform distributed on D;Generate U uniform on (0, 1);until U ≤ If(Y )

M ;3. Take X = Y.

The method based on simulating X with cdf g(x) is called ImportanceMonte Carlo method (IMC). The IMC algorithm is similar to CMC algo-rithm where X is simulated with pdf g(x).

Convergence of IMC. Now we talk about the convergence of the IMCmethod of optimization. For a sample of size N , X1, X2, ..., XN , Xi ∈ D,from the pdf g(x) (whose cdf is G(x)), let us denote fi = f(Xi), where f isthe function to be maximized. Then, the estimate of f∗ = max

x∈Df(x) = f(x∗)

is f∗N = max1≤i≤N

fi. Two questions arise now:

74

1. Does the convergence of f∗N to f∗ depend on the cdf G(x)?2. Assumming that f∗N → f∗, which is the influence of G(x) on the speed

of convergence.The answer to question 1. is given by the following proposition of Gne-

denko (1943).

Proposition 8. 1 If for cdf G(x), any neighbourhood V (x∗) of x∗ has apositive probability (i.e P (X ∈ V (x∗) > 0) then f∗N → f∗ when N → ∞,almost sure.

From this proposition it results that the cdf G(x) does not influence quali-tatively the convergence of f∗N to f∗.

To the question 2.,the answer is that G(x) does have a big influence tothe convergence. This fact was proved by Rubinstein and Weissman (1974)when G(x) is a uniform cdf. In the following we will show that for a givensample size N the IMC algorithm is faster than CMC algorithm, when thefunction f(x) to be optimized, is continuous. To formulate the propositionwhich specifies this, we introduce first notations:

Y is the uniform distributed vector on D, and cdf of Y is G1(x);X has the cdf G2(x);If for the samples of the same size N , Y1, Y2, ..., YN from G1(x) and

X1, X2, ..., Xn from G2(x) we denote f1i = f(Yi), f2

i = f(Xi) and f1∗N =

max1≤i≤N

f1i , f2∗

N = max1≤i≤N

f2i and note that cdf of f1∗

N is H1(x) = (G1(x))N and

cdf of f2∗N is H2(x) = (G2(x))N , then the following proposition holds

Proposition 8. 2 Let δ > 0 be a positive number and consider the interval(f∗ − δ, f∗) in the neighbourhood of f∗. Then, ∀y ∈ (f∗ − δ, f∗) we have

P (f1∗N ∈ (y, f∗)) < P (f2∗

N ∈ (y, f∗)).

Therefore, f2∗N enters faster in the interval (y, f∗) than f1∗

N , i.e. IMC proce-dure converges faster to x∗ than CMC procedure. Note that the presentedrandom search methods assume that the constraint domain D is bounded.However, the IMC method can be used also if the domain D is not bounded.In this case, the cdf G(x), x ∈ D must be properly selected as in the followingexample.

Example. Consider the following problem

maxx∈d

f(x), x ∈ R2, D = (0,∞)× (0,∞),

75

where f(x) is a continuous function. In this case we select the importancepdf g(x, y) in the form

g(x, y) = λµ

e−λx−µy, if x > 0, y > 0,o, otherwise,

λ, µ > 0.

If it is apriory known that the maximum point (x∗, y∗) of f(x, y) is far fromthe origin (0, 0), then we must choose λ < 1, µ < 1. A good choice would bealso λ = µ = 1.

9. Simulation of some discrete stochastic processes

In this section we deal with simulation of some discrete stochastic pro-cesses, suvh as Bivariate Uniform Poisson Process (BUPP) and BivariateUniform Binomial Process (BUBP) and villustrate their application to ahealthcare problem.

9.1. Simulation of Poisson Processes and their use to analyze scanstatistics2

9.1.1 IntroductionScan statistics studies clusters of random points taken in an interval I.

If I = [a, b], −∞ < a < b < ∞, we deal with the one-dimensional scanstatistics and, if I = [a1, b1] × ... × [an, bn], −∞ < ai < bi < ∞, 1 ≤ i ≤ n,we have multivariate scan statistics. In scan statistics the main problemis to study clusters of points which describe unusual situations, namely tosee if it is natural to find large or small clusters with a large or a smallprobability.

The aim of this subsection is to study a bivariate scan statistic and toapproximate (by simulation of a BUPP) the critical value of a statisticaltest rtelated to the continuous scan.

Definition 9. 1 Let X1, X2, ..., XN be random points in the interval [0, T ].Denote Sw the maximum number of points which are found in an intervalof length w, w < T , when such an interval scans [0, T ]. The small intervalof length w is called scan window and the random variable Sw is called one-dimensional scan statistic.

When X1, X2, ..., XN are integer valued random variables, then Sw isthe discrete one-dimensional scan statistic and when X1, X2, ..., XN are atrajectory of N points of a Poisson process Xt, t ≥ 0, then Sw is the one-dimensional continuous scan statistic (sere Alm-1997,Glaz,Naus, Wallenstein-2001).

76

An alternative to Sw is the random variable Wk defined as the minimumlength of an interval in [0, t] which contains k points. Note that we have

P (Wk ≥ w) = P (Sw ≤ k). (9.1)

Let us denote this probability with:

P (Sw ≤ k) = P (k, N, w, T ).

We will focus in this subsection on a two-dimensional continuous scanstatistic:

Definition 9. 2 Let I = [0, L] × [0, T ] be a two-dimensional interval andu, v > 0 two positive numbers such as 0 < u < L < ∞, 0 < v < T < ∞.(The numbers u, v define a two-dimensional scan window with dimensionsu and v). Assume that in the interval I there are X1, X2, ..., XN whichare a trajectory of N points of a bivariate Poisson process Xt, t ≥ 0 withintensity λ. Denote νt,s = νt,s(u, v) = the number of points which fall in thescanning window [t, t + u]× [s, s + v]. Then the bivariate scan statistic is

S = S((u, v), N, L, T ) = max0≤t≤L−u, 0≤s≤T−v

νt,s. (9.2)

The probability of interest is now:

P (S((u, v), N, L, T ) ≥ k) = P ((u, v), N, L, T, k). (9.3)

The probability distribution (9.3) is hard to calculate. Therefore a sim-ulation procedure is the simplest way to estimate it. In Haiman and Preda-2002 is introduced a method for estimating the probability distribution (9.3)using the simulation of conditional scan statistic and the relationship be-tween scan statistic and conditional scan statistics.

We estimate the probability distribution (9.3) using the simulation ofscan statistic and the main steps of the algorithm that we use are the fol-lowing:

Algorithm SIMSCAN

Input T, N, w,m, λ;

1. For j = 1 to m do

begin

generate X1, ..., XN ;

77

Determine Sw, take Kj = Sw;

end;

(In the Section 3 we describe the implementation of the algorithm SIM-SCAN which determines Sw, denoted there by nw).

2. Determine the empirical distribution of the sample K1, ...,Km as fol-lows:

2.1 Determine the order statistics K(1) < K(2) < ... < K(r), r < m;

2.2 Determine the frequencies fi, 1 ≤ i ≤ r, fi = number of samplingvalues K ′s equal to K(i), 1 ≤ i ≤ r,

∑ri=1 fi = m;

2.3 Determine the relative frequencies (i.e. sampling probabilities)πi = fi

m .

(In fact, step 2 builds-up a frequency distribution, i.e. a histogramof the scan statistics).

If m is large enough, then the sampling distribution converges to (9.3)(according to the consistency property of the estimates πi).

9.1.2 Algorithms for the simulation of a multivariate Poissonprocess

For the estimation of the probability distribution of the two-dimensionalcontinuous scan statistic we need a simulation method for the bivariatePoisson process.

Definition 9. 3 (see Devroye-1986) A process consisting of randomly oc-curing points in the plane is said to constitute a two-dimensional Poissonprocess having rate λ, λ > 0, if

1. The number of points occuring in any given region of area A is Poissondistributed with mean λA.

2. The number of points occuring in disjoint regions are independent.

For m = 1 (i.e. X ′s are real numbers), it is well known that the spac-ings between random points of the Poisson process Poisson(λt), t ∈ [0,∞),have an exponential distribution of the parameter λ. This gives the follow-ing algorithm for simulating a trajectory of k points of the one dimensionalhomogenuous Poisson process of the intensity λ:

78

Algorithm SIMPO1

Input λ, k (preparatory step);

1. i = 0, T = 0;

2. repeat

Generate E 7→ Exp(1);

i := i + 1, Ti := Ti + Eλ ;

until i = k.

The algorithm produces the trajectory T1, T2, ..., Tk of the univariatePoisson(λ) process on [0,∞). Simulation of E 7→ Exp(1) can be done bythe inverse method or by the rejection method (see Devroye-1986, Fishman-1996, Vaduva-1977). From this algorithm it results immediately an algo-rithm for simulating a bivariate Poisson process on A = [0, t] × [0, 1], withthe rate λ:

Algorithm SIMPOT01

1. Generate T1, T2, ..., Tk a Poisson trajectory on [0, t];

2. Generate U1, U2, ..., Uk uniform and independent random variates on[0, 1].

The points (U1, T1), (U2, T2), ..., (Uk, Tk) determine an uniform Poissonprocess on A.

Note that k is an integer sampling value of the Poisson random variablewith parameter λt.

Similar ideas could be used to built up an algorithm for simulating anuniform Poisson process of the intensity λ on the n−dimensional intervalI = [0, T1] × ... × [0, Tn]. If we denote V0 =

∏ni=2 Ti, the volume of the

(n − 1)− dimensional interval I1 = [0, T2] × ... × [0, Tn], then the followingalgorithm simulates the required Poisson process.

Algorithm SIMPOMD

1. Simulate 0 < X11 < X12 < ... < X1k an one-dimensional Poissonprocess with parameter λV0 on [0, T1].; (i.e. k is random). This stepis performed as follows:

1.1 Initialize t = 0, k = 0;

79

1.2 repeat

i. Generate E 7→ Exp(1); (E is an exponential random variableof parameter 1);

ii. k := k + 1; t := t + Eλ ; X1k := t

V0;

until X1k ≥ T1;

2. Generate P1, ...,Pk independent and uniform distributed points on I1;

3. Output (X11,P1), ..., (X1k,Pk). (This is a realization of the Poissonprocess of intensity λ on the k−dimensional interval I).

The following theorem justifies the algorithm.

Theorem 9. 1 The points Qi = (X1i,Pi), 1 ≤ i ≤ k determine an uniformPoisson process of parameter λ on I.

Proof. Denote N the random number of points Xi (in [0, T1]) and denoteNQ the number of points Qi generated by the algorithm. Since the variablesE in the algorithm are exponential then N is distributed as in formula (9.5)bellow and

P (NQ = k) = P (N = k)V0. (9.4)

If we denote Λ0 = λV0, then

P (N = k) = F (k−1)(T1)− F (k)(T1) (9.5)

with

F (k)(t) =∫ T1

0

(Λ0u)k

k!e−Λ0udu

because F (k) is the Erlang distribution (i.e. the convolution product ofexponentials). Hence

P (N = k) =Λk−1

0

(k − 1)!

∫ T1

0uk−1e−Λ0u

(1− Λ0u

k

)du =

=Λk−1

0

(k − 1)!

[∫ T1

0uk−1e−Λ0udu−

∫ T1

0

Λ0uk

ke−Λ0udu

]=

=Λk−1

0

(k − 1)1

[∫ T1

0uk−1e−Λ0udu +

uk

ke−Λ0u|T1

0 −∫ T1

0uk−1e−Λ0udu

]=

Λk−10

(k − 1)!T k

1

ke−Λ0T1 .

Finally we get

P (NQ = k) = P (N = k)V0 =Λk

k!e−Λ, Λ = λV, V = T1V0

80

and the theorem is proved.Particularly, an algorithm for simulating an uniform Poisson process with

intensity λ, on the bivariate interval [0, T ]× [0, L], is the following (see alsoDevroye-1986):

Algorithm SIMPOTL

1. Simulate 0 < T1 < T2 < ... < Tk a uniform Poisson Process with therate λ, Tk ≤ T (i.e. k is random). This is done as follows:

1.1 Initialize t = 0, k = 0;

1.2 repeat

i. Generate E 7→ Exp(1);ii. Take: k := k + 1; t := t + E

λ ; Tk := tL ;

until t ≥ T

2. Generate L1, L2, ..., Lk uniform on [0, L];

3. Output (T1, L1), (T2, L2), ..., (Tk, Lk).

The sequence in step 3 defines a bivariate Poisson process with intensityλ, which is uniform on [0, T ]× [0, L].

Now, using the algorithm SIMSCAN for the bivariate case, we can de-termine an empirical distribution of the scan statistics (i.e. a histogram)and then, estimate the critical value Sα, corresponding to a risk α, such asP (S ≥ Sα) = α.

Notes.a). All the algorithms are easily written to produce uniform integer

values for the one-dimensional Poisson process, or for the two-dimensionalPoisson process with integer coordinates of the sampling points.

b). In the one dimensional case, algorithms could be written for nonhomogenous Poisson process with intensity λ(t); these algorithms use thecumulative intensity

Λ(t) =t∫

0

λ(u)du. (9.6)

c). Algorithm SIMSCAN can be then applied to estimate the α− quan-tile of the scan statistics S((u, v), k, T, L).

d) If the number N of points in A = [0, T ] × [0, L] is binomially dis-tributed (see Subsection 9.2 bellow for details) with the parameters p, n, 0 <

81

p < 1, n ∈ N+, then the simulation of the one scan statistics sampling valueis given by the following algorithm:

Algorithm SCANBIN

1. Input parameters n and p;

2. Simulate a sampling value of N as binomial of parameters p, n;

3. Simulate N points (Ti, Li), 1 ≤ i ≤ N , uniformly distributed in A =[0, T ]× [0, L]. These are simulated as follows:

3.1 Take i = 0;

3.2 repeat

i. Generate U1, U2 uniform and independent (0,1) random num-bers;

ii. Take T1 = T · U1, L1 = L · U2, i := i + 1;

until i = N ;

Then an algorithm similar to SIMSCAN can be applied to produce asampling value of the bivariate scan statistics.

In the following section we give some results on the implementation ofthe algorithm SIMPOTL for producing the empirical distribution of the scanstatistics when the points (Ti, Li), 1 ≤ i ≤ N are realizations of an uniformbivariate Poisson process on [0, T ]×[0, L] with intensity λ. Comparisons withthe results of Alm-1997 and with the results of Haiman and Preda-2002 arepresented.

9.1.3 Implementation and test results

Two programs using the algorithm SIMPOTL and a scan algorithm werewritten. One of them is written in C language and runs in Linux operatingsystem, and one is written in C++ language and runs in Windows 2000operating system. The two programs contain small differences concerningthe scan module. We will present here the scan algorithm and the resultsobtained from the C program. We call the following algorithm SCAN2.

In order to understand this implementation we refer to Fig.9.1 ) in sub-section 9.2 bellow.

We suppose that the scan surface and the scanning window are rectangleswith the sides parallel with the horizontal and respectively vertical axes. The

82

width of the scan surface is denoted with T and its height with W . Thewidth of the scanning window is denoted by u and its height by v.

Furthermore we suppose that both the scan surface and the scanningwindow are defined by two of their corners: the upper right corner and thelower left corner. We denote these corners with Sright and Sleft for the sur-face, and with Wright and Wleft for the window. Initially Sright = (T,W ) andSleft = (0, 0). After generating the points (Ti,Wi), 1 ≤ i ≤ N , realizationsof an uniform bivariate Poisson process on [0, T ]× [0, L] with intensity λ, webegin the scanning process. We assume that the first position of the scan-ning window is characterized by the coordinates: Wright = (Tmax,Wmax),Wleft = (Tmax − u,Wmax − v), where Wmax and Tmax are the maximumvalues of Wi and respectively Ti.

The scanning widow moves over the scan surface on vertical bands, par-allel with the y-axis. If we assume that the window is characterized by thecoordinates Wright = (xr, yr), Wleft = (xl, yl), then the following position ofthe scan window will be: Wright = (xr, yr − d), Wleft = (xl, yl − d) whered = minyr − ymax, yl and ymax is the biggest coordinate y from the bandwhich is smaller than yr.

After a band was entirely scanned, the scanning window is repositionedon the next band in the following way: if the last position on the pre-vious band was characterized by Wright = (xr, yr), Wleft = (xl, yl), thenthe present position is characterized by: Wright = (xr − h, ymax), Wleft =(xr − h − u, ymax − v) where h = minxr − xmax, xl, xmax is the biggestcoordinate x smaller than xr, and ymax is the maximum value of Wi for thepoints which have xr − h − u ≤ Ti ≤ xr − h. We use this method of scanbecause the points of the bivariate Poisson process are generated with Ti inincreasing order.

For each position of the window there are counted the number of pointsnW that are in the window. After the scanning of all the surface we deter-mine the maximum value of the nW . This maximum is a simulation valueof the bivariate scan statistics, (i.e. nw = Sw in the notation of Section 1).

By repeating the algorithm SCAN2 for N runs (N−large), one deter-mines the empirical distribution of the scan statistics.

The following tables contain test results. In the table there are alsomentioned for comparison, simulated results produced by Alm [?] and ap-proximations produced by a special method due to Haiman and Preda [?].

On the top of each table are mentioned particular values of the inputdata used:

• λ intensity of the bivariate Poisson process;

83

• W,T dimensions of the rectangle;

• u, v dimensions of the scanning window;

• N number of simulation runs;

• t time in seconds necessary for N simulation runs on a PC with anAthlon processor at 997 MHz and with 256 MB of RAM.

λ = 0.01, W = T = 10, u = v = 1, N = 10000p1 = 0.01, λ′1 = 1, p2 = 0.1, λ′2 = 0.1

k Poisson H&P Alm Bin(p1, λ′1) Bin(p2, λ

′2)

1 0.9818 0.9826 0.9959 0.9524 0.95452 1.0000 0.9998 0.9999 0.9981 0.9988

λ = 0.05, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 5, p2 = 0.1,λ′2 = 0.5


′2)

2 0.9859 0.9854 0.9905 0.8524 0.85473 0.9998 0.9996 0.9997 0.9825 0.97984 1.0000 0.9999 0.9999 0.9982 0.9982

λ = 0.05, W = T = 200, u = v = 1, N = 10000p1 = 0.01,λ′1 = 5, p2 = 0.1,λ′2 = 0.5

k Poisson H&P Alm bin(p1, λ′1) Bin(p2, λ

′2)

3 0.8620 0.8621 0.8935 0.8610 0.85804 0.9974 0.9976 0.9981 0.9972 0.99765 1.0000 0.9999 0.9999 1.00 1.00

λ = 0.1, W = T = 50, u = v = 1, N = 10000p1 = 0.01,λ′1 = 10,p2 = 0.1,λ′2 = 1


′2)

3 0.8762 0.8761 0.9052 0.8719 0.87444 0.9957 0.9957 0.9966 0.9944 0.99575 1.0000 0.9998 0.9999 0.9998 0.9999

84

λ = 0.5, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 0.50,p2 = 0.1,λ′2 = 5


′2)

4 0.7865 0.7938 0.8343 0.7911 0.79325 0.9692 0.9707 0.9759 0.9731 0.96806 0.9968 0.9970 0.9974 0.9976 0.99717 0.9999 0.9997 0.9997 0.9999 0.9999

λ = 1, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 100,p2 = 0.1,λ′2 = 10

k Poisson H&P Alm Bin(p1, λ′1) Bin(p2(λ′2)

6 0.8396 0.8248 0.8603 0.8436 0.83357 0.9695 0.9468 0.9732 0.9714 0.96908 0.9956 0.9691 0.9959 0.9949 0.9954

The following tables compare only our results with the results from theimplementation of Alm.

λ = 2, W = T = 20, u = v = 1, N = 10000p1 = 0.01,λ′1 = 200,p2 = 0.1,λ′2 = 20

k Poisson Alm Bin(p1, λ′1) Bin(p2, λ

′2)

7 0.0007 0.0004 0.0002, 0.00019 0.5111 0.5283 0.5100 0.511911 0.9690 0.9640 0.9692 0.9653

λ = 5, W = T = 20, u = v = 1, N = 10000p1 = 0.01,λ′1 = 500,p2 = 0.1,λ′2 = 50


′2)

13 0.0010 0.0040 0.0002 0.000515 0.2390 0.2535 0.2645 0.261017 0.8540 0.8442 0.8509 0.8457

85

λ = 5, W = T = 30, u = v = 1, N = 10000p1 = 0.01,λ′1 = 500,p2 = 0.1,λ′2 = 50

k Poisson Alm Bin(p1, λ′1), Bin(p2, λ

′2)

14 0.0003 0.0016 0.0004 0.000716 0.3167 0.3346 0.3202 0.320118 0.8851 0.8945 0.8908 0.881820 0.9893 0.9907 0.9899 0.989922 0.9994 0.9994 0.9992 0.9995

During various runs it resulted a convergence of the frequencies to theprobabilities calculated by Haiman and Preda-2002 .

The number of runs N = 10000 considered in the tables seems to belarge enough to ensure a good estimate of the probability distribution of thescan.

9.2.On Simulation of a Bivariate Uniform Binomial Process andits Use to Scan Statistics 3

9.2.1 Basic notionsThe aim of this subsection is to study a Bivariate Uniform Binomial Process

(introduced by Vaduva and Alexe-2006) and to underline its use and to approximate(by simulation) the critical value of a bivariate scan test.

In subsection 9.1 and Suter, Vaduva, Alexe-2005 is considered a scan statisticsfor a Bivariate Uniform Poisson Process (BUPP) of the intensity λ. Here, in asimilar manner, we will consider a discrete Bivariate Uniform Binomial process(BUBP) of the parameters p, λ, I, p ∈ (0, 1), λ ∈ R, I ⊂ R2, defined as

Definition 9. 4 Let I be the bivariate interval I = [0, T ]× [0,W ], 0 < T, W < ∞and n = [λ × mes(I))], mes(I) = T × W, ([x] − integer part) a positive integer.Let p be a given probability, 0 < p < 1. Let X1, X2, ..., XN be a set of randompoints, uniformly distributed on I where N is an integer random variable havinga binomial distribution of parameters (n, p) (denoted Bin(n, p)). The set of pointsX1, X2, ..., XN is called a trajectory of the bivariate uniform binomial process ofparameters (p, λ, I) on I (denoted BUBP (p, λ, I)) if:

1). Points X1, X2, ..., XN are stochastically independent;2). For any bivariate disjoint intervals Bi = [α1i, β1i] × [α2i, β2i], 1 ≤ i ≤

k, αmi, βmi ∈ R,m = 1, 2 and every finite k, the number of points Ni which fallin Bi is binomially distributed Bin(ni, p), ni = [λ× (β1i − α1i)× (β2i − α2i)], andN1, N2, ..., Nk are independent random variables. This process will be denoted forshort as Xt 7→ BUBP (p, λ, I), t ∈ N . The constant λ will be also called theintensity of the process.

3This research was done in the frame of the research Program ”PAI-Brancusi” ofRoanian-French cooperation,2005-2006.

86

This binomial process has a property of stability similar to a Poisson process, namely

Theorem 9. 2 If B1, B2, ..., Bk are disjoint subsets in the interval I = [0, T ] ×[0,W ] and Xt 7→ BUBP (p, λ, I), t ∈ N then the processes Xi(t), t ∈ N , areXi(t) 7→ BUBP (p, λ,Bi) and Xi is independent of Xj , i 6= j. Particularly, ifB ⊂ I then Xk 7→ BUBP (p, λ, B), k = [λmes(B)]. On the other hand if I =B1 ∪ B2 ∪ ... ∪ Bm, Bi ∩ Bj = ∅, i 6= j and Xt 7→ BUBP (p, λ, Bi) then XB1 +XB2 + ... + XBm

is a BUBP (p, λ, I).

Proof. The proof can be easily done using the characteristic function of the binomialdistribution. Thus, for the binomial distribution X 7→ Bin(n, p) the characteristicfunction is

ϕ(t) = E[eitX ] = (p + qeit)n, t ∈ R, q = 1− p. (9.7)

Let us consider the random variables XBi , 1 ≤ i ≤ m which are independent (pointsdefining XBi

beeing uniformly distributed on Bi), and XBi7→ Bin([λmes(Bi)], p).

Then the random variable Y = XB1+XB2+...+XBm has the characteristic functionof Bin(n, p) distribution, i.e.

ϕY (t) =m∏

i=1

ϕXBi(t) =

m∏

i=1

(p + qeit)ni = (p + qeit)n, n =m∑

i=1

ni, ni = [λmes(Bi)].

The last formula gives the end of the proof. %item For j = 1 to m do %parbegin

Using algorithm SIMSCAMN from the subsection 9.1, we can estimate theprfobability distribution of the bivariate scan based on simulation of the BUBP.The empirical distribution of the bivariate scan statistic Su,v) helps to determinethe critical value kα of significance level α of a scan test.

Given the risk α, 0 < α < 1, the critical test value kα of the scan statistics isdefined as

P (S(u,v) > kα) = α. (9.8)

In the next section we discuss the simulation of a trajectory of BUBP.

Algorithms for the simulation of a Bivariate Uniform Binomial Pro-cess

The Definition 9.4 leads to the following algorithm for simulating a trajectoryof N points of the bivariate uniform binomial process of the intensity λ:

Algorithm SIMBIN2

1. (Preparatory step). i = 0, input λ, W, T, p, 0 < p < 1;

2. calculate n = [λ×W × T ];

3. Generate N a sampling value Bin(n, p);repeat

87

Generate U 7→ uniform [0, T ], V 7→ uniform [0,W ];(This can be done as follows:

- Generate U1 uniform (0, 1); Take U = U1T ;- Generate U2 uniform (0, 1); Take V = U2W ;)

i := i + 1,; take Xi = (U, V );until i = N .

The algorithm produces the trajectory X1, X2, ..., XN of the BUBP (p, λ, I).Now, using the algorithm SIMSCAN for the bivariate case, we can determine

an empirical distribution of the scan statistics (i.e. a histogram) and then, estimatethe critical value kα.

The simulation of the random variable X which is binomially distributed withthe parameters p, n, 0 < p < 1, n ∈ N+, can be done in various ways (see Devroye-1986, Vaduva-1077). For large n,we use the fact that X 7→ Bin(n, p) ≈ N(np,

√p(1− p))

i.e. X is normally distributed. Therefore, the algorithm to simulete X is the fol-lowing (see Vaduva-1977)

Algorithm BINCL1. Input n, p; Calculate m = np, σ =

√np(1− p);

2. Generate Z 7→ normal N(0, 1);

3. Calculate X = m + Zσ; Take N = round(X).

The function round(x) gives the integer which is the closest to x.

Simulation of the normal deviate Z 7→ N(0, 1) can be done in several ways;two methods will be presented in short in the following. The first one is based onCentral Limit Theorem (CLT) [6,7].

Algorithm CLNORM (simulates Z 7→ N(0, 1) based on CLT)

1. Z = 0;

2. for i := 1 to 12 do beginGenerate U uniform (0, 1);Z := Z + U ;

end;

Another algorithm combines a rejection (enveloping) and a discrete compositionmethod (see Vaduva-1977, Devroye-1986. It looks as follows:

Algorithm RJNORM

1. repeatGenerate U1 7→ uniform(0, 1);Generate Y 7→ Exp(0, 1);(This can be done by the inverse method as follows:Generate U 7→ uniform(0, 1);

88

while U ≤ 0.0000001 do Generate U 7→ uniform(0, 1);Y := − log(U));

2. until U1 ≤ e−Y 22 +Y−0.5;

3. Take Z1 := Y ;

4. Generate U 7→ uniform(0, 1);if U ≤ 0.5 then s := 1 else s := −1; (s is a random sign);

5. take Z := sZ1. (Z is normal N(0, 1)).

In the following section we give some results on the implementation of thealgorithm SIMSCAN for producing the empirical distribution of the scan statisticswhen the points (Ti,Wi), 1 ≤ i ≤ N are realizations of a bivariate uniform binomialprocess on [0, T ] × [0,W ] with intensity λ. Comparisons with the results of Alm-1997 and with the results of Haiman and Preda-2002, in the case of BUPP process,are presented.In order to compare the results with the bivariate uniform binomialprocess, we use the fact that the binomial distribution Bin(n, p), with n largeis approximated by a Poisson distribution Poisson(λ), λ = np. When we referto the BUPP (λ, I) process and to the binomial BUBP (p, λ′, I) process, both onI = [0, T ]× [0, W ], we must distinguish betweem the intensities λ (for Poisson) andλ′ (for binomial). In fact, for n large, we must have

λWT = λ′pWT (9.9)

which gives

λ′ =λ

p. (9.9′)

9.2.3 Implementation and test results

In this implementation we use one of the programs presented in Vaduva,Alexe-2006, namely the algorithm called SCAN2. In fact the main ideas derive from thealgorithm SIMSCAN presented in the section 1. The discrete process used in thisimplementation is either BUPP or BUBP according to SIMBIN2. In the following,we underline the main steps of SCAN2 (see Suter,Vaduva,Alexe-2006)). Figure 9.1gives some hints on the construction of SCAN2.

89

rr

r

r rrrr

r

rr

rr

-

6 S WrightWright

Xi(xi, yi)Wright = (xr, yr)

Wleft = (xl, yl)

x(T, 0)Sleft

y

(0,W)

u

v

Wleft

Fig. 9.1 a)

rr

r

r rrrr

r

rr

rr

-

66?d

?Moving down

S WrightWright

Wright = (xr, yr − d)

Wleft = (xl, yl − d)

x(T, 0)Sleft

y

(0,W)

u

v

Wleft

Fig. 9.1 b)

rr

r

r rrrr

r

rr

rr

-

6

¾¾-h

moves tothe left

S WrightWright

(x, ymax)Wright = (xr − h, ymax)

Wleft = (xl − h− u,

ymax − v)

x(T, 0)Sleft

y

(0,W)

u

v

Wleft

Fig.9. 1 c)

Figure 9.1. Hints for scanning algorithm:a) initial pozition of scan window;b) moving down the scan window;c) moving window to the left.

We suppose that the scan surface (called also the map) and the scanning windoware rectangles with the sides parallel with the horizontal and respectively verticalaxes. The width of the map is denoted by T and its height by W . The width ofthe scanning window is denoted by u and its height by v.

Furthermore we suppose that both the scan surface and the scanning window

90

are defined by two of their corners: the upper right corner and the lower left corner.We denote these corners by Sright and Sleft for the surface, and Wright and Wleft

for the window. Initially Sright = (T, W ) and Sleft = (0, 0). After generating thepoints (Ti, Wi), 1 ≤ i ≤ M , realizations of an bivariate uniform binomial processon [0, T ] × [0,W ] with intensity λ, we begin the scanning process. First we orderthe simulated points with respect to Ti. Assume that this was already done. Then,let us assume that the first position of the scanning window is characterized by thecoordinates: Wright = (T, W ), Wleft = (T − u,W − v).

The scanning window moves over the scan surface on vertical bands, parallelwith the y-axis. If we assume that the window is characterized by the coordinatesWright = (xr, yr), Wleft = (xl, yl), then the following position of the scan windowwill be: Wright = (xr, yr − d), Wleft = (xl, yl − d) where d = minyr − ymax, yland ymax is the biggest coordinate y from the band which is smaller than yr. (SeeFig 9.1).

After a band was entirely scanned, the scanning window is repositioned onthe next band in the following way: if the last position on the previous band wascharacterized by Wright = (xr, yr), Wleft = (xl, yl), then the present position ischaracterized by: Wright = (xr − h, ymax), Wleft = (xr − h − u, ymax − v) whereh = minxr−xmax, xl, xmax is the biggest coordinate x smaller than xr, and ymax

is the maximum value of Wi for the points which have xr − h − u ≤ Ti ≤ xr − h.We use this method of scan because the points of the BUBP or BUPP have thecoordinates Ti in increasing order.

For each position of the window there is counted the number of points that arein the window and is stored the largest number nw of points found during the scanprocess. This maximum nw is a simulation value of the bivariate scan statistics,(i.e. nw = Sw = S in the notation of Section 9.1).

By repeating the algorithm SCAN2 for N runs or iterations (N−large), onedetermines the empirical distribution of the scan statistics.

The following tables contain test results. In each table there are also mentionedfor comparison, simulated results produced by Alm-1997 and approximations pro-duced by a special method due to Haiman and Preda-2002. (Some of the tablesare reproduced fron Alm-1997). On the top of each table are mentioned particularvalues of the input data used, namely:

• λ intensity of the bivariate Poisson process;• W,T dimensions of the rectangle;• u, v dimensions of the scanning window;• N number of simulation runs;• p1, p2 and λ′1, λ

′2 refer to different values of parameters of binomial processes

corresponding to the approximate parameter of the Poisson process (determinedaccording to (9.9),(9.9’)).

• k is the value of scan statistics for which is calculated empirical probability;• H&P in the table refers to the results from ”Haiman and Preda” (2002).• P refers to BUPP; A refers to Alm; B refers to BUBP; The entries in the

following tables represent probabilities P (S = k) where S = S((u, v), T, W ) is thebivariate scan statistics from Definition 9.1.

91

λ = 0.05, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 5, p2 = 0.1,λ′2 = 0.5

k P H&P A B(p1, λ′1) B(p2, λ

′2)

2 0.9859 0.9854 0.9905 0.8524 0.85473 0.9998 0.9996 0.9997 0.9825 0.97984 1.0000 0.9999 0.9999 0.9982 0.9982

λ = 0.1, W = T = 50, u = v = 1, N = 10000p1 = 0.01,λ′1 = 10,p2 = 0.1,λ′2 = 1

k P H&P AB(p1, λ′1) B(p2, λ

′2)

3 0.8762 0.8761 0.9052 0.8719 0.87444 0.9957 0.9957 0.9966 0.9944 0.99575 1.0000 0.9998 0.9999 0.9998 0.9999

λ = 0.5, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 0.50,p2 = 0.1,λ′2 = 5

k P H&P A B(p1, λ′1) B(p2, λ

′2)

4 0.7865 0.7938 0.8343 0.7911 0.79325 0.9692 0.9707 0.9759 0.9731 0.96806 0.9968 0.9970 0.9974 0.9976 0.99717 0.9999 0.9997 0.9997 0.9999 0.9999

λ = 1, W = T = 10, u = v = 1, N = 10000p1 = 0.01,λ′1 = 100,p2 = 0.1,λ′2 = 10

k P H&P A B(p1, λ′1) B(p2(λ′2)

6 0.8396 0.8248 0.8603 0.8436 0.83357 0.9695 0.9468 0.9732 0.9714 0.96908 0.9956 0.9691 0.9959 0.9949 0.9954

The following tables compare only our results with the results from the imple-mentation of Alm.

λ = 2, W = T = 20, u = v = 1, N = 10000p1 = 0.01,λ′1 = 200,p2 = 0.1,λ′2 = 20


′2)

7 0.0007 0.0004 0.0002, 0.00019 0.5111 0.5283 0.5100 0.511911 0.9690 0.9640 0.9692 0.9653

92

λ = 5, W = T = 20, u = v = 1, N = 10000p1 = 0.01,λ′1 = 500,p2 = 0.1,λ′2 = 50


′2)

13 0.0010 0.0040 0.0002 0.000515 0.2390 0.2535 0.2645 0.261017 0.8540 0.8442 0.8509 0.8457

The results in the tables show a good agreement between the distributions ofscan statistics for all the compared cases (i.e. Poisson, Alm, H&P and binomial).For values of k of practical interest (see bellow), the values of P (S = k) are almostequal for both BUPP and BUBP.

During various runs it resulted a convergence of the frequencies to the proba-bilities calculated by Haiman and Preda-2002.

The number of runs N = 10000 considered in the tables seems to be largeenough to ensure a good estimate of the probability distribution of the scan. AnyN > 10000 will be more succesful.

The tests done here legitimate both assumptions (Poisson or binomial) for defin-ing, via simulation, the critical value of the scan test. Therefore, in the next section(application) we will use the Poisson process. (Runs for BUBP are time consum-ing!).

BUBP and BUPP processes may be used as equal alternatives in various ap-plications where discrete random (uniform) events can occur on some surface ofmaterial or geographic area.

9.3 An Application to Healthcare

Here we present an application of scan statistics to analyze the cancer diseasefor children under age 16 in the region North Pas de Calle (north of France). The re-gion consists of two departments, each department contains some arondisments andan arondisment consists of cantons. The data consisted in the number of diseasedchildren in each canton (considered the scan window). The total population in theregion is about 573500 inhabitants and total number of ill children is N = 497.In one canton of the first department was found the largest number of ill childrenas beeing 9 from a population of π1 = 1600 and in other canton of the seconddepartment were found 7 ill children from a population of π2 = 2300 inhabitants.These two cantons contain the largest figures of ill children.Administrative author-ities want to know if these large figures are natural or they are determined by someenvironmental fators of cantons.(The whole region is a mining region!). Therefore,under the natural hypothesis (denoted H0) we asume that number of diseased chil-dren in the region is a BUBP (or BUPP) process and we must test the hypothesesH01 and H02 that the numbers of 9 respectively 7 ill children are considered normalor dangerous events from the healthcare point of view. Therefore we are in thetheoretical situation discussed in the previous sections.

The collection of data for our application follows from the procedure used in[3,4] which defines the dimensions of the hypothetic geographic region (i.e. the

93

map) taking into consideration the seize of population in the region and defines thescan window using the size of population in the cantons with the largest numberof ill children. As the geographical map of the region is not a regular one, weconsider it as a square [0,W ] × [0, T ] with W = T =

√P where P is the seize

of population of the region (in our case P = 573500), hence W = T = 757.3.Similarly, the scan windows are respectively u1 = v1 =

√π1 =

√1600 = 40, u2 =

v2 =√

π2 =√

2300 = 47.95. The intensity of the Poissin process (over the region)is λ = N

P = 0.0008666 and the parameters for Poisson processes for the two cantonsare Λ1 = λπ1 = 1.384, Λ2 = λπ2 = 1.9918. To use the bivariate uniform binomialprocesses, we need to estimate parameters p1, p2. These are simply defined as p1 =7N = 0.014, p2 = 9

N = 0.018. Hence, according to (1.5’) we have for BUBP theparameters: λ′1 = Λ1/p1 = 99,λ′2 = Λ2/p2 = 110.6. (For BUBP thesefigures are notused).

In order to test the mentioned hypotheses H01,H02 we use the simulation pro-cedure presented in the previous sections. We use also the property of the scanstatistics which says that S((u, v), N,W, T ) = S((1, 1), N, W/u, T/u). Hence, forthe first canton W1 := W/u1 = T1 := T/v1 = 747/40 = 18.93, W2 := W/u2 = T :=T/47.35 = 757.3/47.35 = 15.77.

The simulation of the scan statistics , coresponding to data under Poissonhypothesis, is resumed in the following tables which contain the values of S = kand corresponding frenquences f :

W = T = 18.93, u = v = 1, Λ = 1.3865, N = 100000 = iterationsk 6 7 8 9 10 11 12 13f 2576 36833 42728 14302 3006 471 75 9

W = T = 15.77, u = v = 1, Λ = 1.99318, N = 100000 = iterationsk 7 8 9 10 11 12 13 14 15 16f 1005 22369 44235 23462 6983 1564 308 66 6 2

From the first table one can see that P (S ≤ 9) = 0.96439. Therefore H01 canbe accepted with a risk of α = 0.03561. (Hence kα = 9 and the critical region ofthe scan test is C = k|k > kα).

From the second table one can see that P (S ≤ 10) = 0.91071. The hypothesisH02 is also accepted with α = 0.09929, kα = 10, and the critical region C = k|k >kα. Since in the second case (the canton 2) there are 7 ill children, and this is thesecond large value in the region, the frequencies in the second table must be movedone step to the left. Therefore for the second large value (i.e. k = 7) the criticalregion is C = k|k > kα, kα = 11, α = 0.01946 and this gives a better reason toaccept the hypothesis H02.

In conclusion, the figures of ill children (k = 9, k = 6) are natural. There areno problems for authorities, concerning the cancer healtcare.

Bibliography

Alm, S.E. (1997). ”On the Distributions of Scan Statistics of a Two-dimensional

94

Poisson Process”. Advances in Applied Probability., 1, 1-18. The article presentssome approximates for the bivariate scan statistics.

Anderssen, R.S.,Bloomfeld, P.(1975). ”Properties of the random search in Globaloptimization”, JOTA,Vol.16,Nr.5-6. The paper proves the importance of usinguniform random samples the restriction domain D.

Davison, A.C. and Hinkley,D.V.(1997).Bootstrap Methods and their Applications,Cambridge University Press, Cambridge. This is an up-to-date monograph onbootstrap resampling methods and their applications in statistics.

Devroye, Luc.(1986).Non-Uniform Random Variate Generation, Springer Verlag,New York. The book, after fourteen years of being published, is still a completemonograph which contains all methods invented in the field of random variate simu-lation, including algorithms for simulating some stochastic processes and particulartypes of random vectors also.

Efron, B. and Tibshirani, R.J.(1993).An Introduction to the Bootstrap, Chapman& Hall, New York. This is a pioneering book in the field of bootstrap methodology.

Ermakov, E.S.(1971).Monte Carlo Method and Related Questions, (Russian), ”Nauka”Publishing House, Moscow. The book dominated the literature on Monte Carlomethods in eastern Europe during the 70’s. It is written in an accurate mathemat-ical form and discusses some particuler simulation models.

Fishman, G.S.(1996).Monte Carlo:Concepts, Algorithms and Applications, SpringerVerlag, New York. A modern and consistent monograph studying a wide area oftheoretical questions regarding computer simulation and Monte Carlo methods aswell as various applications.

Garzia,R.F. and Garzia, M.R.(1990).Network Modeling, Simulation, and Analysis,Marcel Dekker, New York. The book does not fit within the framework of thiscontribution, but it is important for the algorithmic way of analyzing stochasticnetworks which may help in solving particular types of optimization problems.

Gentle,J.E.(1998).Random Number Generation and Monte-Carlo Methods, (Statis-tics and Computing Series), Springer Verlag, New York. The book has inspired meto discuss some algorithms for simulating uniformly distributed random numbers;it contains almost all ideas in this respect, underlines the way of analyzing thequality of random numbers and also presents in a modern way the problems relatedto Monte Carlo methods.

Glaz, J. Naus, J. and Wallenstein, S. (2001). Scan Statistics, Springer Verlag, NewYork, Berlin, Heidelberg. The monograph studies various types of scan statisticsin one and two dimensions and gives some applications.

Gnedenko,B.V. (1943).”Sur la Distribution Limite du Terme Maximum d’une SeieAleatoare”, Anals of Maths, Vol 44, Nr. 3. Haiman, G. and Preda C. (2002).

95

”A New Method of Estimating The Distribution for a Two-Dimensioinal PoissonProcess”. Methodology and Computing in Applied Probability. The article uses arandom grid to determine lower and upper bounds for the distribution of a bivatiatescan statistics on a rectanmgle.

Hanssmann, F. (1968). Operations Research in Production and Inventory Control,John Wiley and Sons, New York, London. The book is an introductory textbookon inventory models.

Knuth, D.E.(1981).The Art of Computer Programming, Volume 2, SeminumericalAlgorithms, second edition, Addison-Wesley Publishing Company, Reading, Mas-sachusetts. This is the first monograph of the end of the 60’s which describesand performs a deep analysis of the arithmrtic problems of computer generation ofrandom numbers and their testing.

Morgan, Byron T. (1984). Elements of Simulation. Chapman & Hall, New York,London. The book is a good textbook containing a large number of exercices relatedto simulation of random variates.

Pham, D.T. and Karaboga, D.(2000).Intelligent Optimisation Techniques, SpringerVerlag, Berlin. The book is an introductory text to optimization methods basedon simulated annealing and genetic algorithms.

Ripley, B.(1986).Stochastic Simulation, Wiley, New York. This is a concise textunderlining in an accurate way the main problems of Monte Carlo simulation.Various applications are also underlined.

Robert,C.P. and Casella, G.(1999). Monte Carlo Statistical Methods, SpringerVerlag, New York. This is a complete monograph on statistical simulation whichperforms a good analysis of Markov Chain Monte Carlo methods and their appli-cations. Monte Carlo integration and Monte Carlo optimization are also discussed.

Ross, Sheldom M. (1997). Simulation. Second Edition. Academic Press, San Diego,New York, London. The book derscribes discrete event simulation approach withapplication to queueing theory and presents some variance reduction techniques forMonte Carlo calculations.

Rubinstein,Y.,Weissman, I.(1979).”The Monte Carlo Method for Global Optimiza-tion”, Cahiers d’Etudes de R.O., Vol. 21, Nr. 2. The paper underlines the mainproblems in global optimization by using random search.

Spriet ,Jan, VanstenKiste Ghislain, C. (1982).Computer Aided Modeling and Simu-lation. Academic Press, New York. The book is a monograph dedicated to systemmodeling, based mainly ob using differential equations and numerical methods.

Suter, Florentina, Vaduva, I., Alexe, Bogdan. (2005). ”On Simulation of Pois-son Processees Used to Analyse a Bivariate Scan Statistics”. Analele Universitatii

96

”Al.I.Cuza” Iasi, Sectia Informatica, Tom XV, p. 23-35. The paper gives an algo-rithm to estimate via simulation the critical test value for a bivariate scan statistics.

Vaduva, I, Dinescu, C., Savulescu B. (1974). Mathematical methods for productionmanagement, Educational Publishing House, Bucharest, Vpl. 2. (Published in Ro-manian).The book contains introduction to queueing models and inventory modelsand applications of graph theory and optimization problems.

Vaduva, I. (1977). Computer Simulation Models. Techn. Pub. House, BucharestRomania (written in Romanian). The book presents metzhods for simulation ofrandom veriables, random vectors and stochastic processes, and simulation modelsfor queueing and inventory systems.

Vaduva, I., Spircu,L. (1981).”On Convergence of Monte Carlo Method for Optimiza-tion”, Proc. Symp, on Economic Cybernetik, Rostok, Germany, 14 p, (mycrofilm).

Vaduva, I. (1994).”Fast algorithms for computer generation of random vectors usedin reliability and applications”. Preprint Nr.1603, Januar 1994, TH-Darmstadt.Here are found methods for simulating various multivariate distributions, includ-ing those based on transforming uniformly distributed vectors into other types ofrandom vectors. Algorithms for the simulation of univariate and multivariate dis-tributions used in reliability are particularly discussed.

Zeigler,B.P., Praehofer H. (2000). Theory of Modeling and Simulation. The mono-graph gives a formal description of discrete event systems with application to dis-crete simulatiom modeling.

Further References on Scan Statistics

Glaz, J., Ballakrishnan, B. (1999). Scan Statistics and Applications, Birkhauser,Boston. This isa the forst book on Scan Statistics, ilustrated by various examplesand applications.

Glaz, J, Naus, J. and Wallenstein, S. (2001). Scan Statistics, Springer Verlag, NewYork,Berlin, Heilderberg. This book is a mon ograph updating all rtesults on scanstatistics in one or many dimensions, refering to rectangular or circular maps.

Haiman, G. and Preda, C. (2002) ”A New Method of Estimating The Distributionof Scan Statistics for a Two-Dimensional Poisson Processes” Advances in AppliedProbability, 1, p.1-18.

Vaduva,I., Alexe, B. (2006). ”On Simulation of a Bivariate Uniform Binomia;lProcess to be used for Analyzing Scan Statistics, ” Analele Univ. Bucuresti,Matematica- Informatica, Ano LV, p. 153-164. The paper defines the Bivari-ate Uniform Binomial Process, gives algorithms for simulating such a process anddescribes estimation of probability distribution of a bivariate scan statistics.

97

Documents

Stochastic Simulations