Upload
hoangtruc
View
213
Download
0
Embed Size (px)
Citation preview
1
Introduction to Simulation of
Communication Networks
5hr-Seminar in the course
«Network Design and Planning»
ECS289i
Massimo Tornatore
(Courtesy of Prof. Fabio Martignon)
Politecnico di Milano
Department of Electronics, Information and
Bioengineering (DEIB)
Introduction to Simulation
2
Summary
What is simulation?
Systems, models and variables
Discrete-event simulation
Generation of pseudo-random numbers
– Synthesis of random variables
Statistical Analysis
– Statistical confidence of simulative results
Next Lecture: example of C/C++ simulator
Introduction to Simulation
3
Introduction to simulation
Why, as we are going to study queuing theory, do we also have to use the simulation?
Isn’t queuing theory sufficient to determine the performance of telecommunications networks?
No… In fact:
Queuing theory can describe and give results for a small ensemble of very simplified models and systems
How can we study a complex system? – Queues with not Poisson arrivals: bursty arrivals, arrive in groups,
back-off rejected requests, etc.. – Queues with complex mechanisms for managing queues (PQ, WFQ,
RED, ecc.)
– Queue networks that do not meet Jackson’s assumptions (steady state) – Analysis of the transient behavior of queuing systems
Introduction to Simulation
4
Introduction to simulation
In addition there are network systems that can not be
easily described by queuing models, e.g.:
– Access Interface of wireless systems (LTE, WLAN, ecc.) with
errors due to channel characteristics and interference
– Dynamic routing mechanisms for IP or optical networks
– Congestion control mechanisms (e.g., TCP)
– Admission control mechanisms (e.g.., CSMA-CA)
– Retransmission complex protocols such as, for example,
protocols with piggybacking, selective reject, ecc.
When you start from real systems is quite difficult to
find a model solvable only with the queueing theory
Introduction to Simulation
5
Introduction to simulation
What is SIMULATION:
Simulation seeks to build an experimental device,
that behaves like the real system under study for
some important aspects
Examples:
– scale models of airplanes, cars or trains, used in
wind tunnels
– SimCity, Railroad Tycoon, and other videogames
based on reproducing how a system works
– flight simulators for pilot training
Introduction to Simulation
6
Introduction to simulation
Other:
– predicting the development of ecosystems after
artificial alteration
– verification of stock-exchange tactics
– weather forecasts
– verification of battle tactics
– etc.
F. Martignon: Introduction to Simulation
7
Models and Systems
System – is a very general concept that can be defined
informally as a collection of parts, called components,
that interact with each other
Model – is a system representation. This representation can
take many forms (e.g., that of physical replication ),
but here we focus on the representation by means of
mathematical or software/simulative models
State of a system and level of abstraction
State – the system state describes the current state of all its
components
– to the system state it corresponds a state of the system model
and the model represents the evolution of the system
through the history of changes in state
The level of abstraction of a model indicates that
some features of the system state are omitted – The level of abstraction is tightly related to the measures
that model is aimed at
– The best model is simply the easiest model by which you
get the measures (performance) you want
M. Tornatore: Introduction to Simulation
8
Introduction to Simulation
9
Models and Systems
Variables
– the activities of the model are described as
relationships or functions between variables
– a mathematical model is described using variables.
Same for a simulative model!
– State variables
– state variables define the state of the model
– their evolution defines the evolution of the system
– Input variables
– input variables describe external stimula on the system
under consideration
Introduction to Simulation
10
Models and Systems
– Output variables
– are a function of state and input variables
– they represent, therefore, the probes inserted in the
model for the measure
– the solution of the model is to obtain the values of
output variables
Solution – the analytical solution of a model involves, e.g.,
mathematical methods for solving equations that
describe relationships between variables
– the simulated solution of a model reproduces the
evolution of the system by evolving the state variables
and directly measuring the output variables
Introduction to Simulation
11
Simulation vs. Theoretical Models
Simulation Properties:
– simulation is “descriptive” and not “prescriptive”
– simulation provides information on the behavior of
the system, given the parameters
– the simulation DOES NOT tell you how to set the
parameters for best system behavior, or to test the
limits of the system
– Example: M/M/1
– I see immediately the capacity limit
– If I simulate the system, I have to run lot of simulations,
increasing the load, till I find out which is the limit
1D
Introduction to Simulation
12
Deterministic vs. Stochastic Simulations
There are many ways to classify simulations, e.g.:
– Deterministic vs. Stochastic
– Continuous time vs. Discrete time
Deterministic vs. Stochastic simulations:
– deterministic simulations are completely defined by
the model, and their evolution is deterministically
associated to the input parameters;
– stochastic simulations are based on models that
include random variables or processes and so they
require the generation of random variables; the
evolution of the model depends on the input
parameters and the generation of random variables
Introduction to Simulation
13
Deterministic vs. Stochastic Simulations (2)
Examples
– Deterministic simulation:
– Consider the motion of the billiard balls on the pool
table. Given the position of the balls, direction and
strength of impact of the cue, we can simulate the
outcome of the shot (without explicitly solving an
analytical model)
– Stochastic simulation:
– Consider a GSM cell with N-channel (or a phone
concentrator) to which connection requests arrive
according to a Poisson process with rate
– We want to determine the probability of rejection,
knowing that with probability p rejected calls retry to
log on after a time equal to T
Introduction to Simulation
14
Static and Dynamic Simulations (1)
Stochastic simulations can be classified as
static or dynamic
Static Simulations
– also called Monte Carlo simulations
– the time variable plays no role
– the basic objective is to determine some statistical
characteristics of one or more of random
variables
– in fact, the Monte Carlo simulations tipically
evaluates statistical measures through
independently repeated experiments
Introduction to Simulation
16
Static and Dynamic Simulations (2)
Dynamic simulations
– also called temporal simulations
– time becomes the main variable to be tied to the
evolution of the model
– the purpose is to collect statistics for random
processes observed at different time
Introduction to Simulation
17
Static and Dynamic Simulations (3)
Simple Monte Carlo example (it can be
solved analytically):
– Consider a slotted random multiple access system
with 10 users of type A, 10 users of type B and 13
users of type C
– Users A, B and C have a packet ready for
transmission in each slot with probability 3p, 2p
and p, respectively
– Find the throughput of the system given p.
– Determine the value of p that maximizes the
throughput
Introduction to Simulation
18
Static and Dynamic Simulations (4)
Another (more complex) Monte Carlo
simulation example
– Consider a cellular packet system in which a
packet is transmitted by a mobile user placed in
random position in each time interval with
probability G.
– The attenuation of the channel is a function of
distance and a random factor (fading) and
transmitted power is fixed. The packet is
received correctly only if the signal/interference
is greater than 6 dB.
– Determine the probability of a successful
transmission. Determine the value of G that
maximizes the throughput.
Introduction to Simulation
19
Static and Dynamic Simulations (5)
Example of dynamic simulation (1):
– Consider a queue system with a server and a
queue of up to K packets. Interarrival times have
a uniform random distribution between time a
and b and each arrival is composed of a number
of users x, where x is a random variable Binomial
with reason p. Determine the probability of
rejection and the average delay to the crossing.
Introduction to Simulation
20
Static and Dynamic Simulations (6)
Example of dynamic simulation (2):
– Consider a slotted polling system with N queues.
Packet length x is randomly distributed
according to a cartain known pdf. Interarrival
time are distributed according to a Poisson
process with rate . Assuming that:
– The server adopts a policy of priority queuing (i.e., it
continues to serve a queue until it is empty)
– The server chooses the next queue according to a
«longest-queue» policy
– Determine the average transfer time through the
system
Classification of Dynamic Simulation
Discrete-event dynamic simulation
The system state changes in response to
«events»
– Network simulation (OPNET, NS-2)
Continuous-time dynamic simulation
The system state evolves in response to the
change of continuous-time variable
– Weather Forecast
NB: simulated time vs simulation time!
Introduction to Simulation
21
Introduction to Simulation
22
Simulation of discrete events
Simulation of discrete events is fundamental importance for
telecommunications networks
In discrete event simulation the state variables change value
only at discrete instants of time
The change of the system state is called event and is
characterized by an instant of occurrence
– An event has no duration
After the event occurs, in the system an activity starts that
persists for some time
– An activity is usually characterized by a start event and an end event
– For example, the beginning and the end of the transmission of a
packet are events, while the transmission itself is an activity
Introduction to Simulation
23
Simulation of discrete events
In discrete event simulation we should:
– define the types of events that can occur
– define the changes in the system state associated to each
event
– define a time variability and an ordering of events in a
calendar based on the instant of event occurrence
– define an initial state
– scroll through the calendar and, each time an event
occurs, make changes to state variables according to that
event
– measure on the output variables
Introduction to Simulation
24
Example: Simulation of a queue system
Model:
– Queue system with a server and an infinite queue
– Input variables:
– interarrival times of requests (packets)
– service times of requests
– State variables:
– number of requests in the system
– Initial state
– e.g., no user in the system
– Output variables:
– average time spent by a packet in the system
Introduction to Simulation
25
Events
– 1. first arrival
– Set service start
– Set service end
– 2. from the second arrival on
– In this case we should act differently on the basis of the state:
– Arrival
» in an empty system → immediate service start and schedule
service end
» in a not-empty system → add a packet in the queue (service
end? We can’t add service end!)
– End (? See next slides)
» with empty queue (hold till next event)
» with non-empty queue (set new service start and service end)
- I can not read the calendar ...
- Initialization (“init”)
Example: Simulation of a queue system
Introduction to Simulation
26
Example: simulation of a queue system
Filling the calendar of events
– PROBLEM: it is not possible to place the end
service events of each queued requests as we do
not know the service duration of requests queued
in front of them
– SOLUTION: the calendar can be filled with new
events while other events are pending
– Example: a new packet will be queued even if the
service end events of the packets before it in the queue
are not known.
Introduction to Simulation
27
Example: a queue system simulation
In summary:
when we have an «arrival» event, we increase the
number of users, then
if the system is empty, a new end event is inserted in the
calendar at a time equal to «CLOCK + service time»
if the system is busy, add a packet in the queue
when an service-end event is reached, we
decrease the number of users, then
If the queue is empty, no action
if the queue is not empty, a new end event is inserted in
the calendar at a time equal to the value of «CLOCK +
service time»
Introduction to Simulation
28
Example: a queue system simulation
Measurement of output variables
1. Transfer time
– User arrival: storage time of arrival
– End of service: calculation of the service time
(CLOCK - t_arrival)
2. Average users in the queue
– Weigthed average of the users in the systems during
each activity interval («time slices»)
Introduction to Simulation
29
Simulation of discrete events
Note: the correct inclusion of a new event in
the calendar is a critical operation if the
calendar has many events
– Efficient techniques must be used for inclusion of
an element in an ordered list
Introduction to Simulation
30
The variable Clock
Sliding of the «CLOCK» variable
– «clock driven» simulations
– the CLOCK variable is always increased of a fixed step
» E.g., Slotted systems
– «event driven» simulations
– the CLOCK variable is increased according to the time interval
between the occurrence of an event and the occurence of the
following event
– Notes:
– from a computational-time point of view it may be convenient to
adopt a method rather than the other depending on the model
Introduction to Simulation
31
Some final considerations…
Considering the power/efficiency of modern coding
languages and computing systems, simulation is today a
powerful tool of analysis to address complex problems
But simulation is also a tool that should be used with care
for the following reasons:
– is not easy to validate the obtained results
– the computational time can easily get very high
– is not easy to understand how different parameters affect the
result
Introduction to Simulation
32
Simulation of discrete events
The simulation of a stochastic model involves the
utilization of random input variables
→ We need statistical distribution of input variables
So, for computer simulation, pseudo-random
number generation and synthesis of statistical
variables is needed (the next topic ...)
– Example: traffic entering a queue system described by
the process of arrivals and the process of service times
Introduction to Simulation
33
Summary
What is simulation?
Systems, models and variables
Discrete-event simulation
Generation of pseudo-random numbers
– Synthesis of random variables
Statistical Analysis
– Statistical confidence of simulative results
Introduction to Simulation
34
The Role of Random numbers
When the model to be analyzed via simulation is
stochastic, two important problems arise:
the generation of pseudo-random numbers to
be use for the generation of input variables
the statistical analysis of the results obtained
through the output variables
What I assume you know
Basic of statistics
Average (mean), variance
Concept of random variable
Probability density function (pdf) f(x)
Cumulative distribution function (CDF) F(x)
Theorem of central limit
Gaussian (normal), t-student distributions
M. Tornatore: Introduction to Simulation
35
Introduction to Simulation
36
Generation of pseudo-random numbers
Rigorously speaking, numbers generated by a
computer cannot be random due to
deterministic nature of a computer
We can however generate pseudo-random
sequences that meet a series of statistical tests
of randomness
Introduction to Simulation
37
Generation of pseudo-random numbers
The problem of generating pseudo-random
numbers can be logically divided in two parts:
generation of sequences of random numbers
uniformly distributed between 0 and 1
generation of sequences of random numbers
distributed in an arbitrary mode
– Poisson, Bernoulli, Weibull, Exponential, etc ...
Introduction to Simulation
38
Generation of pseudo-random numbers
The pseudo-random sequences are obtained
through the implementation of recursive formulas
Some history
The first method to generate random sequences
was Von Neumann's «square center» method
The next number is obtained by squaring the
previous number and taking the central number
Von Neumann’s example
Introduction to Simulation
39
x0=3456
that squared provides
(x0) 2 =11943936
so
x1= 9439
This method was abandoned: difficult to analyze,
relatively slow and statistically unsatisfactory
Sequence of random number obtained
Generation of pseudo-random numbers
Factors determining the quality of a method:
1) numbers must be statistically independent
2) we must be able to re-produce the sequence
3) numbers must be uniformly distributed (i.e., they
must have the same probability to occur)
4) sequence must be of an arbitrary length
5) the method should be quickly executable by the
computer and must consume small amount of
memory
Introduction to Simulation
40
Introduction to Simulation
41
Generation of pseudo-random numbers
Let’s recall some basic math operators
Module: yxyxyx /mod
Congruency:
– Property of module:
1mod
0 y
yx
zyzx
zyx
modmod
)(mod
Introduction to Simulation
42
Generation of pseudo-random numbers
Linear Congruency Method (Lehmer 1948)
{Xn}nN X0,X1,X2,...,Xi,...
That:
module
increase
multipliera
seedor valueinitial
)(mod
0
1
m
c
X
mcaXX nn
NB: The method is called:
- multiplicative if c = 0
- mixed if c 0
Introduction to Simulation
43
Generation of pseudo-random numbers
Example:
X0 = a = c = 7
m = 10
{Xn}nN = 7, 6, 9, 0,7, 6, 9,...
Note: 0 Xi m, for all i
Introduction to Simulation
44
Generation of pseudo-random numbers
Drawbacks of linear congruency
As soon as Xp=X0, the sequence is repeated
periodically; p is the period of the sequence
Being p<m, the period will be less than or equal
to m
Note(1): if p = m then all numbers between 0 and
m-1 are repeated once in the sequence
Note(2): to obtain a sequence in [0,1):
{Rn}nN X0
m,X1
m,X2
m,...,
X i
m,...
Introduction to Simulation
45
Generation of pseudo-random numbers
We can relate directly Xn to X0 This emphasizes even more the deterministic nature of the sequence!
mca
aXaX
mca
aXaX
mcaXamcaXX
mcaXX
nn
n mod)1
1(
...
mod)1
1(
mod))1((mod)(
mod)(
0
3
0
3
3
0
2
12
01
Generation of pseudo-random numbers
How to choose the multiplier a and increase c:
a and c strongly influence the period and the
statistical properties of the sequence
there are rules for choosing a and c that will return
periods p = m (full period)
Criteria to ensure optimality:
1. The parameters c and m must be co-prime, i.e.:
MCD(c,m) = 1
2. Every prime divisor of m must divide (a-1) Ex: if m=10, its prime factors are 2 and 5. (a-1) must be a multiple of 2 and 5
3. If m is a multiple of 4, also (a-1) must be
Introduction to Simulation
46
Generation of pseudo-random numbers
It is not easy to find values that satisfy (1), (2), (3)
Ex: m=10, a=21, c=3 (Xn=3,6,9,2,5,8,1,4,7,0,..)
Some researchers have therefore identified the
following values in accordance with these criteria:
KNUTH m = 231; a = int (p * 108) ; c = 453806245
GOODMAN/MILLER m = 231 -1; a = 75 ; c = 0
GORDON m = 231; a = 513 ; c = 0
LEORMONT/LEWIS m = 231; a = 216 + 3 ; c = 0
Introduction to Simulation
47
Introduction to Simulation
48
Generation of pseudo-random numbers
A simpler condition:
– if the method is multiplicative (c=0) you can show
that if m=2b then the maximum period is p=2b-2 if
b4
Note: the equivalence of multiplicative and
mixed approaches has been proven
Introduction to Simulation
49
Generation of pseudo-random numbers
Notes on the choice of module m
m influences the period because p m
m also affects the computational speed:
– to calculate a module, we should generally perform a
product, a sum and a division
– it is possible to do everything together if you choose
as module the maximum integer representable by
the computer plus one
– in this module the operation correspond to a
truncation
– if b is the number of bits used by the computer, you
will choose m=2b
Introduction to Simulation
50
Generation of pseudo-random numbers
Other methods:
congruent square method:
– It is based on the generation of congruent numbers with m-
module according to the relation:
Fibonacci or additive method:
mcaXdXX nnn mod)(2
1
mXXX knnn mod)(1
Introduction to Simulation
51
Notes on Test for generators:
Generation of pseudo-random numbers
The tests on pseudo-random numbers generators
are applied to verify that:
Generated numbers are uniformly distributed
Generated numbers are indipendent
However, these concepts have a value only for
random variables and must find their
implementation in a test run on a finite set of
samples
generally we assume an hypothesis as verified
only if the set of samples satisfies a certain
number of tests
Introduction to Simulation
52
Notes on Test for generators:
Generation of pseudo-random numbers
a typical test for the verification of the distribution
is the test of c2
We divide the set of possible values in k categories
Ŷi is the set of sample values falling in the i-th
category and Yi =npi is the expected value, where n
is the number of samples and pi is probability of
category i (pi =1/k for uniform case)
a quality index can be defined as
k
kk
Y
YY
Y
YY
Y
YYV
2
2
2
22
1
2
11 )ˆ(...
)ˆ()ˆ(
Introduction to Simulation
53
Notes on Test for generators: c2 test
Generation of pseudo-random numbers
the problem is that the value of V is itself a random variable which also depends on the absolute values
therefore is necessary to repeat the test several
times on different samples and evaluate the
probability that V takes high values
it can be proven that V has a c2 distribution with n = k-1 degrees of freedom:
!122
:integer2
0;
22
1)( 2
12
2/
nnn
xexn
xfxn
n
Introduction to Simulation
54
Notes on Test for generators: c2 test
__________ __ ______ ______ _______ Generation of pseudo-random numbers
If Px indicates the percentile x% of the distribution c2, we can rank the observations of V according to table:
P0-P1, P99-P100 reject
P1-P5, P95-P99 suspicious
P5-P10, P90-P95 almost-suspicious
Introduction to Simulation
55
Notes on Test for generators: gap test
__________ __ ______ ______ _______ Generation of pseudo-random numbers
There are several tests to verify the independence of the samples
A frequently used test is the «gap» test
We define an event on the observed distribution, such as passing a certain threshold
We estimate the probability p associated to the
event
From the sequence of samples we derive the sequence of variables (0,1) that defines if the event occurred or not
Introduction to Simulation
56
Notes on Test for generators: gap test
__________ __ ______ ______ _______ Generation of pseudo-random numbers
considering the length of the first sequences of 0 and the sequences of 1
since the distribution of these lengths is geometric, we verify the congruency to the distribution using a test (e.g., the c2)
or more simply we estimate the average value and compare it with 1-p e p respectively
000111110010010010010011100001001
Generation of other distributions
Now we have a sequence of pseudo-random
numbers
– Uniformly distribuited between 0 and 1
– That satisfies test of randomness
Next step
We use them to obtain samples of variables
distributed according to the distribution we
need (exponential, Poisson, geometric, etc. ..)
Introduction to Simulation
57
Generation of other distributions
«Inverse transform» method
Given
– r: variable uniformly distribuited in [0,1]
– that is, f(r)=1 or F(r)=r
To obtain a random variable x with f(x), we have to:
– Determine F(x), 0<=F(x)<=1
– Generate random samples r
– Set r=F(x)
– Calculate the inverse function F(x) => F-1(.)
– Obtain x= F-1(r)
It can be proven, but we skip the proof!
Introduction to Simulation
58
Generation of other distributions
Inverse transformation example (1):
– We want to get x such that f(x) =1/(b-a)
– From uniform [0,1] to uniform [a,b]
– F(x)=(x-a)/(b-a), 0<=F(x)<=1
– r=F(x)=(x-a)/(b-a)
– x=r(b-a)+a
Introduction to Simulation
59
Generation of other distributions
Inverse transformation example (2):
– I want to get x such that f(x) =e-x
– From uniform [0,1] to negative exponential with
average
– F(x)=1e-x
– r=F(x)=1e-x
– x= (-ln(1-r))/
Introduction to Simulation
60
Generation of other distributions
We could show it more rigorously…
– Next 4 slides
Introduction to Simulation
61
Introduction to Simulation
62
Generation of other distributions
Generation of an arbitrary distribution:
Elements of probability :
(y)xxy
g(x)y
x
xg
xfyf
g(X)Yyf
ii
i
i i
iXY
Y
, offunction in turn arethat
:equation theof solutions theare where
)('
)()(
:bygiven is r.v. of )( p.d.f.
The fundamental theorem of functions of random variables:
Introduction to Simulation
63
Generation of other distributions
Example:
(0,1)!in uniform is So
1.y0 ifonly exists that
equation theofsolution only theis where
1y0 1)('
)(')(
: write toallows theoremlfundamenta The
Consider
. C.D.F with r.v. a is
1
1
1
x
(x)Fy
x
xF
xFyf
(X)FY
(x)FX
X
X
XY
X
X
Introduction to Simulation
64
Generation of other distributions
Synthesis of a r.v. using the method of percentile:
it is now easy to see that if you have:
– U r.v. uniform in (0,1)
– To obtain a r.v. X with CDF equal to F(.) it is enough
to set:
)(1 UFX
X F-1(.) F(.)
X U
Introduction to Simulation
65
Generation of other distributions
This can be proven otherwise as:
U is a r.v. uniform in (0,1)
11
10
00
)(
x
xx
x
xFU
elsewhere0
10for 1)(
xxfU
)(1 UFX
FX (t) P X t P F1(U) t P U F(t) F(t)
Set:
It results:
Introduction to Simulation
66
Generation of other distributions
The variable U is obtained from the
generation of pseudo random number
It remains the problem of finding Fr-1(.) for
the variable that you want to synthesize
for some processes the Fr-1(.) cannot be
obtained in analytical form and therefore we
must resort to other methods
moreover, for discrete random variable we
need to slightly modify the approach
Introduction to Simulation
67
Generation of other distributions
Example: Exponential
– If you want to generate an exponential random
variable x with average > 0
Ulnor )1ln
: thereforeand
01
:haveyou
01
:is pdf
-X-U(-X
x-e(x)F
xe(x)f
-x/
X
-x/
X
Introduction to Simulation
68
Generation of other distributions
Example: Rayleigh
– if you want to generate a random variable with
Rayleigh pdf
ln2or )1ln2
thereforeand
01
:haveyou
0
:is pdf
2
2
2
2
U--U(-X
x-e(x)F
xxe(x)f
/-x
X
/-x
X
F. Martignon: Introduction to Simulation
69
Generation of other distributions
Example: Gaussian
– If you want to generate a random variable x with
gaussian pdf and with =0 and s=1
– To have a variable with average and variance s2
it is enough to use the transformation z=sx+
– pdf is
– It is well known that the CDF of the Gaussian
cannot be expressed directly, so it can not be
inverted explicitly
22
2
1 /-x
X e(x)fp
Introduction to Simulation
70
Generation of other distributions
Example: Gaussian
– first approach is to use an approximation
– the central limit theorem tells us that the sum of
N r.v.’s tends to the normal distribution with the
increase of N
– Usually for N12 we assume we can get a good
approximation
– So it is enough to extract 12 variables uniform Ui
12
1
6i
iUX
Introduction to Simulation
71
Generation of other distributions
Example: Gaussian
– a smarter approach gets two independent samples
of normal random variable with only two
extractions
– is based on the observation that:
– a 2-dim vector which has Gaussian and
Independent Cartesian components, has:
» module with Rayleigh’s distribution
» uniform phase in (0,2p)
Introduction to Simulation
72
Generation of other distributions
Example: Gaussian
– therefore:
– two variables are extracted: U1 ed U2, uniform in
(0,1)
– assessing X ed Y:
– that are independent normal random variables
)2sin( ln2
)2cos( ln2
21
21
UU-Y
UU-X
p
p
Introduction to Simulation
73
Generation of other distributions
Discrete random variables:
Consider a discrete random variable described by
probability distribution:
FX(x) is a function as:
.,...,1 mkpaXP kk
pk
ak
Introduction to Simulation
74
Generation of other distributions
Discrete random variables
That, when inverted, becomes
pk
ak
1
The relation expressing the
variable is therefore :
k1-k11-k1 pp...pp...p
:ifonly set
u
ax k
NB: «u» is what we use to call «r»,
i.e., the pseudo-random number between 0 and1
Introduction to Simulation
75
Generation of other distributions
Discrete random variables:
Example:
Generate a random variable X that takes the
value 1 with probability p and value 0 with
probability 1-p
It is enough to set:
1p if 0
p0 if 1
u
uX
Introduction to Simulation
76
Generation of other distributions
Discrete random variables:
it becomes extremely complicated with
discrete distributions with m infinite
we must stop at a finite value m
m determines the number of comparisons
that must be done in the routine assignment,
and thus the speed of the routine itself
in some cases it is possible to adopt some
tricks
Introduction to Simulation
77
Generation of other distributions
Example: Geometric Distribution
We have:
Consider an exponential r.v. Z; we have:
this value matches P(X=n+1) if you require
that:
.... 1,2k )1( 1 kppkxP
)1()1()1(1 /1///)1( eeeenZnP nnn
)1ln(
1;1 /1
pep
Introduction to Simulation
78
Generation of other distributions
Example: Geometric Distribution
Therefore to generate a geometric variable is
enough:
– 1. Generate a uniform variable U in (0,1)
– 2. Get an exponential variable
– 3. Set )1ln(
ln
p
UZ
ZX 1
Introduction to Simulation
79
Generation of other distributions
Example: Poisson’s Distribution
with Poisson’s distribution things get complicated, so there can be no shortcuts to the problem:
– 1. set k:=0, A:= e-a, p:=A
– 2. U:=rand(seed)
– 3. While U>p
» k:=k+1
» A:=A*a/k
» p:=p+A
– 4. return k
ak
ek
akXP
!
Introduction to Simulation
80
Analysis and validation of the results
Once we have built the simulation model and the
software that implements it, we shall:
decide what to measure (which output variables)
decide the statistical metric (average, variance?)
– Note that the output variables are r.v!
repeat the experiment multiple times!!
adopt the appropriate estimators for the
parameters
evaluate the accuracy (“confidence”) of estimation
Introduction to Simulation
81
Analysis and validation of the results
given a population whose distribution is f(x),
with average E[x] = h and variance s2(x) = s2
[x1, x2, ... , xn] are n independent observations
The average value of the samples is defined by:
Estimation problem: estimation of average value
n
i
ixn
x1
1
Introduction to Simulation
82
Analysis and validation of the results
The average of the samples is also a r.v. with:
for large n, the average of the samples is a
normal variable, and then the variable:
it can be assumed normal with zero average
and unitary variance based on the central limit
theorem
Estimation problem: estimation of average value
nxxE
22 )( ; ][
ssh
z x -h
s/ n
Introduction to Simulation
83
Analysis and validation of the results
the normal distribution F(z) is tabulated
u1a/2 is a value such that
we have:
Estimation problem:estimation of average value
F(u1a / 2) 1a /2
P u1a / 2 z u1a / 2 1a
P u1a / 2 x h
s / n u1a / 2
1a
F(z)
z
1
1-a/2
u1-a/2 0
f(z)
u1-a/2
-u1-a/2
Introduction to Simulation
84
Analysis and validation of the results
and therefore
the (1-a) constant is usually expressed with
percentage and is called confidence level
the interval
is called confidence interval
P x u1a / 2s
nh x
s
nu1a / 2
1a
x u1a / 2s
n, x u1a / 2
s
n
Estimation problem: estimation of average value
Introduction to Simulation
85
Analysis and validation of the results
commonly we adopt a confidence level of
95% for which we have:
this means that h falls in this range:
with a probability of 95%
Estimation problem: estimation of average value
a 0.05
u1a / 2 1.96
x 1.96s
n, x 1.96
s
n
Introduction to Simulation
86
Analysis and validation of the results
Unfortunately, the variance s2 is not known
s2 should be replaced by the samples variance,
defined as:
In this way, however, the variable:
is no longer normal but has t-student distribution
with n-1 degrees of freedom
Estimation problem: estimation of average value
n
i
i xxn
s1
22 )(1
1
t x -h
s/ n
Introduction to Simulation
87
______ _ _________ ___ _________ Analysis and validation of the results
in cases with large n (>30) it is possible to
approximate the t-student with the normal
distribution
but for smaller values of n it is necessary to use
t-student distribution with the corresponding
number of degrees of freedom
Note: for Monte Carlo simulations the values of n>30 are quite
common, while for temporal simulations, n is usually smaller
Estimation problem:estimation of average value
Introduction to Simulation
88
Analysis and validation of the results
Table of t-student values
Warning!: b=1-a/2 k=n-1
Introduction to Simulation
89
______ _ _________ ___ _________ Analysis and validation of the results Values generation of t-student
// t-distribution: given p-value and degrees of freedom, // return t-value; adapted from Peizer & Pratt JASA, vol63, p1416 double tval(double p, int df) { double t; int positive = p >= 0.5; p = (positive)? 1.0 - p : p; if (p <= 0.0 || df <= 0) t = HUGE_VAL; else if (p == 0.5) t = 0.0; else if (df == 1) t = 1.0 / tan((p + p) * 1.57079633); else if (df == 2) t = sqrt(1.0 / ((p + p) * (1.0 - p)) - 2.0); else { double ddf = df; double a = sqrt(log(1.0/(p*p))); double aa = a*a; a = a - ((2.515517+0.802853*a+0.010328*aa) / (1.0+1.432788*a+0.189269*aa+0.001308*aa*a)); t = ddf - 0.666666667 + 1.0 / (10.0 * ddf); t = sqrt(ddf*(exp(a*a*(ddf-0.833333333)/(t * t))-1.0)); } return (positive)? t : -t; }
Introduction to Simulation
90
Analysis and validation of the results
Estimation problem: operations on confidence intervals
Let’s denote the confidence intervals of two
variables as:
it can be proven that:
;1 ;1 yulxul yyyPxxxP aa
;1
;1
yxuull
xul
yxyxyxP
BAxBAxBAxP
aa
a
Introduction to Simulation
91
Analysis and validation of the results
Estimation problem: variance estimation
a direct method for estimating variance is using
the expression
having the populations [x1, x2, ... , xn] and [(x1)2,
(x2)2, ... , (xn)2] it is possible estimate the
confidence interval of the average of x and x2
the two intervals can then be combined using
the previous expressions
][][)( 222 xExEx s
Introduction to Simulation
92
______ _ _________ ___ _________ Analysis and validation of the results
Estimation problem:
The results seen so far are based on the
fundamental assumption that:
– the observed variables are stationary
– the measurements are not affected by the initial state
– the observations are independent
the hypothesis of independence is the more
difficult to obtain and verify in practical cases
the independence of the observations depends
on the characteristics of correlation of observed
variables that are not known
Introduction to Simulation
93
Analysis and validation of the results
Estimaton problem: correlated observations
The estimator of the average continues to be a
non-biased estimator
but its variance is now equal to:
where the correlation coefficient rk is:
][ hxE
1
1
22 121)(
n
k
kn
k
nx r
ss
rk E (xi h)(xik h)
s 2
Introduction to Simulation
94
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
The estimation of the confidence interval thus
requires knowledge of the autocorrelation
function of the process that is not generally
known
We could use autocorrelation estimators, but the
complexity and computation load would become
excessive
In practice we use two different approaches to
build independent sequences
Introduction to Simulation
95
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
1) repeated tests
– N independent observations of the process are built repeating N
times the simulation with N different random number generators
– the N estimated values for each simulation are used as
independent samples for the evaluation of the confidence
this approach implements in fact a generalization of the
Monte Carlo simulation
It is useful in many practical situations, but in fact it is
only used when the second method can not be used
Introduction to Simulation
96
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
2) subdivision into intervals of observation (run)
– simulation is divided into N blocks, each consisting of a number of
observations K
– evaluating the average of the output variable in each block
– it is shown that with sufficiently large K the average of each block
are independent
– estimate the confidence interval on the basis of estimates obtained
in each run
This approach is approximate
sometimes may not be easy to check that the number K of
observations is the same for each run
Introduction to Simulation
97
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
Example:
consider a mD/D/1 queue
m flows with deterministic inter-arrival time T are offered at a server
Service time is also deterministic and equal to S
the relative phases of the flows are random (uniform between 0 and T)
It can be shown that:
– delays are periodic and depend only on the initial phases
We must repeat the experiment a number sufficiently large N of times with random phases to obtain some valid estimation of the average delay
Introduction to Simulation
98
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
in some cases the measurement process is a renewal
process and we can exploit the renewal process to have
independent observations
a renewal process is characterized by a series of renewal
instants [b1, b2, b3, ...]
in these moments, the process returns to the same state
the evolution of the process in the intervals [bn-1, bn] is
independent from interval to interval
measurements taken on the process in distinct intervals
are independent and you can apply the formulas for the
estimation of confidence
F. Martignon: Introduction to Simulation
99
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: correlated observations
Example 1:
– is easy to convince yourself that for queuing systems with general arrivals and general services, the instant of time when a new request arrives and the queue is empty this is a moment of renewal of the entire system. Indeed:
– the system state is the same
– a new period of inter-arrival is not started and therefore there is no memory
– a new period of service is not started and therefore there is no memory
– the system is empty and therefore there is no memory of users waiting
F. Martignon: Introduction to Simulation
100
______ _ _________ ___ _________ Analysis and validation of the results
Estimaton problem: observations related
Example 2: – consider a M/G/1 queue system
– conduct a simulation to measure the delay through the system
– consider as champions the delays experienced by each user
– is easy to convince yourself that these samples are related
– Indeed, for example, if the first arrival finds empty system, the immediately following arrivals observe low delays, while consecutive arrivals with the system very high load observe high delay
– dividing the simulation in run of the same length of time you do not control the number Ki of samples for each run and long run needs to be done to have a low dispersion of the Ki.
F. Martignon: Introduction to Simulation
101
______ _ _________ ___ _________ Analysis and validation of the results
Estimation problem:
even assuming that we have solved the estimation problem remains that of stationarity
although the process is stationary, we are forced to start the simulation from an initial state
the initial state influence the statistics gathered in the first part of the simulation until the system reaches a stationary behavior
the simplest approach is to eliminate the results from the statistics collected during the initial interval
F. Martignon: Introduction to Simulation
102
______ _ _________ ___ _________ Analysis and validation of the results
Estimation problem:
the problem is that it needs over the period which shall not collect statistics
Unfortunately there are no precise rules for deciding when to start collecting data
theoretically should estimate the autocorrelation of the process being measured and considered as a transitional period of time until the autocorrelation is not considered negligible
in fact also autocorrelation estimate is a complex operation, and therefore uses methods "empirical"
F. Martignon: Introduction to Simulation
103
______ _ _________ ___ _________ Analysis and validation of the results
Estimation problem:
Of course, if you know the regeneration points
of the process under measurement, the transient
problem is solved
– just delete the data until the first point of generation
– or start from the state of the points of regeneration
and hold good all the statistical data
otherwise you need to have an idea of the time
constants involved in determining the state of
the process
you can then proceed by attempts
F. Martignon: Introduction to Simulation
104
______ _ _________ ___ _________ Analysis and validation of the results
Estimation problem:
Example:
in the queuing systems the time needed to stabilize the system depends on the load
as r tends to 1, the system takes longer time to reach a stable state
in a sense r near one, means unstable system
Problem: How do you know if the system is stable or not if this can not be inferred from the input parameters?
F. Martignon: Introduction to Simulation
105
Further reading
Donald E. Knuth, “The Art of Computer
Programming”, Second Edition, Addison
Wesley Publishing Company, Reading MA,
1981 (in particolare, Volume 2:
“Seminumerical Algorithms”)