Department of Electronics, Information and Bioengineering ...networks.cs.ucdavis.edu/~tornatore/Tornatore_files/ECS289i/Slides/... · Introduction to Simulation of Communication Networks

1

Introduction to Simulation of

Communication Networks

5hr-Seminar in the course

«Network Design and Planning»

ECS289i

Massimo Tornatore

(Courtesy of Prof. Fabio Martignon)

Politecnico di Milano

Department of Electronics, Information and

Bioengineering (DEIB)

Introduction to Simulation

2

Summary

What is simulation?

Systems, models and variables

Discrete-event simulation

Generation of pseudo-random numbers

– Synthesis of random variables

Statistical Analysis

– Statistical confidence of simulative results

Next Lecture: example of C/C++ simulator


3

Introduction to simulation

Why, as we are going to study queuing theory, do we also have to use the simulation?

Isn’t queuing theory sufficient to determine the performance of telecommunications networks?

No… In fact:

Queuing theory can describe and give results for a small ensemble of very simplified models and systems

How can we study a complex system? – Queues with not Poisson arrivals: bursty arrivals, arrive in groups,

back-off rejected requests, etc.. – Queues with complex mechanisms for managing queues (PQ, WFQ,

RED, ecc.)

– Queue networks that do not meet Jackson’s assumptions (steady state) – Analysis of the transient behavior of queuing systems


4


In addition there are network systems that can not be

easily described by queuing models, e.g.:

– Access Interface of wireless systems (LTE, WLAN, ecc.) with

errors due to channel characteristics and interference

– Dynamic routing mechanisms for IP or optical networks

– Congestion control mechanisms (e.g., TCP)

– Admission control mechanisms (e.g.., CSMA-CA)

– Retransmission complex protocols such as, for example,

protocols with piggybacking, selective reject, ecc.

When you start from real systems is quite difficult to

find a model solvable only with the queueing theory


5


What is SIMULATION:

Simulation seeks to build an experimental device,

that behaves like the real system under study for

some important aspects

Examples:

– scale models of airplanes, cars or trains, used in

wind tunnels

– SimCity, Railroad Tycoon, and other videogames

based on reproducing how a system works

– flight simulators for pilot training


6


Other:

– predicting the development of ecosystems after

artificial alteration

– verification of stock-exchange tactics

– weather forecasts

– verification of battle tactics

– etc.

F. Martignon: Introduction to Simulation

7

Models and Systems

System – is a very general concept that can be defined

informally as a collection of parts, called components,

that interact with each other

Model – is a system representation. This representation can

take many forms (e.g., that of physical replication ),

but here we focus on the representation by means of

mathematical or software/simulative models

State of a system and level of abstraction

State – the system state describes the current state of all its

components

– to the system state it corresponds a state of the system model

and the model represents the evolution of the system

through the history of changes in state

The level of abstraction of a model indicates that

some features of the system state are omitted – The level of abstraction is tightly related to the measures

that model is aimed at

– The best model is simply the easiest model by which you

get the measures (performance) you want

M. Tornatore: Introduction to Simulation

8


9

Models and Systems

Variables

– the activities of the model are described as

relationships or functions between variables

– a mathematical model is described using variables.

Same for a simulative model!

– State variables

– state variables define the state of the model

– their evolution defines the evolution of the system

– Input variables

– input variables describe external stimula on the system

under consideration


10

Models and Systems

– Output variables

– are a function of state and input variables

– they represent, therefore, the probes inserted in the

model for the measure

– the solution of the model is to obtain the values of

output variables

Solution – the analytical solution of a model involves, e.g.,

mathematical methods for solving equations that

describe relationships between variables

– the simulated solution of a model reproduces the

evolution of the system by evolving the state variables

and directly measuring the output variables


11

Simulation vs. Theoretical Models

Simulation Properties:

– simulation is “descriptive” and not “prescriptive”

– simulation provides information on the behavior of

the system, given the parameters

– the simulation DOES NOT tell you how to set the

parameters for best system behavior, or to test the

limits of the system

– Example: M/M/1

– I see immediately the capacity limit

– If I simulate the system, I have to run lot of simulations,

increasing the load, till I find out which is the limit

1D


12

Deterministic vs. Stochastic Simulations

There are many ways to classify simulations, e.g.:

– Deterministic vs. Stochastic

– Continuous time vs. Discrete time

Deterministic vs. Stochastic simulations:

– deterministic simulations are completely defined by

the model, and their evolution is deterministically

associated to the input parameters;

– stochastic simulations are based on models that

include random variables or processes and so they

require the generation of random variables; the

evolution of the model depends on the input

parameters and the generation of random variables


13

Deterministic vs. Stochastic Simulations (2)

Examples

– Deterministic simulation:

– Consider the motion of the billiard balls on the pool

table. Given the position of the balls, direction and

strength of impact of the cue, we can simulate the

outcome of the shot (without explicitly solving an

analytical model)

– Stochastic simulation:

– Consider a GSM cell with N-channel (or a phone

concentrator) to which connection requests arrive

according to a Poisson process with rate

– We want to determine the probability of rejection,

knowing that with probability p rejected calls retry to

log on after a time equal to T


14

Static and Dynamic Simulations (1)

Stochastic simulations can be classified as

static or dynamic

Static Simulations

– also called Monte Carlo simulations

– the time variable plays no role

– the basic objective is to determine some statistical

characteristics of one or more of random

variables

– in fact, the Monte Carlo simulations tipically

evaluates statistical measures through

independently repeated experiments


16


Dynamic simulations

– also called temporal simulations

– time becomes the main variable to be tied to the

evolution of the model

– the purpose is to collect statistics for random

processes observed at different time


17


Simple Monte Carlo example (it can be

solved analytically):

– Consider a slotted random multiple access system

with 10 users of type A, 10 users of type B and 13

users of type C

– Users A, B and C have a packet ready for

transmission in each slot with probability 3p, 2p

and p, respectively

– Find the throughput of the system given p.

– Determine the value of p that maximizes the

throughput


18


Another (more complex) Monte Carlo

simulation example

– Consider a cellular packet system in which a

packet is transmitted by a mobile user placed in

random position in each time interval with

probability G.

– The attenuation of the channel is a function of

distance and a random factor (fading) and

transmitted power is fixed. The packet is

received correctly only if the signal/interference

is greater than 6 dB.

– Determine the probability of a successful

transmission. Determine the value of G that

maximizes the throughput.


19


Example of dynamic simulation (1):

– Consider a queue system with a server and a

queue of up to K packets. Interarrival times have

a uniform random distribution between time a

and b and each arrival is composed of a number

of users x, where x is a random variable Binomial

with reason p. Determine the probability of

rejection and the average delay to the crossing.


20


Example of dynamic simulation (2):

– Consider a slotted polling system with N queues.

Packet length x is randomly distributed

according to a cartain known pdf. Interarrival

time are distributed according to a Poisson

process with rate . Assuming that:

– The server adopts a policy of priority queuing (i.e., it

continues to serve a queue until it is empty)

– The server chooses the next queue according to a

«longest-queue» policy

– Determine the average transfer time through the

system

Classification of Dynamic Simulation

Discrete-event dynamic simulation

The system state changes in response to

«events»

– Network simulation (OPNET, NS-2)

Continuous-time dynamic simulation

The system state evolves in response to the

change of continuous-time variable

– Weather Forecast

NB: simulated time vs simulation time!


21


22

Simulation of discrete events

Simulation of discrete events is fundamental importance for

telecommunications networks

In discrete event simulation the state variables change value

only at discrete instants of time

The change of the system state is called event and is

characterized by an instant of occurrence

– An event has no duration

After the event occurs, in the system an activity starts that

persists for some time

– An activity is usually characterized by a start event and an end event

– For example, the beginning and the end of the transmission of a

packet are events, while the transmission itself is an activity


23


In discrete event simulation we should:

– define the types of events that can occur

– define the changes in the system state associated to each

event

– define a time variability and an ordering of events in a

calendar based on the instant of event occurrence

– define an initial state

– scroll through the calendar and, each time an event

occurs, make changes to state variables according to that

event

– measure on the output variables


24

Example: Simulation of a queue system

Model:

– Queue system with a server and an infinite queue

– Input variables:

– interarrival times of requests (packets)

– service times of requests

– State variables:

– number of requests in the system

– Initial state

– e.g., no user in the system

– Output variables:

– average time spent by a packet in the system


25

Events

– 1. first arrival

– Set service start

– Set service end

– 2. from the second arrival on

– In this case we should act differently on the basis of the state:

– Arrival

» in an empty system → immediate service start and schedule

service end

» in a not-empty system → add a packet in the queue (service

end? We can’t add service end!)

– End (? See next slides)

» with empty queue (hold till next event)

» with non-empty queue (set new service start and service end)

- I can not read the calendar ...

- Initialization (“init”)

Example: Simulation of a queue system


26

Example: simulation of a queue system

Filling the calendar of events

– PROBLEM: it is not possible to place the end

service events of each queued requests as we do

not know the service duration of requests queued

in front of them

– SOLUTION: the calendar can be filled with new

events while other events are pending

– Example: a new packet will be queued even if the

service end events of the packets before it in the queue

are not known.


27

Example: a queue system simulation

In summary:

when we have an «arrival» event, we increase the

number of users, then

if the system is empty, a new end event is inserted in the

calendar at a time equal to «CLOCK + service time»

if the system is busy, add a packet in the queue

when an service-end event is reached, we

decrease the number of users, then

If the queue is empty, no action

if the queue is not empty, a new end event is inserted in

the calendar at a time equal to the value of «CLOCK +

service time»


28

Example: a queue system simulation

Measurement of output variables

1. Transfer time

– User arrival: storage time of arrival

– End of service: calculation of the service time

(CLOCK - t_arrival)

2. Average users in the queue

– Weigthed average of the users in the systems during

each activity interval («time slices»)


29


Note: the correct inclusion of a new event in

the calendar is a critical operation if the

calendar has many events

– Efficient techniques must be used for inclusion of

an element in an ordered list


30

The variable Clock

Sliding of the «CLOCK» variable

– «clock driven» simulations

– the CLOCK variable is always increased of a fixed step

» E.g., Slotted systems

– «event driven» simulations

– the CLOCK variable is increased according to the time interval

between the occurrence of an event and the occurence of the

following event

– Notes:

– from a computational-time point of view it may be convenient to

adopt a method rather than the other depending on the model


31

Some final considerations…

Considering the power/efficiency of modern coding

languages and computing systems, simulation is today a

powerful tool of analysis to address complex problems

But simulation is also a tool that should be used with care

for the following reasons:

– is not easy to validate the obtained results

– the computational time can easily get very high

– is not easy to understand how different parameters affect the

result


32


The simulation of a stochastic model involves the

utilization of random input variables

→ We need statistical distribution of input variables

So, for computer simulation, pseudo-random

number generation and synthesis of statistical

variables is needed (the next topic ...)

– Example: traffic entering a queue system described by

the process of arrivals and the process of service times


33

Summary

What is simulation?

Systems, models and variables

Discrete-event simulation


– Synthesis of random variables

Statistical Analysis

– Statistical confidence of simulative results


34

The Role of Random numbers

When the model to be analyzed via simulation is

stochastic, two important problems arise:

the generation of pseudo-random numbers to

be use for the generation of input variables

the statistical analysis of the results obtained

through the output variables

What I assume you know

Basic of statistics

Average (mean), variance

Concept of random variable

Probability density function (pdf) f(x)

Cumulative distribution function (CDF) F(x)

Theorem of central limit

Gaussian (normal), t-student distributions

M. Tornatore: Introduction to Simulation

35


36


Rigorously speaking, numbers generated by a

computer cannot be random due to

deterministic nature of a computer

We can however generate pseudo-random

sequences that meet a series of statistical tests

of randomness


37


The problem of generating pseudo-random

numbers can be logically divided in two parts:

generation of sequences of random numbers

uniformly distributed between 0 and 1

generation of sequences of random numbers

distributed in an arbitrary mode

– Poisson, Bernoulli, Weibull, Exponential, etc ...


38


The pseudo-random sequences are obtained

through the implementation of recursive formulas

Some history

The first method to generate random sequences

was Von Neumann's «square center» method

The next number is obtained by squaring the

previous number and taking the central number

Von Neumann’s example


39

x0=3456

that squared provides

(x0) 2 =11943936

so

x1= 9439

This method was abandoned: difficult to analyze,

relatively slow and statistically unsatisfactory

Sequence of random number obtained


Factors determining the quality of a method:

1) numbers must be statistically independent

2) we must be able to re-produce the sequence

3) numbers must be uniformly distributed (i.e., they

must have the same probability to occur)

4) sequence must be of an arbitrary length

5) the method should be quickly executable by the

computer and must consume small amount of

memory


40


41


Let’s recall some basic math operators

Module: yxyxyx /mod

Congruency:

– Property of module:

1mod

0 y

yx

zyzx

zyx

modmod

)(mod


42


Linear Congruency Method (Lehmer 1948)

{Xn}nN X0,X1,X2,...,Xi,...

That:

module

increase

multipliera

seedor valueinitial

)(mod

0

1

m

c

X

mcaXX nn

NB: The method is called:

- multiplicative if c = 0

- mixed if c 0


43


Example:

X0 = a = c = 7

m = 10

{Xn}nN = 7, 6, 9, 0,7, 6, 9,...

Note: 0 Xi m, for all i


44


Drawbacks of linear congruency

As soon as Xp=X0, the sequence is repeated

periodically; p is the period of the sequence

Being p<m, the period will be less than or equal

to m

Note(1): if p = m then all numbers between 0 and

m-1 are repeated once in the sequence

Note(2): to obtain a sequence in [0,1):

{Rn}nN X0

m,X1

m,X2

m,...,

X i

m,...


45


We can relate directly Xn to X0 This emphasizes even more the deterministic nature of the sequence!

mca

aXaX

mca

aXaX

mcaXamcaXX

mcaXX

nn

n mod)1

1(

...

mod)1

1(

mod))1((mod)(

mod)(

0

3

0

3

3

0

2

12

01


How to choose the multiplier a and increase c:

a and c strongly influence the period and the

statistical properties of the sequence

there are rules for choosing a and c that will return

periods p = m (full period)

Criteria to ensure optimality:

1. The parameters c and m must be co-prime, i.e.:

MCD(c,m) = 1

2. Every prime divisor of m must divide (a-1) Ex: if m=10, its prime factors are 2 and 5. (a-1) must be a multiple of 2 and 5

3. If m is a multiple of 4, also (a-1) must be


46


It is not easy to find values that satisfy (1), (2), (3)

Ex: m=10, a=21, c=3 (Xn=3,6,9,2,5,8,1,4,7,0,..)

Some researchers have therefore identified the

following values in accordance with these criteria:

KNUTH m = 231; a = int (p * 108) ; c = 453806245

GOODMAN/MILLER m = 231 -1; a = 75 ; c = 0

GORDON m = 231; a = 513 ; c = 0

LEORMONT/LEWIS m = 231; a = 216 + 3 ; c = 0


47


48


A simpler condition:

– if the method is multiplicative (c=0) you can show

that if m=2b then the maximum period is p=2b-2 if

b4

Note: the equivalence of multiplicative and

mixed approaches has been proven


49


Notes on the choice of module m

m influences the period because p m

m also affects the computational speed:

– to calculate a module, we should generally perform a

product, a sum and a division

– it is possible to do everything together if you choose

as module the maximum integer representable by

the computer plus one

– in this module the operation correspond to a

truncation

– if b is the number of bits used by the computer, you

will choose m=2b


50


Other methods:

congruent square method:

– It is based on the generation of congruent numbers with m-

module according to the relation:

Fibonacci or additive method:

mcaXdXX nnn mod)(2

1

mXXX knnn mod)(1


51

Notes on Test for generators:


The tests on pseudo-random numbers generators

are applied to verify that:

Generated numbers are uniformly distributed

Generated numbers are indipendent

However, these concepts have a value only for

random variables and must find their

implementation in a test run on a finite set of

samples

generally we assume an hypothesis as verified

only if the set of samples satisfies a certain

number of tests


52

Notes on Test for generators:


a typical test for the verification of the distribution

is the test of c2

We divide the set of possible values in k categories

Ŷi is the set of sample values falling in the i-th

category and Yi =npi is the expected value, where n

is the number of samples and pi is probability of

category i (pi =1/k for uniform case)

a quality index can be defined as

k

kk

Y

YY

Y

YY

Y

YYV

2

2

2

22

1

2

11 )ˆ(...

)ˆ()ˆ(


53

Notes on Test for generators: c2 test


the problem is that the value of V is itself a random variable which also depends on the absolute values

therefore is necessary to repeat the test several

times on different samples and evaluate the

probability that V takes high values

it can be proven that V has a c2 distribution with n = k-1 degrees of freedom:

!122

:integer2

0;

22

1)( 2

12

2/

nnn

xexn

xfxn

n


54

Notes on Test for generators: c2 test

__________ __ ______ ______ _______ Generation of pseudo-random numbers

If Px indicates the percentile x% of the distribution c2, we can rank the observations of V according to table:

P0-P1, P99-P100 reject

P1-P5, P95-P99 suspicious

P5-P10, P90-P95 almost-suspicious


55

Notes on Test for generators: gap test


There are several tests to verify the independence of the samples

A frequently used test is the «gap» test

We define an event on the observed distribution, such as passing a certain threshold

We estimate the probability p associated to the

event

From the sequence of samples we derive the sequence of variables (0,1) that defines if the event occurred or not


56

Notes on Test for generators: gap test


considering the length of the first sequences of 0 and the sequences of 1

since the distribution of these lengths is geometric, we verify the congruency to the distribution using a test (e.g., the c2)

or more simply we estimate the average value and compare it with 1-p e p respectively

000111110010010010010011100001001

Generation of other distributions

Now we have a sequence of pseudo-random

numbers

– Uniformly distribuited between 0 and 1

– That satisfies test of randomness

Next step

We use them to obtain samples of variables

distributed according to the distribution we

need (exponential, Poisson, geometric, etc. ..)


57


«Inverse transform» method

Given

– r: variable uniformly distribuited in [0,1]

– that is, f(r)=1 or F(r)=r

To obtain a random variable x with f(x), we have to:

– Determine F(x), 0<=F(x)<=1

– Generate random samples r

– Set r=F(x)

– Calculate the inverse function F(x) => F-1(.)

– Obtain x= F-1(r)

It can be proven, but we skip the proof!


58


Inverse transformation example (1):

– We want to get x such that f(x) =1/(b-a)

– From uniform [0,1] to uniform [a,b]

– F(x)=(x-a)/(b-a), 0<=F(x)<=1

– r=F(x)=(x-a)/(b-a)

– x=r(b-a)+a


59


Inverse transformation example (2):

– I want to get x such that f(x) =e-x

– From uniform [0,1] to negative exponential with

average

– F(x)=1e-x

– r=F(x)=1e-x

– x= (-ln(1-r))/


60


We could show it more rigorously…

– Next 4 slides


61


62


Generation of an arbitrary distribution:

Elements of probability :

(y)xxy

g(x)y

x

xg

xfyf

g(X)Yyf

ii

i

i i

iXY

Y

, offunction in turn arethat

:equation theof solutions theare where

)('

)()(

:bygiven is r.v. of )( p.d.f.

The fundamental theorem of functions of random variables:


63


Example:

(0,1)!in uniform is So

1.y0 ifonly exists that

equation theofsolution only theis where

1y0 1)('

)(')(

: write toallows theoremlfundamenta The

Consider

. C.D.F with r.v. a is

1

1

1

x

(x)Fy

x

xF

xFyf

(X)FY

(x)FX

X

X

XY

X

X


64


Synthesis of a r.v. using the method of percentile:

it is now easy to see that if you have:

– U r.v. uniform in (0,1)

– To obtain a r.v. X with CDF equal to F(.) it is enough

to set:

)(1 UFX

X F-1(.) F(.)

X U


65


This can be proven otherwise as:

U is a r.v. uniform in (0,1)

11

10

00

)(

x

xx

x

xFU

elsewhere0

10for 1)(

xxfU

)(1 UFX

FX (t) P X t P F1(U) t P U F(t) F(t)

Set:

It results:


66


The variable U is obtained from the

generation of pseudo random number

It remains the problem of finding Fr-1(.) for

the variable that you want to synthesize

for some processes the Fr-1(.) cannot be

obtained in analytical form and therefore we

must resort to other methods

moreover, for discrete random variable we

need to slightly modify the approach


67


Example: Exponential

– If you want to generate an exponential random

variable x with average > 0

Ulnor )1ln

: thereforeand

01

:haveyou

01

:is pdf

-X-U(-X

x-e(x)F

xe(x)f

-x/

X

-x/

X


68


Example: Rayleigh

– if you want to generate a random variable with

Rayleigh pdf

ln2or )1ln2

thereforeand

01

:haveyou

0

:is pdf

2

2

2

2

U--U(-X

x-e(x)F

xxe(x)f

/-x

X

/-x

X


69


Example: Gaussian

– If you want to generate a random variable x with

gaussian pdf and with =0 and s=1

– To have a variable with average and variance s2

it is enough to use the transformation z=sx+

– pdf is

– It is well known that the CDF of the Gaussian

cannot be expressed directly, so it can not be

inverted explicitly

22

2

1 /-x

X e(x)fp


70


Example: Gaussian

– first approach is to use an approximation

– the central limit theorem tells us that the sum of

N r.v.’s tends to the normal distribution with the

increase of N

– Usually for N12 we assume we can get a good

approximation

– So it is enough to extract 12 variables uniform Ui

12

1

6i

iUX


71


Example: Gaussian

– a smarter approach gets two independent samples

of normal random variable with only two

extractions

– is based on the observation that:

– a 2-dim vector which has Gaussian and

Independent Cartesian components, has:

» module with Rayleigh’s distribution

» uniform phase in (0,2p)


72


Example: Gaussian

– therefore:

– two variables are extracted: U1 ed U2, uniform in

(0,1)

– assessing X ed Y:

– that are independent normal random variables

)2sin( ln2

)2cos( ln2

21

21

UU-Y

UU-X

p

p


73


Discrete random variables:

Consider a discrete random variable described by

probability distribution:

FX(x) is a function as:

.,...,1 mkpaXP kk

pk

ak


74


Discrete random variables

That, when inverted, becomes

pk

ak

1

The relation expressing the

variable is therefore :

k1-k11-k1 pp...pp...p

:ifonly set

u

ax k

NB: «u» is what we use to call «r»,

i.e., the pseudo-random number between 0 and1


75



Example:

Generate a random variable X that takes the

value 1 with probability p and value 0 with

probability 1-p

It is enough to set:

1p if 0

p0 if 1

u

uX


76



it becomes extremely complicated with

discrete distributions with m infinite

we must stop at a finite value m

m determines the number of comparisons

that must be done in the routine assignment,

and thus the speed of the routine itself

in some cases it is possible to adopt some

tricks


77


Example: Geometric Distribution

We have:

Consider an exponential r.v. Z; we have:

this value matches P(X=n+1) if you require

that:

.... 1,2k )1( 1 kppkxP

)1()1()1(1 /1///)1( eeeenZnP nnn

)1ln(

1;1 /1

pep


78


Example: Geometric Distribution

Therefore to generate a geometric variable is

enough:

– 1. Generate a uniform variable U in (0,1)

– 2. Get an exponential variable

– 3. Set )1ln(

ln

p

UZ

ZX 1


79


Example: Poisson’s Distribution

with Poisson’s distribution things get complicated, so there can be no shortcuts to the problem:

– 1. set k:=0, A:= e-a, p:=A

– 2. U:=rand(seed)

– 3. While U>p

» k:=k+1

» A:=A*a/k

» p:=p+A

– 4. return k

ak

ek

akXP

!


80

Analysis and validation of the results

Once we have built the simulation model and the

software that implements it, we shall:

decide what to measure (which output variables)

decide the statistical metric (average, variance?)

– Note that the output variables are r.v!

repeat the experiment multiple times!!

adopt the appropriate estimators for the

parameters

evaluate the accuracy (“confidence”) of estimation


81


given a population whose distribution is f(x),

with average E[x] = h and variance s2(x) = s2

[x1, x2, ... , xn] are n independent observations

The average value of the samples is defined by:

Estimation problem: estimation of average value

n

i

ixn

x1

1


82


The average of the samples is also a r.v. with:

for large n, the average of the samples is a

normal variable, and then the variable:

it can be assumed normal with zero average

and unitary variance based on the central limit

theorem


nxxE

22 )( ; ][

ssh

z x -h

s/ n


83


the normal distribution F(z) is tabulated

u1a/2 is a value such that

we have:

Estimation problem:estimation of average value

F(u1a / 2) 1a /2

P u1a / 2 z u1a / 2 1a

P u1a / 2 x h

s / n u1a / 2

1a

F(z)

z

1

1-a/2

u1-a/2 0

f(z)

u1-a/2

-u1-a/2


84


and therefore

the (1-a) constant is usually expressed with

percentage and is called confidence level

the interval

is called confidence interval

P x u1a / 2s

nh x

s

nu1a / 2

1a

x u1a / 2s

n, x u1a / 2

s

n



85


commonly we adopt a confidence level of

95% for which we have:

this means that h falls in this range:

with a probability of 95%


a 0.05

u1a / 2 1.96

x 1.96s

n, x 1.96

s

n


86


Unfortunately, the variance s2 is not known

s2 should be replaced by the samples variance,

defined as:

In this way, however, the variable:

is no longer normal but has t-student distribution

with n-1 degrees of freedom


n

i

i xxn

s1

22 )(1

1

t x -h

s/ n


87

______ _ _________ ___ _________ Analysis and validation of the results

in cases with large n (>30) it is possible to

approximate the t-student with the normal

distribution

but for smaller values of n it is necessary to use

t-student distribution with the corresponding

number of degrees of freedom

Note: for Monte Carlo simulations the values of n>30 are quite

common, while for temporal simulations, n is usually smaller

Estimation problem:estimation of average value


88


Table of t-student values

Warning!: b=1-a/2 k=n-1


89

______ _ _________ ___ _________ Analysis and validation of the results Values generation of t-student

// t-distribution: given p-value and degrees of freedom, // return t-value; adapted from Peizer & Pratt JASA, vol63, p1416 double tval(double p, int df) { double t; int positive = p >= 0.5; p = (positive)? 1.0 - p : p; if (p <= 0.0 || df <= 0) t = HUGE_VAL; else if (p == 0.5) t = 0.0; else if (df == 1) t = 1.0 / tan((p + p) * 1.57079633); else if (df == 2) t = sqrt(1.0 / ((p + p) * (1.0 - p)) - 2.0); else { double ddf = df; double a = sqrt(log(1.0/(p*p))); double aa = a*a; a = a - ((2.515517+0.802853*a+0.010328*aa) / (1.0+1.432788*a+0.189269*aa+0.001308*aa*a)); t = ddf - 0.666666667 + 1.0 / (10.0 * ddf); t = sqrt(ddf*(exp(a*a*(ddf-0.833333333)/(t * t))-1.0)); } return (positive)? t : -t; }


90


Estimation problem: operations on confidence intervals

Let’s denote the confidence intervals of two

variables as:

it can be proven that:

;1 ;1 yulxul yyyPxxxP aa

;1

;1

yxuull

xul

yxyxyxP

BAxBAxBAxP

aa

a


91


Estimation problem: variance estimation

a direct method for estimating variance is using

the expression

having the populations [x1, x2, ... , xn] and [(x1)2,

(x2)2, ... , (xn)2] it is possible estimate the

confidence interval of the average of x and x2

the two intervals can then be combined using

the previous expressions

][][)( 222 xExEx s


92


Estimation problem:

The results seen so far are based on the

fundamental assumption that:

– the observed variables are stationary

– the measurements are not affected by the initial state

– the observations are independent

the hypothesis of independence is the more

difficult to obtain and verify in practical cases

the independence of the observations depends

on the characteristics of correlation of observed

variables that are not known


93


Estimaton problem: correlated observations

The estimator of the average continues to be a

non-biased estimator

but its variance is now equal to:

where the correlation coefficient rk is:

][ hxE

1

1

22 121)(

n

k

kn

k

nx r

ss

rk E (xi h)(xik h)

s 2


94



The estimation of the confidence interval thus

requires knowledge of the autocorrelation

function of the process that is not generally

known

We could use autocorrelation estimators, but the

complexity and computation load would become

excessive

In practice we use two different approaches to

build independent sequences


95



1) repeated tests

– N independent observations of the process are built repeating N

times the simulation with N different random number generators

– the N estimated values for each simulation are used as

independent samples for the evaluation of the confidence

this approach implements in fact a generalization of the

Monte Carlo simulation

It is useful in many practical situations, but in fact it is

only used when the second method can not be used


96



2) subdivision into intervals of observation (run)

– simulation is divided into N blocks, each consisting of a number of

observations K

– evaluating the average of the output variable in each block

– it is shown that with sufficiently large K the average of each block

are independent

– estimate the confidence interval on the basis of estimates obtained

in each run

This approach is approximate

sometimes may not be easy to check that the number K of

observations is the same for each run


97



Example:

consider a mD/D/1 queue

m flows with deterministic inter-arrival time T are offered at a server

Service time is also deterministic and equal to S

the relative phases of the flows are random (uniform between 0 and T)

It can be shown that:

– delays are periodic and depend only on the initial phases

We must repeat the experiment a number sufficiently large N of times with random phases to obtain some valid estimation of the average delay


98



in some cases the measurement process is a renewal

process and we can exploit the renewal process to have

independent observations

a renewal process is characterized by a series of renewal

instants [b1, b2, b3, ...]

in these moments, the process returns to the same state

the evolution of the process in the intervals [bn-1, bn] is

independent from interval to interval

measurements taken on the process in distinct intervals

are independent and you can apply the formulas for the

estimation of confidence


99



Example 1:

– is easy to convince yourself that for queuing systems with general arrivals and general services, the instant of time when a new request arrives and the queue is empty this is a moment of renewal of the entire system. Indeed:

– the system state is the same

– a new period of inter-arrival is not started and therefore there is no memory

– a new period of service is not started and therefore there is no memory

– the system is empty and therefore there is no memory of users waiting


100


Estimaton problem: observations related

Example 2: – consider a M/G/1 queue system

– conduct a simulation to measure the delay through the system

– consider as champions the delays experienced by each user

– is easy to convince yourself that these samples are related

– Indeed, for example, if the first arrival finds empty system, the immediately following arrivals observe low delays, while consecutive arrivals with the system very high load observe high delay

– dividing the simulation in run of the same length of time you do not control the number Ki of samples for each run and long run needs to be done to have a low dispersion of the Ki.


101


Estimation problem:

even assuming that we have solved the estimation problem remains that of stationarity

although the process is stationary, we are forced to start the simulation from an initial state

the initial state influence the statistics gathered in the first part of the simulation until the system reaches a stationary behavior

the simplest approach is to eliminate the results from the statistics collected during the initial interval


102


Estimation problem:

the problem is that it needs over the period which shall not collect statistics

Unfortunately there are no precise rules for deciding when to start collecting data

theoretically should estimate the autocorrelation of the process being measured and considered as a transitional period of time until the autocorrelation is not considered negligible

in fact also autocorrelation estimate is a complex operation, and therefore uses methods "empirical"


103


Estimation problem:

Of course, if you know the regeneration points

of the process under measurement, the transient

problem is solved

– just delete the data until the first point of generation

– or start from the state of the points of regeneration

and hold good all the statistical data

otherwise you need to have an idea of the time

constants involved in determining the state of

the process

you can then proceed by attempts


104


Estimation problem:

Example:

in the queuing systems the time needed to stabilize the system depends on the load

as r tends to 1, the system takes longer time to reach a stable state

in a sense r near one, means unstable system

Problem: How do you know if the system is stable or not if this can not be inferred from the input parameters?


105

Further reading

Donald E. Knuth, “The Art of Computer

Programming”, Second Edition, Addison

Wesley Publishing Company, Reading MA,

1981 (in particolare, Volume 2:

“Seminumerical Algorithms”)

Documents

Department of Electronics, Information and Bioengineering ...networks.cs.ucdavis.edu/~tornatore/Tornatore_files/ECS289i/Slides/... · Introduction to Simulation of Communication Networks