Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Uniform random numbers generators
Lecturer: Dmitri A. Moltchanov
E-mail: [email protected]
http://www.cs.tut.fi/kurssit/TLT-2707/
Network simulation techniques D.Moltchanov, TUT, 2012
OUTLINE:
• The need for random numbers;
• Basic steps in generation;
• Uniformly distributed random numbers;
– Von Neumann’s generator;
– Congruential methods: additive, multiplicative, linear;
– Tausworthe generator;
– Composite generators.
• Statistical tests for uniform random numbers.
– Independence: runs test and correlation test;
– Independence: χ2 and Kolmogorov test.
Lecture: Uniform random numbers generators 2
Network simulation techniques D.Moltchanov, TUT, 2012
1. The need for random numbersExamples of randomness in telecommunications:
• interarrival times between arrivals of packets, tasks, etc.;
• service time of packets, tasks, etc.;
• time between failure of various components;
• repair time of various components;
• . . .
Importance for simulations:
• random events are characterized by distributions;
• simulations: we cannot use distribution directly.
For example, M/M/1 queuing system:
• arrival process: exponential distribution with mean 1/λ;
• service times: exponential distribution with mean 1/µ.
Lecture: Uniform random numbers generators 3
Network simulation techniques D.Moltchanov, TUT, 2012
Discrete-event simulation of M/M/1 queue
INITIALIZATION
time:=0;
queue:=0;
sum:=0;
throughput:=0;
generate first interarrival time;
MAIN PROGRAM
while time < runlength do
case nextevent of
arrival event:
time:=arrivaltime;
add customer to a queue;
start new service if the service is idle;
generate next interarrival time;
departure event:
time:=departuretime;
throughput:=throughtput + 1;
remove customer from a queue;
if (queue not empty)
sum:=sum + waiting time;
start new service;
OUTPUT
mean waiting time = sum / throughput
Lecture: Uniform random numbers generators 4
Network simulation techniques D.Moltchanov, TUT, 2012
2. General notesAll computer generated numbers are pseudo ones:
• we know the method how they are generated;
• we can predict any ”random” sequence in advance.
The goal is then: imitate random sequences as good as possible.
Requirements for generators:
• must be fast;
• must have low complexity;
• must be portable;
• must have sufficiently long cycles;
• must allow to generate repeatable sequences;
• numbers must be independent;
• numbers must closely follow a given distribution.
Lecture: Uniform random numbers generators 5
Network simulation techniques D.Moltchanov, TUT, 2012
General approach nowadays:
• transforming one random variable to another one;
• as a reference distribution a uniform distribution is often used.
Note the following:
• most languages contain generator of uniformly distributed numbers in interval (0, 1).
• most languages do not contain implementations of arbitrarily distributed random numbers.
The procedure is to:
• generate RN with inform distribution between a and b, b >>>> a;
• transform it somehow to random number with uniform distribution on (0, 1);
• transform it somehow to a random number with desired distribution.
Lecture: Uniform random numbers generators 6
Network simulation techniques D.Moltchanov, TUT, 2012
2.1. Step 1: uniform random numbers in (a, b)
Basic approach:
• generate random number with uniform distribution on (a, b);
• transform these random numbers to (0, 1);
• transform it somehow to a random number with desired distribution.
Uniform generators:
• old methods: mostly based on radioactivity;
• Von Neumann’s algorithm;
• congruential methods.
Basic approach: next number is some function of previous one
γi+1 = F (γi), i = 0, 1, . . . , (1)
• recurrence relation of the first order;
• γ0 is known and directly computed from the seed.
Lecture: Uniform random numbers generators 7
Network simulation techniques D.Moltchanov, TUT, 2012
2.2. Step 2: transforming to random numbers in (0, 1)
Basic approach:
• generate random number with uniform distribution on (0, 1);
• transform these random numbers to (0, 1);
• transform it somehow to a random number with desired distribution.
Uniform U(0, 1) distribution has the following pdf:
f(x) =
1, 0 ≤ x ≤ 1
0, otherwise. (2)
Lecture: Uniform random numbers generators 8
Network simulation techniques D.Moltchanov, TUT, 2012
Mean and variance are given by:
E[X] =
∫ 1
0
xdx =x2
2
∣∣∣∣∣1
0
=1
2,
σ2[X] =1
12. (3)
How to get U(0, 1):
• by rescaling from U(0,m) as follows:
yi = γi/m, (4)
• where m is the biggest possible number that can be generated.
What we get:
• something like: 0.12, 0.67, 0.94, 0.04, 0.65, 0.20, . . . ;
• sequence that appears to be random...
Lecture: Uniform random numbers generators 9
Network simulation techniques D.Moltchanov, TUT, 2012
2.3. Step 3: non-uniform random numbers
Basic approach:
• generate random number with uniform distribution on (a, b);
• transform these random numbers to (0, 1);
• transform it somehow to a random number with desired distribution.
If we have generator U(0, 1) the following techniques are avalable:
• discretization: bernoulli, binomial, poisson, geometric;
• rescaling: uniform;
• inverse transform: exponential;
• specific transforms: normal;
• rejection method: universal method;
• reduction method: Erlang, Binomial;
• composition method: for complex distributions.
Lecture: Uniform random numbers generators 10
Network simulation techniques D.Moltchanov, TUT, 2012
3. Uniformly distributed random numbersThe generator is fully characterized by (S, s0, f, U, g):
• S is a finite set of states;
• s0 ∈ S is the initial state;
• f(S → S) is the transition function;
• U is a finite set of output values;
• g(S → U) is the output function.
The algorithm is then:
• let u0 = g(s0);
• for i = 1, 2, . . . do the following recursion:
– si = f(si−1);
– ui = g(si).
Note: functions f(·) and g(·) influence the goodness of the algorithm heavily.
Lecture: Uniform random numbers generators 11
Network simulation techniques D.Moltchanov, TUT, 2012
user choice s0
s0
s1
s2
s3
s4
u0
u1
u2
u3
u4
u0=g(s0)
u1=g(s1)
u2=g(s2)
u3=g(s3)
u4=g(s4)
s1=f(s0)
s2=f(s1) s3=f(s2)
s4=f(s3)
Figure 1: Example of the operations of random number generator.
Here s0 is a random seed:
• allows to repeat the whole sequence;
• allows to manually assure that you get different sequence.
Lecture: Uniform random numbers generators 12
Network simulation techniques D.Moltchanov, TUT, 2012
3.1. Von Neumann’s generator
The basic procedure:
• start with some number u0 of a certain length x (say, x = 4 digits, this is seed);
• square the number;
• take middle 4 digits to get u1;
• repeat...
• example: with seed 1234 we get 1234, 5227, 3215, 3362, 3030, etc.
Shortcoming:
• sensitive to the random seed:
– seed 2345: 2345, 4990, 9001, 180, 324, 1049, 1004, 80, 64, 40... (will always < 100);
• may have very short period:
– seed 2100: 2100, 4100, 8100, 6100, 2100, 4100, 8100,... (period = 4 numbers).
To generate U(0, 1): divide each obtained number by 10x (x is the length of u0).
Note: this generator is also known as midsquare generator.
Lecture: Uniform random numbers generators 13
Network simulation techniques D.Moltchanov, TUT, 2012
3.2. Congruential methods
There are a number of versions:
• additive congruential method;
• multiplicative congruential method;
• linear congruential method;
• tausworthe generator.
General congruential generator:
ui+1 = f(ui, ui−1, . . . ) mod m, (5)
• ui, ui−1, . . . are past numbers.
For example, quadratic congruential generator:
ui+1 = (a1u2i + a2ui−1 + c) mod m. (6)
Note: if here a1 = a2 = 1, c = 0, m = 2 we have the same as midsquare method.
Lecture: Uniform random numbers generators 14
Network simulation techniques D.Moltchanov, TUT, 2012
3.3. Additive congruential method
Additive congruential generator is given:
ui+1 = (a1ui + a2ui−1 + · · ·+ akui−k) mod m. (7)
The common special case is sometimes used:
ui+1 = (a1ui + a2ui−1) mod m. (8)
Characteristics:
• divide by m to get U(0, 1);
• maximum period is mk;
• note: rarely used.
Shortcomings: consider k = 2:
• consider three consecutive numbers ui−2, ui−1, ui;
• we will never get: ui−2 < ui < ui−1 and ui−1 < ui < ui−2 (must be 1/6 of all sequences).
Lecture: Uniform random numbers generators 15
Network simulation techniques D.Moltchanov, TUT, 2012
3.4. Multiplicative congruential method
Multiplicative congruential generator is given:
ui+1 = (aui) mod m. (9)
Characteristics:
• divide by m to get U(0, 1);
• theoretical maximum period is m;
• note: rarely used.
Shortcomings:
• can never produce 0.
Choice of a,m is very important:
• recommended m = (2p − 1) with p = 2, 3, 5, 7, 13, 17, 19, 31, 61 (Fermat numbers);
• if m = 2q, q ≥ 4 simplifies the calculation of modulo;
• practical maximum period is at best no longer than m/4.
Lecture: Uniform random numbers generators 16
Network simulation techniques D.Moltchanov, TUT, 2012
3.5. Linear congruential method
Linear congruential generator is given:
ui+1 = (aui + c) mod m, (10)
• where a, c,m are all positive.
Characteristics:
• divide by m to get U(0, 1);
• maximum period is m;
• frequently used.
Choice of a, c,m is very important. To get full period m choose:
• m and c have no common divisor;
• c and m are prime number (distinct natural number divisors 1 and itself only);
• if q is a prime divisor of m then a = 1, mod q;
• if 4 is a divisor of m then a = 1, mod 4.
Lecture: Uniform random numbers generators 17
Network simulation techniques D.Moltchanov, TUT, 2012
The step-by-step procedure is as follows:
• set the seed x0;
• multiply x by a and add c;
• divide the result by m;
• the reminder is x1;
• repeat to get x2, x3, . . . .
Examples:
• x0 = 7, a = 7, c = 7, m = 10 we get: 7,6,9,0,7,6,9,0,... (period = 4);
• x0 = 1, a = 1, c = 5, m = 13 we get: 1,6,11,3,8,0,5,10,2,7,12,4,9,1... (period = 13);
• x0 = 8, a = 2, c = 5, m = 13 we get: 8,8,8,8,8,8,8,8,... (period = 1!).
Recommended values: a = 314, 159, 269, c = 453, 806, 245, m = 231 for 32 bit machine.
Lecture: Uniform random numbers generators 18
Network simulation techniques D.Moltchanov, TUT, 2012
Complexity of the algorithm: addition, multiplications and division:
• division is slow: to avoid it set m to the size of the computer word.
Overflow problem when m equals to the size of the word:
• values a, c and m are such that the result axi + c is greater than the word;
• it may lead to loss of significant digits but it does not hurt!
How to deal with:
• register can accommodate 2 digits at maximum;
• the largest number that can be stored is 99;
• if m = 100: for a = 8, u0 = 2, c = 10 we get (aui + c) mod 100 = 26;
• if m = 100: for a = 8, u0 = 20, c = 10 we get (aui + c) mod 100 = 170;
– aui = 8 ∗ 20 = 160 causing overflow;
– first significant digit is lost and register contains 60;
– the reminder in the register (result) is: (60 + 10) mod 70 = 70.
• the same as 170 mod 100 = 70.
Lecture: Uniform random numbers generators 19
Network simulation techniques D.Moltchanov, TUT, 2012
3.6. How to get good congruental generator
Characteristics of good generator:
• should provide maximum density:
– no large gaps in [0, 1] are produced by random numbers;
– problem: each number is discrete;
– solution: a very large integer for modulus m.
• should provide maximum period:
– achieve maximum density and avoid cycling;
– achieve by: proper choice of a, c, m, and x0.
• effective for modern computers:
– set modulo to power of 2.
Lecture: Uniform random numbers generators 20
Network simulation techniques D.Moltchanov, TUT, 2012
3.7. Tausworthe generator
Tausworthe generator (case of linear congruential generator or order k):
zi = (a1zi−1 + a2zi−2 + · · ·+ akzi−k + c) mod 2 =
(k∑j=1
ajzi−j + c
)mod 2. (11)
• where aj ∈ {0, 1}, j = 0, 1, . . . , k;
• the output is binary: 0011011101011101000101...
Advantages:
• independent of the system (computer architecture);
• independent of the word size;
• very large periods;
• can be used in composite generators (we consider in what follows).
Note: there are several bit selection techniques to get numbers.
Lecture: Uniform random numbers generators 21
Network simulation techniques D.Moltchanov, TUT, 2012
A way to generate numbers:
• choose an integer l ≤ k;
• split in blocks of length l and interpret each block as a digit:
un =l−1∑j=0
znl+j2−(j+1). (12)
In practice, only two ai are used and set to 1 at places h and k. We get:
zn = (zi−h + zi−k) mod 2. (13)
Example:
• h = 3, k = 4, initial values 1,1,1,1;
• we get: 110101111000100110101111...;
• period is 2k − 1 = 15;
• if l = 4: 13/16, 7/16, 8/16, 9/16, 10/16, 15/16, 1/16, 3/16...
Lecture: Uniform random numbers generators 22
Network simulation techniques D.Moltchanov, TUT, 2012
3.8. Composite generator
Idea: use two generators of low period to generate another with wider period.
The basic principle:
• use the first generator to fill the shuffling table (address - entry (random number));
• use random numbers of second generator as addresses in the next step;
• each number corresponding to the address is replaced by new random number of first generator.
The following algorithm uses one generator to shuffle with itself:
1. create shuffling table of 100 entries (i, ti = γi, i = 1, 2, . . . , 100);
2. draw random number γk and normalize to the range (1, 100);
3. entry i of the table gives random number ti;
4. draw the next random number γk+1 and update ti = γk+1;
5. repeat from step 2.
Note: table with 100 entries gives fairly good results.
Lecture: Uniform random numbers generators 23
Network simulation techniques D.Moltchanov, TUT, 2012
4. Tests for random number generatorsWhat do we want to check:
• independence;
• uniformity.
Important notes:
• if and only if tests passed number can be treated as random;
• recall: numbers are actually deterministic!
Commonly used tests for independence:
• runs test;
• correlation test.
Commonly used tests for uniformity:
• Kolmogorov’s test;
• χ2 test.
Lecture: Uniform random numbers generators 24
Network simulation techniques D.Moltchanov, TUT, 2012
4.1. Independence: runs test
Basic idea:
• compute patterns of numbers (always increase, always decrease, etc.);
• compare to theoretical probabilities.
1/3 1/3 1/3
1/3 1/3 1/3
Figure 2: Illustration of the basic idea.
Lecture: Uniform random numbers generators 25
Network simulation techniques D.Moltchanov, TUT, 2012
Do the following:
• consider a sequence of pseudo random numbers: {ui, i = 0, 1, . . . , n};
• consider unbroken subsequences of numbers where numbers are monotonically increasing;
– such subsequence is called run-up;
– example: 0.78,081,0.89,0.81 is a run-up of length 3.
• compute all run-ups of length i:
– ri, i = 1, 2, 3, 4, 5;
– all run-ups of length i ≥ 6 are grouped into r6.
• calculate:
R =1
n
∑1≤i,j≤6
(ri − nbi)(rj − nbj)aij, 1 ≤ i, j ≤ 6, (14)
where
(b1, b2, . . . , b6) =
(1
6,
5
24,
11
120,
19
720,
29
5040,
1
840
), (15)
Lecture: Uniform random numbers generators 26
Network simulation techniques D.Moltchanov, TUT, 2012
Coefficients aij must be chosen as an element of the matrix:
Statistics R has χ2 distribution:
• number of freedoms: 6;
• n > 4000.
If so, observations are i.i.d.
Lecture: Uniform random numbers generators 27
Network simulation techniques D.Moltchanov, TUT, 2012
4.2. Independence: correlation test
Basic idea:
• compute autocorrelation coefficient for lag-1;
• if it is not zero and this is statistically significant result, numbers are not independent.
Compute statistics (lag-1 autocorrelation coefficient) as:
R =N∑j=1
(uj − E[u])(uj+1 − E[u])/N∑j=1
(uj − E[j])2. (16)
Practice: if R is relatively big there is serial correlation.
Important notes:
• exact distribution of R is unknown;
• for large N : if uj uncorrelated we have: Pr{−2/√N ≤ R ≤ 2/
√N};
• therefore: reject hypotheses of non-correlated at 5% level if R is not in {−2/√N, 2/
√N}.
Notes: other tests for correlation Ljung and Box test, ’Portmanteau’ test, etc.
Lecture: Uniform random numbers generators 28
Network simulation techniques D.Moltchanov, TUT, 2012
4.3. Uniformity: χ2 test
The algorithm:
• divide [0, 1] into k, k > 100 non-overlapping intervals;
• compute the relative frequencies of falling in each category, fi:
– ensure that there are enough numbers to get fi > 5, i = 1, 2, . . . , k;
– values fi > 5, i = 1, 2, . . . , k are called observed values.
• if observations are truly uniformly distributed then:
– these values should be equal to ri = n/k, i = 1, 2, . . . , k;
– these values are called theoretical values.
• compute χ2 statistics for uniform distribution:
χ2 =k
n
k∑i=1
(fi −
n
k
)2. (17)
– that must have k − 1 degrees of freedom.
Lecture: Uniform random numbers generators 29
Network simulation techniques D.Moltchanov, TUT, 2012
Hypotheses:
• H0 observations are uniformly distributed;
• H1 observations are not uniformly distributed.
H0 is rejected if:
• computed value of χ2 is greater than one obtained from the tables;
• you should check the entry with k − 1 degrees of freedom and 1-a level of significance.
Lecture: Uniform random numbers generators 30
Network simulation techniques D.Moltchanov, TUT, 2012
4.4. Kolmogorov test
Facts about this test:
• compares empirical distribution with theoretical ones;
• empirical: FN(x) – number of smaller than or equal to x, divided by N ;
• theoretical: uniform distribution in (0, 1): F (x) = x, 0 < x < 1.
Hypotheses:
• H0: FN(x) follows F (x);
• H1: FN(x) does not follow F (x).
Statistics: maximum absolute difference over a range:
R = max |F (x)− FN(x)|. (18)
• if R > Rα: H0 is rejected;
• if R ≤ Rα: H0 is accepted.
Note: use tables for N , α (significance level), to find Rα.
Lecture: Uniform random numbers generators 31
Network simulation techniques D.Moltchanov, TUT, 2012
Example: we got 0.44, 0.81, 0.14, 0.05, 0.93:
• H0: random numbers follows uniform distribution;
• we have to compute:
0.130.210.04-0.05R(j) – (j-1)/N
0.07-0.160.260.15j/N – R(j)
1.000.800.600.400.20j/N
0.930.810.440.140.05R(j)
0.130.210.04-0.05R(j) – (j-1)/N
0.07-0.160.260.15j/N – R(j)
1.000.800.600.400.20j/N
0.930.810.440.140.05R(j)
• compute statistics as: R = max |F (x)− FN(x)| = 0.26;
• from tables: for α = 0.05, Rα = 0.565 > R;
• H0 is accepted, random numbers are distributed uniformly in (0, 1).
Lecture: Uniform random numbers generators 32
Network simulation techniques D.Moltchanov, TUT, 2012
4.5. Other tests
The serial test:
• consider pairs (u1, u2), (u3, u4), . . . , (u2N−1, u2N);
• count how many observations fall into N2 different subsquares of the unit square;
• apply χ2 test to decide whether they follow uniform distribution;
• one can formulate M -dimensional version of this test.
The permutation test
• look at k-tuples: (u1, uk), (uk+1, u2k), . . . , (u(N−1)k+1, uNk);
• in a k-tuple there k! possible orderings;
• in a k-tuple all orderings are equally likely;
• determine frequencies of orderings in k-tuples;
• apply χ2 test to decide whether they follow uniform distribution.
Lecture: Uniform random numbers generators 33
Network simulation techniques D.Moltchanov, TUT, 2012
The gap test
• let J be some fixed subinterval in (0, 1);
• if we have that:
– un+j not in J , 0 ≤ j ≤ k, and both un−1 ∈ J , un+k+1 ∈ J ;
– we say that there is a gap of length k.
• H0: numbers are independent and uniformly distributed in (0, 1):
– gap length must be geometrically distributed with some parameter p;
– p is the length of interval J :
Pr{gap of length k} = p(1− p)k. (19)
• practice: we observe a large number of gaps, say N ;
• choose an integer and count number of gaps of length 0, 1, . . . , h− 1 and ≥ h;
• apply χ2 test to decide whether they independent and follow uniform distribution.
Lecture: Uniform random numbers generators 34
Network simulation techniques D.Moltchanov, TUT, 2012
4.6. Important notes
Some important notes on seed number:
• do not use seed 0;
• avoid even values;
• do not use the same sequence for different purposes in a single simulation run.
Note: these instruction may not be applicable for a particular generator.
General notes:
• some common generators are found to be inadequate;
• even if generator passed tests, some underlying pattern might still be undetected;
• if the task is important use composite generator.
Lecture: Uniform random numbers generators 35