64
1 統計計算與模擬 政治大學統計系余清祥 201038日至316第三、四週:模擬 http://csyue.nccu.edu.tw

統計計算與模擬

  • Upload
    zoe

  • View
    10

  • Download
    2

Embed Size (px)

Citation preview

1201038316http://csyue.nccu.edu.tw2 (Simulation)Feign, pretend to have or feel; pretend to be, act like, resemble, ( )The practice of mimicking some or all of the behavior of one system with a different, dissimilar system (Wind-tunnels)(Flight simulator) () CPU(Simulating a chip design through software programs that use models to replicate how a device will perform in terms of timing and results.)3 (Risk analysis)(policy simulation)(traffic flow)(CPU simulation software)(Caner treatment)4(Monte Carlo Methods) (Random Number Generation) 5 (Mathematical Statistics) (Experimental Design)(Computer Simulation)67Probability distribution functionRandom number generator Sampling ruleScoring/TallyingError estimationVariance Reduction techniquesParallelization/Vectorization8 t()dx x x xxdx + + =+=10106 4 22) 1 (1 4t9 / t 4t= = 10 () () 0.01%17 1(Variance Reduction)11 500 20(Sample-resample)Mid-termFinal60 65 70 75 80 8570809012 ( 0.8036)(Bootstrap)1000 0.08200.4 0.5 0.6 0.7 0.8 0.9050100150200250correlation13 (Monte Carlo p-value)(power) (e.g. ) (p-value) (Permutation test) 14 () 3.584.93 5.277.248.335.08 8.85 9.0310.2811.87( Cauchy(5)Cauchy(10))t-test Kruskall Wallis p-value0.06 5 (5/252 ~ 0.02)1516() How did Monte Carlo simulation get its name? The name and the systematic development of Monte Carlo methods dates from about 1940s. There are however a number of isolated and undeveloped instances on much earlier occasions.17() In the second half of the nineteenth century a number of people performed experiments, in which they threw a needle in a haphazard manner onto a board ruled with parallel straight lines and inferred the value of PI =3.14 from observations of the number of intersections between needle and lines. In 1899 Lord Rayleigh showed that a one- dimensional random walk without absorbing barriers could provide an approximate solution to a parabolic differential equation.18Buffon's original form was to drop a needle of length L at random on grid of parallel lines of spacing D.For L less than or equal D we obtain P(needle intersects the grid) = 2 L / PI D. If we drop the needle N times and count R intersections we obtain P = R / N, PI = 2 L N / R D. 192021222324() In early part of the twentieth century, British statistical schools indulged in a fair amount of unsophisticated Monte Carlo work. In 1908 Student (W.S. Gosset) used experimental sampling to help him towards his discovery of the distribution of the correlation coefficient. In the same year Student also used sampling to bolster his faith in his so-called t- distribution, which he had derived by a somewhat shaky and incomplete theoretical analysis. 25Student - William Sealy Gosset (1876 - 1937) This birth-and-death process is suffering from labor pains; it will be the death of me yet. (Student Sayings) 26In 1931 Kolmogorov showed the relationship between Markov stochastic processes and certain integro-differential equations. A. N. Kolmogorov (1903-1987) 27() The real use of Monte Carlo methods as a research tool stems from work on the atomic bomb during the second world war. This work involved a direct simulation of the probabilistic problems concerned with random neutron diffusion in fissile material; but even at an early stage of these investigations, von Neumann and Ulam refined this particular "Russian roulette" and "splitting" methods. However, the systematic development of these ideas had to await the work of Harris and Herman Kahn in 1948. About 1948 Fermi, Metropolis, and Ulam obtained Monte Carlo estimates for the eigenvalues of Schrodinger equation. 28John von Neumann (1903-1957) 29() In about 1970, the newly developing theory of computational complexity beganto provide a more precise and persuasive rationale for employing the Mont Carlo method. Karp (1985) shows this property for estimating reliability in a planar multiterminal networkwith randomly failing edges. Dyer (1989) establish it for estimating the volumeof a convex body in M-dimensional Euclidean space. Broder (1986) and Jerrum and Sinclair (1988) establish the property for estimating the permanent of amatrix or, equivalently, the number of perfect matchings in a bipartite graph.30(Random Number) (Physical) (1955)Rand 31John Von Neumann (Mid-Square Method)e.g. 7033 49463089()214369001908816100776161e.g.33 1089 64 not good123456 241383 265752(but small numbers small numbers)32 (Pseudo-random or Quasi- random numbers) Pseudo-random: A sequence of deterministic numbers which have the same relevant statistical properties of random number. ()33{1,2,,m} 1. Uniformly distributed2. Statistically independent 3. Reproducible 34 (Linear Congruential Method; LCG) (Congruential) Lehmer(1951) a, c, m a: multiplier ()c: increment ()m: modulus ()) (mod1m c aX Xi i+ =+35() First step: Given X0 (seed)Second step: a = c = 3, m = 5, X0 =1(4)(1)Seed , 3 , 0 , 4 , 2 , 3 = iX0 0 1mK c aX X + = , 1 , 1 , 1 , 1 , 1 = iX36() Seed ()) 1000 (mod 1 1011+ =+ i iX X) 2 (mod ) 3 2 (35 171 i iX X + =+37 LCG(1) m c (2)pmpa-1(3)4m4a-1 m = 16, a = 5, c = 1, X0 = 3301615121321189147451033001138 Xi Xi+1 0.311 Xi Xi+1 m 0, ) 1 )(6(1mamcamca = pm a =m239 () (Wichmann and Hill, 1982)1) (mod 30323 30307 30269) 30323 (mod 170) 30307 (mod 172) 30269 (mod 171111i i iii ii ii iz y xuz zy yx x+ + ====40() ()0 m1 m 01 U(0,1)U(0,1)(Exp(1) = log(U(0,1))) (Casio) ) 1 (mod ) (51 n nU U + =+t41LCG (Multiplicative Generator) LCGc = 0 LCGX0 m ()SAS ) (mod1m aX Xi i=+) 1 2 mod ( 094 , 204 , 397311 =+ i iX X42 (shift-register generator) Tausworthe1965 Tausworthe (bit) Fibonacci Sequence(k )) 2 (mod1 k i i iX X X ++ =) 1 (mod1 k i i iX X X ++ =43 reproduce 44 U(0,1)()(Uniformity)(Goodness-of-fit) (Chi-square goodness- of-fit)Kolmogorov-Smirnov (Independence)Gap test, Up-and-down test, run test. 45 ()SASSPSS S-Plus Minitab EXCEL46SAS SAS 6.12 U(0,1)397,204,094xi U(0,1)1 2313 2 1 ) 1 2 mod ( 094 , 204 , 397311,... , , i x xi i= =

1 231=iixu1 23147SPSS48S-Plus/R49Minitab50EXCEL5152 (Chi-Square Test) k ( ) : :, :122i ikiii ie oee o== ;5 >ie53 Kolmogorov-Smirnov Goodness-of-fit Test ()) (1) of (#1) (1) , ( iNix i Nx INx xNx F== s = ||.|

\|= = k NXkX Nx F x FkNNkx F P )] ( 1 [ )] ( [ ) ) ( ( 0 ] | ) ( ) ( | sup [ lim = > < < c x F x F PX NxNyUniforml ) ( ) ( , 0 ., . x F x F e iX N > c54 K-S Test ()DN H(x) (p.156, Ross) Expo(100)66, 72, 81, 94, 112, 116, 124, 140, 145, 155| ) ( ) ( | sup x F x F DX NxN < < ) ( )] 2 exp( ) 1 ( 2 1 [ ) ( lim2 211 Nx H x j x D N PjjN= = s = 0.012. value - , 4831487 . 0 ~ = p DN55 Chi-square K-S tests K-S test K-S test 56 Chi-square K-S tests ()12U(0,1) (i.e., power)=121) 6 (iiU/ 1000o0.01 o0.05 o0.10;2-test 20 93 142KS-test 22 116 20257 Cramer-von Mises Goodness-of-fit Test Anderson Darling (1952, Annals, p.193-212) Anderson-Darling test) ( ) ( : . ) ( ) ( :0 1 0 0x F x F x H vs x F x F HX X= = , X , ,1 ) ( ) 1 ( N NX X X = + =NiiNiX FNY12) ( 0]21 2) ( [121NiNiNiiXNF21 2 1)) (( 58 Gaps test:0, 10 < o < | < 1o | o | , 1 , 0 , )) ( 1 ( ) ( ) ( = = = k k K Pko | o |59 Up-and-Down test(+ 1)(0)e.g. 0.20.40.10.30.60.70.51 0 1 1 1 0 0 1 (Run)4k Levene and Wolfowitz (Ann. Math stat., 1944, p.58-69)N!60e.g. Consider the case of N=2,3,4 (1) N=212 121 0 Totally, 2 runs.(ave.=1)(2) N=3123 11132 10213 01231 10312 01 321 00Totally, 10 runs. (ave.=5/3)(3) N=41234 111 1243 110 1324 101 1342 110 1423 101 1432 100 Totally, 56 runs. (ave.=7/3)61 Up-and-Down test ()= (N+1)/12= (11N 14)/12N 1 = 2/N!k == (2N 1)/3= (16N 29)/90)! 3 ()] 4 3 ( ) 1 3 [( 22 3 2+ + + +kk k k N k k) 1 , 0 ( ~] 90 / ) 29 16 [(3 / ) 1 2 (21NNN UZ =62 Up-and-down test ()Another so called Run tests in Minitab.Use a number (usually Mean) as base number and see if the observed number is greater than this number. We would expect the number of observed greater than mean is about half , i.e. Binomial distribution.e.g.In Minitab, I tried 100 random number data, the following is the outputk = 0.5419The observed no. of runs = 61The expected no. of runs = 50.5055 observations above k. 45 belowThe test is significant at 0.0332. 63 Permutation Testsk k! Up-and-down test Permutation test ), , , ( ), , , , (2 1 2 1 + + k k kX X X X X) , , 2 , 1 ( k 64 Coupon Collectors Testk {k, k+1, k+2, }Greenwood, 1955, Math comp. p.1-15 Permutation Coupon Collectors tests k