View
286
Download
6
Tags:
Embed Size (px)
Citation preview
Output Analysis: Variance EstimationIE680 Output Analysis : Variance EstimationIE680
Output Analysis: Variance Estimation
Jong-hyun Ryu
1
Output Analysis: Variance EstimationIE680
Contents
• Motivation (Quality Control Chart)
• Collecting output data– Type of simulations
– Steady-State Simulation
– Stochastic Stationary Process
• Variance Estimation Methods– Replication Method
– Batch mean methods• Non-overlapping vs. Overlapping
– Standardized Time Series
• Additional Methods
2
Output Analysis: Variance EstimationIE680
Motivation
• Quality Control Chart– Simple Control Chart (Shewhart Control Chart)
• Design of Shewhart Control Chart– Control Limits
• UCL & LCL
• Results from incorrect control limits– If overestimated detection delay – If underestimated high false alarm rate– Estimating process variance (standard deviation) is critical.
ˆ3
ˆ3
UCL x
LCL x
3
Output Analysis: Variance EstimationIE680
Type of simulations
• Terminating vs. non-terminating simulations
• Terminating simulations– Runs for some duration of time TE, where E is a specified event
that stops the simulation.
– Starts at time 0 under well-specified initial conditions.
– Ends at the stopping time TE.
– Bank example: Opens at 8:30 am (time 0) with no customers present and 8 of the 11 teller working (initial conditions), and closes at 4:30 pm (Te)
– The simulation analyst chooses to consider it a terminatingsystem because the object of interest is one day’s operation.
4
Output Analysis: Variance EstimationIE680
Type of simulations
• Non-terminating simulation– Runs continuously, or at least over a very long period of time.– Examples: assembly lines that shut down infrequently, telephone
systems, hospital emergency rooms. – Initial conditions defined by the analyst.– Runs for some analyst-specified period of time TE.– Study the steady-state (long-run) properties of the system,
properties that are not influenced by the initial conditions of the model.
• Terminating or non-terminating – The objectives of the simulation study – The nature of the system.
5
Output Analysis: Variance EstimationIE680
Output Analysis for Steady-State Simulation
• Consider a single run of a simulation model to estimate a steady-state or long-run characteristics of the system.– The single run produces observations Y1, Y2, ... (generally
the samples of an autocorrelated time series).
• Performance measures:– Point estimator and confidence interval
– Independent of the initial conditions
6
Output Analysis: Variance EstimationIE680
Output Analysis for Steady-State Simulation
• The sample size is a design choice, with several considerations in mind:– Any bias in the point estimator that is due to artificial or
arbitrary initial conditions (bias can be severe if run length is too short).
– Desired precision of the point estimator.
– Budget constraints on computer resources.
7
Output Analysis: Variance EstimationIE680
Initialization Bias
• Methods to reduce the point-estimator bias caused by using artificial and unrealistic initial conditions:
– Intelligent initialization.– Divide simulation into an initialization phase and data-collection
phase.
• Intelligent initialization
– Initialize the simulation in a state that is more representative of long-run conditions.
– If the system exists, collect data on it and use these data to specify more nearly typical initial conditions.
– If the system can be simplified enough to make it mathematically solvable, e.g. queueing models, solve the simplified model to find long-run expected or most likely conditions, use that to initialize the simulation.
8
Output Analysis: Variance EstimationIE680
Initialization Bias
• No widely accepted, objective and proven technique to guide how much data to delete to reduce initialization bias to a negligible level.
• Plots can be misleading but they are still recommended.– If averages reveal a smoother and more precise trend as the # of
replications increases.– If averages can be smoothed further by plotting a moving
average.– Cumulative average becomes less variable as more data are
averaged.– The more correlation present, the longer it takes for to approach
steady state.
9
Output Analysis: Variance EstimationIE680
Introduction to variance estimation
• Simulation output analysis – Point estimator and confidence interval
– Variance estimation confidence interval
• Independent and identically distributed (IID)– Suppose X1,…Xm are iid
10
Output Analysis: Variance EstimationIE680
Stochastic Stationary Process
• Stationary time series with positive autocorrelation
• Stationary time series with negative autocorrelation
• Nonstationary time series with an upward trend
The stochastic process X is stationary for t1,…,tk, t T, if∈
1 1" "( ,..., ) ( ,..., )
d
k k
d
t t t t t t where denotes equality in distributionX X X X
11
Output Analysis: Variance EstimationIE680
Stochastic Stationary Process (2)
• A discrete-time stationary process X = {Xi : i≥1} with mean µ and variance σX
2 = Cov(Xi,Xi),
• Variance of the sample mean
• Easy to see
• For iid,
2 21, 1
1
2 ( )X jj
Cov X X
2lim ( )nnnVar X
2 12
1, 11
( ) 1 2(1 ) ( )
1
nn
X jj
S X jE Cov X X
n n n n
12
1, 11
1( ) 2 (1 ) ( )
n
n X jj
jVar X Cov X X
n n
12
Output Analysis: Variance EstimationIE680
Stochastic Stationary Process (3)• The expected value of the variance estimator is:
– If Xi are independent, then is an unbiased estimator of
– If the autocorrelation is positive, then is biased low as an estimator of
– If the autocorrelation is negative, then is biased high as an estimator of
2
2
1( )
( )1
( )( )
n nn
nn
nS X a
E Var Xn n
S XE Var X when positively correlated
n
( )nVar X
( )nVar X
( )nVar X2 ( )nS X
n
2 ( )nS X
n
2 ( )nS X
n
13
Output Analysis: Variance EstimationIE680
Replication Method
• Use to estimate point-estimator variability and to construct a confidence interval.
• Approach: make R replications, initializing and deleting from each one the same way.
• Important to do a thorough job of investigating the initial-condition bias:– Bias is not affected by the number of replications, instead, it is affected only by
deleting more data or extending the length of each run (i.e. increasing TE).
• Basic raw output data {Xrj, r = 1, ..., R; j = 1, …, n} is derived by:– Individual observation from within replication r.– Batch mean from within replication r of some number of discrete-time observations.– Batch mean of a continuous-time process over time interval j.
14
Output Analysis: Variance EstimationIE680
Replication Method
• Length of each replication (n) beyond deletion point (d):(n - d) > 10d
• Number of replications (R) should be as many as time permits, up to about 25 replications.
• For a fixed total sample size (n), as fewer data are deleted (d decreases)
– C.I. shifts: greater bias.– Decrease Variance– Trade off between bias and variance.
15
Output Analysis: Variance EstimationIE680
Batch means method
– When using a single, long replication:
– Problem :• Data are dependent so the usual estimator is biased
• Replication method (more than 25 replications) is expensive
– One solution: Batch means method
16
Output Analysis: Variance EstimationIE680
• Non-overlapping batch mean (NBM)
m observations
with batch mean Y1,m
Batch 1
m observations
with batch mean Yk,m
Batch k
Batch means method (Nonoverlapping batch mean)
1, 2, , ,
1 1 2 , ( 1) 1 , ( 1) 1,..., , ,..., ... ,..., ... ,...,
m m i m k m
n k m
m m m i m im k m km
Y Y Y Y
X
X X X X X X X X
17
Output Analysis: Variance EstimationIE680
Nonoverlapping batch mean (NBM)
• Suppose the batch means become uncorrelated
as m ∞
• NBM estimator for σ2
• Confidence Interval
,,
( )( ) ( )i m
n i m
nVar YnVar X mVar Y
k
2,
1
ˆ ( , ) ( )1
k
B i m ni
mV k m Y X
k
1,1 / 2
ˆ ( , )Bn k
V k mX t
n
18
Output Analysis: Variance EstimationIE680
Overlapping Batch Mean (OBM)
• OBM estimator for σ2
1 2 3 , 1 2 3, , , ... , , , , ...m m m mX X X X X X X
Y1,m
Y2,m
12
,1
ˆ ( ) ( )( 1)( )
n m
O i m ni
nmV m Y X
n m n m
19
Output Analysis: Variance EstimationIE680
NBM vs. OBM
• Under mild conditions
– Thus, both have similar bias
• Variance of the estimators
– Thus, the OBM method gives better MSE
2ˆ ( , ) ( 1) (1 )BE V k m k n o n 2ˆ ( ) (1 )OE V m m o m
ˆ ( , ) 3, ,
ˆ 2( )
B
O
Var V k mas k m
Var V m
20
Output Analysis: Variance EstimationIE680
Standardized Time Series
• Define the ‘centered’ partial sums of Xi as
• Central Limit Theorem
• Define the continuous time process
Question: How does Zn(t) behave as n increases?
21
Output Analysis: Variance EstimationIE680
n=100
22
Output Analysis: Variance EstimationIE680
n=1000
23
Output Analysis: Variance EstimationIE680
n=10000
24
Output Analysis: Variance EstimationIE680
n=1,000,000
25
Output Analysis: Variance EstimationIE680
Functional Central Limit Theorem (FCLT)
where B(t) is the standard Brownian motion (drift coefficient =0, diffusion coefficient =1)
• Brownian motion with drift coefficient and diffusion coefficient 2 is a real valued stochastic process with stationary and independent increments having continuous sample paths where
• Thus,
1( ) ( ) (0 1)
nt
ii
n
X ntZ t B t as n t
n
22 1 2 1 2 1( ) ( ) ~ ( ( ), ( ))
( ) ( ) (0) ~ (0, )n n n
B t B t Normal t t t t
Z t Z t Z Normal t
26
Output Analysis: Variance EstimationIE680
Functional Central Limit Theorem (FCLT)
• While CLT says that for any t,
FCLT also shows that Zn(t) converges to an asymptotically continuous stochastic process with independent increments.
27
Output Analysis: Variance EstimationIE680
The Weighted Area Estimator
1
0( ) ( ) 1Var f t B t dt
1
0( ) ( ) ~ (0,1)f t B t dt N
21 2 2 2
10( ) ( ) ( ) ~ ( )A f f t B t dt E A f
28
Output Analysis: Variance EstimationIE680
The Weighted Area Estimator
• Since
and
is a variance estimator
1( )
nt
ii
n
X ntZ t
n
21
0
21
( ; ) ( ) ( ) ( ) ( ) ( )1
n i iA f n f Z A f f t B t dt as n
nn n ni
2 2 2 21( ; ) ( ) ~ ( ) ( )dA f n A f E A f E A f
( ; )A f n
29
Output Analysis: Variance EstimationIE680
Additional Methods
• Regenerative Method (Fishman 1973; Shedler 1993)
• Spectral Theory (Heidelberger and Welch 1981, 1983; Damerdji 1991)
• Quantile estimation (Heidelberger and Lewis 1984; Seila 1982)
• Estimation of functions of means (Munoz and Glynn 1997)
• Estimation of multivariate means (Anderson 1984; Charnes 1989, 1995; Seila 1984)
30
Output Analysis: Variance EstimationIE680
Reference
• Alexopoulos, C., D. Goldsman, and R. F. Serfozo. 2005. Stationary processes: statistical estimation. In The Encyclopedia of Statistical Sciences, ed. N. L. Johnson, and C. B. Read, 2nd edition. New York: John Wiley & Sons
• Kim, S.-H., C. Alexopoulos, K.-L. Tsui, and J. R. Wilson. 2006. A distribution-free tabular CUSUM chart for autocorrelated data. IIE Transactions, forthcoming.
• Alexopoulos, C., N. T. Argon, D. Goldsman, and G. Tokol. 2004. Overlapping variance estimators for simulations. Technical Report, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia.
• Law, A. M., and W. D. Kelton. 2000 Simulation Modeling and Analysis, 3rd ed. New York: McGraw-Hill.
31