Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Fakultät Informatik, Institut für Angewandte Informatik, Professur Modellierung und Simulation
Dresden, 22.11.2011
Introduction to Simulation
Dr. Christoph LaroqueSummer 2012
05.06.2012
Einführung in die Simulation Folie Nr. 2
Folie Nr. 2
ANALYSIS & EXPERIMENTAL DESIGN
Introduction to SimulationSummer 2012
05.06.2012
Einführung in die Simulation Folie Nr. 3
Folie Nr. 3
Analysis of simulation models
Terms:• Experimental design consists of a set of
• Several experiments which Consist of several replications (or runs)
• Main problems:• Number of replications per experiment• Length of a single replication
Einführung in die Simulation Folie Nr. 4
Folie Nr. 4
Types of simulation related on results
Terminating simulation• A natural event determines the end of the
simulation• System starts and ends typically "empty“• Examples: bank, restaurant
Einführung in die Simulation Folie Nr. 5
Folie Nr. 5
Types of simulation related on results
Non-terminating simulation• No natural end of the simulation
• Systems must “settle" for a while (steady-state)• Steady-state parameters
• Distributions of the system parameters are stable• Examples: factory, communication networks
• Steady-state cycle parameters• System parameters repeat in a cycle• Example: call center with typical arrival pattern
over a day or a week• Other parameters
• Parameters remain unstable
Einführung in die Simulation Folie Nr. 6
Folie Nr. 6
Main problems - Detail
Terminating simulation models• Length of the replication is fixed• Only the number of replications must be determined
Non-terminating models without cycles• Problem 1: Length of the 'warm up' phase unknown• Problem 2: correlating measurements• General: Length and number of replications
Non-terminating models with cycles• Problem 1, 2, and 3: cycle length unknown• General: Length and number of replications
Einführung in die Simulation Folie Nr. 7
Folie Nr. 7
Analysis of Simulation Results
The variance of the mean values determines the quality of the results:• Consequence: more experimentation and runs must be
calculated• Problem: Correlated measurements (influence of variance)• But: high correlation within a run, low correlation between runs• For non-terminating systems: additional problem of 'empty'
system:• Measurements are not typically at the beginning, but
affect the total evaluation• How long does it take to warm up?
Einführung in die Simulation Folie Nr. 8
Folie Nr. 8
Solution principle
“To achieve better statistical performance, i.e., less variance in the computed performance estimates, more measurements have to be collected”
• Increase number of runs/replications (works for all types of simulations) – input data must be availabe
• Increase length of a single run (only useful for non-terminating simulations)”
Einführung in die Simulation Folie Nr. 9
Folie Nr. 9
Length of the warm-up
Calculating the length of the warm-up:• Best: graphical, theoretical methods are not very robust• Measurements from 10 to 20 long test runs• Evaluation over averages of all! runs (no averaging of the
values of single runs!)• Determine “moving average” for a length w over the entire series• Test different w until useful curve results• No curve detected 10 more runs and repeat procedure• Determine index / time when system is stable• In subsequent measurements delete values up to that index
Einführung in die Simulation Folie Nr. 10
Folie Nr. 10
Length of the warm-up
Warm-up period
Einführung in die Simulation Folie Nr. 11
Folie Nr. 11
Result analysis
Measurement method Replicate/Delete• Determine the warm-up, replicate simulation runs (Replicate)
and delete values of the warm-up period (Delete)• Run length: multiple of the warm-ups• Pros: Simple, robust to correlation• Cons: Warm-up must be determined correctly, a lot of "useless"
measure
Einführung in die Simulation Folie Nr. 12
Folie Nr. 12
Result analysisMethod of measurement Batch-means • Determine Warm-Up; compute a very long simulation run and
delete measurements of the warm-up;• Regroup measurements into intervals (batches)• Pros: Only one warm-up, robust against wrong warm-up
determination• Cons: Determination of the interval length and number is
difficult, correlated averages between batches, especially consecutive ones
Einführung in die Simulation Folie Nr. 13
Folie Nr. 13
KPI: Typical results and their graphics
To your point of view:
What might typical results of a simulation study look like?How to present?
Einführung in die Simulation Folie Nr. 14
Folie Nr. 14
KPI: Analysis of quantiles / Box Plots
For q (0<q<1) is the q quantiles of a distribution function F (x) is the smallest value for xq for which applies: F (xq) = qImportant quantiles: quartiles• First Quartile = 25% quantiles• Second Quartile = median = 50% quantiles• Third Quartile = 75% quantilesGood summary of the distribution values over 5-point box plot: minimum, maximum and 3 quartiles
Einführung in die Simulation Folie Nr. 15
Folie Nr. 15
Introduction of experimental design
Task: • Optimize by simulating a system with different inputs and
outputs
Question:• How to design experiments?• What optimization methods can be used?• Have the parameters any influence on the problem solution?
A more difficult problem ...
Einführung in die Simulation Folie Nr. 16
Folie Nr. 16
Introduction of experimental design
Input parameters and structural assumptions are factors, output performance measures are called responses
Factors can be quantitative or qualitative• Quantitative factors take on numerical values• Qualitative factors give often back structure assumptions and
can not be quantified natural• Distinction is not clear for all factors
Additional Classification: controllable / uncontrollable• Depending on what can be changed in real system• may also be dependent on the particular situation
Einführung in die Simulation Folie Nr. 17
Folie Nr. 17
Importance of the design of the simulation
• The design of an experiment decides a priori about which configurations will be tested with which simulation effort.
• Effort should be minimal, of course, good results as possible ...
• Intelligent Design much more efficient than hit-or-miss!
• Today, some methods on how toa. among many alternatives to identify the
alternatives to be examined in more detail andb. finally arrive to a quasi-optimal solution
Einführung in die Simulation Folie Nr. 18
Folie Nr. 18
Note on the Design of Experiments
• In the simulation planned experiments have significant advantages over “normal” experimentation
• Factors that are not controlled in reality can be controllable
• Chance is controlled by the characteristics of random number generators
• Conditions of the experiments have not be be randomized, because no systematic error sources are to turn off, for example, in contrast to biological laboratory (What is representative ;-) )
• (Assuming: random number generator does what it should!)
Einführung in die Simulation Folie Nr. 19
Folie Nr. 19
2k factorial design
• For single output size as before n replications for a given factor level confidence interval, evaluation over graph, etc.
• Repeat for all the ready to be examined m factor levels (i.e. at least m * n replications) ready, but only for one factor!
• Now given k factors (k> 1)• Questions:
1.How does one factor influence the response of the system?
2.How do the factors influence each other?• Destination: Examine the system as efficiently as
possible!
Einführung in die Simulation Folie Nr. 20
Folie Nr. 20
2k factorial design
Suggestion 1: Fix and try out oneFix k - 1 factors and make n replications for different levels of the factor k
But:• A lot of replications needed• Measurement of factor interactions impossible• Even implicit assumption that there are no interactions between
factors
Suggestion 2: 2k factorial design Choose for each factor 2 levels and replicate n times for all 2k
factor combinations (so-called design points)
Einführung in die Simulation Folie Nr. 21
Folie Nr. 21
2k factorial design
Letters: The first level of a factor is identified with “-", the second with "+"
"+" and "-" level should be contrary in such a way, that:
• But not so far apart that they are unrealistic• And close enough to each other in order not to miss a possibly
occurring intermediate behavior.
Einführung in die Simulation Folie Nr. 22
Folie Nr. 22
2k factorial designDesign matrix for k = 3 factors
Faktorkomb.(Design‐Punkt)
Faktor 1 Faktor 2 Faktor 3 Antwort(Response)
1 ‐ ‐ ‐ R12 + ‐ ‐ R23 ‐ + ‐ R34 + + ‐ R45 ‐ ‐ + R56 + ‐ + R67 ‐ + + R78 + + + R8
Einführung in die Simulation Folie Nr. 23
Folie Nr. 23
2k factorial design
• The response of the system itself is random subject, i:e: the variances of the effects are to estimate, to rule out random fluctuations
• Simple approach: Replicate the complete design n times
• Attention: Statistical significance does not mean practical relevance!
• About the answers, you can develop a linear regression model.
Einführung in die Simulation Folie Nr. 24
Folie Nr. 24
2k factorial designConcluding remarks
• Generally: no statement about other than the observed factor levels can be made because the results depend directly on the absolute height and the relative distance from “-" to "+"
• There is no provision which level are to selected with "-" and "+"
• The result will almost always be not linear, which is to be considered in the analysis of results!
Einführung in die Simulation Folie Nr. 25
Folie Nr. 25
Analysis of "many" factors
• Example: k = 11 2k = 2048 design pointsonly n = 5 replicates per point, each with length 1 min 10240 min, i.e. 1 week around the clock!
• Ergo: Form a possible intelligent combination of experiments, which have almost the same significance part factorial experimental design
• Or: Filter out quickly as possible irrelevant factors (no influence) and then fix them as an average or a plausible level
• The fewer factors to be considered, the chance is higher to find later an optimum by simulation!
Einführung in die Simulation Folie Nr. 26
Folie Nr. 26
Response-Surface-Method
• Simulation can be regarded as stochastic vector function
• It is usually very complicated, because the calculation includes multiple replications of a comprehensive model
• But: sometimes it can be approximated relatively simple
• The approximation can then be used as a model for the simulation
• Creates an understanding of the simulation model• Search of optima in the simple approximation for sure!
Einführung in die Simulation Folie Nr. 27
Folie Nr. 27
Response-Surface-Method - Example
Consider an inventory model: The average cost C per month is a function of the inputs s and d:
C = R (s, d)
R is stochastic, unknown, and possibly quite nasty• Complete simulation is required for the calculation
Plot of response surface and height lines (see Next Slide)• 420 combinations (s = 0, 5, 10, … , 100; d = 5, 10, … , 100)• Function is described numerically relatively precise - with no
algebraic formula
Einführung in die Simulation Folie Nr. 28
Folie Nr. 28
Response-Surface-Method und MetamodelsExample
Interpretation: Minimum average cost somewhere between 110 and 120 for that s at approximately 25, d from 35 to 40
But: If the model is large computational effort is too high: possibly hours for a single replication - without calculations to create the response surface
Einführung in die Simulation Folie Nr. 29
Folie Nr. 29
Metamodels by regression
Additional by simple transformations - formula change - you can the regression equation make depending of the factors
factor analysis is for any factor combinations with spreadsheet possible!
Attention: As before, only rough approximation of the actual response surface
• Dependent on the disposal stationary points of the real surface
• Also a metamodel must not be valid!• Examples: improving the estimate of the approximation by
increasing the number of observed points(see Next slides)
29
Einführung in die Simulation Folie Nr. 30
Folie Nr. 30
Example:Approximation with 4, 16, 32 points of theResponse Surface
30
Einführung in die Simulation Folie Nr. 31
Folie Nr. 31
Evolutionary Operation (EVOP)
• Actually: Continuous process improvement during operation without process disturbances
• But also: iterative approach to setting parameters for a few factors
• Successively better parameter settings identified
31
Einführung in die Simulation Folie Nr. 32
Folie Nr. 32
Simulation to be optimized
The ultimate target of the simulation
• Search the system configuration, which the costs minimizes or the profits maximizes
• Intelligent searching for combinations of the factor space, the specific performance measurements (Responses) optimize.
• General problem from the optimization theory (see, for example, fundamentals of optimization systems)
• Objective function (here random variable) R, subject variance,
• Inputs 1, …, k = decision variables• upper and lower bounds for variables li <= i <= ui
• additional restrictionse.g. aj11 + aj22 + …+ajkk <= cj
Einführung in die Simulation Folie Nr. 33
Folie Nr. 33
Simulation to optimize
For simulation much more difficult than, for example, for mathematical programming• No insertion into a simple formula possible: calculation of a
single factor vector requires n replications• Evaluation is very difficult due to stochastic effects
Various approaches available, which performs using (meta) heuristics optimization by means of simulation
Nevertheless, recently one of the main areas of research in the field of simulation.
Einführung in die Simulation Folie Nr. 34
Folie Nr. 34
Optimization methods
• Complete enumeration
• A few (meta) heuristics
• Simulated annealing• evolutionary strategies• neural networks• Particle swarm optimization• etc ...
Einführung in die Simulation Folie Nr. 35
Folie Nr. 35
Optimization software
Software can not guarantee optimum, but support by the systematic searchBasic sequence (according to Law / Kelton):
Optimization software„optimizes variable inputs
Simulate specificsystem configuration
(if necessary after replications)
yesno
Simulation softwaresimulate model
Start End
Simulation result(outputs, objectivefunction value(s)) as optimization input
Termination criterionfulfilled?
Specify new systemconfiguration
Create solutionreport
Einführung in die Simulation Folie Nr. 36
Folie Nr. 36
Introduction system comparisons
So far, output analysis of a system, but in practice it is conceivable in different systems
Necessity of comparison of n alternativesWhat is the best solution?
Now: Statistical methods for comparing different system designs or alternative control strategies
Einführung in die Simulation Folie Nr. 37
Folie Nr. 37
Introduction system comparisons
Rating, based on individual simulation runs leads often fail in practice
Example: Purchase an banking machineTwo possible variants A and B• Buying a machine type A• Purchase of two machines type B
A twice as fast and twice as expensive (acquisition, maintenance) as B no cost differences
Target: Best configuration with respect to service
Criteria: average waiting time in queue
Einführung in die Simulation Folie Nr. 38
Folie Nr. 38
Introduction
A
B
B
A is twice as fast as BConfiguration B: A splited queue
Which system ist better? Why?- Analytic calculation with queue theory: B!
Einführung in die Simulation Folie Nr. 39
Folie Nr. 39
Example: Different system configurations
Experiment:• Simulate two systems with each the first 100 customers• Calculate the average waiting time for customers in this
both systems• Select the system where they average waiting time
is lowest
Repeat the experiment 100 times
Result:In 52 cases out of 100 system A "better"
Why?
Einführung in die Simulation Folie Nr. 40
Folie Nr. 40
Example: Different system configurations
the distributions of the mean values of the service times of both configurations overlap strongly
Conjecture: Averaging over longer runs is much better as a basis! Nevertheless, in practice the former approach is often to be found!
It makes sense:Let n be the number of independent replications. Over all n replications is formed the average
Be X1j the average of the 100 delays in the independent overflow j (= 1, ..., n) the first configuration, analogous X2j
Be on X1|2 is the sample mean of X1|2,j then is the system recommended with smaller Xij a
Einführung in die Simulation Folie Nr. 41
Folie Nr. 41
Example: Different system configurations
n Proportion of experiments with the wrong result
1 0,52
5 0,43
10 0,38
20 0,34
Interpretation:The larger n, the better (correct) tends to be the decision recommendationBut even for n = 20 is still a very high probability of error(There are variance reduction techniques, which make the comparison better!)
Einführung in die Simulation Folie Nr. 42
Folie Nr. 42
Confidence intervals for the comparison of two alternative system configurations
Experience result:hypothesis tests are difficult to compare two systems (only yes / no decision)
BetterConstruct confidence interval for the difference (differential) of the expectation values of a particular performance measurements• Includes the test result (“same”/ varies), and a quantitative
statement about the difference• If the confidence interval contains the value 0, now will be
assumed that the systems do not differ significantly, otherwise not
Einführung in die Simulation Folie Nr. 43
Folie Nr. 43
Confidence intervals for the comparison of two alternative system configurations
The forming of differences has the advantage that when two similar distributions of the investigated parameters, the Leaning -if in pointing the same direction - are mitigated
Be Xi1, …, Xin, i = 1, 2, samples of i.i.d. observations of the system i, i = E(Xij) of the considered average output
Target: Construct a confidence interval for = 1 – 2
Procedures / rules of calculation• t-paired• Welch
Einführung in die Simulation Folie Nr. 44
Folie Nr. 44
Comparison of two alternative configurationst-paired-interval
NotesIf the random variables are normally distributed, the confidence interval is exact, otherwise the central limit theorem will be applied
It is not necessary, independent X1j, X2j presupposeIt does not have to Var(X1j) = Var(X2j) apply (important for the applicability of some variance reduction techniques for Zj)Attributed this, in principle, comparison of two systems to analysis a system
Attention: The Xij are defined over complete replications, i.e. e.g. average waiting time of 100 customers instead of waiting for a single client!!!
Einführung in die Simulation Folie Nr. 45
Folie Nr. 45
Comparison of two alternative configurations2 different storage strategiest-paired intervall
j X1j X2j Zj
1 126,97 118,21 8,76
2 124,31 120,22 4,09
3 126,68 122,45 4,23
4 122,66 122,68 ‐0,02
5 127,23 119,40 7,83
Average 4,98, variance 2,44 at = 0,1 confidence interval [1,65; 8,31] for = 1 – 2
Both alternatives are different with approximate 90% confidence level, and Strategy 2 is probably better!
Einführung in die Simulation Folie Nr. 46
Folie Nr. 46
Confidence intervals for more than two systems
3 Waysa.Comparison with a "standard system“b.Pairwise comparisonsc.Comparison with the best of the remaining systems
a.) Comparison with a standard system
Be a system "1" without loss of generality is the standard (e.g. an existent detection system), the other with 2, ..., k named.
Construct k -1 confidence intervals for the k -1 differences i- 1 für i = 2, …, k, all with a confidence level (1 – /(k-1)
Einführung in die Simulation Folie Nr. 47
Folie Nr. 47
Confidence intervals for more than two systemsComparison with a standard system
For all i = 2, ..., k: system i differs from the standard, if the constructed interval does not include 0, otherwise the systems are considered equivalent.
(Interval calculation can be made again by using t-paired interval)
For sharper intervals n can now be increased (4n for half an interval length) or, it can be used variance reduction techniques
Einführung in die Simulation Folie Nr. 48
Folie Nr. 48
Confidence intervals for more than two systemsPairwise comparisons
b) Pairwise comparisons To e.g. identify any significant differences• Example: There is no comparison system, all reasonable
alternatives must treated the same way
Idea: Build Confidence intervals for all differences i1 – i2 for all i1, i2 of {1, …, k} with i1 < i2
k(k-1)/2 intervals, i.e., (1-a)/[k(k-1)/2] as the confidence level
Attention: There may are comparative paradoxa!(Remainder as in a)!
Einführung in die Simulation Folie Nr. 49
Folie Nr. 49
Confidence intervals for more than two systems
c) Multiple comparisons with the best of the other systems
Target: Form k simultaneous confidence intervals at i –maxl<>i l for i = 1, …, k
• Assumption: larger averages better• Otherwise: choose min-formulation!
There are methods that successively inferior systemsexclude
Einführung in die Simulation Folie Nr. 50
Folie Nr. 50
QUESTIONS?
Introduction to SimulationSummer 2012
05.06.2012