View
221
Download
1
Embed Size (px)
Citation preview
Copyright © 2004 David M. Hassenzahl
Monte Carlo Analysis
David M. Hassenzahl
Copyright © 2004 David M. Hassenzahl
Purpose of lecture
• Introduce Monte Carlo Analysis as a tool for managing uncertainty
• Demonstrate how it can be used in the policy setting
• Discuss its uses and shortcomings, and how they are relevant to policy making processes
Copyright © 2004 David M. Hassenzahl
What is Monte Carlo Analysis?
It is a tool for combining distributions, and thereby propagating more than just summary statistics
It uses random number generation, rather than analytic calculations
It is increasingly popular due to high speed personal computers
Copyright © 2004 David M. Hassenzahl
Background/History
• “Monte Carlo” from the gambling town of the same name (no surprise)
• First applied in 1947 to model diffusion of neutrons through fissile materials
• Limited use because time consuming• Much more common since late 80’s• Too easy now?• Name…is EPA “gambling” with people’s lives
(anecdotal, but reasonable).
Copyright © 2004 David M. Hassenzahl
Why Perform Monte Carlo Analysis?
• Combining distributions
• With more than two distributions, solving analytically is very difficult
• Simple calculations lose information– Mean mean = mean– 95% %ile 95%ile 95%ile!– Gets “worse” with 3 or more distributions
Copyright © 2004 David M. Hassenzahl
Monte Carlo Analysis
• Takes an equation – example: Risk = probability consequence
• Instead of simple numbers, draws randomly from defined distributions
• Multiplies the two, stores the answer• Repeats this over and over and over…• Then the set of results is displayed as a
new, combined distribution
Copyright © 2004 David M. Hassenzahl
Simple (hypothetical) example• Skin cream additive is an irritant• Many samples of cream provide information
on concentration:– mean 0.02 mg chemical– standard dev. 0.005 mg chemical
• Two tests show probability of irritation given application– low freq of effect per mg exposure = 5/100/mg– high freq of effect per mg exposure = 10/100/mg
Copyright © 2004 David M. Hassenzahl
Analytical results• Risk = exposure potency
– Mean risk = 0.02 mg 0.075 / mg
= 0.0015
or 15 out of 10,000 applications will result in irritation
Copyright © 2004 David M. Hassenzahl
Analytical results• “Conservative estimate”
– Use upper 95th %ile
Risk = 0.03 mg 0.0975 / mg
= 0.0029
Copyright © 2004 David M. Hassenzahl
Monte Carlo: Visual example
Exposure = normal(mean 0.02 mg, s.d. = 0.005 mg)potency = uniform (range 0.05 / mg to 0.10 / mg)
0.02 0.030.01
Exposure (mgchemical)
Potency (probability ofirritation per mg chemical)
0.05 0.10
Copyright © 2004 David M. Hassenzahl
Random draw one
p(irritate) = 0.0165 mg × 0.063/mg = 0.0010
0.02 0.030.01
Exposure (mgchemical)
Potency (probability ofirritation per mg chemical)
0.05 0.10
0.063
0.0165
Copyright © 2004 David M. Hassenzahl
Random draw two
p(irritate) = 0.0175 mg × 0.089 /mg = 0.0016
Summary: {0.0010, 0.0016}
0.02 0.030.01
Exposure (mgchemical)
Potency (probability ofirritation per mg chemical)
0.05 0.10
0.0890.0175
Copyright © 2004 David M. Hassenzahl
Random draw three
p(irritate) = 0.152 mg × 0.057 /mg = 0.0087
Summary: {0.0010, 0.0016, 0.00087}
0.02 0.030.01
Exposure (mgchemical)
Potency (probability ofirritation per mg chemical)
0.05 0.10
0.0570.0152
Copyright © 2004 David M. Hassenzahl
Random draw four
p(irritate) = 0.0238 mg × 0.085 /mg = 0.0020
Summary: {0.0010, 0.0016, 0.00087, 0.0020}
0.02 0.030.01
Exposure (mgchemical)
Potency (probability ofirritation per mg chemical)
0.05 0.10
0.0850.0238
Copyright © 2004 David M. Hassenzahl
After ten random draws
Summary{0.0010, 0.0016, 0.00087, 0.0020, 0.0011,
0.0018, 0.0024, 0.0016, 0.0015, 0.00062}
mean 0.0014
standard deviation (0.00055)
Copyright © 2004 David M. Hassenzahl
Using software
• Could write this program using a random number generator
• But, several software packages out there.
• I use Crystal Ball– user friendly– customizable– r.n.g. good up to about 10,000 iterations
Copyright © 2004 David M. Hassenzahl
100 iterations (about two seconds)
• Monte Carlo results– Mean 0.0016– Standard Deviation 0.00048– “Conservative” estimate 0.0026
• Compare to analytical results– Mean 0.0015– standard deviation n/a– “Conservative” estimate 0.0029
Copyright © 2004 David M. Hassenzahl
Summary chart - 100 trials
Frequency Chart
.000
.013
.025
.038
.050
0
1.25
2.5
3.75
5
0.00 0.00 0.00 0.00 0.00
100 Trials 1 Outlier
Forecast: P(Irritation)
0.00161 0.003110.00103
Copyright © 2004 David M. Hassenzahl
Summary - 10,000 trials
• Monte Carlo results– Mean 0.0015– Standard Deviation 0.000472– “Conservative” estimate 0.0024
• Compare to analytical results– Mean 0.0015– standard deviation n/a– “Conservative” estimate 0.0029
Copyright © 2004 David M. Hassenzahl
Summary chart - 10,000 trials
Frequency Chart
.000
.006
.011
.017
.023
0
56.5
113
169.5
226
0.00 0.00 0.00 0.00 0.00
10,000 Trials 88 Outliers
Forecast: P(Irritation)
0.00150 0.003310.00069
About 1.5 minutes run time
Copyright © 2004 David M. Hassenzahl
Policy applications
• When there are many distributional inputs
• Concern about “excessive conservatism”– multiplying 95th percentiles– multiple exposures
• Because we can• Bayesian calculations
Copyright © 2004 David M. Hassenzahl
Issues: Sensitivity Analysis• Sensitivity analysis looks at which input
distributions have the greatest effect on the eventual distribution
• Helps to understand which parameters can both be influenced by policy and reduce risks
• Helps understand when better data can be most valuable (information isn’t free…nor even cheap)
Copyright © 2004 David M. Hassenzahl
Issues: Correlation
• Two distributions are correlated when a change in one causes a change in another
• Example: People who eat lots of peas may eat less broccoli (or may eat more…)
• Usually doesn’t have much effect unless significant correlation (||>0.75)
Copyright © 2004 David M. Hassenzahl
Generating Distributions
• Invalid distributions create invalid results, which leads to inappropriate policies
• Two options– empirical– theoretical
Copyright © 2004 David M. Hassenzahl
Empirical Distributions
• Most appropriate when developed for the issue at hand.
• Example: local fish consumption– survey individuals or otherwise estimate– data from individuals elsewhere may be
very misleading
• A number of very large data sets have been developed and published
Copyright © 2004 David M. Hassenzahl
Empirical Distributions
• Challenge: when there’s very little data• Example of two data points
– uniform distribution?– triangular distribution?– not a hypothetical issue…is an ongoing
debate in the literature
• Key is to state clearly your assumptions• Better yet…do it both ways!
Copyright © 2004 David M. Hassenzahl
Which Distribution?
Potency (probability ofirritation per mg chemical)
0.05 0.10
Potency (probability ofirritation per mg chemical)
0.05 0.10
Potency (probability ofirritation per mg chemical)
0.05 0.10
Potency (probability ofirritation per mg chemical)
0.05 0.10
Copyright © 2004 David M. Hassenzahl
Random number generation
• Shouldn’t be an issue…@Risk and Crystal Ball are both good to at least 10,000 iterations
• 10,000 iterations is typically enough, even with many input distributions
Copyright © 2004 David M. Hassenzahl
Theoretical Distributions
• Appropriate when there’s some mechanistic or probabilistic basis
• Example: small sample (say 50 test animals) establishes a binomial distribution
• Lognormal distributions show up often in nature
Copyright © 2004 David M. Hassenzahl
Some Caveats
• Beware believing that you’ve really “understood” uncertainty
• Beware: misapplication – ignorance at best– fraudulent at worst…porcine hoof blister
Copyright © 2004 David M. Hassenzahl
Example (after Finkel)
Alar “versus” aflatoxinExposure has two elements
Peanut butter consumptionaflatoxin residue
Juice consumptionAlar/UDMH residue
Potency has one element
aflatoxin potency UDMH potency
Risk = (consumption residue potency)/body weight
Copyright © 2004 David M. Hassenzahl
Inputs for Alar & aflatoxinVariable Units Mean 5th %ile 95th %ile Percentile location
of the mean.
Peanut butter
consumption
g/day 11.38 2.00 31.86 66
Apple juice
consumption
g/day 136.84 16.02 430.02 69
aflatoxin residue g/g 2.82 1.00 6.50 61
UDMH residue g/g 13.75 0.5 42.00 67
aflatoxin
potency
kg-
day/mg
17.5 4.02 28.23 61
UDMH potency kg-
day/mg
0.49 0.00 0.85 43
Copyright © 2004 David M. Hassenzahl
Alar and aflatoxin point estimates
• aflatoxin estimates:– Mean
= 0.028– Conservative = 0.29
• Alar (UDMH) estimates:– Mean = 0.046– Conservative = 0.77
kgg
mg
mg
daykg
g
g
day
g20
1000
5.1782.238.11
Copyright © 2004 David M. Hassenzahl
Alar and aflatoxin Monte Carlo
• 10,000 runs
• Generate distributions – (don’t allow 0)
• Don’t expect correlation
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (point values)
Aflatoxin
Analytical Monte Carlo Mean 0.028 0.028
Conservative 0.29 0.095
Alar
Analytical Monte Carlo Mean 0.046 0.046
Conservative 0.77 0.18
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Frequency Chart
Certainty is 98.05% from -Infinity to 0.1495
.000
.004
.008
.012
.016
0
40.75
81.5
122.2
163
0 0.0375 0.075 0.1125 0.15
10,000 Trials 192 Outliers
Forecast: peanut butter risk
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Frequency Chart
Certainty is 93.93% from -Infinity to 0.15
.000
.026
.051
.077
.102
0
255
510
765
1020
0 0.1125 0.225 0.3375 0.45
10,000 Trials 125 Outliers
Forecast: apple juice risk
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Cumulative Chart
Certainty is 98.04% from -Infinity to 0.1495
.000
.250
.500
.750
1.000
0
10000
0 0.0375 0.075 0.1125 0.15
10,000 Trials 192 Outliers
Forecast: peanut butter risk
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Cumulative Chart
Certainty is 93.93% from -Infinity to 0.15
.000
.250
.500
.750
1.000
0
10000
0 0.1125 0.225 0.3375 0.45
10,000 Trials 125 Outliers
Forecast: apple juice risk
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Frequency distribution--comparison
.000
.026
.051
.077
.102
0 0.1125 0.225 0.3375 0.45
peanut butter risk
apple juice risk
Overlay Chart
Copyright © 2004 David M. Hassenzahl
Aflatoxin and Alar Monte Carlo results (distributions)
Cumulative distribution--comparison
.000
.250
.500
.750
1.000
0 0.1125 0.225 0.3375 0.45
peanut butter risk
apple juice risk
Overlay Chart
Copyright © 2004 David M. Hassenzahl
References and Further Reading
Burmaster, D.E and Anderson, P.D. (1994). “Principles of good practice for the use of Monte Carlo techniques in human health and ecological risk assessments.” Risk Analysis 14(4):447-81
Finkel, A (1995). “Towards less misleading comparisons of uncertain risks: the example of aflatoxin and Alar.” Environmental Health Perspectives 103(4):376-85.
Kammen, D.M and Hassenzahl D.M. (1999). Should We Risk It? Exploring Environmental, Health and Technological Problem Solving. Princeton University Press, Princeton, NJ.
Thompson, K. M., D. E. Burmaster, et al. (1992). "Monte Carlo techniques for uncertainty analysis in public health risk assessments." Risk Analysis 12(1): 53-63.
Vose, David (1997) “Monte Carlo Risk Analysis Modeling” in Molak, Ed., Fundamentals of Risk Analysis and Risk Management.