Copyright © 2004 David M. Hassenzahl Monte Carlo Analysis David M. Hassenzahl

Copyright © 2004 David M. Hassenzahl

Monte Carlo Analysis

David M. Hassenzahl


Purpose of lecture

• Introduce Monte Carlo Analysis as a tool for managing uncertainty

• Demonstrate how it can be used in the policy setting

• Discuss its uses and shortcomings, and how they are relevant to policy making processes


What is Monte Carlo Analysis?

It is a tool for combining distributions, and thereby propagating more than just summary statistics

It uses random number generation, rather than analytic calculations

It is increasingly popular due to high speed personal computers


Background/History

• “Monte Carlo” from the gambling town of the same name (no surprise)

• First applied in 1947 to model diffusion of neutrons through fissile materials

• Limited use because time consuming• Much more common since late 80’s• Too easy now?• Name…is EPA “gambling” with people’s lives

(anecdotal, but reasonable).


Why Perform Monte Carlo Analysis?

• Combining distributions

• With more than two distributions, solving analytically is very difficult

• Simple calculations lose information– Mean mean = mean– 95% %ile 95%ile 95%ile!– Gets “worse” with 3 or more distributions


Monte Carlo Analysis

• Takes an equation – example: Risk = probability consequence

• Instead of simple numbers, draws randomly from defined distributions

• Multiplies the two, stores the answer• Repeats this over and over and over…• Then the set of results is displayed as a

new, combined distribution


Simple (hypothetical) example• Skin cream additive is an irritant• Many samples of cream provide information

on concentration:– mean 0.02 mg chemical– standard dev. 0.005 mg chemical

• Two tests show probability of irritation given application– low freq of effect per mg exposure = 5/100/mg– high freq of effect per mg exposure = 10/100/mg


Analytical results• Risk = exposure potency

– Mean risk = 0.02 mg 0.075 / mg

= 0.0015

or 15 out of 10,000 applications will result in irritation


Analytical results• “Conservative estimate”

– Use upper 95th %ile

Risk = 0.03 mg 0.0975 / mg

= 0.0029


Monte Carlo: Visual example

Exposure = normal(mean 0.02 mg, s.d. = 0.005 mg)potency = uniform (range 0.05 / mg to 0.10 / mg)

0.02 0.030.01

Exposure (mgchemical)

Potency (probability ofirritation per mg chemical)

0.05 0.10


Random draw one

p(irritate) = 0.0165 mg × 0.063/mg = 0.0010

0.02 0.030.01



0.05 0.10

0.063

0.0165


Random draw two

p(irritate) = 0.0175 mg × 0.089 /mg = 0.0016

Summary: {0.0010, 0.0016}

0.02 0.030.01



0.05 0.10

0.0890.0175


Random draw three

p(irritate) = 0.152 mg × 0.057 /mg = 0.0087

Summary: {0.0010, 0.0016, 0.00087}

0.02 0.030.01



0.05 0.10

0.0570.0152


Random draw four

p(irritate) = 0.0238 mg × 0.085 /mg = 0.0020

Summary: {0.0010, 0.0016, 0.00087, 0.0020}

0.02 0.030.01



0.05 0.10

0.0850.0238


After ten random draws

Summary{0.0010, 0.0016, 0.00087, 0.0020, 0.0011,

0.0018, 0.0024, 0.0016, 0.0015, 0.00062}

mean 0.0014

standard deviation (0.00055)


Using software

• Could write this program using a random number generator

• But, several software packages out there.

• I use Crystal Ball– user friendly– customizable– r.n.g. good up to about 10,000 iterations


100 iterations (about two seconds)

• Monte Carlo results– Mean 0.0016– Standard Deviation 0.00048– “Conservative” estimate 0.0026

• Compare to analytical results– Mean 0.0015– standard deviation n/a– “Conservative” estimate 0.0029


Summary chart - 100 trials

Frequency Chart

.000

.013

.025

.038

.050

0

1.25

2.5

3.75

5

0.00 0.00 0.00 0.00 0.00

100 Trials 1 Outlier

Forecast: P(Irritation)

0.00161 0.003110.00103


Summary - 10,000 trials

• Monte Carlo results– Mean 0.0015– Standard Deviation 0.000472– “Conservative” estimate 0.0024

• Compare to analytical results– Mean 0.0015– standard deviation n/a– “Conservative” estimate 0.0029


Summary chart - 10,000 trials

Frequency Chart

.000

.006

.011

.017

.023

0

56.5

113

169.5

226

0.00 0.00 0.00 0.00 0.00

10,000 Trials 88 Outliers

Forecast: P(Irritation)

0.00150 0.003310.00069

About 1.5 minutes run time


Policy applications

• When there are many distributional inputs

• Concern about “excessive conservatism”– multiplying 95th percentiles– multiple exposures

• Because we can• Bayesian calculations


Issues: Sensitivity Analysis• Sensitivity analysis looks at which input

distributions have the greatest effect on the eventual distribution

• Helps to understand which parameters can both be influenced by policy and reduce risks

• Helps understand when better data can be most valuable (information isn’t free…nor even cheap)


Issues: Correlation

• Two distributions are correlated when a change in one causes a change in another

• Example: People who eat lots of peas may eat less broccoli (or may eat more…)

• Usually doesn’t have much effect unless significant correlation (||>0.75)


Generating Distributions

• Invalid distributions create invalid results, which leads to inappropriate policies

• Two options– empirical– theoretical


Empirical Distributions

• Most appropriate when developed for the issue at hand.

• Example: local fish consumption– survey individuals or otherwise estimate– data from individuals elsewhere may be

very misleading

• A number of very large data sets have been developed and published


Empirical Distributions

• Challenge: when there’s very little data• Example of two data points

– uniform distribution?– triangular distribution?– not a hypothetical issue…is an ongoing

debate in the literature

• Key is to state clearly your assumptions• Better yet…do it both ways!


Which Distribution?


0.05 0.10


0.05 0.10


0.05 0.10


0.05 0.10


Random number generation

• Shouldn’t be an issue…@Risk and Crystal Ball are both good to at least 10,000 iterations

• 10,000 iterations is typically enough, even with many input distributions


Theoretical Distributions

• Appropriate when there’s some mechanistic or probabilistic basis

• Example: small sample (say 50 test animals) establishes a binomial distribution

• Lognormal distributions show up often in nature


Some Caveats

• Beware believing that you’ve really “understood” uncertainty

• Beware: misapplication – ignorance at best– fraudulent at worst…porcine hoof blister


Example (after Finkel)

Alar “versus” aflatoxinExposure has two elements

Peanut butter consumptionaflatoxin residue

Juice consumptionAlar/UDMH residue

Potency has one element

aflatoxin potency UDMH potency

Risk = (consumption residue potency)/body weight


Inputs for Alar & aflatoxinVariable Units Mean 5th %ile 95th %ile Percentile location

of the mean.

Peanut butter

consumption

g/day 11.38 2.00 31.86 66

Apple juice

consumption

g/day 136.84 16.02 430.02 69

aflatoxin residue g/g 2.82 1.00 6.50 61

UDMH residue g/g 13.75 0.5 42.00 67

aflatoxin

potency

kg-

day/mg

17.5 4.02 28.23 61

UDMH potency kg-

day/mg

0.49 0.00 0.85 43


Alar and aflatoxin point estimates

• aflatoxin estimates:– Mean

= 0.028– Conservative = 0.29

• Alar (UDMH) estimates:– Mean = 0.046– Conservative = 0.77

kgg

mg

mg

daykg

g

g

day

g20

1000

5.1782.238.11


Alar and aflatoxin Monte Carlo

• 10,000 runs

• Generate distributions – (don’t allow 0)

• Don’t expect correlation


Aflatoxin and Alar Monte Carlo results (point values)

Aflatoxin

Analytical Monte Carlo Mean 0.028 0.028

Conservative 0.29 0.095

Alar

Analytical Monte Carlo Mean 0.046 0.046

Conservative 0.77 0.18


Aflatoxin and Alar Monte Carlo results (distributions)

Frequency Chart

Certainty is 98.05% from -Infinity to 0.1495

.000

.004

.008

.012

.016

0

40.75

81.5

122.2

163

0 0.0375 0.075 0.1125 0.15


Forecast: peanut butter risk



Frequency Chart


.000

.026

.051

.077

.102

0

255

510

765

1020

0 0.1125 0.225 0.3375 0.45


Forecast: apple juice risk



Cumulative Chart


.000

.250

.500

.750

1.000

0

10000

0 0.0375 0.075 0.1125 0.15


Forecast: peanut butter risk



Cumulative Chart


.000

.250

.500

.750

1.000

0

10000

0 0.1125 0.225 0.3375 0.45


Forecast: apple juice risk



Frequency distribution--comparison

.000

.026

.051

.077

.102

0 0.1125 0.225 0.3375 0.45

peanut butter risk

apple juice risk

Overlay Chart



Cumulative distribution--comparison

.000

.250

.500

.750

1.000

0 0.1125 0.225 0.3375 0.45

peanut butter risk

apple juice risk

Overlay Chart


References and Further Reading

Burmaster, D.E and Anderson, P.D. (1994). “Principles of good practice for the use of Monte Carlo techniques in human health and ecological risk assessments.” Risk Analysis 14(4):447-81

Finkel, A (1995). “Towards less misleading comparisons of uncertain risks: the example of aflatoxin and Alar.” Environmental Health Perspectives 103(4):376-85.

Kammen, D.M and Hassenzahl D.M. (1999). Should We Risk It? Exploring Environmental, Health and Technological Problem Solving. Princeton University Press, Princeton, NJ.

Thompson, K. M., D. E. Burmaster, et al. (1992). "Monte Carlo techniques for uncertainty analysis in public health risk assessments." Risk Analysis 12(1): 53-63.

Vose, David (1997) “Monte Carlo Risk Analysis Modeling” in Molak, Ed., Fundamentals of Risk Analysis and Risk Management.

Documents

Copyright © 2004 David M. Hassenzahl Monte Carlo Analysis David M. Hassenzahl