35
Copyright © 2010 Lumina Decision Systems, Modeling Uncertainty: Probability Distributions Lonnie Chrisman, Ph.D. Lumina Decision Systems Analytica User Group Webinar Series Session 2: 6 May 2010

Modeling Uncertainty: Probability Distributions

Embed Size (px)

DESCRIPTION

Modeling Uncertainty: Probability Distributions. Lonnie Chrisman, Ph.D. Lumina Decision Systems Analytica User Group Webinar Series Session 2: 6 May 2010. Today’s Topics. Review How can we characterize uncertainty for continuous quantities? The Normal Distribution Viewing & interpreting - PowerPoint PPT Presentation

Citation preview

Page 1: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Modeling Uncertainty:Probability Distributions

Lonnie Chrisman, Ph.D.Lumina Decision Systems

Analytica User Group Webinar Series

Session 2: 6 May 2010

Page 2: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Today’s Topics

• Review• How can we characterize

uncertainty for continuous quantities?

• The Normal DistributionViewing & interpreting

• LogNormal Distribution• Why include uncertainty

Page 3: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Course Syllabus(tentative)

Over the coming weeks:• What is uncertainty? Probability.• Probability Distributions (today)• Monte Carlo Sampling• Measures of Risk and Utility• Common parametric distributions• Assessment of Uncertainty• Risk analysis for portfolios

(risk management)

• Hypothesis testing

Page 4: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Review

Page 5: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

What is Uncertainty?

• Uncertainty: the lack of perfect and complete knowledge.

• Applies to:Future outcomesExisting states or quantitiesPhysical measurementsUnknowable (quantum mechanics)

• Exercise: State something that you have perfect and complete knowledge of.

Page 6: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Related Concepts• Randomness

Will by next coin toss be heads or tails?

• Variation75% of the people in this room have type A blood.

• VaguenessHow many people worldwide live in warm climates?

• RiskYou could die during the operation.

• Statistical Confidence/SignificanceThe study confirmed the hypothesis at a 95% confidence level.

Page 7: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Probability: A language for uncertainty

Probability: A measure for how certain, on a scale from 0 to 1, a statement is to be true.

• P(A)=0 : Assertion A is certainly false.• P(A)=1 : Assertion A is certainly true.• P(A)=0.5: Equally likely to true or false.• P(A)=0.7: A is more likely true than

false.

Page 8: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Assertions must be Crisp and Unambiguous

Probability of what?• Must be a true/false assertion.• Vagueness not allowed.

✘ “Gas prices will increase substantially in the short term.”

✔ “The average retail price for regular unleaded gas in the state California, as reported by the U.S. Energy Information Administration, will increase by more than 20% from 26 Apr 2010 to 30 Aug 2010.”

• Truth theoretically knowable

Page 9: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Boolean Chance Variablesin Analytica

• Characterized by a single probability – P(B=true).

• Examples:Component failsDow drops by >1000 pointsCivil war breaks out in NigeriaSubject is male

• Use Chance variable defined asBernoulli(p)

Page 10: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

“Subjective” Interpretation of Probability

• Probabilities measure:how much what we know.not frequency of occurrence.

• Calibration:Over many probability assessments, the frequency of true assertions should match our subjective probabilities for the assertions.

Page 11: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Today’s New Topics

Page 12: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Continuous Quantities

• Most variables in quantitative models represent real-valued quantities.Examples:

RevenueInfection rateOil well capacityMegawatt power output Unit sales (?)

• Saying “Probability of x”, or P(x), is nonsensical.

• We need something more…

Page 13: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Real-valued uncertainty example

At this time (6 May 2010), at what rate (in gallons per hour) is oil leaking into the Gulf of Mexico from the well in Louisiana that exploded on 22 Apr 2010?

• Does this pass the clarity test? • How can we express or knowledge

and degree of uncertainty regarding the true value?

Note: A CNN article gave an estimate of 8,300 gal/hr.

Page 14: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Ways to Expressing Uncertainty

(Attendees ideas)Rate of Oil leak:• Minimum & maximum values• Standard deviation• Mean + Median (if different)• Distribution, e.g, triangular with

10% + 90% percentiles.

Page 15: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Average Deviation

Suppose our “best guess” is:E[ oil_leak_rate ] = 10K gal/hr

• What is the expected error in our estimate? = E[ |10K – trueValue| ]

• Ave. dev. is a simple (intuitive?) one-number measure of how uncertain we are.

Allows us to characterize our knowledge / uncertainty with just two numbers:

Expected value + Expected deviation

Aka: Expected Deviation, (mean/average) Absolute deviation.

Page 16: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Standard Deviation• Other measures of uncertainty

“dispersion”:Variance (expected/average squared error):= E[ (10K – trueValue)2 ]Standard Deviation

=

• Standard deviation has the same intuitive meaning as average (absolute) deviation.

Both are a type of best guess for how much error our best guess has.Nicer mathematical properitesMore commonly used.

])10[( 2trueValueKEVariance

Page 17: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Standard Deviation vs. Average Deviation

• Both are always non-negative.• Zero indicates absolute certainty.• Both are measured in the same units as x.

• Q: Which measure gets larger when extreme errors are more likely?

• What is the typical ratio sd/ad?Symmetric: sd ≈ 1.25 adOne-sided tail: sd ≈ 1.35 ad“Heavy” tails: (up to) 1.3 ad ≤ sd ≤ 2.5 ad

|]*[|]*)[( 2 xxEadvsxxEsd

Page 18: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Expressing uncertainty for a real-valued quantity

• Expected value + dispersion measure, e.g.:

Expected value + average deviationExpected value + standard deviation

• Exercise: Express your uncertainty for the oil well leak example in the above forms.

• There are no probabilities here. Why?

Page 19: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

VisualizationNormal Distribution

Avedev.

ExpectedValue

Stddev.

EV=10KAD=3KSD =3.8K

This is called a probability densityfunction (PDF) plot.

Page 20: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

VisualizationNormal Distribution

EV=10KAD=3KSD =3.8K

+/- Ave Deviation

58% of areawithin 1 averagedeviation.

The connection to probability.

Page 21: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

VisualizationNormal Distribution

EV=10KAD=3KSD =3.8K

+/- Std Deviation

68% of areawithin 1 averagedeviation.

Page 22: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Cumulative Probability Function (CDF)

• Easier to read than PDF.• P(rate≤x)

Page 23: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Specifying the Normal Distribution in Analytica

• Define your real-valued variable as:Normal( mean, stddev )

Take note: Standard Deviation, not expected/average deviation.

Remember to increase slightly (e.g., 25%)when estimating.

Page 24: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Exercise

A toy company must decide how many toys to manufacture for the Christmas season three months in advance.

Demand is: Normal(100K,25K)It costs $5 to manufacture a toy. The

company makes a $10 profit on each toy sold.

They order 100K toys. What is their expected profit?

Page 25: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Exercise <cont>

Using the toy company example:• Compare estimated profit when

uncertainty is ignored (based on Mean demand) to mean profit.

• Examine how mean profit varies with the number of toys ordered:Units_ordered := Sequence(70K,130K,1K)

• What size order should they place?• What improvement in value results from

including explicit uncertainty in the model?

Page 26: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Positive real-valued quantities

• Many real-valued quantities are positive-only, but no hard upper limit:

Oil leak rateDemandPopulation countsStock pricesMultiplier for positive quantityCapacities

• Normal distribution allows negative values.

Page 27: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Nonsense negatives

Negative oil leak? Nearly impossible?

Page 28: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

LogNormal Distribution

0 10K 20K 30K2000 4000 6000 8000 12K 14K 16K 18K 22K 24K 26K 28K0

100u

20u

40u

60u

80u

120u

140u

Oil leak rate (gal/hr)

Pro

ba

bil

ity

Den

sit

y

• Positive values only.• Positive skew (most values to right of mode)• Multiple possible “central” estimates.

Mode

Mean

Median

Page 29: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Specifying a LogNormal

LogNormal(median,gsdev,mean,stddev)• You specify any two of these:

Median: 50th percentile – “typical value”Mean: Average valueGsdev: geometric standard deviationStddev: (Arithmetic) standard deviation

• When using LogNormal, use named-parameter syntax, e.g.:

LogNormal(mean:10K,stddev:3.8K)

LogNormal(median:9350,mean:10K)

Page 30: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Exercise

A mining company obtains rights to extract a gold deposit during a one-week window next year, before a construction project starts on the site.

Extracting the deposit will cost $900K.The size of the deposit:

LogNormal(Mean:1K,Stddev:300) oz.The price of gold next year:

LogNormal(Mean:$1K, stddev:$500)

What is the expected value of these mining rights? Compare to result ignoring uncertainty.

Page 31: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

How important is choice of distribution?

Exercise:• Modify mining example to use

Normal instead of LogNormal, same mean & stddev.

• How much does this change the result?

Page 32: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Compare Normal to LogNormal

0 10K 20K 30K2000 4000 6000 8000 12K 14K 16K 18K 22K 24K 26K 28K0

100u

20u

40u

60u

80u

120u

140u

Oil leak rate (gal/hr)

Pro

ba

bil

ity

Den

sit

y

DistributionNormal LogNormal

These have the same mean and same standard deviation.

Page 33: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

The Flaw of Averages

Who is this guy?

A: Sam Savage, author of:

An entertaining account of the distortionscaused by average-case analysis.

Page 34: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

Why model uncertainty explicitly?

• Misleading results otherwise… “Flaw of averages”

• Explicit “precision” of results.• Some decisions are about uncertainty. E.g.,

to gather more informationcontingency planning

• Improved combining of information sources.• Productivity: Probabilities & distributions can

often be estimated more quickly than expected values (!)

• Sensitivity analyses• Causal modeling & abduction (diagnostic

reasoning)

Page 35: Modeling Uncertainty: Probability Distributions

Copyright © 2010 Lumina Decision Systems, Inc.

What we covered

• Uncertainty about continuous quantities can be largely characterized by:

Central value (e.g., mean or median)Dispersion measure (expected deviation, standard deviation, variance, geometric standard deviation).

• Normal distribution – unbounded quantities

• LogNormal distribution – positive quantities