30
1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

Embed Size (px)

Citation preview

Page 1: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

1

Theoretical Probability Models

Dr. Yan LiuDepartment of Biomedical, Industrial & Human Factors Engineering

Wright State University

Page 2: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

2

Introduction

Use Theoretical Probability Models When They Describe the Physical Model “Adequately” The results of intelligent tests ~ Normal Distribution The length of a telephone call ~ Exponential Distribution The number of people arriving at a bank within an hour ~ Poisson Distribution The number of defects in a bottle production line ~ Binomial Distribution

Discrete Distributions Binomial and Poisson distributions

Continuous Distributions Exponential, Normal, and Beta distributions

Page 3: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

3

Binomial Distribution Characteristics

Totally n trials Dichotomous outcomes

Each trial results in one of two possible outcomes (e.g. yes/no, true/false) Constant Probability

Each trial has the same probability of success, p Independence

Different trials are independent

Probability Mass Function (PMF)

),,1,0( nx

X = “# of successes in a sequence of n independent trials, and the probability of success in each trial is p” (Success means the occurrence of an event)

xnx ppx

npnxXpnBX

)(1),|Pr(),(~

: number of ways you can choose x successes from n trials

(See Appendix A)

x

n)!(!

!xnx

n

Page 4: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

4

Binomial Distribution (Cont.)

Expected Value

Variance

npXE )(

)1()var()var( pnpYX

Y = “# of failures in a sequence of n independent trials”, then Y = n – X

yny ppy

npnyYpnBY

)()1())1(,|Pr())1(,(~

Cumulative Distribution Function

x

i

ini ppi

n

pnxXpnxFpnBX

0

)(1

),|Pr(),|(),(~

(See Appendix B)

)1()( pnYE

Page 5: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

5

Pr(Hit| X=5)=?

Let X = number of people out of 20 tasters who preferred the new pretzel

You are planning to sell a new pretzel, and you want to know whether it will be a success or not. If your pretzel is a “Hit”, you expect to gain 30% of the market. If it is a “Flop”, on the other hand, the market share is only 10%. Initially, you judged these outcomes to be equally likely. You decided to test the market first and found out that 5 out of 20 people preferred your pretzel to the competing product. Given the new data, what do you think of the chance of your pretzel being a Hit?

(Bayes Theorem)

)Flop5Pr()Hit5Pr()5Pr( XXX

Pr(Flop)Flop)|5Pr(Pr(Hit)Hit)|5Pr( XX

0.5 0.5? ?

Pretzel Example

)5Pr()HitPr()Hit|5Pr(

XX

Pr(Hit | X = 5) =

Page 6: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

6

)5|HitPr( X

In conclusion, the new data suggest that the new pretzel is very likely to be a hit

179.07.03.05

20)Hit|5Pr()3.0,20(~Hit| 155

XpnBX

032.09.01.05

20)Flop|5Pr()1.0,20(~Flop| 155

XpnBX

1055.05.0032.05.0179.0

)FlopPr()Flop|5Pr()HitPr()Hit|5Pr()5Pr(

XXX

848.0== 1055.05.0•179.0

)5=Pr()HitPr()Hit|5=Pr(

XX

Page 7: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

7

Poisson Distribution Represent occurrences of events over a unit of measure (time or

space) e.g. number of customers arriving, number of breakdowns occurring

Assumptions Events can happen at any point along a continuum At any particular point, the probability of an event is small (i.e. events do not

happen frequently) Events happen independently of one another The average number of events is constant over a unit of measure

Probability Mass Function

=)|=Pr(⇔)(Poisson~ mxXmXX = “# of events in a unit of measure”

m is the average number of events in a unit of measure

(See Appendix C)

mx

m ex !

Page 8: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

8

Poisson Distribution (Cont.)

Expected Value

Variance

mXE )(

mX )var(

=)•|=Pr(⇔)•(Poisson~ tmyYtmY

Y = “# of events in t units of measure”tm

ytm e

y !)(

tmYE )( tmY )var(

Cumulative Distribution Function

mx

ii

m emxXmxFmXi -

0=!∑=)|≤Pr(=)|(⇔)(Poisson~

(See Appendix D)

Page 9: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

9

Based on your previous market research, you decide to invest in a pretzel stand. Now you need to select a good location. You consider a location to be “good”, “bad”, or “dismal” if you sell 20, 10, or 6 pretzels per hour, respectively. You have found a new stand and your initial judgment is that the probabilities of the location being good, bad, and dismal are 0.7, 0.2, and 0.1, respectively. After having the stand for a week, you decided to run a test. Within 30 minutes, you sold 7 pretzels. Now, what are your probabilities regarding the quality of the stand?

(Bayes Theorem)

Let X=number of pretzels sold within 30 minutes or 0.5 hour

Pr(Good | X = 7) = ?

)DismalPr()Dismal|7Pr()BadPr()Bad|7Pr()GoodPr()Good|7Pr(

)Dismal7Pr()Bad7Pr()Good7Pr()7Pr(

XXX

XXXX

=)7=|GoodPr( X)7=Pr(

)GoodPr()Good|7=Pr(X

X

0.7? ? ?

Pretzel Example (Cont.)

0.2 0.1

Page 10: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

10

In light of the new data, you feel that the chance of the current stand being a good location has slightly increased and thus you should stay.

09.0)Good|7Pr()10205.0(~Good| 10!7

107 eXmPoissonX

104.0)Bad|7Pr()5105.0(~Bad| 5!7

57 eXmPoissonX

022.0)Dismal|7Pr()365.0(~Dismal| 3!7

37 eXmPoissonX

086.01.0022.02.0104.07.009.0)DismalPr()Dismal|7Pr(

)BadPr()Bad|7Pr()GoodPr()Good|7Pr()7Pr(

X

XXX

)7|GoodPr( X 733.0== 086.07.0•09.0

)7=Pr()GoodPr()Good|7=Pr(

XX

Page 11: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

11

Exponential Distribution If the number of events occurring within a unit of measure follows a

Poisson distribution, then the time or space between the occurrence of two events follows an exponential distribution

Exponential distribution has the same assumptions as Poisson distribution

Probability Density Function

Let T =“Time (space) between two consecutive events”

mtemtTmtF 1)|Pr()|(

)0≥(=)|(⇔)(Exp~ - tmemtfmT mt

m is the same average rate used in Poisson distribution

Cumulative Distribution Function

Page 12: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

12

Exponential Distribution (Cont.)

Expected Value

Variance

Other Important Probabilities

mTE /1)(

2/1)var( mT

btat

mtmt

eemaTmbTmbTaeemtTmtT

--

--

-=)|≤Pr(-)|≤Pr(=)|≤<Pr(=)-1(-1=)|<Pr(-1=)|>Pr(

Page 13: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

13

You wonder if you can provide fast service to your customers. It takes 3.5 minutes to cook a pretzel, so what is the probability that the next customer arrives before the pretzel is finished?As in the previous example, you assume customers arrive according to a Poisson process, and you consider your location being good, bad or dismal if you sell 20, 10, 6 pretzels per hour, respectively. Your prior belief is that Pr(Good)=0.7, Pr(Bad)=0.2, and Pr(Dismal)=0.1.

Let T=the time between two consecutive customers

)DismalPr()Dismal|5.3Pr()BadPr()Bad|5.3Pr()GoodPr()Good|5.3Pr(

)Dismal5.3Pr()Bad5.3Pr()Good5.3Pr()5.3Pr(

TTT

TTTT

Pr(T<3.5) = ?

0.7 0.2 0.1? ? ?

Pretzel Example (Cont.)

Page 14: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

14

295.0=-1=)Dismal|5.3<Pr(⇔min)/=min/=(~Dismal|

442.0=-1=)Bad|5.3<Pr(⇔min)/=min/=(~Bad|

689.0=-1=)Good|5.3<Pr(⇔min)/=min/=(~Good|

5.3•-101

606

5.3•-61

6010

5.3•-31

6020

101

61

31

eTmExpT

eTmExpT

eTmExpT

6.01.0295.02.0442.07.0689.0

)DismalPr()Dismal|5.3Pr(

)BadPr()Bad|5.3Pr()GoodPr()Good|5.3Pr()5.3Pr(

T

TTT

In other words, about 60% of your customers will have to wait until the pretzel is ready. Therefore, the fast service does not seem very appealing.

Page 15: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

15

Normal Distribution Bell-Shaped Curve Particularly good for modeling situations in which the uncertain

quantity is subject to many different sources of errors many measured biological phenomena (e.g. height, weight, length)

Probability Density Function

Expected Value: Variance: Some Handy Empirical Rules

),|(),(~ xfNX

)(XE2=)var( σX

99.0)33Pr(

95.0)22Pr(

68.0)Pr(

X

X

X

22

2)(

21

x

e

μ=2μ-σ μ+σ

μ-2σ

μ+2σ

μ-3σ

μ+3σ

Page 16: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

16

Normal Distribution (Cont.) Standard Normal Distribution Convert to Standard Normal Distribution

)1,0|Pr(),|Pr( aZaX

,

)1,0(~),(~ NZNX X

(See Appendix E for Cumulative Probability)

z P(Z≤z)

X ~ N(μ=10, σ2=400), then the probability X is less than or equal to 35 is (Appendix E)8944.0)25.1ZPr()ZPr()35XPr(

4001035

Page 17: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

17

Normal Distribution (Cont.) Other Important Probabilities

)Pr()Pr(

)Pr()Pr()Pr(

ab

babXa

ZZ

ZbXa

Because standard normal distribution is symmetric around zero,

7888.01056.08944.0)25.1Pr()25.1Pr(

)25.125.1Pr()Pr()3515Pr(400

1035400

1015

ZZ

ZZX

)<Pr(-1=)≥Pr(=)-≤Pr( zZzZzZ

1056.0=8944.0-1=)25.1≤Pr(-1=)25.1≥Pr(=)25.1-≤Pr(=)≤Pr(=)15-≤Pr( 400

1015 ZZZZX

X ~ N(μ=10, σ2=400), then

Page 18: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

18

Standard Normal Distribution

-z

Page 19: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

19

Your plant manufactures disk drivers for personal computers. One of your machines produces a part that is used in the final assembly. The width of the part is important to the proper functioning of the disk driver. If the width falls below 3.995 or above 4.005 mm, the disk driver will not work properly and must be repaired at a cost of $10.40. The machine can be set to produce parts with width of 4mm, but it is not perfectly accurate. In fact, the width is normally distributed with mean 4 and the variance depends on the speed of the machine. The standard deviation of the width is 0.0019 at the lower speed and 0.0026 at the higher speed. Higher speed means lower overall cost of the disk driver. The cost of the driver is $20.45 at the higher speed and $20.75 at the lower speed. Should you run the machine at the higher or lower speed?

Quality Control Example

Page 20: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

20

Let X = width of a disk driver

(P1=?) $20.75+$10.40=$31.15

$20.75

$20.45+$10.40=$30.85

$20.45

Cost/Driver

Low Speed

High Speed

X≤3.995 or X≥4.005 (Defective)

3.995 ≤ X≤4.005 (Not Defective)

X≤3.995 or X≥4.005 (Defective)

3.995 ≤ X≤4.005 (Not Defective)(P2=?)

P1=Pr(Defective | Low Speed) P2=Pr(Defective | High Speed)

0086.09914.01

)SpeedLow|DefectiveNotPr(1)SpeedLow|DefectivePr(

)0019.0,4|005.4995.3Pr()SpeedLow|DefectiveNotPr(

)0019.0,4(~SpeedLow|

X

NX

9914.00043.09957.0

)63.2Pr()63.2Pr()63.263.2Pr()Pr( 0019.04005.4

0019.04995.3

ZZZZ

Page 21: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

21

E(Cost|Low Speed)=0.0086∙31.15+0.9914∙20.75=$20.84

E(Cost|High Speed)=0.0548∙30.85+0.9452∙20.45=$21.02

Conclusion: Because E(Cost|Low Speed)<E(Cost|High Speed), you should run the machine at the lower speed

9452.00274.09726.0

)92.1Pr()92.1Pr()92.192.1Pr()Pr( 0026.04005.4

0026.04995.3

ZZZZ

)0026.0,4|005.4995.3Pr()SpeedHigh|DefectiveNotPr(

)0026.0,4(~SpeedHigh|

X

NX

0548.09452.01

)SpeedHigh|DefectiveNotPr(1)SpeedHigh|DefectivePr(

Page 22: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

22

Beta Distribution Useful in modeling an uncertain ratio or proportion (ranging from 0 to 1)

e.g the proportion of voters who will vote for the Republican candidate

Probability Density Function

Let Q=“the proportion of interest”

n, r are parameters that determine the shape of f(q|n,r). n determines the “tightness” of the distribution; the larger n is, the tighter the distribution is. r determines the “skewness” of the distribution. In particular, When r = n/2, the distribution is symmetric around 0.5. Otherwise, the distribution is skewed to the right and left when r < n/2 and r > n/2, respectively.

(See Appendix F for Cumulative Probability)

,3,2,1,)!1()(

)1(),|(),(~ 11)()(

)(

nnn

qqrnqfrnbetaQ rnrrnr

n

Page 23: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

23

Beta Distribution

q

f(q)

Some Symmetric Beta Distributions

Some Asymmetric Beta Distributions

q

f(q)

Page 24: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

24

Beta Distribution (Cont.)

Expected Value

VariancenrQE )(

)1(

)(2)var(

nn

rnrQ

Loosely speaking, r and n can interpreted as r successes in n trials

Suppose your guess for the preference of the Republican candidate is that 40% people would vote for the Republican candidate.

What if you set n=100, r=40?

You can set n=10, r=4. This coincides with the expected proportion of 40%.

This still coincides with the expected proportion of 40%. However, the variances of the two cases are different.

When n=10, r=4, 022.0)Qvar()110(10

)410(42

When n=100, r=40, 0024.0)Qvar()1100(100

)40100(402

Page 25: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

25

You want to re-evaluate your decision to invest in a pretzel stand. At this point, you estimate that you are 50% sure that your market share is less than 20% and 75% sure that your market share is less than 38%.

Using the table in Appendix F, you find that

76.0)1,4|38.0Pr(,49.0)1,4|20.0Pr( rnQrnQ

Let Q= market share, you decide to model the uncertainty in Q as a Beta distribution Pr(Q≤0.20)=0.5, Pr(Q≤0.38)=0.75

You think the beta distribution is close enough and thus should proceed with the analysis

)1=,4=(Beta~ rnQ

The expected value of Q, E(Q)=0.25

Pretzel Example (Cont.)

Page 26: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

26

However, as a careful person, suppose you also want to evaluate your chances of losing money.

Net Profit <0 => 40,000Q-8,000<0 => Q≤0.2

49.0)1,4|20.0Pr( rnQ (Appendix F)

Therefore, there is about 50% chance of losing money. Are you willing to continue to take this risk?

You estimate that the total market is 100,000 pretzels. You sell a pretzel at $0.50. It costs you $0.10 to produce a pretzel, in addition to $8,000 fixed cost for marketing, financing, and overhead.

Net Profit =Revenue – Cost =100,000*Q*0.5 – (100,000*Q*0.1+8,000) = 40,000Q – 8,000

E(Net Profit) =40,000*0.25 – 8,000 =$2,000 > 0

So it seems to be a good idea to start a pretzel career.

Pretzel Example (Cont.)

Page 27: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

27

Exercises

Bottle ProductionIn bottle production, bubbles that appear in the glass are considered defects. Any bottle that has more than two bubbles is classified as “nonconforming” and is sent to recycling. Suppose that a particular production line produces bottles with bubbles at a rate of 1.1 bubbles per bottle. Bubbles occur independently of one another.

a.What is probability that a randomly chosen bottle is nonconforming?b.Bottles are packed in cases of 12. An inspector chooses one bottle from each case and examines it for defects. If it is nonconforming, she inspects the entire case, replacing nonconforming bottles with good ones. If the chosen one conforms, then she passes the case. In total, 20 cases are produced. What is the probability that at least 18 of them pass?

Page 28: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

28

a.X=# of bubbles in a bottleX~ Possion(m=1.1)Pr(X > 2 |m = 1.1) = 1 - Pr(X ≤ 2 |m = 1.1) = 1.00 - 0.90 = 0.1

b. Y=# of cases out of 20 cases that do not pass Y~ Binomial (n=20, p=0.1)Pr(Y≤2|n=20,p=0.1) = 0.677

Page 29: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

29

Exercises

Greeting CardA greeting card shop makes cards that are supposed to fit into 6 in. envelopes. The paper cutter, however, is not perfect. The length of a cut card is normally distributed with mean 5.9 in. and standard deviation 0.0365 in. If a card is longer than 5.975 in., it will not fit into a 6 in. envelope.

a.Find the probability that a card will not fit into a 6 in. envelopeb.The cards are sold in boxes of 20. what is the probability that in one box there will be two or more cards that do not fit in 6 in. envelopes?

Page 30: 1 Theoretical Probability Models Dr. Yan Liu Department of Biomedical, Industrial & Human Factors Engineering Wright State University

30

a. L= the uncertain length of an envelope.L~ N(µ = 5.9, σ = 0.0365)Pr (L > 5.975 | µ = 5.9, σ = 0.0365) = Pr(Z >(5.975-5.9)/0.0365) = Pr(Z > 2.055) =1-Pr(Z≤2.055)=1-0.98=0.02

b.X=# of cards in one box that do not fit in the envelopesX~ Binomial(n=20, p=0.02)Pr(X≥2|n=20,p=0.02) = 1-Pr(X≤1|n=20,p=0.02) =1-0.94=0.06