Chapter 3 Lecture Notes - Squarespace 305-02 { Probability Lecture Notes October 1, 2015 Sections 3.1 { 3.2 Conditional Probabilities Conditional probabilities are one of the most

Chapter 3 Lecture Notes

Conditional Probability and Independence

October 1, 2015

1

MATH 305-02 – Probability Lecture Notes October 1, 2015

Sections 3.1 – 3.2Conditional Probabilities

Conditional probabilities are one of the most important concepts in probability theory. In mostcases, partial information is known before we compute the probability of an event. This means theprobability is based on some condition; hence, conditional probability.

Example Suppose we roll two dice. What is the probability that the dice sums to 8? The samplespace is

S = {(i, j) | i, j = 1, 2, . . . , 6} ,

which has 36 elements. The event E = {(i, j) | i+ j = 8} contains 5 elements. Since any roll of thedice is equally likely, it follws that

P (E) =5

36.

Now, suppose we know that the first dice rolled was a 3. What is the probability that the sum is8? In this case, our sample space is different:

S = {(3, j) | j = 1, 2, . . . , 6} ,

which only has 6 elements. Furthermore, each element in the space is equally likely with probability1/6. Since only rolling a 5 with the second dice will yield an 8 in this case, it follows that theprobability of rolling an 8 given that the first roll is a 3 is

P (8 is rolled | 3 was rolled first) =1

6.

We note that the probability is actually a little bit higher in this case. �

In general, if E and F are two events of an experiment, then the conditional probability thatE occurs given that F has already occurred is denoted by

P (E |F ).

We can derive a formula for this probability. Suppose F has occurred. Then for E to occur, it isnecessary that the occurrence is in both E and F , so it is in EF . Since F has occurred, the eventF becomes our new sample space, and we need to see how often E occurs within this new space.Hence, the probability that EF occurs will equal P (EF ) relative to P (F ). Therefore, we arrive atthe following definition:

Definition If P (F ) > 0, then

P (E |F ) =P (EF )

P (F ).

2


Example (3.1) Two fair dice are rolled. What is the conditional probability that at least onelands on 6 given that the dice land on different numbers? Here, we let E be the event that at leastone dice lands on 6 and let F be the event that the two numbers are different. Then we seek theP (E |F ), which is given by

P (E |F ) =P (EF )

P (F ).

We can compute each probability on the right hand side. We first note that the sample space has36 elements, each equally likely. Consider the event E, which is the event that at least one dice isa six. This event has 11 elements because

E = {(i, 6), (6, i), (6, 6) | i = 1, 2, . . . , 5}.

Of these 11 events, 10 have numbers that are different. Hence,

P (EF ) =10

36.

The event F has 30 elements which we can denote by

F = {(i, j) | i 6= j for i, j = 1, 2, . . . , 6}.

There is only 30 elements because the only dice rolls that yield identical numbers are doubles andthere are only 6 doubles. Hence,

P (F ) =30

36.

Therefore, we can conclude that the desired probability is

P (E |F ) =10/36

30/36=

1

3.

�

Exercise (2a) Suppose Joe is 80 percent certain that his missing key is in one of the two pocketsof his hanging jacket, being 40 percent sure it is in the left pocket and 40 percent sure it is in theright. If he searches the left pocket and does not find the key, what is the conditional probabilitythat it is in the right pocket?

Solution: Here, we let L denote the event that the key is in the left pocket and let R denote theevent that the key is in the right pocket. Then the probability that we seek is P (R |Lc). Hence,

P (R |Lc) =P (RLc)

P (Lc)=

P (R)

1− P (L)=

0.4

0.6=

2

3.

Here, we have used the fact that P (RLc) = P (R) since R, the event that the key is in the rightpocket, is most certainly a subset of Lc, the event that the key is not in the left pocket. (The restof Lc is filled with the event that it is in neither pocket). �

3


Exercise (3.7) The king comes from a family of 2 children. What is the probability that the otherchild is his sister?

Solution: Let E be the event that the king is male and F be the event that his sister is female.We first note that the sample space to this problem is

S = {(g, g), (g, b), (b, g), (b, b)},

where b denotes boy and g denotes girl. We wish to find the probability P (F |E), which is

P (F |E) =P (FE)

P (E)=

2/4

3/4=

2

3.

�

Sometimes working with the reduced sample space is the best way to go.

Example (3.4) What is the probability that at least one of a pair of fair dice lands on 6, giventhat the sum of the dice is i, for i = 2, 3, 4, . . . , 12? In this case, it is much easier to consider thereduced sample space given by the condition of the sum. Let E be the event that at least one dicelands on 6 and let Fi be the event that the sum of the dice is i. Then we quickly see that thedesired probability is

P (E |Fi) = 0, i = 2, 3, 4, 5, 6

This is because if the sum is less than 6, there is no way a 6 could’ve been rolled. For i = 7, we seethat our reduced sample space is restricted to

F7 = {(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)},

of which only 2 elements contain a 6. Therefore,

P (E |F7) =2

6=

1

3.

Using the same argument, we see that when i = 8,

F8 = {(6, 2), (5, 3), (4, 4), (3, 5), (6, 2)}.

Our reduced sample space only has 5 elements, 2 of which contain a 6. Therefore,

P (E |F8) =2

5.

Continuing in this manner, we find

P (E |F9) =1

2

P (E |F10) =2

3P (E |F11) = 1

P (E |F12) = 1.

�

4


Exercise (3.10) Three cards are randomly drawn, without replacement, from an ordinary deck of52 playing cards. Compute the conditional probability that the first card is a spade given that thesecond and third cards are spades.

Solution: Let E be the event that the first card is a spade and let F be the event that the secondand third cards are spades. This problem is very easy to do if we look at the reduced sample space.Suppose we draw three cards. If two of them are already spades, then two of the 13 spades are goneand we are left with 11 spades to choose from a remaining deck of 50 cards. Hence, the probabilityis

P (E |F ) =11

50.

�

Exercise (2c) In the card game of bridge, the 52 cards are dealt out equally to 4 players - calledEast, West, North, and South, where East and West are on a team and North and South are on ateam. If North and South have a total of 8 spades among them, what is the probability that Easthas 3 of the remaining spades?

Solution: Let the event E be East having 3 spades and let F be the event that North and Southhave 8 spades among the two of them. We can think in the sense of the reduced sample space.Assuming the North and South hands have been dealt and they received 8 spades, it follows thatEast could have a possible

(2613

)hands. We seek the probability that he received 3 of the remaining

5 spades. There are(53

)(2110

)ways this could happen. Hence,

P (E |F ) =

(53

)(2110

)(2613

) .

�

We can rearrange the conditional probability formula and write

P (EF ) = P (F )P (E |F ).

We can generalize the conditional probability formula to include any number of events in succession.This is called the multiplication rule:

Theorem. (Multiplication Rule)

P (E1E2 · · ·En) = P (E1)P (E2 |E1)P (E3 |E1E2) · · ·P (En |E1E2 · · ·En−1).

Proof. We can show this is the case by continually applying the conditional probability formula.Consider the right hand side of this equation:

P (E1)P (E2 |E1) · · ·P (En |E1E2 · · ·En−1) = P (E1)

(P (E1E2)

P (E1)

)· · ·(

P (E1E2 · · ·En)

P (E1E2 · · ·En−1)

)= P (E1E2 · · ·En)

5


Example (2d) Celine is undecided as to whether to take a French course or a chemistry course.She estimates that her probability of receiving an A grade would be 1/2 in a French course and2/3 in a chemistry course. If Celine decides to base her decision on the flip of a fair coin, what isthe probability that she gets an A in chemistry? We can use the multiplicative rule to answer thisquestion. Let C be the event that she takes a chemistry class, and A be the event that she gets anA in whatever course she chooses. Then, we know that the probability that she gets an A assumingshe takes chemistry is P (A |C) = 2/3 and we know the probability that she takes chemistry isP (C) = 1/2. We see the probability that she takes chemistry and gets an A. Hence,

P (CA) = P (C)P (A |C) =

(1

2

)(2

3

)=

1

3.

�

Example (3.13) Suppose an ordinary deck of 52 cards is randomly divided into 4 hands of 13 each.We wish to determine the probability p that each hand has an ace. Let Ei be the event that handi has an ace, then we can determine p = P (E1E2E3E4) using the multiplication rule. Here, we canwrite

P (E1E2E3E4) = P (E1)P (E2 |E1)P (E3 |E1E2)P (E4 |E1E2E3).

We need the conditional probabilities on the right hand side of this equation. Consider the prob-ability that the first hand is dealt an ace. There are a total of

(5213

)possible hands. There are

(41

)possible aces to fill the one spot and there are

(4812

)remaining cards to fill the remaining 12 spots

in the hand. Therefore,

P (E1) =

(41

)(4812

)(5213

) .

Now, assume we know that the first hand has an ace. Then we have a reduced sample space of 39cards and there are only 3 aces left. Hence,

P (E2 |E1) =

(31

)(3612

)(3913

) .

Now, assume we know that first two hands contain aces. Then our sample space is reduced furtherand there are only 26 remaining cards, 2 of which are aces. This gives

P (E3 |E1E2) =

(21

)(2412

)(2613

) .

Finally, there is only 1 ace left and there are 13 cards to choose from. Hence,

P (E4 |E1E2E3) =

(11

)(1212

)(1313

) = 1.

Therefore, p is given by

p =

(41

)(4812

)(5213

) (31

)(3612

)(3913

) (21

)(2412

)(2613

) ≈ 0.1055.

�

6


Exercise (3.12) A recent college graduate is planning to take the first three actuarial exams in thecoming summer. She will take the first exam in June. If she passes that exam, then she will takethe second exam in July, and if she passes that one, she will take the third exam in September. Ifshe fails an exam, she cannot take any more. The probability that she passes the exam 1 is 0.9. Ifshe passes the first, the conditional probability that she passes the exam 2 is 0.8, and if she passesboth the first and second exams, the conditional probability that she passes the exam 3 is 0.7.

a.) What is the probability that she passes all three exams?

b.) Given that she did not pass all three exams, what is the conditional probability that shefailed the second exam?

Solution:

a.) Let Ei be the event that she passed the ith exam. Then we seek the probability P (E1E2E3).From the information given, we know P (E1) = 0.9, P (E2 |E1) = 0.8, and P (E3 |E1E2) = 0.7.By the multiplication rule, we have

P (E1E2E3) = P (E1)P (E2 |E1)P (E3 |E1E2) = (0.9)(0.8)(0.7) = 0.504.

b.) We seek the probability P (Ec2 | (E1E2E3)

c), which the conditional probability formula gives

P (Ec2 | (E1E2E3)

c) =P(Ec

2(E1E2E3)c)

P((E1E2E3)c

) =P (E1E

c2)

1− P (E1E2E3)

Here, we have used the fact that P(Ec

2(E1E2E3)c)

= P (E1Ec2) since the only way she could

have failed exam 2 is if she passed exam 1 and failed exam 2. This gives

P (Ec2 | (E1E2E3)

c) =P (E1)P (Ec

2 |E1)

1− 0.504

=(0.9)(0.2)

0.496≈ 0.3629

�

Exercise (3.14) An urn initially contains 5 white and 7 black balls. Each time a ball is selected,its color is noted and it is replaced in the urn along with 2 other balls of the same color. Computethe probability that

a.) the first 2 balls selected are black and the next 2 are white;

b.) of the first 4 balls selected, exactly 2 are black.

7


Solution:

a.) Let B denote the event that a black ball was drawn and let W denote the event that a whiteball was drawn. We seek the probability P (BBWW ), which by the multiplication rule is

P (BBWW ) = P (B)P (B |B)P (W |BB)P (W |BBW ).

The probability that the initial ball is black is simply

P (B) =7

12,

since there are 7 black balls and 5 white balls initially. Since a black ball was selected, weput it back and add two more black balls. Now, we seek the probability P (B |B), which isgiven by

P (B |B) =9

14.

Now, we assume a second black ball was chosen so that there are 11 black balls and still only5 white. If a white ball is selected next, then the probability would be

P (W |BB) =5

16.

Finally, the probability that another white ball is chosen would be

P (W |BBW ) =7

18.

Therefore, the probability that two black balls are chosen then two white balls are chosen isgiven by

P (BBWW ) =

(7

12

)(9

14

)(5

16

)(7

18

)=

35

768.

b.) We note that there is exactly(42

)ways for 2 black balls and 2 white balls to be chosen. In

every case, the probability will be that found in part a. Therefore, the probability is

P (2 black, 2 white) =

(42

)35

768=

210

768.

�

Recall that we may consider the probability of an event as a long-run relative frequency, i.e.,

limn→∞

n(E)

n,

where n(E) is the number of times event E occurs in n repetitions of an experiment. P (E |F )is consistent with this interpretation. Let n be large, then if we only consider the experiments in

8


which F occurs, then P (E |F ) will equal the long-run proportion of them in which E also occurs.To verify:

nP (F ) ≈ number of times F occurs

nP (EF ) ≈ number of times both E and F occur.

Then out of nP (F ) experiments in which F occurs, the proportion in which E occurs is

nP (EF )

nP (F )=P (EF )

P (F ),

which is in agreement with our definition of P (E |F ) as n gets large.

9


Section 3.3Bayes’s Formula

ConsiderE = EF ∪ EF c.

Here, EF and EF c are mutually exclusive. By Axiom 3.], we have

P (E) = P (EF ) + P (EF c)

⇒ P (E) = P (E |F )P (F ) + P (E |F c)P (F c)

⇒ P (E) = P (E |F )P (F ) + P (E |F c)[1− P (F )

].

This equation states that the probability of an event E is a weighted average of the conditionalprobability of E given that F has occurred and the conditional probability of E given that F hasnot occurred. In fact, each weight is the probability of the event on which it is conditioned. Thisallows us to find the probability of an event by first “conditioning” on whether or not some secondevent has occurred.

Example (3.23) Urn I contains 2 white and 4 red balls, whereas urn II contains 1 white and 1 redball. A ball is randomly drawn from urn I and placed into urn II, then a ball is randomly selectedfrom urn II. What is

a.) the probability that the ball selected from urn II is white?

b.) the conditional probability that the transferred ball was white given that a white ball isselected from urn II?

To answer part a., we consider what happens if both a red ball was initially drawn and trans-ferred or if a white ball was drawn and transferred. Let Rt be the event that a red ball was drawnand transferred and let Wt mean that a white ball was transferred. Let W be the event that awhite ball was selected from urn II. Then the probability we seek is P (W ), which is given by

P (W ) = P (W |Rt)P (Rt) + P (W |Wt)P (Wt)

=

(1

3

)(4

6

)+

(2

3

)(2

6

)=

4

9.

For part b., we seek the probability P (Wt |W ), which using the conditional probability formula,

10


gives

P (Wt |W ) =P (WtW )

P (W )

=P (Wt)P (W |Wt)

P (W )

=

(26

) (23

)49

=1

2.

�

Exercise (3a) An insurance company believes that people can be divided into two classes; thosewho are accident prone and those who are not. The company’s statistics show that an accidentprone person will have an accident at some time within a fixed 1-year period with probability 0.4,whereas this probability decreases to 0.2 for a person who is not accident prone.

a.) If we assume that 30% of the population is accident prone, what is the probability that a newpolicyholder will have an accident within a year of purchasing a policy?

b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is theprobability that she is accident prone?

Solution:

a.) Let A be the event that a person is accident-prone, and let A1 be the event that a personhad an accident within a year. Then, we seek the probability P (A1), which we can find byconditioning on whether or not that person is accident prone. We have

P (A1) = P (A1 |A)P (A) + P (A1 |Ac)P (Ac)

= (0.4)(0.3) + (0.2)(0.7)

= 0.26.

b.) Now, we assume the person has had an accident and we want to know whether she wasactually accident prone. In this case, we seek the event P (A |A1). This is given by

P (A |A1) =P (AA1)

P (A1)

=P (A)P (A1 |A)

P (A1)

=(0.3)(0.4)

0.26=

6

13.

�

11


Example (3d) A blood test is 95% effective in detecting a certain disease when it is, in fact,present. The test also yields a “false positive” result for 1% of healthy persons tested. If 0.5% ofthe population actually has the disease, what is the probability that a person has the disease giventhat the test result is positive? Let D be the event that the person has the disease and let E bethe event that the result is positive. Then from the information given we know

P (D) = 0.005, P (E |D) = 0.95, P (E |Dc) = 0.01.

We seek the probability P (D |E), which is given by

P (D |E) =P (DE)

P (E)

=P (E |D)P (D)

P (E |D)P (D) + P (E |Dc)P (Dc)

=(0.95)(0.005)

(0.95)(0.005) + (0.01)(0.995)

≈ 0.323.

�

Exercise (3c) In answering a question on a multiple choice test, a student either knows the answeror guesses. Let p be the probability that the student knows the answer and 1−p be the probabilitythat the student guesses. Assume that a student who guesses at the answer will be correct withprobability 1/m, where m is the number of multiple-choice alternatives. What is the conditionalprobability that a student knew the answer to a question given that he or she answered it correctly?

Solution: Let C be the event that the student gets the answer correct and let G be the eventthat the student guessed at the answer. Then we are given the probabilities P (Gc) = p andP (C |G) = 1/m and we seek the probability P (Gc |C), which is the probability that the studentknew the answer assuming he/she got it correct. Using our definition for conditional probabilitygives

P (Gc |C) =P (GcC)

P (C)

=P (C |Gc)P (Gc)

P (C |Gc)P (Gc) + P (C |G)P (G)

=(1)(p)

(1)(p) +(1m

)(1− p)

=p

p+ 1−pm

=mp

1 + (m− 1)p.

�

12


Exercise (3.19) A total of 48% of the women and 37% of the men who took a certain “quitsmoking” class remained nonsmokers for at least one year after completing the class. These peoplethen attended a success party at the end of a year. If 62% of the original class was male,

a.) what percentage of those at the party were women?

b.) what percentage of the original class attended the party?

Solution: Let A be the event that a smoker attended the party and let W indicate that thesmoker was a woman. Then we are given the following probabilities: P (W ) = 0.38, P (W c) = 0.62,P (A |W ) = 0.48, and P (A |W c) = 0.37. Consider the following solutions:

a.) Here, we are concerned with the probability P (W |A). Using the definition of conditionalprobabilities, we obtain

P (W |A) =P (WA)

P (A)

=P (A |W )P (W )

P (A |W )P (W ) + P (A |W c)P (W c),

where we have conditioned the denominator term over whether or not the person selected isa male or female. Therefore,

P (W |A) =(0.48)(0.38)

(0.48)(0.38) + (0.37)(0.62)≈ 0.443.

b.) We’ve actually already answered this question. We seek the probability P (A), which byconditioning on whether the smoker is a male or female gives

P (A) = P (A |W )P (W ) + P (A |W c)P (W c)

= (0.48)(0.38) + (0.37)(0.62)

= 0.4118.

�

Exercise (3f) At a certain stage of a criminal investigation, the inspector in charge is 60% con-vinced of the guilt of a certain suspect. Suppose, however, that a new piece of evidence which showsthat the criminal has a certain characteristic is uncovered. If 20% of the population possesses thischaracteristic, how certain of the guilt of the suspect should the inspector now be if it turns outthat the suspect has the characteristic?

Solution: Let G be the event that the suspect is guilty and let E be the event that the suspect

13


has this characteristic. Then the probability we seek is P (G |E), which is the probability that thesuspect is guilty given that he has the characteristic. From conditional probabilities, we have

P (G |E) =P (GE)

P (E)

=P (E |G)P (G)

P (E |G)P (G) + P (E |Gc)P (Gc)

=(1)(0.6)

(1)(0.6) + (0.2)(0.4)

≈ 0.882.

�

Just as we did before with conditional probabilities, we can generalize the previous result.Suppose Fi for i = 1, 2, . . . , n are all mutually exclusive events. Then the sample space can bewritten as

S =n⋃

i=1

Fi,

or equivalently, one of Fi must occur. Then we have for any event E,

E =n⋃

i=1

EFi,

and each event EFi is mutually exclusive. [Venn Diagram] Then

P (E) =

n∑i=1

P (EFi) =

n∑i=1

P (E |Fi)P (Fi).

This is called the law of total probability. This shows that we can compute P (E) by first conditioningon which one of the Fi that occurs. Again, P (E) is a weighted average of P (E |Fi), each termbeing weighted by the probability P (Fi).

Theorem. (Bayes’s Formula)

P (Fj |E) =P (EFj)

P (E)=

P (E |Fj)P (Fj)∑ni=1 P (E |Fi)P (Fi)

.

Proof. Follows directly from the definition of conditional probability and the law of total probability.

Example (3.32) A family has j children with probability pj , where p1 = 0.1, p2 = 0.25, p3 = 0.35,p4 = 0.3. A child from this family is randomly chosen. Given that this child is the eldest child inthe family, find the conditional probability that the family has

a.) only 1 child;

b.) 4 children.

14


To answer these questions, we first start off defining each event. Let E be the event that thechild selected is the oldest and let Fj be the event that the family has j children. From this, wemay conclude that the probability that the child is the oldest, given that there is j children isP (E |Fj) = 1/j. Furthermore, we know P (Fj) = pj as given in the problem. To answer parts a.)and b.), we seek the probability P (Fj |E). Hence, by the Bayes’s formula

P (Fj |E) =P (EFj)

P (E)

=P (E |Fj)P (Fj)∑4i=1 P (E |Fj)P (Fj)

=

(1j

)pj∑4

i=1

(1j

)pj

=pj/j

p1 + p2/2 + p3/3 + p4/4.

From this Therefore, for part a.), we have

P (F1 |E) =p1

p1 + p2/2 + p3/3 + p4/4=

1/10

5/12=

6

25,

and for part b.) we have

P (F4 |E) =p4/4

p1 + p2/2 + p3/3 + p4/4=

3/40

5/12=

9

50.

�

Exercise (3.36) Stores A, B, and C have 50, 75, and 100 employees, respectively, and 50, 60, and70 percent of them respectively are women. Resignations are equally likely among all employees,regardless of sex. One woman employee resigns. What is the probability that she works in store C.

Solution: Here, let W refer to the event that the resignation came from a woman and let A, B,and C represent the event that the person who resigned worked in store A, B, or C, respectively.Then we seek the probability P (C |W ). Using the Bayes’s formula, we obtain

P (C |W ) =P (W |C)P (C)

P (W |A)P (A) + P (W |B)P (B) + P (W |C)P (C)

=(0.75)(100/225)

(0.5)(50/225) + (0.7)(75/225) + (0.75)(100/225)

= 0.5.

�

15


Exercise (3n) A bin contains 3 types of disposable flashlights. The probability that a type 1flashlight will give more than 100 hours of use is 0.7, with the corresponding probabilities for type2 and type 3 flashlights being 0.4 and 0.3, respectively. Suppose that 20% of the flashlights in thebin are type 1, 30% are type 2, and 50% are type 3.

a.) What is the probability that a randomly chosen flashlight will give more than 100 hours ofuse?

b.) Given that a flashlight lasted more than 100 hours, what is the conditional probability thatit was a type j flashlight, for j = 1, 2, 3.

Solution: Let E be the event that the chosen flashlight gives more than 100 hours of light andlet Fi be the event that a flashlight of type i is chosen. Consider the following solutions:

a.) We seek the probability P (E). We find this probability by conditioning on which flashlightis chosen. The law of total probability gives

P (E) = P (E |F1)P (F1) + P (E |F2)P (F2) + P (E |F3)P (F3)

= (0.7)(0.2) + (0.4)(0.3) + (0.3)(0.5)

= 0.41.

b.) We seek the probabilities P (Fi |E), which is given by Bayes’s Formula:

P (Fi |E) =P (E |Fi)P (Fi)

0.41.

This gives the following

P (F1 |E) =14

41

P (F2 |E) =12

41

P (F3 |E) =15

41.

�

Exercise (3k) A plane is missing, and it is presumed that it was equally likely to have gone downin any of 3 possible regions. Let 1 − βi, for i = 1, 2, 3, denote the probability that the planewill be found upon a search of the ith region when the plane is, in fact, in that region. What isthe conditional probability that the plane is in the ith region given that a search of region 1 isunsuccessful?

Solution: Here, the values βi are called overlook probabilities for obvious reasons. Let Ri be theevent that the plane is in location i and let E be the event that a search of region 1 was unsuccessful.Then, using these events, we conclude the following: P (Ri) = 1/3, P (E |R1) = β1, P (E |R2) = 1,

16


and P (E |R3) = 1. From this, we can find the desired probabilities. Let i = 1, then we seek thefollowing:

P (R1 |E) =P (E |R1)P (R1)

P (E |R1)P (R1) + P (E |R2)P (R2) + P (E |R3)P (R3)

=β1(1/3)

β1/3 + 1/3 + 1/3

=β1

2 + β1.

For i = 2, 3, we obtain the following:

P (Ri |E) =P (E |Ri)P (Ri)

P (E |R1)P (R1) + P (E |R2)P (R2) + P (E |R3)P (R3)

=(1)(1/3)

β1/3 + 1/3 + 1/3

=1

2 + β1.

�

Definition The odds of an event E are defined as

[ odds ] =P (E)

P (Ec)=

P (E)

1− P (E).

The odds tell how much more likely it is that event E occurs than it is that it doesn’t occur. If theodds are α, then it is common to say “α to 1” in favor of the hypothesis.

We can now compute the odds when new evidence is introduced. Suppose H is true withprobability P (H) and let E be new evidence. Then, given the new evidence,

P (H |E) =P (HE)

P (E)=P (E |H)P (H)

P (E)

P (Hc |E) =P (HcE)

P (E)=P (E |Hc)P (Hc)

P (E)

Dividing the two expressions gives the “new odds” in light of this evidence:

P (H |E)

P (Hc |E)︸︷︷︸new odds

=P (H)

P (Hc)︸︷︷︸old odds

P (E |H)

P (E |Hc).

We can see that the new odds increase if the new evidence is more likely when H is true than whenit is false.

17


Example (3i) An urn contains two type A coins and one type B coin. When a type A coin isflipped it comes up heads with probability 1/4, whereas when a type B coin is flipped, it comes upheads with probability 3/4. A coin is randomly chosen from the urn and flipped. Given that theflip landed on heads, what is the probability that it was a type A coin? To answer this question,we define H as the event of flipping a heads and A the event that coin A is drawn. We find theodds of drawing an A coin as

[ odds of drawing A ] =P (A)

P (Ac)=

2/3

1/3= 2.

The odds of drawing an A coin are two to one. Now, assume we know that the coin flipped cameup heads. Then in light of this new evidence, we wish to find the odds that it is an A coin. Thismeans

[ new odds of drawing A ] =P (A |H)

P (Ac |H)=P (A)

P (Ac

P (H |A)

P (H |Ac)= (2)

(1/4

3/4

)=

2

3.

So the new odds are 2/3 to one, which means the probability that the coin picked was the A coingiven that heads was flipped is P (A |H) = 2/5. �

18


Section 3.4Independence

We say E is independent of F if knowledge that F has occurred does not influence the probabilitythat E occurs. In other words, if P (E |F ) is the same as P (E), then E is independent of F , whichmeans

P (EF ) = P (E)P (F ).

Furthermore, it follows that if E is independent of F , then F is independent of E.

Definition Two events E and F are said to be independent if P (EF ) = P (E)P (F ).

Example (4a) A card is selected at random from an ordinary deck of 52 playing cards. If E isthe event that the selected card is an ace and F is the event that it is a spade, then are the eventsindependent? In this case, yes because because P (E) = 4/52 = 1/13 and the probability of drawingan ace knowing the card is a spade is exactly P (E |F ) = 1/13. Therefore, knowing the card is aspade gives you no extra benefit. On the flippity flop, you could say P (F ) = 13/52 = 1/4. Supposeyou know the card you drew was an ace, then what is the probability that it is a spade? Then,P (F |E) = 1/4 since there are four possible suits of the ace, one of which being a spade. You canalso think that the probability of drawing an ace of spades is P (EF )1/52 and the probability ofeach event separately is P (E) = 4/52 and P (F ) = 13/52, which gives P (EF ) = P (E)P (F ). �

Example (4b) Two coins are flipped, and all 4 outcomes are assumed to be equally likely. If E isthe event that the first coin lands on heads and F the event that the second lands on tails, then areE and F independent? Yes, because P (E) = 1/2, P (F ) = 1/2, and P (EF ) = P ({(H,T )}) = 1/4.Hence, P (EF ) = P (E)P (F ) and the two events are independent. �

Exercise (4c) Suppose we toss 2 fair dice. Let E1 denote the event that the sum of the dice is 6and let E2 denote the event that the sum of the two dice is 7. Let F denote the the event that thefirst roll was a 4. Is either E1 or E2 independent of F?

Solution: The sample space for this problem contains 36 elements. The events consist of thefollowing outcomes:

E1 = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)}E2 = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}F = {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)} .

It follows easily that P (E1) = 5/36 and P (F ) = 1/6 since each outcome is equally likely. Also,P (E1F ) = 1

36 as there is only one element, namely (4, 2), that is in common. Thus,

P (E1F ) =1

366= 5

216= P (E1)P (F ),

19


and the two events are not independent. But, we have P (E2) = 1/6, P (F ) = 1/6, and P (E2F ) =1/36, which means

P (E2F ) =1

36= P (E2)P (F ).

�

Proposition If E and F are independent, then so are E and F c.

Proof. We consider the fact that E = EF ∪ EF c. These two events are mutually exclusive.Therefore,

P (E) = P (EF ) + P (EF c)

= P (E)P (F ) + P (EF c)

since E and F are independent. This equation can be rewritten as

P (E)(1− P (F )

)= P (EF c) ⇒ P (E)P (F c) = P (EF c),

which means E and F c are independent.

For three events, say E, F , and G, to be independent, we must have

P (EFG) = P (E)P (F )P (G),

andP (EF ) = P (E)P (F ), P (EG) = P (E)P (G), P (FG) = P (F )P (G).

In general, independence can be extended to more than three events. The events E1, E2, . . ., En

are said to be independent if for every subset of E1′ , E2′ , . . ., Er′ , for r ≤ n, of these events,

P (E1′E2′ · · ·Er′) = P (E1′E2′ · · ·Er′).

Exercise (4f) An infinite sequence of independent trials is to be performed. Each trial results ina success with probability p and a failure with probability 1 − p. What is the probability that atleast 1 success occurs in the first n trials? What is the probability that k successes occur in thefirst n trials?

Solution: Here, we define S to be a success and F to be a failure. We know that each trial isindependent and P (S) = p and P (F ) = 1 − p. To find the probability of at least one success, wecan find the probability of no successes and subtract it from one. Hence,

P (at least one success) = 1− P (no successes) = 1− (1− p)n,

where we have used the fact that the trials are independent by multiplying the probability of failuren times. To answer the second question, we consider a sequence of n trials, k of which are successesand n− k of which are failures:

SSS · · ·SS︸︷︷︸k successes

FFF · · ·FF︸︷︷︸n− k failures

.

20


The probability of this sequence is simply pk(1 − p)n−k. But this is only one such sequence of ksuccesses and n − k failures. There is exactly

(nk

)sequences of this form since we are placing k

successes among n trials. Therefore, the probability is

P (k successes) =

(n

k

)pk(1− p)n−k.

�

Exercise (4i) Suppose there are n types of coupons and that each new coupon collected is, in-dependent of previous selections, a type i coupon with probability pi,

∑ni=1 pi = 1. Suppose k

coupons are to be collected. If Ai is the event that there is at least one type i coupon among thosecollected, then, for i 6= j, find P (Ai), P (Ai ∪Aj), and P (Ai |Aj).

Solution: We note here that the process of selecting a coupon is independent of the previousselection and that the actual event defined by Ai is not independent of Aj for i 6= j. Since Ai isdefined as the event that at least one coupon of type i is picked, we again consider the event thatno coupon of type i is selected. Therefore, to find P (Ai), we obtain

P (Ai) = 1− P (no coupon of type i)

= 1− (1− pi)k.

The event Ai ∪ Aj is the event that at least one of the coupons i or j is selected. Again, we canfind this by considering the event that neither of coupon i nor j is chosen:

P (Ai ∪Aj) = 1− P((Ai ∪Aj)

)= 1− (1− pi − pj)k.

Finally, to find the conditional probability P (Ai |Aj), we use the conditional probability formula,which gives

P (Ai |Aj) =P (AiAj)

P (Aj).

To determine P (AiAj), we can use the following identity:

P (Ai ∪Aj) = P (Ai) + P (Aj)− P (AiAj).

Substituting and evaluating, we obtain the following solution:

P (AiAj) =

[1− (1− pi)k

]+[1− (1− pj)k

]−[1− (1− pi − pj)k

]1− (1− pj)k

.

�

21


Exercise (3.91) Suppose that n independent trials, each of which results in any of the outcomes0, 1, or 2, with respective probabilities p0, p1, and p2, such that

∑2i=0 pi = 1, are performed. Find

the probability that outcomes 1 and 2 both occur at least once.

Solution: Let Ei be the event that outcome i does not occur. Then we seek the event P((E1 ∪

E2)c), which is the event that at least one of the two outcomes occurs. We have

P (E1 ∪ E2) = P (E1) + P (E2)− P (E1E2)

= (1− p1)n + (1− p2)n − (1− p1 − p2)n

= (1− p1)n + (1− p2)n − pn0 .

Subtracting this from one yields the desired probability:

P((E1 ∪ E2)

c)

= 1 + pn0 − (1− p1)n − (1− p2)n.

�

22


Section 3.5P (· |F ) is a Probability

We now consider the conditional probability P (· |F ) as a probability function that satisfies thethree axioms of probability.

Proposition For any event E of a sample space S, the probability P (E |F ) satisfies the threeaxioms of probability:

1.] 0 ≤ P (E |F ) ≤ 1.

2.] P (S |F ) = 1.

3.] If events E1, E2, E3, . . . are mutually exclusive (i.e., EiEj = ∅ for i, j, when i 6= j), then

P

( ∞⋃i=1

Ei |F

)=∞∑i=1

P (Ei |F ).

Proof. For part 1.], we must show that 0 ≤ P (E |F ) ≤ 1. Here, we can use the formula forconditional probability to write

P (E |F ) =P (EF )

P (F ).

Here, P (EF ) ≥ 0 and it follows that EF ⊂ F , which means P (EF ) ≤ P (F ). Therefore,

0 ≤ P (EF )

P (F )≤ 1.

P (E |F ) satisfies the first axiom. Part 2.] follows since

P (S |F ) =P (SF )

P (F )=P (F )

P (F )= 1.

Finally, consider a sequence of mutually exclusive events E1, E2, . . ., then

P

( ∞⋃i=1

Ei |F

)=

P

(( ∞⋃i=1

Ei

)F

)P (F )

=

P

( ∞⋃i=1

EiF

)P (F )

.

From here, since EiEj = ∅ for i 6= j it follows that EiFEjF = ∅ and we have

P

( ∞⋃i=1

EiF

)=

∞∑i=1

P (EiF ).

23


This is because all other sums in the inclusion-exclusion principle drop out because every event ismutually exclusive. Therefore,

P

( ∞⋃i=1

Ei |F

)=

∞∑i=1

P (EiF )

P (F )=∞∑i=1

P (Ei |F ).

Axiom 3.] is satisfied.

if we define Q(E) = P (E |F ), then from the above proposition, Q(E) may be regarded asa probability function on the events of S. In this sense, all propositions previously proved forprobabilities hold for Q(E). For example, we have

Q(E1 ∪ E2) = Q(E1) +Q(E2)−Q(E1E2)

or equivalentlyP (E1 ∪ E2 |F ) = P (E1 |F ) + P (E2 |F )− P (E1E2 |F ).

Also, we can define a conditional probability for the probability Q(E). Suppose we wish to findthe probability Q(E1) by first conditioning on whether or not E2 occurs, then

Q(E1) = Q(E1 |E2)Q(E2) +Q(E1 |Ec2)Q(Ec

2)

If we substitute our equation for conditional probability, Q(E1 |E2) = Q(E1E2)/Q(E2), we obtain

Q(E1 |E2) =Q(E1E2)

Q(E2)

=P (E1E2 |F )

P (E2 |F )

=

P (E1E2F )P (F )

P (E2F )P (F )

=P (E1E2F )

P (E2F

= P (E1 |E2F ).

Therefore, if we condition the probability Q(E1) on whether or not E2 occurs, it is equivalent to

P (E1 |F ) = P (E1 |E2F )P (E2 |F ) + P (E1 |Ec2F )P (Ec

2 |F ).

Exercise (5a and 3a) An insurance company believes that people can be divided into two classes;those who are accident prone and those who are not. The company’s statistics show that an accidentprone person will have an accident at some time within a fixed 1-year period with probability 0.4,whereas this probability decreases to 0.2 for a person who is not accident prone.

a.) If we assume that 30% of the population is accident prone, what is the probability that a newpolicyholder will have an accident within a year of purchasing a policy?

24


b.) Suppose that a new policy has an accident within a year of purchasing a policy. What is theprobability that she is accident prone?

c.) What is the conditional probability that a new policyholder will have an accident in his or hersecond year of policy ownership, given that the policyholder had an accident the fist year?

Solution: Let A be the event that a person is accident-prone, and let A1 be the event that aperson had an accident within a year. Also, let A2 denote the event that a person had an accidentduring the second year of holding a policy. We are given P (A1 |A) = 0.4 and P (A1 |Ac) = 0.2.Consider the following:

a.) We seek the probability P (A1), which we can find by conditioning on whether or not thatperson is accident prone. We have

P (A1) = P (A1 |A)P (A) + P (A1 |Ac)P (Ac)

= (0.4)(0.3) + (0.2)(0.7)

= 0.26.

b.) Now, we assume the person has had an accident and we want to know whether she wasactually accident prone. In this case, we seek the event P (A |A1). This is given by

P (A |A1) =P (AA1)

P (A1)

=P (A)P (A1 |A)

P (A1)

=(0.3)(0.4)

0.26=

6

13.

c.) Now, we wish to find the probability P (A2 |A1). We can find this by conditioning on whetheror not the person is accident prone. This gives

P (A2 |A1) = P (A2 |AA1)P (A |A1) + P (A2 |AcA1)P (Ac |A1).

Here, the probability P (A2 |AA1) = 0.4 since the person is accident prone and the secondyear is separate period from the first year. This gives the following solution:

P (A2 |A1) = (0.4)

(6

13

)+ (0.2)

(7

13

)≈ 0.29.

�

Exercise (5e) There are k + 1 coins in a box. When flipped, the ith coin will turn up heads withprobability i/k, for i = 0, 1, . . . , k. A coin is randomly selected from the box and is then repeatedlyflipped. If the first n flips all result in heads, what is the conditional probability that the (n + 1)flip will do likewise?

25


Solution: Let Ci be the event that the ith coin is drawn, Fn be the event that the first n flipswere heads, and H be the event that the n + 1 flip is heads. Then the probability we seek isP (H |Fn). We can conclude that the probabilities given are P (Ci) = 1

k+1 , P (H |Ci) = ik , and

P (Fn |Ci) =(ik

)n, as each flip is independent from the last. To find the probability, we condition

on which coin was drawn:

P (H |Fn) =k∑

i=0

P (H |CiFn)P (Ci |Fn)

We can assume that P (H |CiFn) = P (H |Ci) = ik since the previous flips really should not influence

the n+ 1 flip. The only thing that matters is which coin was used. Therefore, we have

P (H |Fn) =

k∑i=0

(i

k

)P (Ci |Fn)

Now, we seek the probability P (Ci |Fn), which is the probability of drawing the ith coin consideringn heads was flipped. Using Bayes’s Formula, we obtain

P (Ci |Fn) =P (Fn |Ci)P (Ci)∑kj=0 P (Fn |Cj)P (Cj)

=

(ik

)n ( 1k+1

)∑k

j=0

(jk

)n (1

k+1

)=

(ik

)n∑kj=0

(jk

)nFrom this, we update our probability as

P (H |Fn) =

∑ki=0

(ik

)n+1∑kj=0

(jk

)n .

We see if k is large, and if we multiplying the numerator and denominator by 1k , we obtain the

following approximation:

P (H |Fn) =1k

∑ki=0

(ik

)n+1

1k

∑kj=0

(jk

)n ≈ ∫ 10 x

n+1 dx∫ 10 x

n dx=n+ 1

n+ 2.

This approximation follows as the sum, as k gets big, is a Riemann sum approximation to theintegral.

�

26

Documents

Chapter 3 Lecture Notes - Squarespace 305-02 { Probability Lecture Notes October 1, 2015 Sections 3.1 { 3.2 Conditional Probabilities Conditional probabilities are one of the most