6
PHIL 6334 - Probability/Statistics Lecture Notes 2: Conditional Probabilities and Bayes’ theorem Aris Spanos [ Spring 2014 ] 1 The view from the ( F ()) perspective 1.1 Conditional probability Consider the probability set up described by the probability space ( F ()) where - set of all possible outcomes, F - eld of events of interest, and () a probability set function assigning probabilities to events in F For any two events and in F the following formula for conditional probability holds: (| )= ( ) ( ) ( ) 0 (1) This formula treats the events and symmetrically, and thus: (|)= ( ) () () 0 (2) Solving (1) and (2) for ( ) yields the multiplication rule: ( )= ( |)· ()= (| )· ( ) (3) Substituting (3) into (1) yields the conditional probability: (| )= ( |)· () ( ) ( ) 0 (4) 1

6334 Day 3 slides: Spanos-lecture-2

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: 6334 Day 3 slides: Spanos-lecture-2

PHIL 6334 - Probability/Statistics Lecture Notes 2:

Conditional Probabilities and Bayes’ theorem

Aris Spanos [Spring 2014]

1 The view from the (F ()) perspective1.1 Conditional probability

Consider the probability set up described by the probability

space (F ()) where - set of all possible outcomes, F -field of events of interest, and () a probability set function

assigning probabilities to events in F For any two events and in F the following formula for

conditional probability holds:

(|)= ( ∩) ()

() 0 (1)

This formula treats the events and symmetrically, and

thus:

(|)= ( ∩) ()

() 0 (2)

Solving (1) and (2) for ( ∩ ) yields the multiplication

rule:

( ∩)= (|)· ()= (|)· () (3)

Substituting (3) into (1) yields the conditional probability:

(|)= (|)· () ()

() 0 (4)

1

Page 2: 6334 Day 3 slides: Spanos-lecture-2

Example. Consider the random experiment of tossing a

fair coin twice:

= {() ( ) () ( )}Let the events of interest be:

= {() ( ) ()} ()=75= {( ) () ( )} ()=75

The conditional probability of given takes the form:

(|)= ( ∩) ()

= 575=23 (5)

since (∩)= (( ) ())=5 Notice also that:

(|)= 575=23→ (∩)= (|)· ()= 5

75(75) =5

Now consider introducing a third event:

= {() ( ) ( )} ()=75What is the conditional probability of given and ?

(| ∩ )= ( ∩ [ ∩ ]) ( ∩ ) ( ∩ ) 0

which in light of the fact that:

( ∩ )= (|)· ()→ (|) 0 () 0 (∩)= (() ( ))=5 (∩)= (( ) ( ))=5

(∩∩)= ( )=25 which imply that:

(| ∩ )=255=1

2 (|)= 2

3

2

Page 3: 6334 Day 3 slides: Spanos-lecture-2

1.2 Bayes’ theorem from the (F ()) perspectiveThe conditional probability formula in (4) is transformed into

an updating rule by interpreting the two events and as a

hypothesis and evidence to yield Bayes’ formula:

(|)= (|)· () ()

() 0 (6)

(i) (|) as the posterior probability of ,(ii) (|) is interpreted as the likelihood of ,(iii) () is interpreted as the prior probability of , and

(iv) () is interpreted as the initial probability of evidence.

Remark 1: Viewed from the probability space (F ())perspective, (6) makes mathematical sense only when the hy-

pothesis and evidence belong to the same fieldF This ispotentially problematic because in empirical modeling lives

in Plato’s world and lives in the real world. Hence, (6) pre-

sumes that the two worlds can be easily merged in with

and constituting overlapping events. However, Bayesians

feel timid to introduce ( ∩) and assign it a probabilityusing Bayes’ formula:

(|) = (∩) ()

() 0

Instead, they replace (∩) with (|)· (), whichalthough mathematically equivalent, the terms (|) and () can be given more beguiling interpretations! These is-

sues become more insidious when Bayes’ formula is viewed

from the {(x;θ) θ∈Θ ∈R} perspective.The most problematic of the probabilistic assignments (i)-

(iv) is () because it’s not obvious where the probability

3

Page 4: 6334 Day 3 slides: Spanos-lecture-2

could come from. The Bayesians seek to address this conun-

drum by defining (iv) in terms of (ii)-(iii). In particular, they

use and not- denoted by (¬, the "catch-all"), to definea partition of :

= ∪ ¬

and then use ( ∪ ¬)= ()+ (¬)= ()=1to deduce the total probability rule:

()= ()· (|) + (¬)· (|¬) (7)

This rule holds for any set of events (1 2 ) that con-

stitutes a partition of in the sense that if:

(1 ∪2 ∪ ∪) = ∩=∅ for any 6= =1 2

()=X

=1 ()· (|)

The rule in (7) is often used to write Bayes’ formula as:

(|)= (|)· () ()· (|)+ (¬)· (|¬) () 0 (8)

Remark 2: It is important to distinguish between the

formula for conditional probabilities (4), which is totally non-

controversial, and Bayes’ formula (8) which is controversial

because:

(a) it assumes that a hypothesis and evidence are just

overlapping events in the same field F and(b) it invokes the total probability formula to assign a prob-

ability to

4

Page 5: 6334 Day 3 slides: Spanos-lecture-2

1.3 Bayesian Confirmation Theory

The Bayesian confirmation theory relies on comparing the

prior with the posterior probability of hypothesis :

[i] Confirmation: (|) ()

[ii] Disconfirmation: (|) ()

In case [i] evidence confirms hypothesis , and in case [ii]

evidence disconfirms hypothesis .

The degree of confirmation is measured using some

measure c() of the "degree to which raises the proba-

bility of ". The most popular such Bayesian measures are:

()= (|)− ()

()= (|)− (¬)()= (|)− ()

()= (|)− (¬)()=

(|) ()

()= (|) (|¬)

One can use any one of the above measures to argue that:

According to measure c(), evidence favors hy-

pothesis 1 over 0 iff:

c(1 ) c(0 )

For instance using the measure () in the case of two

competing hypotheses 0 and 1 :

(1|) (1)

(0|) (0)

Bayes⇔ (|1) ()

(|0) ()

⇔ (|1) (|0) 1

5

Page 6: 6334 Day 3 slides: Spanos-lecture-2

where (|1) (|0) is the (Bayesian) likelihood ratio.

For comparison purposes let us contrast this to the ratio of

the posteriors:

(1|) (0|)=

(|1)· (1) ()

(|0)· (0) ()

= (|1)· (1) (|0)· (0) 1

which is the product of (|1) (|0) and the ratio of the priors

(1)

(0).

Remark 3: It is important to note that the above mea-

sures are considered different only when they are not ordi-

nally equivalent in the sense that they give rise to the same

ranking. This, however, raises serious questions about the ap-

propriateness of such measures since ordinal measures render

the differences between the same ranking uninterpretable; how

can one interpret such differences as measuring the degree of

confirmation.

6