Upload
jemille6
View
3.930
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
PHIL 6334 - Probability/Statistics Lecture Notes 2:
Conditional Probabilities and Bayes’ theorem
Aris Spanos [Spring 2014]
1 The view from the (F ()) perspective1.1 Conditional probability
Consider the probability set up described by the probability
space (F ()) where - set of all possible outcomes, F -field of events of interest, and () a probability set function
assigning probabilities to events in F For any two events and in F the following formula for
conditional probability holds:
(|)= ( ∩) ()
() 0 (1)
This formula treats the events and symmetrically, and
thus:
(|)= ( ∩) ()
() 0 (2)
Solving (1) and (2) for ( ∩ ) yields the multiplication
rule:
( ∩)= (|)· ()= (|)· () (3)
Substituting (3) into (1) yields the conditional probability:
(|)= (|)· () ()
() 0 (4)
1
Example. Consider the random experiment of tossing a
fair coin twice:
= {() ( ) () ( )}Let the events of interest be:
= {() ( ) ()} ()=75= {( ) () ( )} ()=75
The conditional probability of given takes the form:
(|)= ( ∩) ()
= 575=23 (5)
since (∩)= (( ) ())=5 Notice also that:
(|)= 575=23→ (∩)= (|)· ()= 5
75(75) =5
Now consider introducing a third event:
= {() ( ) ( )} ()=75What is the conditional probability of given and ?
(| ∩ )= ( ∩ [ ∩ ]) ( ∩ ) ( ∩ ) 0
which in light of the fact that:
( ∩ )= (|)· ()→ (|) 0 () 0 (∩)= (() ( ))=5 (∩)= (( ) ( ))=5
(∩∩)= ( )=25 which imply that:
(| ∩ )=255=1
2 (|)= 2
3
2
1.2 Bayes’ theorem from the (F ()) perspectiveThe conditional probability formula in (4) is transformed into
an updating rule by interpreting the two events and as a
hypothesis and evidence to yield Bayes’ formula:
(|)= (|)· () ()
() 0 (6)
(i) (|) as the posterior probability of ,(ii) (|) is interpreted as the likelihood of ,(iii) () is interpreted as the prior probability of , and
(iv) () is interpreted as the initial probability of evidence.
Remark 1: Viewed from the probability space (F ())perspective, (6) makes mathematical sense only when the hy-
pothesis and evidence belong to the same fieldF This ispotentially problematic because in empirical modeling lives
in Plato’s world and lives in the real world. Hence, (6) pre-
sumes that the two worlds can be easily merged in with
and constituting overlapping events. However, Bayesians
feel timid to introduce ( ∩) and assign it a probabilityusing Bayes’ formula:
(|) = (∩) ()
() 0
Instead, they replace (∩) with (|)· (), whichalthough mathematically equivalent, the terms (|) and () can be given more beguiling interpretations! These is-
sues become more insidious when Bayes’ formula is viewed
from the {(x;θ) θ∈Θ ∈R} perspective.The most problematic of the probabilistic assignments (i)-
(iv) is () because it’s not obvious where the probability
3
could come from. The Bayesians seek to address this conun-
drum by defining (iv) in terms of (ii)-(iii). In particular, they
use and not- denoted by (¬, the "catch-all"), to definea partition of :
= ∪ ¬
and then use ( ∪ ¬)= ()+ (¬)= ()=1to deduce the total probability rule:
()= ()· (|) + (¬)· (|¬) (7)
This rule holds for any set of events (1 2 ) that con-
stitutes a partition of in the sense that if:
(1 ∪2 ∪ ∪) = ∩=∅ for any 6= =1 2
()=X
=1 ()· (|)
The rule in (7) is often used to write Bayes’ formula as:
(|)= (|)· () ()· (|)+ (¬)· (|¬) () 0 (8)
Remark 2: It is important to distinguish between the
formula for conditional probabilities (4), which is totally non-
controversial, and Bayes’ formula (8) which is controversial
because:
(a) it assumes that a hypothesis and evidence are just
overlapping events in the same field F and(b) it invokes the total probability formula to assign a prob-
ability to
4
1.3 Bayesian Confirmation Theory
The Bayesian confirmation theory relies on comparing the
prior with the posterior probability of hypothesis :
[i] Confirmation: (|) ()
[ii] Disconfirmation: (|) ()
In case [i] evidence confirms hypothesis , and in case [ii]
evidence disconfirms hypothesis .
The degree of confirmation is measured using some
measure c() of the "degree to which raises the proba-
bility of ". The most popular such Bayesian measures are:
()= (|)− ()
()= (|)− (¬)()= (|)− ()
()= (|)− (¬)()=
(|) ()
()= (|) (|¬)
One can use any one of the above measures to argue that:
According to measure c(), evidence favors hy-
pothesis 1 over 0 iff:
c(1 ) c(0 )
For instance using the measure () in the case of two
competing hypotheses 0 and 1 :
(1|) (1)
(0|) (0)
Bayes⇔ (|1) ()
(|0) ()
⇔ (|1) (|0) 1
5
where (|1) (|0) is the (Bayesian) likelihood ratio.
For comparison purposes let us contrast this to the ratio of
the posteriors:
(1|) (0|)=
(|1)· (1) ()
(|0)· (0) ()
= (|1)· (1) (|0)· (0) 1
which is the product of (|1) (|0) and the ratio of the priors
(1)
(0).
Remark 3: It is important to note that the above mea-
sures are considered different only when they are not ordi-
nally equivalent in the sense that they give rise to the same
ranking. This, however, raises serious questions about the ap-
propriateness of such measures since ordinal measures render
the differences between the same ranking uninterpretable; how
can one interpret such differences as measuring the degree of
confirmation.
6