Upload
xing-qiu
View
222
Download
0
Embed Size (px)
Citation preview
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 1/28
BST 401 Probability Theory
Xing Qiu Ha Youn Lee
Department of Biostatistics and Computational BiologyUniversity of Rochester
October 19, 2010
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 2/28
Outline
1 Basic Concepts of Probability
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 3/28
Basic Definitions
We will go through Chapter 1, sections 1-5.
I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.
I’ll mention a few things that are not in the appendix.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 4/28
Basic Definitions
We will go through Chapter 1, sections 1-5.
I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.
I’ll mention a few things that are not in the appendix.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 5/28
Basic Definitions
We will go through Chapter 1, sections 1-5.
I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.
I’ll mention a few things that are not in the appendix.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 6/28
Some Remarks
(Page 5) We say two r.v.s X and Y are equal in
distribution, denoted as X d = Y , if they have the same
distribution function, i.e., P (X x ) = P (Y x ) for all
x ∈ R.
Remember, being equal in distribution is a very weak
equality.
(Page 8) Exercise 1.10. It shows you how to compute the
density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 7/28
Some Remarks
(Page 5) We say two r.v.s X and Y are equal in
distribution, denoted as X d = Y , if they have the same
distribution function, i.e., P (X x ) = P (Y x ) for all
x ∈ R.
Remember, being equal in distribution is a very weak
equality.
(Page 8) Exercise 1.10. It shows you how to compute the
density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 8/28
Some Remarks
(Page 5) We say two r.v.s X and Y are equal in
distribution, denoted as X d = Y , if they have the same
distribution function, i.e., P (X x ) = P (Y x ) for all
x ∈ R.
Remember, being equal in distribution is a very weak
equality.
(Page 8) Exercise 1.10. It shows you how to compute the
density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 9/28
Proof of 1.10
Let π, ν , and λ be three measures defined on B(R) in this
way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is
the Lebesgue measure on R. In other words, ν is the
probability associated with X , and π is the probability of
g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π
dx .
First, we need to show that such a density function exists.
It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.
(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions
only).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 10/28
Proof of 1.10
Let π, ν , and λ be three measures defined on B(R) in this
way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is
the Lebesgue measure on R. In other words, ν is the
probability associated with X , and π is the probability of
g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π
dx .
First, we need to show that such a density function exists.
It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.
(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions
only).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 11/28
Proof of 1.10
Let π, ν , and λ be three measures defined on B(R) in this
way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is
the Lebesgue measure on R. In other words, ν is the
probability associated with X , and π is the probability of
g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π
dx .
First, we need to show that such a density function exists.
It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.
(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions
only).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 12/28
Proof of 1.10 (II)
Strict monotonicity implies the existence of g −
1. Continuityimplies that g −1 is continuous.
The pre-image g −1((a ,b )) is an open interval:
(g −1(a ),g −1(b )). Monotonicity implies that this pre-image
must be an interval (no hole in the middle), continuity
implies that this interval must be open.
Continuity of g −1 further implies that when (a , b ) shrinks to
zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),
(g −1(a ),g −1(b )) shrinks to zero as well.
Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,
λ(B n ) ↓ 0). With a bit more work, you will see that
λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 13/28
Proof of 1.10 (II)
Strict monotonicity implies the existence of g −
1. Continuityimplies that g −1 is continuous.
The pre-image g −1((a ,b )) is an open interval:
(g −1(a ),g −1(b )). Monotonicity implies that this pre-image
must be an interval (no hole in the middle), continuity
implies that this interval must be open.
Continuity of g −1 further implies that when (a , b ) shrinks to
zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),
(g −1(a ),g −1(b )) shrinks to zero as well.
Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,
λ(B n ) ↓ 0). With a bit more work, you will see that
λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 14/28
Proof of 1.10 (II)
Strict monotonicity implies the existence of g −
1. Continuityimplies that g −1 is continuous.
The pre-image g −1((a ,b )) is an open interval:
(g −1(a ),g −1(b )). Monotonicity implies that this pre-image
must be an interval (no hole in the middle), continuity
implies that this interval must be open.
Continuity of g −1 further implies that when (a , b ) shrinks to
zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),
(g −1(a ),g −1(b )) shrinks to zero as well.
Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,
λ(B n ) ↓ 0). With a bit more work, you will see that
λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 15/28
Proof of 1.10 (II)
Strict monotonicity implies the existence of g −
1. Continuityimplies that g −1 is continuous.
The pre-image g −1((a ,b )) is an open interval:
(g −1(a ),g −1(b )). Monotonicity implies that this pre-image
must be an interval (no hole in the middle), continuity
implies that this interval must be open.
Continuity of g −1 further implies that when (a , b ) shrinks to
zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),
(g −1(a ),g −1(b )) shrinks to zero as well.
Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,
λ(B n ) ↓ 0). With a bit more work, you will see that
λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 16/28
Proof of 1.10 (III)
Denote h (x ) = d π
dx and f (x ) = d ν
dx . By definition,
π(A) = A h (x )dx , ν (A) =
A
d ν dx dx , for all A ∈ B.
Let A = (−∞, y ]. We get
y
−∞
h (x )dx = P (g (X ) y ) = P (X g −1(y ))
=
g −1(y )
−∞
f (x )dx (Let x = g −1(x ))
= y
−∞
f (g −1(t ))d g −1(t )=
y −∞
f (g −1(t ))
g (g −1(t ))dt
Carathéodory extension theorem.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 17/28
Proof of 1.10 (III)
Denote h (x ) = d π
dx
and f (x ) = d ν
dx
. By definition,
π(A) = A h (x )dx , ν (A) =
A
d ν dx dx , for all A ∈ B.
Let A = (−∞, y ]. We get
y
−∞
h (x )dx = P (g (X ) y ) = P (X g −1(y ))
=
g −1(y )
−∞
f (x )dx (Let x = g −1(x ))
= y
−∞
f (g −1(t ))d g −1(t )=
y −∞
f (g −1(t ))
g (g −1(t ))dt
Carathéodory extension theorem.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 18/28
Proof of 1.10 (III)
Denote h (x ) = d π
dx
and f (x ) = d ν
dx
. By definition,
π(A) = A h (x )dx , ν (A) =
A
d ν dx dx , for all A ∈ B.
Let A = (−∞, y ]. We get
y
−∞
h (x )dx = P (g (X ) y ) = P (X g −1(y ))
=
g −1(y )
−∞
f (x )dx (Let x = g −1(x ))
= y
−∞
f (g −1(t ))d g −1(t )=
y −∞
f (g −1(t ))
g (g −1(t ))dt
Carathéodory extension theorem.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 19/28
Inequalities
Chebyshev’s inequality. Suppose ϕ : R→ R is positive. Let
i A = inf ϕ(y ) : y ∈ A.
P (X ∈ A) 1i A
Aϕ(X )dP (x ) 1
i AE ϕ(x ).
A special case:
P (X a ) EX 2
a 2
.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 20/28
Misc.
Random variables are measurable functions. Measurable
transformation (which includes continuous transformation)
of r.v.s are r.v.s.
Change of variable formula, page 17.
The k th moment, EX k .
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 21/28
Misc.
Random variables are measurable functions. Measurable
transformation (which includes continuous transformation)
of r.v.s are r.v.s.
Change of variable formula, page 17.
The k th moment, EX k .
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 22/28
Misc.
Random variables are measurable functions. Measurable
transformation (which includes continuous transformation)
of r.v.s are r.v.s.
Change of variable formula, page 17.
The k th moment, EX k .
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 23/28
Theorem of Total Probability
Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.
For any A ∈ F we have:
P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).
You can easily generalize this theorem to the countable
infinite case.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 24/28
Theorem of Total Probability
Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.
For any A ∈ F we have:
P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).
You can easily generalize this theorem to the countable
infinite case.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 25/28
Theorem of Total Probability
Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.
For any A ∈ F we have:
P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).
You can easily generalize this theorem to the countable
infinite case.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 26/28
Theorem of Total Probability
Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.
For any A ∈ F we have:
P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).
You can easily generalize this theorem to the countable
infinite case.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 27/28
Theorem of Total Probability
Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.
For any A ∈ F we have:
P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).
You can easily generalize this theorem to the countable
infinite case.
Qiu, Lee BST 401
8/8/2019 Probability Theory Presentation 13
http://slidepdf.com/reader/full/probability-theory-presentation-13 28/28
Homework
Go over all the proofs in the book.
Qiu, Lee BST 401