28
BST 401 Probability Theory Xing Qiu Ha Youn Lee Department of Biostatistics and Computational Biology University of Rochester October 19, 2010 Qiu, Lee BST 401

Probability Theory Presentation 13

Embed Size (px)

Citation preview

Page 1: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 1/28

BST 401 Probability Theory

Xing Qiu Ha Youn Lee

Department of Biostatistics and Computational BiologyUniversity of Rochester

October 19, 2010

Qiu, Lee BST 401

Page 2: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 2/28

Outline

1 Basic Concepts of Probability

Qiu, Lee BST 401

Page 3: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 3/28

Basic Definitions

We will go through Chapter 1, sections 1-5.

I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.

I’ll mention a few things that are not in the appendix.

Qiu, Lee BST 401

Page 4: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 4/28

Basic Definitions

We will go through Chapter 1, sections 1-5.

I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.

I’ll mention a few things that are not in the appendix.

Qiu, Lee BST 401

Page 5: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 5/28

Basic Definitions

We will go through Chapter 1, sections 1-5.

I’ll ask you to go through sections 1-3. You will find most ofthese definitions/theorems/inequalities very, very familiar.

I’ll mention a few things that are not in the appendix.

Qiu, Lee BST 401

Page 6: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 6/28

Some Remarks

(Page 5) We say two r.v.s X and Y are equal in

distribution, denoted as X d = Y , if they have the same

distribution function, i.e., P (X x ) = P (Y x ) for all

x ∈ R.

Remember, being equal in distribution is a very weak

equality.

(Page 8) Exercise 1.10. It shows you how to compute the

density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.

Qiu, Lee BST 401

Page 7: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 7/28

Some Remarks

(Page 5) We say two r.v.s X and Y are equal in

distribution, denoted as X d = Y , if they have the same

distribution function, i.e., P (X x ) = P (Y x ) for all

x ∈ R.

Remember, being equal in distribution is a very weak

equality.

(Page 8) Exercise 1.10. It shows you how to compute the

density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.

Qiu, Lee BST 401

Page 8: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 8/28

Some Remarks

(Page 5) We say two r.v.s X and Y are equal in

distribution, denoted as X d = Y , if they have the same

distribution function, i.e., P (X x ) = P (Y x ) for all

x ∈ R.

Remember, being equal in distribution is a very weak

equality.

(Page 8) Exercise 1.10. It shows you how to compute the

density function of a transformed random variable.Exercise 1.12 is just an important special case of 1.10.

Qiu, Lee BST 401

Page 9: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 9/28

Proof of 1.10

Let π, ν , and λ be three measures defined on B(R) in this

way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is

the Lebesgue measure on R. In other words, ν is the

probability associated with X , and π is the probability of

g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π

dx .

First, we need to show that such a density function exists.

It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.

(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions

only).

Qiu, Lee BST 401

Page 10: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 10/28

Proof of 1.10

Let π, ν , and λ be three measures defined on B(R) in this

way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is

the Lebesgue measure on R. In other words, ν is the

probability associated with X , and π is the probability of

g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π

dx .

First, we need to show that such a density function exists.

It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.

(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions

only).

Qiu, Lee BST 401

Page 11: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 11/28

Proof of 1.10

Let π, ν , and λ be three measures defined on B(R) in this

way: π(A) = P (g (X ) ∈ A), ν (A) = P (X ∈ A), and λ(A) is

the Lebesgue measure on R. In other words, ν is the

probability associated with X , and π is the probability of

g (X ). The density function of g (X ), if exists, is theRadon-Nikodym derivative d π

dx .

First, we need to show that such a density function exists.

It suffices to show that π λ, i.e., π(A) = 0 if λ(A) = 0.

(Radon-Nikodym Theorem)This is true because (I’ll use the hard way, using definitions

only).

Qiu, Lee BST 401

Page 12: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 12/28

Proof of 1.10 (II)

Strict monotonicity implies the existence of g −

1. Continuityimplies that g −1 is continuous.

The pre-image g −1((a ,b )) is an open interval:

(g −1(a ),g −1(b )). Monotonicity implies that this pre-image

must be an interval (no hole in the middle), continuity

implies that this interval must be open.

Continuity of g −1 further implies that when (a , b ) shrinks to

zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),

(g −1(a ),g −1(b )) shrinks to zero as well.

Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,

λ(B n ) ↓ 0). With a bit more work, you will see that

λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).

Qiu, Lee BST 401

Page 13: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 13/28

Proof of 1.10 (II)

Strict monotonicity implies the existence of g −

1. Continuityimplies that g −1 is continuous.

The pre-image g −1((a ,b )) is an open interval:

(g −1(a ),g −1(b )). Monotonicity implies that this pre-image

must be an interval (no hole in the middle), continuity

implies that this interval must be open.

Continuity of g −1 further implies that when (a , b ) shrinks to

zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),

(g −1(a ),g −1(b )) shrinks to zero as well.

Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,

λ(B n ) ↓ 0). With a bit more work, you will see that

λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).

Qiu, Lee BST 401

Page 14: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 14/28

Proof of 1.10 (II)

Strict monotonicity implies the existence of g −

1. Continuityimplies that g −1 is continuous.

The pre-image g −1((a ,b )) is an open interval:

(g −1(a ),g −1(b )). Monotonicity implies that this pre-image

must be an interval (no hole in the middle), continuity

implies that this interval must be open.

Continuity of g −1 further implies that when (a , b ) shrinks to

zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),

(g −1(a ),g −1(b )) shrinks to zero as well.

Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,

λ(B n ) ↓ 0). With a bit more work, you will see that

λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).

Qiu, Lee BST 401

Page 15: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 15/28

Proof of 1.10 (II)

Strict monotonicity implies the existence of g −

1. Continuityimplies that g −1 is continuous.

The pre-image g −1((a ,b )) is an open interval:

(g −1(a ),g −1(b )). Monotonicity implies that this pre-image

must be an interval (no hole in the middle), continuity

implies that this interval must be open.

Continuity of g −1 further implies that when (a , b ) shrinks to

zero (means a ↑ c and b ↓ c for c ∈ (a ,b )),

(g −1(a ),g −1(b )) shrinks to zero as well.

Now a Lebesgue null set A has this property: you can finda sequence of open sets B n to approximate it (A ⊆ B n ,

λ(B n ) ↓ 0). With a bit more work, you will see that

λ(g −1(B n )) ↓ 0. π(B n ) = ν (g −1(B n )) ↓ 0 (ν λ).

Qiu, Lee BST 401

Page 16: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 16/28

Proof of 1.10 (III)

Denote h (x ) = d π

dx and f (x ) = d ν

dx . By definition,

π(A) = A h (x )dx , ν (A) =

A

d ν dx dx , for all A ∈ B.

Let A = (−∞, y ]. We get

y

−∞

h (x )dx = P (g (X ) y ) = P (X g −1(y ))

=

g −1(y )

−∞

f (x )dx (Let x = g −1(x ))

= y

−∞

f (g −1(t ))d g −1(t )=

y −∞

f (g −1(t ))

g (g −1(t ))dt

Carathéodory extension theorem.

Qiu, Lee BST 401

Page 17: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 17/28

Proof of 1.10 (III)

Denote h (x ) = d π

dx

and f (x ) = d ν

dx

. By definition,

π(A) = A h (x )dx , ν (A) =

A

d ν dx dx , for all A ∈ B.

Let A = (−∞, y ]. We get

y

−∞

h (x )dx = P (g (X ) y ) = P (X g −1(y ))

=

g −1(y )

−∞

f (x )dx (Let x = g −1(x ))

= y

−∞

f (g −1(t ))d g −1(t )=

y −∞

f (g −1(t ))

g (g −1(t ))dt

Carathéodory extension theorem.

Qiu, Lee BST 401

Page 18: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 18/28

Proof of 1.10 (III)

Denote h (x ) = d π

dx

and f (x ) = d ν

dx

. By definition,

π(A) = A h (x )dx , ν (A) =

A

d ν dx dx , for all A ∈ B.

Let A = (−∞, y ]. We get

y

−∞

h (x )dx = P (g (X ) y ) = P (X g −1(y ))

=

g −1(y )

−∞

f (x )dx (Let x = g −1(x ))

= y

−∞

f (g −1(t ))d g −1(t )=

y −∞

f (g −1(t ))

g (g −1(t ))dt

Carathéodory extension theorem.

Qiu, Lee BST 401

Page 19: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 19/28

Inequalities

Chebyshev’s inequality. Suppose ϕ : R→ R is positive. Let

i A = inf ϕ(y ) : y ∈ A.

P (X ∈ A) 1i A

Aϕ(X )dP (x ) 1

i AE ϕ(x ).

A special case:

P (X a ) EX 2

a 2

.

Qiu, Lee BST 401

Page 20: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 20/28

Misc.

Random variables are measurable functions. Measurable

transformation (which includes continuous transformation)

of r.v.s are r.v.s.

Change of variable formula, page 17.

The k th moment, EX k .

Qiu, Lee BST 401

Page 21: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 21/28

Misc.

Random variables are measurable functions. Measurable

transformation (which includes continuous transformation)

of r.v.s are r.v.s.

Change of variable formula, page 17.

The k th moment, EX k .

Qiu, Lee BST 401

Page 22: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 22/28

Misc.

Random variables are measurable functions. Measurable

transformation (which includes continuous transformation)

of r.v.s are r.v.s.

Change of variable formula, page 17.

The k th moment, EX k .

Qiu, Lee BST 401

Page 23: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 23/28

Theorem of Total Probability

Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.

For any A ∈ F we have:

P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).

You can easily generalize this theorem to the countable

infinite case.

Qiu, Lee BST 401

Page 24: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 24/28

Theorem of Total Probability

Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.

For any A ∈ F we have:

P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).

You can easily generalize this theorem to the countable

infinite case.

Qiu, Lee BST 401

Page 25: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 25/28

Theorem of Total Probability

Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.

For any A ∈ F we have:

P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).

You can easily generalize this theorem to the countable

infinite case.

Qiu, Lee BST 401

Page 26: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 26/28

Theorem of Total Probability

Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.

For any A ∈ F we have:

P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).

You can easily generalize this theorem to the countable

infinite case.

Qiu, Lee BST 401

Page 27: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 27/28

Theorem of Total Probability

Two compartment case. Ω = B 1 ∪ B 2, B 1 ∩ B 2 = φ.

For any A ∈ F we have:

P (A) = P (A ∩ B 1) + P (A ∩ B 2).P (A) = P (B 1)P (A|B 1) + P (B 2)P (A|B 2).

You can easily generalize this theorem to the countable

infinite case.

Qiu, Lee BST 401

Page 28: Probability Theory Presentation 13

8/8/2019 Probability Theory Presentation 13

http://slidepdf.com/reader/full/probability-theory-presentation-13 28/28

Homework

Go over all the proofs in the book.

Qiu, Lee BST 401