62
Probability Theory: Advanced Look Statistical Methods in Finance Lecture 2 Ta-Wei Huang December 7, 2016 Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 1 / 58

L2 probability theory refresh

Embed Size (px)

Citation preview

Page 1: L2 probability theory refresh

Probability Theory: Advanced LookStatistical Methods in Finance

Lecture 2

Ta-Wei Huang

December 7, 2016

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 1 / 58

Page 2: L2 probability theory refresh

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Page 3: L2 probability theory refresh

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Page 4: L2 probability theory refresh

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Page 5: L2 probability theory refresh

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Page 6: L2 probability theory refresh

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Page 7: L2 probability theory refresh

Sample Space and Event

Sample Space: Definition

Definition 2.1.1 (Sample Space)

For an experiment and a given index set A, all possible outcomes

ωα, α ∈ A are called sample points. The set Ω = ωα : α ∈ A = the

collection of all possible outcomes, i.e., the set of all sample points, is

called the sample space of that experiment.

1 If A is countable, we say that Ω is a discrete sample space;

2 if A is uncountable, we say that Ω is a continuous sample space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 3 / 58

Page 8: L2 probability theory refresh

Sample Space and Event

Sample Space: Examples

Example 2.1.2 (Sample Space)

1 If the experiment is tossing a coin, and the observation is the face of

that coin, then the sample space is Ω = H,T, which is a discrete

sample space.

2 If the experiment is a monetary policy conducted by Fed, and the

observation is the return on S&P500 Index one day after that policy,

then Ω = R, which is a continuous sample space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 4 / 58

Page 9: L2 probability theory refresh

Sample Space and Event

Event

Definition 2.1.3 (Event)

An event E is any collection of all possible outcomes of an experiment,

that is, any subset of the sample space Ω.

An event is actually a statement about the experiment results. For

example, the set R+ = S&P500 has positive return is an event for Ω in

example 2.1.2 case (2).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 5 / 58

Page 10: L2 probability theory refresh

Sample Space and Event

σ-algebra: Definition

Definition 2.1.4 (σ-algebra)

A system F of subsets of Ω is called a σ-algebra if

(a) Ω ∈ F , (b) Ac ∈ F if A ∈ F ,

(c) and for A1, A2, · · · , An, · · · ∈ F , the union⋃∞i=1Ai ∈ F .

Why do we need the concept of σ-algebra? The reason is that we will

assign a probability to any event E, which is a subset of the sample space

Ω, and therefore set operations are required so that we can easily do

something on different events, and then compute probabilities.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 6 / 58

Page 11: L2 probability theory refresh

Sample Space and Event

σ-algebra: Example

Example 2.1.5 (σ-algebra)

1 Let Ω = 1, 2, 3. Then

1 F1 = φ, 1, 2, 3, 1, 2, 3 is a σ-algebra.

2 F2 = φ, 1, 2, 3, 1, 2, 3 is not a σ-algebra since for

A = 1 ∈ F2, Ac = 2, 3 6∈ F2.

2 Let Ω = R. Then F = the collection of all subsets in R is a σ-algebra.

3 Let Ω = N. Then

1 F1 = φ, 1, 3, 5, 7, · · · , 2, 4, 6, 8, · · · ,N is a σ-algebra.

2 F2 = A ⊆ N : A is countable or Ac is countable is a σ-algebra.

3 F3 = A ⊆ N : A is finite or Ac is finite is not a σ-algebra since

Ω = N 6∈ F3.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 7 / 58

Page 12: L2 probability theory refresh

Probability Measure

Defining Probability Measure

Classically, we define the probability P (A) of an event A by

P(A) = ] of A] of Ω , but there are two main problems.

1 It requires a finite sample space, which is not true in some cases

2 It requires symmetric outcomes, that is, ∀ωi ∈ Ω, P(ωi) = 1] of Ω .

Therefore, we need the modern probability theory, on foundations laid by

Andrey Nikolaevich Kolmogorov.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 8 / 58

Page 13: L2 probability theory refresh

Probability Measure

Probability Measure: Definition

Definition 2.1.6 (Measurable Space and Probability Measure)

Let Ω be a non-empty set and let F be a σ-algebra on Ω, then (Ω,F) is

called a measurable space. A probability measure is a real-valued function

P : F → R such that

(1) P(Ω) = 1;

(2) P(E) ≥ 0 for any event E ∈ F ;

(3) (Countable additivity) for any sequence of pairwise disjoint events

En in F , P(⋃∞i=1Ei) =

∑∞i=1 P(Ei).

The triple (Ω,F ,P) is called a probability space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 9 / 58

Page 14: L2 probability theory refresh

Probability Measure

Probability Measure: Remark

Remark on the Definition of Probability Measure

This axiomatic definition makes no attempt to tell what particular

function P to choose.

This definition regards the probability as a property of an event in the

σ-algebra F on the sample space Ω.

So, we’ll further discuss how to define probability measures on discrete and

continuous sample spaces, respectively.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 10 / 58

Page 15: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 1

Proposition 2.1.7

Let (Ω,F ,P) be a probability space. Then, for any event E ∈ F ,

(1) P(φ) = 0; (2) P(E) ≤ 1;

(3) P(EC) = 1− P(E).

Proof. It is easier to prove (3) first. Since the sets E and EC form a

partition of the sample space Ω, we have P(E ∪ EC) = P(Ω) = 1 by the

axiom of probability. Also, E and EC are disjoint, so by the axiom,

P(E ∪ EC) = P(E) + P(EC) = 1, and hence P(EC) = 1− P(E).

It is similar to prove (1), and so we skip it out.

Since P(EC) = 1− P(E) ≥ 0, (2) immediately holds.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 11 / 58

Page 16: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 2

Proposition 2.1.8

Let (Ω,F ,P) be a probability space. Then, for any events A,B ∈ F ,

(1) P(B ∩AC) = P(B)− P(A ∩B);

(2) P(A ∪B) = P(A) + P(B)− P(A ∩B);

(3) (Monotonicity) If A ⊆ B, then P(A) ≤ P(B).

Proof. For any sets A and B, B = B ∩AC ∪ B ∩A. Then,

P(B) = P(B ∩AC ∪ B ∩A) = P(B ∩AC) + P(B ∩A).

To establish (2), we use the identity A ∪B = A ∪ B ∩AC. (why?)

To establish (3), combining A ⊆ B ⇒ B = B ∩A and (1) will give the

result.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 12 / 58

Page 17: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 3

Proposition 2.1.9 (Inclusion-exclusion Identity)

Let (Ω,F ,P) be a probability space. For n events A1, · · · , An ∈ F , define

P1, P2, · · · , Pn by P1 =∑n

i=1 P (Ai) , P2 =∑

1≤i<j≤n P (Ai ∩Aj) ,

P3 =∑

1≤i<j<k≤n P (Ai ∩Aj ∩Ak) , · · · , and Pn = P (∩ni=1Ai) . Then

the probability of the union of A1, · · · , An is given by

P (A1 ∪A2 ∪ · · · ∪An) =

n∑i=1

(−1)n+1Pi.

Proof. By mathematical induction.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 13 / 58

Page 18: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 4

Example 2.1.10 (The Matching Problem)

Suppose that each of N men at a party throws his hat into the center of

the room. The hats are first mixed up, and then each man randomly

selects a hat. What is the probability that none of the men selects his own

hat?

Solution. We first calculate the complementary probability of at least one

man’s selecting his own hat. Let Ai be the event that the i-th man selects

his own hat, i = 1, 2, . . . , N .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 14 / 58

Page 19: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 5

Solution (Cont’d). Then, the probability that at least one of the men

selects his own hat is given by P (A1 ∪A2 ∪ · · · ∪AN ) =

(−1)2∑n

i=1 P (Ai) + (−1)3∑

1≤i1<i2≤n P (Ai1 ∩Ai2) + · · ·+ (−1)N+1Pn

= P(∩Ni=1Ai

)The number of all possible outcomes of this experiment is

N !. The number of all possible outcomes of the event Ai1 ∩ · · · ∩Ain is

(N − n)!. So, the probability P(Ai1 ∩ · · · ∩Ain) = (N−n)!N ! .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 15 / 58

Page 20: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 6

Solution (Cont’d). Now, since there are(Nn

)terms in the item∑

1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain), we have∑1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain) =

(Nn

) (N−n)!N ! = 1

n! .

Thus, we get the complementary probability

P (A1 ∪A2 ∪ · · · ∪AN ) = 1− 1

2!+

1

3!− · · ·+ (−1)N+1 1

N !

Hence, the probability that none of the men selects his own hat is

1−(1− 1

2! + 13! − · · ·+ (−1)N+1 1

N !

)Note that as N →∞, the

probability is 1− e−1 ≈ 0.36788, not 1!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 16 / 58

Page 21: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 7

Proposition 2.1.11 (Boole’s Inequality)

Let (Ω,F ,P) be a probability space. For countably many events

A1, · · · ∈ F , P (∪ni=1Ai) ≤∑n

i=1 P (Ai)

Solution. Let A′i be events defined by A′1 = A1 andA′i = Ai − ∪i−1

j=1Aj , ∀i = 2, 3, . . .. Then (1) A′i ⊂ Ai and (2) A′i’s arepairwise disjoint since for i > k,

A′i ∩A′k = (Ai − ∪i−1j=1Aj) ∩ (Ak − ∪k−1

j=1Aj)

= (Ai ∩ (∪i−1j=1Aj)

C) ∩ (Ak ∩ (∪k−1j=1Aj)

C)

= (Ai ∩ (∩i−1j=1A

Cj )) ∩ (Ak ∩ (∩k−1

j=1ACj ))

and (Ai ∩ (∩i−1j=1A

Cj )) are contained in ACk . Since ∪iA′i = ∪iAi, we have

P (∪ni=1Ai) = P (∪ni=1A′i) =

∑ni=1 P (A′i) ≤

∑ni=1 P (Ai).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 17 / 58

Page 22: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 8

Corollary 2.1.12 (σ-subadditivity)

Let (Ω,F ,P) be a probability space. Then for countably many events

A1, · · · ∈ F and A ⊂ ∩ni=1Ai, P(A) ≤∑n

i=1 P (Ai).

Proof. By Theorem 1.1.9.(3) and Theorem 1.1.11.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 18 / 58

Page 23: L2 probability theory refresh

Probability Measure

Properties of Probability Measure 9

Proposition 2.1.13 (Law of Total Probability)

Let (Ω,F ,P) be a probability space. If A1, · · · ∈ F is a partition of Ω,

that is, Ai’s are pairwise disjoint and ∪∞i Ai = Ω. Then, for any event

B ∈ F , P(B) =∑∞

i=1 P(B ∩Ai).

Proof. Since Ai is a partition of Ω, we have

B = B ∩ Ω = B ∩ (∪∞i=1Ai) = ∪∞i=1 (B ∩Ai). Therefore, we have

P(B) = P(∪∞i=1 (B ∩Ai)) =∑∞

i=1 P(B ∩Ai).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 19 / 58

Page 24: L2 probability theory refresh

Probability Measure

Define Probability Measure: Discrete Sample Space 1

Theorem 2.1.14 (Define Probability Measures in a Discrete Sample Space)

Let Ω = ωα : α ∈ N be a discrete (countable) sample space and F be a

σ-algebra on Ω. Define a function P : F → R by (1) P(Ω) = 1; (2)

P(ω) ≥ 0 for any ω ∈ Ω; (3) P(E) =∑

ω∈E P(ω). Then P is a

probability measure.

Proof. The axiom (1) is true by the definition of P. Since P(ω) ≥ 0 for

any ω ∈ Ω, for any event E ∈ F , P(E) =∑

ω∈E P(ω) ≥ 0. ⇒

assumption (2) holds.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 20 / 58

Page 25: L2 probability theory refresh

Probability Measure

Define Probability Measure: Discrete Sample Space 2

Proof (Cont’d). Let En be a sequence of pairwise disjoint events in F .

Then,

P

( ∞⋃i=1

Ei

)=

∑ω∈

⋃∞i=1 Ei

P(ω) =

∞∑i=1

∑ω∈Ei

P(ω) =

∞∑i=1

P(Ei)

⇒ Assumption (3) holds.

Remark on Theorem 2.1.7

The triple (Ω,F ,P) in above theorem is called a discrete probability space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 21 / 58

Page 26: L2 probability theory refresh

Probability Measure

Define Probability Measure: Continuous Sample Space 1

It is much harder to define a probability measure on a continuous sample

space. First, we need some basic knowledge on set theory.

Definition 2.1.15 (Convergence of Sets)

A sequence of sets A1, A2, · · · , An, · · · is said to be increasing to A if

A1 ⊂ A2 ⊂ · · · ⊂ An ⊂ · · · and A = ∪∞n=1An. We denote it as

limn→∞An = A or The case An ↑ A.

A sequence of set A1, A2, · · · , An, · · · is said to be decreasing to A if

A1 ⊃ A2 ⊃ · · · ⊃ An ⊃ · · · and A = ∩∞n=1An. We denote it as

limn→∞An = A or The case An ↓ A.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 22 / 58

Page 27: L2 probability theory refresh

Probability Measure

Define Probability Measure: Continuous Sample Space 2

The following graph shows the idea about the convergence of sets.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 23 / 58

Page 28: L2 probability theory refresh

Probability Measure

Define Probability Measure: Continuous Sample Space 3

Next, we introduce a basic theorem about the continuity of a probability

measure. Actually, we have a more general definition of the limit and

continuity of a set, but here we skip it since we only want to see how to

define a probability measure on a continuous sample space.

Theorem 2.1.16 (Above and Below Continuity of Probability)

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is

increasing/decreasing to a set A, then limi→∞ P(An) = P(A).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 24 / 58

Page 29: L2 probability theory refresh

Probability Measure

Define Probability Measure: Continuous Sample Space 4

Proof. We only prove the increasing case. Let Bn = An −An−1. Then

Bn’s are pairwise disjoint, ∪ni=1Bi = ∪ni=1Ai, and ∪∞n=1Bn = A. Thus,

P(A) = P(∪∞n=1Bn) =

∞∑n=1

P(Bn)

= limn→∞

n∑i=1

P(Bi) = limn→∞

P(∪ni=1Bi)

= limn→∞

P(∪ni=1Ai) = limn→∞

P(An).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 25 / 58

Page 30: L2 probability theory refresh

Probability Measure

Define Probability Measure: Continuous Sample Space 5

Example 2.1.17

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is

increasing/decreasing to a set A, then limi→∞ P(An) = P(A).

Solution. Let Ai =(b− 1

k , b]. Then Ai ↓ [a, b], and so we have

P([a, b]) = limi→∞ P(Ai) = limi→∞ = 12π

(b− a+ 1

k

)= P((a, b]) = b−a

2π .

Remark.

Note that P(a) = P ([a, b]− (a, b]) = 0. Also, the probability

P(Q ∩ Ω) = 0. (Quite different from discrete case!)

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 26 / 58

Page 31: L2 probability theory refresh

Probability Measure

General Limit of Sets

Definition 2.1.18 (Limit of Sets)

Let A1, A2, · · · , An, · · · be a sequence of subsets of Ω.

The limit sup of An is defined as

lim supAn = ∩∞n=1 ∪∞i=n Ai = ω ∈ Ω : ω ∈ infinitely many An.

The limit inf of An is defined as

lim inf An = ∪∞n=1 ∩∞i=n Ai = ω ∈ Ω : ω ∈ all but finitely many An.

We say that limnAn = A if A = lim supAn = lim inf An.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 27 / 58

Page 32: L2 probability theory refresh

Probability Measure

Continuity of Probability Measure 1

Theorem 2.1.19 (Continuity of Probability)

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is a sequence of

subsets of Ω with limnAn = A, then limn→∞ P(An) = P(A).

Proof. Let Bn = ∪∞i=nAi, a decreasing sequence. Thus,

P(lim supAn) = P(∩∞n=1Bn) = limi→∞

P(Bn)

by theorem 2.1.15. Also, let Cn = ∩∞i=nAi, a increasing sequence. Then,

P(lim supAn) = P(∪∞n=1Cn) = limi→∞ P(Cn).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 28 / 58

Page 33: L2 probability theory refresh

Probability Measure

Continuity of Probability Measure 2

Proof.

Since limnAn exist, we have lim supAn = lim inf An = A, and so

P(A) = limi→∞ P(Bn) = limi→∞ P(Cn).

Since the relationship Cn ⊆ An ⊆ Bn holds, we have

P(Cn) ≤ P(An) ≤ P(Bn).

By pinching theorem, one can get

P(A) = limi→∞ P(Bn) = limi→∞ P(Cn) = limi→∞ P(An).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 29 / 58

Page 34: L2 probability theory refresh

Random Variable

Understanding the Sample Space

From example 2.1.17, we know that defining a probability measure on a

sample space is not quite simple as one may think. Therefore, further

understanding on σ−algebra and probability is needed to help us to define

an appropriate probability measure. Here, we first discuss properties of

σ−algebra. The discussion will be very helpful when constructing the

framework of random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 30 / 58

Page 35: L2 probability theory refresh

Random Variable

Understanding the σ−algrbra

Question

Given a sample set Ω and a collection of subsets of Ω, C. Does there exist

a collection of subsets of Ω, say G, such that (1)C ⊆ G and (2)G is a

σ−algebra?

The answer is definitely yes. We may take G to be the collection of all

subsets of Ω. However, this method does not help us to understand about

how to define a suitable probability measure on a desired complicated

space. So our goal is to find the smallest σ−algebra that contains C.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 31 / 58

Page 36: L2 probability theory refresh

Random Variable

Understanding the σ−algrbra

Definition 2.2.1

Let Ω be a sample space and C a collection of subsets of Ω. If G is the

smallest σ−algebra that contains C, we say that G is generated by C and

denote it by G = σ(C).

Note. Here are two proposition that is easy to check. First, if C1 ⊆ C2,

then σ(C1) ⊆ σ(C2). Second, σ(σ(C)) = σ(C) since σ(C) is a σ−algebra.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 32 / 58

Page 37: L2 probability theory refresh

Random Variable

Borel Set and Borel σ−algrbra

Definition 2.2.2 (Borel Sets)

Let Ω be a topological space and C the collection of all open subsets of Ω.

The the σ−algebra B(Ω) ≡ σ(C) is called a Borel σ−algebra. Any

element E ∈ B is called a Borel set.

Now, let R be the sample sets. Then every subset of R which you meet

everyday is a Borel set, and it is difficult, but possible, to find a subset of

R constructed explicitly that is not a Borel set. This means that

B ≡ B(R) is not equal to the collection of all subsets of R.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 33 / 58

Page 38: L2 probability theory refresh

Random Variable

Lemma

Events in B are quite complicated, but we can use a very easily

understanding structure to construct B by theorem 2.2.4, but before prove

it, we need a simple result in lemma 2.2.3.

Lemma 2.2.3

Let Ω be a set. IF Fα is a σ−algebra for each α in some non-empty index

set I, then ∩α∈IFα is also a σ−algebra.

Proof. This follows immediately from the definition of σ−algebra.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 34 / 58

Page 39: L2 probability theory refresh

Random Variable

Structure of a Borel σ−algebra 1

Theorem 2.2.4

The Borel σ−algebra B ≡ B(R) is generated by the following collections

of subsets of R:

(1) C1 = (a, b) : a < b, a ∈ R, b ∈ R;

(2) C2 = (a, b] : a < b, a ∈ R, b ∈ R;

(3) C3 = [a, b) : a < b, a ∈ R, b ∈ R;

(4) C4 = [a, b] : a < b, a ∈ R, b ∈ R.

Proof. This is not our main topic, so we only prove (1) as an example.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 35 / 58

Page 40: L2 probability theory refresh

Random Variable

Structure of a Borel σ−algebra 2

Proof (Cont’d).

Let G be the collection of all open sets in R. Then B = σ(G). Since every

(a, b) are open, we have C1 ⊂ G, and so σ(C1) ⊂ σ(G).

Since (a,∞) = ∪ε∈R+(a, a+ ε) and (−∞, b) = ∪ε∈R+(b− ε, b), we have

(a,∞) and (−∞, b) in σ(C1).

To get the reverse inclusion, if G ⊂ R is open, it is the countable union of

open intervals and so G ∈ σ(C1), and thus G ⊂ σ(C1). This implies that

σ(G) ⊂ σ(C1).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 36 / 58

Page 41: L2 probability theory refresh

Random Variable

Measurable Function 1

There are still one thing we need to talk about, that is, the term of

”measurability” of a function. This definition is the core concept behind

random variables.

Definition 2.2.5 (Measurability)

Let (Ω,F) and (S,F ′) be two measurable spaces. A function X : Ω→ S

is said to be a (F ,F ′) measurable map from (Ω,F) to (S,F ′) if

X−1(B) = ω : X(ω) ∈ B ∈ F , ∀B ∈ F ′.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 37 / 58

Page 42: L2 probability theory refresh

Random Variable

Measurable Function 2

The meaning of (F ,F ′) measurable is that we can transform the original

measure defined on F into a new probability measure PX on F ′, and then

form a new probability space (S,F ′,PX). From here, we’ve known the

structure behind the concept of random variable. Now, we give the forma

definition of a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 38 / 58

Page 43: L2 probability theory refresh

Random Variable

Random Variable

Definition 2.2.6 (Random Variable)

Let (Ω,F ,P) be a probability space. We say that a real-valued function

X : Ω→ R a (F ,B) measurable random variable if for any Borel set

B ⊂ R, X−1(B) = ω : X(ω) ∈ B ∈ F .

Remark.

This definition means that a random variable transforms the original

probability space (Ω,F ,P) into a new probability space (X(Ω),B,PX).

Here we have not talked about how to transform P into a new probability

measure PX w.r.t X on B.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 39 / 58

Page 44: L2 probability theory refresh

Random Variable

Example of Random Variables

Example 2.2.7 (Random Variables)

Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function

X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?

Solution. For any Borel set B ∈ B,

X−11 (B) = ω ∈ Ω : X1(ω) ∈ B = ω ∈ [0, 1] : ω ∈ B = B ∩ [0, 1] ∈ F .

Hence, X1(ω) = ω, ω ∈ Ω is a random variable.

For any Borel set B ∈ B,

X−12 (B) = ω ∈ Ω : X2(ω) ∈ B = ω ∈ [0, 1] : ω2 ∈ B.

It is hard to check whether X−12 (B) ∈ F , especially that we need to check

every Borel set B!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 40 / 58

Page 45: L2 probability theory refresh

Random Variable

Alternative Definition of Random Variable 1

The definition of the random variable is too hard to check! We need other

equivalent definition of random variables. We first introduce a lemma.

Lemma 2.2.8 (Transformation of σ−algebra)

Let (Ω,F) and (S,F ′) be two measurable spaces and C ∈ F ′ be a

collection of subsets of S. If X : Ω→ S is a (F ,F ′) measurable map,

then σ(X−1(C)) = X−1(σ(C)).

Proof. Very hard to prove! Skip!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 41 / 58

Page 46: L2 probability theory refresh

Random Variable

Alternative Definition of Random Variable 2

Theorem 2.2.9 (Equivalent Definition of a Random Variable)

Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued

function. Then X is a random variable if and only if

X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.

Proof.

If X is a random variable, since (−∞, r] is a Borel set (note that (−∞, r]

generates the Borel σ−algebra on R), X−1((−∞, r]) ∈ F must holds.

To prove the converse, let C = (−∞, r] : r ∈ R. Then

X−1(B) = X−1(σ(C)) = σ(X−1(C)). Since X−1(C) ∈ F , we have

σ(X−1(C)) = X−1(B) ⊂ F , and so X is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 42 / 58

Page 47: L2 probability theory refresh

Random Variable

Alternative Definition of Random Variable 3

Corollary 2.2.10 (Equivalent Definition of a Random Variable)

Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued

function. The following statements are equivalent.

(1) X is a random variable.

(2) X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.

(3) X−1((−∞, r)) = ω : X(ω) < r ∈ F for any r ∈ R.

(4) X−1([r,∞)) = ω : X(ω) ≥ r ∈ F for any r ∈ R.

(5) X−1((r,∞)) = ω : X(ω) > r ∈ F for any r ∈ R.

It is a easier definition for us to check whether a function is a random

variable. For example, now we can answer example 2.2.7.

If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,

X−12 ((−∞, r]) = [0,

√r] ∈ F since

√r ≤ 1 If r > 1,

X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 43 / 58

Page 48: L2 probability theory refresh

Random Variable

Alternative Definition of Random Variable 4

Now we can answer example 2.2.7.

Example 2.2.7 (Random Variables)

Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function

X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?

If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,

X−12 ((−∞, r]) = [0,

√r] ∈ F since

√r ≤ 1 If r > 1,

X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 44 / 58

Page 49: L2 probability theory refresh

Random Variable

Define the Probability Measure on the Borel Set 1

Here, we still need to discuss one thing, that is how to define a probability

measure on B with respect to the random variable X? The following

theorem gives us the hint.

Theorem 2.2.11

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable.

Then the function PX : B → [0, 1] defined by

PX(B) = P(X−1(B)), ∀B ∈ B is a probability measure on (R,B).

Proof. To prove this theorem, we need to check the axioms of probability

measures.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 45 / 58

Page 50: L2 probability theory refresh

Random Variable

Define the Probability Measure on the Borel Set 2

Proof (Cont’d).

(1) PX(R) = P(X−1(R)) = P(X−1(Ω)) = 1.

(2) PX(B) = P(X−1(B)) ≥ 0 since P) is a probability measure.

(3) For pairwise disjoint sets Bi’s in B,

PX(∪iBi) = P(X−1(∪iBi)) = P(∪iX−1(Bi))

=∑

i P(X−1(Bi)) =∑

i PX(Bi).

Therefore, PX is a probability measure on (R,B). Note that PX is called

the probability distribution function of the random variable X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 46 / 58

Page 51: L2 probability theory refresh

Random Variable

Define the Probability Measure on the Borel Set 2

From here, we’ve known how to define a probability measure with respect

to a random variable. However, the definition is too complicated since it

defines a probability measure on (R,B). As a result, in the next section we

will introduce a more common, useful, and intuitive function to describe

the probability w.r.t X, that is the (culmulative) distribution function FX .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 47 / 58

Page 52: L2 probability theory refresh

Distribution Function

Distribution Function: Definition

The distribution of a random variable X is usually described by giving its

distribution function, not the probability distribution function.

Definition 2.3.1 (Distribution Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable

with probability distribution function PX . The distribution function of X,

written FX : R→ [0, 1], is defined by

FX(x) = PX((−∞, x]) = P(ω ∈ Ω : X(ω) ≤ x) = P(X ≤ x), ∀x ∈ R.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 48 / 58

Page 53: L2 probability theory refresh

Distribution Function

Distribution Function: Properties 1

The distribution function gives us a simplified framework when discussing

a quantitative random phenomenon.

Theorem 2.3.2 (Properties of Distribution Functions)

Any distribution function FX of a random variable X has the following

properties:

(1) FX is nondecreasing;

(2) limx→∞ FX(x) = 1 and limx→−∞ FX(x) = 0;

(3) FX is right continuous, that is, limx→a+ FX(x) = FX(a);

(4) FX(a−) := limx→a− FX(x) = P(X < a);

(5) the probability of the event X = a is P(X = a) = FX(a)−FX(a−).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 49 / 58

Page 54: L2 probability theory refresh

Distribution Function

Distribution Function: Properties 2

Proof. To prove (1), assume that a ≤ b ∈ R. Note that

ω : X(ω) ≤ a ⊆ ω : X(ω) ≤ b. Then,

FX(a) = P(ω : X(ω) ≤ a) ≤ ω : X(ω) ≤ b = F (b).

For(2), if x→∞, then ω : X(ω) ≤ x ↑ Ω, so

limx→∞ FX(x) = limx→∞ P(ω : X(ω) ≤ x) = P(Ω) = 1. The fact that

x→ −∞⇒ X(ω) ≤ x ↓ φ implies limx→−∞ FX(x) = 0.

To prove (3), we observe that if x→ a+, then

ω : X(ω) ≤ x ↓ ω : X(ω) ≤ a.

To prove (4), if x→ a−, ω : X(ω) ≤ x ↑ ω : X(ω) ≤ a.

For (5), note that P(X = a) = P (X ≤ a)− P(X < a) and use (3),(4).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 50 / 58

Page 55: L2 probability theory refresh

Distribution Function

Lebesgue-Stieltjes measure 1

Definition 2.3.3 (Lebesgue-Stieltjes Measure)

Let Ω = R and C be a collection of intervals of the form (a, b]. If F is a

function satisfying (1) and (3), then we call it a Stieltjes measure.

Define a function l by l((a, b]) = F (b)− F (a). Then the function

m∗(E) = inf

∞∑i=1

l(Ai) : Ai ∈ C for each i and E ⊂ ∩∞i Ai

is called a Lebesgue-Stieltjes measure. If we define F (x) = x, then the

measure m∗ is called a Lebesque measure.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 51 / 58

Page 56: L2 probability theory refresh

Distribution Function

Lebesgue-Stieltjes measure 2

Note that there are many important properties for the Lebesgue-Stieltjes

measure. First, every set in the Borel σ−algebra on R can be measured by

the Lebesgue-Stieltjes measure. (Here, we do not define what the term

”can be measured”!) Second, if a and b are finite, then

m∗((a, b]) = l((a, b]). A more important one is the Lebesque measure, it

plays a central role when defining expectation.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 52 / 58

Page 57: L2 probability theory refresh

Distribution Function

Conditions for Distribution Functions 1

Theorem 2.3.4 (Conditions for Distribution Functions)

If a function F satisfies (1), (2) and (3), then it is the distribution

function of some random variable.

Proof.

Let Ω = [0, 1], F = the Borel sets on (0, 1), and P = the Lebesgue

measure.

Define a real-valued function X : Ω→ R by X(ω) = supx : F (x) < ω.

Then, for a fixed number x, if ω ≤ F (x), X(ω) < x must holds by the

definition of X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 53 / 58

Page 58: L2 probability theory refresh

Distribution Function

Conditions for Distribution Functions 2

On the other hand if ω > F (x), then since F is right continuous, there is

an ε > 0 so that F (x+ ε) < ω and so X(ω) ≤ x+ ε > x. As a result, we

have ω : X(ω) ≤ x = ω : ω ≤ F (x). Since P is a lebesgue measure,

we have the desired result since

P(ω ∈ Ω : X(ω) ≤ x) = P(ω : ω ≤ F (x))

= P((−∞, F (x)]) = F (x)− limx→−∞ FX(x) = F (x).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 54 / 58

Page 59: L2 probability theory refresh

Distribution Function

Probability Mass Function

Definition 2.3.5 (Probability Mass Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If

the image of X, say X(Ω), is a countable set, we call X a discrete

random variable. The probability mass function of X is defined by

pX(x) = PX(X(ω) = x) = P(ω : X(ω) = x). In this case, the

distribution function of X must have some jumps.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 55 / 58

Page 60: L2 probability theory refresh

Distribution Function

Probability Density Function

Definition 2.3.6 (Probability Density Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If

the distribution function FX of X is continuous, we call X a continuous

random variable. The function fX(x) = ddxFX(x) is called the probability

density function of X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 56 / 58

Page 61: L2 probability theory refresh

Distribution Function

Probability Density Function

By the fundamental theorem of calculus, we have the relationship

FX(x) =∫ x−∞ f(t)dt and

PX((a− ε, a+ ε)) = FX(a+ ε)− FX(a− ε) =∫ a+εa−ε f(x)dx.

These properties mean that we can start with f and then define a

distribution function F . In order to end up with a distribution function FX

it is necessary and sufficient that f(x) ≥ 0 and∫∞−∞ f(x)dx = 1.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 57 / 58

Page 62: L2 probability theory refresh

Next Lecture

The Next Lecture

In next lecture, we’ll review the idea about the random variable, and then

introduce the random vector.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 58 / 58