L2 probability theory refresh

Preview:

Citation preview

Probability Theory: Advanced LookStatistical Methods in Finance

Lecture 2

Ta-Wei Huang

December 7, 2016

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 1 / 58

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Table of Contents

Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.

1 Sample Space and Event

2 Probability Measure

3 Random Variable

4 Distribution Function

5 Next Lecture

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58

Sample Space and Event

Sample Space: Definition

Definition 2.1.1 (Sample Space)

For an experiment and a given index set A, all possible outcomes

ωα, α ∈ A are called sample points. The set Ω = ωα : α ∈ A = the

collection of all possible outcomes, i.e., the set of all sample points, is

called the sample space of that experiment.

1 If A is countable, we say that Ω is a discrete sample space;

2 if A is uncountable, we say that Ω is a continuous sample space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 3 / 58

Sample Space and Event

Sample Space: Examples

Example 2.1.2 (Sample Space)

1 If the experiment is tossing a coin, and the observation is the face of

that coin, then the sample space is Ω = H,T, which is a discrete

sample space.

2 If the experiment is a monetary policy conducted by Fed, and the

observation is the return on S&P500 Index one day after that policy,

then Ω = R, which is a continuous sample space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 4 / 58

Sample Space and Event

Event

Definition 2.1.3 (Event)

An event E is any collection of all possible outcomes of an experiment,

that is, any subset of the sample space Ω.

An event is actually a statement about the experiment results. For

example, the set R+ = S&P500 has positive return is an event for Ω in

example 2.1.2 case (2).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 5 / 58

Sample Space and Event

σ-algebra: Definition

Definition 2.1.4 (σ-algebra)

A system F of subsets of Ω is called a σ-algebra if

(a) Ω ∈ F , (b) Ac ∈ F if A ∈ F ,

(c) and for A1, A2, · · · , An, · · · ∈ F , the union⋃∞i=1Ai ∈ F .

Why do we need the concept of σ-algebra? The reason is that we will

assign a probability to any event E, which is a subset of the sample space

Ω, and therefore set operations are required so that we can easily do

something on different events, and then compute probabilities.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 6 / 58

Sample Space and Event

σ-algebra: Example

Example 2.1.5 (σ-algebra)

1 Let Ω = 1, 2, 3. Then

1 F1 = φ, 1, 2, 3, 1, 2, 3 is a σ-algebra.

2 F2 = φ, 1, 2, 3, 1, 2, 3 is not a σ-algebra since for

A = 1 ∈ F2, Ac = 2, 3 6∈ F2.

2 Let Ω = R. Then F = the collection of all subsets in R is a σ-algebra.

3 Let Ω = N. Then

1 F1 = φ, 1, 3, 5, 7, · · · , 2, 4, 6, 8, · · · ,N is a σ-algebra.

2 F2 = A ⊆ N : A is countable or Ac is countable is a σ-algebra.

3 F3 = A ⊆ N : A is finite or Ac is finite is not a σ-algebra since

Ω = N 6∈ F3.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 7 / 58

Probability Measure

Defining Probability Measure

Classically, we define the probability P (A) of an event A by

P(A) = ] of A] of Ω , but there are two main problems.

1 It requires a finite sample space, which is not true in some cases

2 It requires symmetric outcomes, that is, ∀ωi ∈ Ω, P(ωi) = 1] of Ω .

Therefore, we need the modern probability theory, on foundations laid by

Andrey Nikolaevich Kolmogorov.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 8 / 58

Probability Measure

Probability Measure: Definition

Definition 2.1.6 (Measurable Space and Probability Measure)

Let Ω be a non-empty set and let F be a σ-algebra on Ω, then (Ω,F) is

called a measurable space. A probability measure is a real-valued function

P : F → R such that

(1) P(Ω) = 1;

(2) P(E) ≥ 0 for any event E ∈ F ;

(3) (Countable additivity) for any sequence of pairwise disjoint events

En in F , P(⋃∞i=1Ei) =

∑∞i=1 P(Ei).

The triple (Ω,F ,P) is called a probability space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 9 / 58

Probability Measure

Probability Measure: Remark

Remark on the Definition of Probability Measure

This axiomatic definition makes no attempt to tell what particular

function P to choose.

This definition regards the probability as a property of an event in the

σ-algebra F on the sample space Ω.

So, we’ll further discuss how to define probability measures on discrete and

continuous sample spaces, respectively.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 10 / 58

Probability Measure

Properties of Probability Measure 1

Proposition 2.1.7

Let (Ω,F ,P) be a probability space. Then, for any event E ∈ F ,

(1) P(φ) = 0; (2) P(E) ≤ 1;

(3) P(EC) = 1− P(E).

Proof. It is easier to prove (3) first. Since the sets E and EC form a

partition of the sample space Ω, we have P(E ∪ EC) = P(Ω) = 1 by the

axiom of probability. Also, E and EC are disjoint, so by the axiom,

P(E ∪ EC) = P(E) + P(EC) = 1, and hence P(EC) = 1− P(E).

It is similar to prove (1), and so we skip it out.

Since P(EC) = 1− P(E) ≥ 0, (2) immediately holds.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 11 / 58

Probability Measure

Properties of Probability Measure 2

Proposition 2.1.8

Let (Ω,F ,P) be a probability space. Then, for any events A,B ∈ F ,

(1) P(B ∩AC) = P(B)− P(A ∩B);

(2) P(A ∪B) = P(A) + P(B)− P(A ∩B);

(3) (Monotonicity) If A ⊆ B, then P(A) ≤ P(B).

Proof. For any sets A and B, B = B ∩AC ∪ B ∩A. Then,

P(B) = P(B ∩AC ∪ B ∩A) = P(B ∩AC) + P(B ∩A).

To establish (2), we use the identity A ∪B = A ∪ B ∩AC. (why?)

To establish (3), combining A ⊆ B ⇒ B = B ∩A and (1) will give the

result.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 12 / 58

Probability Measure

Properties of Probability Measure 3

Proposition 2.1.9 (Inclusion-exclusion Identity)

Let (Ω,F ,P) be a probability space. For n events A1, · · · , An ∈ F , define

P1, P2, · · · , Pn by P1 =∑n

i=1 P (Ai) , P2 =∑

1≤i<j≤n P (Ai ∩Aj) ,

P3 =∑

1≤i<j<k≤n P (Ai ∩Aj ∩Ak) , · · · , and Pn = P (∩ni=1Ai) . Then

the probability of the union of A1, · · · , An is given by

P (A1 ∪A2 ∪ · · · ∪An) =

n∑i=1

(−1)n+1Pi.

Proof. By mathematical induction.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 13 / 58

Probability Measure

Properties of Probability Measure 4

Example 2.1.10 (The Matching Problem)

Suppose that each of N men at a party throws his hat into the center of

the room. The hats are first mixed up, and then each man randomly

selects a hat. What is the probability that none of the men selects his own

hat?

Solution. We first calculate the complementary probability of at least one

man’s selecting his own hat. Let Ai be the event that the i-th man selects

his own hat, i = 1, 2, . . . , N .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 14 / 58

Probability Measure

Properties of Probability Measure 5

Solution (Cont’d). Then, the probability that at least one of the men

selects his own hat is given by P (A1 ∪A2 ∪ · · · ∪AN ) =

(−1)2∑n

i=1 P (Ai) + (−1)3∑

1≤i1<i2≤n P (Ai1 ∩Ai2) + · · ·+ (−1)N+1Pn

= P(∩Ni=1Ai

)The number of all possible outcomes of this experiment is

N !. The number of all possible outcomes of the event Ai1 ∩ · · · ∩Ain is

(N − n)!. So, the probability P(Ai1 ∩ · · · ∩Ain) = (N−n)!N ! .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 15 / 58

Probability Measure

Properties of Probability Measure 6

Solution (Cont’d). Now, since there are(Nn

)terms in the item∑

1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain), we have∑1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain) =

(Nn

) (N−n)!N ! = 1

n! .

Thus, we get the complementary probability

P (A1 ∪A2 ∪ · · · ∪AN ) = 1− 1

2!+

1

3!− · · ·+ (−1)N+1 1

N !

Hence, the probability that none of the men selects his own hat is

1−(1− 1

2! + 13! − · · ·+ (−1)N+1 1

N !

)Note that as N →∞, the

probability is 1− e−1 ≈ 0.36788, not 1!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 16 / 58

Probability Measure

Properties of Probability Measure 7

Proposition 2.1.11 (Boole’s Inequality)

Let (Ω,F ,P) be a probability space. For countably many events

A1, · · · ∈ F , P (∪ni=1Ai) ≤∑n

i=1 P (Ai)

Solution. Let A′i be events defined by A′1 = A1 andA′i = Ai − ∪i−1

j=1Aj , ∀i = 2, 3, . . .. Then (1) A′i ⊂ Ai and (2) A′i’s arepairwise disjoint since for i > k,

A′i ∩A′k = (Ai − ∪i−1j=1Aj) ∩ (Ak − ∪k−1

j=1Aj)

= (Ai ∩ (∪i−1j=1Aj)

C) ∩ (Ak ∩ (∪k−1j=1Aj)

C)

= (Ai ∩ (∩i−1j=1A

Cj )) ∩ (Ak ∩ (∩k−1

j=1ACj ))

and (Ai ∩ (∩i−1j=1A

Cj )) are contained in ACk . Since ∪iA′i = ∪iAi, we have

P (∪ni=1Ai) = P (∪ni=1A′i) =

∑ni=1 P (A′i) ≤

∑ni=1 P (Ai).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 17 / 58

Probability Measure

Properties of Probability Measure 8

Corollary 2.1.12 (σ-subadditivity)

Let (Ω,F ,P) be a probability space. Then for countably many events

A1, · · · ∈ F and A ⊂ ∩ni=1Ai, P(A) ≤∑n

i=1 P (Ai).

Proof. By Theorem 1.1.9.(3) and Theorem 1.1.11.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 18 / 58

Probability Measure

Properties of Probability Measure 9

Proposition 2.1.13 (Law of Total Probability)

Let (Ω,F ,P) be a probability space. If A1, · · · ∈ F is a partition of Ω,

that is, Ai’s are pairwise disjoint and ∪∞i Ai = Ω. Then, for any event

B ∈ F , P(B) =∑∞

i=1 P(B ∩Ai).

Proof. Since Ai is a partition of Ω, we have

B = B ∩ Ω = B ∩ (∪∞i=1Ai) = ∪∞i=1 (B ∩Ai). Therefore, we have

P(B) = P(∪∞i=1 (B ∩Ai)) =∑∞

i=1 P(B ∩Ai).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 19 / 58

Probability Measure

Define Probability Measure: Discrete Sample Space 1

Theorem 2.1.14 (Define Probability Measures in a Discrete Sample Space)

Let Ω = ωα : α ∈ N be a discrete (countable) sample space and F be a

σ-algebra on Ω. Define a function P : F → R by (1) P(Ω) = 1; (2)

P(ω) ≥ 0 for any ω ∈ Ω; (3) P(E) =∑

ω∈E P(ω). Then P is a

probability measure.

Proof. The axiom (1) is true by the definition of P. Since P(ω) ≥ 0 for

any ω ∈ Ω, for any event E ∈ F , P(E) =∑

ω∈E P(ω) ≥ 0. ⇒

assumption (2) holds.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 20 / 58

Probability Measure

Define Probability Measure: Discrete Sample Space 2

Proof (Cont’d). Let En be a sequence of pairwise disjoint events in F .

Then,

P

( ∞⋃i=1

Ei

)=

∑ω∈

⋃∞i=1 Ei

P(ω) =

∞∑i=1

∑ω∈Ei

P(ω) =

∞∑i=1

P(Ei)

⇒ Assumption (3) holds.

Remark on Theorem 2.1.7

The triple (Ω,F ,P) in above theorem is called a discrete probability space.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 21 / 58

Probability Measure

Define Probability Measure: Continuous Sample Space 1

It is much harder to define a probability measure on a continuous sample

space. First, we need some basic knowledge on set theory.

Definition 2.1.15 (Convergence of Sets)

A sequence of sets A1, A2, · · · , An, · · · is said to be increasing to A if

A1 ⊂ A2 ⊂ · · · ⊂ An ⊂ · · · and A = ∪∞n=1An. We denote it as

limn→∞An = A or The case An ↑ A.

A sequence of set A1, A2, · · · , An, · · · is said to be decreasing to A if

A1 ⊃ A2 ⊃ · · · ⊃ An ⊃ · · · and A = ∩∞n=1An. We denote it as

limn→∞An = A or The case An ↓ A.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 22 / 58

Probability Measure

Define Probability Measure: Continuous Sample Space 2

The following graph shows the idea about the convergence of sets.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 23 / 58

Probability Measure

Define Probability Measure: Continuous Sample Space 3

Next, we introduce a basic theorem about the continuity of a probability

measure. Actually, we have a more general definition of the limit and

continuity of a set, but here we skip it since we only want to see how to

define a probability measure on a continuous sample space.

Theorem 2.1.16 (Above and Below Continuity of Probability)

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is

increasing/decreasing to a set A, then limi→∞ P(An) = P(A).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 24 / 58

Probability Measure

Define Probability Measure: Continuous Sample Space 4

Proof. We only prove the increasing case. Let Bn = An −An−1. Then

Bn’s are pairwise disjoint, ∪ni=1Bi = ∪ni=1Ai, and ∪∞n=1Bn = A. Thus,

P(A) = P(∪∞n=1Bn) =

∞∑n=1

P(Bn)

= limn→∞

n∑i=1

P(Bi) = limn→∞

P(∪ni=1Bi)

= limn→∞

P(∪ni=1Ai) = limn→∞

P(An).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 25 / 58

Probability Measure

Define Probability Measure: Continuous Sample Space 5

Example 2.1.17

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is

increasing/decreasing to a set A, then limi→∞ P(An) = P(A).

Solution. Let Ai =(b− 1

k , b]. Then Ai ↓ [a, b], and so we have

P([a, b]) = limi→∞ P(Ai) = limi→∞ = 12π

(b− a+ 1

k

)= P((a, b]) = b−a

2π .

Remark.

Note that P(a) = P ([a, b]− (a, b]) = 0. Also, the probability

P(Q ∩ Ω) = 0. (Quite different from discrete case!)

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 26 / 58

Probability Measure

General Limit of Sets

Definition 2.1.18 (Limit of Sets)

Let A1, A2, · · · , An, · · · be a sequence of subsets of Ω.

The limit sup of An is defined as

lim supAn = ∩∞n=1 ∪∞i=n Ai = ω ∈ Ω : ω ∈ infinitely many An.

The limit inf of An is defined as

lim inf An = ∪∞n=1 ∩∞i=n Ai = ω ∈ Ω : ω ∈ all but finitely many An.

We say that limnAn = A if A = lim supAn = lim inf An.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 27 / 58

Probability Measure

Continuity of Probability Measure 1

Theorem 2.1.19 (Continuity of Probability)

Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is a sequence of

subsets of Ω with limnAn = A, then limn→∞ P(An) = P(A).

Proof. Let Bn = ∪∞i=nAi, a decreasing sequence. Thus,

P(lim supAn) = P(∩∞n=1Bn) = limi→∞

P(Bn)

by theorem 2.1.15. Also, let Cn = ∩∞i=nAi, a increasing sequence. Then,

P(lim supAn) = P(∪∞n=1Cn) = limi→∞ P(Cn).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 28 / 58

Probability Measure

Continuity of Probability Measure 2

Proof.

Since limnAn exist, we have lim supAn = lim inf An = A, and so

P(A) = limi→∞ P(Bn) = limi→∞ P(Cn).

Since the relationship Cn ⊆ An ⊆ Bn holds, we have

P(Cn) ≤ P(An) ≤ P(Bn).

By pinching theorem, one can get

P(A) = limi→∞ P(Bn) = limi→∞ P(Cn) = limi→∞ P(An).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 29 / 58

Random Variable

Understanding the Sample Space

From example 2.1.17, we know that defining a probability measure on a

sample space is not quite simple as one may think. Therefore, further

understanding on σ−algebra and probability is needed to help us to define

an appropriate probability measure. Here, we first discuss properties of

σ−algebra. The discussion will be very helpful when constructing the

framework of random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 30 / 58

Random Variable

Understanding the σ−algrbra

Question

Given a sample set Ω and a collection of subsets of Ω, C. Does there exist

a collection of subsets of Ω, say G, such that (1)C ⊆ G and (2)G is a

σ−algebra?

The answer is definitely yes. We may take G to be the collection of all

subsets of Ω. However, this method does not help us to understand about

how to define a suitable probability measure on a desired complicated

space. So our goal is to find the smallest σ−algebra that contains C.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 31 / 58

Random Variable

Understanding the σ−algrbra

Definition 2.2.1

Let Ω be a sample space and C a collection of subsets of Ω. If G is the

smallest σ−algebra that contains C, we say that G is generated by C and

denote it by G = σ(C).

Note. Here are two proposition that is easy to check. First, if C1 ⊆ C2,

then σ(C1) ⊆ σ(C2). Second, σ(σ(C)) = σ(C) since σ(C) is a σ−algebra.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 32 / 58

Random Variable

Borel Set and Borel σ−algrbra

Definition 2.2.2 (Borel Sets)

Let Ω be a topological space and C the collection of all open subsets of Ω.

The the σ−algebra B(Ω) ≡ σ(C) is called a Borel σ−algebra. Any

element E ∈ B is called a Borel set.

Now, let R be the sample sets. Then every subset of R which you meet

everyday is a Borel set, and it is difficult, but possible, to find a subset of

R constructed explicitly that is not a Borel set. This means that

B ≡ B(R) is not equal to the collection of all subsets of R.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 33 / 58

Random Variable

Lemma

Events in B are quite complicated, but we can use a very easily

understanding structure to construct B by theorem 2.2.4, but before prove

it, we need a simple result in lemma 2.2.3.

Lemma 2.2.3

Let Ω be a set. IF Fα is a σ−algebra for each α in some non-empty index

set I, then ∩α∈IFα is also a σ−algebra.

Proof. This follows immediately from the definition of σ−algebra.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 34 / 58

Random Variable

Structure of a Borel σ−algebra 1

Theorem 2.2.4

The Borel σ−algebra B ≡ B(R) is generated by the following collections

of subsets of R:

(1) C1 = (a, b) : a < b, a ∈ R, b ∈ R;

(2) C2 = (a, b] : a < b, a ∈ R, b ∈ R;

(3) C3 = [a, b) : a < b, a ∈ R, b ∈ R;

(4) C4 = [a, b] : a < b, a ∈ R, b ∈ R.

Proof. This is not our main topic, so we only prove (1) as an example.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 35 / 58

Random Variable

Structure of a Borel σ−algebra 2

Proof (Cont’d).

Let G be the collection of all open sets in R. Then B = σ(G). Since every

(a, b) are open, we have C1 ⊂ G, and so σ(C1) ⊂ σ(G).

Since (a,∞) = ∪ε∈R+(a, a+ ε) and (−∞, b) = ∪ε∈R+(b− ε, b), we have

(a,∞) and (−∞, b) in σ(C1).

To get the reverse inclusion, if G ⊂ R is open, it is the countable union of

open intervals and so G ∈ σ(C1), and thus G ⊂ σ(C1). This implies that

σ(G) ⊂ σ(C1).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 36 / 58

Random Variable

Measurable Function 1

There are still one thing we need to talk about, that is, the term of

”measurability” of a function. This definition is the core concept behind

random variables.

Definition 2.2.5 (Measurability)

Let (Ω,F) and (S,F ′) be two measurable spaces. A function X : Ω→ S

is said to be a (F ,F ′) measurable map from (Ω,F) to (S,F ′) if

X−1(B) = ω : X(ω) ∈ B ∈ F , ∀B ∈ F ′.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 37 / 58

Random Variable

Measurable Function 2

The meaning of (F ,F ′) measurable is that we can transform the original

measure defined on F into a new probability measure PX on F ′, and then

form a new probability space (S,F ′,PX). From here, we’ve known the

structure behind the concept of random variable. Now, we give the forma

definition of a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 38 / 58

Random Variable

Random Variable

Definition 2.2.6 (Random Variable)

Let (Ω,F ,P) be a probability space. We say that a real-valued function

X : Ω→ R a (F ,B) measurable random variable if for any Borel set

B ⊂ R, X−1(B) = ω : X(ω) ∈ B ∈ F .

Remark.

This definition means that a random variable transforms the original

probability space (Ω,F ,P) into a new probability space (X(Ω),B,PX).

Here we have not talked about how to transform P into a new probability

measure PX w.r.t X on B.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 39 / 58

Random Variable

Example of Random Variables

Example 2.2.7 (Random Variables)

Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function

X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?

Solution. For any Borel set B ∈ B,

X−11 (B) = ω ∈ Ω : X1(ω) ∈ B = ω ∈ [0, 1] : ω ∈ B = B ∩ [0, 1] ∈ F .

Hence, X1(ω) = ω, ω ∈ Ω is a random variable.

For any Borel set B ∈ B,

X−12 (B) = ω ∈ Ω : X2(ω) ∈ B = ω ∈ [0, 1] : ω2 ∈ B.

It is hard to check whether X−12 (B) ∈ F , especially that we need to check

every Borel set B!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 40 / 58

Random Variable

Alternative Definition of Random Variable 1

The definition of the random variable is too hard to check! We need other

equivalent definition of random variables. We first introduce a lemma.

Lemma 2.2.8 (Transformation of σ−algebra)

Let (Ω,F) and (S,F ′) be two measurable spaces and C ∈ F ′ be a

collection of subsets of S. If X : Ω→ S is a (F ,F ′) measurable map,

then σ(X−1(C)) = X−1(σ(C)).

Proof. Very hard to prove! Skip!

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 41 / 58

Random Variable

Alternative Definition of Random Variable 2

Theorem 2.2.9 (Equivalent Definition of a Random Variable)

Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued

function. Then X is a random variable if and only if

X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.

Proof.

If X is a random variable, since (−∞, r] is a Borel set (note that (−∞, r]

generates the Borel σ−algebra on R), X−1((−∞, r]) ∈ F must holds.

To prove the converse, let C = (−∞, r] : r ∈ R. Then

X−1(B) = X−1(σ(C)) = σ(X−1(C)). Since X−1(C) ∈ F , we have

σ(X−1(C)) = X−1(B) ⊂ F , and so X is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 42 / 58

Random Variable

Alternative Definition of Random Variable 3

Corollary 2.2.10 (Equivalent Definition of a Random Variable)

Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued

function. The following statements are equivalent.

(1) X is a random variable.

(2) X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.

(3) X−1((−∞, r)) = ω : X(ω) < r ∈ F for any r ∈ R.

(4) X−1([r,∞)) = ω : X(ω) ≥ r ∈ F for any r ∈ R.

(5) X−1((r,∞)) = ω : X(ω) > r ∈ F for any r ∈ R.

It is a easier definition for us to check whether a function is a random

variable. For example, now we can answer example 2.2.7.

If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,

X−12 ((−∞, r]) = [0,

√r] ∈ F since

√r ≤ 1 If r > 1,

X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 43 / 58

Random Variable

Alternative Definition of Random Variable 4

Now we can answer example 2.2.7.

Example 2.2.7 (Random Variables)

Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function

X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?

If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,

X−12 ((−∞, r]) = [0,

√r] ∈ F since

√r ≤ 1 If r > 1,

X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 44 / 58

Random Variable

Define the Probability Measure on the Borel Set 1

Here, we still need to discuss one thing, that is how to define a probability

measure on B with respect to the random variable X? The following

theorem gives us the hint.

Theorem 2.2.11

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable.

Then the function PX : B → [0, 1] defined by

PX(B) = P(X−1(B)), ∀B ∈ B is a probability measure on (R,B).

Proof. To prove this theorem, we need to check the axioms of probability

measures.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 45 / 58

Random Variable

Define the Probability Measure on the Borel Set 2

Proof (Cont’d).

(1) PX(R) = P(X−1(R)) = P(X−1(Ω)) = 1.

(2) PX(B) = P(X−1(B)) ≥ 0 since P) is a probability measure.

(3) For pairwise disjoint sets Bi’s in B,

PX(∪iBi) = P(X−1(∪iBi)) = P(∪iX−1(Bi))

=∑

i P(X−1(Bi)) =∑

i PX(Bi).

Therefore, PX is a probability measure on (R,B). Note that PX is called

the probability distribution function of the random variable X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 46 / 58

Random Variable

Define the Probability Measure on the Borel Set 2

From here, we’ve known how to define a probability measure with respect

to a random variable. However, the definition is too complicated since it

defines a probability measure on (R,B). As a result, in the next section we

will introduce a more common, useful, and intuitive function to describe

the probability w.r.t X, that is the (culmulative) distribution function FX .

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 47 / 58

Distribution Function

Distribution Function: Definition

The distribution of a random variable X is usually described by giving its

distribution function, not the probability distribution function.

Definition 2.3.1 (Distribution Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable

with probability distribution function PX . The distribution function of X,

written FX : R→ [0, 1], is defined by

FX(x) = PX((−∞, x]) = P(ω ∈ Ω : X(ω) ≤ x) = P(X ≤ x), ∀x ∈ R.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 48 / 58

Distribution Function

Distribution Function: Properties 1

The distribution function gives us a simplified framework when discussing

a quantitative random phenomenon.

Theorem 2.3.2 (Properties of Distribution Functions)

Any distribution function FX of a random variable X has the following

properties:

(1) FX is nondecreasing;

(2) limx→∞ FX(x) = 1 and limx→−∞ FX(x) = 0;

(3) FX is right continuous, that is, limx→a+ FX(x) = FX(a);

(4) FX(a−) := limx→a− FX(x) = P(X < a);

(5) the probability of the event X = a is P(X = a) = FX(a)−FX(a−).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 49 / 58

Distribution Function

Distribution Function: Properties 2

Proof. To prove (1), assume that a ≤ b ∈ R. Note that

ω : X(ω) ≤ a ⊆ ω : X(ω) ≤ b. Then,

FX(a) = P(ω : X(ω) ≤ a) ≤ ω : X(ω) ≤ b = F (b).

For(2), if x→∞, then ω : X(ω) ≤ x ↑ Ω, so

limx→∞ FX(x) = limx→∞ P(ω : X(ω) ≤ x) = P(Ω) = 1. The fact that

x→ −∞⇒ X(ω) ≤ x ↓ φ implies limx→−∞ FX(x) = 0.

To prove (3), we observe that if x→ a+, then

ω : X(ω) ≤ x ↓ ω : X(ω) ≤ a.

To prove (4), if x→ a−, ω : X(ω) ≤ x ↑ ω : X(ω) ≤ a.

For (5), note that P(X = a) = P (X ≤ a)− P(X < a) and use (3),(4).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 50 / 58

Distribution Function

Lebesgue-Stieltjes measure 1

Definition 2.3.3 (Lebesgue-Stieltjes Measure)

Let Ω = R and C be a collection of intervals of the form (a, b]. If F is a

function satisfying (1) and (3), then we call it a Stieltjes measure.

Define a function l by l((a, b]) = F (b)− F (a). Then the function

m∗(E) = inf

∞∑i=1

l(Ai) : Ai ∈ C for each i and E ⊂ ∩∞i Ai

is called a Lebesgue-Stieltjes measure. If we define F (x) = x, then the

measure m∗ is called a Lebesque measure.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 51 / 58

Distribution Function

Lebesgue-Stieltjes measure 2

Note that there are many important properties for the Lebesgue-Stieltjes

measure. First, every set in the Borel σ−algebra on R can be measured by

the Lebesgue-Stieltjes measure. (Here, we do not define what the term

”can be measured”!) Second, if a and b are finite, then

m∗((a, b]) = l((a, b]). A more important one is the Lebesque measure, it

plays a central role when defining expectation.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 52 / 58

Distribution Function

Conditions for Distribution Functions 1

Theorem 2.3.4 (Conditions for Distribution Functions)

If a function F satisfies (1), (2) and (3), then it is the distribution

function of some random variable.

Proof.

Let Ω = [0, 1], F = the Borel sets on (0, 1), and P = the Lebesgue

measure.

Define a real-valued function X : Ω→ R by X(ω) = supx : F (x) < ω.

Then, for a fixed number x, if ω ≤ F (x), X(ω) < x must holds by the

definition of X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 53 / 58

Distribution Function

Conditions for Distribution Functions 2

On the other hand if ω > F (x), then since F is right continuous, there is

an ε > 0 so that F (x+ ε) < ω and so X(ω) ≤ x+ ε > x. As a result, we

have ω : X(ω) ≤ x = ω : ω ≤ F (x). Since P is a lebesgue measure,

we have the desired result since

P(ω ∈ Ω : X(ω) ≤ x) = P(ω : ω ≤ F (x))

= P((−∞, F (x)]) = F (x)− limx→−∞ FX(x) = F (x).

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 54 / 58

Distribution Function

Probability Mass Function

Definition 2.3.5 (Probability Mass Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If

the image of X, say X(Ω), is a countable set, we call X a discrete

random variable. The probability mass function of X is defined by

pX(x) = PX(X(ω) = x) = P(ω : X(ω) = x). In this case, the

distribution function of X must have some jumps.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 55 / 58

Distribution Function

Probability Density Function

Definition 2.3.6 (Probability Density Function)

Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If

the distribution function FX of X is continuous, we call X a continuous

random variable. The function fX(x) = ddxFX(x) is called the probability

density function of X.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 56 / 58

Distribution Function

Probability Density Function

By the fundamental theorem of calculus, we have the relationship

FX(x) =∫ x−∞ f(t)dt and

PX((a− ε, a+ ε)) = FX(a+ ε)− FX(a− ε) =∫ a+εa−ε f(x)dx.

These properties mean that we can start with f and then define a

distribution function F . In order to end up with a distribution function FX

it is necessary and sufficient that f(x) ≥ 0 and∫∞−∞ f(x)dx = 1.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 57 / 58

Next Lecture

The Next Lecture

In next lecture, we’ll review the idea about the random variable, and then

introduce the random vector.

Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 58 / 58