Upload
ta-wei-huang
View
117
Download
0
Embed Size (px)
Citation preview
Probability Theory: Advanced LookStatistical Methods in Finance
Lecture 2
Ta-Wei Huang
December 7, 2016
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 1 / 58
Table of Contents
Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.
1 Sample Space and Event
2 Probability Measure
3 Random Variable
4 Distribution Function
5 Next Lecture
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58
Table of Contents
Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.
1 Sample Space and Event
2 Probability Measure
3 Random Variable
4 Distribution Function
5 Next Lecture
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58
Table of Contents
Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.
1 Sample Space and Event
2 Probability Measure
3 Random Variable
4 Distribution Function
5 Next Lecture
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58
Table of Contents
Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.
1 Sample Space and Event
2 Probability Measure
3 Random Variable
4 Distribution Function
5 Next Lecture
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58
Table of Contents
Probability theory is the way we think about randomness. In financialmarkets, uncertainty and risks are everywhere, and thus we need theprobability theory to model the market trends.
1 Sample Space and Event
2 Probability Measure
3 Random Variable
4 Distribution Function
5 Next Lecture
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 2 / 58
Sample Space and Event
Sample Space: Definition
Definition 2.1.1 (Sample Space)
For an experiment and a given index set A, all possible outcomes
ωα, α ∈ A are called sample points. The set Ω = ωα : α ∈ A = the
collection of all possible outcomes, i.e., the set of all sample points, is
called the sample space of that experiment.
1 If A is countable, we say that Ω is a discrete sample space;
2 if A is uncountable, we say that Ω is a continuous sample space.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 3 / 58
Sample Space and Event
Sample Space: Examples
Example 2.1.2 (Sample Space)
1 If the experiment is tossing a coin, and the observation is the face of
that coin, then the sample space is Ω = H,T, which is a discrete
sample space.
2 If the experiment is a monetary policy conducted by Fed, and the
observation is the return on S&P500 Index one day after that policy,
then Ω = R, which is a continuous sample space.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 4 / 58
Sample Space and Event
Event
Definition 2.1.3 (Event)
An event E is any collection of all possible outcomes of an experiment,
that is, any subset of the sample space Ω.
An event is actually a statement about the experiment results. For
example, the set R+ = S&P500 has positive return is an event for Ω in
example 2.1.2 case (2).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 5 / 58
Sample Space and Event
σ-algebra: Definition
Definition 2.1.4 (σ-algebra)
A system F of subsets of Ω is called a σ-algebra if
(a) Ω ∈ F , (b) Ac ∈ F if A ∈ F ,
(c) and for A1, A2, · · · , An, · · · ∈ F , the union⋃∞i=1Ai ∈ F .
Why do we need the concept of σ-algebra? The reason is that we will
assign a probability to any event E, which is a subset of the sample space
Ω, and therefore set operations are required so that we can easily do
something on different events, and then compute probabilities.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 6 / 58
Sample Space and Event
σ-algebra: Example
Example 2.1.5 (σ-algebra)
1 Let Ω = 1, 2, 3. Then
1 F1 = φ, 1, 2, 3, 1, 2, 3 is a σ-algebra.
2 F2 = φ, 1, 2, 3, 1, 2, 3 is not a σ-algebra since for
A = 1 ∈ F2, Ac = 2, 3 6∈ F2.
2 Let Ω = R. Then F = the collection of all subsets in R is a σ-algebra.
3 Let Ω = N. Then
1 F1 = φ, 1, 3, 5, 7, · · · , 2, 4, 6, 8, · · · ,N is a σ-algebra.
2 F2 = A ⊆ N : A is countable or Ac is countable is a σ-algebra.
3 F3 = A ⊆ N : A is finite or Ac is finite is not a σ-algebra since
Ω = N 6∈ F3.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 7 / 58
Probability Measure
Defining Probability Measure
Classically, we define the probability P (A) of an event A by
P(A) = ] of A] of Ω , but there are two main problems.
1 It requires a finite sample space, which is not true in some cases
2 It requires symmetric outcomes, that is, ∀ωi ∈ Ω, P(ωi) = 1] of Ω .
Therefore, we need the modern probability theory, on foundations laid by
Andrey Nikolaevich Kolmogorov.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 8 / 58
Probability Measure
Probability Measure: Definition
Definition 2.1.6 (Measurable Space and Probability Measure)
Let Ω be a non-empty set and let F be a σ-algebra on Ω, then (Ω,F) is
called a measurable space. A probability measure is a real-valued function
P : F → R such that
(1) P(Ω) = 1;
(2) P(E) ≥ 0 for any event E ∈ F ;
(3) (Countable additivity) for any sequence of pairwise disjoint events
En in F , P(⋃∞i=1Ei) =
∑∞i=1 P(Ei).
The triple (Ω,F ,P) is called a probability space.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 9 / 58
Probability Measure
Probability Measure: Remark
Remark on the Definition of Probability Measure
This axiomatic definition makes no attempt to tell what particular
function P to choose.
This definition regards the probability as a property of an event in the
σ-algebra F on the sample space Ω.
So, we’ll further discuss how to define probability measures on discrete and
continuous sample spaces, respectively.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 10 / 58
Probability Measure
Properties of Probability Measure 1
Proposition 2.1.7
Let (Ω,F ,P) be a probability space. Then, for any event E ∈ F ,
(1) P(φ) = 0; (2) P(E) ≤ 1;
(3) P(EC) = 1− P(E).
Proof. It is easier to prove (3) first. Since the sets E and EC form a
partition of the sample space Ω, we have P(E ∪ EC) = P(Ω) = 1 by the
axiom of probability. Also, E and EC are disjoint, so by the axiom,
P(E ∪ EC) = P(E) + P(EC) = 1, and hence P(EC) = 1− P(E).
It is similar to prove (1), and so we skip it out.
Since P(EC) = 1− P(E) ≥ 0, (2) immediately holds.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 11 / 58
Probability Measure
Properties of Probability Measure 2
Proposition 2.1.8
Let (Ω,F ,P) be a probability space. Then, for any events A,B ∈ F ,
(1) P(B ∩AC) = P(B)− P(A ∩B);
(2) P(A ∪B) = P(A) + P(B)− P(A ∩B);
(3) (Monotonicity) If A ⊆ B, then P(A) ≤ P(B).
Proof. For any sets A and B, B = B ∩AC ∪ B ∩A. Then,
P(B) = P(B ∩AC ∪ B ∩A) = P(B ∩AC) + P(B ∩A).
To establish (2), we use the identity A ∪B = A ∪ B ∩AC. (why?)
To establish (3), combining A ⊆ B ⇒ B = B ∩A and (1) will give the
result.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 12 / 58
Probability Measure
Properties of Probability Measure 3
Proposition 2.1.9 (Inclusion-exclusion Identity)
Let (Ω,F ,P) be a probability space. For n events A1, · · · , An ∈ F , define
P1, P2, · · · , Pn by P1 =∑n
i=1 P (Ai) , P2 =∑
1≤i<j≤n P (Ai ∩Aj) ,
P3 =∑
1≤i<j<k≤n P (Ai ∩Aj ∩Ak) , · · · , and Pn = P (∩ni=1Ai) . Then
the probability of the union of A1, · · · , An is given by
P (A1 ∪A2 ∪ · · · ∪An) =
n∑i=1
(−1)n+1Pi.
Proof. By mathematical induction.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 13 / 58
Probability Measure
Properties of Probability Measure 4
Example 2.1.10 (The Matching Problem)
Suppose that each of N men at a party throws his hat into the center of
the room. The hats are first mixed up, and then each man randomly
selects a hat. What is the probability that none of the men selects his own
hat?
Solution. We first calculate the complementary probability of at least one
man’s selecting his own hat. Let Ai be the event that the i-th man selects
his own hat, i = 1, 2, . . . , N .
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 14 / 58
Probability Measure
Properties of Probability Measure 5
Solution (Cont’d). Then, the probability that at least one of the men
selects his own hat is given by P (A1 ∪A2 ∪ · · · ∪AN ) =
(−1)2∑n
i=1 P (Ai) + (−1)3∑
1≤i1<i2≤n P (Ai1 ∩Ai2) + · · ·+ (−1)N+1Pn
= P(∩Ni=1Ai
)The number of all possible outcomes of this experiment is
N !. The number of all possible outcomes of the event Ai1 ∩ · · · ∩Ain is
(N − n)!. So, the probability P(Ai1 ∩ · · · ∩Ain) = (N−n)!N ! .
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 15 / 58
Probability Measure
Properties of Probability Measure 6
Solution (Cont’d). Now, since there are(Nn
)terms in the item∑
1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain), we have∑1≤i1<···<in≤n P(Ai1 ∩ · · · ∩Ain) =
(Nn
) (N−n)!N ! = 1
n! .
Thus, we get the complementary probability
P (A1 ∪A2 ∪ · · · ∪AN ) = 1− 1
2!+
1
3!− · · ·+ (−1)N+1 1
N !
Hence, the probability that none of the men selects his own hat is
1−(1− 1
2! + 13! − · · ·+ (−1)N+1 1
N !
)Note that as N →∞, the
probability is 1− e−1 ≈ 0.36788, not 1!
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 16 / 58
Probability Measure
Properties of Probability Measure 7
Proposition 2.1.11 (Boole’s Inequality)
Let (Ω,F ,P) be a probability space. For countably many events
A1, · · · ∈ F , P (∪ni=1Ai) ≤∑n
i=1 P (Ai)
Solution. Let A′i be events defined by A′1 = A1 andA′i = Ai − ∪i−1
j=1Aj , ∀i = 2, 3, . . .. Then (1) A′i ⊂ Ai and (2) A′i’s arepairwise disjoint since for i > k,
A′i ∩A′k = (Ai − ∪i−1j=1Aj) ∩ (Ak − ∪k−1
j=1Aj)
= (Ai ∩ (∪i−1j=1Aj)
C) ∩ (Ak ∩ (∪k−1j=1Aj)
C)
= (Ai ∩ (∩i−1j=1A
Cj )) ∩ (Ak ∩ (∩k−1
j=1ACj ))
and (Ai ∩ (∩i−1j=1A
Cj )) are contained in ACk . Since ∪iA′i = ∪iAi, we have
P (∪ni=1Ai) = P (∪ni=1A′i) =
∑ni=1 P (A′i) ≤
∑ni=1 P (Ai).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 17 / 58
Probability Measure
Properties of Probability Measure 8
Corollary 2.1.12 (σ-subadditivity)
Let (Ω,F ,P) be a probability space. Then for countably many events
A1, · · · ∈ F and A ⊂ ∩ni=1Ai, P(A) ≤∑n
i=1 P (Ai).
Proof. By Theorem 1.1.9.(3) and Theorem 1.1.11.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 18 / 58
Probability Measure
Properties of Probability Measure 9
Proposition 2.1.13 (Law of Total Probability)
Let (Ω,F ,P) be a probability space. If A1, · · · ∈ F is a partition of Ω,
that is, Ai’s are pairwise disjoint and ∪∞i Ai = Ω. Then, for any event
B ∈ F , P(B) =∑∞
i=1 P(B ∩Ai).
Proof. Since Ai is a partition of Ω, we have
B = B ∩ Ω = B ∩ (∪∞i=1Ai) = ∪∞i=1 (B ∩Ai). Therefore, we have
P(B) = P(∪∞i=1 (B ∩Ai)) =∑∞
i=1 P(B ∩Ai).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 19 / 58
Probability Measure
Define Probability Measure: Discrete Sample Space 1
Theorem 2.1.14 (Define Probability Measures in a Discrete Sample Space)
Let Ω = ωα : α ∈ N be a discrete (countable) sample space and F be a
σ-algebra on Ω. Define a function P : F → R by (1) P(Ω) = 1; (2)
P(ω) ≥ 0 for any ω ∈ Ω; (3) P(E) =∑
ω∈E P(ω). Then P is a
probability measure.
Proof. The axiom (1) is true by the definition of P. Since P(ω) ≥ 0 for
any ω ∈ Ω, for any event E ∈ F , P(E) =∑
ω∈E P(ω) ≥ 0. ⇒
assumption (2) holds.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 20 / 58
Probability Measure
Define Probability Measure: Discrete Sample Space 2
Proof (Cont’d). Let En be a sequence of pairwise disjoint events in F .
Then,
P
( ∞⋃i=1
Ei
)=
∑ω∈
⋃∞i=1 Ei
P(ω) =
∞∑i=1
∑ω∈Ei
P(ω) =
∞∑i=1
P(Ei)
⇒ Assumption (3) holds.
Remark on Theorem 2.1.7
The triple (Ω,F ,P) in above theorem is called a discrete probability space.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 21 / 58
Probability Measure
Define Probability Measure: Continuous Sample Space 1
It is much harder to define a probability measure on a continuous sample
space. First, we need some basic knowledge on set theory.
Definition 2.1.15 (Convergence of Sets)
A sequence of sets A1, A2, · · · , An, · · · is said to be increasing to A if
A1 ⊂ A2 ⊂ · · · ⊂ An ⊂ · · · and A = ∪∞n=1An. We denote it as
limn→∞An = A or The case An ↑ A.
A sequence of set A1, A2, · · · , An, · · · is said to be decreasing to A if
A1 ⊃ A2 ⊃ · · · ⊃ An ⊃ · · · and A = ∩∞n=1An. We denote it as
limn→∞An = A or The case An ↓ A.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 22 / 58
Probability Measure
Define Probability Measure: Continuous Sample Space 2
The following graph shows the idea about the convergence of sets.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 23 / 58
Probability Measure
Define Probability Measure: Continuous Sample Space 3
Next, we introduce a basic theorem about the continuity of a probability
measure. Actually, we have a more general definition of the limit and
continuity of a set, but here we skip it since we only want to see how to
define a probability measure on a continuous sample space.
Theorem 2.1.16 (Above and Below Continuity of Probability)
Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is
increasing/decreasing to a set A, then limi→∞ P(An) = P(A).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 24 / 58
Probability Measure
Define Probability Measure: Continuous Sample Space 4
Proof. We only prove the increasing case. Let Bn = An −An−1. Then
Bn’s are pairwise disjoint, ∪ni=1Bi = ∪ni=1Ai, and ∪∞n=1Bn = A. Thus,
P(A) = P(∪∞n=1Bn) =
∞∑n=1
P(Bn)
= limn→∞
n∑i=1
P(Bi) = limn→∞
P(∪ni=1Bi)
= limn→∞
P(∪ni=1Ai) = limn→∞
P(An).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 25 / 58
Probability Measure
Define Probability Measure: Continuous Sample Space 5
Example 2.1.17
Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is
increasing/decreasing to a set A, then limi→∞ P(An) = P(A).
Solution. Let Ai =(b− 1
k , b]. Then Ai ↓ [a, b], and so we have
P([a, b]) = limi→∞ P(Ai) = limi→∞ = 12π
(b− a+ 1
k
)= P((a, b]) = b−a
2π .
Remark.
Note that P(a) = P ([a, b]− (a, b]) = 0. Also, the probability
P(Q ∩ Ω) = 0. (Quite different from discrete case!)
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 26 / 58
Probability Measure
General Limit of Sets
Definition 2.1.18 (Limit of Sets)
Let A1, A2, · · · , An, · · · be a sequence of subsets of Ω.
The limit sup of An is defined as
lim supAn = ∩∞n=1 ∪∞i=n Ai = ω ∈ Ω : ω ∈ infinitely many An.
The limit inf of An is defined as
lim inf An = ∪∞n=1 ∩∞i=n Ai = ω ∈ Ω : ω ∈ all but finitely many An.
We say that limnAn = A if A = lim supAn = lim inf An.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 27 / 58
Probability Measure
Continuity of Probability Measure 1
Theorem 2.1.19 (Continuity of Probability)
Let (Ω,F ,P) be a probability space. If A1, A2, · · · ∈ F is a sequence of
subsets of Ω with limnAn = A, then limn→∞ P(An) = P(A).
Proof. Let Bn = ∪∞i=nAi, a decreasing sequence. Thus,
P(lim supAn) = P(∩∞n=1Bn) = limi→∞
P(Bn)
by theorem 2.1.15. Also, let Cn = ∩∞i=nAi, a increasing sequence. Then,
P(lim supAn) = P(∪∞n=1Cn) = limi→∞ P(Cn).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 28 / 58
Probability Measure
Continuity of Probability Measure 2
Proof.
Since limnAn exist, we have lim supAn = lim inf An = A, and so
P(A) = limi→∞ P(Bn) = limi→∞ P(Cn).
Since the relationship Cn ⊆ An ⊆ Bn holds, we have
P(Cn) ≤ P(An) ≤ P(Bn).
By pinching theorem, one can get
P(A) = limi→∞ P(Bn) = limi→∞ P(Cn) = limi→∞ P(An).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 29 / 58
Random Variable
Understanding the Sample Space
From example 2.1.17, we know that defining a probability measure on a
sample space is not quite simple as one may think. Therefore, further
understanding on σ−algebra and probability is needed to help us to define
an appropriate probability measure. Here, we first discuss properties of
σ−algebra. The discussion will be very helpful when constructing the
framework of random variable.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 30 / 58
Random Variable
Understanding the σ−algrbra
Question
Given a sample set Ω and a collection of subsets of Ω, C. Does there exist
a collection of subsets of Ω, say G, such that (1)C ⊆ G and (2)G is a
σ−algebra?
The answer is definitely yes. We may take G to be the collection of all
subsets of Ω. However, this method does not help us to understand about
how to define a suitable probability measure on a desired complicated
space. So our goal is to find the smallest σ−algebra that contains C.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 31 / 58
Random Variable
Understanding the σ−algrbra
Definition 2.2.1
Let Ω be a sample space and C a collection of subsets of Ω. If G is the
smallest σ−algebra that contains C, we say that G is generated by C and
denote it by G = σ(C).
Note. Here are two proposition that is easy to check. First, if C1 ⊆ C2,
then σ(C1) ⊆ σ(C2). Second, σ(σ(C)) = σ(C) since σ(C) is a σ−algebra.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 32 / 58
Random Variable
Borel Set and Borel σ−algrbra
Definition 2.2.2 (Borel Sets)
Let Ω be a topological space and C the collection of all open subsets of Ω.
The the σ−algebra B(Ω) ≡ σ(C) is called a Borel σ−algebra. Any
element E ∈ B is called a Borel set.
Now, let R be the sample sets. Then every subset of R which you meet
everyday is a Borel set, and it is difficult, but possible, to find a subset of
R constructed explicitly that is not a Borel set. This means that
B ≡ B(R) is not equal to the collection of all subsets of R.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 33 / 58
Random Variable
Lemma
Events in B are quite complicated, but we can use a very easily
understanding structure to construct B by theorem 2.2.4, but before prove
it, we need a simple result in lemma 2.2.3.
Lemma 2.2.3
Let Ω be a set. IF Fα is a σ−algebra for each α in some non-empty index
set I, then ∩α∈IFα is also a σ−algebra.
Proof. This follows immediately from the definition of σ−algebra.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 34 / 58
Random Variable
Structure of a Borel σ−algebra 1
Theorem 2.2.4
The Borel σ−algebra B ≡ B(R) is generated by the following collections
of subsets of R:
(1) C1 = (a, b) : a < b, a ∈ R, b ∈ R;
(2) C2 = (a, b] : a < b, a ∈ R, b ∈ R;
(3) C3 = [a, b) : a < b, a ∈ R, b ∈ R;
(4) C4 = [a, b] : a < b, a ∈ R, b ∈ R.
Proof. This is not our main topic, so we only prove (1) as an example.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 35 / 58
Random Variable
Structure of a Borel σ−algebra 2
Proof (Cont’d).
Let G be the collection of all open sets in R. Then B = σ(G). Since every
(a, b) are open, we have C1 ⊂ G, and so σ(C1) ⊂ σ(G).
Since (a,∞) = ∪ε∈R+(a, a+ ε) and (−∞, b) = ∪ε∈R+(b− ε, b), we have
(a,∞) and (−∞, b) in σ(C1).
To get the reverse inclusion, if G ⊂ R is open, it is the countable union of
open intervals and so G ∈ σ(C1), and thus G ⊂ σ(C1). This implies that
σ(G) ⊂ σ(C1).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 36 / 58
Random Variable
Measurable Function 1
There are still one thing we need to talk about, that is, the term of
”measurability” of a function. This definition is the core concept behind
random variables.
Definition 2.2.5 (Measurability)
Let (Ω,F) and (S,F ′) be two measurable spaces. A function X : Ω→ S
is said to be a (F ,F ′) measurable map from (Ω,F) to (S,F ′) if
X−1(B) = ω : X(ω) ∈ B ∈ F , ∀B ∈ F ′.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 37 / 58
Random Variable
Measurable Function 2
The meaning of (F ,F ′) measurable is that we can transform the original
measure defined on F into a new probability measure PX on F ′, and then
form a new probability space (S,F ′,PX). From here, we’ve known the
structure behind the concept of random variable. Now, we give the forma
definition of a random variable.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 38 / 58
Random Variable
Random Variable
Definition 2.2.6 (Random Variable)
Let (Ω,F ,P) be a probability space. We say that a real-valued function
X : Ω→ R a (F ,B) measurable random variable if for any Borel set
B ⊂ R, X−1(B) = ω : X(ω) ∈ B ∈ F .
Remark.
This definition means that a random variable transforms the original
probability space (Ω,F ,P) into a new probability space (X(Ω),B,PX).
Here we have not talked about how to transform P into a new probability
measure PX w.r.t X on B.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 39 / 58
Random Variable
Example of Random Variables
Example 2.2.7 (Random Variables)
Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function
X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?
Solution. For any Borel set B ∈ B,
X−11 (B) = ω ∈ Ω : X1(ω) ∈ B = ω ∈ [0, 1] : ω ∈ B = B ∩ [0, 1] ∈ F .
Hence, X1(ω) = ω, ω ∈ Ω is a random variable.
For any Borel set B ∈ B,
X−12 (B) = ω ∈ Ω : X2(ω) ∈ B = ω ∈ [0, 1] : ω2 ∈ B.
It is hard to check whether X−12 (B) ∈ F , especially that we need to check
every Borel set B!
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 40 / 58
Random Variable
Alternative Definition of Random Variable 1
The definition of the random variable is too hard to check! We need other
equivalent definition of random variables. We first introduce a lemma.
Lemma 2.2.8 (Transformation of σ−algebra)
Let (Ω,F) and (S,F ′) be two measurable spaces and C ∈ F ′ be a
collection of subsets of S. If X : Ω→ S is a (F ,F ′) measurable map,
then σ(X−1(C)) = X−1(σ(C)).
Proof. Very hard to prove! Skip!
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 41 / 58
Random Variable
Alternative Definition of Random Variable 2
Theorem 2.2.9 (Equivalent Definition of a Random Variable)
Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued
function. Then X is a random variable if and only if
X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.
Proof.
If X is a random variable, since (−∞, r] is a Borel set (note that (−∞, r]
generates the Borel σ−algebra on R), X−1((−∞, r]) ∈ F must holds.
To prove the converse, let C = (−∞, r] : r ∈ R. Then
X−1(B) = X−1(σ(C)) = σ(X−1(C)). Since X−1(C) ∈ F , we have
σ(X−1(C)) = X−1(B) ⊂ F , and so X is a random variable.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 42 / 58
Random Variable
Alternative Definition of Random Variable 3
Corollary 2.2.10 (Equivalent Definition of a Random Variable)
Let (Ω,F ,P) be a probability space and X : Ω→ R a real-valued
function. The following statements are equivalent.
(1) X is a random variable.
(2) X−1((−∞, r]) = ω : X(ω) ≤ r ∈ F for any r ∈ R.
(3) X−1((−∞, r)) = ω : X(ω) < r ∈ F for any r ∈ R.
(4) X−1([r,∞)) = ω : X(ω) ≥ r ∈ F for any r ∈ R.
(5) X−1((r,∞)) = ω : X(ω) > r ∈ F for any r ∈ R.
It is a easier definition for us to check whether a function is a random
variable. For example, now we can answer example 2.2.7.
If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,
X−12 ((−∞, r]) = [0,
√r] ∈ F since
√r ≤ 1 If r > 1,
X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 43 / 58
Random Variable
Alternative Definition of Random Variable 4
Now we can answer example 2.2.7.
Example 2.2.7 (Random Variables)
Let Ω = [0, 1] and F = B ∩ [0, 1]. Is the real-valued function
X1(ω) = ω, ω ∈ Ω a random variable? How about X2(ω) = ω2, ω ∈ Ω?
If r < 0, X−12 ((−∞, r]) = φ ∈ F . If 0 ≤ r ≤ 1,
X−12 ((−∞, r]) = [0,
√r] ∈ F since
√r ≤ 1 If r > 1,
X−12 ((−∞, r]) = φ ∈ F . Therefore, X2 is a random variable.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 44 / 58
Random Variable
Define the Probability Measure on the Borel Set 1
Here, we still need to discuss one thing, that is how to define a probability
measure on B with respect to the random variable X? The following
theorem gives us the hint.
Theorem 2.2.11
Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable.
Then the function PX : B → [0, 1] defined by
PX(B) = P(X−1(B)), ∀B ∈ B is a probability measure on (R,B).
Proof. To prove this theorem, we need to check the axioms of probability
measures.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 45 / 58
Random Variable
Define the Probability Measure on the Borel Set 2
Proof (Cont’d).
(1) PX(R) = P(X−1(R)) = P(X−1(Ω)) = 1.
(2) PX(B) = P(X−1(B)) ≥ 0 since P) is a probability measure.
(3) For pairwise disjoint sets Bi’s in B,
PX(∪iBi) = P(X−1(∪iBi)) = P(∪iX−1(Bi))
=∑
i P(X−1(Bi)) =∑
i PX(Bi).
Therefore, PX is a probability measure on (R,B). Note that PX is called
the probability distribution function of the random variable X.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 46 / 58
Random Variable
Define the Probability Measure on the Borel Set 2
From here, we’ve known how to define a probability measure with respect
to a random variable. However, the definition is too complicated since it
defines a probability measure on (R,B). As a result, in the next section we
will introduce a more common, useful, and intuitive function to describe
the probability w.r.t X, that is the (culmulative) distribution function FX .
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 47 / 58
Distribution Function
Distribution Function: Definition
The distribution of a random variable X is usually described by giving its
distribution function, not the probability distribution function.
Definition 2.3.1 (Distribution Function)
Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable
with probability distribution function PX . The distribution function of X,
written FX : R→ [0, 1], is defined by
FX(x) = PX((−∞, x]) = P(ω ∈ Ω : X(ω) ≤ x) = P(X ≤ x), ∀x ∈ R.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 48 / 58
Distribution Function
Distribution Function: Properties 1
The distribution function gives us a simplified framework when discussing
a quantitative random phenomenon.
Theorem 2.3.2 (Properties of Distribution Functions)
Any distribution function FX of a random variable X has the following
properties:
(1) FX is nondecreasing;
(2) limx→∞ FX(x) = 1 and limx→−∞ FX(x) = 0;
(3) FX is right continuous, that is, limx→a+ FX(x) = FX(a);
(4) FX(a−) := limx→a− FX(x) = P(X < a);
(5) the probability of the event X = a is P(X = a) = FX(a)−FX(a−).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 49 / 58
Distribution Function
Distribution Function: Properties 2
Proof. To prove (1), assume that a ≤ b ∈ R. Note that
ω : X(ω) ≤ a ⊆ ω : X(ω) ≤ b. Then,
FX(a) = P(ω : X(ω) ≤ a) ≤ ω : X(ω) ≤ b = F (b).
For(2), if x→∞, then ω : X(ω) ≤ x ↑ Ω, so
limx→∞ FX(x) = limx→∞ P(ω : X(ω) ≤ x) = P(Ω) = 1. The fact that
x→ −∞⇒ X(ω) ≤ x ↓ φ implies limx→−∞ FX(x) = 0.
To prove (3), we observe that if x→ a+, then
ω : X(ω) ≤ x ↓ ω : X(ω) ≤ a.
To prove (4), if x→ a−, ω : X(ω) ≤ x ↑ ω : X(ω) ≤ a.
For (5), note that P(X = a) = P (X ≤ a)− P(X < a) and use (3),(4).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 50 / 58
Distribution Function
Lebesgue-Stieltjes measure 1
Definition 2.3.3 (Lebesgue-Stieltjes Measure)
Let Ω = R and C be a collection of intervals of the form (a, b]. If F is a
function satisfying (1) and (3), then we call it a Stieltjes measure.
Define a function l by l((a, b]) = F (b)− F (a). Then the function
m∗(E) = inf
∞∑i=1
l(Ai) : Ai ∈ C for each i and E ⊂ ∩∞i Ai
is called a Lebesgue-Stieltjes measure. If we define F (x) = x, then the
measure m∗ is called a Lebesque measure.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 51 / 58
Distribution Function
Lebesgue-Stieltjes measure 2
Note that there are many important properties for the Lebesgue-Stieltjes
measure. First, every set in the Borel σ−algebra on R can be measured by
the Lebesgue-Stieltjes measure. (Here, we do not define what the term
”can be measured”!) Second, if a and b are finite, then
m∗((a, b]) = l((a, b]). A more important one is the Lebesque measure, it
plays a central role when defining expectation.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 52 / 58
Distribution Function
Conditions for Distribution Functions 1
Theorem 2.3.4 (Conditions for Distribution Functions)
If a function F satisfies (1), (2) and (3), then it is the distribution
function of some random variable.
Proof.
Let Ω = [0, 1], F = the Borel sets on (0, 1), and P = the Lebesgue
measure.
Define a real-valued function X : Ω→ R by X(ω) = supx : F (x) < ω.
Then, for a fixed number x, if ω ≤ F (x), X(ω) < x must holds by the
definition of X.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 53 / 58
Distribution Function
Conditions for Distribution Functions 2
On the other hand if ω > F (x), then since F is right continuous, there is
an ε > 0 so that F (x+ ε) < ω and so X(ω) ≤ x+ ε > x. As a result, we
have ω : X(ω) ≤ x = ω : ω ≤ F (x). Since P is a lebesgue measure,
we have the desired result since
P(ω ∈ Ω : X(ω) ≤ x) = P(ω : ω ≤ F (x))
= P((−∞, F (x)]) = F (x)− limx→−∞ FX(x) = F (x).
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 54 / 58
Distribution Function
Probability Mass Function
Definition 2.3.5 (Probability Mass Function)
Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If
the image of X, say X(Ω), is a countable set, we call X a discrete
random variable. The probability mass function of X is defined by
pX(x) = PX(X(ω) = x) = P(ω : X(ω) = x). In this case, the
distribution function of X must have some jumps.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 55 / 58
Distribution Function
Probability Density Function
Definition 2.3.6 (Probability Density Function)
Let (Ω,F ,P) be a probability space and X : Ω→ R a random variable. If
the distribution function FX of X is continuous, we call X a continuous
random variable. The function fX(x) = ddxFX(x) is called the probability
density function of X.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 56 / 58
Distribution Function
Probability Density Function
By the fundamental theorem of calculus, we have the relationship
FX(x) =∫ x−∞ f(t)dt and
PX((a− ε, a+ ε)) = FX(a+ ε)− FX(a− ε) =∫ a+εa−ε f(x)dx.
These properties mean that we can start with f and then define a
distribution function F . In order to end up with a distribution function FX
it is necessary and sufficient that f(x) ≥ 0 and∫∞−∞ f(x)dx = 1.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 57 / 58
Next Lecture
The Next Lecture
In next lecture, we’ll review the idea about the random variable, and then
introduce the random vector.
Ta-Wei Huang Probability Theory: Advanced Look December 7, 2016 58 / 58