Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Continuous Distributions &
Expectation, Variance, Moment…
May 8, 2019
来嶋 秀治 (Shuji Kijima)
Dept. Informatics,
Graduate School of ISEE
確率統計特論 (Probability & Statistics)
Lesson 3
1. (univariate) continuous distributions
3
Continuous roulette
Ω = 𝜃 0 ≤ 𝜃 < 2𝜋
ℱ = 2Ω
Pr X = 𝜃 =? (𝜃 ∈ Ω)
Pr 𝜃 =𝜋
4=?
4
Continuous roulette
Ω = 𝜃 0 ≤ 𝜃 < 2𝜋
ℱ = 2Ω
Pr X = 𝜃 =? (𝜃 ∈ Ω)
Pr 𝑋 =𝜋
4= 0 ? ? ?
5
(continuous) uniform distr.
Ω = 0,2𝜋
Pr 𝑋 =𝜋
4= 0 ? ? ?
Pr 𝑋 ≤𝜋
4=
1
8
cumulative distribution function
seems appropriate.
6
continuous distr. (distr. on uncountable set R)
probability density function (確率密度関数)
𝑓 𝑥 =d
d𝑥𝐹 𝑥
(cumulative) distribution function ((累積)分布関数)
𝐹 𝑥 = Pr 𝑋 ≤ 𝑥 differentiable (continuous)
1
P
x
F(x)Continuous Distribution Function 𝐹: R → R≥0
1. 𝐹 −∞ = 0, 𝐹 +∞ = 1
2. Monotone non-decreasing (単調非減少)
3. Differentiable* (微分可能)
*in the effective domain.
7
Uniform ditr. (一様分布) U(a,b)
Ω = 𝑎, 𝑏
𝑓 𝑥 =1
𝑏 − 𝑎a ≤ 𝑥 ≤ 𝑏
𝐹 𝑥 =𝑥 − 𝑎
𝑏 − 𝑎(𝑎 ≤ 𝑥 ≤ 𝑏)
continuous roulette
= (0,2]
ℱ= 2
F(x) = x/2 (x)
f(x) = 1/2 (x)
8
Normal distr. (正規分布) N(, 2)
Ω = −∞,∞
𝑓 𝑥 =1
2𝜋𝜎exp −
1
2
𝑥 − 𝜇
𝜎
2
−∞ < 𝑥 < ∞
9
Exponential distr. (指数分布) Ex() (>0)
Ω = 0,∞
𝑓 𝑥 = 𝜆e−𝜆𝑥 (𝑥 ≥ 0)
where
Γ 𝜈 = න−∞
∞
𝑡𝜈−1e−𝑡d𝑡
10
Gamma distr. (ガンマ分布) G(,) (>0, >0)
Ω = 0,∞
𝑓 𝑥 =1
Γ(𝜈)𝛼𝜈𝑥𝜈−1e−𝛼𝑥 (𝑥 ≥ 0)
remark that
Γ 1 = 1Γ 𝜈 = 𝜈 − 1 Γ 𝜈 − 1Γ 𝜈 = 𝜈 − 1 ! (𝜈 = 1,2,… )
11
Some Distributions
Discrete distributions
(1) Bernoulli B(1,p)
(2) Binomial B(n,p)
#heads during tossing n coins.
(3) Geometric Ge(p)
# tails before a head.
(4) Poisson Po()
Continuous distributions
(1) Uniform U(a,b)
(2) Exponential Ex()
(3) Normal N(,2)
(4) Beta Be(,)
(5) Gamma G(,k)
i.i.d.
Distribution of random variables X and Y of (Ω, F , P).
Ex1. two dice.
Ω ={(1,1),(1,2),…,(6,5),(6,6)}
X = sum of casts
Y = product of casts
例2. poker
choose five cards,
X = # of A’s
Y = # of spades
Joint Distribution (同時分布; 結合分布)13
Joint distribution
𝐹 𝑥, 𝑦 ≔ Pr 𝑋 ≤ 𝑥 , 𝑌 ≤ 𝑦
(pdf: 𝑓 𝑥, 𝑦 ≔𝜕2
𝜕𝑥𝜕𝑦𝐹(𝑥, 𝑦))
cf. multivariate
cf. multivariate distribution14
multivariate discrete distribution
distr. fnc. : 𝐹 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 ≤ 𝑥, 𝑦 = Pr 𝑋 ≤ 𝑥 , 𝑌 ≤ 𝑦
pmf: 𝑓 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 = 𝑥, 𝑦 = Pr 𝑋 = 𝑥 , 𝑌 = 𝑦
multivariate continuous distribution
distr. fnc. : 𝐹 𝑥, 𝑦 ≔ Pr 𝑋, 𝑌 ≤ 𝑥, 𝑦 = Pr 𝑋 ≤ 𝑥 , 𝑌 ≤ 𝑦
pdf: 𝑓 𝑥, 𝑦 ≔𝜕2
𝜕𝑥𝜕𝑦𝐹(𝑥, 𝑦)
terminology 215
X and Y are independent (独立)
𝐹𝑋𝑌 𝑥, 𝑦 = 𝐹𝑋 𝑥 𝐹𝑌(𝑦)
prop. X,Y independent 𝑓𝑋𝑌 𝑥, 𝑦 = 𝑓𝑋 𝑥 𝑓𝑌(𝑦)
X,Y are identically distributed (同一分布に従う)
fX = fY
X,Y are independent and identically distributed
(i.i.d.;独立同一分布)
Prop.16
Proof.
𝑓 𝑥, 𝑦 ≔𝜕2
𝜕𝑥𝜕𝑦𝐹𝑋𝑌 𝑥, 𝑦
=𝜕2
𝜕𝑥𝜕𝑦𝐹𝑋 𝑥 𝐹𝑌 𝑦
=𝜕
𝜕𝑥
𝜕
𝜕𝑦𝐹𝑋 𝑥 𝐹𝑌 𝑦 +
𝜕
𝜕𝑥𝐹𝑋 𝑥
𝜕
𝜕𝑦𝐹𝑌 𝑦
= 0 +𝜕
𝜕𝑥𝐹𝑋 𝑥 𝑓𝑌 𝑦
=𝜕
𝜕𝑥𝐹𝑋 𝑥 𝑓𝑌 𝑦 + 𝐹𝑋 𝑥
𝜕
𝜕𝑥𝑓𝑌 𝑦
= 𝑓𝑋 𝑥 𝑓𝑌 𝑦
Prop.
𝐹𝑋𝑌 𝑥, 𝑦 = 𝐹𝑋 𝑥 𝐹𝑌(𝑦) 𝑓𝑋𝑌 𝑥, 𝑦 = 𝑓𝑋 𝑥 𝑓𝑌(𝑦).
Expectation, variance, moment
Today’s topic 2
Expectation18
Expectation (期待値) of a discrete random variable X is defined by
E 𝑋 =
𝑥∈Ω
𝑥 ⋅ 𝑓 𝑥
only when the right hand side is converged absolutely (絶対収束),
i.e., σ𝑥∈Ω 𝑥 ⋅ 𝑓 𝑥 < ∞ holds.
If it is not the case, we say “expectation does not exist.”
Expectation (期待値) of a continuous random variable X is defined by
E 𝑋 = න−∞
+∞
𝑥 ⋅ 𝑓 𝑥 d𝑥 .
Compute expectations of distributions19
*Ex 2.
Discrete
(*i) Bernoulli distribution B 1, 𝑝 .
(*ii) Binomial distribution B 𝑛, 𝑝 .
(iii) Geometric distribution Ge 𝑝 .
(iv) Poisson distribution Po 𝜆 .
Continuous
(v) Exponential distribution Ex 𝛼 .
(vi) Normal distribution N 𝜇, 𝜎2 .
Ex. Expectation of Geom. distr. 20
Thm.
The expectation of 𝑋 ∼ 𝐵 𝑛, 𝑝 is 𝑛𝑝
proof
𝑘=0
𝑛
𝑘𝑛
𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 =
𝑘=0
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛𝑛!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑛𝑝(𝑛 − 1)!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘−1 1 − 𝑝 𝑛−𝑘
= 𝑛𝑝
𝑘′=0
𝑛−1𝑛 − 1
𝑘′𝑝𝑘
′1 − 𝑝 𝑛−1−𝑘′
= 𝑛𝑝
Ex. Expectation of Geom. distr. 21
Thm.
The expectation of 𝑋 ∼ Ge 𝑝 is 1−𝑝
𝑝.
Proof
E 𝑋 = 0 𝑝 + 1 1 − 𝑝 𝑝 + 2 1 − 𝑝 2𝑝 + 3 1 − 𝑝 3𝑝 +⋯−) 1 − 𝑝 E 𝑋 = 0 1 − 𝑝 𝑝 + 1 1 − 𝑝 2𝑝 + 2 1 − 𝑝 3𝑝 +⋯
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−𝑝E 𝑋 = 1 − 𝑝 𝑝 + 1 − 𝑝 2𝑝 + 1 − 𝑝 3𝑝 +⋯
=1 − 𝑝 𝑝
1 − (1 − 𝑝)= 1 − 𝑝
Thus E 𝑋 =1−𝑝
𝑝.
Properties of Expectations22
Thm.
For an arbitrary constant c,
E 𝑐 = 𝑐E 𝑐𝑋 = 𝑐 ⋅ E 𝑋E 𝑋 + 𝑐 = E 𝑋 + 𝑐
Linearity of expectations (discrete random variables)23
Thm. (linearity of expectation; 期待値の線形性)
E
𝑖=1
𝑛
𝑋𝑖 =
𝑖=1
𝑛
E(𝑋𝑖)
proof.
E 𝑋 + 𝑌
= σ𝑥σ𝑦(𝑥 + 𝑦) Pr 𝑋 = 𝑥 ∩ 𝑌 = 𝑦
= σ𝑥σ𝑦 𝑥𝑓(𝑥, 𝑦) + σ𝑥σ𝑦 𝑦𝑓(𝑥, 𝑦)
= σ𝑥 𝑥 σ𝑦 𝑓(𝑥, 𝑦) + σ𝑦 𝑦σ𝑥 𝑓(𝑥, 𝑦)
= σ𝑥 𝑥𝑓(𝑥) + σ𝑦 𝑦𝑓(𝑦)
= E 𝑋 + E[𝑌]
= σ𝑥σ𝑦 𝑥 + 𝑦 𝑓(𝑥, 𝑦)
Linearity of expectations (continuous random variables)24
Thm. (linearity of expectation; 期待値の線形性)
E
𝑖=1
𝑛
𝑋𝑖 =
𝑖=1
𝑛
E(𝑋𝑖)
proof.
E 𝑋 + 𝑌
= ∞−+∞
∞−+∞
𝑥 + 𝑦 𝑓 𝑥, 𝑦 d𝑥d𝑦
= ∞−+∞
∞−+∞
𝑥𝑓 𝑥, 𝑦 d𝑥d𝑦 + ∞−+∞
∞−+∞
𝑦𝑓 𝑥, 𝑦 d𝑥d𝑦
= ∞−+∞
𝑥 ∞−+∞
𝑓 𝑥, 𝑦 d𝑦 d𝑥 + ∞−+∞
𝑦 ∞−+∞
𝑓 𝑥, 𝑦 d𝑥 d𝑦
= ∞−+∞
𝑥𝑓(𝑥)d𝑥 + ∞−+∞
𝑦𝑓(𝑦)d𝑦
= E 𝑋 + E[𝑌]
Application of linearity of expectation25
Thm.
The expectation of 𝑋 ∼ B(𝑛; 𝑝) is 𝑛𝑝
proof
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d. B(1; 𝑝),
then 𝑌 ≔ 𝑋1 +⋯+ 𝑋𝑛 follows B(𝑛; 𝑝).
E 𝑋𝑖 = 1 ⋅ 𝑝 + 0 ⋅ (1 − 𝑝)
E 𝑌 = E σ𝑖𝑋𝑖 = σ𝑖 E 𝑋𝑖 = σ𝑖 𝑝 = 𝑝𝑛
Moment & Variance
Today’s topic 2
Motivation27
Consider the following three distributions.
Distr. 1.
• Pr 𝑋 = 0 = 1/3
• Pr 𝑋 = 1 = 1/3
• Pr 𝑋 = 2 = 1/3
Distr. 2.
• Pr 𝑋 = 𝑘 = 1/2(𝑘+1)
for 𝑘 = 0,1,2,…
Distr. 3.
•Pr 𝑋 = 0 = 2/3
• Pr 𝑋 = 1 = 0
• Pr 𝑋 = 2𝑘 = 1/4𝑘
for 𝑘 = 1,2,…
E 𝑋 = 1 E 𝑋 = 1 E 𝑋 = 1
Motivation28
Consider the following three distributions.
Distr. 1.
• Pr 𝑋 = 0 = 1/3
• Pr 𝑋 = 1 = 1/3
• Pr 𝑋 = 2 = 1/3
Distr. 2.
• Pr 𝑋 = 𝑘 = 1/2(𝑘+1)
for 𝑘 = 0,1,2,…
Distr. 3.
•Pr 𝑋 = 0 = 2/3
• Pr 𝑋 = 1 = 0
• Pr 𝑋 = 2𝑘 = 1/4𝑘
for 𝑘 = 1,2,…
E 𝑋 = 1
Pr 𝑋 > 1 = 1/3
Pr 𝑋 > 2 = 0
Pr 𝑋 > 1000 = 0
E 𝑋 = 1
Pr 𝑋 > 1 = 1/4
Pr 𝑋 > 2 = 1/8
Pr 𝑋 > 1000 = 1/512
E 𝑋 = 1
Pr 𝑋 > 1 = 1/3
Pr 𝑋 > 2 = 1/12
Pr 𝑋 > 1000 = 1/192
Definitions29
𝑘-th moment (𝑘次の積率) of 𝑋
E[𝑋𝑘]
variance (分散) of 𝑋
Var 𝑋 ≔ E 𝑋 − 𝐸 𝑋 2
standard deviation (標準偏差) of 𝑋
𝜎 𝑋 ≔ Var 𝑋
covariance (共分散) of 𝑋 and 𝑌
Cov 𝑋, 𝑌 ≔ E (𝑋 − E[𝑋])(𝑌 − E[𝑌])
Compute the variances of distributions30
*Ex 2.
Discrete
(*i) Bernoulli distribution B 1, 𝑝 .
(*ii) Binomial distribution B 𝑛, 𝑝 .
(iii) Geometric distribution Ge 𝑝 .
(iv) Poisson distribution Po 𝜆 .
Continuous
(v) Exponential distribution Ex 𝛼 .
(vi) Normal distribution N 𝜇, 𝜎2 .
Properties of variance and covariance31
Thm.
Var 𝑋 = E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
E 𝑋 − E 𝑋 2 = E 𝑋2 − 2𝑋E 𝑋 + E 𝑋 2
= E 𝑋2 − 2E 𝑋 E 𝑋 + E 𝑋 2
= E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋 − E 𝑋 𝑌 − E 𝑌= E 𝑋𝑌 − 𝑋E 𝑌 − 𝑌E 𝑋 + E 𝑋 E 𝑌= E 𝑋𝑌 − 2E 𝑋 E 𝑌 + E 𝑋 E 𝑌= E 𝑋𝑌 − E 𝑋 E[𝑌]
Properties of variance and covariance32
Thm.
Var 𝑋 = E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
Var 𝑋 + 𝑌 = E 𝑋 + 𝑌 2 − E 𝑋 + 𝑌 2
= E 𝑋2 + 2𝑋𝑌 + 𝑌2 − E 𝑋 + E 𝑌 2
= E 𝑋2 − E 𝑋 2 + E 𝑌2 − E 𝑌 2 + 2E 𝑋𝑌 − 2E 𝑋 E 𝑌= Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
Properties of var and cov (for independent 𝑋 and 𝑌)33
Thm. If 𝑋 and 𝑌 are independent,
E 𝑋𝑌 = E 𝑋 E 𝑌
Cov 𝑋, 𝑌 = 0
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌
𝐸 𝑋𝑌 =
𝑥
𝑦
𝑥𝑦Pr 𝑋 = 𝑥 ∧ 𝑌 = 𝑦
=
𝑥
𝑦
𝑥𝑦 Pr 𝑋 = 𝑥 Pr 𝑌 = 𝑦
=
𝑥
𝑥 Pr 𝑋 = 𝑥
𝑦
𝑦 Pr 𝑌 = 𝑦
= E 𝑋 E[𝑌]
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌= 0
Properties of Var and Cov34
Thm. If 𝑋1, … , 𝑋𝑛 are mutually independent,
Var 𝑋1 +⋯+ 𝑋𝑛 = Var 𝑋1 +⋯+ Var 𝑋𝑛
Linearity of independent variance: binomial distr.35
Thm.
The variance of 𝑋 ∼ B(𝑛; 𝑝) is 𝑛𝑝(1 − 𝑝)
proof
Suppose 𝑋1, … , 𝑋𝑛 are independent and identically distr. B(1; 𝑝),
then 𝑌 ≔ 𝑋1 +⋯+ 𝑋𝑛 follows B(𝑛; 𝑝).
𝐸 𝑋𝑖2 = 12 ⋅ 𝑝 + 02 ⋅ 1 − 𝑝 = 𝑝
Var 𝑋𝑖 = 𝐸 𝑋𝑖2 − 𝐸 𝑋𝑖
2 = 𝑝 − 𝑝2 = 𝑝 1 − 𝑝
Var 𝑌 = Var σ𝑖=1𝑛 𝑋𝑖 = σ𝑖=1
𝑛 Var 𝑋𝑖 = σ𝑖=1𝑛 𝑝 1 − 𝑝 = 𝑛𝑝 1 − 𝑝
Since X and Y are indipendent