85
Lecture 1: Normal approximation Lecture 2: Poisson approximation and other distributions A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term 2011 Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

A Short Introduction to Stein’s Method

Gesine ReinertDepartment of Statistics

University of Oxford

Michaelmas Term 2011

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 2: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Outline

Lecture 1: Normal approximationMotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Lecture 2: Poisson approximation and other distributionsPoisson approximation

Poisson approximation: the local approach

Poisson approximation: size bias couplings

A general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 3: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Lecture 1: Normal approximation

Recall two famous distributional approximations:Example: X1, X2, . . . ,Xn i.i.d., P(Xi = 1) = p = 1 − P(Xi = 0)

n−1/2n∑

i=1

(Xi − p) ≈d N (0, p(1 − p))

but alson∑

i=1

Xi ≈d Poisson(np)

Which approximation should we choose?We would like to assess the distance between distributions viaexplicit bounds.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 4: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Weak convergence

Often distributional distances are phrased in terms of testfunctions. This approach is based on weak convergence.For c.d.f.s Fn, n ≥ 0 and F on the line we say that Fn convergesweakly (converges in distribution) to F ,

FnD−→ F

ifFn(x) → F (x) (n → ∞)

for all continuity points x of F .For the associated probability distributions:

PnD−→ P.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 5: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Facts of weak convergence

PnD−→ P

is equivalent to:

1. Pn(A) → P(A) for each P-continuity set A (i.e. P(∂A) = 0)

2.∫

fdPn →∫

fdP for all functions f that are bounded,continuous, real-valued

3.∫

fdPn →∫

fdP for all functions f that are bounded, infinitelyoften differentiable, continuous, real-valued.

For a random variable X with distribution L(X ),

L(Xn)D−→ L(X ) ⇐⇒ Ef (Xn) → Ef (X )

for all functions f that are bounded, infinitely often differentiable,continuous, real-valued.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 6: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Total variation distance

Let L(X ) = P,L(Y ) = Q; define total variation distance

dTV (P, Q) = supA measurable

|P(A) − Q(A)|.

If the underlying space is discrete then convergence in totalvariation is equivalent to convergence in distribution. Forprobability measures on Z+ we can write

dTV (P, Q) =1

2

j≥0

|P{j} − Q{j}|.

If the underlying space is continuous then the total variationdistance is very strong; for example approximating a discreteinteger-valued random variable by a normal random variable wecould choose as set A the set of integers; then the total variationdistance would be 1.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 7: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Wasserstein distance

Consider the set of Lipschitz-continuous functions with Lipschitzconstant 1,

L = {g : R → R; |g(y) − g(x)| ≤ |y − x |}

and Wasserstein distance

dW (P, Q) = supg∈L

|Eg(Y ) − Eg(X )|

= inf E|Y − X |,

where the infimum is over all couplings X , Y such thatL(X ) = P,L(Y ) = Q.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 8: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Using

F = {f ∈ L absolutely continuous, f (0) = f ′(0) = 0}

we also have

dW (P, Q) = supf ∈F

|Ef ′(Y ) − Ef ′(X )|.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 9: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Stein’s Method for Normal Approximation

Stein (1972, 1986)Z ∼ N (µ, σ2) if and only if for all smooth functions f ,

E(Z − µ)f (Z ) = σ2Ef ′(Z ).

For a random variable W with EW = µ,VarW = σ2, if

σ2Ef ′(W ) − E(W − µ)f (W )

is close to zero for many functions f , then W should be close to Zin distribution.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 10: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Sketch of proof for µ = 0, σ2 = 1:

Assume Z ∼ N (0, 1). Integration by parts:

1√2π

f ′(x)e−x2/2dx

=

[

1√2π

f (x)e−x2/2

]

+1√2π

xf (x)e−x2/2dx

=1√2π

xf (x)e−x2/2dx

and so Ef ′(Z ) = EZf (Z ).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 11: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Conversely, assume EZf (Z ) = Ef ′(Z ): We solve the differentialequation

f ′(x) − xf (x) = g(x), limx→−∞f (x)e−x2/2 = 0

for any bounded function g, giving

f (y) = ey2/2

∫ y

−∞g(x)e−x2/2dx

Take g(x) = 1(x ≤ x0) − Φ(x0), then

0 = E(f ′(Z ) − Zf (Z )) = P(Z ≤ x0) − Φ(x0)

so Z ∼ N (0, 1).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 12: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

The Stein equation

Let µ = 0. Given a test function h, let Nh = Eh(Z/σ), and solvefor f in the Stein equation

σ2f ′(w) − wf (w) = h(w/σ) − Nh

giving

f (y) = ey2/2

∫ y

−∞(h(x/σ) − Nh) e−x2/2dx .

Now evaluate the expectation of the r.h.s. by the expectation ofthe l.h.s.We can bound the solution f and its derivatives in terms of thetest function h and its derivatives, e.g. ‖ f ′′ ‖≤ 2 ‖ h′ ‖.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 13: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: the sum of i.i.d. random variables

X , X1, . . . ,Xn i.i.d. with EX = 0, VarX = 1n; put W =

∑ni=1 Xi

and put Wi = W − Xi =∑

j 6=i Xj . Then

EWf (W ) =n∑

i=1

EXi f (W )

=n∑

i=1

EXi f (Wi ) +n∑

i=1

EX 2i f ′(Wi ) + R

=1

n

n∑

i=1

Ef ′(Wi ) + R.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 14: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

So

Ef ′(W ) − EWf (W ) =1

n

n∑

i=1

E{f ′(W ) − f ′(Wi )} + R

and we can bound the remainder term R.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 15: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Theorem: explicit bound on distance

For any smooth h

|Eh(W ) − Nh| ≤ ‖h′‖(

2√n

+n∑

i=1

E|X 3i |)

.

Note: this bound is true for any n; nothing goes to infinity.

This theorem extends to local dependence.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 16: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Local dependence

Let X1, . . . ,Xn be mean zero, finite variances, put W =∑n

i=1 Xi ;assume VarW = 1. Suppose that for each i = 1, . . . , n there existsets Ai ⊂ Bi ⊂ {1, . . . , n} such that Xi is independent of

j 6∈AiXj

and∑

j∈AiXj is independent of

j 6∈BiXj . Define

ηi =∑

j∈Ai

Xj and τi =∑

j∈Bi

Xj .

TheoremFor any smooth h with ‖h′‖ ≤ 1,

|Eh(W ) − Nh| ≤ 2n∑

i=1

(E|Xiηiτi | + |E(Xiηi )|E|τi |) +n∑

i=1

E|Xiη2i |.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 17: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: word counts in DNA sequences

Let B = B1B2 . . .Bn be a string of letters which are i.i.d. from thealphabet A = {A, C , G , T}. Let w = w1w2 . . .wm be a word oflength m composed of letters from A. Let V = V (w) be thenumber of times that w occurs in the sequence B.Words occur in clumps; for example consider

B = T G A A C A A A C A A C A A A G A A C A A A A

w = A A C A A

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 18: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: word counts in DNA sequences

Let B = B1B2 . . .Bn be a string of letters which are i.i.d. from thealphabet A = {A, C , G , T}. Let w = w1w2 . . .wm be a word oflength m composed of letters from A. Let V = V (w) be thenumber of times that w occurs in the sequence B.Words occur in clumps; for example consider

B = T G A A C A A A C A A C A A A G A A C A A A A

w = A A C A A

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 19: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: word counts in DNA sequences

Let B = B1B2 . . .Bn be a string of letters which are i.i.d. from thealphabet A = {A, C , G , T}. Let w = w1w2 . . .wm be a word oflength m composed of letters from A. Let V = V (w) be thenumber of times that w occurs in the sequence B.Words occur in clumps; for example consider

B = T G A A C A A A C A A C A A A G A A C A A A A

w = A A C A A

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 20: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: word counts in DNA sequences

Let B = B1B2 . . .Bn be a string of letters which are i.i.d. from thealphabet A = {A, C , G , T}. Let w = w1w2 . . .wm be a word oflength m composed of letters from A. Let V = V (w) be thenumber of times that w occurs in the sequence B.Words occur in clumps; for example consider

B = T G A A C A A A C A A C A A A G A A C A A A A

w = A A C A A

V (w) = 4; there are two clumps of occurrences.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 21: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

For j = 1, . . . , n − m + 1 define the word indicators

Ij =

{

1 if BjBj+1 . . .Bj+m−1 = w

0 otherwise.

Let p(x) = P(X1 = x) for x = A, C , G , T , then

EIj = p(w) :=m∏

i=1

p(wi )

andEV = (n − m + 1)p(w).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 22: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

The variance

Define, for ℓ = 1, . . . ,m,

pw(m) = p(wm−ℓ+1)p(wm−ℓ+2) · · · p(wm),

which is the probability that w occurs, given that w1w2 · · ·wm−ℓ

has occurred. Define the self-overlap indicator

ξ(j) = 1(wj+1 = w1, . . . ,wm = wm−j).

This overlap indicator equals 1 if the last m − j letters of woverlap exactly the first m − j letters of w. Then the varianceVar(W ) = σ2 is given by

2p(w)m−1∑

j=1

(n − m − j + 1)ξ(j)pw(j) − 2(m − 1)(n − m + 1)p(w)2.

The variance is of order n when m = o(n).Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 23: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

The neighbourhoods of dependence

Put

Xj =1

σ(Ij − p(w))

andAi = {j ∈ {1, . . . , n} : |j − i | ≤ m − 1}

as well as

Bi = {j ∈ {1, . . . , n} : |j − i | ≤ 2(m − 1)}.

Then Xi is independent of∑

j 6∈AiXj and

j∈AiXj is independent

of∑

j 6∈BiXj . Moreover

ηi =∑

j :|j−i |≤m−1

Xj and τi =∑

j :|j−i |≤2(m−1)

Xj .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 24: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Consider the statement of the theorem,

|Eh(W ) − Nh| ≤ 2n∑

i=1

(E|Xiηiτi | + |E(Xiηi )|E|τi |) +n∑

i=1

E|Xiη2i |.

We bound |Xi | ≤ 1σ and |ηi | ≤ 2m−1

σ as well as |τi | ≤ 4m−1σ and

hence the bound can be bounded by

(2m − 1)(4(4m − 1) + 1)

σ3n.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 25: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Couplings

The local approach does not work well when there is (weak) globaldependence, for example as in simple random sampling.Instead we make use of couplings: changing one random variableshould not have much effect on the other random variables if thedependence is weak.Here we consider two such couplings: the size-bias coupling andthe zero-bias coupling. There are more, such as the exchangeablepair coupling.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 26: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Size-Bias coupling: µ > 0

If W ≥ 0,EW > 0 then W s has the W -size biased distribution if

EWf (W ) = EWEf (W s)

for all f for which both sides exist.

Example: If X ∼ Bernoulli(p), then EXf (X ) = pf (1) and soX s = 1.

Example: If X ∼ Poisson(λ), then

P(X s = k) =ke−λλk

k!λ=

e−λλk−1

(k − 1)!

and so X s = X + 1, where the equality is in distribution.Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 27: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

UsingEWf (W ) = EWEf (W s)

with f (x) =? we get that

EW s =EW 2

µ.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 28: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

UsingEWf (W ) = EWEf (W s)

with f (x) = x we get that

EW s =EW 2

µ.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 29: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

UsingEWf (W ) = EWEf (W s)

with f (x) = x2 we get that

E(W s)2 =EW 3

µ.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 30: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Construction

(Goldstein + Rinott 1997) Suppose W =∑n

i=1 Xi with Xi ≥ 0,EXi > 0, all i .Choose index V proportional to the mean, EXv . If V = v : replaceXv by X s

v having the Xv -size biased distribution, independent, andif X s

v = x : adjust Xu, u 6= v , such that

L(Xu, u 6= v) = L(Xu, u 6= v |Xv = x)

Then W s =∑

u 6=V Xu + X sV has the W -size bias distribution.

Example: sum of independent Bernoullis Example: Xi ∼ Be(pi ) fori = 1, . . . , nThen W s =

u 6=V Xu + 1.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 31: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

How does the size-bias coupling help?

X , X1, . . . ,Xn i.i.d. non-negative, EX = µ,VarX = σ2 andW =

∑ni=1 Xi . Then

E(W − µ)f (W ) = µE(f (W s) − f (W ))

≈ µE(W s − W )f ′(W )

= µ1

n

n∑

i=1

E(X si − Xi )f

′(W )

≈ µEf ′(W )E(X s − µ)

= µEf ′(W )

{

1

µEX 2 − µ

}

= σ2Ef ′(W ).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 32: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

As a theorem

(Goldstein + Rinott 1997) Let W ≥ 0 have mean µ > 0 andvariance 1. Let W s have the W−size bias distribution and assumethat W s is defined on the same probability space as W . Let h be asmooth function. Then

|Eh(W − µ) − Eh(Z )|

≤ (sup h − inf h)µ√

VarE(W s − W |W ) +1

6‖h′‖µE[(W s − W )2].

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 33: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: sum of i.i.d.’s

X , X1, . . . ,Xn i.i.d. non-negative, EX = µn, VarX = 1

n, and

W =∑n

i=1 Xi . Then

E(W s − W |W ) =1

n

n∑

i=1

E(X si − Xi |W )

=1

n

(

n∑

i=1

nEX 2

µ−

n∑

i=1

Xi

)

=nEX 2

µ− 1

nW

and so

VarE(W s − W |W ) =1

n2.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 34: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Moreover

E[(W s − W )2] =1

n

n∑

i=1

E(Xi − X si )2

= EX 2 − 2µEX s + E(X s)2

= EX 2 − 2nEX 2 + nEX 3

µ

≤ nEX 3

µ.

Thus the bound gives

|Eh(W − µ) − Eh(Z )| ≤ 1

2‖h′′‖µ1

n+

1

6‖h(3)‖EX 3n.

We think of X ≍ n−12 ; then the overall bound is of order n−

12 .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 35: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Example: Vertex degrees in Bernoulli random graphs

Let G (n, p) be a random graph on n vertices where each pair ofvertices has probability p of making up an edge, independently.The degree D(v) of a vertex v is the number of its neighbours.Let W count the number of vertices with degree d . Then

W =n∑

v=1

I(D(v) = d).

The indicators are not independent - if we know that D(v) = 0 forv = 1, . . . , n − 1, then D(n) = 0 also.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 36: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Coupling

Take a random graph and pick a vertex v at random. Set itsdegree equal to d . If its degree was d already, nothing needs to beadjusted. If its degree was D(v) > d then choose D(v) − d of itsedges uniformly at random and delete them. If the degree wasD(v) < d then choose d − D(v) vertices among the remainingn − D(v) − 1 non-neighbours of v and create an edge betweeneach of these and v .Goldstein and Rinott (1997) calculate the bound; see also Lin andR. (2012) for a random graph model where p = p(u, v) maydepend on the vertices.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 37: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

For mean-zero random variables the size bias coupling is not easyto implement.In particular it would not be applicable to normally distributedrandom variables.This drawback lead to the definition of the zero-bias coupling.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 38: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Zero bias coupling

Let X be a mean zero random variable with finite, nonzerovariance σ2. We say that X ∗ has the X -zero biased distribution iffor all differentiable f for which EXf (X ) exists,

EXf (X ) = σ2Ef ′(X ∗).

The zero bias distribution X ∗ exists for all X that have mean zeroand finite variance. (Goldstein and R. 1997)It is easy to verify that W ∗ has density

p∗(w) = σ−2E{W I(W > w)}.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 39: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Examples

Example: If X ∼ N (0, σ2), then X has the X -zero bias distribution- and the normal is the only fixed point of the zero biastransformation.

Example: If X ∼ Bernoulli(p) − p, then

E{X I(X > x)} = p(1 − p) for − p < x < 1 − p

and is zero elsewhere, so X ∗ ∼ Uniform(−p, 1 − p).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 40: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Connection with Wasserstein distance

We have for W mean zero, variance 1,

|Eh(W ) − Nh| = |E[

f ′(W ) − Wf (W )]

|= |E

[

f ′(W ) − f ′(W ∗)]

|≤ ||f ′′||E|W − W ∗|,

where || · || is the supremum norm. By (Stein 1986) for habsolutely continuous we have ||f ′′|| ≤ 2||h′|| and hence

|Eh(W ) − Nh| ≤ 2||h′||E|W − W ∗|;

thusdW (L(W ),N (0, 1)) ≤ 2E|W − W ∗|.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 41: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Construction: sum of i.i.d.s

W =∑n

i=1 Xi , where the Xi ’s are independent mean zero finitevariance σ2

i variables: Choose an index I proportional to thevariance, zero bias in that variable,

W ∗ = W − XI + X ∗I .

Then, for any smooth f ,

EWf (W ) =n∑

i=1

EXi f (W ) =n∑

i=1

EXi f (Xi +∑

t 6=i

Xt)

=

n∑

i=1

σ2i Ef ′(X ∗

i +∑

t 6=i

Xt) = σ2n∑

i=1

σ2i

σ2Ef ′(W − Xi + X ∗

i )

= σ2Ef ′(W − XI + X ∗I ) = σ2Ef ′(W ∗),

where we have used independence of Xi and Xt , t 6= i .Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 42: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

And here is the theorem

Let X1, . . . ,Xn be independent mean zero variables with variancesσ2

1, . . . , σ2n and finite third moments, and let

W = (X1 + . . . + Xn)/σ where σ2 = σ21 + . . . + σ2

n. Then for allabsolutely continuous test functions h,

|Eh(W ) − Nh| ≤ 2||h′||σ3

n∑

i=1

E

(

|Xi | +1

2|Xi |3

)

σ2i .

When the variables are i.i.d. with variance 1,

|Eh(W ) − Nh| ≤ 3||h′||E|X1|3√n

.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 43: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

The construction is (much) more involved under dependence.When third moments vanish, fourth moments exist: Order n−1

bound.Also: Berry-Esseen bound, combinatorial central limit theorem(Goldstein 2004).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 44: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Some remarks

◮ Left out:

1. Exchangeable pair couplings, also used for variance reductionin simulations

2. Multivariate, also coupling approaches3. Infinite-dimensional variables (Malliavin calculus), random

measures

◮ Key feature: bounds in the presence of dependence

◮ In the i.i.d. case: Berry-Esseen inequality not quite recovered

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 45: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

Further reading

◮ Barbour, A.D. (1990). Stein’s method for diffusion approximations, Probability Theory and Related Fields

84 297-322.

◮ Barbour, A.D. and Chen. L.H.Y. eds (2005). An Introduction to Stein’s Method. Lecture Notes Series 4,Inst. Math. Sciences, World Scientific Press, Singapore.

◮ Barbour, A.D. and Chen. L.H.Y. eds (2005). Stein’s Method and Applications. Lecture Notes Series 5,Inst. Math. Sciences, World Scientific Press, Singapore.

◮ Chen, L.H.Y., Goldstein, L., and Shao, Q.-M. (2011). Normal Approximation by Stein’s Method. Springer,Berlin and Heidelberg.

◮ Diaconis, P., and Holmes, S. (eds.) (2004). Stein’s Method: Expository Lectures and Applications. IMSLecture Notes 46.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 46: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

MotivationDistributional distancesThe Stein equationLocal dependenceCouplings

More further reading

◮ Reinert, G. (2011). Gaussian approximation of functionals: Malliavin calculus and Stein’s method In:Surveys in Stochastic Processes. Editors: Jochen Blath , Peter Imkeller , and Sylvie Roelly. EuropeanMathematical Society, Zurich. pp. 107-126

◮ G. Reinert (1998). Couplings for normal approximations with Stein’s method. In Microsurveys in Discrete

Probability. D. Aldous, J. Propp eds., Dimacs series. AMS, 193 - 207.

◮ G. Reinert, S. Schbath, and M.S. Waterman (2005). Probabilistic and Statistical Properties of FiniteWords in Finite Sequences. In: Lothaire: Applied Combinatorics on Words, Cambridge University Press, J.Berstel, D. Perrin, eds., 268-352.

◮ Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum ofdependent random variables. Proc. Sixth Berkeley Symp. Math. Statist. Probab. 2 583-602, Univ.California Press, Berkeley.

◮ Stein, C. (1986). Approximate Computation of Expectations. IMS, Hayward, CA.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 47: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Poisson approximation and other distributions

We first look at Poisson approximation, then we will put theapproximations in a general framework.We then apply the general framework to chisquare distributions aswell as one-dimensional Gibbs measures.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 48: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Poisson approximation

The standard reference for this section is Barbour, Holst andJanson (1992).Recall the Poisson(λ) distribution

pk = e−λ λk

k!, k = 0, 1, . . . .

This is a discrete distribution, and for approximations on Z+ wewould usually take total variation distance;

dTV (P, Q) = supA⊂Z+

|P(A) − Q(A)|.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 49: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

A Stein characterisation

Fact: Z ∼ Po(λ) ⇐⇒ for all functions f : Z+ → R withEf (Z ) = 0

Eλg(Z + 1) − Zg(Z ) = 0.

This suggests to use as test functions indicators I (j ∈ A) for setsA ⊂ Z+ and as Stein equation

λg(j + 1) − jg(j) = I (j ∈ A) − Po{λ}(A).

For any random variable W we hence have

λg(W + 1) − Wg(W ) = I (W ∈ A) − Po{λ}(A)

and taking expectations

dTV (L(W ), Po{λ}) ≤ supg

{Eλg(W + 1) − Wg(W )} .

Here the sup is over all functions g which solve the Stein equation.Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 50: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The solution of the Stein equation

λg(j + 1) − jg(j) = I (j ∈ A) − Po{λ}(A)

can be calculated iteratively. We only need the results that

supj

|g(j)| ≤ min(1, λ− 12 )

and∆g := sup

j

|g(j + 1) − g(j)| ≤ min(1, λ−1).

Here λ−1 is often referred to as the magic factor.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 51: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Example: sum of independent Bernoulli variables

Let X1, X2, . . . be independent Bernoulli(pi ) and W =∑n

i=1 Xi .With λ =

∑ni=1 pi

EWg(W ) =∑

EXig(W ) =∑

EXig(W − Xi + 1)

=∑

piEg(W − Xi + 1)

and thus

Eλg(W + 1) − Wg(W )

=∑

piE(g(W + 1) − g(W − Xi + 1))

=∑

p2i E(g(W + 1) − g(W − Xi + 1)|Xi = 1)

and

|Eλg(W + 1) − Wg(W )| ≤ ∆g∑

p2i ≤ min(1, λ−1)

p2i .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 52: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The local approach

Again the argument generalises to local dependence.Let X1, . . . ,Xn be Bernoulli random variables, Xi ∼ Be(pi ), andput W =

∑ni=1 Xi . Let λ =

∑ni=1 pi . Suppose that for each

i = 1, . . . , n there exists a set Ai ⊂ {1, . . . , n} such that Xi isindependent of

j 6∈AiXj . Define

ηi =∑

j∈Ai

Xj .

TheoremFor any smooth h with ‖h′‖ ≤ 1, and λ = EW,

dTV (L(W ), Po(λ)) ≤n∑

i=1

[(piEηi + E(Xiηi ))] min(

1, λ−1)

.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 53: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Example: word counts in DNA sequences

Let B = B1B2 . . .Bn be a string of letters which are i.i.d. from thealphabet A = {A, C , G , T}. Let w = w1w2 . . .wm be a word oflength m composed of letters from A. Let V = V (w) be thenumber of times that w occurs in the sequence B.For j = 1, . . . , n − m + 1 define the word indicators

Xj =

{

1 if BjBj+1 . . .Bj+m−1 = w0 otherwise.

Let p(x) = P(B1 = x) for x = A, C , G , T , then

EXj = p(w) :=m∏

i=1

p(wi ).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 54: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The neighbourhood of dependence

Put Ai = {j ∈ {1, . . . , n} : |j − i | ≤ m − 1}. Then Xi isindependent of

j 6∈AiXj . Moreover ηi =

j :|j−i |≤m−1 Xj .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 55: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Consider the statement of the theorem,

dTV (L(W ), Po(λ) ≤n∑

i=1

[(piEηi + E(Xiηi ))] min(

1, λ−1)

.

Here pi = p(w) for all i and we think of λ = O(1) so that roughlyp(w) = O(n−1). Moreover Eηi ≤ (2m − 1)p(w) andE(Xiηi ) =

j :|j−i |≤m−1 EXiXj depends on the amount ofself-overlap in w. If w does not have any self-overlap then we haveEXiXj = 0 for |i − j | ≤ m − 1 and hence

dTV (L(W ), Po(λ) ≤ n(2m − 1)p(w)2 min(

1, λ−1)

.

If w has considerable self-overlap then E(Xiηi ) will not be smalland indeed a compound Poisson approximation would be advised.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 56: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Example: sequencing by hybridisation

(Arratia et al. 1996) Sequencing by hybridisation is a tool todetermine a DNA sequence from the unordered list of all l-tuplesin the sequence; typically l = 8, 10, 12. Assume that the multisetof all l-tuples is known. The sequence is then uniquely recoverableunless one of the following occurs in the sequence:

◮ a non-trivial rotation

◮ a non-trivial transposition using three way repeats

◮ a non-trivial transposition using two interleaved pairs ofrepeats.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 57: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Using a Poisson approximation for pairs and three way repeats wecan bound the probability of the sequence being uniquelyrecoverable. For example if all letters are equally likely:The probability of unique recoverability is at least .95 ifl = 8 and sequence length is at most 85l = 10 and sequence length is at most 469l = 12 and sequence length is at most 2,288.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 58: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Recall: If W ≥ 0,EW = λ then W s has the W -size biaseddistribution if

EWf (W ) = λEf (W s)

for all f for which both sides exist.

Example: sum of independent Bernoullis Example: Xi ∼ Be(pi ) fori = 1, . . . , n; pick V ∝ pi , then W s =

u 6=V Xu + 1.

TheoremFor the sum of possibly dependent Bernoullis, withL(Xj , j 6= i) = L(Xj , j 6= i |Xi = 1),

dTV (L(W ), Po(λ)) = |n∑

i=1

pi (Eg(W + 1) − g(∑

j 6=i

Xj + 1))|

≤ ∆gn∑

i=1

piE|W −∑

j 6=i

Xj |.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 59: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

If the Bernoulli random variables are independent then

E|W −∑

j 6=i

Xj | = EXi = pi

and we obtain the same bound as from the local approach.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 60: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

A general framework

Start with a target distribution µ.

1. Find characterization: operator A such that X ∼ µ if and onlyif for all smooth functions f , EAf (X ) = 0

2. For each smooth function h find solution f = fh of the Steinequation

h(x) −∫

hdµ = Af (x)

3. Then for any variable W ,

Eh(W ) −∫

hdµ = EAf (W )

We usually need to bound f , f ′, or ∆f .Here h is a smooth test function; for nonsmooth functions: seetechniques used by Shao, Chen, Rinott and Rotar, Gotze.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 61: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The generator approach

Barbour 1989, 1990; Gotze 1993Choose A as generator of a Markov process with stationarydistribution µ, that is:Let (Xt)t≥0 be a homogeneous Markov processPut Tt f (x) = E (f (Xt)|X (0) = x)The generator is defined as

Af (x) = limt↓0

1

t(Tt f (x) − f (x)) .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 62: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Generator facts

(Ethier and Kurtz (1986), for example)

1. µ stationary distribution then X ∼ µ if and only ifEAf (X ) = 0 for f for which Af is defined

2. Tth − h = A(

∫ t

0 Tuhdu)

and formally

hdµ − h = A(∫ ∞

0Tuhdu

)

if the r.h.s. exists.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 63: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Examples

1. Ah(x) = h′′(x) − xh′(x) is the generator ofOrnstein-Uhlenbeck process, stationary distribution N (0, 1).

2. Ah(x) = λ(h(x + 1) − h(x)) + x(h(x − 1) − h(x)) or

Af (x) = λf (x + 1) − xf (x)

is the generator of an immigration-death process withimmigration rate λ and unit per capita death rate; thestationary distribution is Poisson(λ).

Advantage: generalisations to multivariate, diffusions, measurespace...

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 64: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Chisquare distributions

A generator for χ2p is

Af (x) = xf ′′(x) +1

2(p − x)f ′(x).

Luk 1994: A Stein operator for Gamma(r , λ) is

Af (x) = xf ′′(x) + (r − λx)f ′(x)

and χ2p = Gamma(d/2, 1/2); and, for χ2

p, A is the generator of aMarkov process given by the solution of the stochastic differentialequation

Xt = x +1

2

∫ t

0(p − Xs)ds +

∫ t

0

2XsdBs

where Bs is standard Brownian motion.Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 65: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The Stein equation is then

(χ2p) h(x) − χ2

ph = xf ′′(x) +1

2(p − x)f ′(x)

where χ2ph is the expectation of h under the χ2

p-distribution.

Lemma(Pickett 2002). Suppose h : R → R is absolutely bounded,|h(x)| ≤ ceax for some c > 0 a ∈ R, and the first k derivatives of hare bounded. Then the equation (χ2

p) has a solution f = fh suchthat

‖ f (j) ‖≤√

2π√p

‖ h(j−1) ‖

with h(0) = h.

(Improvement over Luk 1994 in 1√p)

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 66: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Example: squared sum (R. + Pickett)Xi , i = 1, . . . , n, i.i.d. mean zero, variance one, exisiting 8th

moment

S =1√n

n∑

i=1

Xi

andW = S2

Pickett + R. (2004): the bound on the distance to Chisquare(1)for smooth test functions is of order 1

n

Also: Pearson’s chisquare statistic.Robert Gaunt: Variance-gamma distributions.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 67: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

One-dimensional Gibbs distributions

(Eichelsbacher + R. (2008)) We consider discrete univariateGibbs measures µ with supp(µ) = {0, . . . ,N}, whereN ∈ N0 ∪ {∞}. Such a Gibbs measure can be written as

µ(k) =1

Zexp(V (k))

ωk

k!, k = 0, 1, . . . ,N,

with Z =∑N

k=0 exp(V (k))ωk

k! , where ω > 0 is fixed. We shallassume that the normalizing constant Z exists.Conversely, for a given probability distribution (µ(k))k∈N0 we canfind a representation as a Gibbs measure by choosing

V (k) = log µ(k) + log k! + log Z − k log ω, k = 0, 1, . . . ,N,

with V (0) = log µ(0) + log Z. We have some freedom in thechoice of ω and thus of V . Fix a representation.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 68: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

A Markov process

To each Gibbs measure we associate a Markovian birth-deathprocess with unit per-capita death rate dk = k and birth rate

bk = ω exp{V (k + 1) − V (k)} = (k + 1)µ(k + 1)

µ(k),

for k , k + 1 ∈ supp(µ). It is easy to see that this birth-deathprocess has invariant measure µ.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 69: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The generator

Following the generator approach to Stein’s method, we wouldtherefore choose as generator

(Ah)(k) = (h(k + 1) − h(k)) exp{V (k + 1) − V (k)}ω+k(h(k − 1) − h(k))

or, with the simplification f (k) = h(k) − h(k − 1),

(Af )(k) = f (k + 1) exp{V (k + 1) − V (k)}ω − kf (k).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 70: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Examples

1. The Poisson-distribution with parameter λ > 0: We useω = λ, V (k) = −λ,Z = 1. The Stein-operator is

(Af )(k) = f (k + 1)λ − kf (k)

as before.

2. The Binomial-distribution with parameters n and 0 < p < 1:We use ω = p

1−p, V (k) = − log((n − k)!), and

Z = (n!(1 − p)n)−1. The Stein-operator is

(Af )(k) = f (k + 1)p(n − k)

(1 − p)− kf (k).

3. The Hypergeometric distribution: The Stein-operator is

(Af )(k) = (a − k)(n − k) f (k + 1) − (b − n − x) kf (k).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 71: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

The solution of the Stein equation

For a given function h, a solution f of the Stein equation is givenby f (0) = 0, f (k) = 0 for k 6∈ supp(µ), and

f (j + 1) =j!

ωj+1e−V (j+1)

j∑

k=0

eV (k) ωk

k!(h(k) − µ(h))

= − j!

ωj+1e−V (j+1)

N∑

k=j+1

eV (k) ωk

k!(h(k) − µ(h)) .

We can bound the solution and their first difference ∆f .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 72: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Exploiting size-biasing

Recall that, for W ≥ 0 with EW > 0 we say that W ∗ has theW -size biased distribution if

EWg(W ) = EWEg(W ∗)

for all g for which both sides exist. In particular we obtain that

E

{

exp{V (X + 1) − V (X )}ω g(X + 1) − X g(X )

}

= E

{

exp{V (X + 1) − V (X )}ω g(X + 1) − EXEg(X ∗)

}

and

EX = ωEeV (X+1)−V (X ).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 73: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

A Gibbs measure characterisation

Let X ≥ 0 be such that 0 < E (X ) < ∞, and let µ be a discreteunivariate Gibbs measure on the non-negative integers. ThenX ∼ µ if and only if for all bounded g we have that

ω EeV (X+1)−V (X )g(X + 1) = ω EeV (X+1)−V (X )Eg(X ∗).

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 74: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

And here is the theorem

For any W ≥ 0 with 0 < EW < ∞ we have that

Eh(W ) − µ(h)

= ω{EeV (W+1)−V (W )f (W + 1)

−EeV (W+1)−V (W )Ef (W ∗)}

where f is the solution of the Stein equation.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 75: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Example: lattice approximation in statistical physics

Chayes and Klein (1994): Assume A ⊂ Rd is a rectangle; denoteits volume by |A|. Consider the intersection of A with thed-dimensional lattice n−1Zd . For each site m in this intersectionwe associate a Bernoulli random variable X n

m which takes value 1,with probability pn

m, if a particle is present at m and 0 otherwise. Ifthe collection (X n

m)m is independent, the joint distribution can beinterpreted as the Gibbs distribution for an ideal gas on the lattice,and we have Poisson convergence.We provide bounds on the approximation for such particles whileallowing for interactions.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 76: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

More details

The model is as follows. Pick n ∈ N and suppose that A can bepartitioned into a regular array of d(n) sub-rectangles

{Sn1 , . . . ,Sn

d(n)} with volumes v(Snm) = zn

m

zn. For each m, n choose a

point qnm ∈ Sn

m. We consider a sequence of functions (fk)ksatisfying f0 ≡ 1 and for each k ≥ 1, fk(x1, . . . , xk) is anonnegative function, Riemann integrable on Ak , such that fk is asymmetric function for each k .

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 77: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Let X nm, 1 ≤ m ≤ d(n), be 0-1 random variables with joint density

function

P(X n1 = a1, . . . ,X

nd(n) = ad(n)) ∝ fk(qn

i1, . . . , qn

ik)

d(n)∏

m=1

(znm)am

where each ai ∈ {0, 1} and the indices on the right-hand side are

determined by k =∑d(n)

m=1 am. Define Sn =∑d(n)

m=1 X nm.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 78: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Then, due to the symmetry of fk ,

P(Sn = k) =

(zn)k

k!

∑d(n)i1=1 · · ·

∑d(n)ik=1 fk(qn

i1, . . . , qn

ik)∏k

m=1 v(Snim

)∑d(n)

k=0(zn)k

k!

∑d(n)i1=1 · · ·

∑d(n)ik=1 fk(qn

i1, . . . , qn

ik)∏k

m=1 v(Snim

).

Note that we can write this distribution as a Gibbs measure µn.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 79: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Let S be a nonnegative integer valued random variable defined by

P(S = k) =zk

k!

Ak fk(x1, . . . , xk)dx1 · · · dxk∑∞

k=0zk

k!

Ak fk(x1, . . . , xk)dx1 · · · dxk

.

We write this distribution as a Gibbs measure µ. Using ourapproach we can bound the total variation distance between µn

and µ.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 80: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Comparing distributions via generators

Let µ have generator A and corresponding (ω, V ), and let µ2 havegenerator A2 and corresponding (ω2, V2). Assume that bothbirth-death processes have unit per-capita death rates. Then, forX ∼ µ2, if the solution f of the Stein equation for µ is such thatA2f exists,

Eh(X ) − µ(h) = EAf (X ) = E(A−A2)f (X ).

Using that X ∗ has the X -size bias distribution,

|Eh(X ) − µ(h)| ≤ ‖ f ‖ E(X )

{ |ω − ω2|ω2

ω2E∣

∣e(V (X∗)−V (X∗−1))−(V2(X

∗)−V2(X∗−1)) − 1

}

.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 81: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

For example, for two Poisson distributions Poisson(λ1) andPoisson(λ2) this approach gives

Eh(X ) −∫

hdµ

≤ ‖ f ‖ |λ1 − λ2| .

Note that the normalising constant Z in the Gibbs distribution,which is often difficult to calculate, is not needed explicitly in theStein approach. This is one of the main advantages of the aboveapproach.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 82: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Final Remarks

In the first example, X1, X2, . . . ,Xn i.i.d.,P(Xi = 1) = p = 1 − P(Xi = 0), using Stein’s method we canshow that

supx

|P((np(1 − p))−1/2n∑

i=1

(Xi − p) ≤ x) − P(N (0, 1) ≤ x)|

≤ 6

p(1 − p)

n

and

supx

|P(n∑

i=1

Xi = x) − P(Po(np) = x)| ≤ min(np2, p).

So, if p < 36n+36 , the bound on the Poisson approximation is smaller

than the bound on the normal approximation.Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 83: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Application areas

◮ sequence analysis

◮ random graph statistics, and network statistics

◮ number theory

◮ MCMC convergence

◮ exceedances in time series

◮ interacting particle systems

◮ epidemics

◮ permutation tests

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 84: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

◮ Many more distributions

◮ Exchangeable pair couplings, also used for variance reductionin simulations

◮ Multivariate, also coupling approaches

◮ Infinite-dimensional: random measures, functionals ofGaussian random fields

◮ General distributional transformations

◮ Bounds in the presence of dependence

◮ In the i.i.d. case the Berry-Esseen inequality is not quiterecovered.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method

Page 85: A Short Introduction to Stein’s Method - stats.ox.ac.uk · A Short Introduction to Stein’s Method Gesine Reinert Department of Statistics University of Oxford Michaelmas Term

Lecture 1: Normal approximationLecture 2: Poisson approximation and other distributions

Poisson approximationA general frameworkChisquare distributionsOne-dimensional Gibbs distributionsFinal Remarks

Further reading

1. Arratia, R., Goldstein, L., and Gordon, L. (1989). Two momentssuffice for Poisson approximation: The Chen-Stein method. Ann.Probab. 17, 9-25.

2. Barbour, A.D., Holst, L., and Janson, S. (1992). PoissonApproximation. Oxford University Press.

3. G. Reinert (2005). Three general approaches to Stein’s method. In:A Program in Honour of Charles Stein: Tutorial Lecture Notes.A.D. Barbour, L.H.Y. Chen, eds. World Scientific, Singapore(2005), 183-221.

Gesine ReinertDepartment of Statistics University of Oxford A Short Introduction to Stein’s Method