Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConﬁdenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Bootstrap (Part 4)

Christof Seiler

Stanford University, Spring 2016, Stats 205

Overview

I So far:I Nonparametric bootstrap on the rows (e.g. regression, PCA

with random rows and columns)I Nonparametric bootstrap on the residuals (e.g. regression)I Parametric bootstrap (e.g. PCA with fixed rows and columns)I Studentized bootstrap

I Today:I Bias-Corrected-accelerated (BCa) bootstrapI From BCa to ABC

Motivation

I Correlation coefficient of bivariate normal with ρ = 0.577

sigma = matrix(nrow = 2,ncol = 2)diag(sigma) = 1rho = 0.577sigma[1,2] = sigma[2,1] = rhosigma

## [,1] [,2]## [1,] 1.000 0.577## [2,] 0.577 1.000

I Distribution of sample correlation coefficient (n = 10)I Compare: Percentile, Studentized, and

Bias-Corrected-Accelerated (BCa) bootstrap

MotivationHistogram of corhat

corhat

Fre

quen

cy

−0.5 0.0 0.5 1.0

050

010

0015

0020

00

bias = rho - mean(corhat); bias

## [1] 0.0217078

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0Percentile Bootstrap

Motivation

I Studentized bootstrap with variance stabilization fails due tonumerical problems

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0Studentized Bootstrap Without Variance Stabilization

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0BCa Bootstrap

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

Percentile Bootstrap

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

Studentized Bootstrap Without Variance Stabilization

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

BCa Bootstrap

BCa Bootstrap

I The bias-corrected bootstrap is similar to the percentilebootstrap

I Recall the percentile bootstrap:I Take bootstrap samples

θ∗1, . . . , θ∗B

I Order themθ(∗1), . . . , θ(∗B)

I Define interval as

(θ(∗Bα), θ(∗B(1−α)))

(assuming that Bα and B(1− α) are integers)

BCa Bootstrap

I Assume that there is an monotone increasing transformation gsuch that

φ = g(θ) and φ = g(θ)I The BCa bootstrap is based on this model

φ− φσφ

∼ N(−z0, 1) with σφ = 1 + aφ

I Which is a generalization of the usual normal approximation

θ − θσ∼ N(0, 1)

BCa Bootstrap

I z0 is the bias estimateI z0 measures discrepancy between the median of θ∗ and θI It is estimated with

z0 = Φ−1(

#{θ∗b < θ}B

)

I We obtain z0 = 0 if half of the θ∗b values are less than or equalto θ

BCa Bootstrap

I a is the skewness estimateI a measures the rate of change of the standard error of θ with

respect to the true parameter θI It is estimated using the Jackknife

I Delete ith observation in original sample denote new sample byθ(i) and estimate

θ(·) =n∑

i=1

θ(i)

n

I Then

a =∑n

i=1(θ(·) − θ(i))3

6{∑n

i=1(θ(·) − θ(i))2}3/2

BCa Bootstrap

I The bias-corrected version makes two additional corrections tothe percentile version

I By redefining lower α1 and upper α2 levels as

α1 = Φ(

z0 + z0 + z(α)

1− a(z0 + z(α))

)α2 = Φ

(z0 + z0 + z(1−α)

1− a(z0 + z(1−α))

)

with z(α) being the 100α percentile of standard normaland Φ normal CDF

I When a and z0 are equal to zero then α1 = α and α2 = 1− αI The interval is then given by

(θ(∗Bα1), θ(∗Bα2))

(assuming that Bα1 and Bα2 are integers)

BCa Bootstrap

I Same asymptotic accuracy as the studentized bootstrapI Can handle out of range problem as wellI Efron (1987) for detailed justification of this model

BCa Bootstrap in R

library(bootstrap)xdata = matrix(rnorm(30),ncol=2); n = 15theta = function(x,xdata) {

cor(xdata[x,1],xdata[x,2])}

results = bcanon(1:n,100,theta,xdata,alpha=c(0.025, 0.975))

results$confpoints

## alpha bca point## [1,] 0.025 -0.39659## [2,] 0.975 0.69326

Properties of Different Boostrap Methods

Standard Percentile Studentized∗ BCa

Asymptotic Acurracy O(√

n) O(√

n) O(1/n) O(1/n)Range-Preserving No Yes No YesTransformation-Invariant No Yes No YesBias-Correcting No No No YesSkeweness-Correcting No Yes Yes Yesσ, σ∗ required No No Yes NoAnalytic constant orvariance stabilizingtranformation required No No Yes Yes

∗ with variance stabilization

Properties of Different Boostrap Methods

For nonparametric boostrap:

Source: Carpenter and Bithell (2000)

Many More Topics

I Using the boostrap for better confidence in model selection(Efron 2014)

I Using the jackknife and the infinitesimal jackknife forconfidence intervals in random forests prediction orclassification (Wager, Hastie, and Efron 2014)

Approximate Bayesian Computation (ABC)

I Goal: We wish to sample from the posterior distribution p(θ|D)given data D

p(θ|D) = p(D|θ)p(θ)p(D)

I Setting:I The likelihood p(D|θ) is hard to evaluate or expensive to

compute (e.g. missing normalizing constant)I Easy to sample from likelihood p(D|θ)I Easy to sample from prior p(θ)

I Examples:I Population genetics (latent variables)I Ecology, epidemiology, systems biology (models based on

differential equations)

Approximate Bayesian Computation (ABC)

I Sampling algorithm (with data D = {y1, . . . , yn}):1. Sample θi ∼ p(θ)2. Sample xi ∼ p(x |θi )3. Reject θi if

xi 6= yj for j = 1, . . . , n

I ABC sampling (define statistics µ, distance ρ, and tolerance ε):1. Sample θi ∼ p(θ)2. Sample Di = {x1, . . . , xk} ∼ p(x |θi )3. Reject θi if

ρ(µ(Di ), µ(D)) > ε

References

I Efron (1987). Better Bootstrap Confidence IntervalsI Hall (1992). The Bootstrap and Edgeworth ExpansionI Efron and Tibshirani (1994). An Introduction to the BootstrapI Carpenter and Bithell (2000). Bootstrap Conidence Intervals:

When, Which, What? A Practical Guide for MedicalStatisticians

I Marin, Pudlo, Robert, and Ryder (2012). ApproximateBayesian Computational Methods

I Efron (2014). Estimation and Accuracy after Model SelectionI Wager, Hastie, and Efron (2014). Confidence Intervals for

Random Forests: The Jackknife and the Infinitesimal Jackknife

Documents

Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConﬁdenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion