23
Bootstrap (Part 4) Christof Seiler Stanford University, Spring 2016, Stats 205

Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Bootstrap (Part 4)

Christof Seiler

Stanford University, Spring 2016, Stats 205

Page 2: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Overview

I So far:I Nonparametric bootstrap on the rows (e.g. regression, PCA

with random rows and columns)I Nonparametric bootstrap on the residuals (e.g. regression)I Parametric bootstrap (e.g. PCA with fixed rows and columns)I Studentized bootstrap

I Today:I Bias-Corrected-accelerated (BCa) bootstrapI From BCa to ABC

Page 3: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

I Correlation coefficient of bivariate normal with ρ = 0.577

sigma = matrix(nrow = 2,ncol = 2)diag(sigma) = 1rho = 0.577sigma[1,2] = sigma[2,1] = rhosigma

## [,1] [,2]## [1,] 1.000 0.577## [2,] 0.577 1.000

I Distribution of sample correlation coefficient (n = 10)I Compare: Percentile, Studentized, and

Bias-Corrected-Accelerated (BCa) bootstrap

Page 4: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

MotivationHistogram of corhat

corhat

Fre

quen

cy

−0.5 0.0 0.5 1.0

050

010

0015

0020

00

bias = rho - mean(corhat); bias

## [1] 0.0217078

Page 5: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0Percentile Bootstrap

Page 6: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

I Studentized bootstrap with variance stabilization fails due tonumerical problems

Page 7: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0Studentized Bootstrap Without Variance Stabilization

Page 8: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

020

4060

8010

0BCa Bootstrap

Page 9: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Motivation

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

Percentile Bootstrap

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

Studentized Bootstrap Without Variance Stabilization

−1.0 −0.5 0.0 0.5 1.0 1.5

040

80

BCa Bootstrap

Page 10: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I The bias-corrected bootstrap is similar to the percentilebootstrap

I Recall the percentile bootstrap:I Take bootstrap samples

θ∗1, . . . , θ∗B

I Order themθ(∗1), . . . , θ(∗B)

I Define interval as

(θ(∗Bα), θ(∗B(1−α)))

(assuming that Bα and B(1− α) are integers)

Page 11: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I Assume that there is an monotone increasing transformation gsuch that

φ = g(θ) and φ = g(θ)I The BCa bootstrap is based on this model

φ− φσφ

∼ N(−z0, 1) with σφ = 1 + aφ

I Which is a generalization of the usual normal approximation

θ − θσ∼ N(0, 1)

Page 12: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I z0 is the bias estimateI z0 measures discrepancy between the median of θ∗ and θI It is estimated with

z0 = Φ−1(

#{θ∗b < θ}B

)

I We obtain z0 = 0 if half of the θ∗b values are less than or equalto θ

Page 13: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I a is the skewness estimateI a measures the rate of change of the standard error of θ with

respect to the true parameter θI It is estimated using the Jackknife

I Delete ith observation in original sample denote new sample byθ(i) and estimate

θ(·) =n∑

i=1

θ(i)

n

I Then

a =∑n

i=1(θ(·) − θ(i))3

6{∑n

i=1(θ(·) − θ(i))2}3/2

Page 14: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I The bias-corrected version makes two additional corrections tothe percentile version

I By redefining lower α1 and upper α2 levels as

α1 = Φ(

z0 + z0 + z(α)

1− a(z0 + z(α))

)α2 = Φ

(z0 + z0 + z(1−α)

1− a(z0 + z(1−α))

)

with z(α) being the 100α percentile of standard normaland Φ normal CDF

I When a and z0 are equal to zero then α1 = α and α2 = 1− αI The interval is then given by

(θ(∗Bα1), θ(∗Bα2))

(assuming that Bα1 and Bα2 are integers)

Page 15: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap

I Same asymptotic accuracy as the studentized bootstrapI Can handle out of range problem as wellI Efron (1987) for detailed justification of this model

Page 16: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

BCa Bootstrap in R

library(bootstrap)xdata = matrix(rnorm(30),ncol=2); n = 15theta = function(x,xdata) {

cor(xdata[x,1],xdata[x,2])}

results = bcanon(1:n,100,theta,xdata,alpha=c(0.025, 0.975))

results$confpoints

## alpha bca point## [1,] 0.025 -0.39659## [2,] 0.975 0.69326

Page 17: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Properties of Different Boostrap Methods

Standard Percentile Studentized∗ BCa

Asymptotic Acurracy O(√

n) O(√

n) O(1/n) O(1/n)Range-Preserving No Yes No YesTransformation-Invariant No Yes No YesBias-Correcting No No No YesSkeweness-Correcting No Yes Yes Yesσ, σ∗ required No No Yes NoAnalytic constant orvariance stabilizingtranformation required No No Yes Yes

∗ with variance stabilization

Page 18: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Properties of Different Boostrap Methods

For nonparametric boostrap:

Source: Carpenter and Bithell (2000)

Page 19: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Many More Topics

I Using the boostrap for better confidence in model selection(Efron 2014)

I Using the jackknife and the infinitesimal jackknife forconfidence intervals in random forests prediction orclassification (Wager, Hastie, and Efron 2014)

Page 20: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Approximate Bayesian Computation (ABC)

I Goal: We wish to sample from the posterior distribution p(θ|D)given data D

p(θ|D) = p(D|θ)p(θ)p(D)

I Setting:I The likelihood p(D|θ) is hard to evaluate or expensive to

compute (e.g. missing normalizing constant)I Easy to sample from likelihood p(D|θ)I Easy to sample from prior p(θ)

I Examples:I Population genetics (latent variables)I Ecology, epidemiology, systems biology (models based on

differential equations)

Page 21: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

Approximate Bayesian Computation (ABC)

I Sampling algorithm (with data D = {y1, . . . , yn}):1. Sample θi ∼ p(θ)2. Sample xi ∼ p(x |θi )3. Reject θi if

xi 6= yj for j = 1, . . . , n

I ABC sampling (define statistics µ, distance ρ, and tolerance ε):1. Sample θi ∼ p(θ)2. Sample Di = {x1, . . . , xk} ∼ p(x |θi )3. Reject θi if

ρ(µ(Di ), µ(D)) > ε

Page 22: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion
Page 23: Bootstrap (Part 4)christofseiler.github.io/stats205/Lecture27/BiasCorrection.pdf · References I Efron(1987). BetterBootstrapConfidenceIntervals I Hall(1992). TheBootstrapandEdgeworthExpansion

References

I Efron (1987). Better Bootstrap Confidence IntervalsI Hall (1992). The Bootstrap and Edgeworth ExpansionI Efron and Tibshirani (1994). An Introduction to the BootstrapI Carpenter and Bithell (2000). Bootstrap Conidence Intervals:

When, Which, What? A Practical Guide for MedicalStatisticians

I Marin, Pudlo, Robert, and Ryder (2012). ApproximateBayesian Computational Methods

I Efron (2014). Estimation and Accuracy after Model SelectionI Wager, Hastie, and Efron (2014). Confidence Intervals for

Random Forests: The Jackknife and the Infinitesimal Jackknife