Leave-one-out crossvalidationprema.mf.uni-lj.si/files/CindexLOO_EMRIBS_b69.pdf · Leave-one-out...

Leave-one-out crossvalidation favors inaccurate estimators

Section for Clinical Biometrics

Angelika Geroldinger, Lara Lusa, Mariana Nold, Georg Heinze

8–12 May 2017, Thessaloniki

This work is supported by FWF under project number I 2276 (“PREMA”).

Motivation

Study on rare events in logistic regression,

compared 10 different model estimation methods

Real data example: arterial closure devices in minimally invasive

cardiac surgery

Type of surgical

access

conven-

tional

arterial

closure

device

plication

82 342

Motivation

Simulations: all 10 model estimation methods similar with respect to

discrimination ability (c statistic)

In the real data example leave-one-out crossvalidation (LOO CV) gave:

Motivation

In the real data example leave-one-out crossvalidation (LOO CV) gave:

just a matter of

sampling variability?

Motivation

Using the .632+ bootstrap gives different results:

Definitions

C statistic: proportion of pairs with opposite outcomes, which are

ranked correctly by the model (equal to AUC of ROC curves)

c statistic of 8

4∗3= 0.667

The c-statistic does not have

to be ≥0.5!

Definitions

C statistics with 5-fold CV:

‘training’ data

‘test’ data

Definitions

C statistics with 5-fold CV:

With LOO CV: only one observation per fold,

calculate pred. probability on test data,

calculate the c statistic for the pooled pred.

probabilities.

‘training’ data

‘test’ data

The problem with LOO CV

20 observations of one explanatory variable (N(0,1)),

independent outcome with 5 events.

The problem with LOO CV

20 observations of one explanatory variable (N(0,1)),

independent outcome with 5 events.

Leave-pair-out crossvalidation (LPO CV)

Algorithm:

• Each pair of observations with opposite outcomes is used as

test data once.

• Calculate the proportion of pairs, where the ranking is correct.

(Airola et al., 2011)

• tailored for the estimation of c statistics,

• high computational burden: (#events * #non-events) models to fit,

• not depending on random sampling.

Bootstrap

Enhanced bootstrap (Harrell, 2001):

1. Fit the model on bootstrap sample.

2. Calculate the c statistic using the

model from 1. in the bootstrap

sample and the original data.

3. The difference is the estimated

“optimism”.

4. Subtract the optimism averaged

over multiple bootstrap samples

from the apparent estimator.

original data

bootstrap samples

Bootstrap

.632+ bootstrap (Efron, 1997):

• weighted average of the apparent c statistic and c statistic

calculated from the omitted observations of the bootstrap

sample,

• puts less weight on the apparent c statistic than the .632

bootstrap.

Simulation study: set up

-0.2 -0.4

continuous

categorical

We evaluated the performance of the resampling methods,

simulating 1000 data sets for 12 scenarios with:

• 50 or 100 observations,

• event rates of 0.25 or 0.5,

• 5 covariables (2 cat., 3 cont.),

see Binder et al., 2011,

• none, moderate and strong effects.

Main evaluation criteria:

mean difference and root mean squared difference to true value

Simulation study: set up

We consider the following resampling methods

• 5-fold CV,

• LOO,

• LPO,

• enhanced bootstrap,

• .632+ bootstrap,

in combination with the following model estimation methods

• ML,

• Firth‘s penalization,

• ridge regression with AIC as tuning criterion.

Simulation results

Median (IQR) difference to the true c statistic

N=50, event rate=0.25, no effect

Difference to the true c statistic

Simulation results

N=50, event rate=0.25, small effect

Simulation results

N=50, event rate=0.25, large effect

Discussion

• problem of non-estimability in subsamples: most frequent with

bootstrap and 5-fold CV,

• different problems with estimation of the discrimination slope.

• LOO CV underestimates the c statistic,

• the bias in LOO CV is larger for methods with small variance,

• .632+ bootstrap, LPO CV and 5-fold CV were most accurate.

Conclusion

Literature

• Airola A, Pahikkala T, Waegeman W, De Baets B, Salakoski T. An experimental comparison of cross-

validation techniques for estimating the area under the ROC curve. Computational Statistics & Data

Analysis. 2011;55(4):1828-44. doi: 10.1016/j.csda.2010.11.018.

• Binder H, Sauerbrei W and Royston P. Multivariable Model-Building with Continuous Covariates:

Performance Measures and Simulation Design 2011. Technical Report FDM-Preprint 105, University

of Freiburg, Germany.

• Efron B, Tibshirani R. Improvements on cross-validation: The .632+ bootstrap method. Journal of the

American Statistical Association. 1997;92(438):548-60. doi: Doi 10.2307/2965703.

• Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression,

and Survival Analysis: Springer; 2001.

• Firth D. Bias reduction of maximum likelihood estimates. Biometrika 1993; 80(1): 27-38.

• Puhr R, Heinze G, Nold M, Lusa L, Geroldinger A. Firth's logistic regression with rare events: accurate

effect estimates AND predictions? Stat Med In Press. 2017. doi: 10.1002/sim.7273.

Leave-one-out crossvalidationprema.mf.uni-lj.si/files/CindexLOO_EMRIBS_b69.pdf · Leave-one-out...

Documents

Workers compensation: Where does opting out leave employees?

Leave-out Estimation of Variance Componentspkline/papers/KSS_slides.pdf · Leave-outEstimationofVarianceComponents PatrickKline1,RaﬀaeleSaggio2,MikkelSølvsten3 1DepartmentofEconomics,UniversityofCalifornia,Berkeley

Solving the Puzzle Disability and Family Leave, ADA, Workers Compensation When Employees Are Out of Work: Solving the Puzzle Disability and Family Leave,

New voluntary leave and early out options availableRequest Block VTO . New voluntary leave and early out options available Flight Crew Training Instructors ... You will be eligible

How to improve the statistical power of the 10-fold crossvalidation scheme in Recommender Systems

Week 2, Lecture 4 - Does my model work? Crossvalidation ... · 39 / 44 Example-GeneExpressionSignatures Figure4.Mostprognostictranscriptionalsignalsarecorrelated with meta-PCNA. A)

Cross-validation for detecting and preventing overfitting./awm/tutorials/overfit10.pdf · Copyright © Andrew W. Moore Slide 23 LOOCV (Leave-one-out Cross ... (Leave-one-out Cross

ELEVATION - gov.mb.ca · elevation view plan view standard leave-out detail (typ) paved surface purposes only for illustrative post is shown scale 1:30 leave-out ≥ 0 (typ) 50 mm

Hacer to do / to make hago. salir to leave / to go out salgo

Fast Exact Leave-One-Out Cross-Validation of Sparse Least ...theoval.cmp.uea.ac.uk/publications/pdf/nn2004a.pdf · Fast Exact Leave-One-Out Cross-Validation of Sparse Least-Squares

CORPORATE INSOLVENCY AND GOVERNANCE BILL · 3 Page 5, line 39, leave out from beginning to end of line 42 4 Page 6, leave out lines 27 to 30 5 Page 8, leave out lines 6 to 9 6 Page

A crossvalidation method for estimating conditional densitiesorfe.princeton.edu/~jqfan/papers/03/cdensity2.pdfA crossvalidation method for estimating conditional densities B JIANQING

SALIR “ to leave / go out ”

Package ‘TBSignatureProﬁler’ · Bootstrap on Leave-one-out CV with Logistic Regression. Description Bootstrap on Leave-one-out CV with Logistic Regression. ... Claire M. Banwell,

CIVIL APPEAL NO. 735 OF 2020 (Arising out of Special Leave

Off Target: How Cuts to Child Care and After-School Leave Out

Use of portable NIR spectrometerfor the analysisof fruits quality...2019/04/19 · Leave one out 162 8 18.5 7 1.33 0.68 1.96 Pommes moyennées‐leave one apple out 42 8.8 17.5 6

Resampling: CrossValidation EdVul · 2021. 1. 7. · 4 0.00 15690.46 Note: 10totaldatapoints,splittinginto5train,5test. Overand overagain. (Moreonthislater) Leave-one-outcross-validation

JoJo - Leave (Get Out), The Locations

Improvements on Cross-Validation: The .632+ Bootstrap ......strated that leave-one-out cross-validation has high variance if the prediction rule is unstable, because the leave-one-out