Assessing the skill of decadal predictions Reidun Gangst ø, Andreas P. Weigel, Mark A. Liniger

Preview:

DESCRIPTION

Assessing the skill of decadal predictions Reidun Gangst ø, Andreas P. Weigel, Mark A. Liniger. EMS Annual Meeting, Berlin, 13 September 2011. Outline. The ENSEMBLES decadal predictions Impact of drift correction on skill Is there any skill apart from the trend ? - PowerPoint PPT Presentation

Citation preview

Eidgenössisches Departement des Innern EDIBundesamt für Meteorologie und Klimatologie MeteoSchweiz

Assessing the skill of decadal predictions

Reidun Gangstø, Andreas P. Weigel, Mark A. Liniger

EMS Annual Meeting, Berlin, 13 September 2011

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

2/23

Outline

• The ENSEMBLES decadal predictions

• Impact of drift correction on skill

• Is there any skill apart from the trend?

• Impact of cross-validation on skill

• Evaluating skill with the Jackknife bias corrector

• Summary and conclusions

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

3/23

Global average T2 temperature(ENSEMBLES decadal predictions vs ERA-40/Interim re-analysis data)

ECMWF UKMO

IFM-GEOMAR CERFACS

T2

(°C

)

Hindcast year Hindcast year

T2

(°C

)

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

4/23

Problem: sample size too small (8-9) to obtain robust bias estimates for each year separately

Lead-time (year)

T2

tem

per

atu

re (

°C)

Example of drift evolution with lead-time Crosses: CONV solid lines: FIT Drift correction

methods:

• Subtracting the lead-time dependent bias (CONV)

• Fitting a 3rd degree polynomial fit to the lead-time dependent bias (FIT)

The drift correction is done in a leave-one-out cross-validation mode

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

5/23

Global mean T2, after drift correction

ECMWF UKMO

IFM-GEOMAR CERFACS

T2

(°C

)

Hindcast year Hindcast year

T2

(°C

)

Mu

lti-

mo

de

l

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

6/23

Correlation after drift correction(in cross-validation)

Correlation, FIT (T2 mean over years 1-5)

Lead-time (year)

Mean of grid point-wise correlation

Co

rrel

atio

n

Lat

itu

de

Longitude

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

7/23

Removing the model trend

1-5 y

6-10 y

All lead-times 1-10 y

T2

tem

per

atu

re (

°C)

Year

1-5 y

6-10 y

1-5 y

6-10 y

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

8/23

Removing the observed trend

1-5 y

6-10 y

1-5 y

6-10 y

-

-

1-5 y

6-10 y

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

9/23

Correlation after detrending (drift correction with CONV, in cross-validation)

Why is the skill predominantly negative???

Lead-time (year)

Co

rrel

atio

n

Correlation, model trend removed (yrs 1-5)

Lat

itu

de

Longitude

Mean of grid point-wise correlation

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

10/23

Cross-validation

1960 Predict

1960

Determine bias

1965 Determine bias

Predict 1965

Determine bias

1970 Determine bias Predict 1970

Determine bias

1975 Determine bias Predict 1975

Determine bias

1980 Determine bias Predict 1980

Determine bias

… then correlate with observations 1960, 1965, 1970, …

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

11/23

Prescribed correlation: 0Number of experiments: 10’000Var.fcst / Var.obsv 1:12

Correlation as measured

Drift-correction (method: CONV) in cross-validation

Toy model: bias from cross-validation

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

12/23

Not bias corrected

Forecasts

Obsv.

NO CORRELATION

Illustration of cross-validation bias,example: CONV drift correction

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

13/23

Not bias corrected Bias corrected

Forecasts

Obsv.

NO CORRELATION

Illustration of cross-validation biasexample: CONV drift correction

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

14/23

Not bias corrected Bias corrected

Forecasts

Obsv.

NO CORRELATION

Illustration of cross-validation biasexample: CONV drift correction

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

15/23

Not bias corrected Bias corrected

Forecasts

Obsv.

NO CORRELATION

Illustration of cross-validation biasexample: CONV drift correction

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

16/23

Not bias corrected Bias corrected

Forecasts

Obsv.

NO CORRELATION NEGATIVE CORRELATION

Illustration of cross-validation biasexample: CONV drift correction

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

17/23

Consequences for verification

• Estimates of actual prediction skill of decadal forecasts problematic because:• Issues of data situation in hindcasts (e.g. ocean data

before 1980s)• small sample size induces bias in cross-validation

procedure

• It may be better to look at potential predictability, i.e. the skill we would have with an infinite number of training data, and assuming that there are no limitations in data quality

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

18/23

Jackknifing as a pragmatic solution

• Empirical approach frequently used to quantify sample size related biases

• Related to bootstrapping

• The idea is that the estimator is computed from the full sample, then recomputed n times, leaving a different observation out each time

• Reference: B. Efron (1982). The Jackknife, the Bootstrap and other resampling plans. J.W. Arrowsmith, Ltd., Bristol, England, 92 pp.

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

19/23

Prescribed correlation: 0Number of experiments: 10’000Var.fcst / Var.obsv 1:12

Correlation as measured

Drift-correction (method: CONV) in cross-validation

Jackknife estimate

Toy model: bias from cross-validation

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

20/23

Local correlation after drift correction,with the Jackknife bias corrector (JK) applied

Correlation with JK (T2 mean over years 1-5)

Lead-time (year)

Co

rrel

atio

n

Correlation, CONV, with CV

Lat

itu

de

Longitude

Mean of grid point-wise correlation

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

21/23

Local correlation after detrending, with the Jackknife bias corrector applied

Correlation with JK, model trend removed (yrs 1-5)

Lead-time (year)

Co

rrel

atio

n

Correlation CONV, with CV

Lat

itu

de

Longitude

Mean of grid point-wise correlation

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

22/23

Difference in correlation between the detrending methods

Uncertainties related to the choice of detrending method are of the same order of magnitude as remaining fluctuations

Assessing the skill of decadal predictions | Reidun Gangstø

EMS Annual Meeting, Berlin | 13 September 2011

23/23

Summary and conclusions

• Predicted near-surface temperature from the ENSEMBLES decadal model forecasts are compared to ERA-40/Interim re-analysis data

• Drift correction:• Reduction of noise by fitting suitable polynomial through annual

bias estimates• Verification:

• Unbiased estimate of forecasts problematic due to small sample sizes

• It may be more useful to focus on potential predictability (e.g. Jackknife method)

• Trend:• By far most of the skill is related to reproduction of linear trend • Skill of predicting remaining (interannual) fluctuations close to zero • Exact quantification difficult due to uncertainties in detrending

methods