Upload
eleanor-dennis
View
27
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set. Model : 2005 version of ECMWF model; T255 resolution. Initial Conditions : 15 members, ERA-40 analysis + singular vectors - PowerPoint PPT Presentation
Citation preview
Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set
• Model: 2005 version of ECMWF model; T255 resolution. • Initial Conditions: 15 members, ERA-40 analysis + singular
vectors• Dates of reforecasts: 1982-2001, Once-weekly reforecasts
from 01 Sep - 01 Dec, 14 total. So, 20*14 ensemble reforecasts = 280 samples.
• Data sent to NOAA / ESRL : T2M and precipitation ensemble over most of North America, excluding Alaska. Saved on 1-degree lat / lon grid. Forecasts to 10 days lead.
Tom Hamill and Jeff Whitaker, NOAA/ESRLData courtesy of Renate Hagedorn & ECMWF
What we did• Considered 6-10 day forecasts of T2m and
precipitation (longest-possible lead from this data set).
• Relevance to weeks 2, 3, 4 forecast? Your guess is as good as mine (I think probably some relevance, less for week 4 than week 2).
• Experiments:– N-member reforecast, N members real time
(N=1, 3, 5, 7, 9, 11, 13, 15)
– N-member reforecast, 15 members real time– Use established statistical calibration procedures
Observation locations for2-meter temperature calibration
Uses stations fromNCAR’s DS472.0database that havemore than 96%of the yearly recordsavailable, and overlapwith the domain thatECMWF sent us.
(Note: precipitation calibration based on NARR over CONUS)
T2m calibration procedure: “NGR”“Non-homogeneous Gaussian Regression”
• Reference: Gneiting et al., MWR, 133, p. 1098• Predictors: ensemble mean and ensemble spread• Output: mean, spread of calibrated normal distribution
• Advantage: leverages possible spread/skill relationship appropriately. Large spread/skill relationship, c ≈ 0.0, d ≈1.0. Small, d ≈ 0.0
• Disadvantage: iterative method, slow…no reason to bother (relative to using simple linear regression) if there’s little or no spread/skill relationship.
f CAL x, σ( ) ~N a+bx, c+ dσ( )
Also: T2m calibration procedure:linear regression
• Predictors: ensemble mean of lowest sigma-layer temp• Output: predicted mean and standard deviation
where σ is determined by
and y denotes the observations and S the training sample size
f CAL x, σ( ) ~N a+bx, σ( )
σ =y − a + bx( )⎡⎣ ⎤⎦
2
S − 2S∑
Training data forNon-homogeneous Gaussian Regression
(all cross validated)
• 01 Sep: 01 Sep, 08 Sep, 15 Sep• 08 Sep: 01 Sep, 08 Sep, 15 Sep, 22 Sep• 15 Sep: 01 Sep, 08 Sep, 15 Sep, 22 Sep, 29 Sep• • • • 17 Nov: 03 Nov, 10 Nov, 17 Nov, 24 Nov, 01 Dec• 24 Nov: 10 Nov, 17 Nov, 24 Nov, 01 Dec• 01 Dec: 17 Nov, 24 Nov, 01 Dec
Use a centered training data set for weeks 3 - 12, uncentered for weeks 1, 2, 13, and 14
T2m results, same ensemble size reforecast as real-time forecast
Notes:
(1) Some sensitivityto ensemble size;more members clearlybetter, most of benefitby 11 members.
(2) Linear regression slightly better for smallensemble size, NGRslightly better for large ensemble size
T2m results, smaller reforecast than 15-mbr real-time forecast
Notes:
(1) NGR line replicated from previous plot for sake of comparison.
(2) Linear regressionfrom with coefficientsdeveloped from 3-memberreforecast and appliedto 15 members real timeprovides almost all of thebenefit of full 15-memberreforecast.
(2)
Precipitation forecast calibration: logistic regression
P obs >T( ) =1.−
11−exp β0 + β1x1 +K + βNxN( )
Given predictors x1, … , xN (such as the ensemble-mean), find regression coefficients
β0, β1, …, βN for the equation
This generates an S-shaped curve (here for one predictor)
Pro
ba
bili
ty
Predictor value
Precipitation calibrationtraining procedure
• Cross-validate (for example, 1983 forecasts use 1982, 1984-2001).
• Use all fall season data together, unlike temperature (1 Sep forecasts use 1 Sep - 1 Dec training data). [seasonal biases assumed less important than training sample size]
• Sole predictor: (ens. mean precip)0.5
Increasing logistic regressionsample size by compositing data from different locations
Big dot: location to perform logistic regression.
Small dots: grid points with similar observed climatologies, used to augment training sample at big dot’s location.
Constrained so that the analogcomposite locations can’t be too near to each other.
Sub-optimal (what ifforecast climatologies differ?What if forecast/observedcorrelations differ? These notaccounted for in choosinganalog locations.)
Precipitation calibration example
Reliability diagrams
15-member reforecast / 15-member real-time calibrated
15-member, from raw ECMWF ensemble
Precipitation Brier skill scores
Again, fewermembers areneeded in reforecast, aslong as real-timeforecast is larger.
Most of thebenefit achieved with 5-7 membersin the reforecast(larger than the 3members with temperature calibration)
Comments from Renate Hagedorn, ECMWF
• “The results itself are pretty much consistent with my results on the importance of the number of ensemble members in the training data set. I've also seen that 5 members are already quite sufficient and increasing the number to 15 doesn't give much benefit. In contrast to that, the number of years seems to be more important. Since increasing the reforecasts from 5 to 15 members is obviously very expensive (and doesn't seem to be justified very much) we'll go for a 5-member reforecast ensemble in our new system.”
• “Why you don't use ECMWF monthly forecast / reforecast data if you are interested in week 2,3,4 aspects? This could help with the problem/question of relevance of the 6-10 day forecasts.”
Reconfiguration of CFS?(intended as a starting point for discussion)
• Real-time: – Planned : 4x/day to 9 months (=36 months/day)– Reconfigured : 2x/day, 10 members out to 1 month, then
single member to 9 months (2*(10+8) = 36 months/day)
• Reforecasts– Planned: 1 run/day to 9 months (9 months/day)– Reconfigured:
• 10-member ensemble to 1 month every 2nd day (alternating 00Z, 12Z) = 5 months/day
• 1 member extending for 2-9 months every other day = 4 months/day
• Total = 5 + 4 = 9 months/day
Conclusions
• Assuming 15-member real-time forecast:– 3-member reforecast sufficient for calibration of 6-10 day
temperatures– 5-7 member reforecast sufficient for calibration of 6-10
day precipitation
• Relevance to calibration of weeks 2, 3, and 4? (perhaps could explore ECMWF’s monthly data set for greater relevance).
• Reconfiguration/supplementation of CFS to improve sub-seasonal forecasts should be discussed.