View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Evaluation of a Mesoscale Short-Range Ensemble Forecasting System
over the Northeast United States
Matt Jones & Brian A. Colle
NROW, 2004
Institute for Terrestrial and Planetary AtmospheresStony Brook UniversityStony Brook, New York
Verification Method
SUMMER = May – September 2003WINTER = October 2003 – March 2004
Scalar Measures:
Contingency-based Measures:
Prob.-based Measures:
SUMMER MESUMMER MAE2mT
2mRH
SLP
10mWS
10mWD
2mT
2mRH
SLP
10mWS
10mWD
night day night day
night day night day
Near-Surface T
Lowest level cloud water (~3K ft.)
Warm
Cool
Moist
Dry
Example of PHYS-member spread – Eta-PBL 2mT
WINTER MEWINTER MAE
NCEP BREDS
GFS
2mT
2mRH
SLP
10mWS
10mWD
2mT
2mRH
SLP
10mWS
10mWD
night day night day
night day night day
21zEta-1 21zEta-2 21zEta+1 21zEta+2 21zEta-CTL
00zEta 00zGFS IC MEAN
PHYS MEAN
L992
L
2004102200 f48
SUMMER MAE SUMMER MAE2mT
2mRH
SLP
10mWS
10mWD
night day night day
2mT
2mRH
SLP
10mWS
10mWD
night day night day
0000UTC Eta0000UTC ensemble mean0000UTC 4-km MM5
1200UTC 4-km MM50000UTC ensemble mean
Can the ensemble-mean beat 4km MM5 and Eta
determinitistic forecasts?
WINTER MAE WINTER MAE
Can the ensemble-mean beat 4km MM5 and Eta
determinitistic forecasts?
2mT
2mRH
SLP
10mWS
10mWD
night day night day
night day night day
2mT
2mRH
SLP
10mWS
10mWD
0000UTC 4-km MM5 1200UTC 4-km MM50000UTC ensemble mean
0000UTC Eta 0000UTC ensemble mean
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
Threshold (inches above, centimeters below)
BIA
S
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
Threshold (inches above, centimeters below)
Equi
tabl
e Thre
at Sco
re
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
Threshold (inches above, centimeters below)
BIA
S
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
Threshold (inches above, centimeters below)
Equi
tabl
e Thre
at Sco
re
SUMMER 24HP BIAS WINTER 24HP BIAS
SUMMER 24HP ETS WINTER 24HP ETS
Better
Worse
Over Pred.
Under Pred.
PHYS IC ALL
Verification Rank Histogram
All solutions of ensemble should be equally likely.Observation should appear no different than any ensemble member.Not a measure of skill; a necessary, but not sufficient condition for a good ensemble.
Perfect
“flat”
“U-shaped” “N-shaped” “L-shaped”
Under-dispersed Over-Dispersed Biased
Probabilistic Precipitation
Brier Score:
REL = Reliability
RES = Resolution (event discrimination)
UNC = Uncertainty (dependent only on obs.)fi = forecast probability
oi = observed probability (=1 for occurrence, =0 for non-
occurrence)N
t = number of forecast/event pairs for threshold, t
m = number of ensemble members (m+1 probability categories)
Skill
Perfect ReliabilityNo skillNo resolution
Ensemble Post-processing
Due to model imperfections, significant bias is retained even after ensemble averaging.
Day-15 Day-14 Day-13 Day-12 Day-11 Day-10 Day-9 Day-8 TODAY Day-7 Day-6 Day-5 Day-4 Day-3 Day-2 Day-1
Use previous 14 complete forecasts to correct forecasts starting 0000UTC today
SUMMER MISSING RATE IMPROVEMENT WINTER MISSING RATE IMPROVEMENT
2mT
2mRH
SLP
10mWS
UncalibratedCalibrated
The ensemble-mean is more skillful than component members on average for daytime 2mT/10mWS, SLP, and 10mWD. Persistent biases among component members reduce the skill advantage of the ensemble-mean during other periods (e.g. nighttime 2mT/10mWS).
The ensemble-mean can outperform the deterministic Eta model, and can equal the skill of a high-resolution deterministic MM5 initialized 12 hours later.
The PHYS ensemble is more beneficial for forecasting surface parameters during the warm season due to greater variation among component members.
The GFS initial condition leads to a superior SLP forecast compared to the poorly skilled NCEP Eta-bred members, especially during the cool season. The GFS member outperforms the ensemble-mean for SLP and 10mWD in the cool season.
The ensemble has some ability to predict forecast skill and estimate the uncertainty of a forecast through ensemble spread-error correlation, especially for 10mWD. Persistent biases among component members and ensemble underdispersion for other surface parameters reduce the spread-error correlation (e.g. 2mT, 10mWS).
Conclusions (1)
In warm season, low POPs have reliability for low threshold precip. events. High POPs have reliability for all thresholds.
In cool season, low POPs have poor reliability for all precip. event thresholds. High POPs have reliability all precip. event thresholds.
The PHYS (IC) ensemble is more skillful in POPs during the warm (cool) season. In the warm season, the Hybrid ensemble has the greatest POP skill.
A 14-day bias calibration can reduce much of the bias for most parameters, improving ensemble MRs.
Conclusions (2)
http://fractus.msrc.sunysb.edu/mm5rte
18-mbr Ens output
Ensemble Stats
Ensemble Verif.
REALTIME SBU-SREF PRODUCTS
Acknowledgments●Eric Grimit – University of Washington●NWS – OKX●ITPA – SBU
Website●http://fractus.msrc.sunysb.edu/mm5rte
Publication●Jones, M.S., and B. A. Colle, 2004: Evaluation of a mesoscale short-range ensemble forecasting system over the Northeast United States. Wea. Forecasting, in preparation.
Investigate for which synoptic regimes ensemble variance is most/least useful.
Investigate for which synoptic regimes a post-processing technique is most beneficial (MOS vs. historical bias calibration).
Reduce the inequality of skill among members by removing poorly-performing members / replacing with multiple models, multiple analysis initial conditions.
Investigate alternative ensemble quantities (trimmed mean/variance, modal quantile value).
Continue efforts in improving presentation of forecast uncertainty/ensemble confidence.
Future Work
Verification Rank Histogram
●All solutions of ensemble should be equally likely.●Observation should appear no different than any ensemble member.●Not a measure of skill; a necessary, but not sufficient condition for a good ensemble.
MR =Summation of Extreme Ranks
MR exp =2
M ƒ 1=
2
6
MR adj = MR MR exp
MR = “Missing Rate” Perfect
“flat”
“U-shaped” “N-shaped” “L-shaped”
Under-dispersed Over-Dispersed Biased
Usability of Ensemble Variance
●The variance of a properly dispersed ensemble is a good representation of forecast uncertainty.●Ensemble variance should be correlated with ensemble error, leading to an ability of the ensemble to predict ensemble skill (Houtekamer 1993).
High skillLow spread
Low skillHigh spread
Ensemble Probability Forecasts
●An ensemble distribution should present what is most probable and what is least probable, reducing the “element of surprise” (Brooks and
Doswell 1993).