30
1 James Brown [email protected] An introduction to verifying probability forecasts RFC Verification Workshop

1 James Brown [email protected] An introduction to verifying probability forecasts RFC Verification Workshop

Embed Size (px)

Citation preview

Page 1: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

1

James Brown

[email protected]

An introduction to verifying probability forecasts

RFC Verification Workshop

Page 2: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

2

1. Introduction to methods• What methods are available?

• How do they reveal (or not) particular errors?

• Lecture now, and hands-on training later

2. Introduction to prototype software• Ensemble Verification System (EVS)

• Part of a larger experimental project (XEFS)

• Lecture now, and hands on training later

Goals for today

Page 3: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

3

3. To establish user-requirements• EVS in very early (prototype) stage

• Pool of methods may expand or contract

• Need some input on verification products

• AND to address pre-workshop questions…...

Goals for today

Page 4: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

4

How is ensemble verification done?

Same for short/long-term ensembles?

What tools, and are they operational?

Which metrics for which situations?

Simple metrics for end-users?

How best to manage the workload?

What data need to be archived/how?

Pre-workshop questions

Page 5: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

5

1. Background and status

2. Overview of EVS

3. Metrics available in EVS

4. First look at the user-interface (GUI)

Contents for next hour

Page 6: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

6

1. Background and status

Page 7: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

7

A first look at operational needs• Two classes of verification identified

1. High time sensitivity (‘prognostic’)• e.g. how reliable is my live flood forecast?...

• …where should I hedge my bets?

2. Less time sensitive (‘diagnostic’)• e.g. which forecasts do less well and why?

A verification strategy?

Page 8: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

8

Prognostic exampleT

emp

erat

ure

(oC

)

Forecast lead day

Live forecast (L)

Historical observations | μH = μL ± 1.0˚C

Matching historical forecasts (H)

Page 9: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

9

Diagnostic exampleP

rob

abili

ty o

f w

arn

ing

co

rrec

tly

(hit

)

Probability of warning incorrectly (‘false alarms’)

0

0 1.0

1.0

e.g. flood warningwhen P>=0.9

Climatology

Single-valuedforecast

Page 10: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

10

Motivation for EVS (and XEFS)• Demand: forecasters and their customers

• Demand for useable verification products

• ….limitations of existing software

History• Ensemble Verification Program (EVP)

• Comprised (too) many parts, lacked flexibility

• Prototype EVS begun in May 07 for XEFS…..

Motivation for EVS

Page 11: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

11

Position in XEFS

IFP

Ensemble Viewer

OFS

Raw flow ens.

Pp’ed flow ens.

Ensemble Verification Subsystem

Flow Data

Ens. Product Generation Subsystem

Ensemble verification products

Hydrologic Ensemble Hindcaster

Ens. User Interface

EPP User Interface

Ens. Pre-Processor

Atmospheric forcing data

Ensemble/prob.

products

Ens. Post-Proc.

Ens. Streamflow Prediction Subsystem

HMOS Ensemble Processor

MODs

EPP3ESP2 EnsPost EPG

EVS

Hydro-meteorol.

ensembles

Precip., temp. etc.

Streamflow

Page 12: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

12

2. Overview of EVS

Page 13: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

13

Diagnostic verification• For diagnostic purposes (less time-sensitive)

• Prognostic built into forecasting systems

Diagnostic questions include….• Are ensembles reliable?

• Prob[flood]=0.9: does it occur 9/10 times?

• Are forecaster MODS working well?

• What are the major sources of uncertainty?

Scope of EVS

Page 14: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

14

Verification of continuous time-series• Temperature, precipitation, streamflow etc.

• > 1 forecast points, but not spatial products

All types of forecast times• Any lead time (e.g. 1 day – 2 years or longer)

• Any forecast resolution (e.g. hourly, daily)

• Pair forecasts/observed (in different t-zones)

• Ability to aggregate across forecast points

Design goals of EVS

Page 15: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

15

Flexibility to target data of interest• Subset based on forecasts and observations

• Two conditions: 1) time; 2) variable value

• e.g. forecasts where ensemble mean < 0˚C

• e.g. max. observed flow in 90 day window

Ability to pool/aggregate forecast points• Number of observations can be limiting

• Sometimes appropriate to pool points

Design goals of EVS

Page 16: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

16

Carefully selected metrics • Different levels of detail on errors

• Some are more complex than others, but….

• Use cases and online docs. to assist

To be ‘user-friendly’• Many factors determine this….

• GUI, I/O, exec. speed, batch modes

Design goals of EVS

Page 17: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

17

Example of workflow

How biased are my winter flows > flood

level at dam A?

Page 18: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

18

Coordinated across XEFS:

The forecasts• Streamflow: ESP binary files (.CS)

• Temperature and precip: OHD datacard files

The observations• OHD datacard files

Unlikely to be database in near future

Archiving requirements

Page 19: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

19

3. Metrics available

Page 20: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

20

Many ways to test a probability forecast

1. Tests for single-valued property (e.g. mean)

2. Tests of broader forecast distribution

• Both may involve reference forecasts (“skill”)

Caveats in testing probabilities• Observed probabilities require many events

• Big assumption 1: we can ‘pool’ events

• Big assumption 2: observations are ‘good’

Types of metrics

Page 21: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

21

Discrete/categorical forecasts• Many metrics rely on discrete forecasts

• e.g. will it rain? {yes/no} (rain > 0.01)

• e.g. will it flood? {yes/no} (stage > flood level)

What about continuous forecasts?• An infinite number of events

• Arbitrary event thresholds (i.e. ‘bins’)?

• Typically, yes (and choice will affect results)

Problem of cont. forecasts

Page 22: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

22

Detail varies with verification question • e.g. inspection of ‘blown’ forecasts (detailed)

• e.g. avg. reliability of flood forecast (< detail)

• e.g. rapid screening of forecasts (<< detail)

All included to some degree in EVS……

Metrics in EVS

Page 23: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

23

Greatest + ve

90 percent.

80 percent.

50 percent.

20 percent.

10 percent.

‘Errors’ for 1 forecast

Greatest - ve

Observation

En

sem

ble

fo

reca

st e

rro

rs

Most detailed (box plot)

0 2 4 6 8 10 12 14 16 18 20 Time (days since start time)

Page 24: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

24

Greatest + ve

90 percent.

80 percent.

50 percent.

20 percent.

10 percent.

‘Errors’ for 1 forecast

Greatest - ve

Observation

En

sem

ble

fo

reca

st e

rro

rs

Observed value (increasing size)

Most detailed (box plot)

Page 25: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

25

Less detail (Reliability)O

bse

rved

pro

bab

ility

giv

en f

ore

cast

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Forecast probability (probability of flooding)

“On occasions when flooding is forecast withprobability 0.5, it should occur 50% of the time.”

“Forecast bias”

Page 26: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

26

Less detail (C. Talagrand)

Cu

mu

lati

ve p

rob

abili

ty

“If river stage <=X is forecast with probability 0.5, it should be observed 50% of the time.”

0 10 20 30 40 50 60 70 80 90 100

Position of observation in forecast distribution

“Forecast bias”

Page 27: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

27

Least detailed (a score)

0 5 10 15 20 25 30

Riv

er s

tag

e

Time (days)

2.0

1.6

1.2

0.8

0.4

0.0

Flood stage

12

3

5

Forecast Observation

Brier score = 1/5 x {(0.8-1.0)2 + (0.1-1.0)2 +

(0.0-0.0)2 + (0.95-1.0)2 + (1.0-1.0)2}4

Page 28: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

28

Least detailed (a score)

0 5 10 15 20 25 30

Cu

mu

lati

ve p

rob

abili

ty

Precipitation amount

1.0

0.8

0.6

0.4

0.2

0.0

Single forecast

Observation

A

B

CRPS = A2 + B2

Then average acrossmultiple forecasts: small scores are better

Page 29: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

29

4. First look at the GUI

Page 30: 1 James Brown James.D.Brown@noaa.gov An introduction to verifying probability forecasts RFC Verification Workshop

30

Two-hour lab sessions with EVS• Start with synthetic data (with simple errors)

• Then more on to a couple of real cases

Verification plans and feedback • Real-time (‘prognostic’) verification

• Screening verification outputs

• Developments in EVS

• Feedback: discussion and survey

Rest of today