17
Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real- time Baudouin Raoult Data and Services Section ECMWF

Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Embed Size (px)

Citation preview

Page 1: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 1

TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time

Baudouin Raoult

Data and Services Section

ECMWF

Page 2: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 2

The TIGGE core dataset

THORPEX Interactive Grand Global Ensemble

Global ensemble forecasts to around 14 days generated routinely at different centres around the world

Outputs collected in near real time and stored in a common format for access by the research community

Easy access to long series of data is necessary for applications such as bias correction and the optimal combination of ensembles from different sources

Page 3: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 3

Building the TIGGE database

Three archive centres: CMA, NCAR and ECMWF

Ten data providers:

- Already sending data routinely: ECMWF, JMA (Japan), UK Met Office (UK), CMA (China), NCEP (USA), MSC (Canada), Météo-France (France), BOM (Australia), KMA (Korea)

- Coming soon: CPTEC (Brazil)

Exchanges using UNIDATA LDM, HTTP and FTP

Operational since 1st of October 2006

88 TB, growing by ~ 1 TB/week

- 1.5 millions fields/day

Page 4: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 4Archive Centre

Current Data Provider

Future Data Provider

NCARNCEP

CMC

UKMO

ECMWF

MeteoFrance

JMAKMA

CMA

BoM

CPTEC

TIGGE Archive Centres and Data Providers

Page 5: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 5

Strong governancePrecise definition of:

- Which products: list of parameters, levels, steps, units,…

- Which format: GRIB2

- Which transport protocol: UNIDATA’s LDM

- Which naming convention: WMO file name convention

Only exception: the grid and resolution

- Choice of the data provider. Data provider to provide interpolation to regular lat/lon

- Best possible model output

Many tools and examples:

- Sample dataset available

- Various GRIB2 tools, “tigge_check” validator, …

- Scripts that implement exchange protocol

Web site with documentation, sample data set, tools, news….

Page 6: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 6

Using SMS to handle TIGGE flow

Page 7: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 7

Quality assurance: homogeneity

Homogeneity is paramount for TIGGE to succeed- The more consistent the archive the easier it will be to develop

applications

There are three aspects to homogeneity:- Common terminology (parameters names, file names,…)

- Common data format (format, units, …)

- Definition of an agreed list of products (Parameters, Steps, levels, …)

What is not homogeneous:- Resolution

- Base time (although most provider have a run a 12 UTC)

- Forecast length

- Number of ensemble

Page 8: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 8

QA: Checking for homogeneityE.g. Cloud-cover: instantaneous or six hourly?

Page 9: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 9

QA: Completeness

The objective is to have 100% complete datasets at the Archive Centres

Completeness may not be achieved for two reasons:

- The transfer of the data to the Archive Centre fails

- Operational activities at a data provider are interrupted and back filling past runs is impractical

Incomplete datasets are often very difficult to use

Most of the current tools (e.g. epsgrams) used for ensemble forecasts assume a fixed number of members from day to day

- These tools will have to be adapted

Page 10: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 10

QA: Checking completeness

Page 11: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 11

GRIB to NetCDF Conversion

t, EGRR, 1

t (1,2,3,4)

d (1,2,3,4)

Metadata

t, ECMF, 2

t, EGRR, 2

t, ECMF, 1

d, EGRR, 1

d, EGRR, 2

d, ECMF, 1

d, ECMF, 2

Gather metadata and message locations

Create NetCDF file structure

Populate NetCDF parameter arrays(1,2,3,4) represents ensemble member id (Realization)

GRIB File NetCDF File

Page 12: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 12

Ensemble NetCDF File Structure

NetCDF File format

- Based on available CF conventions

- File organization built according to Doblas-Reyes (ENSEMBLES project) proposed NetCDF file structure

- Provides grid/ensemble specific metadata for each member

Data Provider

Forecast type (perturbed, control, deterministic)

- Allows for multiple combinations of initialization times and forecast periods within one file.

Pairs of initialization and forecast step

Page 13: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 13

Ensemble NetCDF File Structure

NetCDF Parameter structure (5 dimensions):

- Reftime

- Realization (Ensemble member id)

- Level

- Latitude

- Longitude

“Coordinate” variables are use to describe:

- Realization

Provides metadata associated with each ensemble grid.

- Reftime

Allows for multiple initialization times and forecast periods

to be contained within one file

Page 14: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 14

Tool Performance

GRIB-2 Simple Packing to NetCDF 32 BIT

- GRIB-2 size x ~2

GRIB-2 Simple Packing to NetCDF 16 BIT

- Similar size

GRIB-2 JPEG 2000 to NetCDF 32 BIT

- GRIB-2 size x ~8

GRIB-2 JPEG 2000 to NetCDF 16 BIT

- GRIB-2 size x ~4

Issue: packing of 4D fields (e.g. 2D + levels + time steps)

- Packing in NetCDF similar to simple packing in GRIB2:

Value = scale_factor * packed_value+ add_offset;

- All dimensions shares the same scale_factor and add_offset

- For 16 bits, only different 65536 values can be encoded. This is a problem if there is a lot of variation in the 4D matrices

Page 15: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 15

GRIB2

WMO Standard

Fine control on numerical accuracy of grid values

Good compression (Lossless JPEG)

GRIB is a record format

- Many GRIBs can be written in a single file

GRIB Edition 2 is template based

- It can easily be extended

Page 16: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 16

NetCDF

Work on the converter gave us a good understanding of both formats

NetCDF is a file format

- Merging/splitting NetCDF files is non-trivial

Need to agree on a convention (CF)

- Only lat/long and reduced grid (?) so far. Work in progress for adding other grids to the CF

- There is no way to support multiple grids in the same file

Choose a convention for multi fields per NetCDF files

- All levels? All variables? All time steps?

Simple packing possible, but only a convention

- 2 to 8 times larger than GRIB2

Page 17: Slide 1 TIGGE phase1: Experience with exchanging large amount of NWP data in near real-time Baudouin Raoult Data and Services Section ECMWF

Slide 17

Conclusion

True interoperability

- Data format, Units

- Clear definition of the parameters (semantics)

- Common tools are required (only guarantee of true interoperability)

Strong governance is needed

GRIB2 vs NetCDF

- Different usage patterns

- NetCDF: file based, little compression, need to agree on a convention

- GRIB2: record based, easier to manage large volumes, WMO Standard