42
ECMWF B Training 2006 slide 1 Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

  • Upload
    molly

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006. Overview. Introduction to ODB Creating a simple database Use of simulobs2odb –program Visualizing data using odbviewer, ODBTk The bigger picture ODB within IFS/4DVAR-system A more complex database - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 1

Introduction to Observational DataBase (ODB)

Sami Saarinen, Paul BurtonECMWF

22-Mar-2006

Page 2: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 2

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from Fortran90

Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

Page 3: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 3

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from Fortran90

Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

Page 4: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 4

Introduction to ODBODB is a tailor made database software developed at

ECMWF to manage very large observational data volumes through the IFS/4DVAR-system, and to enable flexible post-processing of observational data

Observational database usually contains following items:

Observation identification, position and time coordinates

Observation value, pressure levels, channel numbers

Various quality control flags

Obs. departures from background and analysis fields

Satellite specific information

Other closely related information

Page 5: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 5

AMSU-A data before screening

Page 6: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 6

Basic components of ODBODB/SQL-language

Data Definition Language: To describe what data items belong to database, what are their data types and how they are related (if any) to each other

Data Query Language: To query and return a subset of data which satisfies certain user specified conditions. This is the key feature of the ODB software !!

Fortran90 interface layer

Data manipulation : create, update & remove data

Execute ODB/SQL-queries and retrieve filtered data

To control MPI and OpenMP-parallelization

Page 7: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 7

ODB/SQL compilation system

Page 8: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 8

Typical ODB usage patternsDatabase can be created interactively or in batch mode

We usually run our in-house BUFR2ODB in batch

New observation types can also be fed in via text file

Complete database manipulation currently prefers using Fortran90-interface, but read/only database can also be accessed via rudimentary client-server –interface (C/C++)

When database has been created, the application program normally queries data and places the result (also known as view) into a data matrix allocated by the user

There can be virtually any number of active views at any given time. These can be updated and fed back to database

Page 9: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 9

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from For

Tools: odbless, odbdif, odbcompress, odbdup, odb2netcdf

Page 10: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 10

Creating a simple databaseWe will create a very simple database using text files

The 3 text files describe

Data layout i.e. what data items comprise this ODB

Location and time information of observations

Actual observation measurement information for each location at the given pressure levels

Feed these files into simulobs2odb-program

Discover the data values in database by using odbviewer

Page 11: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 11

Data definition layout : MYDB.ddl

CREATE TABLE hdr AS (

seqno pk1int,

obstype pk1int,

codetype pk1int,

lat pk9real,

lon pk9real,

date yyyymmdd,

time hhmmss,

body @LINK,

);

CREATE TABLE body AS (

entryno pk1int,

varno pk1int,

vertco_type pk1int,

press pk9real,

obsvalue pk9real, );

Page 12: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 12

Input file#2 : hdr.txt

#hdr

obstype = 2

codetype = 141

seqno lat lon date time body.len

1 45 -15 20041101 000000 1

Page 13: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 13

Input file#3 : body.txt

#body

entryno varno vertco_type press obsvalue

1 2 1 50000 251.0

Page 14: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 14

Running simulobs2odbInitialize ODB interactive environment :

use odb

Create database using the following simple command :

simulobs2odb –l MYDB –i hdr.txt –i body.txt

As a result of these commands, a small database called MYDB has been created and it contains one data pool with two tables hdr and body, which are linked (related) to each other via special @LINK data type

It is now easy to extend database by providing more data, or specifying more data items, or adding more tables, or all above at the same time

Page 15: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 15

Visualizing with odbviewerHistory: odbviewer was originally written to be used as a

debugging tool for ODB software development

Linked with ECMWF graphics package MAGICS/MAGICS++ it displays coverage plots

Also a textual report generator

Displays output of data queries

“Sensitive” to ODB/SQL-language : tries automatically produce both coverage plot and textual report for the user

Textual report itself can be invaluable source of information for further post-processing tasks

Page 16: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 16

Running odbviewerGo to database directory

cd MYDB

Run

odbviewer –q ‘SELECT lat,lon,press,obsvalue\

FROM hdr, body \

WHERE obstype = 2’

Page 17: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 17

odbviewer coverage plot

Our observation !!

Page 18: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 18

Some odbviewer [options]-h List of options (gimme some “help” !)

-q ‘SQL-stmt’ Provide ODB/SQL-statement inline

-v viewname/poolno Choose SQL name (& optionally pool number)

-p “1-10,12,15” Choose from a subset of pools

-R No radians-to-degrees conversion for (lat,lon)

-r Enforce radians-to-degrees conversion

-c Clean start (i.e. recompile all)

-e editor Choose preferred editor

-e batch Run in batch mode (same as –e pipe)

-N Do not produce a report at all

-I Do not show plot immediately

-P projection Change projection

-C file.cmap Supply a color map file

-A plot_area Choose plotting area

Page 19: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 19

ODBTk : The ODB ToolkitGUI based ODB visualisation tool

Easy way for non-experts to build SQL

Interactive viewing of observational data

Can refine SQL “WHERE” statement as you view the data

Portable, lightweight applicationRequires ODB, perl, Fortran90 & C compilers

Page 20: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 20

ODBTk : Building an SQL

Twin views on structure

Hierarchical structure

Allows relationship between

tables/columns to be seen

“Flat structure”

Easy to find a given column/member

or table

Allows user to sort structure

SQL library

Both local & shared

Page 21: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 21

Visualising Coverage

Page 22: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 22

Visualising X-Y plots

Page 23: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 23

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from Fortran90

Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

Page 24: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 24

AMSU-A data after screening

Under 10% left active !!

Page 25: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 25

ODB within IFS/4DVAR-system

ECMA/ODB

CCMA/ODB

Output BUFRs

Page 26: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 26

A more complex databaseIn the real world a database may contain many more tables

(>>5) than in the simple example earlier

Each table can contain 10—50 data columns

There can also be a sophisticated data hierarchy (next slide) to describe potentially complex relationships between tables

In order to provide a good parallel performance on supercomputers, data tables are furthermore divided into data pools

They behave like sub-databases within a database

Allows much bigger data sets than otherwise possible

Page 27: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 27

Comprehensive data hierarchy

Page 28: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 28

ECMWF BUFR to ODB conversionODBs at ECMWF are normally created by using bufr2odb

Enables MPI-parallel database creation efficient

Allows retrospective inspection of Feedback BUFR data by converting it into ODB

bufr2odb can also be used interactively, for example: bufr2odb –i bufr_input_file –I 1-20 –n 4

The preceding example creates 4 pools of ECMA database from the given BUFR input file, but includes only BUFR subtypes from 1 to 20 (inclusive)

Feedback BUFR to ODB works similarly:

fb2odb –i feedback_bufr_file –n 8 –u 2

Page 29: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 29

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from Fortran90

Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

Page 30: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 30

Manipulating ODB from Fortran90Currently Fortran90 is the only way to fill an ODB database

simulobs2odb is also a Fortran90-program underneath

likewise odbviewer or practically any other ODB-tool

Also: to fetch and update data, Fortran90 is necessary

ODB Fortran90 interface layer offers a comprehensive set of functions to

Open & close database

Attach to & execute precompiled ODB/SQL queries

Load, update & store queried data

Page 31: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 31

An example ODB program program main

use odb_module

implicit none

integer(4) :: h, rc, nra, nrows, ncols, npools, j, jp

real(8), allocatable :: x(:,:)

npools = 0

h = ODB_openODB_open(‘MYDB’, ’OLD’, npools=npools)

< data manipulation loop ; see next page >

rc = ODB_closeODB_close(h, save=.TRUE.)

end program main

Page 32: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 32

Data manipulation loop DO jp=1,npools

! Execute SQL, allocate space, get data into matrix

rc = ODB_selectODB_select(h,’sqlview’,nrows,ncols,poolno=jp)

allocate(x(nrows,0:ncols))

rc = ODB_getODB_get(h,’sqlview’,x,nrows,ncols,poolno=jp)

! Update data, put back to DB, deallocate space

call update(x,nrows,ncols) ! Not an ODB-routine

rc = ODB_putODB_put(h,’sqlview’,x,nrows,ncols,poolno=jp)

deallocate(x)

rc = ODB_cancelODB_cancel(h,’sqlview’,poolno=jp)

! Use the following only with READONLY-databases

! rc = ODB_releaseODB_release(h,poolno=jp)

ENDDO

Page 33: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 33

Compile, link and run

(1) use odb # once per session

(2) odbcomp MYDB.ddl # once only;often from file MYDB.sch

(3) odbcomp sqlview.sql # recompile only when changed

(4) odbf90 main.F90 update.F90 –lMYDB –o main.x # link

(5) ./main.x # run

Page 34: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 34

OverviewIntroduction to ODB

Creating a simple database

Use of simulobs2odb –program

Visualizing data using odbviewer, ODBTk

The bigger picture

ODB within IFS/4DVAR-system

A more complex database

Manipulating ODB from Fortran90

Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

Page 35: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 35

odblessA textual browser that allows to look at ODB data page-by-

page –basis (a little like Unix less-command):

By default calculates statistical summary for each retrieved data column

Cheap with near-optimal ODB data access pattern

User has a choice of specifying starting row

Usage: odbless –q ‘SELECT column(s) FROM table(s) WHERE …’ \

–s starting_row –n number_of_rows_to_display \

[–b buffer_size –X]

Page 36: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 36

odbdiffEnables to compare two ODB databases for differences

Very useful tool when trying to identify errors/differences between operational and experimental 4DVAR runs

Usage:

odbdiff –q ‘SELECT …’ DATABASE1 DATABASE2

By default brings up an xdiff-window with respect to diffs

If latitude and longitude were given in the data query, then also produces a difference plot using odbviewer-tool

Page 37: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 37

odbcompressEnables creation of very compact database from the

existing one for

archiving purposes, or for smaller footprint

Makes post-processing considerably faster

At this point the user has choices of both

Truncating the data precision

Leaving out columns that are less of importance

Early tests show that this new tool achieves compression factors from 2.5X to 11X

the higher compression being for satellite data !!

Page 38: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 38

odbdupDuplicates database(s) by copying metadata (low volume),

but shares the actual data (high volume)

Allows database sharing between multiple users

Over shared (e.g. NFS mounted) disk

Enables creation of time-series database, for example: odbdup –i “200601*/ECMA.conv” –o USERDB

The previous example creates a new database labelled as USERDB, which presumably spans over all the conventional observations during January 2006

The heureka is : user has now access to a whole month of data as if it was situated in one single database !!

Page 39: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 39

odb2netcdfTranslates the given ODB-query (or whole ODB-table) into a

series of NetCDF-files, by default one file for each ODB data pool

Usage:

odb2netcdf –q ‘SELECT …’

The result files can be viewed with standard NetCDF tools like ncdump and ncview

The files can also be produced in NetCDF packed format (with a caveat of truncated precision)

Page 40: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 40

Also … Some interesting factsWritten mainly in C-language

Except Fortran90-interface and IFS/4DVAR interface

Except BUFRODB (by Milan Dragosavac)

ODB/SQL is currently converted into C-code

10 lines of SQL generates >> 100 lines of C-code

Standalone ODB installation (w/o IFS) is also available

Can be built in about 30 minutes for Linux/laptop

Tested at least on the following machines

SGI/Altix, IBM Power3/4, Linux Intel/AMD, VPP, …

Automatic binary data conversion guarantees database portability between different machines

Page 41: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 41

… and some ODB “limitations”ODB software is clearly meant for large scale computation

since – given lots of memory and disk space, fast CPUs:

A single program can handle up to 2^31 ODB databases

A single database can have up to 2^31 data pools

A single database can have any number of tables

A single table in a data pool can have up to 2^31 rows and (by default) 9999 columns

A single ODB/SQL-query over active data pools can retrieve up to 2^31 rows in one go

These really big numbers show that ODBs potential is on parallel computers, but we haven’t forgotten desktop PCs!

Page 42: Introduction to  Observational DataBase (ODB)  Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

ECMWFODB Training 2006 slide 42

Finally…ODB software is developed to allow unprecedented amounts

of satellite data through the IFS/4DVAR system

Software has been operational at ECMWF since June’2000, but is still evolving

Emphasis is now on graphical post-processing and how to enable fast access to very large amounts of data

Other ECMWF member states and co-operating countries that are also using or just becoming users of ODB

MeteoFrance, DWD, Hungary, Aladin/HIRLAM-nations

MetOffice is considering via collaboration with BoM

University of Vienna via re-analysis ERA40 collaboration