27
Best Practices Writing (Read-only archives of) netCDF (version 3) John Caron Unidata June 28, 2007

Best Practices Writing (R ead-only archives of) netCDF (version 3)

  • Upload
    brody

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Best Practices Writing (R ead-only archives of) netCDF (version 3). John Caron Unidata June 28, 2007. Overview. NetCDF solves file syntax ; writer and readers need to agree on semantics Goal 1: intelligible by humans Goal 2: readable by standard tools Write a Conventions document - PowerPoint PPT Presentation

Citation preview

Page 1: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Best Practices Writing (Read-only archives of)

netCDF(version 3)

John Caron

Unidata

June 28, 2007

Page 2: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Overview

• NetCDF solves file syntax; writer and readers need to agree on semantics– Goal 1: intelligible by humans– Goal 2: readable by standard tools

• Write a Conventions document• Types of metadata:

– Structural metadata: ncdump -h– Use metadata: units, coordinates– Search metadata: bounding boxes, time

ranges, standard variable names, keywords

Page 4: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

NetCDF-3 Data Model

Page 5: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Attributes

• Use standard attribute names if possible– netCDF Users Guide, CF-1.0

• Use numeric when appropriate– :calibration = “23.7”; // string– :calibration = 23.7f; // float

• Can be multivalued– :special = 23.7f, 10.6f;

Page 6: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Global Attributes

• :Conventions = "NCAR-RAF/nimbus"; – Put document on the web, send us a link

• Search metadata: – bounding boxes, time ranges, keywords– NetCDF Attribute Convention for Dataset Di

scovery

• Many others Sources:– CF-1.0, FGDC, ISO, Dublin Core

Page 7: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Variable Attributes

– long_name : human readable plot title– units : udunits compatible

• sps vs s-1• display_units = “NO3 ppm”;

– Missing values : • _FillValue “never written”• missing_value • valid_min, valid_max, valid_range

Page 8: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Dimensions

• Name: make it meaningful– “vector16” vs “bins16”, “wind_vector”;– char date(vector16) vs date(date_strlen)

• Shared dimension imply shared coordinateschar date(dim16);

float P(time,dim16); // BAD DOG!

versuschar date(date_strlen);

float P(time,bins16);

float T(time,bins16); // GOOD BOY!!

Page 9: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Example Conventions:

• http://www.unidata.ucar.edu/software/netcdf/conventions.html

Debugging Tool:

• http://www.unidata.ucar.edu/software/netcdf-java/v2.2/webstart-dev/index.html

Page 10: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Nimbus: Options (1)

• Data Types– Allow any datatype– Use scale/offset to save space

• Units– Change units to be udunit compatible– Add display_units (?) attributes

Page 11: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Coordinate Variablesdimensions: time = 1761; lat = 180; lon = 360; z = 42; variables: int time(time); :units = "seconds since 1970-1-1 0:00:00 0:00"; double lat(lat); :units = “degrees_north”; double lon(lon); :units = “degrees_east”; double z(z); :units = “m”; :positive = “up”; float data(time,z,lat,lon);

Page 12: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Coordinate Variables

• Variable name same as dimension name• Strictly monotonic values• No missing values• Simple case:

– All coordinates are 1D coordinate variables– Data variables have one dimension for each

coordinate: • data(time,z,lat,lon);

• Correct rules only for gridded (model) data

Page 13: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Stationary Buoy (first attempt)

dimensions: time = unlimited; lat = 1; lon = 1;

float data(time, lat, lon); int time(time); double lat(lat); double lon(lon);

• Only works when lat=1, lon=1 (single buoy per file)

Page 14: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Multiple Buoys per file

float data(buoy,time);

int time(time);

int buoy(buoy);

:long_name = “buoy id”;

double lat(buoy);

double lon(buoy);

Page 15: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Aircraft (Trajectory) Coordinates

float data(pt);

int time(pt);

double altitude(pt);

double lat(pt);

double lon(pt);

Page 16: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

2D Coordinates

float data(time,z,y,x);

int time(time);

double z(z);

double y(y);

double x(x);

double lat(y,x);

double lon(y,x);

Page 17: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Generalize Coordinate Variable to Coordinate Axis

• Can be multidimensional

• Name can be different from the dimension

• A set of axes for a variable is called a Coordinate System

• How to associate a Coordinate System with a variable?

float data(pt);

data:coordinates=“lat lon altitude time”;

Page 18: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Nimbus Coordinates

:coordinates = "LONC LATC GGALT Time";

float LONC(Time=7741);

:_FillValue = -32767.0f; // float

:units = "degree_E";

:long_name = "GPS-Corrected Inertial Longitude";

:valid_range = -180.0f, 180.0f; // float

:Category = "Position";

:standard_name = "longitude";

Page 19: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Nimbus: Recommend (2)

• Document Coordinates:– All variables have same coordinate system,

described by coordinates global attribute– Coordinate variable have standard_name

attribute describing coordinate type: latitude, longitude, altitude, time

– Are missing values possible?– CF-1.0 units for lat/lon: degrees_east,

degrees_north (decimal degrees)

Page 20: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Bin Coordinatesfloat AS200_RWO(Time=7741, sps1=1, Vector31=31); :FillValue = -32767.0f; :units = "count"; :long_name = "SPP-200 (PCASP) Raw Accumulation (per cell) -

DMT"; :Category = "PMS Probe"; :missing_value = -32767.0f; :SampledRate = 10; :DataQuality = "Preliminary"; :SerialNumber = "PCAS108"; :FirstBin = 6; // int :LastBin = 30; // int :CellSizes = 0.05f, 0.065f, 0.08f, 0.095f, 0.11f, 0.125f,

0.14f, 0.155f, 0.17f, 0.185f, 0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f, 0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f, 1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f;

:CellSizeUnits = "micrometers";

Page 21: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Bin Coordinates (alt)float AS200_RWO(Time=7741, sps1=1, AS200_RWO_BINS=31); :long_name = "SPP-200 (PCASP) Raw Accumulation”; :FillValue = -32767.0f; :units = ""; :display_units = "count"; :coordinates = "LONC LATC GGALT Time AS200_RWO_BINS";

float AS200_RWO_BINS(AS200_RWO_BINS=31); :FirstBin = 6; :LastBin = 30;

data: AS200_RWO_BINS = 0.05f, 0.065f, 0.08f, 0.095f, 0.11f,

0.125f, 0.14f, 0.155f, 0.17f, 0.185f, 0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f, 0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f, 1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f;

Page 22: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Bin Coordinates (alt)

• Advantages:– Can be written outside of define mode– More likely to be interpreted by standard tools

Page 23: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Station data(same number of pts at each station)

float data(station, time);

int time(time);

double altitude(station);

double lat(station);

double lon(station);

Page 24: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Station data (different number of pts at each station)

dimensions: record = unlimited;

char station_name(station, strlen); double altitude(station); double lat(station); double lon(station); int firstChild(station); // record index int numChildren(station);

float data1(record);float data2(record);float data3(record);float time(record);

Page 25: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

record

record

record

record

record

record

record

record

Header

variable

variable

variable

variable

NetCDF-3 file layout

Non-record

variables

Recordvariables

variablevariablevariablevariable

Obs for one station

Page 26: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Unidata Obs Data Conventions

• Different number of groups of observations

• Nested groups

• Linked list, contiguous list

• Additional complexity

• Performance implication

• http://www.unidata.ucar.edu/software/netcdf-java/formats/UnidataObsConvention.html

Page 27: Best Practices Writing  (R ead-only archives of) netCDF (version 3)

Conclusions

• NCAR-RAF/nimbus Conventions are quite good

• Unidata is interested in helping out with future revisions, new formats

• Netcdf-4 will offer new options

• Standards are evolving – please help!– CF could be standards umbrella