25
WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego Supercomputer Center, UCSD CUAHSI = Consortium of Universities for the Advancement of Hydrologic Sciences, Inc.; HIS = Hydrologic Information System Collaborative Project: UT Austin + SDSC + Drexel + Duke +Utah State www.cuahsi.org/his/

WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Embed Size (px)

Citation preview

Page 1: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA

REPOSITORIES AND REAL TIME OBSERVATION DATA:

CUAHSI HIS EXPERIENCE

Ilya Zaslavsky San Diego Supercomputer Center, UCSD

CUAHSI = Consortium of Universities for the Advancement of Hydrologic Sciences, Inc.; HIS = Hydrologic Information System

Collaborative Project: UT Austin + SDSC + Drexel + Duke +Utah State

www.cuahsi.org/his/

Page 2: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

SDSC Spatial Information Systems Lab

Research and system development• Services-based spatial information integration

infrastructure• Mediation services for spatial data, query

processing, map assembly services• Long-term spatial data preservation• Spatial data standards and technologies for

online mapping (SVG, WMS/WFS)• Support of spatial data projects at SDSC and

beyond

Mediator

LegendGenerator

MapAssembler

Ontology

GRID SERVICESFOR MAP INTEGRATION

Mediator

LegendGenerator

MapAssembler

Ontology

GRID SERVICESFOR MAP INTEGRATION

services

In Geosciences (GEON, CUAHSI, CBEO,…)

Spatial web services

FederalAgencies

Figure 1.26 The Geography Network.

ESRICounty spatial data and toxicant information

Telesis, other localNon-profits

CA state

WSDL

WSWSDL

WSWSDL

WSWSDL

WSWSDL

WSWSDL

WS

Student projects

The CHI ME Model

In regional development (NIEHS SBRP, Katrina)

In Neurosciences (BIRN, CCDB)

Page 3: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

The Grid is becoming the backbone for collaborative science and data sharing

CI is about RE-USING data and research resources !!

Page 4: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Cyberinfrastructure for hydrology

• Hydrologic observations:• Reliance on federally-organized data collection (NWIS, STORET,

NCDC, etc.) with huge and complex nomenclatures simplifying access to federal repositories relatively lower emphasis on data ownership

• Handling time in both UTC and local• Various spatial offsets• Multiple data types: time series, fields, spatial data

• Integrative discipline:• Interoperation with atmospheric, ocean, soils, geomorphology, social

datasets and services…• Community:

• Organized by “natural boundaries” networks of relatively autonomous self-managed data nodes

• Partnership with public sector water management• 96% use Windows for research; Excel, ArcGIS, Matlab – most popular

Page 5: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SupercomputerCenters:NCSA,TACC

Domain Sciences:

Unidata, NCARLTER, GEON

Government:USGS, EPA,

NCDC, USDA

Industry:ESRI, Kisters,

OpenMI

HISTeam

WATERSTestbed

WATERS Network Information System

CUAHSI HIS

The CUAHSI Community, HIS and WATERS

CUAHSI: 116 Universities (Nov. 2006)

HIS Team:Texas, SDSC,Utah, Drexel,

Duke

Page 6: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

WaterOneFlow Web Services

Data access through web

services

Data storage through web

services

Dow

nlo

ads

Upl

oa

ds

Observatory servers

Workgroup HIS

SDSC HIS servers

3rd party servers

e.g. USGS, NCDC

GIS

Matlab

IDL

Splus, R

D2K, I2K

Programming (Fortran, C, VB)

Web services interface

Web portal Interface (HDAS)

Information input, display, query and output services

Preliminary data exploration and discovery. See what is available and perform exploratory analyses

HTML -XML WS

DL

- SO

AP

Hydrologic Information System Service Oriented Architecture

Page 7: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Main Components• Web services for

accessing hydrologic repositories

• Hydrologic Observations Data Model

• Hydrologic Data Access System + Time SeriesViewer + desktop clients

• Collection of CUAHSI nodes

NWISNWIS

ArcGISArcGIS

ExcelExcel

NCARNCAR

UnidataUnidata

NASANASAStoretStoret

NCDCNCDC

AmerifluxAmeriflux

MatlabMatlabAccessAccess SASSAS

FortranFortran

Visual BasicVisual Basic

C/C++C/C++

CUAHSI Web ServicesCUAHSI Web Services

Page 8: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Point Observations Information ModelData Source

Network

Sites

ObservationSeries

Values

{Value, Time, Qualifier}

USGS

Streamflow gages

Neuse River near Clayton, NC

Discharge, stage, start, end (Daily or instantaneous)

206 cfs, 13 August 2006

• A data source operates an observation network• A network is a set of observation sites• A site is a point location where one or more variables are measured• A variable is a property describing the flow or quality of water• An observation series is an array of observations at a given site, for a given variable, with start time and end time• A value is an observation of a variable at a particular time• A qualifier is a symbol that provides additional information about the value

Page 9: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Observations Data Model Schema (version 4.0)

Data Source and Network Sites Variables Values Metadata

Depth of snow pack

Streamflow

Landuse, Vegetation

Windspeed, Precipitation

Controlled Vocabulary Tables

e.g. mg/kg, cfs

e.g. depth

e.g. Non-detect,Estimated,

A site is a point location where one or more variables are measured

A data source operates an observation network A network is a set of observation sites

Metadata provide information about the context of the observation.A variable is a property describing the flow or quality of water

A value is an observation of a variable at a particular time

From Ernest To, David Maidment, CRWR

Page 10: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Water Data Web Sites

Page 11: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

NWISWeb site output# agency_cd Agency Code# site_no USGS station number# dv_dt date of daily mean streamflow# dv_va daily mean streamflow value, in cubic-feet per-second# dv_cd daily mean streamflow value qualification code## Sites in this file include:# USGS 02087500 NEUSE RIVER NEAR CLAYTON, NC#agency_cd site_no dv_dt dv_va dv_cdUSGS 02087500 2003-09-01 1190USGS 02087500 2003-09-02 649USGS 02087500 2003-09-03 525USGS 02087500 2003-09-04 486USGS 02087500 2003-09-05 733USGS 02087500 2003-09-06 585USGS 02087500 2003-09-07 485USGS 02087500 2003-09-08 463USGS 02087500 2003-09-09 673USGS 02087500 2003-09-10 517USGS 02087500 2003-09-11 454

Time series of streamflow at a gaging station

Page 12: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Challenges… (1/2)

• Sites• STORET has stations, and measurement points, at various offsets…• Site metadata lacking and inconsistent (e.g. 2/3 no HUC info, 1/3 no state/county info);

agency site files need to be upgraded to ODM…• A groundwater site is different than a stream gauge…

• Censored values• Values have qualifiers, such as “less than”, “censored”, etc. – per value. Sometimes

mixed data types.. • Units

• There are multiple renditions of the same units, even within one repository• There may be several units for the same parameter code (STORET)• If no value recorded – there are no units??• Unit multipliers

• E.g. NCDC ASOS keeps measurements as integers, and provides a multiplier for each variable

• Sources• STORET requires organization IDs (which collected data for STORET) in addition to site

IDs• Time stamps: ISO 8601

• Data types problem (conversion to PST???)• A service to determine UTC offsets given lat/lon and date??

Page 13: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Challenges… (2/2)

• Values retrieval• USGS: by site, variable, time range• EPA: by organization-site, variable, medium, units, time range• NCDC: fewer variables, period of record applies to site, not to

seriesCatalog

• Variable semantics• Variable names and measurement methods don’t match

• E.g. NWIS parameter # 625 is labeled ‘ammonia + organic nitrogen‘, Kjeldahl method is used for determination but not mentioned in parameter description. In STORET this parameter is referred to as Kjeldahl Nitrogen.

• One-to-one mapping not always possible• E.g. NWIS: ‘bed sediment’ and ‘suspended sediment’ medium types vs.

STORET’s ‘sediment’.

Ontology tagging, semantic mediation

Page 14: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

- From different database structures, data collection procedures, quality control, access mechanisms to uniform signatures … Water Markup Language- Tested in different environments- Standards-based- Can support advanced interfaces via harvested catalogs- Accessible to community- Templates for development of new services- Optimized, error handling, memory management, versioning, run from fast serversAnd: working with agencies on setting up services!

Page 15: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

WaterOneFlow API

• GetValues – Returns a TimeSeries

• GetSiteInfo– Station Information, including a period of record

• GetVariableInfo– Returns variable/parameter information

-- developed to have a low barrier to entry -- terminology same as Observations Database -- reuse of common elements

Page 16: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

GetVariableInfo• Input

– Vocabulary:VariableCode

• Output– VariableResponse

<variablesResponse xmlns:gml="http://www.opengis.net/gml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:wtr="http://www.cuahsi.org/waterML/" xmlns="http://www.cuahsi.org/waterML/1.0/">- <variables>- <variable>  <variable>  <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode>   <variableName>Discharge, cubic feet per second</variableName>   <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units>   </variable>  </variables> </variablesResponse>

Page 17: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

<sitesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/">- <site>

- <siteInfo>  <siteName>BIG ROCK C NR VALYERMO CA</siteName>   <siteCode network="NWIS" siteID="4622637">10263500</siteCode> - <geoLocation>

- <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269">  <latitude>34.42083115</latitude>   <longitude>-117.8395072</longitude>   </geogLocation>

  </geoLocation>  </siteInfo>- <seriesCatalog menuGroupName="USGS Daily Values" serviceWsdl="http://localhost/WaterOneFlowDev/DailyValues.asmx">  <note type="sourceUrl">http://waterdata.usgs.gov/nwis/dv?referred_module=sw&format=rdb&date_format=YYYY-MM-DD&begin_date=2006-11-17&site_no=10263500</note>

- <series><!– [snip] -->  </series>

  </seriesCatalog>- <seriesCatalog menuGroupName="USGS Unit Values" serviceWsdl="http://localhost/WaterOneFlowDev/UnitValues.asmx">

-<!-- [snip] -->  -</seriesCatalog>

  </site>  </sitesResponse>

<series>- <variable>  <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode>   <variableName>Discharge, cubic feet per second</variableName>   <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units>   </variable>  <valueCount countIsEstimated="true">30563</valueCount> - <variableTimeInterval xsi:type="TimeIntervalType">  <beginDateTime>1923-02-01T00:00:00</beginDateTime>   <endDateTime>2006-10-07T00:00:00</endDateTime>   </variableTimeInterval>  </series>

GetSiteInfo

Page 18: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

GetValues

• NWIS, STORET, etc.– Location: NWIS:10263500– Variable: NWIS:00060– Time Range: 2005-08-01 to 2005-08-03

• MODIS, etc.– Location: GEOM:BOX(-180 -90,180 90)– Variable: MODIS:11/plotarea=landocean – Time Range: 2000-10-01 to 2001-03-01

Page 19: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

timeSeriesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/">- <queryInfo>  <queryURL>http://waterdata.usgs.gov/nwis/dv?&site_no=10263500&</queryURL>

- <criteria>  <locationParam>NWIS:10263500</locationParam>   <variableParam>nwis:00060</variableParam> - <timeParam>

  <beginDateTime>2005-08-01</beginDateTime>   <endDateTime>2005-08-03</endDateTime>

  </timeParam>  </criteria>

  </queryInfo>- <timeSeries>

- <sourceInfo xsi:type="SiteInfoType">  <siteName>BIG ROCK C NR VALYERMO CA</siteName>   <siteCode siteID="4622637">10263500</siteCode> - <geoLocation>

- <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269">  <latitude>34.42083115</latitude>   <longitude>-117.8395072</longitude>   </geogLocation>

  </geoLocation>  </sourceInfo>- <variable>

  <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode>   <variableName>Discharge, cubic feet per second</variableName>   <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units>

  </variable>- <values count="3">

  <value qualifiers="Ae" dateTime="2005-08-01T00:00:00">25</value>   <value qualifiers="Ae" dateTime="2005-08-02T00:00:00">26</value>   <value qualifiers="Ae" dateTime="2005-08-03T00:00:00">24</value>

  </values>  </timeSeries>  </timeSeriesResponse>

Page 20: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Hydrologic Data Access System

Page 21: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Hydrologic Data Access

System

Page 22: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

HIS nodes: cross-platform design Central CUAHSI HIS Node (Windows) GEON Data Node (Linux)

Data

Apache TomcatIIS Web Server

ASP . Net

Geon Software Stack

SQL Server

Proxy

ArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Data

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI

HIS Nodes (Windows)

ApplicationServices, handlingof spatial data types, etc Security management,

distributed data management, integrationwith other CI projects

Page 23: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Resource registration• Shapefiles• TIFF images, GMT rasters• Web Services, WMS services• Relational databases, ASCII• PDFs, URLs• “CUAHSI data”• NetCDF• Coming: Geodatabases and ODM

Page 24: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Possible Connections

• Review of ODM• Dealing with observations/measurements rather than with

sensor data?

• Review of WaterOneFlow services schema• Aligning WaterOneFlow output schemas with

GML/SensorML• Carrying WaterOneFlow requests/responses over WFS

• Long term preservation of observation data• Water Data Interoperability Testbed?

Page 25: WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego

Survey of Observing Systems

• NEON: http://www.neoninc.org • ORION: http://www.orionprogram.org/

• WATERS• CUASHI: http://www.cuahsi.org,

http://river.sdsc.edu/hdas • CLEANER: http://cleaner.ncsa.uiuc.edu/home/

• GLEON: http://www.gleon.org/, http://lakemetabolism.org/ • CREON: http://www.coralreefeon.org/ • MoveBank: http://www.princeton.edu/~wikelski/research/index.htm

• Civil Infrastructure: http://healthmonitoring.ucsd.edu/index.jsp

• IRIS/USArray: http://www.iris.edu/USArray/