22
A PROPOSED EARTH SCIENCE COLLABORATORY K-S Kuo 1,2 , Chris Lynnes 1 , Rahul Ramachandran 3 1 NASA Goddard Space Flight Center, USA 2 Caelum Research Corporation, USA 3 University of Alabama-Huntsville, USA 7/27/11 1 IGARSS 2011, Vancouver, Canada

Fr1T101-Kuo-20110729 IGARSS ESC.pptx

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 1

A PROPOSED EARTH SCIENCE

COLLABORATORY K-S Kuo1,2, Chris Lynnes1, Rahul Ramachandran3

1NASA Goddard Space Flight Center, USA2Caelum Research Corporation, USA

3University of Alabama-Huntsville, USA

7/27/11

Page 2: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

Why ESC?

7/27/112IGARSS 2011, Vancouver, Canada

Data Intensive ScienceMany forms and sources of data

In situ measurementsRemote sensing observationsModel simulations

Large volumes of data

Effectiveness as a scientistIncreasing proportion of effort in data managementThreatening:

ReproducibilityCorrectnessProductivity

Page 3: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 3

What is an ESC?Vision of a rich model development/simulation and data analysis environment that:

Provides access to various Earth Science models

Facilitates model and analysis software development

Provides access across a wide spectrum of Earth Science data

Provides a diverse set of science analysis services and tools

Supports the application of services and tools to data

Supports collaboration, i.e. sharing of data, tools and results

Supports discovery and publication of all science artifacts

7/27/11

Basically, a new and natural place for Earth scientists to conduct their work and

collaborate with others!

Page 4: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/114IGARSS 2011, Vancouver, Canada

The Situation TodayIslands of data and services with selective

connectivity

Data Center A

Data Center C

Data Center B

Page 5: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 5

High-Level View

7/27/11

Cyberinfrastructure

Tool Library

Data Library

Laboratory Notebook

Workflow

Mediator

Data Centers

Page 6: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/116IGARSS 2011, Vancouver, Canada

Tool Library

• Discovery• Social

oSharingoTaggingoDiscussion

• Configuration ManagementoTestingoVersioning

Packager• autoconf• RPM• Web

wrapper

PROVISIONED

• GrADS• IDL• MatLab• ncl• nco• cdat

COMMUNITY• Quality filter• Coincidence• Feature

detection• Event service• Visualization

CONTRIBUTED

• [Tool 1]• [Tool 2]• [Tool 3]• [Tool 4]• [Tool 5]• …

PERSONAL• [Tool 1]• [Tool 2]• [Tool 3]• [Tool 4]• [Tool 5]• …

Page 7: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/117IGARSS 2011, Vancouver, Canada

Data Library

• Cache• Discovery• Social

oSharingoTaggingoDiscussion

• Configuration ManagementoTestingoVersioning

Packager• data probe• format

check• metadata

wizard

PROVISIONED

• EOSDIS

COMMUNITY• Field

campaigns• MEaSUREs• ACCESS• Validation

CONTRIBUTED

• [Dataset 1]• [Dataset 2]• [Dataset 3]• [Dataset 4]• [Dataset 5]• …

PERSONAL• [Dataset 1]• [Dataset 2]• [Dataset 3]• …

Page 8: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/118IGARSS 2011, Vancouver, Canada

Workflow Library

• Discovery• Social

oSharingoTaggingoDiscussion

• Configuration ManagementoTestingoVersioning

Packager• Workflow editor

PROVISIONED

• Processing Algorithms

COMMUNITY• GeoBrain• SciFlo• Data Mining• Giovanni

CONTRIBUTED

• [Workflow 1]• [Workflow 2]• [Workflow 3]• [Workflow 4]• [Workflow 5]• …

PERSONAL• [Workflow 1]• [Workflow 2]• [Workflow 3]• …

Page 9: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/119IGARSS 2011, Vancouver, Canada

Laboratory Notebook

• Discovery• Social

oSharingoTaggingoDiscussion

• Configuration ManagementoVersioning

Packager• Project Manager

• Experiment manager

• Notebook editor

PROVISIONED

• Tutorials• User guides• Example

uses• Educational

packages

COMMUNITY• Project results• Publications• Example

cases• Educational

packages

PROJECT• [Project 1]• [Project 2]• [Project 3]• [Project 4]• [Project 5]• …

PERSONAL• Notes• Journals• …

Page 10: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

7/27/1110IGARSS 2011, Vancouver, Canada

Mediator• Mediates tool interaction with data• OPeNDAP – a common data model

(accessible by most tools)• Custom modules reformat data for

the rest of the tools• Ontology matches tools with data,

and vice versa.

Page 11: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 11

CyberinfrastructureServices used by all other

components

Securityauthenticationauthorizationcode audit/padded cell integrity checking

Socialtaggingsharingdiscussionsgroups

Cloudelastic provisioned storage and computing

Discoverydata, tools, workflows, experimentssearch by keyword, variable, time, author

Information Management

provenanceidentifiersarchive

Semantic Webdata ontologytools ontology 7/27/11

Page 12: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 12

Key Advantages of ESC

Tool availability will be a force multiplierMore tools will be usable with more datasets

More tools will be more available to more users

Knowledge sharing evolves from text on paper to a rich mixture of data, tools, workflows and articlesA “wikihow” for Earth Science data analysis

Incorporating live data, services and workflows

ESC maintains a record of the analysis processShare, repeat, build upon analysis techniques

Transparency of the process is built in

7/27/11

Page 13: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

IGARSS 2011, Vancouver, Canada 13

Prior ArtTalkoot, myExperiment.org – workflow sharing, virtual notebooksEarth System Grid – provisioned tools, format standards/checkersNASA Earth Exchange (NEX)Land Information System – OPeNDAP as access infrastructureEarth Science Modeling Framework – programmatic approach to integrationGiovanni, LAS – community services/toolsCanadian Space Science Data Portal (EOS, Feb. 22, 2011)Nebula – cloud provisioning

7/27/11

Page 14: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

A Use CaseGPM Precipitation Retrieval Algorithm

Development

7/27/1114IGARSS 2011, Vancouver, Canada

GPM Core Satellite: Dual-Frequency Precipitation Radar (JAXA) and GPM Microwave Imager (NASA)GPM Constellation: International partner satellites with mostly microwave radiometersRetrieval algorithms – 3 types

Radar-onlyRadiometer-onlyRadar-radiometer-combined

Participants in algorithm development are distributed in Japan, NASA centers (GSFC, MSFC, JPL), NCAR, and universities (FSU, Uwisc, etc.)

Page 15: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

A Use CaseGPM Algorithm Development – Current

Situation

7/27/1115IGARSS 2011, Vancouver, Canada

Interdependence among 3 types of algorithmsCommunication/Coordination – Narrow bandwidth

Periodic workshop meetings and teleconferences

Data access – DuplicativeEach location/group has a copy or subset of required data

Sharing of data/tools – Individual, not concertedthrough ftp/email

Knowledge sharing – Delayed

Page 16: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

A Use CaseGPM Algorithm Development – with ESC

7/27/1116IGARSS 2011, Vancouver, Canada

Tools

Data

ESC Client

mySci

Cat.

A

Tools

Data

ESC Client

mySci

Cat.

Z

Cloud

VM Image

Tools

Data

A

VM Image

Tools

Data

B

Community Catalog

ESC

Page 17: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

A Use CaseGPM Algorithm Development – Multi-

level MembershipDC

B

A

K

J I H

G

F

E

GPM

Radar-Only

Radiometer-Only

Combined

Algorithm

M

L

Page 18: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

A Use CaseGPM Algorithm Development – in ESC

7/27/1118IGARSS 2011, Vancouver, Canada

Enhanced communication/coordination – wide bandwidthEfficient data access – less duplicationImproved sharing – more pervasiveEffective knowledge sharing – immediate

Page 19: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

Thank you!

7/27/1119IGARSS 2011, Vancouver, Canada

Page 20: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

Why now?Because we can do it (finally)!

Advances in standards acceptance andimplementation (OPeNDAP, autoconf)

A consistent, loosely coupled architecture encapsulates complexity and maximizes flexibility

Social networking has reached the mainstream

Key lessons can be learned from prior efforts

The need is growingInterest in working with multiple datasets is growing

Calls for transparency and reproducibility are growing

7/27/1120IGARSS 2011, Vancouver, Canada

Page 21: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

What’s New?Macro View (forest-level)

Systematic approach to making data available to services and vice versa

Integration of all major analysis components

Consistent view of all architectural components

Cyberinfrastructure services for all architectural components

Micro View (tree-level): Nothing!

7/27/1121IGARSS 2011, Vancouver, Canada

Page 22: Fr1T101-Kuo-20110729 IGARSS ESC.pptx

How to move forward?

Option 1RFC to community on feasibility, challenges, approach

Followed by RFPs for component and integration

Option 2Narrow end-to-end prototype

Followed by refactoring and broadening

7/27/1122IGARSS 2011, Vancouver, Canada