Upload
truongtu
View
217
Download
2
Embed Size (px)
Citation preview
DataONE: A Distributed Earth Science Data Network Supporting the Full Data Life Cycle Robert Cook Oak Ridge National Laboratory
W. Michener, D. Vieglais, A. Budden, and R. Koskela University of New Mexico
Agenda
• Introduction
• Approach • Cyberinfrastructure
• Community Engagement
• DataONE and Data Life Cycle
2
Dec
reas
ing
Spat
ial C
over
age
Incr
easi
ng P
roce
ss K
now
ledg
e
Adapted from CENR-OSTP
Remote sensing
Intensive science sites and experiments
Extensive science sites
Volunteer & education networks
“Building the Knowledge Pyramid” 80:20 20:80
Objectives: Solve Data Challenges
4
DataONE Vision and Approach
Enable new science and knowledge creation through universal access to data about life on earth and the environment that sustains it.
1. Build on existing cyberinfrastructure
2. Create new cyberinfrastructure
3. Support communities of practice
5
Three major components for a flexible, scalable, sustainable network
Member Nodes • diverse institutions • serve local community • provide resources for
managing their data • retain copies of data
Coordinating Nodes • retain complete metadata
catalog • indexing for search • network-wide services • ensure content
availability (preservation) • replication services
Investigator Toolkit
Cyberinfrastructure
6
Operational core infrastructure • Three coordinating nodes:
• ORC, UCSB, UNM • Seven member nodes:
• KNB SANParks • Dryad ORNL DAAC • Merritt USGS • Avian Knowledge Network
• Essential investigator toolkit components: • Search interface (ONE Mercury) • ONE-R Plugin • Developer tools in in Python and Java
• Design and component documentation
July 2012 Cyberinfrastructure Release
7
Community Involvement through Working Groups
9
Engagement Research
• Community Engagement and Education
• Sociocultural Barriers to Data Sharing / Preservation
• Public Participation in Science and Research
• Sustainability and Governance
Infrastructure Research
• Federated Security • Data Integration and
Semantics • Data Preservation and
Metadata • Distributed Storage • Scientific Workflows and
Provenance
• Exploration, Visualization, and Analysis • Usability and Assessment
Working Groups Explore the Entire Life Cycle
10
Engagement Research
• Community Engagement and Education
• Sociocultural Barriers to Data Sharing / Preservation
• Public Participation in Science and Research
• Sustainability and Governance
Infrastructure Research
• Federated Security • Data Integration and
Semantics • Data Preservation and
Metadata • Distributed Storage • Scientific Workflows and
Provenance
• Exploration, Visualization, and Analysis • Usability and Assessment
Working Groups and the Data Life Cycle
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
11
Public Participation in Science and Research
Data Integration and Semantics
Scientific Workflows and Provenance
Distributed Storage
Exploration, Visualization, and Analysis
Sociocultural Barriers to Data Sharing / Preservation
Data Preservation and Metadata
Spatio-Temporal Exploratory Model identifies factors affecting patterns of migration
Diverse bird observations and environmental data from 300,00 locations in the US integrated and analyzed using High Performance Computing Resources
Land Cover
Meteorology
MODIS – Remote sensing data
• Analyze patterns of migration
• Predict future bird distributions
Model results
Occurrence of Indigo Bunting (2008)
Jan Sep Dec Jun Apr
Exploration, Visualization, and Analysis Working Group
12
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
DataONE Investigator Toolkit – 2012
13
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
DataONE Investigator Toolkit – future
Morpho
15
DataONE and the Data Life Cycle
• Education and training: Providing essential skills (e.g., data management training, best practices) for scientific enquiry
• Discovery and access: Enabling discovery and universal access to data about life on earth from around the world
• Data integration, visualization, and synthesis: Providing transformational tools that enable cross-cutting research
• Building community: Combining expertise and resources across diverse communities to collectively educate, advocate, and support the scientific data life cycle
16
DataONE Team and Sponsors
• Bertram Ludaescher
• Peter Honeyman
• Jeff Horsburgh
• Robert Sandusky
• Peter Buneman
• Carole Goble
• Cliff Duke
• Donald Hobern
• Ewa Deelman • Amber Budden, Roger Dahl, Rebecca Koskela, Bill Michener, Robert Nahf, Mark Servilla
• Patricia Cruse, John Kunze
• Dave Vieglais
• Paul Allen, Rick Bonney, Steve Kelling
• Chad Berkley, Stephanie Hampton, Matt Jones
• Suzie Allard, Carol Tenopir, Maribeth Manoff, Robert Waltz, Bruce Wilson
• John Cobb, Bob Cook, Giri Palanisamy, Line Pouchard, Suresh SanthanVannan
• Mike Frame, Viv Hutchison, Jeff Morisette, Jake Weltzin, Lisa Zolly
• David DeRoure
• Ryan Scherle, Todd Vision
LEON LEVY FOUNDATION
• Randy Butler