54
The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and Cyberinfrastructure Lead, iPlant Collaborative Deputy Director, Texas Advanced Computing Center

The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Embed Size (px)

Citation preview

Page 1: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

The iPlant CollaborativeCyberinfrastructure

akaDevelopment of Public Cyberinfrastructure to Support Plant Science

Presented by Dan Stanzione

Co-PI and Cyberinfrastructure Lead, iPlant Collaborative

Deputy Director, Texas Advanced Computing Center

Page 2: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Today’s Schedule• Presentations

– Me: Overview and architecture– Matt: CI for Genotype to Phenotype– Sheldon: CI for Tree of Life– Uwe: CI for Education the next generation of

biologists.

• A quick break• Live interactive demos of DNA Subway,

Discovery Environment, and MyPlant site.

Page 3: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

What is iPlant?

• iPlant’s mission is to build the CI to support plant biology’s Grand Challenge solutions

• Grand Challenges were not defined in advance, but identified through engagement with the community

• A virtual organization with Grand Challenge teams relying on national cyberinfrastructure

• Long term focus on sustainable food supply, climate change, biofuels, ecological stability, etc

• Hundreds of participants globally… Working group members at >50 US institutions, USDA, DOE, etc.

Page 4: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Brief History

• Formally approved by National Science Board – 12/2007

• Funding by NSF – February 1st, 2008

• iPlant Kickoff Conference at CSHL – April 2008o ~200 participants

Grand Challenge Workshops – Sept-Dec 2008 CI workshop – Jan 2009 Grand Challenge White Paper Review – March 2009 Project Recommendations – March 2009 Project Kickoffs – May 2009 & August 2009 First Release of Discovery Environments – April 2010

Page 5: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Grand Challenge & CI Workshops• Mechanistic Basis of Plant Adaptation (9-30-08)

• Impact of Climate Change on Plant Productivity: Prediction of Phenotype from Genotype (9-30-08)

• Developing common models for molecular mechanisms, crop physiology, and ecology (11-7-08)

• Assembling the Tree of Life to Enable the Plant Sciences (11-19-08)

• Computational Morphodynamics of Plants (12-15-08)• Botanical Information & Ecology Network• CI Workshop

Page 6: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

GC Projects Recommended by the iPlant Board of Directors March 2009Initial Projects:

Plant Tree of Life – iPToL – May ‘09+Taxonomic Intelligence+ APWeb2+ Social Networking Website

Genotype to Phenotype – iPG2P – Aug ‘09+ Image Analysis Platform

Page 7: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

iPlant Tree of Life Working GroupsiPlant Tree of Life Working Groups

Trait Evolution, Brian Omeara– Post-tree analysis and mapping of ancestral traits

Tree Reconciliation, Todd Vision– Large-scale reconciliation of gene trees, co-evolving parasites, etc.,

with species trees

Big Trees, Alexandros Stamatakis– HPC Phylogenetic inference with 500K taxa

Tree Visualization Michael Sanderson; Karen Cranston– Cross cutting group for the viz needs of all

Data Integration, Val Tannen, Bill Piel– Cross cutting group for the data integration needs of all

Data Assembly, Doug Soltis, Pam Soltis, Michael Donoghue– Community and network building, data assembly

Page 8: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

iPlant Genotype to Phenotype Working Groups

• NextGen Sequencing– Establishing an informatics pipeline that will allow the plant community to process

NextGen sequence data

• Statistical Inference– Developing a platform using advanced computational approaches to statistically link

genotype to phenotype

• Modeling Tools– Developing a framework to support tools for the construction, simulation and analysis

of computational models of plant function at various scales of resolution and fidelity

• Visual Analytics– Generating, adapting, and integrating visualization tools capable of displaying diverse

types of data from laboratory, field, in silico analyses and simulations

• Data Integration– Investigating and applying methods for describing and unifying data sets into virtual

systems that support iPG2P activities

Page 9: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

NSF Cyberinfrastructure Vision

• High Performance Computing• Data and Data Analysis• Virtual Organizations• Learning and Workforce

Ref: “Cyberinfrastructure Vision for 21st Century Discovery”, NSF Cyberinfrastructure Council, March 2007.

Page 10: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

What is Cyberinfrastructure?(Originally about TeraGrid)

And More!:

- Viz

- Facilities

- Data collections

It’s a Grid!

It’s Storage!

It’s a Common Software Environ!

It’s a Network!

They are HPC

Centers!

It’s Apps and

Support!

It was six men of Indostan,To learning much inclined,

Who went to see the elephant,(Though all of them were blind),

That each by observationMight satisfy his mind.

WWW.TERAGRID.ORG

Page 11: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Cyberinfrastructure versus Bioinformatics

• Leveraging production compute and storage infrastructure; hundreds of millions in NSF investment… these aren’t machines in our lab.

• Focus on a *platform* not tools– Methods for leveraging physical resources– Methods for integrating tools– Methods for integrating data

• Emphasis on a sustainable, species independent platform.

Page 12: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

What is the iPlant CI?• Two grand challenges:

• iPlant Tree of Life (IPTOL):– Build a single tree showing

the evolutionary relationships of all green plant species on Earth

• iPlant Genotype-to-Phenotype (IPG2P)– Construct a methodology

whereby an investigator, given the genomic and environmental information about a given plant, can predict it’s characteristics.

Strong focus on data integration, not simulation:Plant science is truly data driven.

Still many computational challenges (e.g. inferring phylogenies from genome data

Prototype visualization tool, showing 220,000 taxa phylogenetic tree

Page 13: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Open Source Philosophy, Commercial Quality Process

• iPlant is open in every sense of the word:– Open access to source– Open API to build a community of contributors– Open standards adopted wherever possible– Open access to data (where users so choose).

• iPlant code design, implementation, and quality control will be based in best industrial practice

Page 14: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

CI Timelines• Per NSF mandate: No development before conclusion of GC

selection in March, 2009

• GC projects kicked off requirements gathering phase in May and July 2009, respectively.

• Software engineering practices established, staffing expansion in summer of 2009

• Architecture design first production and prototype coding began in September 2009

• Initial prototype rollouts began Jan. 2010

• First product betas began March 2010

• New releases of DE quarterly, with periodic releases of other products.

Page 15: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

IPTOL CI – At a Very High Level• Goal: Build and use very large trees, perhaps all

green plant species• Needs:

– Most of the data isn’t collected. A lot of what is collected isn’t organized.

– Lots of analysis tools exist (probably plenty of them) – but they don’t work together, and use many different data formats.

– The tree builder tools take too long to run.– The visualization tools don’t scale to the tree sizes

needed.

Page 16: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

IPTOL CI – High Level• Addressing these needs through CI

– MyPlant – the social networking site for phylogenetic data collection (organized by clade)

– Provide a common repository for data without an NCBI home (e.g. 1kP)

– Discovery Environment: Build a common interface, data format, and API to unite tools.

– Enhance tree builder tools (RAxML, NINJA, Sate’) with parallelization and checkpointing

– Build a remote visualization tool capable of running where we can guarantee RAM resources

Page 17: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Support of Existing Tools

• The IPTOL working groups have determined a number of tools that needed to be enhanced to meet initial scientific goals: – NINJA (Neighbor Joining) – RAXML (Maximum Likeklihood)– Both pose significant scalability challenges

iPlant staff are helping the developers tackle.

Page 18: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Tree VisualizationClade-based navigation, scaling to 1M taxa guaranteed

Page 19: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Discovery Environment

Page 20: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

First DE• Support of only one workflow, independent

contrasts, but:– Remote execution of compute tasks on TeraGrid resources

seamlessly

– Incorporation of existing informatics tools behind iPlant interface

– Parsing of multiple data formats into iPlant format

– Seamless integration of online data resources

– Role based access and basic provenance support

• Mostly foundation work…

Page 21: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Second DE Release

• Added Functionality for G2P, specifically high throughput sequencing (transcript abundance, variant detection)

• Substantially enhanced UI

• IRODS Integration

Page 22: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Portfolio of Activities

• Maintaining a balance of “past, present, future” strategies– “Past”: make services, systems, and support available

to existing bioinformatics projects, either to enhance them or simply make critical tools more widely available.

– “Present” build the best bioinformatics software tools that today’s technologies can provide.

– “Future” track emerging technologies, and where appropriate stimulate research into the creation and use of those technologies.

Page 23: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Portfolio of Activities• In a nutshell:

– 12 Working groups in the two grand challenges, each of which is defining requirements for DE development.

Each group not only has discussions that leads to final projects, but they also spawn prototyping efforts, tech eval projects, tool support projects, etc.

– Services group: provide cycles, storage, hosting, etc. to users.

– A comprehensive technology evaluation program to find, borrow, or build relevant technologies, headlined by the semantic web effort.

– A number of ancillary projects related to grand challenges, i.e. APWEB, high throughput image analysis

– The Core development/integration effort.

Page 24: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

The iPlant Cyberinfrastructure

Physical InfrastructurePhysical Infrastructure

ComputeStorage

Persistent Virtual Machines

TeraGridOpen Science Grid

UA/ASU/TACC

iPlant MiddlewareiPlant Middleware

Job Submission Workflow Management Service/Data APIsiRODS, Grid Technologies, Condor, RESTful Services

iPlant Discovery EnvironmentsiPlant Discovery Environments

Grand Challenge Workflows, iPlant InterfacesThird Party Tools, iPlant-built Tools, Community Contributed Tools and Data!

Build a CI that’s robust, leverages national infrastructure, and can grow through community contribution!

User

Page 25: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Systems and Services

• Provide access for problems like these on large scale systems

• Provide the storage infrastructure for biological data (again, in support of existing projects)

• Provide cloud style VM infrastructure for service hosting.

Page 26: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Existing Systems• We have made resources available to iPlant users

from a number of TeraGrid and local systems– Ranger (TG/Large Scale Supercomputer)

– Stampede (TACC/High Throughput)

– Longhorn (TG/Remote Visualization and GPU)

• The Contrast tool runs in production on Stampede; TreeViz on Longhorn

• Several groups accessing these systems for real science now; command line only, but open for business!

Page 27: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Storage Services

• We have also begun offering storage to a number of projects connected to the grand challenges in some way, as well as iPlant internal.– IRODS interface– Corral at TACC, a local storage array at UA

• Data arriving now for 1KP project, Gates C3/C4 project, some labs starting to use… open for business.

Page 28: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Cloud Services

• iPlant is now offering “cloud” style hosting services.

• Dynamically launch virtual servers hosted by iPlant.

• Still in prototype

Page 29: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

A Discovery Environment

remote repositories for Data, Models and Algorithms

Local datasets

Community annotation

Computational tools &

web services

Integration layer

Community ontologies &

controlled vocabularies

ProgrammaticAccess

API

Collaboration-friendly front

end

Page 30: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Discovery Environment Releases• First release was March; Technology Preview

– Tests of architecture, identity management, remote system integration, etc.

– One supported workflow for IPTOL

• Second release in June 2010– Six workflows for IPG2P around high throughput sequencing

– Integrated tree visualization tool

– UI Refinements based on user feedback

• Third release September 2010 – Basic Support for incorporation of 3rd party tools

– Enhanced collaboration features

– Taxon name scrubber

Page 31: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

The iPlant Application Programmer Interface

And What it Means to You

Page 32: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

API – Why is it important?• The API is the Application Programmer Interface (it

comes with an associated SDK, or software developer kit).

• This is the way bioinformatics tools and data get integrated with iPlant.

• First pieces to be released late Sept/early October. The cardinal sin of API support:– Release lots of versions, each incompatible with the last.

– Our approach: Incremental releases; each release will add new areas of functionality, not change old syntax

– Initial support: Getting files in and out of the environment, running jobs.

Page 33: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Architecture

Page 34: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Core Services

Page 35: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Application Discovery Services

Application discovery and management (different from semantic web service discovery)

• /apps: add a new application to the iPlant CI• /apps/list: list all supported applications• /apps/search: search for a specific application• /apps/type/list: list all supported application types• /apps/type/<type_name>: list all supported applications

of a specific type• /apps/name/<app_name>: list all supported

applications matching a given name

Page 36: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

API/DE Back End

Page 37: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Publishing Your Service Through iPlant• Wrap your service in our API (or get us too).• Give us the package to deploy on our

platforms (optional, but a good idea).• We register as a service, discoverable

through app API. • Describe the user interface to our discovery

environment (Graphical tool to build forms).

Page 38: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Using the API in your work *outside* the Discovery Environment

• You need not come thru the DE to make use of a service.

• Embed calls to the web service in your own code, or even from the command line.

• For example, to get an output file from your Phylip run: https://services.iplantcollaborative.org/contrast/file/get/(<id>)

• While it is nice to do this by hand, the key thing is it can be *automated*.

Page 39: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Roadmap

API Service Expected Beta Release Development Team

io, data, event September, 2010 TACC, CSH

apps, job October, 2010 TACC

profile, auth October, 2010 UA, TACC

audit November, 2010 TACC, UA

mashup January, 2010 UA, TACC, CSH

TACC – Texas Advanced Computing CenterCSH – Cold Spring Harbor LaboratoryUA – University of Arizona

Page 40: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Technology Eval Activities

• Largest investment in semantic web activities– Key for addressing the massive data integration

challenges

• Exploring alternate implementations of QTL mapping algorithms

• Experimental Reproducability

• Policy and Technology for Provenance Management

• Evaluation of HubZero, Workflow engines, numerous other tools

Page 41: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Deployment Strategy

• Broadly, iPlant CI deployment can be grouped into 3 categories: – Systems, services (middleware), and tools

• In each category, there are a couple of types of development/deployment activities.– Prototype, production

• The transition from prototype to production can usually follow a relatively robust engineering schedule; prototyping less so.

Page 42: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

So, what can I get from iPlant Right NOW!

• Tools: – Use the Discovery Environment to do transcript

abundance, variant detection, trait evolution, or just store your stuff

– Access to prototype tools for large scale tree visualization or very large tree building runs with neighbor joining or maximum likelihood.

– Use MyPlant to find data and colleagues working on related species

– Use DNASubway to do genome annotation and train your students.

Page 43: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

So, what can I get from iPlant Right NOW!

• Systems/Services: – Request a repository to provide command line

or WebDAV access to large scale datasets on high integrity storage systems

– Get command line access to the most powerful computing and visualization systems in the world.

– Use the iPlant Cloud to host your web application in a virtual machine.

Page 44: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

So, what can I get from iPlant SOON• Services:

– Use the API to embed access to iPlant tools, systems, and data repositories in your own scripts and workflows.

– Submit your bioinformatics tool to be registered as an iPlant service (run on large platforms, available to others thru API), or make your web service discoverable thru iPlant.

A little later: Have your tool incorporated in the iPlant DE with it’s own graphical interface.

• Tools: more coming on line steadily

Page 45: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Collaborations

• More than 80 faculty at 45 institutions involved in working groups.

• Gates Integrated Breeding Platform• Gates C3/C4 photosynthesis project• 1KP thousand plant transcriptome project• Nascent “National Virtual Herbaria”, many

existing herbaria.

Page 46: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Discussion(See demo clips at http://iplantcollaborative.org/videos

Page 47: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

CI Master Project List• DE

• API

• Semantic Web

• GLM

• GLM – GPU

• MyPlant

• RaXML

• NINJA

• Experimental Reproducability

• Image management pipeline

• APWEB refit

• Ingest pipeline (Phlawd)

• DNA Subway

• BrachyBio

• DropBox

• Cloud service

• Storage repositories

• Analytics pipeline

• Visualization explorations

• Workflow tools analysis

• Large Scale Tree Visualization

• Name resolution service

Page 48: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

DNA Subway

Page 49: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

My Plant

• Social networking for plant biologists

• Organized by clade

• Used to organize the data collection for the “big tree”

Page 50: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Scope: What iPlant won’t do

• iPlant is not a funding agency– A large grant shouldn’t become a bunch of small grants

• iPlant does not fund data collection• iPlant will (probably) not continue funding for

<favorite tool x> whose funding is ending.• iPlant will not seek to replace all online data

repositories• iPlant will not *impose* standards on the community.

Page 51: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Scope: What iPlant *will* do

• Provide storage, computation, hosting, and lots of programmer effort to support grand challenge efforts.

• Work with the community to support and develop standards

• Provide forums to discuss the role and design of CI in plant science

• Help organize the community to collect data• Provide appropriate funding for time spent helping

us design and test the CI

Page 52: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Experimental Systems• We are experimenting with some newer

technologies to plug gaps in the existing lineup for demonstrated needs (also leveraging some other funding)– New model for shared memory (ScaleMP cluster to be

deployed soon)Will support whole-genome assembly

– “Cloud Storage” models to reduce archive cost, increase capacity (HDFS system on commodity cluster to be deployed this quarter)

Will also support Hadoop data processing

Page 53: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

Deployment Timelines Summary• Systems:

– Production systems (HPC, storage, throughput, visualization) available *now* and in use.

– Experimental systems (simulated shared memory, cloud, cloud storage) coming up in prototype stage.

• Services:– Web service API to incorporate 3rd party tools prototyping now,

public releases in Q3.

• Tools – A number of prototypes available now, many underway

– Contrast workflow, Variant detection, Transcript quantification all released now.

Page 54: The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and

The iPlant CI• Engagement with the CI Community to leverage best

practice and new research• Unprecedented engagement with the user community

to drive requirements• A single CI for all plant scientists, with customized

discovery environments to meet grand challenges• An exemplar virtual organization for modern

computational science• A Foundation of Computational and Storage Capability• Open source principles, commercial quality

development process