44
Scientific Workflows, Science Gateways, and Apache Airavata: Opening Science, Opening Cyberinfrastructure Marlon Pierce, Suresh Marru, Saminda Wijeratne, and the IU Science Gateway Group {marpierc, smarru}@iu.edu

Scientific

Embed Size (px)

DESCRIPTION

Overview of Science Gateways, scientific workflows, cyberinfrastructure, and Apache Airavata.

Citation preview

Page 1: Scientific

Scientific Workflows, Science Gateways, and Apache Airavata: Opening Science, Opening

Cyberinfrastructure

Marlon Pierce, Suresh Marru, Saminda Wijeratne, and the IU Science Gateway

Group{marpierc, smarru}@iu.edu

Page 2: Scientific

Science Gateway Group• Amila Jayasekara• Chathuri Wimalasena• Jun Wang• Lahiru Gunathilake• Marlon Pierce (Group

Lead)• Raminder Singh• Saminda Wijeratne• Suresh Marru• Viknes Balasubramanee• Yu (Marie) Ma

We are primarily funded by NSF (XSEDE, SDCI, others) and NASA (QuakeSim, E-DECIDER) awards with two base funded positions.

Page 3: Scientific
Page 4: Scientific

What Is Cyberinfrastructure?

“Cyberinfrastructure consists of computing systems,data storage systems, advanced instruments anddata repositories, visualization environments, andpeople, all linked together by software and highperformance networks to improve researchproductivity and enable breakthroughs not otherwisepossible.”

–Craig Stewart, Indiana University

Page 5: Scientific

Compute Resources

Resource Middleware Cloud Interfaces Grid Middleware SSH & Resource

Managers

Computational Clouds Computational Grids

Gateway Services

User Interfacing

Services

Web/Gadget Container

Web Enabled Desktop Applications

User Management

Auditing & Reporting

Fault Tolerance

Application Abstractions

Workflow System

Information ServicesMonitoring

RegistryProvenance &

Metadata Management

Local Resources

Web/Gadget Interfaces

Gateway Abstraction Interfaces

Color Coding

Dependent resource provider components

Complimentary gateway components

Airavata components

Cyberinfrastructure Layers

Security

Page 6: Scientific
Page 7: Scientific

7

Page 8: Scientific

XSEDE Offers Variety

• Leading-edge distributed memory systems• Very large shared memory systems• High throughput systems• Visualization engines• Accelerators like GPUs

8

Many scientific problems have components that call for use of more than one architecture.

Page 9: Scientific

Extended Collaborative Support Services

• Support staff who understand the discipline as well as the systems (perhaps more than one support person working with a project).– 37 FTEs, spread over ~80 people at almost a dozen sites.

• Brings the best available knowledge and skills to bear on computational science problems within the XSEDE user community.

• Wide range of areas, – Performance analysis– Petascale optimization techniques to – Building Science Gateways.

• Novel and Innovative Projects – – bring in diverse and non-traditional users. 9

Page 10: Scientific

ECSS Examples• Algorithmic or solver change; incorporation or implementation of

new or advanced numerical methods and/or math libraries• Optimization (single processor performance, parallel scaling, parallel

I/O performance, memory usage, or benchmarking improvement)• Development of parallel code from serial code/algorithm• Data visualization or analysis• Incorporation of new data management or storage• New grid computing work• Implementation of new workflows for automation of scientific

processes• Incorporation of new visualization methods• Innovative scheduling implementation• Integration of XSEDE resources into a portal or Science Gateway

Page 11: Scientific

Knowledge and Expertise

Computational Resources

Scientific Instruments

Algorithms and Models

Archived Data and Metadata

Advanced Science Tools

Science Gateways: Enabling & Democratizing Scientific Research

Page 12: Scientific

Science Gateway CommunitiesCommunity Gateway Capabilities

Universities and Academic Departments

Provide simplified access to computing resources for students, faculty, and staff. Gateways should support MOOCs

Shared Instrument Facilities

Simplify access to instruments, support research derived from common data products: UltraScan, LIGO, DES, LSST

Multisite Collaborations and Virtual Organizations

Funded or unfunded, including self-organized communities, who need to collaborate and use a common pool of resources: LEAD, ENZO

Businesses and Services Provide Software as a Service for a particular application suite

Small Research Groups “Long tail” of science, need to preserve the work that is done.

Page 13: Scientific

UltraScan: A Science Gateway for Biophysical Analysis

• Support analysis of experiments on molecules in solution environments– Analytical ultracentrifugation (AUC)– Laser Light Scattering (LS)– Small angle X-ray/neutron scattering (SAXS/SANS)

• Provide highest possible resolution in the analysis – Requires HPC

• Offer a flexible model for multiple optimization methods• Integrate a variety of HPC environments into a uniform

user experience• Must be easy to learn and use –

– users should not have to be HPC experts– Provide check-pointing – easy to understand error messages

• Fast turnaround to support serial workflows

Page 15: Scientific

On-DemandGrid Computing

Example: Adapting Weather Prediction to Observational Sources Using Dynamic Adaptivity

StreamingObservations

Storms Forming

Forecast Model

Data Mining

Page 16: Scientific

16

Abstractions

Load LevelerPBS/Torque Sun Grid Engine LSF

GlobusGlobus

Unicore

EC2 APIGateway Frameworks

SSH

Page 17: Scientific

Apache Airavata Overview

http://airavata.apache.org

Page 18: Scientific

Realizing the Universe for the Dark Energy Survey (DES) Using XSEDE Support(Pis: A. Evrard (UM) and A. Kravtsov (UC)

• The Dark Energy Survey (DES) is an upcoming international experiment that aims to constrain the properties of dark energy and dark matter in the universe using a deep, 5000-square degree survey of cosmic structure traced by galaxies.

• To support this science, the DES Simulation Working Group is generating expectations for galaxy yields in various cosmologies.

• Analysis of these simulated catalogs offers a quality assurance capability for cosmological and astrophysical analysis of upcoming DES telescope data.

• These large, multi-staged computations are a natural fit for workflow control atop XSEDE resources.

Fig. 2: A synthetic 2x3 arcmin DES sky image showing galaxies, stars, and observational artifacts. Courtesy Huan Lin, FNAL.

Fig. 1 The density of dark matter in a thin radial slice as seen by a synthetic observer located in the 8 billion light-year computational volume. Image courtesy Matthew Becker, University of Chicago.

Page 19: Scientific

DES Application

Component Description

CAMB Code for Anisotropies in the Microwave Background is a serial FORTRAN code that computes the power spectrum of dark matter, which is necessary for generating the simulation initial conditions. Output is a small ASCII file describing the power spectrum.

2LPTic Second-order Lagrangian Perturbation Theory initial conditions code is an MPI based C code that computes the initial conditions for the simulation from parameters and an input power spectrum generated by CAMB. Output is a set of binary files that vary in size from ~80-250 GB depending on the simulation resolution.

LGadget LGadget is an MPI based C code that evolves a gravitational N-body system. The outputs of this step are system state snapshot files, as well as lightcone files, and some properties of the matter distribution, including the power spectrum at various timesteps. The total output from LGadget depends on resolution and the number of system snapshots stored, and approaches ~10 TB for large DES simulation boxes.

Page 20: Scientific

DES as a WorkflowThere are plenty of issues:• Long running code: Based on simulation box size

L-gadget can run for 3 to 5 days using more than 1024 cores.

• Local HPC provider policies: XSEDE resource provider’s job scheduling policy does not allow jobs to run for more than 24 hours in normal queue

• Do-While Construct: Restart service support is needed in workflow. Do-while construct was developed to address the need.

• Data size and File transfer challenges: L-gadget produces 10~TB for large DES simulation boxes in system scratch so data need to moved to persistent storage ASAP

• File system issues: More than 10,000 lightcone files are doing continues file I/O. This can cause problems with the HPC resource’s file system (usually Lustre-based in XSEDE).

Processing steps to build a synthetic galaxy catalog.

Page 21: Scientific

Workflow Interpreter

Application Factory

Message Box

Registry

Apache Airavata

API

Lorem ipsum

insolens

p1m5

duo x

End

Use

rsG

atew

ay

Dev

elop

er

Scientific Applicati

on

Core Developer

Computational Resources

Apache Airavata

Page 22: Scientific

Apache Airavata Components

Component Description

XBaya Workflow graphical composition tool.

Registry Service Insert and access application, host machine, workflow, and provenance data.

Workflow Interpreter Service

Execute the workflow on one or more resources.

Application Factory Service (GFAC)

Manages the execution and management of an application in a workflow

Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events

Airavata API Single wrapping client to provide higher level programming interfaces.

Page 23: Scientific

Domain Description

Astronomy Image processing pipeline for One Degree Imager instrument on XSEDE

Astrophysics Supporting workflow of Dark Energy Survey simulations working group on XSEDE

Bioinformatics Supported workflow executions on Amazon EC2 for BioVLAB project

Biophysics Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources

Computational Chemistry

Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE

Nuclear Physics Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources

Apache Airavata in Action

Page 24: Scientific

Apache and Open Governance

Page 25: Scientific

Cyberinfrastructure: How open is open source?

• What’s missing?– Open source licensing– Open Standards– Open Code (GitHub,

SourceForge, Google Code e.t.c)

We also need Open Governance

Page 26: Scientific

Open Source Software and Governance

• Open source projects need diversity, governance.– Reproducibility– Sustainability

• Incentives for projects to diversify their developer base.

• Govern– Software releases– Contributions– Credit sharing.– Members are added– Project direction decisions.– IP, legal issues

• Our approach: Apache Software Foundation

Collaborate

Compete

Page 27: Scientific

The Apache Software Foundation

• Apache software powers 65% of web sites worldwide

• 501(c)3 non-profit foundation

• Reasons for creating ASF– Create legal entity– Protect contributors from

liability– Protect Apache assets

• Membership: individual• Apache Incubator

• Governance and Staffing– Board of Directors– Project Management

Committees– ASF Members– Committers– Contributors

• Funding– All-volunteer

staffing/development resources

– Donations– Corporate investment

Page 28: Scientific

Apache Contributions Aren’t Just Software

• Apache committers and PMC members aren’t just code writers.

• Successful communities also include– Important users– Project evangelists – Content providers: documentation, tutorials– Testers, requirements providers, and constructive

complainers • Using Jira and mailing lists

– Anything else that needs doing.

Page 29: Scientific

More Information

• Contact Us:– [email protected], [email protected]

• Websites:– Science Gateway Group:

http://rt.uits.iu.edu/visualization/gateways/index.php

– Apache Airavata: http://airavata.apache.org – Science Gateway Institute:

http://sciencegateways.org

Page 30: Scientific

What Is Apache Airavata?• Science Gateway software

system to• Compose, manage, execute,

and monitor distributed, computational workflows

• Wrap legacy command line scientific applications with Web services.

• Run jobs on computational resources ranging from local resources to computational grids and clouds

• Airavata software is largely derived from NSF-funded academic research.

Page 31: Scientific

Apache Airavata Science Gateway Framework

Page 32: Scientific

Workflow Interpreter

Application Factory

Message Box

Registry

Apache Airavata

API

Lorem ipsum

insolens

p1m5

duo x

End

Use

rsG

atew

ay D

evel

oper

Scientific Applicati

on

Core Developer

Computational Resources

Apache Airavata

Page 33: Scientific

Apache Airavata Components

Component Description

XBaya Workflow graphical composition tool.

Registry Service Insert and access application, host machine, workflow, and provenance data.

Workflow Interpreter Service

Execute the workflow on one or more resources.

Application Factory Service (GFAC)

Manages the execution and management of an application in a workflow

Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events

Airavata API Single wrapping client to provide higher level programming interfaces.

Page 34: Scientific

Key Airavata Features

• Graphical user interface to construct, execute, control, manage and reuse scientific workflows.

• Desktop tools and browser-based web interface components to manage applications, workflows and generated data.

• Sophisticated server-side tools to register, schedule and manage scientific applications on high performance computational resources.

• Ability to Interface and interoperate with various external (third party) data, workflow and provenance management tools.

Page 35: Scientific

A Classic Scientific Workflow

• Workflows are composite applications built out of independent parts.– Parts are executables wrapped as network accessible services

• The classic example is that codes A, B, and C need to be executed in a specific sequence.– A, B, C: parallel codes compiled and executable on a cluster, supercomputer,

etc. by schedulers.• A, B, and C do not need to be co-located• A, B, and C may be sequential or parallel• A, B and C may have date or control dependencies

– Data may need to be staged in and out• Some variations on ABC:

– Conditional execution branches– Dynamic execution resource binding– Iterations (Do-while, For-Each) over all or parts of the sequence– Triggers, events, data streams

Page 36: Scientific

Challenges in Scientific Workflows

• Accommodating wide range of execution patterns – Iterations: for-each, do-while, dot and Cartesian

products– Interactivity, adaptivity, non-determinism

• Accommodating error and uncertainties

Page 37: Scientific

NextGen Workflow Systems: Need for Interactivity Across Layers

• Scientific workflow systems and compiled workflow languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows.

• Building on these foundations Airavata extends to a interactive and flexible workflow systems.

• Airavata Workflow Features include:– interactive ways of interfering and steering the workflow

execution– interpreted workflow execution model– high level instruction set – flexibility to execute individual workflow activity and wait for

further analysis.

Page 38: Scientific

Interactivity Contd.• Derivations during workflow Execution that does

not affect the structure of the workflow– dynamic change workflow inputs, workflow rerun.

interpreted workflow execution model.– dynamic change in point of execution, workflow smart

rerun.– Fault handling and exception models.

• Derivation that change the workflow DAG during runtime– Reconfiguration of activity..– dynamic addition of activities to the workflow.– Dynamic remove or replace of activity to the workflow

Page 39: Scientific

Interactivity• Mathematical uncertainty:

– PDE’s from domain problems do not have analytical solution and thereby look at numerical methods to find solutions

– These solvers may not converge depending on method, PDE system, initial conditions and expected output tolerances

– statistical techniques lead to nondeterministic results.– closer observation at computational output ensure acceptability of results.

• Domain uncertainty:– Scenarios of running against range of parameter values in an attempt to find the most appropriate

input set. – Initial execution providing estimate of the accuracy of the inputs and facilitating further

refinement.– Outputs are diverse and nondeterministic

• Resource uncertainty:– Failures in distributed systems are norm than an exception– transient failures can be retried if computation is side-effect free/Idempotent. – persistent failures require migration

• Real-time Model refinement– Real-time event processing systems not having data available prior to initialization of model. – models evolve over time and can take advantage of more and more events as they become

available

Page 40: Scientific

Illustrating Interactivity

Page 41: Scientific

Domain Description

Astronomy Image processing pipeline for One Degree Imager instrument on XSEDE

Astrophysics Supporting workflow of Dark Energy Survey simulations working group on XSEDE

Bioinformatics Supported workflow executions on Amazon EC2 for BioVLAB project

Biophysics Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources

Computational Chemistry

Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE

Nuclear Physics Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources

Apache Airavata in Action

Page 42: Scientific

Workflow Interpreter

Application Factory

Message Box

Registry

Apache Airavata

API

Lorem ipsum

insolens

p1m5

duo x

End

Use

rsG

atew

ay D

evel

oper

Scientific Applicati

on

Core Developer

Computational Resources

Apache Airavata

Page 43: Scientific

Apache Airavata ComponentsComponent Description

XBaya Workflow graphical composition tool.

Registry Service Insert and access application, host machine, workflow, and provenance data.

Workflow Interpreter Service

Execute the workflow on one or more resources.

Application Factory Service (GFAC)

Manages the execution and management of an application in a workflow

Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events

Airavata API Single wrapping client to provide higher level programming interfaces.

Page 44: Scientific