22
Opportunities and Challenges for Running Scientific Workflows on the Cloud Yong Zhao, Xubo Fei, Ioan Raicu, Shiyong Lu Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference Ying Lian Computer Science, WSU

Opportunities and Challenges for Running Scientific Workflows on the Cloud

Embed Size (px)

DESCRIPTION

Opportunities and Challenges for Running Scientific Workflows on the Cloud

Citation preview

Page 1: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Opportunities and Challenges for Running Scientific Workflows

on the Cloud Yong Zhao, Xubo Fei, Ioan Raicu, Shiyong Lu

Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference

Ying Lian

Computer Science, WSU

Page 2: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Overview

INTRODUCTION

OPPORTUNITIES

CHALLENGES

RESEARCH DIRECTIONS

CONCLUSIONS

Page 3: Opportunities and Challenges for Running Scientific Workflows on the Cloud

INTRODUCTION

There is something in the air.

Page 4: Opportunities and Challenges for Running Scientific Workflows on the Cloud

INTRODUCTION

Cloud computing is gaining tremendous momentum in both academia and industry.

“Cloud Computing”: a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.

Mostly applied to Web applications and business applications. To support workflow applicationsa link is missing

Page 5: Opportunities and Challenges for Running Scientific Workflows on the Cloud

INTRODUCTION

Manage and run workflow applications on the cloud (especially data-intensive scientific workflows)

Several Scientific workflow management systems (SWFMSs) have been applied.

Cloud Workflow: specification, execution, and provenance tracking of scientific workflows, as well as the management of data and computing resources to enable the running of scientific workflows on the Cloud

Following sections: Meaning, challenges, research opportunities

Page 6: Opportunities and Challenges for Running Scientific Workflows on the Cloud

OPPORTUNITIES

Keywords: Infinite computing resource

1. The scale of scientific problems that can be addressed by scientific workflows is now greatly increased, which was previously upbounded by the size of a dedicated resource pool with limited resource sharing extension in the form of virtual organizations. data size (e.g. GenBank double/9-12m )—vast storage

space complexities of the applications (e.g. protein simulation

by iterative algorithm with huge parameters) – massive computing resources

Page 7: Opportunities and Challenges for Running Scientific Workflows on the Cloud

OPPORTUNITIES

2. The on-demand resource allocation mechanism in Cloud has a number of advantages over the traditional cluster/Grid environments for scientific workflows: Improve resources utilization. Unequal numbers of

recourses are required for different stages. Faster turn-around time for end users: dynamic

scale out/in Enable new generation workflow: collaborative

scientific workflow. In which user interaction and collaboration patterns are favored

Page 8: Opportunities and Challenges for Running Scientific Workflows on the Cloud

OPPORTUNITIES

3. Much bigger room for trade-off between performance and cost. Spectrum of resource investment: from delicate

private resources, hybrid local & cloud, full outsourcing on clouds

Cloud computing bring the opportunities to improve the performance/cost ratio

But the optimization of this ratio and automatic trade-off mechanism remain challenging.

Page 9: Opportunities and Challenges for Running Scientific Workflows on the Cloud

CHANLLENGES

Architectural challenges

Integration challenges

Computing challenges

Data management challenges

Language challenges

Service management challenges

Page 10: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Architectural ChallengesUser interface customizability and

support

Reproducibility support

Heterogeneous and distributed services and software tools integration

Heterogeneous and distributed data product management

High-end computing support

Workflow monitoring and failure handling

Interoperability

Page 11: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Reference Architecture for SWFMSs

Page 12: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Deploy the architecture: solutions

Operation

SWFMS running out of

the Cloud

No concern of vendor lock-in

SWFMS itself cannot benefit

from the scalability

Task Management

Not on a batch-based

schedule

Deploy immediately

without sequence

Cost of storage of

provenance & data products

Workflow management

Presentation Layer

deployed at a client machine

Suitable for ad hoc domain-

specific requirement

More dependent on Cloud platform

All_in_the_could

SWFMS inside the cloud, and accessed via Web browser

Highly scalable:

Software as a Service

Cost; Dependency; Vendor lock-in

Page 13: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Integration Challenges

How to integrate scientific workflow systems with Cloud infrastructure and resources ?

Operation layer : Applications, services, and tools hosted in the Cloud and the scheduling and management of a workflow are outside the Cloud. (e.g. Google Map service use ad hoc scripts and programs to glue the services together)

Task management layer: resource provisioning. (e.g. Nimbus)

Workflow management layer: Debugging, monitoring, and provenance tracking

All in cloud: porting issue. Need a workflow engine at cloud end, and web interface or thin client at user end

Page 14: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Language Challenges

MapReduce: a widely used computing model, with two key function, Map and Reduce. --White-Box

SwiftScript serves as a general purpose coordination language, where existing applications can be invoked without modification. --Black-Box

Page 15: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Language Challenges

Handle the mapping from input and output data into logical structures.

Support large-scale parallelism via either implicit parallelism, or explicit declaratives.

Support data partitioning and task partitioning.

Require a scalable, reliable, and efficient runtime system that can support Cloud-scale task scheduling and dispatching, provide error recovery and fault tolerance.

Page 16: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Computing Challenges

Workflow system may not be able to talk to Cloud resources directly middleware services needed. (Nimbus or Falkon to handle the resource provisioning and task dispatching)

More complicated if consider: workflow resource requirement, data dependencies, Cloud virtualization.

A SWFMS will try to automatically recover when non-fatal errors happen. Smart-return: detailed execution info be logged, for workflow restart.

Page 17: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Data management challenges

When data intensiveness increase, the management of data resources and dataflow between the storage and compute resources become the bottleneck. Data Locality: CPU cheaper, data inflate location is the

most challenge, rather than the computational resources Combining compute and data management: need to

minimize the amount of data movement. Otherwise, significant underutilization of raw resources will be yield.

Provenance: derivation history of a data product. Tracking across service providers, and across different abstraction layers. Secure access is another missing now.

Page 18: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Service management challenges

The engineering of the components of an SWFMS as services: thousands of services developed and available for the

myExperiment project the LEAD system has developed a tool to wrap and

convert ordinary science applications into services

The orchestration and invocation of services from an SWFMS managing the large number of service instances data movements across different service instances

Page 19: Opportunities and Challenges for Running Scientific Workflows on the Cloud

RESEARCH DIRECTIONS Emphasis on workflow reference architecture and

direct research effort to foregoing layers

Great leap on Middleware development: resource management, monitoring, messaging

Many Task computing (MTC): preliminary applied in Grids and supercomputer, expected to largely improved for Cloud

Scripting: mixture of semantics, combination of application of services…

Cost optimization: very challenging, but rewarding too

Page 20: Opportunities and Challenges for Running Scientific Workflows on the Cloud

RESEARCH DIRECTIONS

SWFMS security Access control: critical because of the natures of

clouds ( Dynamic, large data and service sharing) Information flow control: assure the scientific flow

related info propagated to an authorized end Secure electronic transaction protocol: pay-as-you-

go pricing model

Page 21: Opportunities and Challenges for Running Scientific Workflows on the Cloud

CONCLUSIONS

As more customers and applications migrate into Cloud, the requirement to have workflow system to manage complex tasks will become more urgent

Now mash-up’s and MapReduce style task management have been acting in place of a workflow system in the Cloud

The opportunities and challenges in bringing workflow systems into the Cloud are discussed

They identify key research directions in realizing scientific workflows in Cloud environments

Page 22: Opportunities and Challenges for Running Scientific Workflows on the Cloud

Thank You!