23
Combining Data-Flow and Control-Flow For Scientific Workflows CSC 8710-001 – Presentation 2 Mohammed Shahnawaz Ali

Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

  • Upload
    ff2687

  • View
    1.584

  • Download
    0

Embed Size (px)

DESCRIPTION

CSC8710-001_Winter2014_MohammedShahnawazAli-ff2687_Presentation_2

Citation preview

Page 1: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

Combining Data-Flow and Control-FlowFor Scientific Workflows

CSC 8710-001 – Presentation 2Mohammed Shahnawaz Ali

Page 2: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 2

• Data-centric scientific workflows modeled as dataflow process networks

• Establish a generic framework for embedding control-flow intensive tasks

• Make scientific workflows more robust and reusable

Executive Summary

2/19/2014

Page 3: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 3

Describe:

Scientific Workflow Systems – Usage & Models Actor Oriented Workflow

Building Blocks Design Extensions

A 3-tier architecture framework Design Usage

Closing Notes Next Steps

Objective & Structure

2/19/2014

Page 4: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 4

• Used to construct and execute complex data-centric scientific analyses

• Requires bringing together – data retrieval, computation, visualization

• Support end-to-end workflow management

Scientific Workflow Systems – Usage

2/19/2014

Page 5: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 5

• Directed Acyclic Graph with arcs for scheduling dependencies between jobs.

• Dataflow Process Networks with built-in support for stream based and concurrent execution

o Efficient analysiso Simple and intuitive

Scientific Workflow Systems – Models

2/19/2014

Page 6: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 6

Building blocks of actor-oriented modeling + design:

• Actors: Workflow components wired together by portso Composite: Encapsulate sub-workflows

• Director: Overall execution and component interaction, behavioral polymorphism

• Port: Input and Output

Actor Oriented Workflow – Building Block

2/19/2014

Page 7: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 7

Actor oriented workflow graph, W = ‹A,D› [A: Actors, D: Dataflow connections]Signature of Actor, ∑A = in(A) → out(A)

Dataflow connection, d Є D, is a directed hyperedge d = ‹o,i› [o: output, i: input] has:1. Merge step2. Copy step3. Delivery step

Actor Oriented Model – Design

2/19/2014

Page 8: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 8

• A composite actor Aw encapsulates sub-workflow W.• The ports of Aw consists of set of W ports.• Hierarchical workflow contains at least one composite actor with any level of

nesting

Actor Oriented Model – Design (Cont’d)

2/19/2014

Page 9: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 9

Two extensions to actor oriented modeling:

• Frames: Abstraction that denotes a set of alternative actor implementations with similar

functionality.

• Templates: Abstraction for a set of workflows that specifies the behavior of the workflow it represents.

Actor Oriented Model – Extensions

2/19/2014

Page 10: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 10

• Used as abstractions for a family of components with similar function.• Placeholders for components that will be instantiated and specialized later. • Has input, output, and parameter ports, structural types, and semantic types – frame signature

Actor Oriented Model – Frames

2/19/2014

Page 11: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 11

• F[C] in ports(F) X ports(C) • The embedded component may:

o introduce new portso not use all the ports

• Parameter ports can also be connected to input ports and vice versa

Actor Oriented Model – Frames (cont’d)

2/19/2014

Page 12: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 12

• Embedding F[C] is well-formed if the input and output port directions are observed.• A well-formed embedding is structurally well-typed and/or semantically well-typed.• The typing rules can be relaxed when the frames occur within a workflow.• Provides natural mechanism to execute associated actors in parallel.

Actor Oriented Model – Frames (cont’d)

2/19/2014

Page 13: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 13

Workflow Template

• Specifies the behavior of the workflows it represents.• ∑T : in(T) → out(T)• Includes an “inner” workflow graph WT with some of the components as frames.

Actor Oriented Model – Workflow Templates

2/19/2014

Page 14: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 14

Workflow Template

• T represents a partial workflow specification.• Frames can be independently specialized by embedded components• Resulting embedding is :

• either a concrete, executable workflow• or a template

Actor Oriented Model – Workflow Templates (cont’d)

2/19/2014

Page 15: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 15

Transducer Template

• The template T can constrain by providing one or more directors.• FST director inscribed indicates executing the workflow graph WT as a finite state

transducer.• The director dictates:

1. Execution model2. Constraints on the graph.

Actor Oriented Model – Transducer Templates

2/19/2014

Page 16: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 16

Objective:

Structure frames and templates that can be executed using,

1. Alternative control behavior2. Alternative task implementation

Generic Control-Flow Component Pattern – Objective

2/19/2014

Page 17: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 17

Design:

Consists of three tiers/levels:

1. Level 1: • A frame within a dataflow graph and denotes a particular task.• Can be embedded with finite state transducer templates

2. Level 2: • Transducer templates for control-flow behavior.• Has one or more state frames.• Offers a more natural, intuitive, succinct language

Generic Control-Flow Component Pattern – Design

2/19/2014

Page 18: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 18

Design:

3. Level 3: • State Frames that can be embedded in a particular task implementation.

An FST is a tuple M = ‹I, O, Q, q0, T›

Generic Control-Flow Component Pattern – Design (cont’d)

2/19/2014

Page 19: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 19

Usage:

Implementation enables workflow designers to configureboth the behavior and underlying implementation.

Specifically, a workflow designer can,1. Insert into a workflow generic component. 2. Select an available transducer template behavior.3. Select task implementations for the state frames and templates.

Generic Control-Flow Component Pattern – Usage

2/19/2014

Page 20: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 20

• Scientific workflows are primarily dataflow oriented, certain workflows can be control-intensive • The generic framework describes how to support structured embedding of generic control-flow components within data process networks.• Frames and templates can be used to develop robust workflows via reusable control-intensive subtasks.

Closing Notes

2/19/2014

Page 21: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 21

• Fully integrate frames and templates as first class modeling constructs.• Develop additional transducer templates and lower level implementation components.• Explore mechanisms for easily combining transducer templates.

Next Steps

2/19/2014

Page 22: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 22

• Shawn Bowers – UC Davis Genome Center, University Of California, Davis. • Bertram Ludascher – UC Davis Genome Center, University Of California, Davis.• Anner H.H. Ngu – Department of Computer Science, Texas State University.• Terrence Crtichlow – Center for Applied Scientific Computing, Lawrence Livermore

National Laboratory.

• G. Alonso and C. Mohan. Workflow management systems: The next generation of distributed processing tools. In Advanced Transaction Models and Architectures.

• C. Berkley, S. Bowers, M. Jones, B. Lud¨ascher, M. Schildhauer, and J. Tao. Incorporating semantics in scientific workflow authoring. In Proc. of the Intl. Conf. on Scientific and Statistical Database Management (SSDBM).

• V. Bhat, S. Klasky, S. Atchley, M. Beck, D. McCune, and M. Parashar. High performance threaded data streaming for large scale simulations. In

Proc. of the IEEE/ACM Intl.Workshop on Grid Computing (GRID’04).

Acknowledgements

2/19/2014

References

Page 23: Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2

CSC 8710 - Presentation 2 23

Thank You

2/19/2014