Upload
bojana-koteska
View
225
Download
0
Embed Size (px)
Citation preview
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
1/31
Building Scientific Workflows on the Grid: A Comparison
between OpenMole and Taerna
Bojana Koteska Boro Jakimovski Anastas Mishev
November 2014, Bucharest, Romania
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
2/31
Oeriew
! Scientific workflows
! Characteristics of Scientific Workflow S"stems:
OpenMole and Taverna
! Workflow #mplementation in OpenMole and Taerna
! Conclusion and future work
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
3/31
Scientific workflows
! workflows needed for different scientific applications and
scientific e$periments
!designing% automating% controlling and managing thecomple$ flow of data and processes
! scientific communities & workflow s"stems to perform
e$periments with a large set of data and computations
! business companies ' workflow s"stems to automate the
manufacturing process and to aoid redundant tasks(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
4/31
Scientific workflows
! Workflow - automation of a process during which documents%
information% or tasks are passed from one participant to another
for action% according to a set of procedural rules(
! Workflow engine ' software serice or )engine) that proidesthe run time e$ecution enironment for a workflow instance(
! Workflow management system ' a s"stem that defines%
creates% and manages the e$ecution of workflows through theuse of software running on one or more workflow engines(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
5/31
Grid
! coordinated use of man" heterogeneous and distributed
resources
!soling problems in man" computing and data intensiescientific applications *ph"sics% biolog" and chemistr"+
! large scalabilit"% transparent and consistent access to
resources% distributed supercomputing support% high'throughput computing support% data'intensie computing
support
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
6/31
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
7/31
OpenMole
! parallel e$ecution enironments for naturall" parallel
processes
!adanced numerical e$periments on simulation models
! distribution of the workflows on multi'core machines%
desktop'grids% clusters and grids
! scalable and it supports TB of data and millions of tasks(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
8/31
Mole components
! Tasks
! Transitions
! -rotot"pes
! Samplings! .nironments
! /ooks
! Sources
! #n order to be run% a mole must contain at least one task
! and a starting task *capsule+
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
9/31
Mole components
! Prototype ' a ariable that operates in the workflow *must
be a part of a task & input or output+! name% t"pe *#nteger% String% etc(+% dimension * 0'scalar% 1'ector%
2' matri$% etc(+(
! Task ' seeral t"pes of tasks: Groo"% .$ploration%
S"stem.$ec% Mole Task% Agent based model% Sensitiit"%
and Stat(! name% input and output slots *for receiing or sending protot"pe to
another task+% enironment *to assign a computing resources to a
gien task ' grid% cluster%((+
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
10/31
Mole components ' Tasks
! Groovy task ' run some small pieces of script code or call code running on
the 34M *3aa% Scala((+
! Exploration task ' enables to embed a sampling which will proide an arra"
of combination of protot"pes at the runtime(
! SystemExec task ' e$ecute C% C55 and -"thon codes(
! Mole task ' runs another Mole( This enables to embed a Mole within a Task(
! Stat task ' calculates the aerage% the sum and the median of protot"pe with
dimension one(
! Sensitivity task ' enables ariance'based method
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
11/31
Mole components
! Sampling ' can be composed graphicall"%! man" kinds of samplings: complete% shuffle% 6ip% combine7
! and domains: range% multiple file% single file% uniform distribution%
logarithmic range domain% etc(
! Hook ' a Mole listener( #t performs a particular action on the
output(
! Source ' reads data from a CS4 file and maps it toprotot"pes included in the workflow
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
12/31
Taerna
! aailable as a desktop client application *Taerna
Workbench+% Taerna Serer and command line
!
integrated with two other m"Grid Tools: my experiment*workflow sharing enironment+ and Bioatalogue
*catalogue of Web serices for 8ife Sciences+
! scalable scientific workflow s"stem which has access to
local and remote resources and grid serices *more than
900+(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
13/31
Taerna
! support for calling client libraries *in ;ub" and 3aa+ and
tools and scripts on local or remote machines(
!
users define how data will flow between serices% but the"should not care how the serices will be inoked(
! workflow can be shared on m".$periment and searched
and downloaded from other users(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
B ildi S i ifi W kfl h G id A C i b O M l d T
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
14/31
Taerna components
! Services &! WS,8'st"le Web serices *path! Oauth serices
! component serices
! string constant *for setting a fi$ed'alue input for a serice+(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
B ildi S i tifi W kfl th G id A C i b t O M l d T
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
15/31
Taerna components
! !ists an" iterations ' Serices in Taerna can return single
alues or lists of alues or lists of lists(
! !oop ' usuall" used for inoking as"nchronous serices '
web serice or similar where an" action should be repeated(
! ontrol link ' used to set dependencies between serices
that do not directl" share data
! Merges ' combination of outputs from more serices into
one single input(
! Parallel service invocation ' used when the same serice
should be inoked more times in parallel(
! omponent ' a reusable unit of functionalit" which is
defined b" a workflow(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
B ildi S i tifi W kfl th G id A C i b t O M l d T
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
16/31
Comparison between OpenMole and Taerna
! similarities in terms of functionalit"
! differ slightl" in the wa" of their implementation
! Taverna ' addition of e$isting Web serices% the automatic
conersion of the list depths in order to connect two
serices
! #penMole & simpler definition of composite inputs *for e$(
making combinations of the elements in arra"+
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
B ildi S i tifi W kfl th G id A C i b t O M l d T
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
17/31
Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
18/31
Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
19/31
The Workflow
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
20/31
The workflow
! circle represents a file containing some data
! s?uares a1@ a2@ a9@(( a file actions' merging of files
contents
! choose alwa"s combinations of 9 files
! two copies of the file list in the director"
! each file list is shuffled and the first files of each of the
three file lists are chosen
!)folderA) represents the director" where the combinationsare chosen
! result of an each action a1@ a2@ a9@(( a is a file saed in a
new director" )folderBD
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
21/31
The workflow
! actions a1@ a2@ a9@(( a are performed in parallel
! when three files are created in )folderB)% the action b is
performed' merges the contents of the 9 files into a new file
! files in )folderB) are deleted
! b is repeated E9 F 21 times in each iteration and it is
performed in parallel with the actions a1@ a2@ a9@(( a
immediatel" after creating 9 new files in )folderBD
! The new file is moed to )folderAD
! workflow can be e$ecuted repeatedl"
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
22/31
The Workflow #mplementation in OpenMole
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
23/31
The Workflow #mplementation in OpenMole
! )defining i) capsule' groo" task% setting the integer protot"pe i to
1% number of workflow iterations% output i
! )combinations) capsule ' e$ploration task% defining the sampling! Three multiple file domains )in folderA) and three file protot"pes Dfile)%
Dfile1) and Dfile 2)! All three multiple domains hae the same director" path
! The samplings: )ip with string name)% )ip with string1 name) and )ip
with string2 name) are used for making arra"s of file names in )folderA)(
! .ach arra" is shuffled *Shuffle sampling+ and first elements are taken
*Take *+ Sampling+(! The four file names of each arra" are combined *Combine sampling+
! The combinations * $ $ + are made(
! H outputs from this task : the protot"pe i and arra"s *fileI1J%
file1I1J% file2I1J% stringI1J% string1I1J and string2I1J+(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
24/31
The Workflow #mplementation in OpenMole
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
25/31
The Workflow #mplementation in OpenMole
! capsule )comb( 9 files) ' groo" task% includes a
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
26/31
The Workflow #mplementation in OpenMole
! capsule )increase i) % groo" task used for increasing the
integer protot"pe i(! transition between )comb( 9 firstD and this capsule is an aggregation
! e$ecuted once in each workflow iteration(
! capsule )comb( 9 first) ' groo" task% includes a
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
27/31
The Workflow #mplementation in Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
28/31
The Workflow #mplementation in Taerna
! )Workflow 1) ' nested workflow ' e$ecuted n times(! input n ' list of integer numbers(
! integer input port ni & increasing in each workflow iteration
! Serice ),irector"-athD' beanshell script ' the path of the
director" )folderA) is specified as a string )director")(! 9 serices )8ist Kiles)% )8ist Kiles 1)% )8ist Kiles 2) ' identical%
but used for making the same arra" of files in the input
director"(
! .ach arra" is shuffled and first four elements are taken *3aa code+! random combinations% each of the three files
! Output of each serice ' list of four files(
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
29/31
The Workflow #mplementation in Taerna
! Serice )concatenate and write in files b" 9 KilesD! #nputs with length 0 ' it takes one b" one element of each
! of the three lists of four files *combinations+
! Also a beanshell script and it merges the contents of each
combination of three files & repeated times *a1@ a2@ a9@ 7 a+(
! Serice )combine first three) ' checking if three files e$ist
and merging them in the director" )folderBD
! .$ecuted in parallel with the preious serice )concatenate and writein les b" 9 Kiles)(
! contents are merged into a new file
g p p
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
30/31
Conclusion and Kuture Work
! anal"6ed the most important characteristics of the two workflow
s"stems
! ,ifferent implementation logic! Taerna offers natural parallelism of processes
! in OpenMole the user should e$plicitl" define that
! Taverna ' more suitable for designing comple$ workflows
*plent" of predefined serices+
! #penMole ' simple and interactie interface% useful for peoplethat do not hae e$perience in designing workflows% defining
samples automaticall"
! anal"6ing and measuring the e$ecution times of more comple$
workflows in both workflow s"stems(
g p p
Building Scientific Workflows on the Grid: A Comparison between OpenMole and Taerna
8/19/2019 Building Scientific Workflows on the Grid- A Comparison between OpenMole and Taverna
31/31
Thank Lou for the attention
g p p