Upload
donna-park
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Advances in a Dag-Scheduling Theory for Internet-Based Computing
Gennaro CordascoUniversity of Salerno, Italy
Collaborators:Arnold L. Rosenberg Univ. of Massachusetts Amherst Greg Malewicz Google Inc.Matt Yurkewych Univ. of Massachusetts Amherst
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Outline• Motivation• The Internet-Based Computing (IC) Pebble Game• IC-Optimal Schedules for Common dags• Decomposition-Based Scheduling Theory
– Priority Relation– Dag Composition
• Expanding the repertoire of building-block dags• The ICO-Sweep algorithm: A decision procedure for
sums of dags• Project in Progress
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Motivation
Internet-Based computing platform
– Web Computing
– Grid Computing
– P2P Computing
Why Internet-Based Computing?
– Remuneration (Commercial computational grids)
– Reciprocation (Open computational grids)
– Altruism (e.g. FightAids@home)
– Curiosity (e.g. Seti@home)
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Internet-Based Computing (IC)
The “owner” of a massive job enlists the aid of remote clients to compute the job’s tasks.
The owner (server) allocates tasks to clients one task at a time.
A client receives its (k+1)-th task after returning the results from its k-th task.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Challenges in Internet-Based Computing
Focus on jobs that have inter-task dependencies (modeled as dags)
– we want to enhance their utilization.
Unfortunately, IC platforms are characterized by a temporal unpredictability:
– communication takes place over the Internet
– remote client are not dedicated, hence can be unexpectedly slow
Temporal unpredictability precludes use of “standard” strategies that were developed for older platform.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
An Avenue of Idealization• Fact
Without further assumption, adversarial Clients can confute any strategy the Server adopts.
• Fact (Buyya-Abramson-Giddy, Kondo-Casanova-Wing-Berman, Sun-Wu)
Monitoring client’s past performance and present resources allows one to:– mitigate the degree of temporal unpredictability;– match task complexity to client resources.
• Idealization
Via monitoring, one can “approximately” ensure the temporal unpredictability of clients affects the timing, but not the order of task executions.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The Formal Idealization
Assumption: Tasks are executed in the order of allocation.
This assumption allows us to:– let “time” be event-driven (execute a node at each “step”)
– derive scheduling guidelines that are totally under control of the Server.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The IC Pebble GameA dag =(,) is used to model a computation (computation-dag):– each node v represents a task in the
computation;– an arc (u v) represents the dependence of
task v on task u: v cannot be executed until u is.– Given an arc (u v) , u is parent of v, and
v is child of u in . Each parentless node of is a source (node), and each childless node is a sink (node).
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The IC Pebble GameThe Players– a single Server, S (the owner)
– a (finite or infinite) set of Clients, C1, C2, …
S has unlimited supplies of two types of Pebbles: – ELIGIBLE pebbles, whose presence indicates a
task eligible for execution– EXECUTED pebbles, whose presence indicates
an executed task
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The IC Pebble GameRules of the Game
1. S begins by placing an ELIGIBLE pebble on each unpebbled source of .
Unexecuted sources are always eligible for execution, having no parent whose prior execution they depend on
ELIGIBLE
EXECUTED
unpebbled
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The IC Pebble GameRules of the Game
1. S begins by placing an ELIGIBLE pebble on each unpebbled source of .
2. At each step, S– selects a node that contains an ELIGIBLE pebble,
– replaces that pebble by an EXECUTED pebble,
– places an ELIGIBLE pebble on each unpebbled node of all of whose parents contain EXECUTED pebbles.
ELIGIBLE
EXECUTED
unpebbled
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC Quality/Optimality of a Dag-Schedule
The IC quality of a schedule for a dag:• the rate of producing ELIGIBLE nodes - the larger,
the better.
A schedule for a dag is IC optimal (ICO):• It maximizes the number of ELIGIBLE nodes for all
steps.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
How Important is IC Quality?
• Consider the following dag:
Non-optimal schedule: Never more than 2 ELIGIBLE nodes
Optimal schedule: Roughly t1/2
ELIGIBLE nodes at step t
ELIGIBLE
EXECUTED
unpebbled
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Optimality is not always possible
For each step t of a play of the Game on a dag under a schedule : E(t) denotes the number of nodes of that contain ELIGIBLE pebbles at step t.Consider the following dag:
schedule E(0) = 3;
• max E(1)=E’(1)=3 (where ’s1,s2,s3ors1,s3,s2))
• but E’(2)=2
• However, max E(2)=E”(2)=3 (where ”s2,s3,s1ors3,s2,s1))
No schedule maximal at step 1 is maximal at step 2.
ELIGIBLE
EXECUTED
unpebbled s1 s2 s3
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Our ContributionUnder idealized assumptions:
Optimal scheduling strategies for familiar classes of dags:
• 2-D evolving meshes
• 2-D reduction meshes
• (binary) reduction tree
• FFT dags
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimal Schedules for Common Dags
• Theorem.Theorem. [[R04]]
A schedule for
is IC optimal iff is parent oriented..
Parent oriented: A node’s parents are executed sequentially.• Meshes: level by level, sequentially across each levelMeshes: level by level, sequentially across each level• Tree-dags: in “sibling pairs” (nodes that share a child)Tree-dags: in “sibling pairs” (nodes that share a child)• FFT-dags: in “butterfly pairs” (nodes that share two FFT-dags: in “butterfly pairs” (nodes that share two
children)children)
an evolving mesh-dag (2- or 3D) a reduction-mesh a reduction-tree an FFT dag
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Decomposition-Based Scheduling Theory
Construct Dags from schedulable “building blocks”
1. Choose bipartite “building block” dags that have optimal schedules.
Theorem.Theorem. [[MRY06]]
Any schedule for these building blocks that executes all sources sequentially is IC optimal
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The Priority relation
Let two dag and respectively admit an IC optimal schedule and then
means that the schedule that entirely execute
’s non-sinks and then entirely execute ’s non-sinks is at least as good as any other schedule that execute both and (+).
Lemma.Lemma. [[MRY06]] The relation is transitive
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Examples of priorities
over itself
over
longer over
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Priorities among some Building Blocks
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Dag Composition
Let and two dags, the composition of and is obtained by merging some k sources of with some k sinks of 1.
The dag obtained is composite of type .
Composition is associative
Example 1,22,31,3
,,
,
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Dag CompositionOur familiar dags are compositions of building blocks. E.g.:
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Dag Composition
Theorem.Theorem. [[MRY06]] If the dag is a composition of connected bipartite dags {i}1in, of type
… n
and if … n ( is a –linear composition)
then executing by executing the i in –order is IC optimal.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Dag Composition
Composite dags that admit IC-optimal schedules can be very nonuniform in structure. E.g.:
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimality via Dag-Decomposition
We need a framework which allows to convert a “real” dag into a simplified and decomposed one
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimality via Dag-Decomposition
Step 1: Eliminate all shortcut from Lemma.Lemma. Every dag has a unique transitive
skeleton, (), which can be found in polynomial time.
Theorem.Theorem. [[MRY06]] If a schedule is IC-optimal for (), then it is IC-optimal for .
()
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimality via Dag-Decomposition
Step 2: Parse () into building blocks1. Start from the dag () sources.2. Detach a maximal connected bipartite subdag whose sources are
sources of () 3. Recurse
Theorem.Theorem. [[MRY06]] If is composite, then this process identifies a unique sequence of bipartite building blocks (1,2,…,n) such that is composite of type 12…n.
()
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimality via Dag-Decomposition
Step 3: Build super-dag 1. Each node of (supernode) is one of the i’s
2. There is an arc is from supernode u to supernode v just when some sink(s) of u are identified with some sources of v
()
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
IC-Optimality via Dag-Decomposition• Step 4: Look for a –Linearization of
Theorem.Theorem. [[MRY06]] Say that:• is composite of type 12…n;• for all i,j with ij, either i j or j i.
Then is –linear composition whenever the following holds.Whenever j is a child of i in , we have i j
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Expanding the repertoire of building-block dags
For any finite sequence of integers, each > 1: the -strands of [] and [] are defined inductively as follows.1. For each integer d > 1, [d] is the (single-source) degree-
d W-dag. It has one source and d sinks; its d arcs connect the source to each sink.
[4]
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Expanding the repertoire of building-block dags
For any finite sequence of integers, each > 1: the -strands of [] and [] are defined inductively as follows.1. For each integer d > 1, [d] is the (single-source) degree-
d W-dag. It has one source and d sinks; its d arcs connect the source to each sink.
2. For each sequence of integers, each > 1, and each integer d > 1, [,d] is obtained from [] and [d] by identifying (or, merging) the rightmost sink of the former dag with the leftmost sink of the latter.
[4,2,4] [4] [4,2,4,4]
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Expanding the repertoire of building-block dags
For any finite sequence of integers, each > 1: the -strands of [] and [] are defined inductively as follows.1. For each integer d > 1, [d] is the (single-source) degree-
d W-dag. It has one source and d sinks; its d arcs connect the source to each sink.
2. For each sequence of integers, each > 1, and each integer d > 1, [,d] is obtained from [] and [d] by identifying (or, merging) the rightmost sink of the former dag with the leftmost sink of the latter.
Every W-strand [] has a dual M-strand [] that is obtained by reversing all arcs.
[4] [4] [4,2,4,3] [4,2,4,3]
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Scheduling-Based Duality Expansive building block () and reductive building
block () are dual to one another in two strong senses:
1. Given a building block of either type, and given an optimal schedule for , one can algorithmically derive from an optimal schedule for the building block obtained by reversing all arcs of
2. Given two building block, 1 and 2 of the same type, if 1 2, then 2 1
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Scheduling-Based DualityLet be a W-strand having n sources and m sinks.Let be a schedule for that execute its source in the order u1,u2,…,un.
renders ’s sinks ELIGIBLE in a sequence of “packets” P1,P2,…,Pn (where Pi is the set of sinks that become eligible when executes ui).A schedule for is dual to if it executes ’s sources in an order of the form [[Pn]],[[Pn-1]],…,[[P1]]
Theorem.Theorem. [CMR06]Let the W-strand admit the IC-optimal schedule . Any schedule for that is dual to is IC-optimal
Where [[a,b,..c]] denotes a fixed, but unspecified permutation of {a,b,…,c}
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Scheduling-Based Duality: An example
• =(1,2) is IC-optimal for [3,2]
• P1={a,b}, P2={c,d}
• =([[c,d]],[[a,b]]) is IC-optimal for [3,2] [3,2]
a b c d
[3,2]
a b c d
Where [[a,b,..c]] denotes a fixed, but unspecified permutation of {a,b,…,c}
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
On Scheduling Strands IC-Optimally
Theorem.Theorem. [CMR06] Every sum of W-Strands and every sum of M-Strand admits an IC-optimal schedule.
– Let Src() denote 's sources – For X Src(), e(X;) denotes the number of sinks of
that are rendered ELIGIBLE when precisely the sources in X are EXECUTED.
– For each u Src() – For any k [1,n], u(k) = e({u,…, u + k - 1}; ) – Vu = u(1),…,u(n)
− Order the vectors V1,…,Vn lexicographically, using the notation Va L Vb to denote this order.
− A source s Src() is maximum if Vs L Vs’ for all s’ Src().
1 2 3 4
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The greedy schedule : An example
Consider the dag =[4,2]+[3,2]V1=3,5,5,5, V2=1,1,1,1, V3=2,4,4,4, V4=1,1,1,1, hence 1 is maximum executes 1
’=[2]+[3,2]
V2=2,2,2, V3=2,4,4, V4=1,1,1, hence 3 is maximum
’ executes 3
’’=[2]+[2]V2=2,2, V4=2,2, hence 2 and 4 are maximum
’’ executes 2 and then 4 (or viceversa)
12 3 4
2 3 4
’
2 4
’’
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
The ICO-Sweep Algorithm
Consider a sequence of p 2 disjoint dags, 1,…, p having respectively n1,…,np non-sinks.
Lemma.Lemma. If the sum = 1+2+· · ·+p admits an IC-optimal schedule Σ, then, for each i [1, p],
1. Each i admits an IC-optimal schedule Σi
2. Σ must execute the non-sinks of that come from i in the same order as some IC-optimal schedule Σi for i.
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Algorithm ICO-Sweep on two dag
Algorithm 2-ICO-Sweep:
1. Use the schedules Σ1 and Σ2 to construct an (n1 + 1) × (n2 + 1) table such that: (i [0, n1])(j[0, n2]) (i,j) is the maximum number of nodes of 1+2 that can be rendered ELIGIBLE by the execution of i non-sinks of 1 and j non-sinks of 2.
2. Perform a left-to-right pass along each diagonal i+j of in turn, and fill in the n1×n2 Verification Table , as follows.
a) Initialize all (i,j) to “NO”
b) Set (0,0) to “YES”
c) for each t [1, n1 + n2]:
i. for each (i,j) with i + j = t: if ((i-1, j) = “YES” OR (i, j-1) = “YES”) AND (i, j) = maxa+b=i+j{(a,b)} then set (i, j) to “YES”
ii. if no entry (i, j) with i + j = t has been set to “YES” then HALT and report “There is no IC-optimal schedule.”
d) HALT and report “There is an IC-optimal schedule.”
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Algorithm ICO-Sweep on two dag
Theorem.Theorem. Given disjoint dags, 1 and 2 , having, respectively, n1 and n2 non-sinks, Algorithm 2-ICO-Sweep determines, within time O(n1n2):
1. whether or not the sum 1 + 2 admits an IC-optimal schedule; in the positive case, the Algorithm provides such a schedule;
2. whether or not either 1 2, or 2 1, or both.
1 2 0 1 2
0 0 3 5
1 4 7 9
2 6 9 11
1 2
1 2
1 2 0 1
0 0 4
1 5 9
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Algorithm ICO-Sweep on multiple dag
Algorithm ICO-Sweep:
1. Perform Algorithm 2-ICO-Sweep on the sum 1 + 2 .
2. For each step k=2,3,... :a) if Step k−1 does not succeed—i.e., the (k−1 )-th invocation of
Algorithm 2-ICOSweep halts with the answer “NO”—then HALT and give the answer “NO”
b) else if Step k−1 succeeds—i.e., the (k−1)-th invocation of Algorithm 2-ICOSweep halts with the answer “YES”—then perform Algorithm 2-ICO-Sweep on the sum (1 + · · · + k) + k+1
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
Project in Progress
1. Extend the priority relation to include topological order. – Order of composition (rather than -priority) may force a
schedule to execute 1 before 2.
2. Determine how to invoke schedules that execute building blocks in an interleaved – rather than sequential – fashion.
– Can now do this for bipartite dags. (ICO-Sweep)
3. A new dag decomposition procedure which exploits the advantages provided by the ICO-Sweep algorithm.
4. Augment the repertoire of building-block dags.– Can now handle blocks for “expansive” and “reductive”
dags.
5. Experimentally determine the significance of IC-optimality– Initial results suggest significant speedup
Scheduling Algorithms for new Emerging Applications May, 29th - June, 2nd 2006, CIRM, Marseille, France
References
• [R04] A.L. Rosenberg (2004): “On scheduling mesh-structured computations for Internet-based computing”. IEEE Trans. Comput. 53, 1176–1186.
• [RY05] A.L. Rosenberg and M. Yurkewych (2005): “Guidelines for scheduling some common computation-dags for Internet-based computing.” IEEE Trans. Comput. 54, 428–438.
• [MRY06] G. Malewicz, A.L. Rosenberg, M. Yurkewych (2006): “Toward a theory for scheduling dags in Internet-based computing”. IEEE Trans. Comput. 55. See also, Intl. Parallel and Distr. Processing Symp., 2005.
• [CMR06] G. Cordasco, G. Malewicz, A.L. Rosenberg (2006): On scheduling expansive and reductive dags for Internet-based computing. 26th Intl. Conf. on Distributed Computing Systems, to appear.