View
213
Download
0
Category
Preview:
Citation preview
Dynamic Reconfiguration of Grid-Aware Applications in ASSIST
Euro-Par 2005Lisboa, Portugal
M. AldinucciA. Petrocelli, E. Pistoletti, M. Torquati,
M. Vanneschi, L. Veraldi, C. Zoccolo
ISTI - CNR, Pisa, ItalyUniversity of Pisa, Italy
Dynamic Reconfiguration of Grid-Aware Applications in ASSIST
Euro-Par 2005Lisboa, Portugal
M. AldinucciA. Petrocelli, E. Pistoletti, M. Torquati,
M. Vanneschi, L. Veraldi, C. Zoccolo
ISTI - CNR, Pisa, ItalyUniversity of Pisa, Italy
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms • autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
2
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms • autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
2
The grid“... coordinated resource sharing and problem
solving in dynamic, multi institutional virtual organizations.”
Foster, Anatomy
“1) coordinates resources that are not subject to centralized control …”“2) … using standard, open, general-purpose protocols and interfaces”“3) … to deliver nontrivial qualities of service.”
Foster, What is the Grid?
Did you see J. Fortes invited talk?
The grid“... coordinated resource sharing and problem
solving in dynamic, multi institutional virtual organizations.”
Foster, Anatomy
“1) coordinates resources that are not subject to centralized control …”“2) … using standard, open, general-purpose protocols and interfaces”“3) … to deliver nontrivial qualities of service.”
Foster, What is the Grid?
Did you see J. Fortes invited talk?
Moreover, since this is not Euro-Seq, I assume applications we are focusing on should be parallel (and hopefully high-performance).
// progr. & the grid
• concurrency exploitation, concurrent activities set up, mapping/scheduling, communication/synchronization handling and data allocation, ...
• manage resources heterogeneity and unreliability, networks latency and bandwidth unsteadiness, resources topology and availability changes, firewalls, private networks, reservation and jobs schedulers, ...
// progr. & the grid
• concurrency exploitation, concurrent activities set up, mapping/scheduling, communication/synchronization handling and data allocation, ...
• manage resources heterogeneity and unreliability, networks latency and bandwidth unsteadiness, resources topology and availability changes, firewalls, private networks, reservation and jobs schedulers, ...
... and a non trivial QoS for applications
// progr. & the grid
• concurrency exploitation, concurrent activities set up, mapping/scheduling, communication/synchronization handling and data allocation, ...
• manage resources heterogeneity and unreliability, networks latency and bandwidth unsteadiness, resources topology and availability changes, firewalls, private networks, reservation and jobs schedulers, ...
... and a non trivial QoS for applications
not easy leveraging only on middleware
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms • autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
5
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms • autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
5
Grid
Abstract
Machine
Application(Manager(
(AM: non-functional aspects & QoS control)
ASSIST(applications
Abstraction(of(the(basic(services:(
resource(management(&(scheduling,(
monitoring,(...
standard(middleware(tools
(Globus,(WS,(CCM,(...)
Ap
p. d
ep
en
de
nt
co
mp
iler g
en
era
ted
ASSIST idea
“moving most of the Grid specific efforts needed while
developing high-performance Grid applications from
programmers to grid tools and run-time systems”
ASSIST is a high-level programming environment for grid-aware // applications. Developed at Uni. Pisa within several national/EU projects.
First version in 2001. Open source under GPL.
app = graph of modules
7
P2 P3
P4P1
input output
Sequential or parallel module
Typed streamsof data items
Programmable, possibly nondeterministic input behaviour
native + standard
8
P2 P3
P4P1
ASSIST native or wrap (MPI, CORBA, CCM, WS)
TCP/IP, Globus,IIOP CORBA,HTTP/SOAP
ASSIST native parmod
9
VP VP
VP VP
VP VP
An “input section” can be programmed in a CSP-like way
Data items can be distributed (scattered,
broadcasted, multicasted) to a set of
Virtual Processes which are named accordingly to a
topology
ASSIST native parmod
9
VP VP
VP VP
VP VP
An “input section” can be programmed in a CSP-like way
Data items can be distributed (scattered,
broadcasted, multicasted) to a set of
Virtual Processes which are named accordingly to a
topology
Data items partitions are elaborated by VPs, possibly in
iterative way
while(...) forall VP(in, out) barrier
data is logically shared by VPs (owner-computes)
ASSIST native parmod
9
VP VP
VP VP
VP VP
An “input section” can be programmed in a CSP-like way
Data items can be distributed (scattered,
broadcasted, multicasted) to a set of
Virtual Processes which are named accordingly to a
topology
Data items partitions are elaborated by VPs, possibly in
iterative way
while(...) forall VP(in, out) barrier
data is logically shared by VPs (owner-computes)
Data is eventually gathered accordingly to
an user defined way
ASSIST native parmod
9
VP VP
VP VP
VP VP
An “input section” can be programmed in a CSP-like way
Data items can be distributed (scattered,
broadcasted, multicasted) to a set of
Virtual Processes which are named accordingly to a
topology
Data items partitions are elaborated by VPs, possibly in
iterative way
while(...) forall VP(in, out) barrier
data is logically shared by VPs (owner-computes)
Data is eventually gathered accordingly to
an user defined way
Easy to express standard paradigms(skeltons), such as
farm, deal, haloswap, map, apply-to-all,
forall, ...
parmod implementation
10
inputmanager
VP VP
VP manager (VPM)
VP VP
VP manager (VPM)
inputmanager
VP VP
VP manager (VPM)
processes VP Virtual Processes
11
Compiling & Running
QoScontract
ASSISTprogram
ASSISTcompiler
resourcedescription
XML
executablecode
(linux, mac,uindoz)
11
Compiling & Running
QoScontract
ASSISTprogram
ASSISTcompiler
resourcedescription
XML
executablecode
(linux, mac,uindoz)
launch
query new resources
reco
nf
com
man
ds
Managers
AM+MAMs
Grid execution
agent (GEA)
ISM OSM
VPM
seqseq
Network of processes
Run
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms (+ demo)• autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
12
P1 P2
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms (+ demo)• autonomic QoS managers
• Demo & experiments
• Concluding remarks
Outline
12
P3
Application adaptivity
• Adaptivity aims to dynamically control program configuration (e.g. parallel degree) and mapping
• for performance (high-performance is a natural sub-target)
• for fault-tolerance (enable to cope with unsteadiness of resources, and some kind of faults)
13
Ingredients for the adaptivity recipe
1. Mechanism for adaptivity
• reconf-safe points• in which points a parallel code can be safely reconfigured?
• reconf-safe point consensus• different parallel activities may not proceed in lock-step fashion
• add/remove/migrate computation & data
14
Ingredients for the adaptivity recipe
1. Mechanism for adaptivity
• reconf-safe points• in which points a parallel code can be safely reconfigured?
• reconf-safe point consensus• different parallel activities may not proceed in lock-step fashion
• add/remove/migrate computation & data
2. Managing adaptivity
• QoS contracts• Describing high-level QoS requirement for modules/applications
• “self-optimizing” module• under the control of an autonomic manager
14
reconf-safe points
• In which points of the code the execution can be reconfigured?
• low-level approach• the programmer places in the code calls to a suitable API, e.g.
safe_point(); • error-prone, time-consuming
• ASSIST • automatically generated by the compiler, driven by program
semantics• no artifactual synchronization added, already existing
synchronizations are rather instrumented
• overhead w.r.t. not adaptive code < 0.04%
15
Mechanisms
Distributed agreement
• The program reconfiguration actually starts only when all interested entities are ready to react
• i.e. all processes have reached a suitable reconf-safe point
• they agreed on which one• fresh resources are up and running
• distributed protocol
16
Mechanisms
Basic operations
• Change parallelism degree
• Add n VPMs to parmod• Remove n VPMs from a parmod
• Change mapping
• Move k VPs from a VPM to another• Move a VPM from a PE to another• Dynamic load-balancing as sequence of
migrate operations
17
Mechanisms
VPM
Example: Add VPM
VP VP
ISM OSM
MAM
VPVPM
VP VPVPM
data
VP
data
1. Gexec(newPE, VPM)
18
Mechanisms
VPM
Example: Add VPM
VP VP
ISM OSM
MAM
VPVPM
VP VPVPM
data
VP
data
1. Gexec(newPE, VPM)
2. acquire consensus
18
Mechanisms
VPM
Example: Add VPM
VP
ISM OSM
MAM
VPVPM
VP VPVPM
data
VP VP
data
1. Gexec(newPE, VPM)
2. acquire consensus
3. move VP and data
Only 3. is in the critical path 18
Mechanisms
!"# $"#
%&#
#'#
!"!#!$%&'("!)*+$#!("&+*",&-.&/0,1
!"# $"#
#'#
%&#
!"# $"#
%&#
#'#
%&#
2+('3,,
("&/04
!"$%&'("!)*+$#!("&+*",&-.56&/0,1
$"$%7832%$" 9:9
;:9&-<!==%3>$+31!"#$%&'()*#'(&"$'
"33=&6&/0 /04
+,-.'/0-12$3*#'(&"$'
343'*#3
2$+<(=&+3$'?3,&$+3'("@A,$@3&2(!"#
+3'("@B&%$#3"'7
+3'("@B&#!<3
<("!#(+
#!<3
C$*"'?-D/9E/041 $'.
4("5*6%2",%(*#'(&"$'
D/,&$+3+3=!,#+!F*#3=
G?3&"3>&2+('3,,'("#$'#,&#?3&9:9
Fig. 2. Reconfiguration dynamics and metrics.
TCP/IP or Globus provided communication channels. The two applications arecomposed by one parmod and two sequential modules. The first is a data-parallelapplication receiving a stream of integer arrays and computing a forall of sim-ple function for each stream item; the matrix is stored in the parmod sharedstate. The second is a farm application computing a simple function on di!erentstream items. Since Rt also depends on sequential function cost, in both caseswe choose sequential functions with a close to zero computational cost in orderto evaluate mechanism on the finest possible grain.
The reconfiguration overhead (Ro) measured during our experiments, with-out any reconfiguration change actually performed, is practically negligible, re-maining under the limit of 0,004%, the measurement of the other two metricsare reported in Table 1.
Notice that in the case of a data-parallel parmod, Rl grows linearly with(x + y) for the reconfiguration x ! y for both kinds of reconf-safe points, anddepends on shared state size and mapping. Farm parmod cannot be reconfiguredon-barrier since it has no barrier, and achieves a negligible Rl (below 10!3 ms).This is due to the fact that no processes are stopped in the transition from oneconfiguration to the next. Rt, which includes both the protocol cost and time toreach next reconf-safe point, grows linearly with (x + y) for the former cost andheavily depends on user-function cost for the latter.
parmod kind Data-parallel (with shared state) Farm (without shared state)
reconf. kind add PEs remove PEs add PEs remove PEs
# of PEs involved 1"2 2"4 4"8 2"1 4"2 8"4 1"2 2"4 4"8 2"1 4"2 8"4
Rl on-barrier 1.2 1.6 2.3 0.8 1.4 3.7 – – – – – –Rl on-stream-item 4.7 12.0 33.9 3.9 6.5 19.1 # 0 # 0 # 0 # 0 # 0 # 0
Rt 24.4 30.5 36.6 21.2 35.3 43.5 24.0 32.7 48.6 17.1 21.6 31.9
Table 1. Evaluation of reconfiguration overheads (ms). On this cluster, 50 ms areneeded to ping 200KB between two PEs, or to compute a 1M integer additions.
Overheads (milliseconds)
GrADS papers reports overhead in the order of hundreds of seconds (K. Kennedy et al. 2004), this is mainly due to the stop/restart behavior, not to the different running env.
19
Mechanisms
Demo Here !
run beginwith 1 VPM
lines arrivesslowly one
after the other(par. degree=1)
fresh VPMs are added to the app.
lines arrives much faster
(par. degree=5)
reconf: release 3 VPMs
lines arrivesa bit slower
(par. degree=2)
reconf: add the 4 VPMs
4 fresh VPMsare started
reconf. console
Managing adaptivity
21
Management
Grid execution agent
(Globus, ACE, ...)
ISM OSM
VPM
seqseq
parmod
(network of processes)
QoSdata
Executenext
config
brokencontracts
Analyze
PlanMonitorManagers
AM+MAMs
launch launch
reconf. commands
new resources
queries
parmod autonomic manager
1. monitor• collect execution stats: machine load, VPM service
time, input/output queues lenghts, ...
2. analyze• instanciate performance models with monitored
data, detect broken contract, in and in the case try to indivituate the problem
3. plan• select a (predefined or user defined) strategy to
reconvey the contract to valid status. The strategy is actually a list of mechanism to apply.
4. execute• leverage on mechanism to apply the plan
Management
QoS contract(of the experiment I’ll show you in a minute)
Perf. features QLi (input queue level), QLo (input queuelevel), TISM (ISM service time), TOSM
(OSM service time), Nw (number of VPMs),Tw[i] (VPMi avg. service time), Tp (parmodavg. service time)
Perf. model Tp = max{TISM ,!n
i=1Tw[i]/n, TOSM},
Tp < K (goal)
Deployment arch = (i686-pc-linux-gnu ! powerpc-apple-darwin*)
Adapt. policy goal based
Management
Performance models:an example (DP load balancing)
Time
barrier
barrier
idle
T1=T2=3 T3=2 T4=1
n1=n2=n3=n4=7
24
Management
Performance models:an example (DP load balancing)
Time
barrier
barrier
idle
T1=T2=3 T3=2 T4=1
n1=n2=n3=n4=7
barrier
barrier
T1=T2=3 T3=2 T4=1
n1=5 n2=5 n3=7 n4=12idle
24
Management
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms (+ demo)• autonomic QoS managers
• Experiments
• Concluding remarks
Outline
25
Perf(P1)
Perf(P2)
Perf(P3)
Perf(P4)
• Motivating ...
• high-level programming for the grid• application adaptivity for the grid
• ASSIST basics
• Adaptivity in ASSIST
• mechanisms (+ demo)• autonomic QoS managers
• Experiments
• Concluding remarks
Outline
25
Perf(P1)
Perf(P2)
Perf(P3)
Perf(P4)
N. of VPMs in parmod
Input stream pressureVPMs aggregated power
QoS contract
Wall Clock Time (s)
0 50
Ite
ms/s
N.
of
VP
Ms
100
2 4 6 8
10
2 4 6 8
20 80 60Input stream queue fill level
200 40 180 160 140 120
Fill
%
100
Farm: contract: keep a given service time contract change along the run
26
0
0.2
0.4
0.6
0.8
1
1.2
P4@2.5GHz P4@2GHz P3@868MHz P4@2.8GHz
Platform kind
Ap
p.
pro
f. (
ite
ratio
ns/s
)
0K
1K
2K
3K
4K
5K
6K
Lin
ux B
og
oM
IPS
App. proling BogoMIPS
A B C D
Running Env27
0
400 350 300 250
150 200
100 50
D
A
C
B
additional load started on platform B
VP
s t
o V
PM
s m
ap
pin
gse
co
nd
s
Iteration count 0 50 100 150 200 250 300 350 400
Max unbalance timeIteration time 3
1
4
2
Data parallel (shortest path)Machine B externally overloaded
after a while
0
0.2
0.4
0.6
0.8
1
1.2
P4@2.5GHz P4@2GHz P3@868MHz P4@2.8GHz
Platform kind
Ap
p.
pro
f. (
ite
ratio
ns/s
)
0K
1K
2K
3K
4K
5K
6K
Lin
ux B
og
oM
IPS
App. proling BogoMIPS
A B C D
28
Conclusions 1/2
• Application adaptivity in ASSIST
• complex, but trasparent (no burden for the programmers)• they should just define they QoS requirements• perf. models are automatically generated from
program structure (and don’t depend on seq. funct.)
• dynamically controlled, efficiently managed• catch both platforms unsteadiness and code
irregular behavior in running time• performance models not critical, reconfiguration
does not stop the application• key feature for the grid
29
Conclusions 2/2
• ASSIST cope with• grid platform unsteadiness• interoperability with standards
• and rely on them for many features
• high-performance• app deployment problems on grid
• private networks, job schedulers, firewalls, ...
30
Conclusions 2/2
• ASSIST cope with• grid platform unsteadiness• interoperability with standards
• and rely on them for many features
• high-performance• app deployment problems on grid
• private networks, job schedulers, firewalls, ...
• We currently working on• QoS of the whole application through hierarchy of
managers• components, fault-tolerance, efficient launch time
mapping• in cooperation with many coreGRID partners
30
Recommended