Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
1/ 43
Models, Simulation, Emulation, Experimention
for Grid Computing
Arnaud Legrand
ID laboratory, [email protected]
Grid 5000 school: March 8, 2006
2/ 43
Grid Research
Grid researchers often ask questions in the broad area of distributedcomputing
I Which scheduling algorithm is best for this application on agiven Grid?
I Which design is best for implementing a distributed Grid re-source information service?
I Which caching strategy is best for enabling a community ofusers that to distributed data analysis?
I What are the properties of several resource management poli-cies in terms of fairness and throughput?
I What is the scalability of my publish-subscribe system undervarious levels of failures
I . . .
3/ 43
Analytical or Experimental?
Analytical Grid Computing researchI Some have developed purely analytical / mathematical models
for Grid computingI makes it possible to prove interesting theoremsI often too simplistic to convince practitionersI but generally useful for understanding principles in spite of du-
bious applicability
I One often uncovers NP-complete problems anyway e.g., rout-ing, partitioning, scheduling problems
I And one must run experiments
; Grid computing research is based on experiments
4/ 43
Grid Computing as Science?
You can tell you are a scientific discipline if
I You can read a paper, easily reproduce (at least a subset of)its results, and improve
I You can tell to a grad student: “Here are the standard tools,go learn how to use them and come back in one month”.
I You can give a 1-hour seminar on widely accepted tools thatare the basis for doing research in the area
We are not there today. . .
But perhaps I can give a 1-hour seminar on emerging tools thatcould be the basis for doing research in the area, provided a fewopen questions are addressed.
Need for standard ways to run Grid experiments
5/ 43
Grid Experiments
Real-world experiments are good
I Eminently believableI Demonstrates that proposed approach can be implemented
in practice
But...
Can be time-intensive Execution of “applications” for hours, days,months, ...
Can be labor-intensive Entire application needs to be built and func-tional (including all design / algorithms alternatives include allhooks for deployment).Is it a good engineering practice to build many full-fledge solu-tions to find out which ones work best?
6/ 43
Grid Experiments: What experimental testbed?
My own little testbed well-behaved, controlled, stable, often not rep-resentative of real Grids.
Real grid platforms
I (Still) challenging for many grid researchers to obtainI Not built as a computer scientist’s playpen (other users may
disrupt experiments, other users may find experiments dis-ruptive)
I Platform will experience failures that may disrupt the exper-iments
I Platform configuration may change drastically while experi-ments are being conducted
I Experiments are uncontrolled and unrepeatable: even if dis-ruption from other users is part of the experiments, it pre-vents back-to-back runs of competitor designs / algorithms
7/ 43
Grid Experiments are Limited and Non Reproducible
; Difficult to obtain statistically significant results on an appropri-ate testbed
And to make things worse...
Experiments are limited to the testbed
I What part of the results are due to idiosyncrasies of thetestbed?
I Extrapolations are possible, but rarely convincingI Must use a collection of testbeds...I Still limited explorations of “what if” scenarios (what if the
network were different? what if we were in 2015? what ifwe had a different workload?)
Difficult for others to reproduce results This is the basis for scien-tific advances!
8/ 43
Simulation
Simulation can solve many (all) of these difficulties
I No need to build a real system
I Conduct controlled / repeatable experiments
I In principle, no limits to experimental scenarios
I Possible for anybody to reproduce results
Definition: Simulation.
Attempting to predict aspects of the behaviour of some system bycreating an approximate (mathematical) model of it.
Key question: Validation (correspondence between simulation andreal-world)
9/ 43
Simulation in Computer Science
Microprocessor Design A few standard “cycle-accurate” simulatorsare used extensively
http://www.cs.wisc.edu/∼arch/www/tools.html
Possible to reproduce simulation results
Networking Network-guys are better at standards than we are... ,I A few standard “packet-level” simulators NS-2, DaSSF, OM-
NeT++I Well-known datasets for network topologiesI Well-known generators of synthetic topologiesI SSF standard: http://www.ssfnet.org/I Possible to reproduce simulation results
Grid Computing? None of the above up until a few years ago (mostpeople built their own “ad-hoc” solutions).
10/ 43
Simulation of Distributed Computing Platforms?
Simulation of parallel platforms used for decades. Most simulationshave made drastic assumptions
Simplistic platform model
I Processors perform work at some fixed rate (Flops)I Network links send data at some fixed rateI Topology is fully connected (no communication interference)
or a bus (simple communication interference)I Communication and computation are perfectly overlappable
Simplistic application model
I All computations are CPU intensiveI Clear-cut communication and computation phasesI Application is deterministic
Straightforward simulation in most cases Just fill in a Gantt chartwith a computer rather than by hand. No need for a “simulationstandard”
11/ 43
Grid Simulations?
Simple models are perhaps justifiable for a switched dedicated clusterrunning a matrix multiply application.They are hardly justifiable for grid platforms. . .
I Complex and wide-reaching network topologies (multi-hop net-works, heterogeneous bandwidths and latencies, non-negligiblelatencies, complex bandwidth sharing behaviors, contention withother traffic)
I Overhead of middleware
I Complex resource access/management policies
I Interference of communication and computation
12/ 43
Grid Experiments
Recognized as a critical area
GRID 5000 A reconfigurable grid to enable reproductibility.
Grid eXplorer (GdX) project (INRIA) Build an actual scientific in-strument
I Databases of “experimental conditions”I 1000-node cluster for simulation/emulationI Visualization tools
What simulation technology???
What kind of results in my paper?
I Results for a few “real” platforms
I Results for many synthetic platforms
Two main questions:
1 What does a “representative” Grid look like?
2 How does one do simulation on a synthetic representative Grid?
13/ 43
Outline
1 Simulation for Grid Computing
2 Generating Synthetic GridsTopologyCompute resourcesBackground conditions
3 Simulating Applications on Synthetic GridsIntroductionMicroGridModelNetPlanetLabChicagoSim, OptorSim, GridSimSimGrid
4 Conclusion
14/ 43
Outline
1 Simulation for Grid Computing
2 Generating Synthetic GridsTopologyCompute resourcesBackground conditions
3 Simulating Applications on Synthetic GridsIntroductionMicroGridModelNetPlanetLabChicagoSim, OptorSim, GridSimSimGrid
4 Conclusion
15/ 43
Why Synthetic Platforms?
Two goals of simulations:
I Simulate platforms beyond the ones at hand
I Perform sensitivity analyses
Need: Synthetic platforms
I Examine real platforms
I Discover principles
I Implement “platform generators”
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
16/ 43
Generation of Synthetic Grids
Network Topology
I Graph
I Bandwidth and Latencies
Compute Resources And other re-sources “Background” Conditions
I CPU capacity
I Load
I Failures
What is Representative and Tractable?
17/ 43
Synthetic Network Topologies
The network community has wondered about the properties of theInternet topology for years
I The Internet grows in a decentralized fashion with seeminglycomplex rules and incentives
I Could it have a mathematical structure?
I Could we then have generative models?
Three “generations of graph generators”
I Random
I Structural
I Degree-based
18/ 43
Flat models
Brain-dead N dots are randomly chosen (using a uniform distribu-tion) in a square. Then they are randomly connected with auniform probability α.
Waxman [Wax88] Dots are randomly placed on a square of side cand are randomly connected with a probability P (u, v) = αe−d/(βL),0 < α, β 6 1 where d is the Euclidean distance between u and vand L = c
√2. The edge number increases with α and the edge
length heterogeneity increases with β.
Exponential Dots are randomly placed and are connected with aprobability P (u, v) = αe−d/(L−d).
Locality [ZCD97] This model is due to Zegura. Dots are randomlyplaced and are connected with a probability
P (u, v) =
{α if d < L× r
β if d > L× r.
18/ 43
Flat models
Brain-dead N dots are randomly chosen (using a uniform distribu-tion) in a square. Then they are randomly connected with auniform probability α.
Waxman [Wax88] Dots are randomly placed on a square of side cand are randomly connected with a probability P (u, v) = αe−d/(βL),0 < α, β 6 1 where d is the Euclidean distance between u and vand L = c
√2. The edge number increases with α and the edge
length heterogeneity increases with β.
Exponential Dots are randomly placed and are connected with aprobability P (u, v) = αe−d/(L−d).
Locality [ZCD97] This model is due to Zegura. Dots are randomlyplaced and are connected with a probability
P (u, v) =
{α if d < L× r
β if d > L× r.
18/ 43
Flat models
Brain-dead N dots are randomly chosen (using a uniform distribu-tion) in a square. Then they are randomly connected with auniform probability α.
Waxman [Wax88] Dots are randomly placed on a square of side cand are randomly connected with a probability P (u, v) = αe−d/(βL),0 < α, β 6 1 where d is the Euclidean distance between u and vand L = c
√2. The edge number increases with α and the edge
length heterogeneity increases with β.
Exponential Dots are randomly placed and are connected with aprobability P (u, v) = αe−d/(L−d).
Locality [ZCD97] This model is due to Zegura. Dots are randomlyplaced and are connected with a probability
P (u, v) =
{α if d < L× r
β if d > L× r.
18/ 43
Flat models
Brain-dead N dots are randomly chosen (using a uniform distribu-tion) in a square. Then they are randomly connected with auniform probability α.
Waxman [Wax88] Dots are randomly placed on a square of side cand are randomly connected with a probability P (u, v) = αe−d/(βL),0 < α, β 6 1 where d is the Euclidean distance between u and vand L = c
√2. The edge number increases with α and the edge
length heterogeneity increases with β.
Exponential Dots are randomly placed and are connected with aprobability P (u, v) = αe−d/(L−d).
Locality [ZCD97] This model is due to Zegura. Dots are randomlyplaced and are connected with a probability
P (u, v) =
{α if d < L× r
β if d > L× r.
18/ 43
Flat models
Brain-dead N dots are randomly chosen (using a uniform distribu-tion) in a square. Then they are randomly connected with auniform probability α.
Waxman [Wax88] Dots are randomly placed on a square of side cand are randomly connected with a probability P (u, v) = αe−d/(βL),0 < α, β 6 1 where d is the Euclidean distance between u and vand L = c
√2. The edge number increases with α and the edge
length heterogeneity increases with β.
Exponential Dots are randomly placed and are connected with aprobability P (u, v) = αe−d/(L−d).
Locality [ZCD97] This model is due to Zegura. Dots are randomlyplaced and are connected with a probability
P (u, v) =
{α if d < L× r
β if d > L× r.
19/ 43
Node placement
Uniform
19/ 43
Node placement
Heavy Tailed
20/ 43
What about hierarchy ?
Top-Down
N-level [ZCD97] Starting from a connected graph, at each step, anode is replaced by another connected graph (Tiers, GT-ITM).
Transit-stub [ZCD97] 2-levels of hierarchy and some additional edges(GT-ITM, BRITE).
Stub Domains
Multi-homed stubTransit Domains
Stub-Stub Edge
20/ 43
What about hierarchy ?
Top-Down
N-level [ZCD97] Starting from a connected graph, at each step, anode is replaced by another connected graph (Tiers, GT-ITM).
Transit-stub [ZCD97] 2-levels of hierarchy and some additional edges(GT-ITM, BRITE).
AS-level Topology
Router Level
EdgeConnectionMethod
Topologies
(1)
(2)
(3)
AS Nodes
21/ 43
Power-Law : rank exponent
Faloutsos brothers [FFF99] have analyzed the topology at the ASlevel and have established power-laws describing this topology.The rank rv of a note v is its index in the order of decreasing degree.
21/ 43
Power-Law : rank exponent
Faloutsos brothers [FFF99] have analyzed the topology at the ASlevel and have established power-laws describing this topology.The rank rv of a note v is its index in the order of decreasing degree.
0.1
1
10
100
1000
1 10 100 1000 10000
"971108.rank"exp(6.34763) * x ** ( −0.811392 )
0.1
1
10
100
1000
1 10 100 1000 10000
"980410.rank"exp(6.62082) * x ** ( −0.821274 )
0.1
1
10
100
1000
1 10 100 1000 10000
"981205.rank"exp(6.16576) * x ** ( −0.74496 )
1
10
100
1 10 100 1000 10000
"routes.rank"exp(4.39519) * x ** ( −0.487592 )
21/ 43
Power-Law : rank exponent
Faloutsos brothers [FFF99] have analyzed the topology at the ASlevel and have established power-laws describing this topology.The rank rv of a note v is its index in the order of decreasing degree.
Power Law (rank exponent).
Given a graph, the degree dv of a node v is propor-tional to the rank of the node rv to the power of aconstant R.
dv ∝ rRv
Nov. 97 Apr. 98 Dec. 98 Router 95
R 0,81 0,82 0,74 0,48
22/ 43
What about Power Laws ?
Incremental growth and affinity lead to Power Laws [BA99].Nodes are incrementally added. The probability that v is connectedto u depends on du:
P (u, v) =du∑k dk
22/ 43
What about Power Laws ?
Incremental growth and affinity lead to Power Laws [BA99].Nodes are incrementally added. The probability that v is connectedto u depends on du:
P (u, v) =du∑k dk
23/ 43
Let us check two Power-Laws
1
10
100
1 10 100 1000
outd
egre
e_ra
nk
rank
1
10
100
1000
1 10
freq
uenc
youtdegree_freq
Interdomain November 1997
23/ 43
Let us check two Power-Laws
100
1 10 100 1000
outd
egre
e_ra
nk
rank
1
10
100
freq
uenc
youtdegree_freq
GT-ITM flat ?
23/ 43
Let us check two Power-Laws
10
1 10 100 1000 10000
outd
egre
e_ra
nk
rank
1
10
100
1000
1 10
freq
uenc
youtdegree_freq
Waxman (BRITE)
23/ 43
Let us check two Power-Laws
1
10
1 10 100 1000
outd
egre
e_ra
nk
rank
1
10
100
1000
1 10
freq
uenc
youtdegree_freq
Transit-Stub (GT-ITM)
23/ 43
Let us check two Power-Laws
10
100
1 10 100 1000 10000
outd
egre
e_ra
nk
rank
1
10
100
1000
1 10
freq
uenc
youtdegree_freq
Barabasi Albert (BRITE)
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
24/ 43
Are these measurements meaningful?
We know network have structure AND power laws. Some projectstry to combine both (GridG [LD03]).However, some of the power laws observed by the Faloutsos brothersare correlated. Power laws could be a measurement bias. What kindof measurements can be used ?
I Expension
I Resilience
I Distortion
I Excentricity distri-bution
I Eigenvalues distribu-tion
I Set cover size, . . .
Observation [TGJ+02]:
I AS-level and router-level have similar characteristics
I Degree-based generators are significantly better at representinglarge scale properties of the Internet than structural ones.
I Hierarchy seem to arise from degree-based generators.
25/ 43
I’m lost. . . what should I do then?
I For a 10 000 nodes platform, degree-based generators seem togive good results.
I For a 100 node platform, power laws do not make sense andstructural generators may be more appropriate.
We need some additional informations(e.g. routing, bandwidth, latency, sharing capacity, . . . ).
25/ 43
I’m lost. . . what should I do then?
I For a 10 000 nodes platform, degree-based generators seem togive good results.
I For a 100 node platform, power laws do not make sense andstructural generators may be more appropriate.
We need some additional informations(e.g. routing, bandwidth, latency, sharing capacity, . . . ).
25/ 43
I’m lost. . . what should I do then?
I For a 10 000 nodes platform, degree-based generators seem togive good results.
I For a 100 node platform, power laws do not make sense andstructural generators may be more appropriate.
We need some additional informations(e.g. routing, bandwidth, latency, sharing capacity, . . . ).
26/ 43
Bandwidths, latencies, traffic, etc.
Topology generators only produce a graph: We need link character-istics as well
1 Model physical characteristics.I Some “models” in topology generatorsI Need to simulate background trafficI No accepted model for generating background trafficI Simulation can be very costly
2 Model end-to-end performance. Models ([LS01]) or Measure-ments (NWS, ...). Go from path modeling to link modeling?
Turns out to be a difficult question (DARPA workshop on networkmodeling).
27/ 43
Bandwidths, latencies, traffic, etc.
Maybe none of this matters?
I Fiber inside the network mostly unused
I Communication bottleneck is the local link
I Appropriate tuning of TCP or better protocols should saturatethe local link
I Don’t care about topology at all!
I Maybe none of this matters for my application (no networkcontention)
28/ 43
Compute Resources
What resources do we put at the end- points?
“Ad-hoc” generalization Look at the TeraGrid. Generate new sitesbased on existing sites
Statistical modeling
I Examing many production resourcesI Identify key statistical characteristicsI Come up with a generative/predictive model
29/ 43
Synthetic Clusters?
Many Grid resources are clusters. What is the “typical” distributionof clusters?“Commodity Cluster synthesizer” [KCC04a]
I Examined 114 production clusters (10K+ procs)I Came up with statistical models
I Linear fit between clock-rate and release-year within a processorfamily
I Quadratic fraction of processors released on a given year
I Validated model against a set of 191 clusters (10K+ procs)
I Models allow “extrapolation” for future configurations
I Models implemented in a resource generator
30/ 43
Resource Availability / Workload
Probabilistic models
I Naive: experimental distributed availability and unavailabil-ity intervals
I Sophisticated: Weibull distributions [NBW05]
Traces NWS, Desktop Grid resources [KCC04b]
Workload models e.g. batch schedulers
I TracesI Models [LF03]: job inter-arrival times (Gamma), amount
of work requested (Hyper-Gamma), number of processorsrequested: Compounded (2p, 1, ...)
I Adversary ?
31/ 43
A Sample Synthetic Grid
I Generate a 5,000 node graph with BRITE
I Annotate latency according to BRITE’s Euclidian distance method(scaling to obtain the desired network diameter)
I Annotate bandwidth based on a set of end-to-end NWS mea-surements
I Pick 30% of the end-points
I Generate a cluster at each end-point according to Kee’s syn-thesizer for Year 2006
I Model cluster load with Feitelson’s model with a range of pa-rameters for the random distributions
I Model resource failures based on Inca measurements on Tera-Grid
32/ 43
Outline
1 Simulation for Grid Computing
2 Generating Synthetic GridsTopologyCompute resourcesBackground conditions
3 Simulating Applications on Synthetic GridsIntroductionMicroGridModelNetPlanetLabChicagoSim, OptorSim, GridSimSimGrid
4 Conclusion
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Based solely on equations
Abstraction of system as a setof dependant actions and events(fine- or coarse- grain)
Trapping and virtualization of low-levelapplication/system actions
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Microscopic: packet-level(fine-grain d.e. simulation)
Macroscopic: flows in a pipe(coarse-grain d.e simulation + math. simulation)
Actual flows go through some network
Network
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Microscopic: Cycle-accurate simulation(fine-grain d.e. simulation)
Macroscopic: flows in a pipe(coarse-grain d.e simulation + math. simulation)
Virtualization via another CPU/virtual machine
CPU
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
33/ 43
Simulation in a nutshell
Simulations are configurable, repeatable, fast.Key question: “Validation: correspondence between simulation andreal world”.
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
Application
Virtualization (emulation of actual codewith trapping of application generated events)
Macroscopic: application = analytical ”flow”
with resource needs and dependanciesLess Macroscopic: set of abstract tasks
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Boundaries above are blurred (d.e. simulation/simulation).A simulation can combine all paradigms at different levels.I MicroGrid
I ModelNet
I Emulab/DummyNet
I PlatetLab
I ChicagoSim,OptorSim
I SimGrid
34/ 43
MicroGrid
MicroGrid is a UCSD project [SLJ+00] lead by Andrew Chien.Applications are supported by emulation and virtualization: Actualapplication code is executed on “virtualized” resourcesMicrogrid accounts for CPU and network
Resource gethostnames, sockets,GIS, MDS, NWS are wrapped
CPU Direct execution on a frac-tion of physical CPU: find agood mapping
Network Packet-level simulation(parallel version of MaSSF)
Time Synchronize real time andvirtual time: find the good ex-ecution rate
Virtual
Resources
Physical
Ressources
Application
MicroGrid
34/ 43
MicroGrid
MicroGrid is a UCSD project [SLJ+00] lead by Andrew Chien.Applications are supported by emulation and virtualization: Actualapplication code is executed on “virtualized” resourcesMicrogrid accounts for CPU and network
Resource gethostnames, sockets,GIS, MDS, NWS are wrapped
CPU Direct execution on a frac-tion of physical CPU: find agood mapping
Network Packet-level simulation(parallel version of MaSSF)
Time Synchronize real time andvirtual time: find the good ex-ecution rate
Virtual
Resources
Physical
Ressources
Application
MicroGrid
MicroGrid in a Nutshell
Network
Application Slow but hopefullyaccurate
CPU Can have high overheadsBut captures the overhead!
Slow but hopefullyaccurate
MicroGrid
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
35/ 43
ModelNet
ModelNet is a UCSD/Duke project [VYW+02] lead by Amin Vahdat.Applications are supported by emulation and virtualization: Actualapplication code is executed on “virtualized” resourcesA key tradeoff in ModelNet is scalability versus accuracy.ModelNet accounts for network but not for CPU:
Resource gethostnames, socketsare wrapped
CPU Plain mapping, slowdown isnot taken into account
Network one emulator runningFreeBSD, a gigabit LAN, somehost machines with IP aliasingfor the virtual nodes ; emula-tion of heterogeneous links
Similar ideas have been used in other projects (Emulab [WLS+02],DummyNet, Panda [KBM+02], . . . )
35/ 43
ModelNet
ModelNet is a UCSD/Duke project [VYW+02] lead by Amin Vahdat.Applications are supported by emulation and virtualization: Actualapplication code is executed on “virtualized” resourcesA key tradeoff in ModelNet is scalability versus accuracy.ModelNet accounts for network but not for CPU:
Resource gethostnames, socketsare wrapped
CPU Plain mapping, slowdown isnot taken into account
Network one emulator runningFreeBSD, a gigabit LAN, somehost machines with IP aliasingfor the virtual nodes ; emula-tion of heterogeneous links
ModelNet in a Nutshell
Network
Application Slow but hopefullyaccurate
CPU
Fast and scalableSomehow realist
Doesn’t take computationinto account
ModelNet
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Similar ideas have been used in other projects (Emulab [WLS+02],DummyNet, Panda [KBM+02], . . . )
36/ 43
PlanetLab
PlanetLab is an open platform for developping, deploying, and ac-cessing planetary-scale services.
Planetary-scale 350 machines, 140 sites, 20 countries
Distribution Virtualization each user can get a slice of the platform.
Unbundled Management
I OS defines only local (per-node) behavior (global (network-wide) behavior implemented by services)
I multiple competing services running in parallel (shared, un-privileged interfaces)
Unstable like the real world Convenient to try P2P applications ordemonstrate the feasability of a middleware.
36/ 43
PlanetLab
PlanetLab is an open platform for developping, deploying, and ac-cessing planetary-scale services.
Planetary-scale 350 machines, 140 sites, 20 countries
Distribution Virtualization each user can get a slice of the platform.
Unbundled Management
I OS defines only local (per-node) behavior (global (network-wide) behavior implemented by services)
I multiple competing services running in parallel (shared, un-privileged interfaces)
Unstable like the real world Convenient to try P2P applications ordemonstrate the feasability of a middleware.
PlanetLab in a Nutshell
Application
CPU
Network
PlanetLab
This is not reproducible!!!
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
37/ 43
ChicagoSim, OptorSim, GridSim
Previous solutions are kind of heavy. What about Higher-Level con-cerns ?
I data placement/replication
I grid economy
PhD students need a all-in-one Grid simulator to try a few things.They’d like to only have to plug in their algorithm.Amongst many simulators we can cite:
ChicSim built on ParSec [RF02]
OptorSim developped for Data-Grid [opt]
GridSim focus on Grid economy [MBA01]
Unfortunately, different needs require different simulation paradigms.
37/ 43
ChicagoSim, OptorSim, GridSim
Previous solutions are kind of heavy. What about Higher-Level con-cerns ?
I data placement/replication
I grid economy
PhD students need a all-in-one Grid simulator to try a few things.They’d like to only have to plug in their algorithm.Amongst many simulators we can cite:
ChicSim built on ParSec [RF02]
OptorSim developped for Data-Grid [opt]
GridSim focus on Grid economy [MBA01]
ChicSim, OptorSim, GridSim in a Nutshell
Application
Network
CPU
ChicSim, OptorSim, GridSim
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Unfortunately, different needs require different simulation paradigms.
37/ 43
ChicagoSim, OptorSim, GridSim
Previous solutions are kind of heavy. What about Higher-Level con-cerns ?
I data placement/replication
I grid economy
PhD students need a all-in-one Grid simulator to try a few things.They’d like to only have to plug in their algorithm.Amongst many simulators we can cite:
ChicSim built on ParSec [RF02]
OptorSim developped for Data-Grid [opt]
GridSim focus on Grid economy [MBA01]
ChicSim, OptorSim, GridSim in a Nutshell
Application
Network
CPU
ChicSim, OptorSim, GridSim
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
Unfortunately, different needs require different simulation paradigms.
38/ 43
SIMGRID
Originally developed for scheduling research ; must be fast to allowfor thousands of simulation
Application No real application code is executed. Simulation is ex-pressed in term of communicating process that can perform com-putations (task communication or computation).
Resources No virtualization. A resource is defined by a rate a whichit does “work”, a fixed overhead that must be paid by each task,traces of the above if needed + failures.
Tasks Tasks can use multiple resources (data transfer over multiplelinks, computation that uses a disk and a CPU)
In previous projects, resource sharing “emerges” out of the low-levelemulation and simulation.SIMGRID uses a blend of mathematical simulation and coarse-graindiscrete event simulation.Simple API to “specify” an application rather than having it alreadyimplemented [CALLM03].
38/ 43
SIMGRID
Originally developed for scheduling research ; must be fast to allowfor thousands of simulation
Application No real application code is executed. Simulation is ex-pressed in term of communicating process that can perform com-putations (task communication or computation).
Resources No virtualization. A resource is defined by a rate a whichit does “work”, a fixed overhead that must be paid by each task,traces of the above if needed + failures.
Tasks Tasks can use multiple resources (data transfer over multiplelinks, computation that uses a disk and a CPU)
In previous projects, resource sharing “emerges” out of the low-levelemulation and simulation.SIMGRID uses a blend of mathematical simulation and coarse-graindiscrete event simulation.Simple API to “specify” an application rather than having it alreadyimplemented [CALLM03].
SimGrid in a Nutshell
SIMGRID
Application
CPUNetwork
less abstract
more abstract
Discrete-EventSimulation
SimulationMathematical
Emulation
39/ 43
Outline
1 Simulation for Grid Computing
2 Generating Synthetic GridsTopologyCompute resourcesBackground conditions
3 Simulating Applications on Synthetic GridsIntroductionMicroGridModelNetPlanetLabChicagoSim, OptorSim, GridSimSimGrid
4 Conclusion
40/ 43
So what should I use?
It really depends on your goal / resources
I SimGrid’s models have clear limitations (e.g. for short trans-fers)
I SimGrid simulations are easy to set upI MicroGrid simulations take a lot of time (although they can
be parallelized)I ModelNet requires some hardware setupI SimGrid does not require for a full application to be writtenI MicroGrid models overhead of system calls implicitlyI PlanetLab does not enable reproducible experiments
Key trade-off: accuracy and speed
I The more abstract the simulation the fastestI The less abstract the simulation the most accurate
Does this trade-off really hold?
41/ 43
Simulation Validation
The crux of most simulation work in most domains of computerscience.
Validation is difficult and almost never done convincingly.
I Provide justification that the model is plausible
I Convince people that the simulator implements the model (ver-ification)
I Provide a few graphs that show that it’s reasonable (validationin a few special cases at best or validation against another“validated” simulator)
I Argue that although absolute values are off, the trends are re-spected
I Conclude that the simulator is useful to compare algorithms/designs
I Obtain scientific results?????
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
42/ 43
FLASH vs. FLASH
FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [GKOH00]
FLASH project at Stanford
I building large-scale shared-memory multiprocessorsI Went from conception, to design, to actual hardware (32-node)I Used simulation heavily over 6 years
The authors went back and compared simulation to the real world!
I Simulation error is unavoidable(30% error in their case was notrare; Negating the impact of “we got 1.5% improvement”)
I One should focus on simulating the important thingsI A more complex simulator does not ensure better simulation
I Simple simulators worked better than sophisticated ones, which wereunstable
I Simple simulators predicted trends as well as slower, sophisticatedones
I It is key to use the real-world to tune/calibrate the simulator
ConclusionFor FLASH, the simple simula-tor was all that was needed. . .
43/ 43
Conclusion
Simulation is difficult : What does really matters?
Grid researchers are actively working on it : Usable tools exist; Gridsimulation today should not “re-invent the wheel”.
Two crucial next steps
I Repository of synthetic Grids, Grid measurement datasets,and Grid simulation software
I Scientifically sound validation experiments. Validate simu-lators. Understand what matters and what does not.
Only then will we have a scientific discipline with a standard wayto conduct experiments and a way for researcher to reproduceeach other’s results.
43/ 43
A.L. Barabasi and R. Albert.Emergence of scaling in random networks.Science, 59:509–512, 1999.Available at http://www.nd.edu/∼networks/Papers/science.pdf.
Henri Casanova, Arnaud Legrand, and Loris Marchal.Scheduling Distributed Applications: the SimGrid SimulationFramework.In Proceedings of the third IEEE International Symposium onCluster Computing and the Grid (CCGrid’03). IEEE ComputerSociety Press, May 2003.
Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos.On power-law relationships of the internet topology.In SIGCOMM, pages 251–262, 1999.Available at http://citeseer.nj.nec.com/faloutsos99powerlaw.html.
Jeff Gibson, Robert Kunz, David Ofelt, and Mark Heinrich.FLASH vs. (simulated) FLASH: Closing the simulation loop.
43/ 43
In Architectural Support for Programming Languages and Op-erating Systems, pages 49–58, 2000.
Thilo Kielmann, Henri E. Bal, Jason Maassen, Rob van Nieuw-poort, Lionel Eyraud, Rutger Hofman, and Kees Verstoep.Programming environments for high-performance grid comput-ing: the albatross project.Future Generation Computer Systems, 2002.Available at http://www.cs.vu.nl/∼kielmann/papers/fgcs01.ps.gz.
Yang-Suk Kee, Henri Casanova, and Andrew A. Chien.Realistic modeling and synthesis of resources for computationalgrids.In Proceedings of Supercomputing 2004. IEEE Computer Soci-ety Press, 2004.
Derrick Kondo, Andrew A. Chien, and Henri Casanova.Resource management for rapid application turnaround on en-terprise desktop grids.
43/ 43
In Proceedings of Supercomputing 2004. IEEE Computer Soci-ety Press, 2004.
D. Lu and P. Dinda.GridG: Generating realistic computational grids.Performance Evaluation Review, 30(4), March 2003.Available at http://www.cs.nwu.edu/∼pdinda/Papers/lu-dinda-per.pdf.
Uri Lublin and Dror G. Feitelson.The workload on parallel supercomputers: modeling the char-acteristics of rigid jobs.Journal of Parallel and Distributed Computing, 63(11):1105–1122, 2003.
C. Lee and J. Stepanek.On future global grid communication performance.In 10th Heterogeneous Computing Workshop (HCW’2001).IEEE Computer Society Press, 2001.
M. Murshed, R. Buyya, and D. Abramson.
43/ 43
Gridsim: A toolkit for the modeling and simulation of globalgrids.Technical report, Monash University, October 2001.Available at http://www.csse.monash.edu.au/∼rajkumar/gridsim/.
Daniel Nurmi, John Brevik, , and Rich Wolski.Modeling machine availability in enterprise and wide-area dis-tributed computing environments.In Proceedings of EuroPar 2005, 2005.
OptorSim: Simulating data access optimization algorithms.URL:http://edg-wp2.web.cern.ch/edg-wp2/optimization/optorsim.html.
Kavitha Ranganathan and Ian Foster.Decoupling computation and data scheduling in distributeddata-intensive applications.In Proceedings of the 11 th IEEE International Symposiumon High Performance Distributed Computing HPDC-11 20002(HPDC’02), Chicago, IL, 2002. IEEE Computer Science Press.
43/ 43
H. J. Song, X. Liu, D. Jakobsen, R. Bhagwan, X. Zhang, KenjiroTaura, and Andrew A. Chien.The microgrid: a scientific tool for modeling computationalgrids.In Supercomputing, 2000.Available at http://www.sc2000.org/techpapr/papers/pap.pap286.pdf.
H. Tangmunarunkit, R. Govindan, S. Jamin, S. Shenker, andW. Willinger.Network topology generators: Degree-based vs structural.In Proceedings of SIGCOMM’02. ACM Press, August 2002.Available at http://citeseer.nj.nec.com/tangmunarunkit02network.html.
Amin Vahdat, Ken Yocum, Kesvin Walsh, Priya Mahadevan,Dejan Kosti0107, Jeffrey Chase, and David Becker.Scalability and accuracy in a largescale network emulator.In Proceedings of the 5th ACM/USENIX Symposium on Oper-ating System Design and Implementation (OSDI), 2002.
43/ 43
Bernard M. Waxman.Routing of Multipoint Connections.IEEE Journal on Selected Areas in Communications, 6(9):1617–1622, December 1988.
Brian White, Jay Lepreau, Leigh Stoller, Robert Ricci, ShashiGuruprasad, Mac Newbold, Mike Hibler, Chad Barb, and Abhi-jeet Joglekar.An integrated experimental environment for distributed systemsand networks.pages 255–270, Boston, MA, December 2002.
Ellen W. Zegura, Kenneth L. Calvert, and Michael J. Donahoo.
A quantitative comparison of graph-based models for Internettopology.IEEE/ACM Transactions on Networking, 5(6):770–783, 1997.Available at http://citeseer.nj.nec.com/zegura97quantitative.html.