42
Using Application- Domain Knowledge in the Runtime Support of Multi-Experiment Computational Studies Siu Yau Dissertation Defense, Dec 08

Siu Yau Dissertation Defense, Dec 08

  • Upload
    kele

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Using Application-Domain Knowledge in the Runtime Support of Multi-Experiment Computational Studies. Siu Yau Dissertation Defense, Dec 08. Multi-Experiment Study (MES). Simulation software rarely run in isolation Multi-Experiment Computational Study - PowerPoint PPT Presentation

Citation preview

Page 1: Siu Yau Dissertation Defense, Dec 08

Using Application-Domain Knowledge in the Runtime

Support of Multi-Experiment Computational Studies

Siu Yau

Dissertation Defense, Dec 08

Page 2: Siu Yau Dissertation Defense, Dec 08

Multi-Experiment Study (MES)

• Simulation software rarely run in isolation

• Multi-Experiment Computational Study – Multiple executions of a simulation experiment– Goal: Identify interesting regions in input

space of simulation code

• Examples in engineering, science, medicine, finance

• Interested in aggregate result– Not individual experiment

Page 3: Siu Yau Dissertation Defense, Dec 08

MES Challenges

• Systematically cover input space – Refinement + High dimensionality

Large number of experiments (100s or 1000s) and/or user interaction

• Accurate individual experiments– Spatial + Temporal refinement

Long-running individual experiments (days or weeks per experiment)

• Subjective goal– Require study-level user guidance

Page 4: Siu Yau Dissertation Defense, Dec 08

MES on Parallel Architectures

• Parallel architecture maps well to MES

• Dedicated, local access to small- to medium-sized parallel computers– Interactive MES

User-directed coverage of exploration space

• Massively-parallel systems– Multiple concurrent parallel experiments

exploit power of massively parallel systems

• Traditional systems lack high-level view

Page 5: Siu Yau Dissertation Defense, Dec 08

Thesis Statement

To meet the interactive and computational requirements of Multi-Experiment Studies, a parallel run-time system must view an entire study as a single entity, and use application-level knowledge that are made available from the study context to inform its scheduling and resource allocation decisions.

Page 6: Siu Yau Dissertation Defense, Dec 08

Outline

• MES Formulation, motivating examples – Defibrillator Design, Helium Model Validation

• Related Work

• Research Methodology

• Research Test bed: SimX

• Optimization techniques– Sampling, Result reuse, Resource allocation

• Contributions

Page 7: Siu Yau Dissertation Defense, Dec 08

MES Formulation

• Simulation Code: maps input to result– Design Space: Space of possible inputs to

simulation code

• Evaluation Code: maps result to performance metric– Performance Space: Space of outputs of

evaluation code

• Goal: Find Region of Interest in Design & Performance Space

Page 8: Siu Yau Dissertation Defense, Dec 08

Example: Defibrillator Design

• Help design implantable defibrillators

• Simulation Code: – Electrode placements + shock voltage

Torso potential

• Evaluation Code: – Torso potential + activation/damage thresholds

% activated & damaged heart tissues

• Goal: Placement + voltage combination to max activation, min damage

Page 9: Siu Yau Dissertation Defense, Dec 08

Example: Gas Model Validation

• Validate gas-mixing model

• Simulation Code:– Prandtl number + Gas inlet velocity

Helium plume motion

• Evaluation Code:– Helium plume motion

Velocity profile deviation from real-life data

• Goal: Find Prandtl number + inlet velocity to minimize deviation

Page 10: Siu Yau Dissertation Defense, Dec 08

Example: Pareto Optimization

• Set of inputs that cannot be improved in all objectives

Damage

Activa

tion

Page 11: Siu Yau Dissertation Defense, Dec 08

Example: Pareto Optimization

• Set of inputs that cannot be improved in all objectives

Page 12: Siu Yau Dissertation Defense, Dec 08

• Interactive Exploration of Pareto Frontier– Change set up (voltage, back electrode, etc.)

new study– Interactive exploration of “study space”

• One user action one aggregate result

• Need study-level view, interactive rate

Challenge: Defibrillator Design

Page 13: Siu Yau Dissertation Defense, Dec 08

Challenge: Model Validation

• Multiple executions of long-running code– 6x6 grid = 36 experiment – ~3000 timesteps per experiment @ 8

seconds per timestep – 6.5 hours per experiment

10 days per study• Schedule and allocate resource as a

single entity: how to distribute parallel resources?

Page 14: Siu Yau Dissertation Defense, Dec 08

• Grid Schedulers– Condor, Globus– Each experiment treated as a “black box”

• Application-aware grid infrastructures:– Nimrod/O and Virtual Instrument – Take advantage of application knowledge –

but in ad-hoc fashion– No consistent set of APIs reusable across

different MESs

Related Work: Grid Schedulers

Page 15: Siu Yau Dissertation Defense, Dec 08

Related Work: Parallel Steering

• Grid-based Steering– Grid-based: RealityGrid, WEDS– Steer execution of inter-dependent tasks– Different focus: Grid Vs cluster

• Parallel Steering Systems– Falcon, CUMULVS, CSE– Steers single executions (not collections) on

parallel machines

Page 16: Siu Yau Dissertation Defense, Dec 08

Methodology

• Four example MESs, varying properties

Study Bridge Design

Defibrillator Design

Animation Design

Gas Model Validation

User interaction

No Yes Yes No

No. of experiments

100K 65K ~100K 36

Time per experiment

7 secs 2 secs < 1 sec 6.5 hours

Parallel code?

No No No Yes

Study goal Pareto Optimization

Pareto Optimization

Aesthetic

Measure

Pareto Optimization

Page 17: Siu Yau Dissertation Defense, Dec 08

Methodology (cont’d)

• Identify application-aware system policies– Scheduling, Resource allocation, User

interface, Storage support

• Construct research test bed (SimX)– API to import application-knowledge– Implemented on parallel clusters

• Conduct example MESs– Implement techniques, measure effect of

application-aware system policies

Page 18: Siu Yau Dissertation Defense, Dec 08

Test bed: SimX

• Parallel System for Interactive Multi-Experiment Studies (SIMECS)

• Support MESs on parallel clusters

• Functionality-based components– UI, Sampler, Task Queue, Resource

Allocator, Simulation container, SISOL

• Each component with specific API

• Adapt API to the needs of the MES

Page 19: Siu Yau Dissertation Defense, Dec 08

SISOLAPI

Test bed: SimX

Front-end Manager Process

Worker Process Pool

User Interface: Visualisation &

Interaction

Sampler

ResourceAllocator

FU

EL

Inte

rfac

e

SISOL Server Pool

Data Server

Data Server

Data Server

Data Server

Dir

Ser

ver

TaskQueue

Simulationcode

FU

EL

Inte

rfac

eEvaluation

code

SimulationContainer

Page 20: Siu Yau Dissertation Defense, Dec 08

Optimization techniques

• Reduce number of experiments needed:– Automatic sampling– Study-level user steering– Study-level result reuse

• Reduce run time of individual experiments: – Reuse results from another experiment:

checkpoints, internal states

• Improve resource utilization rate– Min. parallel. overhead & max. reuse potential– Preemption: claim idle resources

Page 21: Siu Yau Dissertation Defense, Dec 08

Active Sampling

• If MES is optimization study (i.e., region of interest is to optimize a function)– Incorporate search algorithm in scheduler

• Pareto optimizations: Active Sampling– Cover design space from coarse to fine grid – Use aggregate results from coarse level to

identify promising regions

• Reduce number of experiments needed

Page 22: Siu Yau Dissertation Defense, Dec 08

Active Sampler (cont’d)

Initial Grid

1st level results

First Refinement

2nd level results

2nd Refinement

3rd level results

Page 23: Siu Yau Dissertation Defense, Dec 08

CustomSampler SISOL

API

Support for Sampling

Front-end Manager Process

Worker Process Pool

Naïve (Sweep)Sampler

ResourceAllocator

FU

EL

Inte

rfac

e

SISOL Server Pool

Data Server

Data Server

Data Server

Data Server

Dir

Ser

ver

TaskQueue

Simulationcode

FU

EL

Inte

rfac

eEvaluation

code

RandomSampler

Active (Pareto) SamplerSampler

SimulationContainer

void setStudy(StudySpec)

void registerResult(experiment, performance)

experiment getNextPointToRun ()

SimX Sampler API

User Interface: Visualisation &

Interaction

Page 24: Siu Yau Dissertation Defense, Dec 08

Evaluation: Active Sampling

• Helium validation study– Resolve Pareto frontier on 6x6 grid– Reduce no. of experiments from 36 to 24

• Defibrillator study– Resolve Pareto frontier on 256x256 grid– Reduce no. of experiments from 65K to 7.3K– Non-perfect scaling due to dependencies– At 128 workers: Active sampling: 349 secs;

Grid sampling: 900 secs

Page 25: Siu Yau Dissertation Defense, Dec 08

Result reuse

• MES: many similar runs of simulation code

• Share information between experiments – speed up experiment that reuse information– only need to calculate deltas

• Many types: depends on information used– varying degrees of generality

• Reduce individual experiment run time– except study-level reuse

Page 26: Siu Yau Dissertation Defense, Dec 08

Result reuse typesType Result reused Applicability

Checkpoint reuse

Simulation code output

Time-stepping code,

Iterative solver

Preconditioner reuse

Preconditioner Iterative linear solver

Intermediate result reuse

Internal state Simulation code with shared internal states

Simulation result reuse

Simulation code output

Interactive MESs

Performance metric reuse

Evaluation code output

Interactive MESs

Study-level reuse

Aggregate result of study

Interactive MESs

Page 27: Siu Yau Dissertation Defense, Dec 08

Intermediate Result Reuse

• Defibrillator simulation code solves 3 systems, linearly combine solutions

• Same system needed by different experiments

• Cache the solutions Adx=bd

Aax=ba

Abx=bbAcx=bc

Store Ac-1bc and Ab

-1bb

Page 28: Siu Yau Dissertation Defense, Dec 08

SISOLAPI

Support for Result Reuse

Front-end Manager Process Worker Process Pool

Sampler

ResourceAllocator

FU

EL

Inte

rfac

e

SISOL Server Pool

Data Server

Data Server

Data Server

Data Server

Dir

Ser

ver

TaskQueue

SimulationCode

FU

EL

Inte

rfac

e

Evaluation code

Ab-1bb

Aa-1ba

SISOL API:

object StartRead(objSet, coord)

void EndRead(object)

object StartWrite(objSet, coord)

void EndWrite(objSet, object)

User Interface: Visualisation &

Interaction

Page 29: Siu Yau Dissertation Defense, Dec 08

Checkpoint Result Reuse

• Helium code terminates when KE stabilizes• Start from another checkpoint – stabilize faster• Must have same inlet velocities

Page 30: Siu Yau Dissertation Defense, Dec 08

Study-level Result Reuse

• Interactive study: two similar studies

• Use Pareto frontier from first study as a guide for next study

Page 31: Siu Yau Dissertation Defense, Dec 08

Evaluation: Result Reuse

• Checkpoint reuse in Helium Model Study:– No reuse: 3000 timesteps; with reuse: 1641– 18 experiments out of 24 able to reuse– 28% improvement overall

• Defibrillator study– No reuse: 7.3K experiments @ 2 secs each =

349 secs total on 128 procs– With reuse: 6.5K experiments @ 1.5 secs =

123 secs total on 128 procs– 35% improvement overall

Page 32: Siu Yau Dissertation Defense, Dec 08

Resource Allocation

• MES made up of parallel simulation codes

• How to divide cluster among experiments? – Parallelization overhead

• fewer processes per experiment

– Active sampling + reuse • Some experiments more important; more

processes for those experiments

• Adapt allocation policy to MES: – Use application knowledge to decide which

experiments are prioritized

Page 33: Siu Yau Dissertation Defense, Dec 08

Resource Allocation

• Batching strategy: select subset (batch) and assign high priority, run concurrently– Considerations for batching policies

• Scaling behavior: maximize batch size• Sampling policy: prioritize “useful” samples• Reuse potential: prioritize experiments with reuse

• Preemption strategy: – claim unused processing elements and assign

to experiments in progress

Page 34: Siu Yau Dissertation Defense, Dec 08

Resource Allocation: Batching

• Batch for Active Sampling• Identify independent experiments in sampler• Max. parallelism while allowing active sampling

First Batch

1st Pareto-Optimal

Second Batch

1st & 2nd Pareto Opt.

3rd Batch

1st to 3rd Pareto Opt.

4rd Batch

Pareto Frontier

Prantl Number

Inle

t V

eloc

ity

Page 35: Siu Yau Dissertation Defense, Dec 08

Resource Allocation: Batching

• Active Sample batching

1st Batch

2nd Batch 3rd Batch 4th Batch

Page 36: Siu Yau Dissertation Defense, Dec 08

Resource Allocation: Batching

• Batch for reuse class• Sub-divide each batch into 2 smaller batches:

– 1st sub-batch: first in reuse class; no two belong to same reuse class

– No two concurrent from- scratch experiments can reuse each other’s checkpoints(max. reuse potential)

– Experiments in samebatch have comparable run times (reduce holes)

Prantl Number

Inle

t V

eloc

ity

Page 37: Siu Yau Dissertation Defense, Dec 08

Resource Allocation: Batching

• Batching for reuse classes

1st Batch

2nd Batch

3rd Batch4th Batch 5th Batch

6th Batch

Page 38: Siu Yau Dissertation Defense, Dec 08

Resource Allocation: Preemption

• With preemption

1st Batch

2nd Batch

3rd Batch 4th Batch5th Batch

6th Batch

Page 39: Siu Yau Dissertation Defense, Dec 08

SISOLAPI

Support for Resource Allocation

Front-end Manager Process

Worker Process Pool

User Interface: Visualisation &

Interaction

Sampler

ResourceAllocator

FU

EL

Inte

rfac

e

SISOL Server Pool

Data Server

Data Server

Data Server

Data Server

Dir

Ser

ver

TaskQueue

Simulationcode

FU

EL

Inte

rfac

e

Evaluation code

SimulationContainer

TaskQueue::AddTask(Experiment)

TaskQueue:: CreateBatch(set<Experiment>&)

TaskQueue::GetIdealGroupSize()

TaskQueue:: AssignNextTask(GroupID)

Reconfigure(const int* assignment)

Page 40: Siu Yau Dissertation Defense, Dec 08

Evaluation: Resource AllocationKnowledge used

Total time Utilization

Rate

Avg. time per run

Improvement

None (run on 1 worker)

12 hr 35 min 56.3% 6 hr 17 min N/A

None (run 1 experiment)

20 hr 35 min 100% 34.3 min N/A

+ Active Sampling

6 hr 10 min 71.1% 63.4 min 51% / 70%

+ Reuse classes

5 hr 10 min 71.3% 39.7 min 59% / 75%

+ Preemption 4 hr 30 min 91.8% 34.5 min 64% / 78%

Page 41: Siu Yau Dissertation Defense, Dec 08

Contributions

• Demonstrate the need to consider the entire end-to-end system

• Identify system policies that can benefit from application-level knowledge– Scheduling (Sampling): for optimization MESs – User steering: for MESs with subjective goals and

MES with high design space dimensionality– Result reuse: for MESs made up of similar executions

of simulation code– Resource allocation: for MESs made up of parallel

simulation codes

Page 42: Siu Yau Dissertation Defense, Dec 08

Contributions

• Demonstrate with prototype system– API to import relevant application-knowledge

• Quantify the benefits of application-aware techniques– Sampling: orders of magnitude improvement in bridge

design and defibrillator study; 33% improvement in Helium model validation study

– User steering: enable interactivity in animation design study and defibrillator study

– Result reuse: multi-fold improvement in bridge design, defibrillator, and helium model validation studies

– Application-aware resource allocation: multi-fold improvement in Helium model validation study