Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Preview:

Citation preview

Parallel Simulation of Continuous Systems:A Brief Introduction

Oct. 19, 2005

CS6236 Lecture

Background

Sample applications of continuous systems Civil engineering: building construction Aerospace engineering: aircraft design Mechanical engineering: machining Systems biology: heart simulations Computer engineering: semiconductor simulations

Computersimulations

Discretemodels

Continuous models

Outline

Mathematical models and methods

Parallel algorithm methodology

Some active research areas

Mathematical Models

Ordinary/partial differential equations Laplace equation: Heat (diffusion) equation:

Steady-state v.s. time-dependent Convert into discrete problem through numerical

discretization Finite difference methods: structured grids Finite element methods: local basis functions Spectral methods: global basis functions Finite volume methods: conservation

Example: 1-D Laplace Equation

Laplace equation in one dimension

with boundary conditions

Finite difference approximation

with Jacobi iteration

Example: 2-D Laplace Equation

Laplace equation in two dimension

with boundary conditions at four sides

Parallel Programming Model

Parallel computation: two or more tasks executing concurrently

Task encapsulates sequential program and local memory

Tasks can be mapped to processors in various ways, including multiple tasks per processor

Performance Considerations

Load balance: work divided evenly Concurrency: work done simultaneously Overhead: work not present in serial

computation Communication Synchronization Redundant work Speculative work

Example: 1-D Laplace Equation

Define n tasks, one for each yi

Program for task i, i=1,…,n

Initialize yi

for k=1,…if i>1, send yi to task i-1if i<n, send yi to task i+1

if i<n, recv yi+1 from task i+1if i>1, recv yi-1 from task i-1yi = (yi-1+yi+1)/2

end

Design Methodology

Partition (Decomposition): decompose problem into fine-grained tasks to maximize potential parallelism

Communication: determine communication pattern among tasks

Agglomeration: combine into coarser-grained tasks, if necessary, to reduce communication requirements or other costs

Mapping: assign tasks to processors, subject to tradeoff between communication cost and concurrency

Design Methodology

Types of Partitioning

Domain decomposition: partition data Example: grid points in 1-, 2-, or 3-D mesh

Functional decomposition: partition computation Example: components in climate model (atmosphere,

ocean, land, etc.)

Example: Domain Decomposition

3-D mesh can be partitioned along any combination of one, two, or all three of its dimensions

Partitioning Checklist

Identify at least an order of magnitude more tasks than processors in target parallel system

Avoid redundant computation or storage Make tasks reasonably uniform in size Number of tasks, rather than size of each task,

should grow as problem size increases

Communication Issues

Latency and bandwidth Routing and switching Contention, flow control, and aggregate

bandwidth Collective communication

One-to-many: broadcast, scatter Many-to-one: gather, reduction, scan All-to-all Barrier

Communication Checklist

Communication should be reasonably uniform across tasks in frequency and volume

As localized as possible Concurrent Overlapped with computation, if possible Not inhibiting concurrent execution of tasks

Agglomeration

Communication is proportional to surface area of subdomain, whereas computation is proportional to volume of subdomain

Higher-dimensional decompositions have more favorable communication-to-computation ratio

Increasing task sizes reduces communication but also reduces potential concurrency and flexibility

Surface-to-Volume Ratio

Example: Agglomeration

Define p tasks, each with n/p of yi’s Program for task j, j=1,...p

initialize yl,...,yh

for k=1,...if j>1, send yl to task j-1if j<p, send yh to task j+1if j<p, recv yh+1 from task j+1if j>1, recv yl-1 from task j-1for i=l to h

zi = (yi-1+yi+1)/2endy = z

end

Example: Overlap Comm/Comp

Program for task j, j=1,...p

initialize yl,...,yh

for k=1,...if j>1, send yl to task j-1if j<p, send yh to task j+1for i=l+1 to h-1

zi = (yi-1+yi+1)/2endif j<p, recv yh+1 from task j+1zh = (yh-1+yh+1)/2 if j>1, recv yl-1 from task j-1zl = (yl-1+yl+1)/2y = z

end

Mapping

Two basic strategies for assigning tasks to processors: Place tasks that can execute concurrently on different

processors Place tasks that communicate frequently on same

processor Problem: These two strategies often conflict In general, finding optimal solution to this

tradeoff is NP-complete, so heuristics are used to find reasonable compromise

Dynamic vs static strategies

Mapping Issues

Partitioning Granularity Mapping Scheduling Load balancing

Particularly challenging for irregular problems Some software tools: Metis, Chaco, Zoltan, etc.

Example: Atmosphere Model

Partitioning grid points in 3-D finite difference model Typically yields 105 to 107 tasks

Communication 9-point stencil horizontally and 3-point stencil vertically Physics computations in vertical columns Global operations to compute total mass

Example: Atmosphere Model

Other Equations

Heat (diffusion) equation: Laplace equation:

Advection equation: Wave equation: Classification of second-order equations

Parabolic, hyperbolic, and elliptic Methods for time-dependent equations

Explicit v.s. implicit Finite-difference, finite-volume, finite-element

CFL Condition for Stability

Necessary condition named after Courant, Friedrichs, and Lewy

Computational domain of dependence must contain physical domain of dependence

Implies time step must satisfy

Active Research Areas

DES of continuous systems

Active Research Areas

Coupling of different physics Different mathematical models Continuous v.s. discrete techniques

Load balancing Manager-worker model Irregular/unstructured problems Dynamic load balancing

Summary

Mathematical models for continuous systems Ordinary and partial differential equations Finite difference, finite volume, and finite element

Parallel algorithm design Partitioning Communication Agglomeration Mapping

Active research areas

References

I. T. Foster, Designing and Building Parallel Programs, Addison-Wesley, 1995

A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2nd. ed., Addison-Wesley, 2003

M. J. Quinn, Parallel Computing: Theory and Practice, McGraw-Hill, 1994

K. M. Chandy and J. Misra, Parallel Program Design: A Foundation, Addison-Wesley, 1988

Recommended