30
Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Embed Size (px)

Citation preview

Page 1: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Parallel Simulation of Continuous Systems:A Brief Introduction

Oct. 19, 2005

CS6236 Lecture

Page 2: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Background

Sample applications of continuous systems Civil engineering: building construction Aerospace engineering: aircraft design Mechanical engineering: machining Systems biology: heart simulations Computer engineering: semiconductor simulations

Computersimulations

Discretemodels

Continuous models

Page 3: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Outline

Mathematical models and methods

Parallel algorithm methodology

Some active research areas

Page 4: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Mathematical Models

Ordinary/partial differential equations Laplace equation: Heat (diffusion) equation:

Steady-state v.s. time-dependent Convert into discrete problem through numerical

discretization Finite difference methods: structured grids Finite element methods: local basis functions Spectral methods: global basis functions Finite volume methods: conservation

Page 5: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: 1-D Laplace Equation

Laplace equation in one dimension

with boundary conditions

Finite difference approximation

with Jacobi iteration

Page 6: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: 2-D Laplace Equation

Laplace equation in two dimension

with boundary conditions at four sides

Page 7: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Parallel Programming Model

Parallel computation: two or more tasks executing concurrently

Task encapsulates sequential program and local memory

Tasks can be mapped to processors in various ways, including multiple tasks per processor

Page 8: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Performance Considerations

Load balance: work divided evenly Concurrency: work done simultaneously Overhead: work not present in serial

computation Communication Synchronization Redundant work Speculative work

Page 9: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: 1-D Laplace Equation

Define n tasks, one for each yi

Program for task i, i=1,…,n

Initialize yi

for k=1,…if i>1, send yi to task i-1if i<n, send yi to task i+1

if i<n, recv yi+1 from task i+1if i>1, recv yi-1 from task i-1yi = (yi-1+yi+1)/2

end

Page 10: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Design Methodology

Partition (Decomposition): decompose problem into fine-grained tasks to maximize potential parallelism

Communication: determine communication pattern among tasks

Agglomeration: combine into coarser-grained tasks, if necessary, to reduce communication requirements or other costs

Mapping: assign tasks to processors, subject to tradeoff between communication cost and concurrency

Page 11: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Design Methodology

Page 12: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Types of Partitioning

Domain decomposition: partition data Example: grid points in 1-, 2-, or 3-D mesh

Functional decomposition: partition computation Example: components in climate model (atmosphere,

ocean, land, etc.)

Page 13: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: Domain Decomposition

3-D mesh can be partitioned along any combination of one, two, or all three of its dimensions

Page 14: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Partitioning Checklist

Identify at least an order of magnitude more tasks than processors in target parallel system

Avoid redundant computation or storage Make tasks reasonably uniform in size Number of tasks, rather than size of each task,

should grow as problem size increases

Page 15: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Communication Issues

Latency and bandwidth Routing and switching Contention, flow control, and aggregate

bandwidth Collective communication

One-to-many: broadcast, scatter Many-to-one: gather, reduction, scan All-to-all Barrier

Page 16: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Communication Checklist

Communication should be reasonably uniform across tasks in frequency and volume

As localized as possible Concurrent Overlapped with computation, if possible Not inhibiting concurrent execution of tasks

Page 17: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Agglomeration

Communication is proportional to surface area of subdomain, whereas computation is proportional to volume of subdomain

Higher-dimensional decompositions have more favorable communication-to-computation ratio

Increasing task sizes reduces communication but also reduces potential concurrency and flexibility

Page 18: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Surface-to-Volume Ratio

Page 19: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: Agglomeration

Define p tasks, each with n/p of yi’s Program for task j, j=1,...p

initialize yl,...,yh

for k=1,...if j>1, send yl to task j-1if j<p, send yh to task j+1if j<p, recv yh+1 from task j+1if j>1, recv yl-1 from task j-1for i=l to h

zi = (yi-1+yi+1)/2endy = z

end

Page 20: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: Overlap Comm/Comp

Program for task j, j=1,...p

initialize yl,...,yh

for k=1,...if j>1, send yl to task j-1if j<p, send yh to task j+1for i=l+1 to h-1

zi = (yi-1+yi+1)/2endif j<p, recv yh+1 from task j+1zh = (yh-1+yh+1)/2 if j>1, recv yl-1 from task j-1zl = (yl-1+yl+1)/2y = z

end

Page 21: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Mapping

Two basic strategies for assigning tasks to processors: Place tasks that can execute concurrently on different

processors Place tasks that communicate frequently on same

processor Problem: These two strategies often conflict In general, finding optimal solution to this

tradeoff is NP-complete, so heuristics are used to find reasonable compromise

Dynamic vs static strategies

Page 22: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Mapping Issues

Partitioning Granularity Mapping Scheduling Load balancing

Particularly challenging for irregular problems Some software tools: Metis, Chaco, Zoltan, etc.

Page 23: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: Atmosphere Model

Partitioning grid points in 3-D finite difference model Typically yields 105 to 107 tasks

Communication 9-point stencil horizontally and 3-point stencil vertically Physics computations in vertical columns Global operations to compute total mass

Page 24: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Example: Atmosphere Model

Page 25: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Other Equations

Heat (diffusion) equation: Laplace equation:

Advection equation: Wave equation: Classification of second-order equations

Parabolic, hyperbolic, and elliptic Methods for time-dependent equations

Explicit v.s. implicit Finite-difference, finite-volume, finite-element

Page 26: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

CFL Condition for Stability

Necessary condition named after Courant, Friedrichs, and Lewy

Computational domain of dependence must contain physical domain of dependence

Implies time step must satisfy

Page 27: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Active Research Areas

DES of continuous systems

Page 28: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Active Research Areas

Coupling of different physics Different mathematical models Continuous v.s. discrete techniques

Load balancing Manager-worker model Irregular/unstructured problems Dynamic load balancing

Page 29: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

Summary

Mathematical models for continuous systems Ordinary and partial differential equations Finite difference, finite volume, and finite element

Parallel algorithm design Partitioning Communication Agglomeration Mapping

Active research areas

Page 30: Parallel Simulation of Continuous Systems: A Brief Introduction Oct. 19, 2005 CS6236 Lecture

References

I. T. Foster, Designing and Building Parallel Programs, Addison-Wesley, 1995

A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2nd. ed., Addison-Wesley, 2003

M. J. Quinn, Parallel Computing: Theory and Practice, McGraw-Hill, 1994

K. M. Chandy and J. Misra, Parallel Program Design: A Foundation, Addison-Wesley, 1988