20
1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer

1/17 Design Patterns and Computer Architecture Mark Murphy, Scott Beamer, Henry Cook, Andrew Waterman, Krste Asanovic, Kurt Keutzer

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

1/17

Design Patternsand

Computer Architecture

Mark Murphy,Scott Beamer, Henry Cook, Andrew

Waterman,Krste Asanovic, Kurt Keutzer

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

2/17

Work In Progress The point of this talk is not to present any results I want your input on result of brainstorming

sessions between myself and the Architecture research group

There are 40 minutes for this -- ~20 of me presenting slides and the rest for discussion

3/17

Structural PatternsChoose your high level structure

Agent and repository Layered systems

Arbitrary static task graph Map reduce

Iterative refinement Model view controller

Process control Pipe-and-filter

Event based, implicit invocation

Puppeteer

Computational PatternsIdentify the key computations

Dense linear algebra

Backtrack branch and

bound

Monte carlo methods

Sparse linear algebra

Finite state machine

Dynamic programming

Unstructured grids

Graphical models

Graph algorithms

Structured grids N-body methods

Circuits

Spectral methods

Parallel Algorithm Strategy PatternsRefine the structure - what concurrent approach do I use? Guided re-organization

Task Parallelism

Geometric Decomposition

Data Parallelism Pipeline Discrete Event

Recursive Splitting

Implementation Strategy PatternsUtilize Supporting Structures – how do I implement my concurrency? Guided mapping

Program Structure

Actors SPMD Master/Worker Shared queue Distributed array Data StructureTask queue Strict data

parallelLoop

parallelismShared data Graph partitioning

Fork/Join BSP Shared hash table Memory parallelismConcurrent Execution Patterns

Implementation methods – what are the building blocks of parallel programming? Guided implementation

Advancing Program Counters Coordination

MIMD Thread pool Message passing Mutual exclusion Digital circuits

Task graph Speculation Collective communication Transaction al memory

SIMD Data flow Collective synchronization

P2P synchronization

Applications

Pro

duct

ivit

y L

ayer

Effi

ciency

Layer

Pattern Language Exposes ||ism

Pattern Language Exposes ||ism

Example from Machine Learning: Compute the gradient of a scalar function w.r.t a matrix

B Each entry of gradient requires NxN Blas2 matrix

computations

5/17

Pattern Language Exposes ||ism

Example from Quantum Chemistry: Need to compute a matrix <# basis functions> x <#

electrons> Each entry of matrix requires evaluating a number of

functions, and summing the results

6/17

Pattern Language Exposes ||ism

In both examples, we have (at least) two levels of ||ism Many entries in matrix (Task Parallel) Much work in computing each entry (Map/Reduce Data

Parallel) The pattern language can pretty much tell us this

However, the right parallel program for a GPU-like manycore processor looks different in the two cases for the Machine Learning problem, only parallelize the

computation of each matrix element for the Chemistry problem, parallelize at both levels

Knowing this requires understanding that GPU-like processors implement fine-grained data parallelism best

7/17

SW writers understand HW arch?

There has been a sentiment that the pattern language should be architecture-agnostic

Architectural savvy required for decisions like these.

Otherwise, the options are all unattractive: Implement every possible parallelization, choose best? ... Choose one parallelization, hope it works? ... Ask Bryan to parallelize your code?

But clearly we can't write a pattern language around GTX200, just as we can't write it around LRB or Nehalem

8/17

Performance Models? Abstract, simplistic models to capture the

essence of low-level performance issues. Extant example: logP for distributed memory

machines l -- Network Latency for message o -- CPU overhead of sending a message g -- gap = inverse of NIC bandwidth P -- number of processors

9/17

l-latency network

Performance Models? Could imagine a similar model for current

manycores. How about this one? The BLIMP model:

B(L) -- Bandwidth as function of load/store block size I -- # Instruction Fetch units M -- # Load/Store units P -- # Execution Pipelines

10/17

I = 4

P = 8

Performance Models? Problems are obvious

Sure -- you can analyze the FFT algorithm and Matrix Mulitply

But what about my code? Can't handle data dependence in computational

intensity Example: SIFT Feature Extraction

Compute a "scale space" For each maximum in scale space:

Do a whole bunch of work How many maxima are there?

"Interesting" architectural features cannot be described

Still .... better than nothing? 11/17

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

12/17

Architects need patterns too! "Benchmark Addiction" was part of motivation for

Dwarfs Reliance upon C-source code benchmarks pigeon-holed

architectural innovation Dwarfs were supposed to be anti-benchmarks: provide a

non-source code description of the computations that were important

We (i.e. Tim) quickly discovered that Dwarfs were far too vague and high-level to serve this purpose A Computational Patern (~Dwarf) doesn't even imply a

particular problem to be solved, much less a particular algorithm

Can the fleshed-out pattern language be the solution?

13/17

Anti-Benchmarks? Architecture-agnostic patterns-based analysis of a

program enumerates space of implementations

14/17

Task Parallel

Map/Reduce

But architects still need their benchmark fix What does this actually tell them? They need to know:

Is my cache big enough? Should I include my whiz-bang u-arch

widget?

Anti-Benchmarks Suppose that the pattern language included

somehow the architectural savvy needed to make every possible implementation decision

What happens when the architect changes the rules?

15/17

Multiple Levels of Description Level 0: A patterns-based description Level 1: An "Abstract Machine" model? Level 2: A performance model? Level 3: A cycle-accurate simulation? Level 4: A joule-accurate simulation?

16/17

Abstract Machines Alternate proposal for performance model (K.

Asanovic) Given a microarchitectural widget, how does its

presence/absence affect the performance of a program? Map the program to two different machines (one

with, one without the widget). How are the programs different? Mapping process TBD. SEJITS?

Examples: An "Infinite ILP" machine. The superscalar analogue of

PRAM An Infinite Vector-width machine. An infinite thread machine

17/17

Design Patterns and Architecture

Design patterns (so far) are good at exposing ||ism Only half of the battle / There is parallelism everywhere

we look!

We need to incorporate Architectural information But not too much: we don't want to drown in detail!

Computer Architects need patterns too! Dwarfs were supposed to supplant benchmarks,

remember? Dwarfs -> Computational Patterns: too vague for

architects

Do design pattern writers need architectural patterns? Standardize a vocabulary to discuss performance

issues?

18/17

Architectural Meta-Patterns Hopefully by now I've conveyed my concern

about the lack of architectural / performance information in design patterns

Also, hopefully it is clear that I don't know the answer

Maybe someone can write me a pattern? How should I tell you what I know about

architecture?

19/17

Thank You

20/17