13
Adaptive Execution Support for Malleable Computation Speaker: LIN Qian http://www.comp.nus.edu.sg/ ~linqian

Adaptive Execution Support for Malleable Computation

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Adaptive Execution Support for Malleable Computation

Adaptive Execution Support for Malleable Computation

Speaker: LIN Qianhttp://www.comp.nus.edu.sg/~linqian

Page 2: Adaptive Execution Support for Malleable Computation

Outline

• Introduce the key ideas of 3 selected papers• Discussion

Page 3: Adaptive Execution Support for Malleable Computation

FORMLESS

• FORMLESS: Scalable Utilization of Embedded Manycores in Streaming Applications[LCTES’12]– Functionally-cOnsistent stRucturally-MalLEabe

Streaming Specification– Actor-oriented specification models– Space exploration scheme • to customize the application specification to better fit

the target platform.

Page 4: Adaptive Execution Support for Malleable Computation

FORMLESS (cont.)

• Space exploration for platform-driven instantiation

Page 5: Adaptive Execution Support for Malleable Computation

FORMLESS (cont.)

• Example:

Page 6: Adaptive Execution Support for Malleable Computation

Dynamic Load Balancing

• A Distributed and Adaptive Dynamic Load Balancing Scheme for Parallel Processing of Medium-Grain Tasks[IEEE Jounal, 1990]– Challenge: Allocate and distribute tasks

dynamically with minimum run time overhead.– Design: A distributed and adaptive load balancing

scheme for medium-grain tasks

Page 7: Adaptive Execution Support for Malleable Computation

Dynamic Load Balancing (cont.)

• Key idea 1: Neighborhood average strategy– Attempts to balance load within a neighborhood

by distributing tasks • such that all neighbors have loads close to the

neighborhood average.

– The decision when to balance load is based on the neighborhood state information that is checked periodically. • Each processor maintains status information of all its

neighbors.

Page 8: Adaptive Execution Support for Malleable Computation

Dynamic Load Balancing (cont.)

• Key idea 2: Grain Size Control– If the cost of making work available to another

processor exceeds the cost of executing it at the local processor, then it does not make sense to decompose and parallelize work beyond a certain size or granularity of work.

– Granularity control: To determine when to stop breaking down a computation into parallel computations at a frontier node, treating it as a leaf node and executing it sequentially.

Page 9: Adaptive Execution Support for Malleable Computation

Adaptive Load Balancing

• Compiler and Run-Time Support for Adaptive Load Balancing in Software Distributed Shared Memory Systems[1998]– Use information provided by the compiler to help

the run-time system distribute the work of the parallel loops• according to the relative power of the processors• minimize communication and page sharing

Page 10: Adaptive Execution Support for Malleable Computation

Adaptive Load Balancing (cont.)

• Compile-Time Support for Load Balancing– The specific compiler adopts SUIF system, which is

organized as a set of compiler passes.– The SUIF pass extracts the shared data access

patterns in each of the SPMD regions, and feeds this information to the run-time system.• also responsible for adding hooks in the parallelized

code to allow run-time library to change the load distribution

--------SUIF: Stanford University Intermediate FormatSPMD: Single-Program Multiple-Data

Page 11: Adaptive Execution Support for Malleable Computation

Adaptive Load Balancing (cont.)– Access pattern extraction• SUIF pass walks through the program looking for

accesses to shared memory.

– Prefetching• Use the access pattern information to prefetch data

through prefetching calls.

– Load balancing interface and strategy• The compiler can direct the run-time to choose

between two partitioning strategies for distributing the parallel loops.

1. Shifting of loop boundaries2. Multiple loop bounds

Page 12: Adaptive Execution Support for Malleable Computation

Adaptive Load Balancing (cont.)

• Run-Time Load Balancing Support– The run-time library is responsible for keeping

track of the progress of each process• collect statistics about the execution time of each

parallel task, and• adjust the load accordingly

– Load balancing vs. Locality management• need to avoid unnecessary movement of data and

minimize page sharing• Locality-conscious load balancing: the run-time library

uses the information supplied by the compiler about what loop distribution strategy to use.

Page 13: Adaptive Execution Support for Malleable Computation

Algorithms for Scheduling

• Scheduling Malleable Parallel Tasks: An Asymptotic Fully Polynomial-Time Approximation Scheme [2002]

• Mapping and Scheduling Heterogeneous Tasks using Genertic Algorithms [1995]