23
Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee, Bingsheng He

Speedup for Multi-Level Parallel Computing

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Speedup for Multi-Level Parallel Computing

Speedup for Multi-Level Parallel Computing

School of Computer Engineering

Nanyang Technological University

21st May 2012

Shanjiang Tang, Bu-Sung Lee, Bingsheng He

Page 2: Speedup for Multi-Level Parallel Computing

OutLine

•  Background & Motivation •  Multi-Level Parallel Speedup •  Evaluation •  Conclusion

Page 3: Speedup for Multi-Level Parallel Computing

Multi-Level Computing Architecture and Paradigm

Page 4: Speedup for Multi-Level Parallel Computing

Multi-Level Computing Architecture and Paradigm

•  MPI+OpenMP •  MPI+CUDA •  MPI+OpenMP+CUDA …..

Page 5: Speedup for Multi-Level Parallel Computing

Multi-Level Parallel Computing Model

L Lm L3 L2 L1

LLLLLLLL

Notes: Sequential Part Parallel Part

PE2,

2

PE1,

1 PE2,

1

PE3,

1

PE3,

2

PE3,

3

PE3,

4

PE3,

5

PE3,

6

PE3,

7

PE3,

8

Page 6: Speedup for Multi-Level Parallel Computing

Parallel Speedup

•  Definition •  Classification

Ø Absolute Speedup

Ø Relative Speedup

SequentialExecutionTimeSpeedupParallelExecutionTime

=

BestSequentialALGExecutionTimeSpeedupParallelALGExecutionTime

=

ParallelALGSequentialExecutionTimeSpeedupParallelALGExecutionTime

=

Page 7: Speedup for Multi-Level Parallel Computing

Relative Speedup Model

•  Fixed-size Speedup

Ø Amdahl’s Law

where is parallel fraction workload of the program, is the

number of processors.

•  Fixed-time Speedup

Ø Gustafson’s Law

pmeparallelTiTimesequentialSpeedup

αα +−

==1

1

ppmeparallelTiTimesequentialSpeedup αα

αααα

+−=+−

+−== 111

α p

Page 8: Speedup for Multi-Level Parallel Computing

Motivation Example—NAS Benchmark (MPI+OpenMP)

Page 9: Speedup for Multi-Level Parallel Computing

Motivation Example—NAS Benchmark (MPI+OpenMP)

Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing

Page 10: Speedup for Multi-Level Parallel Computing

OutLine

•  Background & Motivation •  Multi-Level Parallel Speedup •  Evaluation •  Conclusion

Page 11: Speedup for Multi-Level Parallel Computing

E-Amdahl’s Law

•  Awareness of Different Grained-Level Parallelism

L Lm L3 L2 L1

LLLLLLLL

Notes:

Sequential Part

Parallel Part

PE2,

2

PE1,

1 PE2,

1

PE3,

1

PE3,

2

PE3,

3

PE3,

4

PE3,

5

PE3,

6

PE3,

7

PE3,

8

⎪⎪⎪⎪

⎪⎪⎪⎪

<≤

++−

=+−

=

)1(

)1()()()(1

1

)(

)()()(1

1

)(

mi

ispipifif

mi

mpmfmf

isp

Page 12: Speedup for Multi-Level Parallel Computing

E-Amdahl’s Law

•  Two-Level Parallelism Speedup Model (MPI+OpenMP)

where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.

pt

pt

tpsp)1(

1

1),,,(β

βαα

βα+−

+−

=

αβ

Page 13: Speedup for Multi-Level Parallel Computing

E-Gustafson’s Law

•  Awareness of Different Grained-Level Parallelism

L Lm L3 L2 L1

LLLLLLLL

Notes:

Sequential Part

Parallel Part

PE2,

2

PE1,

1 PE2,

1

PE3,

1

PE3,

2

PE3,

3

PE3,

4

PE3,

5

PE3,

6

PE3,

7

PE3,

8

⎩⎨⎧

<≤++−

=+−=

)1()1()()()(1)()()()(1

)(miispipififmimpmfmf

isp

Page 14: Speedup for Multi-Level Parallel Computing

OutLine

•  Background & Motivation •  Multi-Level Parallel Speedup •  Evaluation •  Conclusion

Page 15: Speedup for Multi-Level Parallel Computing

Experiment Setup

•  Platform and Configuration

Ø A linux cluster consisting of eight computing nodes each with two quad-core chips Ø Configuration: One thread per CPU core

•  Benchmarks

NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: Ø BT-MZ (Unbalanced Workload Partitioning) Ø SP-MZ (balanced Workload Partitioning) Ø  LU-MZ (balanced Workload Partitioning)

Page 16: Speedup for Multi-Level Parallel Computing

Performance Prediction

Page 17: Speedup for Multi-Level Parallel Computing

Prediction Result Comparison

Page 18: Speedup for Multi-Level Parallel Computing

OutLine

•  Background & Motivation •  Multi-Level Parallel Speedup •  Evaluation •  Conclusion

Page 19: Speedup for Multi-Level Parallel Computing

Conclusion

•  Traditional speedup models are unsuitable for multi-level parallelism

–  Unable to be awareness of different granularities of parallelism for multi-level parallel computing.

•  Multi-level Parallelism Model

–  A guidance model for multi-level optimization. –  A prediction model for multi-level parallelism.

Page 20: Speedup for Multi-Level Parallel Computing
Page 21: Speedup for Multi-Level Parallel Computing

Argument Estimation

Page 22: Speedup for Multi-Level Parallel Computing

Speedup Under E-Amdahl’s Law

Page 23: Speedup for Multi-Level Parallel Computing

Speedup Under E-Gustafson’s Law