23
Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee, Bingsheng He

Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Embed Size (px)

DESCRIPTION

Multi-Level Computing Architecture and Paradigm

Citation preview

Page 1: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Speedup for Multi-Level Parallel Computing

School of Computer Engineering

Nanyang Technological University

21st May 2012

Shanjiang Tang, Bu-Sung Lee, Bingsheng He

Page 2: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

OutLine

• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion

Page 3: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Multi-Level Computing Architectureand Paradigm

Page 4: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Multi-Level Computing Architectureand Paradigm• MPI+OpenMP • MPI+CUDA• MPI+OpenMP+CUDA …..

Page 5: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Multi-Level Parallel Computing Model LmL3L2L1

Notes: Sequential Part Parallel Part

PE2,2

PE1,1 PE

2,1

PE3,1

PE3,2

PE3,3

PE3,4

PE3,5

PE3,6

PE3,7

PE3,8

Page 6: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Parallel Speedup

• Definition

• Classification

Absolute Speedup

Relative Speedup

SequentialExecutionTimeSpeedupParallelExecutionTime

=

BestSequentialALGExecutionTimeSpeedupParallelALGExecutionTime

=

ParallelALGSequentialExecutionTimeSpeedupParallelALGExecutionTime

=

Page 7: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Relative Speedup Model

• Fixed-size Speedup

Amdahl’s Law

• Fixed-time Speedup

Gustafson’s Law

1

1

sequentialTimeSpeedupparallelTime

paa

= =- +

1 11

sequentialTime pSpeedup ppparallelTimep

a a a aaa

- += = = - +- +

Page 8: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Motivation Example—NAS Benchmark (MPI+OpenMP)

Page 9: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Motivation Example—NAS Benchmark (MPI+OpenMP)

Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing

Page 10: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

OutLine

• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion

Page 11: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

E-Amdahl’s Law

• Awareness of Different Grained-Level Parallelism

1 ( )( )1 ( )( )

( )1 (1 ).( )1 ( )( ) ( 1)

i mf mf mp m

sp ii m

f if ip i sp i

ìïï =ïïï - +ïïï=íïï £ <ïïï - +ïï +ïî

LmL3L2L1

Notes:

Sequential Part

Parallel Part

PE2,2

PE1,1 PE

2,1

PE3,1

PE3,2

PE3,3

PE3,4

PE3,5

PE3,6

PE3,7

PE3,8

Page 12: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

E-Amdahl’s Law

• Two-Level Parallelism Speedup Model (MPI+OpenMP)

where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.

1( , , , )(1 )

1

sp p t

tp

a b ba ba

=- +

- +

abpt

Page 13: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

E-Gustafson’s Law

• Awareness of Different Grained-Level Parallelism

1 ( ) ( ) ( ) ( )( )

1 ( ) ( ) ( ) ( 1) (1 ).f m f m p m i m

sp if i f i p i sp i i m

ì - + =ïï=íï - + + £ <ïî

LmL3L2L1

Notes:

Sequential Part

Parallel Part

PE2,2

PE1,1 PE

2,1

PE3,1

PE3,2

PE3,3

PE3,4

PE3,5

PE3,6

PE3,7

PE3,8

Page 14: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

OutLine

• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion

Page 15: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Experiment Setup

• Platform and Configuration

A linux cluster consisting of eight computing nodes each with two quad-core chips Configuration: One thread per CPU core

• Benchmarks

NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: BT-MZ (Unbalanced Workload Partitioning) SP-MZ (balanced Workload Partitioning) LU-MZ (balanced Workload Partitioning)

Page 16: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Performance Prediction

Page 17: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Prediction Result Comparison

Page 18: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

OutLine

• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion

Page 19: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Conclusion

• Traditional speedup models are unsuitable for multi-level parallelism

– Unable to be awareness of different granularities of parallelism for multi-level parallel computing.

• Multi-level Parallelism Model

– A guidance model for multi-level optimization.– A prediction model for multi-level parallelism.

Page 20: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,
Page 21: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Argument Estimation

Page 22: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Speedup Under E-Amdahl’s Law

Page 23: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,

Speedup Under E-Gustafson’s Law