27
1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 7 Performance Analysis Performance Analysis Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Learning Objectives Predict performance of parallel programs Predict performance of parallel programs Predict performance of parallel programs Predict performance of parallel programs Understand barriers to higher performance Understand barriers to higher performance

Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

Embed Size (px)

Citation preview

Page 1: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

1

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Chapter 7

Performance AnalysisPerformance Analysis

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Learning Objectives

Predict performance of parallel programsPredict performance of parallel programs Predict performance of parallel programsPredict performance of parallel programs

Understand barriers to higher performanceUnderstand barriers to higher performance

Page 2: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

2

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Speedup Formula

iii l

timeexecution Parallel

timeexecution Sequential Speedup

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Execution Time Components

Inherently sequential computations:Inherently sequential computations: ((nn)) Inherently sequential computations: Inherently sequential computations: ((nn))

Potentially parallel computations: Potentially parallel computations: ((nn))

Parallel overhead: Parallel overhead: ((n,pn,p))

CommunicationCommunication

Redundant operationsRedundant operationsRedundant operationsRedundant operations

Page 3: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

3

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Speedup Expression

)()()(

nn ),(/)()(

)()(),(

pnpnn

nnpn

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Computation Component - (n)/p

Processors

Page 4: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

4

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Communication Component -(n,p)

Processors

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

(n)/p + (n,p)

Processors

Page 5: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

5

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Speedup Plot“elbowing out”

Processors

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Efficiencytimeexecution Sequential

Speedup

used Processors

Speedup Efficiency

timeexecution Parallel used Processors

timeexecution Sequential Efficiency

Processors

Speedup Speedup

timeexecution Parallel Processors Speedup

Page 6: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

6

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Efficiency 0 (n,p) 1

),()()(

)()(),(

pnpnnp

nnpn

All terms > 0 (n,p) > 0

Denominator > numerator (n,p) < 1

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Amdahl’s Law Provides an upper bound on the speedup Provides an upper bound on the speedup

achievable by applying a certain number of achievable by applying a certain number of processors to solve the problem in parallelprocessors to solve the problem in parallel

Can be used to determine the asymptotic Can be used to determine the asymptotic speedup achievable as the number of speedup achievable as the number of processors increasesprocessors increasesprocessors increasesprocessors increases

Page 7: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

7

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Amdahl’s LawThe speedup of a program using multiple processorsusing multiple processors in parallel computing is limited by the sequential fraction of the program. For example, if 95% of the program can be parallelized, the theoretical maximum speedup using parallel computing would be 20× as shown in the diagram, no matter how many processors are used.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Amdahl’s Law

nn )()()(

pnn

nn

pnpnnpn

/)()(

)()(

),(/)()(

)()(),(

Let f = (n)/((n) + (n))f ( ) ( ( ) ( ))

pff /)1(

1

Page 8: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

8

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 1 95% of a program’s execution time occurs 95% of a program’s execution time occurs

inside a loop that can be executed in inside a loop that can be executed in parallel.parallel.

What What is the maximum speedup we should is the maximum speedup we should expect from a parallel version of the expect from a parallel version of the program executing on 8 CPUs?program executing on 8 CPUs?program executing on 8 CPUs?program executing on 8 CPUs?

9.58/)05.01(05.0

1

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 2

20% of a program’s execution time is spent20% of a program’s execution time is spent 20% of a program s execution time is spent 20% of a program s execution time is spent within inherently sequential code. What is within inherently sequential code. What is the limit to the speedup achievable by a the limit to the speedup achievable by a parallel version of the program?parallel version of the program?

115

2.0

1

/)2.01(2.0

1lim

pp

Page 9: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

9

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Pop Quiz

An oceanographer gives you a serialAn oceanographer gives you a serial An oceanographer gives you a serial An oceanographer gives you a serial program and asks you how much faster it program and asks you how much faster it might run on 8 processors. You can only might run on 8 processors. You can only find one function amenable to a parallel find one function amenable to a parallel solution. Benchmarking on a single solution. Benchmarking on a single processor reveals 80% of the execution time processor reveals 80% of the execution time ppis spent inside this function. What is the is spent inside this function. What is the best speedup a parallel version is likely to best speedup a parallel version is likely to achieve on 8 processors?achieve on 8 processors?

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Pop Quiz

A computer animation program generates aA computer animation program generates a A computer animation program generates a A computer animation program generates a feature movie framefeature movie frame--byby--frame. Each frame frame. Each frame can be generated independently and is can be generated independently and is output to its own file. If it takes 99 seconds output to its own file. If it takes 99 seconds to render a frame and 1 second to output it, to render a frame and 1 second to output it, h h d b hi d bh h d b hi d bhow much speedup can be achieved by how much speedup can be achieved by rendering the movie on 100 processors?rendering the movie on 100 processors?

Page 10: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

10

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Limitations of Amdahl’s Law

IgnoresIgnores ((nn pp)) Ignores Ignores ((nn,,pp))

Overestimates speedup achievableOverestimates speedup achievable

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Amdahl Effect

TypicallyTypically ((nn pp) has lower complexity than) has lower complexity than Typically Typically ((nn,,pp) has lower complexity than ) has lower complexity than ((nn)/)/pp

As As nn increases, increases, ((nn)/)/pp dominates dominates ((nn,,pp))

As As nn increases, speedup increases, speedup increasesincreases

For a fixed number of processors speedupFor a fixed number of processors speedup For a fixed number of processors, speedup For a fixed number of processors, speedup is usually an increasing function of the is usually an increasing function of the problem sizeproblem size

Page 11: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

11

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Illustration of Amdahl Effect

Speedup

n = 1,000

n = 10,000Speedup

n = 100

Processors

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Review of Amdahl’s Law

Treats problem size as a constantTreats problem size as a constant Treats problem size as a constantTreats problem size as a constant

Shows how execution time decreases as Shows how execution time decreases as number of processors increasesnumber of processors increases

Page 12: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

12

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Another Perspective

We often use faster computers to solveWe often use faster computers to solve We often use faster computers to solve We often use faster computers to solve larger problem instanceslarger problem instances

Let’s treat time as a constant and allow Let’s treat time as a constant and allow problem size to increase with number of problem size to increase with number of processorsprocessors

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Gustafson-Barsis’s Law

Given a parallel program solving a problem ofGiven a parallel program solving a problem ofGiven a parallel program solving a problem of Given a parallel program solving a problem of size n using p processors, let s denote the size n using p processors, let s denote the fraction of total execution time spent in fraction of total execution time spent in serial code. The maximum speedup serial code. The maximum speedup achievable by this program is achievable by this program is

spp )1(

Page 13: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

13

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Gustafson-Barsis’s Law

nn )()( pnn

nnpn

/)()(

)()(),(

Let s = (n)/((n)+(n)/p)

spp )1( pp )(

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Gustafson-Barsis’s Law

Begin with parallel execution timeBegin with parallel execution time Begin with parallel execution timeBegin with parallel execution time

Estimate sequential execution time to solve Estimate sequential execution time to solve same problemsame problem

Problem size is an increasing function of Problem size is an increasing function of pp

PredictsPredicts scaled speedupscaled speedup Predicts Predicts scaled speedupscaled speedup

Page 14: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

14

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 1

An application running on 10 processorsAn application running on 10 processors An application running on 10 processors An application running on 10 processors spends 3% of its time in serial code. What is spends 3% of its time in serial code. What is the scaled speedup of the application?the scaled speedup of the application?

73.927.010)03.0)(101(10

Execution on 1 CPU takes 10 times as long…

…except 9 do not have to execute serial code

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 2

What is the maximum fraction of aWhat is the maximum fraction of a What is the maximum fraction of a What is the maximum fraction of a program’s parallel execution time that can program’s parallel execution time that can be spent in serial code if it is to achieve a be spent in serial code if it is to achieve a scaled speedup of 7 on 8 processors?scaled speedup of 7 on 8 processors?

140)81(87 ss 14.0)81(87 ss

Page 15: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

15

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Pop Quiz

A parallel program executing on 32A parallel program executing on 32 A parallel program executing on 32 A parallel program executing on 32 processors spends 5% of its time in processors spends 5% of its time in sequential code. What is the scaled speedup sequential code. What is the scaled speedup of this program?of this program?

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

The Karp-Flatt Metric

Amdahl’s Law and GustafsonAmdahl’s Law and Gustafson--Barsis’ LawBarsis’ Law Amdahl s Law and GustafsonAmdahl s Law and Gustafson--Barsis Law Barsis Law ignore ignore ((nn,,pp))

They can overestimate speedup or scaled They can overestimate speedup or scaled speedupspeedup

Karp and Flatt proposed another metricKarp and Flatt proposed another metricp p pp p p

Page 16: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

16

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Experimentally Determined Serial Fraction

I h l i l

)()(

),()(

nn

pnne

Inherently serial componentof parallel computation +processor communication andsynchronization overhead

Single processor execution time

p

pe

/11

/1/1

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Experimentally Determined Serial Fraction

Takes into account parallel overheadTakes into account parallel overhead

Detects other sources of overhead or Detects other sources of overhead or inefficiency ignored in speedup modelinefficiency ignored in speedup model

Process startup timeProcess startup timepp

Process synchronization timeProcess synchronization time

Imbalanced workloadImbalanced workload

Architectural overheadArchitectural overhead

Page 17: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

17

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 1

2 3 4 5 6 7 8p 2 3 4 5 6 7

1.8 2.5 3.1 3.6 4.0 4.4

8

4.7

What is the primary reason for speedup of only 4.7 on 8 CPUs?

e 0.1 0.1 0.1 0.1 0.1 0.1 0.1

Since e is constant, large serial fraction is the primary reason.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 2

2 3 4 5 6 7 8p 2 3 4 5 6 7

1.9 2.6 3.2 3.7 4.1 4.5

8

4.7

What is the primary reason for speedup of only 4.7 on 8 CPUs?

e 0.070 0.075 0.080 0.085 0.090 0.095 0.100

Since e is steadily increasing, overhead is the primary reason.

Page 18: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

18

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Pop Quiz

4 8 12

Is this program likely to achieve a speedup of 10 Is this program likely to achieve a speedup of 10 on 12 processors?on 12 processors?

p 4

3.9

8

6.5

12

?

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Isoefficiency Metric

Parallel system: parallel program executingParallel system: parallel program executing Parallel system: parallel program executing Parallel system: parallel program executing on a parallel computeron a parallel computer

Scalability of a parallel system: measure of Scalability of a parallel system: measure of its ability to increase performance as its ability to increase performance as number of processors increasesnumber of processors increases

A scalable system maintains efficiency asA scalable system maintains efficiency as A scalable system maintains efficiency as A scalable system maintains efficiency as processors are addedprocessors are added

Isoefficiency: way to measure scalabilityIsoefficiency: way to measure scalability

Page 19: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

19

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Isoefficiency Derivation Steps

Begin with speedup formulaBegin with speedup formula Begin with speedup formulaBegin with speedup formula

Compute total amount of overheadCompute total amount of overhead

Assume efficiency remains constantAssume efficiency remains constant

Determine relation between sequential Determine relation between sequential execution time and overheadexecution time and overheadexecution time and overheadexecution time and overhead

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Deriving Isoefficiency RelationDetermine overhead

),()()1(),( pnpnppnTo Substitute overhead into speedup equation

),()()())()((

0),( pnTnn

nnppn

Substitute T(n,1) = (n) + (n). Assume efficiency is constant.

),()1,( 0 pnCTnT Isoefficiency Relation

Page 20: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

20

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Isoefficiency Relation

In order to maintain the same level ofIn order to maintain the same level ofIn order to maintain the same level of In order to maintain the same level of efficiency as the number of processors efficiency as the number of processors increases, increases, nn must be increased so that the must be increased so that the following inequality is satisfied where following inequality is satisfied where TT((nn,1) ,1) and and CTCT00((nn,,pp) are as defined in the book) are as defined in the book

),()1,( 0 pnCTnT

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Scalability Function

SupposeSuppose isoefficiencyisoefficiency relation isrelation is nn f(p)f(p) Suppose Suppose isoefficiencyisoefficiency relation is relation is nn f(p)f(p)

Let Let M(n)M(n) denote memory required for denote memory required for problem of size problem of size nn

M(f(p))/p M(f(p))/p shows how memory usage shows how memory usage per per processorprocessor must increase to maintain same must increase to maintain same ppefficiencyefficiency

We call We call M(f(p))/p M(f(p))/p the the scalability functionscalability function

Page 21: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

21

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Meaning of Scalability Function

To maintain efficiency when increasingTo maintain efficiency when increasing pp wewe To maintain efficiency when increasing To maintain efficiency when increasing pp, we , we must increase must increase nn

Maximum problem size limited by available Maximum problem size limited by available memory, which is linear in memory, which is linear in pp

Scalability function shows how memory usage per Scalability function shows how memory usage per processor must grow to maintain efficiencyprocessor must grow to maintain efficiencyprocessor must grow to maintain efficiencyprocessor must grow to maintain efficiency

Scalability function a constant means parallel Scalability function a constant means parallel system is perfectly scalablesystem is perfectly scalable

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Interpreting Scalability Function

sor Cplogp

need

ed p

er p

roce

ss

Cp

Memory Size

Can maintainffi i

Cannot maintainefficiency

Number of processors

Mem

ory

Clogp

C

efficiency

Page 22: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

22

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 1: Reduction

Sequential algorithm complexitySequential algorithm complexity Sequential algorithm complexitySequential algorithm complexityTT((nn,1) = ,1) = ((nn))

Parallel algorithmParallel algorithm

Computational complexity = Computational complexity = ((n/pn/p))

Communication complexity =Communication complexity = (log(log pp))Communication complexity Communication complexity (log(log pp))

Parallel overheadParallel overheadTT00((nn,,pp) = ) = ((pp log log pp))

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Reduction (continued)

Isoefficiency relation:Isoefficiency relation: nn C pC p loglog pp Isoefficiency relation: Isoefficiency relation: nn C pC p log log pp We ask: To maintain same level of We ask: To maintain same level of

efficiency, how must efficiency, how must nn increase when increase when ppincreases?increases?

M(n)M(n) = = nn

l/l/)l(

The system has good scalabilityThe system has good scalability

pCppCpppCpM log/log/)log(

Page 23: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

23

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 2: Floyd’s Algorithm

Sequential time complexity:Sequential time complexity: ((nn33)) Sequential time complexity: Sequential time complexity: ((nn ))

Parallel computation time: Parallel computation time: ((nn33//pp))

Parallel communication time: Parallel communication time: ((nn22log log pp))

Parallel overhead: Parallel overhead: TT00((nn,,pp) = ) = ((pnpn22log log pp))

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Floyd’s Algorithm (continued)

Isoefficiency relationIsoefficiency relation Isoefficiency relationIsoefficiency relationnn33 CC((p np n33 log log pp) ) nn C pC p log log pp

M(n) = nM(n) = n22

ppCpppCppCpM 22222 log/log/)log(

The parallel system has poor scalabilityThe parallel system has poor scalability

Page 24: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

24

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Example 3: Finite Difference

Sequential time complexity per iteration:Sequential time complexity per iteration: Sequential time complexity per iteration: Sequential time complexity per iteration: ((nn22))

Parallel communication complexity per Parallel communication complexity per iteration: iteration: ((nn//p)p)

Parallel overhead: Parallel overhead: ((nn p)p)(( p)p)

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Finite Difference (continued)

Isoefficiency relationIsoefficiency relation Isoefficiency relationIsoefficiency relationnn22 CnCnp p nn CC pp

M(n)M(n) = = nn22

22 //)( CppCppCM This algorithm is perfectly scalableThis algorithm is perfectly scalable

Page 25: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

25

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Summary

Performance termsPerformance terms Performance termsPerformance terms

SpeedupSpeedup

EfficiencyEfficiency

Model of speedupModel of speedup

Serial componentSerial componentSerial componentSerial component

Parallel componentParallel component

Communication componentCommunication component

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Summary

What prevents linear speedup?What prevents linear speedup? What prevents linear speedup?What prevents linear speedup?

Serial operationsSerial operations

Communication operationsCommunication operations

Process startProcess start--upup

Imbalanced workloadsImbalanced workloads Imbalanced workloadsImbalanced workloads

Architectural limitationsArchitectural limitations

Page 26: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

26

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Summary

Analyzing parallel performanceAnalyzing parallel performance Analyzing parallel performanceAnalyzing parallel performance

Amdahl’s LawAmdahl’s Law

GustafsonGustafson--Barsis’ LawBarsis’ Law

KarpKarp--Flatt metricFlatt metric

Isoefficiency metricIsoefficiency metric Isoefficiency metricIsoefficiency metric

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Amdahl’s Law

Based on the assumption that we are trying toBased on the assumption that we are trying to Based on the assumption that we are trying to Based on the assumption that we are trying to solve a problem of fixed size as quickly as solve a problem of fixed size as quickly as possiblepossible Provides an upper bound on the speedup achievable by Provides an upper bound on the speedup achievable by

applying a certain number of processors to solve the applying a certain number of processors to solve the problem in parallelproblem in parallel

Can be used to determine the asymptotic speedup Can be used to determine the asymptotic speedup achievable as the number of processors increasesachievable as the number of processors increases

Page 27: Copyright © The McGraw-Hill Companies, Inc. Permission ...digital.cs.usu.edu/~allan/PP/Slides/Chapter07.pdf · ... Inc. Permission required for reproduction or ... Hill Companies,

27

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Gustafson-Barsis’s Law

Time is treated as a constant and we let theTime is treated as a constant and we let the Time is treated as a constant and we let the Time is treated as a constant and we let the problem size increase with the number of problem size increase with the number of processorsprocessors

It begins with a parallel computation and It begins with a parallel computation and estimates how much faster the parallel estimates how much faster the parallel computation is than the same computation computation is than the same computation executing on a single processorexecuting on a single processor

Called scaled speedupCalled scaled speedup

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Karp-Flatt Metric Takes into account parallel overheadTakes into account parallel overhead

Helps detect other sources of overhead orHelps detect other sources of overhead or Helps detect other sources of overhead or Helps detect other sources of overhead or inefficiency inefficiency

For a fixed problem size, the efficiency of a For a fixed problem size, the efficiency of a parallel computation typically decreases as the parallel computation typically decreases as the number of processors increasenumber of processors increase By using the experimentally determined serial fraction,By using the experimentally determined serial fraction, By using the experimentally determined serial fraction, By using the experimentally determined serial fraction,

can determine whether is efficiency decrease is due to:can determine whether is efficiency decrease is due to: Limited opportunities for parallelismLimited opportunities for parallelism

Increases in algorithmic or architectural overheadIncreases in algorithmic or architectural overhead