26
Multi-threading model High level model of thread processes using spawnand sync. Does not consider the underlying hardware. Algorithm Algorithm-A begin fg spawn Algorithm-B do Algorithm-B in parallel with this code f other stu g sync wait here for all previous spawned parallel computations to complete fg end

Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-threading model

High level model of thread processes using spawn and sync.Does not consider the underlying hardware.

Algorithm Algorithm-A

begin· · · spawn Algorithm-Bdo Algorithm-B in parallel with this code· · · other stuff · · · syncwait here for all previous spawned parallel computations to complete· · · end

Page 2: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-threading model

Many languages (e.g. Java) support the production of separatelyrunnable processes called threads. Each thread looks like it isrunning on its own and the operating system shares time andprocessors between the threads. In the multi-threading model, theexact parallel implementation is left to the operating system

Page 3: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-threading Model

We look at some examples.

I Fibonacci

I Complexity measures

See CLRS: Cormen, Lierseson, Rivest and Stein, Introduction toAlgorithms (3rd edition). Chapter 27, Multithreaded Algorithmshttps://mitpress.mit.edu/sites/default/files/

titles/sample/0262533057chap27.pdf

Page 4: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-treading Fibonacci

Reminder: Recursive Fibonacci

Algorithm FIB(n)

1: if n ≤ 1 return n2: else3: return FIB(n − 1) + FIB(n − 2)

Parallel version

Algorithm Par-FIB(n)

1: if n ≤ 1 return n2: else3: x = spawn Par-FIB(n − 2)4: y = Par-FIB(n − 1)5: sync6: return x + y

Page 5: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-treading Fibonacci

Reminder: Recursive Fibonacci

Algorithm FIB(n)

1: if n ≤ 1 return n2: else3: return FIB(n − 1) + FIB(n − 2)

Parallel version

Algorithm Par-FIB(n)

1: if n ≤ 1 return n2: else3: x = spawn Par-FIB(n − 2)4: y = Par-FIB(n − 1)5: sync6: return x + y

Page 6: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Recursive Fibonacci. The recursion tree for FIB1(6)

Figure : From CLRS Introduction to Algorithms Chapter 27 (downloaded)

Page 7: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Multi-threading Fibonacci

Page 8: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Complexity measures for multi-threading

DAG: directed acyclic graph. Vertices are the circles for spawn,sync or procedure call. For a problem of size n:

I Span S or T∞(n). Number of vertices on the longestdirected path from start to finish in the computation DAG.(The critical path).The run time if each vertex of the DAG has its own processor.

I Work W or T1(n). Total time to execute the entirecomputation on one processor. Defined as the number ofvertices in the computation DAG

I Tp(n). Total time to execute entire computation with pprocessors

I Speed up = T1/Tp. How much faster it is

I Parallelism = T1/T∞. The maximum possible speed up

Page 9: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 1: Fibonacci

Lets look at the answer first.(For details see pages 776–784 ofCLRS.)

I T1(n) = Θ(φn) where φ ∼ 1.62 (see page 776)

I T∞(n) = Θ(n).The critical path is proportional to Fib(n)

I Parallelism = T1/T∞ = Θ(φn/n).Let t = T1(n) ∝ (1.62)n be the (sequential) time.Then log t ∝ n log 1.62 = Θ(n).

I Parallelism = T1/T∞ = Θ(t/ log t).Almost linear speed up relative to our chosen sequentialalgorithm.

Page 10: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Span and work

Page 11: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Back to Fib. FIB1(n) is exponential

I For Par-Fib(n) we haveT∞(n) = max(T∞(n − 1) + T∞(n − 2)) + Θ(1) = Θ(n)

I The value (1.62)n is tricky. We can show T1(n) = Ω(√

2n)

T1(n) = T1(n − 1) + T1(n − 2) + Θ(1)

≥ 2T1(n − 2)

≥ 22T1(n − 4) ≥ 23T1(n − 6) ≥ · · ·≥ 2n/2T1(0)

=√

2nΘ(1)

where√

2 ∼ 1.41 so T1(n) ≥ (1.4)n, an exponential run time

Page 12: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers

S(n) = 1 + · · ·+ 1 i.e. S(n) = n

Algorithm SUM1(n)

1: if n = 0 return 02: SUM1= 03: for i = 1, ..., n do SUM1 = SUM1 + 14: return SUM1

How to make SUM ’look parallel’? Recursive version!

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Add up the first half and then the second half.Not very practical? But it has a good parallel counterpart

Page 13: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers

S(n) = 1 + · · ·+ 1 i.e. S(n) = n

Algorithm SUM1(n)

1: if n = 0 return 02: SUM1= 03: for i = 1, ..., n do SUM1 = SUM1 + 14: return SUM1

How to make SUM ’look parallel’? Recursive version!

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Add up the first half and then the second half.Not very practical? But it has a good parallel counterpart

Page 14: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers

S(n) = 1 + · · ·+ 1 i.e. S(n) = n

Algorithm SUM1(n)

1: if n = 0 return 02: SUM1= 03: for i = 1, ..., n do SUM1 = SUM1 + 14: return SUM1

How to make SUM ’look parallel’? Recursive version!

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Add up the first half and then the second half.Not very practical? But it has a good parallel counterpart

Page 15: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers

S(n) = 1 + · · ·+ 1 i.e. S(n) = n

Algorithm SUM1(n)

1: if n = 0 return 02: SUM1= 03: for i = 1, ..., n do SUM1 = SUM1 + 14: return SUM1

How to make SUM ’look parallel’? Recursive version!

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Add up the first half and then the second half.Not very practical? But it has a good parallel counterpart

Page 16: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers 1 + · · ·+ 1

Sequential recursion.Assume n is a power of 2, i.e.n = 2m

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Parallel version

Algorithm Par-SUM(n)

1: if n = 1 return 12: else3: x = spawn Par-SUM(n/2)4: y = Par-SUM(n/2)5: sync6: return x + y

Page 17: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Add up numbers 1 + · · ·+ 1

Sequential recursion.Assume n is a power of 2, i.e.n = 2m

Algorithm SUM(n)

1: if n = 1 return 12: else3: return SUM(n/2) + SUM(n/2)

Parallel version

Algorithm Par-SUM(n)

1: if n = 1 return 12: else3: x = spawn Par-SUM(n/2)4: y = Par-SUM(n/2)5: sync6: return x + y

Page 18: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 2. Complexity comparison

I T1(n) = Θ(n)

I T∞(n) = m = log2 n. Why?Ans: If n = 2m then m = log2 n

T∞(n) = Θ(1) + max(T∞(n/2),T∞(n/2))

= Θ(1) + T∞(n/2)

= (m − 1)Θ(1) + T∞(n/2m)

= mΘ(1)

I Parellism= T1(n)/T∞(n) = Θ(n/ log2 n)almost linear speed up compared to our initial algorithm

Page 19: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 3: Add up squares

S(n) = 12 + 22 + · · ·+ n2 = n2 + (n − 1)2 + · · ·+ 12

Algorithm SQUARE(n)

1: if n = 1 return 12: x =SQUARE(n − 1)3: y = n ∗ n4: return x + y

Simple parallel version.

Algorithm Par-SQUARE(n)

1: if n = 1 return 12: x = spawn Par-SQUARE(n − 1)3: y = n ∗ n4: sync5: return x + y

Page 20: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 3: Add up squares

S(n) = 12 + 22 + · · ·+ n2 = n2 + (n − 1)2 + · · ·+ 12

Algorithm SQUARE(n)

1: if n = 1 return 12: x =SQUARE(n − 1)3: y = n ∗ n4: return x + y

Simple parallel version.

Algorithm Par-SQUARE(n)

1: if n = 1 return 12: x = spawn Par-SQUARE(n − 1)3: y = n ∗ n4: sync5: return x + y

Page 21: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 3: Computation DAG

Page 22: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Example 3. Complexity comparison

I T1(n) = Θ(n)

I T∞(n) = Θ(1) + max(1,T∞(n − 1)) = Θ(n)

I Pararellism= T1(n)/T∞(n) = Θ(1)No speed up over sequential algorithm. Bad parallelimplementation

Page 23: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

The bounds on speed up for p processors

I Speed up = T1/Tp.

I In reality: How much faster does the program run with pprocessors?

I What are the bounds on Tp(n)?

I Crude lower bound: Tp ≥ T1/p. Why?

I Difficult to divide work perfectly between the p processors.i.e. pTp ≥ T1

I If p is very large this lower bound is inaccurate. Why?

We need more accurate bounds

Page 24: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Greedy scheduling I

I A scheduler is greedy if it immediately allocates any freeprocessor to an available tasks

I The greedy scheduling principle says that if a computation isrun on p processors using a greedy scheduler then the totaltime Tp is bounded by

Tp ≤W

p+ S

I The span S measures the unavoidably sequential part of thealgorithm

Page 25: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

Greedy scheduling II

I The lower bound is

Tp ≥ max

(W

p,S

)I W /p allocates work equally to processors so they all finish at

the same time, S is the span

I Thus

max

(W

p,S

)≤ Tp ≤

W

p+ S

I This means that if we increase the number of processors p sothat W /p S we are wasting resources. The algorithm stilltakes time at least S

Page 26: Multi-threading model · 2016. 10. 3. · Multi-threading model Many languages (e.g. Java) support the production of separately runnable processes called threads. Each thread looks

This material (and much more) is covered inCormen, Lierseson, Rivest and Stein,Introduction to Algorithms (3rd edition)Chapter 27, Multithreaded Algorithms,downloadable fromhttps://mitpress.mit.edu/sites/default/files/

titles/sample/0262533057chap27.pdf

See also the free book at:http://www.parallel-algorithms-book.com/ (Sections 3.3.2, 3.4)