58
1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the text explanations given in class.

1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Embed Size (px)

Citation preview

Page 1: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

1

Lecture #21Shared Objects and Concurrent

Programming

This material is not available in the textbook. The online powerpoint presentations contain the text explanations given in class.

Page 2: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 2

Moore’s Law

Clock speed

flattening sharply

Transistor count still

rising

Page 3: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 3

Vanishing from your Desktops: The Uniprocesor

memory

cpu

Page 4: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 4

Your Server: The Shared Memory Multiprocessor

(SMP)

cache

BusBus

shared memory

cachecache

Page 5: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 5

Your New Server or Desktop: The Multicore Processor

(CMP)

cache

BusBus

shared memory

cachecacheAll on the same chip

Sun T2000Niagara

Page 6: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 6

From the 2008 press…

…Intel has announced a press conference in San Francisco on November 17th, where it will officially launch the Core i7 Nehalem processor…

…Sun’s next generation Enterprise T5140 and T5240 servers, based on the 3rd Generation UltraSPARC T2 Plus processor, were released two days ago…

Page 7: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 7

Why is Kunle Smiling?

Niagara 1

Page 8: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

© 2006 Herlihy and Shavit8

Traditional Software Scaling Process

User code

TraditionalUniprocessor

Speedup1.8x1.8x

7x7x

3.6x3.6x

Time: Moore’s law

Page 9: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

© 2006 Herlihy and Shavit9

Multicore Software Scaling Process

User code

Multicore

Speedup 1.8x1.8x

7x7x

3.6x3.6x

Unfortunately, not so simple…

Page 10: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

© 2006 Herlihy and Shavit10

Real-World Software Scaling Process

1.8x1.8x 2x2x 2.9x2.9x

User code

Multicore

Speedup

Parallelization and Synchronization require great care…

Page 11: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

11

Concurrent Programming

object

object

Shared Memory

Challenge: coordinating access

Page 12: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

12

Persistent vs. Transient Communication

•Persistent Communication medium: the sending of information changes the state of the medium forever.

Example: Blackboard.

•Transient communication medium: the change of state is only for some limited time period.

Example: Talking.

Page 13: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

13

Parallel Primality Testing

Task: Print all primes from 1 to 1010 in some order Available: A machine with 10 processors

Solution: Speed work up 10 times, that is, new time to print all primes will be 1/10 of time for single processor

Page 14: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

14

Parallel Primality Testing

P1 P2 P10

1 109 2x109 1010

Split the work among processors!

Each processor Pi gets 109 numbers to test.

Page 15: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

15

Parallel Primality Testing

(define (P i) (let ((counter (+ 1 (* (- i 1) (power 10 9)))) (upto (* i (power 10 9)))) (define (iter) (if (< counter upto) (begin (if (prime? counter) (display counter) #f) (increment-counter) (iter)) 'done)) (iter)))

(parallel-execute (P 1) (P 2) ... (P 10))

Page 16: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

16

Problem: work is split unevenly

Some processors have less primes to test… Some composite numbers are easier to test…

P1 P2 P10

1 109 2x109 1010

Need to split the work range dynamically!

Page 17: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 17

17

18

19

Shared Counter

each thread takes a number

Page 18: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

18

A Shared Counter Object

(define (make-shared-counter value) (define (fetch) value) (define (increment) (set! value (+ 1 value)) (define (dispatch m) (cond (((eq? m 'fetch) (fetch)) (eq? m 'increment) (increment)) (else (error “unknown request”)))) dispatch)

(define shared-counter (make-shared-counter 1))

Page 19: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

19

Using the Shared Counter

(define (P i) (define (iter) (let ((index (shared-counter 'fetch))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (shared-counter 'increment) (iter)) 'done)) (iter)))

(parallel-execute (P 1) (P 2) ... (P 10))

Page 20: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

20

This Solution Doesn’t Work

time

Increment: (set! value (+ 1 value))

P1 read value77

77

P2 increment 10 times

87 P1 set! value78 Error!

(let ((index (shared-counter 'fetch)))

77P1 fetch

P2 fetch

77

77Error!

Page 21: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 21

Is this problem inherent?

If we could only glue reads and writes together…

read

write read

write

!! !!

Page 22: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

22

The Fetch-and-Increment Operation

(define (make-shared-counter value) (define (fetch-and-increment) (let ((old value)) (set! value (+ old 1)) old)) (define (dispatch m) (cond (((eq? m 'fetch-and-increment) (fetch-and-increment)) (else (error ``unknown request -- counter'' m)))) dispatch) 

Instantaneous

Shared Counter

Fetch-and-inc

Page 23: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

© 2006 Herlihy and Shavit23

Where Things Reside

cache

Bus Bus

cachecache

1

shared counter

shared memory

void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*109+1, j<(i+1)*109; j++) { if (isPrime(j)) print(j); }}

code

Local variables

Page 24: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

24

A Correct Shared Counter

(define shared-counter (make-shared-counter 1))(define (P i) (define (iter) (let ((index (shared-counter 'fetch-and-increment))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2) ... (P 10))

Page 25: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

25

Implementing Fetch-and-Inc

To make the program work we need an “instantaneous” implementation of fetch-and-increment. How can we do this:

• Special Hardware. Built-in synchronization instructions. • Special Software. Use regular instructions -- the solution

will involve waiting.

Software: Mutual Exclusion

Page 26: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

26

Mutual Exclusion

(mutex 'start)

(let ((old value))

(set! value (+ old 1))

old)

(mutex 'end))

Only one process at a time can execute these instructions

P1

P2

P10

...11 P2

returns 1Mutex count

Page 27: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

27

The Story of Alice and Bob

Bob Alice

Yard

* As told by Leslie Lamport

Page 28: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

28

The Mutual Exclusion Problem

Requirements: • Mutual Exclusion: there will never be two dogs

simultaneously in the yard.• No Deadlock: if only one dog wants to be in the yard it will

succeed, and if both dogs want to go out, at least one of them will succeed.

Page 29: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

29

Cell Phone Solution

Bob Alice

Yard

Page 30: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

30

Coke Can Solution

Bob Alice

Yard

Page 31: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

31

Flag Solution -- Alice

(define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving ))

(define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving ))

Bob Alice

Page 32: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

32

Flag Solution -- Bob

(define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

(define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

Page 33: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

33

Flag Solution -- Both

(define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving ))

(define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving ))

(define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

(define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

Page 34: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

34

Intuition: Why Mutual Exclusion is Preserved

Each perform: • First raise the flag, to signal interest. Then• look to see if the other one has raised the flag.

One can claim that the following flag principle holds:

since Alice and Bob each raise their own flag and then look at the others flag, the last one to start looking must notice that both flags are up.

Page 35: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 35

Proof of Mutual Exclusion

• Assume both dogs in yard• Derive a contradiction• By reasoning backwards

• Consider the last time Alice and Bob each looked before letting the dogs in

• Without loss of generality assume Alice was the last to look…

Page 36: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 36

Proof

time

Alice’s last look

Alice last raised her flag

Bob’s last look

QED

Alice must have seen Bob’s Flag. A Contradiction

Bob last raised flag

Page 37: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

37

Why is there no Deadlock?

Since Alice has priority over Bob…if neither is entering the critical section, both are repeatedly trying, and Bob will give Alice priority.

Unfortunately, the algorithm is not a fair one, and Bob's dogs might eventually grow very anxious :-)

Page 38: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

38

The Morals of our Story

• The Mutual Exclusion problem cannot be solved using transient communication. (I.e. Cell-phones.)

• The Mutual Exclusion problem cannot be solved using interrupts or interrupt bits (I.e. Cans)

• The Mutual Exclusion problem can be solved with one bit registers (i.e. Flags), memory locations that can be read and written (set!-ed).

We cheated a little: the arbiter problem…

Page 39: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 39

The Arbiter Problem (an aside)

Pick a point

Pick a point

Page 40: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

40

The Solution and Conclusion

(define (Alice) (loop (mutex 'begin) (Alice-dog-in-yard) ;; critical section (mutex 'end) ))

Question: then why not execute all the code of the parallel prime-printing algorithm in a critical section?

Page 41: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 41

Answer: Amdahl’s Law

OldExecutionTimeNewExecutionTimeSpeedup=

…of computation given n CPUs instead of 1

Page 42: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 42

Amdahl’s Law

p

pn

1

1Speedup=

Page 43: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 43

Amdahl’s Law

p

pn

1

1Speedup=

Parallel fraction

Page 44: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 44

Amdahl’s Law

p

pn

1

1Speedup=

Parallel fraction

Sequential fraction

Page 45: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 45

Amdahl’s Law

p

pn

1

1Speedup=

Parallel fraction

Number of

processors

Sequential fraction

Page 46: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 46

Example

• Ten processors• 60% concurrent, 40% sequential• How close to 10-fold speedup?

Page 47: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 47

Example

• Ten processors• 60% concurrent, 40% sequential• How close to 10-fold speedup?

106.0

6.01

1

Speedup = 2.17=

Page 48: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 48

Example

• Ten processors• 80% concurrent, 20% sequential• How close to 10-fold speedup?

Page 49: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 49

Example

• Ten processors• 80% concurrent, 20% sequential• How close to 10-fold speedup?

108.0

8.01

1

Speedup = 3.57=

Page 50: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 50

Example

• Ten processors• 90% concurrent, 10% sequential• How close to 10-fold speedup?

Page 51: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 51

Example

• Ten processors• 90% concurrent, 10% sequential• How close to 10-fold speedup?

109.0

9.01

1

Speedup = 5.26=

Page 52: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 52

Example

• Ten processors• 99% concurrent, 01% sequential• How close to 10-fold speedup?

Page 53: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming 53

Example

• Ten processors• 99% concurrent, 01% sequential• How close to 10-fold speedup?

1099.0

99.01

1

Speedup = 9.17=

Page 54: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Art of Multiprocessor Programming

Back to Real-World Multicore Scaling

54

1.8x1.8x 2x2x 2.9x2.9x

User code

Multicore

Speedup

Why the bad performance?

Page 55: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

As num cores grows the effect of 25% becomes more accute 2.3/4, 2.9/8, 3.4/16, 3.7/32….

Amdahl’s Law:

Pay for N = 8 cores SequentialPart = 25%

Speedup = only 2.9 times!

Must parallelize applications on a very fine grain!

Where is sequential code coming from…

Page 56: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

Need Fine-Grained Locking

75%Unshared

25%Shared

c c

c c

c cc c

CoarseGrained

c

cc

c

c

c

c c

c c

c c

c cc c

FineGrained c c

cc

cc

cc

The reason we get

only 2.9 speedup

75%Unshared

25%Shared

Page 57: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

57

Multicores are here …

Page 58: 1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the

58

“Life is the synchronicity of chance”

You just saw a bit of what concurrent programming is about

Today we don’t have sufficient expertise yet on how to make use of multicore machines…

You guys are the generation that will get to use them and hopefully develop this expertise.

Programming Multicore Machines