25
Read-Write Lock Allocation in Software Transactional Memory Amir Ghanbari Bavarsad and Ehsan Atoofian Lakehead University

Read-Write Lock Allocation in Software Transactional Memory

  • Upload
    satin

  • View
    64

  • Download
    0

Embed Size (px)

DESCRIPTION

Read-Write Lock Allocation in Software Transactional Memory. Amir Ghanbari Bavarsad and Ehsan Atoofian Lakehead University. Transactional Memory. Software transactional memory (STM) exploits a global clock to validate transactional data Pros: reduces validation overhead Cons: contention - PowerPoint PPT Presentation

Citation preview

Page 1: Read-Write Lock Allocation in Software Transactional Memory

Read-Write Lock Allocation in Software Transactional

Memory

Amir Ghanbari Bavarsad and Ehsan Atoofian

Lakehead University

Page 2: Read-Write Lock Allocation in Software Transactional Memory

P1

$ $

Pn

Global Clock

Transactional Memory Software transactional memory (STM) exploits a

global clock to validate transactional data Pros: reduces validation overhead Cons: contention

Alternate: Read Write Lock Allocation (RWLA) Pros: no central clock Cons: overhead if a TX aborts

Speculative RWLA: changes validation policy dynamically → Speedup: up to 66%

2

Page 3: Read-Write Lock Allocation in Software Transactional Memory

Outline

Background

RWLA

Speculative RWLA

Conclusion

3

Page 4: Read-Write Lock Allocation in Software Transactional Memory

4

Counter in STM

T1

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Page 5: Read-Write Lock Allocation in Software Transactional Memory

Transactional data are validated using: Global clock

Shared variable Timestamp for transactions

Lock Memory is mapped to Lock Table Each entry of the table:

Version #

5

Validation in STM

Global Clock

Memory

Lock Table

Version #

Page 6: Read-Write Lock Allocation in Software Transactional Memory

6

Updating Global Clock & Lock Increment Global Clock Version # = global_clock Global Clock

Memory

Lock Table

Version #

counter

Page 7: Read-Write Lock Allocation in Software Transactional Memory

7

Validation in STM

rv (read version) is set to global_clock

T1

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Metadata for TX1

rv

Global Clock

Page 8: Read-Write Lock Allocation in Software Transactional Memory

8

Successful Read Validation

rv >= version# The most recent write to counter,

occurred before TM_BEGIN()

T1

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Metadata for TX1 Global Clock

rv

Page 9: Read-Write Lock Allocation in Software Transactional Memory

9

Failed Read Validation

rv < version# The most recent write to counter,

occurred after TM_BEGIN()

T1

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Metadata for TX1 Global Clock

rv

Page 10: Read-Write Lock Allocation in Software Transactional Memory

Overhead of Validation

This method, called GV4, results in many cache coherence misses if transactions commit frequently

10

P1

$ $

Pn

Global Clock

Page 11: Read-Write Lock Allocation in Software Transactional Memory

Outline

Background

RWLA

Speculative RWLA

Conclusion

11

Page 12: Read-Write Lock Allocation in Software Transactional Memory

Lock Memory is mapped to Lock Table Each entry of the table:

Lock bit Read bits

Read Write Lock Allocation (RWLA)

12

Lock Table

Memory

P0P1…Pn-1

lock bitRead bits

Page 13: Read-Write Lock Allocation in Software Transactional Memory

13

TM_READ

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

000000 …..

Page 14: Read-Write Lock Allocation in Software Transactional Memory

14

TM_READ

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Set read bit in the corresponding lock

entry

Yes

TM_READ()

Lock bit is free?

000000 …..1lock bit

Page 15: Read-Write Lock Allocation in Software Transactional Memory

15

TM_READ

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Abort

No

100000 …..

Set read bit in the corresponding lock

entry

Yes

TM_READ()

Lock bit is free?

Page 16: Read-Write Lock Allocation in Software Transactional Memory

16

TM_WRITE

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Abort

TM_WRITE

All read bits are clear?

No

000100 …..

Page 17: Read-Write Lock Allocation in Software Transactional Memory

17

TM_WRITE

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

Abort

TM_WRITE

Acquire lockfailed

All read bits are clear?

No

Yes

100000 …..

Page 18: Read-Write Lock Allocation in Software Transactional Memory

18

TM_WRITE

TM_BEGIN(); local_counter = TM_READ(counter); local_counter++;

TM_WRITE(counter, local_counter); TM_END();

00000 …..

Abort

TM_WRITE

Acquire lockfailed

All read bits are clear?

No

Yes

10

Page 19: Read-Write Lock Allocation in Software Transactional Memory

Experimental Framework

Benchmarks: Stamp v0.9.7 Run up to competition Measured statistics over 10 runs

TL2 as an STM framework

Two Intel Xeon E5660, 6-way CMP

19

Page 20: Read-Write Lock Allocation in Software Transactional Memory

Performance of RWLA

20

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Bayes Kmeans Labyrinth Ssca2 Vacation Genome

2 4 8 16 AVG.

bette

r

Page 21: Read-Write Lock Allocation in Software Transactional Memory

Speculative RWLA Conflict occurs frequently → select GV4 Conflict occurs rarely → select RWLA How to predict conflict?

21

Page 22: Read-Write Lock Allocation in Software Transactional Memory

Contention Predictor

Prediction: y≥0 →predict commit y<0 →predict abort

Update If outcome of current TX and TXi agree/disagree →increment/decrement

wi

22

1 X1 … Xn

y

w1w0 wn

n

niiwxwy

10 )(

xi: global transaction history, bipolar value

wi: weight vector

Page 23: Read-Write Lock Allocation in Software Transactional Memory

Performance of Speculative RWLA # of threads changes between 2 and 16 On average, performance changes from 21% in Bayes to

47% in Labyrinth

23

0

0.2

0.4

0.6

0.8

1

1.2

Bayes Kmeans Labyrinth Ssca2 Vacation Genome

2 4 8 16 AVG.

bette

r

Page 24: Read-Write Lock Allocation in Software Transactional Memory

Conclusion

RWLA to overcome contentions over global clok

Applications react differently to GV4 and RWLA

Speculative RWLA changes validation policy dynamically

Speculative RWLA performance of STMs up to 66%

24

Page 25: Read-Write Lock Allocation in Software Transactional Memory

25

Thank You!

Questions?