37
Adaptive Insertion Policies for High Performance Caching Qureshi, et al. EECE527 - Paper Summary Jose Pinilla

Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Embed Size (px)

Citation preview

Page 1: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Adaptive Insertion Policies for High Performance CachingQureshi, et al.

EECE527 - Paper SummaryJose Pinilla

Page 2: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Cache Replacement Policies

● Victim Selection Policy○ LRU

● Insertion Policy○ MRU○ LRU

Page 3: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

LRU (Baseline)LRU replacement (commonly used):

Page 4: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Belady’s OPT

Optimal page replacement algorithm (Changes Victim Selection Policy):

LRU replacement (commonly used):

Page 5: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

LIP (LRU Insertion Policy)

LIP (LRU Insertion Policy)

LRU replacement (commonly used):

7 7

0

7

1

0

2

1

0

3

1

0

4

1

0

2

1

0

3

1

0

3

2

0

3

2

1

0

2

1

0

7

1

Page 6: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Belady’s OPT

LIP (LRU Insertion Policy)

LRU replacement (commonly used):

7 7

0

7

1

0

2

1

0

3

1

0

4

1

0

2

1

0

3

1

0

3

2

0

3

2

1

0

2

1

0

7

1

Page 7: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Cyclic Reference Model

for j = 1 to Ninstructions read (a1...aT)

for j = 1 to Ninstructions read (b1...bT)

Let there be an access pattern in which (a1 · · · aT)N is followed by (b1 · · · bT)N

Cache Size K (K < T)

N >> T N >> K/ϵ

Page 8: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Access Pattern: LRU Step 1

a1

a2

a3

aT

K

TN

Page 9: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Access Pattern: LRU Step 2

a1

a2

a3

aT

K

TN

Page 10: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Access Pattern: LRU Step X

a1

a2

a3

aT

K

TN

Page 11: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Access Pattern: LRU Step X>T*N

a1

a2

a3

aT

TN

b1

b2

b3

bT

KTN

Page 12: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: LIP Step 1

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

K

Page 13: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: LIP Step 2

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

K-1

Page 14: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: LIP Step X>T*N

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

K-1

Page 15: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Bimodal InsertionControl the percentage of incoming lines placed as MRU

ϵ = Bimodal throttle parameterϵ=1 => LRUϵ=0 => LIP

Page 16: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: BIP

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

K-1

Page 17: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: BIP

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

Page 18: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: BIP

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

Page 19: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

T

Access Pattern: BIP

a1

a2

a3

aT

TN

b1

b2

b3

bT

N

Page 20: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Hit Rate

Cache Size K (K < T)

ϵ = Bimodal throttle parameterϵ=1 => LRUϵ=0 => LIP

N >> T N >> K/ϵ

Page 21: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Benchmarksmcf art

health

250M instructions obtained with SimPoint

Page 22: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Results 1

So they proved that it works…

Page 23: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Results 1

So they proved that it works…...but don’t over do it (ϵ)...

Page 24: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Results 1

So they proved that it works…...but don’t over do it (ϵ)...

...actually, let’s choose LRU on run-time sometimes.

Page 25: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling

ATD: Auxiliary Tag Directory

MTD: Main Tag Directory

Page 26: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling

Dedicated-SetSelection

Policy

Staticor

Dynamic(+2 bits)

Page 27: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling

Dedicated-SetSize

SelectionPolicy

Page 28: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Run-time adaptation: PSEL values

PSEL>=512 then LIP PSEL<512 then LRU

Page 29: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Hardware advantages● LIP, BIP and DIP similar to current LRU approximations

● DIP does not require extra bits in the tag-store entry

● No major logic overhead means the cache access time is unaffected

Page 30: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Related Work

R: Random, N: Random from the less recent half, F: Frequently

● Bypass

● Early Eviction

● Dynamic Exclusion

Page 31: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Remarks

Retain some fraction of the working set

Dynamically adapt to workloads and patterns

Low overhead (Set dueling)

Page 32: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Questions?

Page 33: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

Questions?

What would be the behaviour if DIP used ATDs dedicated to LRU and LIP?

● Compare Amean

Dynamic ϵ● Can ϵ be extracted from PSEL?

Page 34: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

References“Cache Replacement with Dynamic Exclusion”. Scott McFarling

“Set-Dueling-Controlled Adaptive Insertion for High-Performance Caching”. Qureshi et al.

“Using SimPoint for Accurate and Efficient Simulation”. Perelman et al.

“Adaptive Caching for High-Performance Memory Systems”. PhD Dissertation. Qureshi et al.

Page 35: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

McFarling: Conflict Between Loops

for i = 1 to 10for j = 1 to 10

instruction afor j = 1 to 10

instruction b

*(a10b10)10 = 0%

(amah9bmbh

9)10 = 10%

* ignoring loop

Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling

Page 36: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

McFarling: Conflict Between Loops Levels

for i = 1 to 10for j = 1 to 10

instruction ainstruction b

Direct-mapped(amah

9bm)10 = 18%

Optimalamah

9bm(ah10bm)9 = 10%

Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling

Page 37: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al

McFarling: Conflict within Loops

for i = 1 to 10instruction ainstruction b

Direct-mapped(ambm)10 = 100%

Optimalambm(ahbm)9 = 55%

Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling