38
Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin— Madison http://www.ece.wisc.edu/~pharm

Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

Embed Size (px)

Citation preview

Page 1: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

Characterization ofSilent Stores

Gordon B.BellKevin M. LepakMikko H. Lipasti

University of Wisconsin—Madisonhttp://www.ece.wisc.edu/~pharm

Page 2: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Background Lepak, Lipasti: On the Value Locality

of Store Instructions: ISCA 2000 Introduced Silent Stores

A memory write that does not change the system state

Silent stores are real and non-trivial 20%-60% of all dynamic stores are silent in

SPECINT-95 and MP benchmarks (32% average)

Page 3: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Why Do We Care? Reducing cache writebacks

Reducing writeback buffering Reducing true and false sharing Write operations are generally

more expensive than reads

Page 4: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Code Size / Efficiency

R(I1,I2,I3) = V(I1,I2,I3) - A(0)*(U(I1,I2,I3)) - A(1)*(U(I1-1,I2,I3) + U(I1+1,I2,I3) + U(I1,I2-1,I3) + U(I1,I2+1,I3) + U(I1,I2,I3-1) + U(I1,I2,I3+1)) - A(2)*(U(I1-1,I2-1,I3) + U(I1+1,I2-1,I3) + U(I1-1,I2+1,I3) + U(I1+1,I2+1,I3) + U(I1,I2-1,I3-1) + U(I1,I2+1,I3-1) + U(I1,I2-1,I3+1) + U(I1,I2+1,I3+1) + U(I1-1,I2,I3-1) + U(I1-1,I2,I3+1) + U(I1+1,I2,I3-1) + U(I1+1,I2,I3+1)) - A(3)*(U(I1-1,I2-1,I3-1) + U(I1+1,I2-1,I3-1) + U(I1-1,I2+1,I3-1) + U(I1+1,I2+1,I3-1) + U(I1-1,I2-1,I3+1) + U(I1+1,I2-1,I3+1) + U(I1-1,I2+1,I3+1) + U(I1+1,I2+1,I3+1))

Example from mgrid (SPECFP-95)

Eliminating this expression (when silent) removes over 100 static instructions (2.4% of the total dynamic instructions)

Page 5: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

This Talk Characterize silent stores

Why do they occur? Source code case studies

Silent store statistics Critical silent stores Goal: provide insight into silent

stores that can lead to novel innovations in detecting and exploiting them

Page 6: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Terminology Silent Store: A memory write that

does not change the system state Store Verify: A load, compare, and

conditional store (if non-silent) operation

Store Squashing: Removal of a silent store from program execution

Page 7: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

An Example

for (i = 0; i < 32; i++){

time_left[i] -= MIN(time_left[i],time_to_kill);

}

Example from m88ksim

• This store is silent in over 95% of the dynamic executions of this loop

• Difficult for compiler to eliminate because how often the store is silent may depend on program inputs

Page 8: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Value DistributionValue Distribution of Silent Stores (SPECINT-95)

0%5%

10%15%20%25%30%35%40%45%50%

Fre

qu

en

cy

Both values and addresses are likely to be silent

Page 9: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Silent Store Dynamic Distribution (SPECINT)

0

10

20

30

40

50

60

70

80

90

100

0.01 0.1 1 10 100

% of Static Silent Store Instructions

Co

ntr

ibu

tio

n t

o T

ota

l S

ile

nt

Sto

res

compress

m88ksim

ijpeg

gcc

perl

li

go

vortex

Frequency of Execution

Few static instructions contribute to most silent stores

Page 10: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Stack / Heap

0%

20%

40%

60%

80%

100%

Pe

rce

nt

of

Tota

l Sto

res

Silent Stores Addresses (SPECINT)

heap non-silent

heap silent

stack silent

stack non-silent

• Uniform stack silent stores (25%-50%)• Variable heap silent stores

Page 11: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Stores Likely to be Silent 4 categories based on previous

execution of that particular static store

Same Location, Same Value A silent store stores the same value

to the same location as the last time it was executed

Common in loops

Page 12: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Same Location, Same Value

for (anum = 1; anum <= maxarg; anum++) {

argflags = arg[anum].arg_flags;

Example from perl

• argflags is a stack-allocated temporary variable (same location)

• arg_flags is often zero (same value)

• Silent 71% of the time

Page 13: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Stores Likely to be Silent Different Location, Same Value

A silent store stores the same value to a different location as the last time it was executed

Common in instructions that store to an array indexed by a loop induction variable

Page 14: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Different Location, Same Valuefor(x = xmin; x <= xmax; ++x)

for(y = ymin; y <= ymax; ++y){s = y*boardsize+x;...ltrscr -= ltr2[s];ltr2[s] = 0;ltr1[s] = 0;ltrgd[s] = FALSE;

}

Example from go

• Clears game board array

• Board is likely to be mostly zero in subsequent clearings

• Silent 86%, 43%, 77% of the time, respectively

Page 15: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Stores Likely to be Silent Same Location, Different Value

A silent store stores a different value to the same location as the last time it was executed

Rare, but can be caused by: Intervening static stores to the same

address Stack frame manipulations

Page 16: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Same Location, Different Valuefor(x = xmin; x <= xmax; ++x)

for(y = ymin; y <= ymax; ++y){s = y*boardsize+x;...ltrscr -= ltr2[s];ltr2[s] = 0;ltr1[s] = 0;ltrgd[s] = FALSE;

}

Example from go

• ltrscr is a global variable (same location)

• ltr2 is indexed by loop induction variable (different value)

• Silent 86%, but of that 98% is Same Location, Same Value

Page 17: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registerscall foo()call bar()…call foo()

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

$17 is callee-saved

Page 18: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

call foo()call bar()…call foo()

Page 19: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

2

call foo()call bar()…call foo()

Page 20: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

2

call foo()call bar()…call foo()

Page 21: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

2

call foo()call bar()…call foo()

Page 22: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

62

call foo()call bar()…call foo()

Page 23: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

62

call foo()call bar()…call foo()

Page 24: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

62

call foo()call bar()…call foo()

Page 25: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Callee-Saved Registers

*** void foo() ***

sw $17,28($fp)

...

*** void bar() ***

sw $17,28($fp)

...

62=

Same location, different value, silent

6

call foo()call bar()…call foo()

Page 26: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Stores Likely to be Silent Different Location, Different Value

A static silent store stores a different value to a different location as the last time it was executed

Example: nested loops

Page 27: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Different Location, Different Value

NODE ***xlsave(NODE **nptr,...){...for (; nptr != (NODE **) NULL; nptr = va_arg(pvar, NODE **)){

...*--xlstack = nptr;...

}

Example from li

• xlstack is continually decremented (different location)

• nptr is set to next function argument (different value)

• Silent if subsequent calls to xlsave store the same set of nodes to the same starting stack address

Page 28: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Likelihood of Being Silent

0%10%20%30%40%50%60%70%80%90%

100%

% o

f Sto

res

that

are

Sile

nt

Same LocationSame Value

Different LocationSame Value

Different LocationDifferent Value

Same LocationDifferent Value

Silence can be accurately predicted based on category

Page 29: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Silent Store Breakdown

0%

20%

40%

60%

80%

100%

go

m88

ksim gc

c

com

pres

s liijp

eg perl

vorte

x

Pe

rce

nt

of

Dyn

amic

Sile

nt

Sto

res

Same LocationSame Value

Same LocationDifferent Value

Different LocationSame Value

Different LocationDifferent Value

Stores that can be predicted silent (Same Value)are a large portion of all silent stores

Page 30: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Critical Silent Stores Critical Silent Store: A specific

dynamic silent store that, if not squashed, will cause a cacheline to be marked as dirty and hence require a writeback

Page 31: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Critical Silent Stores

Dirty bit

Cache blocks

A B C D E F

Page 32: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Critical Silent Stores

sw 0, Csw 2, E

x 0 2

Both silent stores are critical because the dirty bit would

not have been set if silent stores are squashed

==

A B C D E F

Page 33: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Non-Critical Silent Stores

sw 0, Csw 2, E

x 0 2

No silent stores are critical because the dirty bit is set

by a non-silent store (regardless of squashing)

0

sw 4, D

==

A B C D E F

4

Page 34: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Critical Silent StoresWho Cares? It is sufficient to squash only

critical silent stores to obtain maximal writeback reduction

Squashing non-critical silent stores: Incurs store verify overhead with no

reduction in writebacks Can cause additional address bus

transactions in multiprocessors

Page 35: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Critical Silent Stores: Exampledo{

*(htab_p-16) = -1;*(htab_p-15) = -1;*(htab_p-14) = -1;*(htab_p-13) = -1;...*(htab_p-2) = -1;*(htab_p-1) = -1;

} while ((i -= 16) >= 0);

Example from compress

• These 16 stores fill entire cache lines

• If all stores to a line are silent, then they are all critical as well

• 19% of all writebacks can be eliminated

Page 36: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Writeback Reduction

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

go m88ksim gcc compress li ijpeg perl vortex

% SS Critical

% WB reduced

Squashing only a subset of silent storesresults in significant writeback reduction

Page 37: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Conclusion Silent Stores occur for a variety of

values and execution frequencies Silent Store causes:

Algorithmic (bad programming?) Architecture / compiler conventions

Squashing only critical silent stores is sufficient for removing all writebacks

Page 38: Characterization of Silent Stores Gordon B.Bell Kevin M. Lepak Mikko H. Lipasti University of Wisconsin—Madison pharm

PACT 2000 G. Bell, K. Lepak and M. Lipasti

Future Work Silence prediction

Store verify only if have reason to believe that store is:

Silent Critical

Multiprocessor Silent Stores Extend notion of criticality to include

silent stores that cause sharing misses as well as writebacks