25
WALL: A Writeback-Aware LLC Management for PCM-based Main Memory Systems Bahareh Pourshirazi * , Majed Valad Beigi , Zhichun Zhu * , and Gokhan Memik * University of Illinois at Chicago Northwestern University DATE-2018 IEEE/ACM Design, Automation, and Test in Europe March 21 Dresden, Germany

WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

WALL: A Writeback-Aware LLC Management for

PCM-based Main Memory SystemsBahareh Pourshirazi*, Majed Valad Beigi†,

Zhichun Zhu*, and Gokhan Memik†

* University of Illinois at Chicago† Northwestern University

DATE-2018IEEE/ACM Design, Automation, and Test in Europe

March 21Dresden, Germany

Page 2: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Introduction• Increasing demand for memory capacity

– Increasing number of cores on multicore processors• Intel Sandy Bridge: 8 cores (16 threads)• IBM POWER7: 8 cores (32 threads)

– Increasing data set sizes• Graph, database, scientific workloads

• Problems with DRAM – Scalability limitations

• Slowed down• Below 16nm seems difficult

– Periodic refresh operations

2DATE 2018 – Pourshirazi et al. 3/21/2018

Page 3: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Phase Change Memory• Promising technology

– Denser than DRAM (3−12×)– Non-volatile storage

• Shortcomings– Higher access latency (4−12× DRAM)– Higher dynamic energy (2−40× DRAM) – Limited write endurance

3

wordline

PCM

bitline

storageelement

DATE 2018 Pourshirazi et al. –

especially WRITE

operations

3/21/2018

Page 4: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Existing Solutions• DRAM + PCM hybrid main memories

– DRAM as a cache to PCM• Modifications to PCM-based main memories

– Optimization on PCM architecture– Reducing the number of writebacks from LLC to PCM

4DATE 2018 – Pourshirazi et al.

CPU Core

Shared LLC

PCM Main Memory

CPU Core

CPU Core

CPU Core

Shared LLC

DRAM Cache

Large PCM storage

CPU Core

CPU Core

MM

3/21/2018

Page 5: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Summary of Contributions• In this work

– We propose WALL, a novel Writeback-Aware LLC management scheme

WALL reduces the number of writebacks from the Last Level Cache (LLC) to a PCM-based main memory

– WALL consists of • Writeback aware set-balancing mechanism• Writeback-aware replacement policy

5DATE 2018 – Pourshirazi et al. 3/21/2018

Page 6: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Outline

• Introduction• Background• Motivation• WALL• Evaluation Results• Conclusion

6DATE 2018 – Pourshirazi et al. 3/21/2018

Page 7: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Impact of reducing write traffic on PCM– Lifetime enhancement

– Performance improvement• Writes increase latency of reads by 1.2 to 1.8 times

[Arjomand_ISCA2016]– Reduction in energy consumption

• Writes consume ~10× of reads

Background

7DATE 2018 – Pourshirazi et al.

PCM Lifetime α1

write traffic

0

2

4

6

8

10

0 10 20 30 40 50 60 70 80 90

Nor

mal

ized

Life

time

(×)

Write Reduction (%)

3/21/2018

Page 8: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Related Work• WADE [Wang_TACO2013]

– Reduces the number of writebacks to PCM• Partitions a set’s blocks into “frequent writeback” and

“non-frequent writeback”• Tries to keep the frequent writeback blocks in the set

– Considers the set’s blocks as the only replacement candidates

– Complex implementation• Set-Balancing Cache (SBC) [Rolán_Micro2009]

– Balances the pressure on cache sets to reduce miss rate

– It does not reduce writebacks

8DATE 2018 – Pourshirazi et al. 3/21/2018

Page 9: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Writebacks are not uniformly distributed among LLC sets

Motivation

9DATE 2018 – Pourshirazi et al.

0

20

40

60

80

100

0 12.5 25 37.5 50 62.5 75 87.5 100

Perc

enta

ge o

f writ

ebac

ks

Percentage of sets

sp

22.9%

5.3%94.7

sp from NAS

0

20

40

60

80

100

0 12.5 25 37.5 50 62.5 75 87.5 100

Perc

enta

ge o

f writ

ebac

ks

Percentage of sets

gcc

29.6%

94.9

0

20

40

60

80

100

0 12.5 25 37.5 50 62.5 75 87.5 100

Perc

enta

ge o

f writ

ebac

ks

Percentage of sets

streamcluster

27.4%

8.7%91.3

gcc from SPEC2006 streamcluster from PARSEC

5.1%5.3%

A set with few writeback can be used to store the dirty eviction victims of a set with many writeback

3/21/2018

Page 10: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• LLC sets are classified into three categories– Writer: frequent writebacks– Non-writer: infrequent writebacks, underutilized– Neutral: neither writer, nor non-writer

• Each writer set is partnered with a non-writer set

Set Balancing

10DATE 2018 – Pourshirazi et al.

PCM Main Memory…

write

INSERT…

EVICT DIRTY

…Partners

WRITER

NON-WRITER

LRU

3/21/2018

Page 11: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• To determine set types, two counters are used– Writeback counter– Saturation counter [Rolán_Micro2009]

• To measure the degree to which set can hold its working set

• Counter thresholds– Writeback

– Saturation • τsat = K/4, K is the set associativity

30313132

Set Balancing (cont.)

11DATE 2018 – Pourshirazi et al.

Access miss

Saturation CounterAccess hit

𝑤𝑤𝑤𝑤𝑚𝑚

Arithmetic Mean𝑤𝑤𝑤𝑤1 𝑤𝑤𝑤𝑤2 𝑤𝑤𝑤𝑤𝑛𝑛… overall

average

𝑤𝑤𝑤𝑤1 𝑤𝑤𝑤𝑤2 𝑤𝑤𝑤𝑤𝑚𝑚−1… τlow_wb

𝑤𝑤𝑤𝑤𝑚𝑚 𝑤𝑤𝑤𝑤𝑚𝑚+1 𝑤𝑤𝑤𝑤𝑛𝑛… τhigh_wb

3/21/2018

Page 12: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

0102030405060708090

100

wb sat wb sat wb sat wb sat wb sat wb sat wb sat wb sat

sp ua stream dedup gcc mcf mix1 mix2

Perc

enta

ge o

f Set

s

< [ ] > < >

• For a set with writeback count of wb and saturation counter of sat– Set is writer if wb ≥ 𝜏𝜏ℎ𝑖𝑖𝑖𝑖ℎ_𝑤𝑤𝑤𝑤

– Set is non-writer if sat ≤ 𝜏𝜏𝑠𝑠𝑠𝑠𝑠𝑠 and wb ≤ 𝜏𝜏𝑙𝑙𝑙𝑙𝑤𝑤_𝑤𝑤𝑤𝑤𝝉𝝉𝒔𝒔𝒔𝒔𝒔𝒔𝝉𝝉𝒍𝒍𝒍𝒍𝒍𝒍_𝒍𝒍𝒘𝒘- 𝝉𝝉𝒉𝒉𝒉𝒉𝒉𝒉𝒉𝒉_𝒍𝒍𝒘𝒘 𝝉𝝉𝒉𝒉𝒉𝒉𝒉𝒉𝒉𝒉_𝒍𝒍𝒘𝒘𝝉𝝉𝒍𝒍𝒍𝒍𝒍𝒍_𝒍𝒍𝒘𝒘 𝝉𝝉𝒔𝒔𝒔𝒔𝒔𝒔

Set Balancing (cont.)

12DATE 2018 – Pourshirazi et al.

writer

non-writer

3/21/2018

Page 13: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Frequent writeback block: frequently reused after being evicted from the cache

• Frequent writeback blocks are given a second chance upon eviction to stay in cache and be accessed

• To avoid performance penalty, the replacement policy is considered for the neutral or non-writer sets

Replacement Policy

13DATE 2018 – Pourshirazi et al.

PCM Main Memory… write

INSERT

EVICT DIRTYLRUFV = 0 FV = 1

ACCESS

Non-writer or neutral set

3/21/2018

Page 14: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• WALL storage overhead– Less than 0.6% of the LLC capacity

Design

14DATE 2018 – Pourshirazi et al.

victim

to PCM

dirty FV1 to PCM

to MRU

1

0

dirt

y

FV1 1

1 0 00 X 1 to PCM

insert into partner of Set(n)

move to MRU

ST[0

]

ST[1

]

1 writer1 01 0 1

1 neutral

ST[0] ST[1]

non-writer0 XST: Set Type

1

0

...

MRU

...

LRU

MRU

finding another victim

Set(n) Set(n)

3/21/2018

Page 15: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Simulator– GEM5 integrated with NVMAIN

• Cores8 cores, out-of-order, 2.0GHz

• Caches– 32KB L1 (2 cycles), 256KB L2 (12 cycles), Shared LLC 8MB (35 cycles)– MOESI directory

• PCM Main Memory– 4GB, 4 channels, 1 rank/channel, 4 banks/rank– t_SET= 150ns, t_RESET= 100ns, t_RCD= 120ns, – Cell endurance = 32×106 writes – Four memory controllers– One read and write queue, Write drain threshold: high = 80%, low = 50%

Experimental Setup

15DATE 2018 – Pourshirazi et al. 3/21/2018

Page 16: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Experimental Setup• Workloads

– Multi-threaded applications• NAS and PARSEC benchmarks

– Multi-programmed workloads• SPEC CPU2006

• We run the workloads for 2 billion instructions, after two billion for cache warm-up phase

• We compare WALL with– Baseline: that uses the LRU replacement policy – Baseline double-way: a baseline with double the associativity– WADE: the proposed scheme by Wang et al. [Wang_TACO2013]

DATE 2018 – Pourshirazi et al. 3/21/2018 16

Page 17: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline, reduced by 26.6% on average– For writers sets, reduced by 39.5%, on average– For non-writers sets, increased from 10.4% to 13.1% – For neutral sets, reduced by 28.6% on average

LLC Writeback

17DATE 2018 – Pourshirazi et al.

0

0.2

0.4

0.6

0.8

1

1.2

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

sp ua stream dedup gcc mcf mix1 mix2 Average

Nor

mal

ized

Writ

ebac

ks

non-writer neutral writer

26.6%

3/21/2018

Page 18: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

WALL Writeback Reduction

18DATE 2018 – Pourshirazi et al.

0

0.2

0.4

0.6

0.8

1

1.2

sp ua stream dedup gcc mcf mix1 mix2 Average

LLC Writeback Reduction

Baseline Baseline double-way WADE WALL

• Compared to baseline double-way, by 23.3% on average • Compared to WADE, by 16.4% on average

23.3%16.4%

3/21/2018

Page 19: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline, reduced by 2.4% on average– For writers sets, reduced by 27.8%, on average.– For non-writers sets, increased from 12.0% to 16.2%– For neutral sets, increased from 57.3% to 59.1%

MPKI

19DATE 2018 – Pourshirazi et al.

0

0.2

0.4

0.6

0.8

1

1.2

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

Base

line

WAL

L

sp ua stream dedup gcc mcf mix1 mix2 Average

Nor

mal

ized

MPK

I

non-writer neutral writer

2.4%

3/21/2018

Page 20: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline double-way, increased by 1.0% on average

• Compared to WADE, reduced by 0.3% on average

WALL MPKI

20DATE 2018 – Pourshirazi et al.

0.50.60.70.80.9

11.11.2

sp ua stream dedup gcc mcf mix1 mix2 Average

Normalized MPKI

Baseline Baseline double-way WADE WALL

3/21/2018

Page 21: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline, reduced by 19.2% on average • Compared to baseline double-way, reduced by 16.5% on

average • Compared to WADE, reduced by 11.3% on average

Main Memory Energy

21DATE 2018 – Pourshirazi et al.

0.5

0.6

0.7

0.8

0.9

1

1.1

sp ua stream dedup gcc mcf mix1 mix2 GMEAN

Main Memory Energy

Baseline Baseline double-way WADE WALL

19.2%16.5%11.3%

3/21/2018

Page 22: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline, improved by 6.7% on average • Compared to baseline double-way, improved by 4.9% on

average• Compared to WADE, 3.2% on average

IPC

22DATE 2018 – Pourshirazi et al.

0.9

1

1.1

1.2

1.3

sp ua stream dedup gcc mcf mix1 mix2 GMEAN

Normalized IPC

Baseline Baseline double-way WADE WALL

6.7%4.9%3.2%

3/21/2018

Page 23: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

• Compared to baseline scheme, increased by 1.25× on average• Compared to baseline double-way, increased by 1.21× on

average • Compared to WADE, increased by 1.17× on average

PCM Lifetime

23DATE 2018 – Pourshirazi et al.

1248

163264

128256

sp ua stream dedup gcc mcf mix1 mix2 GMEAN

Life

time

(yea

rs)

Baseline Baseline double-way WADE WALL

3/21/2018

Page 24: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Conclusion• We proposed WALL to reduce the number of writebacks

from the LLC to the PCM main memory • WALL includes:

– A set-balancing mechanism • Uses the non-write sets as storage of writer sets writebacks.

– A writeback-aware replacement policy • Keeps the frequently reused dirty lines of the sets

• Results show that WALL can achieve– Writeback reduction, by 26.6% on average– PCM lifetime enhancement , by 1.25× on average– Main memory energy efficiency, by 19.2% on average

24DATE 2018 – Pourshirazi et al. 3/21/2018

Page 25: WALL: A Writeback-Aware LLC Management for PCM-based …users.eecs.northwestern.edu/~mvb541/paper/DATE2018_WALL.pdfmix1. mix2. Percentage of Sets < [ ] > < > • For a

Thank You !Questions ?

25DATE 2018 – Pourshirazi et al. 3/21/2018