23
1 Virtual Private Caches ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li

Virtual Private Caches

Embed Size (px)

DESCRIPTION

Virtual Private Caches. ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li. CMP-based System. Chip-level Multiprocessor multiple processor cores are implemented into a single chip Multithreading support. Intel Core 2 Duo E6750. CMP-based System (2). Resource sharing - PowerPoint PPT Presentation

Citation preview

Page 1: Virtual Private Caches

1

Virtual Private Caches

ISCA’07Kyle J. Nesbit, James Laudon, James E. Smith

Presenter: Yan Li

Page 2: Virtual Private Caches

2

CMP-based System Chip-level Multiprocessor

multiple processor cores are implemented into a single chip

Multithreading support

Intel Core 2 Duo E6750

Page 3: Virtual Private Caches

3

CMP-based System (2)

Resource sharingCache capacity/bandwidth, main memory……

Pros: Higher resource utilization Cons: Inter-thread interference

Unpredictable performance / no QoS! Many applications running on CMP-based

systems require Quality of Service

Page 4: Virtual Private Caches

4

Quality of Service QoS are required by many applications:

Soft real-time applications video games

Find-grain parallel applications Scheduling & synchronization

Server consolidation Hosting services

QoS objectives in CMP-based system provide an upper bound on thread execution time regar

dless of other thread activity

Page 5: Virtual Private Caches

5

Outline

Introduction QoS Framework Virtual Private Cache - VPC Arbiter Virtual Private Cache - Capacity Manager Performance Evaluation Conclusions

Page 6: Virtual Private Caches

6

Overview of VPM

Virtual Private Machine: A set of allocated hardware resourcesProcessors, bandwidth, memory spaces…

Each thread is allocated a share of hardware resource based on policiesApplications & system software

Hardware mechanism enforces allocated resources

Page 7: Virtual Private Caches

7

System hardware

VPM

Page 8: Virtual Private Caches

8

Objectives of VPM

Performance Isolation thread performance is as good as on real

private machine having same resources Dynamic distribution of excess resources

Unallocated resourcesAllocated but not used resources

Page 9: Virtual Private Caches

9

Virtual Private Cache

Microarchitecture-level mechanism Main components

VPC Arbiter: tag & data array bandwidth sharing VPC Capacity Manager: cache capacity sharing

Advantages Performance isolation Improved utilization

Page 10: Virtual Private Caches

10

Outline

Introduction QoS Framework Virtual Private Cache - VPC Arbiter Virtual Private Cache - Capacity Manager Performance Evaluation Conclusions

Page 11: Virtual Private Caches

11

VPC Arbiter - Implementation(1) Each data & tag array has an arbiter Each arbiter has

FIFO buffer for each thread:1 clock register R.clk: determine arrival timeR.Li & R.Si for thread i: virtual service/start tim

e

Page 12: Virtual Private Caches

12

VPC Arbiter - Implementation(2)

R.Li: virtual service time of a request from thread i L: latency of shared cache; : thread i’s fraction of re

sources R.Si: virtual start time of the next request of thre

ad i Time that the resource is available for the next reques

t of thread i

Page 13: Virtual Private Caches

13

Fair Queuing Scheduling

Request Arrival:

Arbiter Calculation of virtual finish time:

Arbiter Selection: select the request with the earliest Fi

Page 14: Virtual Private Caches

14

Arbiter Fairness Policy

Excess bandwidth is distributed to threads that has received the least excess bandwidth in the past

Page 15: Virtual Private Caches

15

Outline Introduction QoS Framework Virtual Private Cache - VPC Arbiter Virtual Private Cache - Capacity Manager Performance Evaluation Conclusions

Page 16: Virtual Private Caches

16

Implementation

Set associative replacement policy Each thread receives

same number of sets as the shared cache at least <ways in the shared cache>

Replacement policy LRU line owned by thread i, such that thread i owns

more than ways LRU line owned by the thread that requesting the

replacement

Page 17: Virtual Private Caches

17

Outline Introduction QoS Framework Virtual Private Cache - VPC Arbiter Virtual Private Cache - Capacity Manager Performance Evaluation Conclusions

Page 18: Virtual Private Caches

18

Experiment Setup Two microbenchmarks to stress performanc

e isolation featureLoads: load operations with continuous read hitsStores: store operations with continuous write hit

s SPEC CPU2000 benchmark suite QoS performance metrics

IPCData array utilization

Page 19: Virtual Private Caches

19

Other Arbiter

Read over WritePrioritize read over write

Read over Write First Come First ServicePrioritize read over writePrioritize oldest requests

Round Robin Interleave requests uniformly and consistently

Page 20: Virtual Private Caches

20

Microbenchmark

Page 21: Virtual Private Caches

21

SPEC

Page 22: Virtual Private Caches

22

Conclusions

VPC: hardware mechanism of VPM QoS framework VPC arbiter & capacity manager

VPC can achieve global QoS objectives

Issues: Local QoS objectives assumes performance monotoni

city

Page 23: Virtual Private Caches

23

Thank You!

&

Questions?