52
Memory Servers for Multicore Systems Rodolfo Pellizzoni and Heechul Yun

Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

Memory Servers for Multicore Systems

Rodolfo Pellizzoni and Heechul Yun

Page 2: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 3: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 4: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Resource Contention

2

Core Core Core

Shared Physical Resource

INTERFERENCE!!!

t t t

We need a resource delay analysis to bound such effects

Page 5: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Analysis Taxonomy

•  Request-driven bound: –  Compute maximum per-Request Delay –  : # requests task under analysis –  Delay = –  Computed delay can be very pessimistic!

•  Job-driven bound: –  : bound on # requests by other cores –  Delay = –  Non compositional – requires knowledge of tasks

running on all cores! 3

RD

H

RD ⇥H

Hf(H,H)

Page 6: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Example: DRAM Write Buffering •  Reads are prioritized over writes; writes are buffered and

transmitted in non-preemptive batches (ARM A15 – 18 req.) •  If a write batch begins right before a read under analysis

arrives, the request is delayed by 18 write requests…

4

Memory Controller

13

Read request buffer

Write request buffer

Bank 1 scheduler

Channel scheduler

Bank 2 scheduler

Bank N scheduler

DRAM chip

Memory requests from cores

Writes are buffered and processed opportunistically

Page 7: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

COTS Arbiters are Unfair

•  The main issue: unfair arbitration –  cores –  A request can be delayed by other requests

•  Many other examples… –  First-Ready FCFS DRAM arbitration –  Non-blocking caches –  Wormhole NoC arbitration

5

M

� M � 1

Page 8: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

How bad can it be?

6

Results with SPEC2006 Benchmarks

• Main source of pessimism: – The pathological case of write (LLC write-backs) processing

25

Even with just 4 cores, measured execution time is ~8x larger

Provable bound is much higher due to write buffering

Page 9: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Compositional analysis is highly desirable •  Develop each application (core) independently •  But bounds are bad due to request-driven analysis

7

Core#1–Applica/on#1

Shared Physical Resource

Core#3–Applica/on#3

Core#2–Applica/on#2

Page 10: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Compositional analysis is highly desirable •  Develop each application (core) independently •  But bounds are bad due to request-driven analysis

7

Shared Physical Resource

CoreCoret

t

t

Page 11: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Or forget about compositionality to use job-driven bounds... •  ... but then any change to other applications forces me to

redo timing analysis

8

Shared Physical Resource

t

t

t t

t

t

t

t

Page 12: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Or forget about compositionality to use job-driven bounds... •  ... but then any change to other applications forces me to

redo timing analysis

8

Shared Physical Resource

t

t

t t

t

t

t

t

Let me change a compiler flag here…

Page 13: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Or forget about compositionality to use job-driven bounds... •  ... but then any change to other applications forces me to

redo timing analysis

8

Shared Physical Resource

t

t

t t

t

t

t

t

Let me change a compiler flag here… OMG!!!

Page 14: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Summary: The Problem •  Or forget about compositionality to use job-driven bounds... •  ... but then any change to other applications forces me to

redo timing analysis

8

Shared Physical Resource

t

t

t t

t

t

t

t

Let me change a compiler flag here… OMG!!!

We need a way to keep compositionality but use job-driven bounds!

Page 15: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 16: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory Server •  How do we solve the problem?

1.  Add a server with budget to each core p

2.  Server constraints the core to make no more than accesses to the shared resource every time units

3.  Now we know the maximum resource accesses by other cores, so we can use job-driven analysis!

9

Qp

Qp

P

Page 17: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Server Implementation

10

•  Reserve per-core memory bandwidth via the OS scheduler –  Use h/w PMC to monitor memory request rate

1ms 2ms 0

Suspend the RT task

Budget

Core activity

2 1

computation memory fetch

Qp

Regulation period P

Heechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isolation in Multi-core Platforms, RTAS’13

Page 18: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  Isn’t this similar to a hierarchical server?

11

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

Yes

•  Multiple applications on the same core

Page 19: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  Isn’t this similar to a hierarchical server?

11

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

•  Step 1: perform local analysis to derive sched. interface

Yes

Page 20: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  Isn’t this similar to a hierarchical server? Yes

11

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

System-level Analysis

•  Step 2: perform system-level analysis to check sched. –  Ex:

XQp/P Ub

Page 21: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  For memory server:

12

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

•  Multiple application on different cores

Page 22: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  For memory server:

12

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

•  Step 1: perform core-level analysis to derive interface

Page 23: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  For memory server:

12

t

t

t t

t

Application#1 Application#2

Sched. Interface Sched. Interface

System-level Analysis

•  Step 2: perform system-level analysis to check sched.

Page 24: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Interfaces and Compositionality •  In both cases, the interface represents a contract

–  Each application’s developer guarantees that he will respect the contract

–  Check schedulability based on the contracts for the other applications, not the tasks inside

13

I want to add a new task to my application...

Do what you want, just respect the contract

Page 25: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory vs Hierarchical Server •  Key difference: interfaces •  For fixed , the schedulability of an application in a

hierarchical server depends only on the assigned budget

14

PQp

0 2 4 6 8 Qp

Page 26: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

2

4

6

8

10

QBp

Memory vs Hierarchical Server •  Key difference: interfaces •  … but the delay (hence schedulability) for memory server also

depends on the total budget assigned to other cores!

14 0 2 4 6 8

0 Qp

Key intuition: there is no global resource utilization bound – it depends on the tasks within the application

QBp

Page 27: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Detailed Contribution •  Core-level analysis computes 2D feasibility region for a core

–  Based on state-of-the-art DRAM delay analysis; but any request-driven / job-driven analysis could be used

•  System-level analysis determines feasibility –  Is there a budget assignment that satisfies all interfaces?

•  Both analyses have reasonable complexity

15

Page 28: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 29: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Core-level Analysis •  Fixed priority scheduling

16

t

t

t

Bini & Buttazzo: Schedulability analysis of periodic fixed priority systems, TC’04

For each task…

•  Compute set of time instants •  Iff for any time instant : computation demand( ) ,

task is schedulable t̄ [0, t̄) t̄

Page 30: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Core-level Analysis

16

t

t

t

•  Fixed priority scheduling Bini & Buttazzo: Schedulability analysis of periodic fixed priority systems, TC’04

For each task…

•  With resource interference: computation demand( ) + resource delay( ) [0, t̄) [0, t̄) t̄

Page 31: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Resource Delay: Regulation •  Regulation: tasks in the busy interval are delayed because

they exhaust the budget

•  Regulation delay decreases with larger budget values

17

resource delay( ) computation demand( ) [0, t̄) [0, t̄) t̄�

Qp

QBp

Qp

Qp

Page 32: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Resource Delay: Contention •  Contention: tasks in the busy interval are delayed due to

requests of other cores

•  Contention delay decreases with smaller budget values

18

QBp

Qp

QBp

QBp

resource delay( ) computation demand( ) [0, t̄) [0, t̄) t̄�

Page 33: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Memory Delay: One Time Instant •  Key obs.: resource delay is the max of regulation, contention

–  Independent of analysis; true bc of how regulation works •  Task is schedulable at if it meets both regulation and

contention constraints

19

Qp

QBp

Qp

QBp

Page 34: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Interface: One Task •  Task is schedulable if it meets constraints at any time

instant

20

Qp

QBp

Qp

QBp

t̄1Region for Region for t̄2

Page 35: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Interface: One Task

20

Qp

QBp

Region for ⌧1

•  Task is schedulable if it meets constraints at any time instant t̄

Page 36: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Interface: One Core

21

•  Task set is schedulable if all tasks are schedulable

Qp

QBp

Qp

QBp

Region for ⌧1Region for ⌧2

Page 37: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Interface: One Core

21

•  Task set is schedulable if all tasks are schedulable

Qp

QBp

Region for the core

Page 38: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 39: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

System-level analysis

22

0 2 4 6 8

0

2

4

6

8

Qp

QBp

0 2 4 6 8

0

2

4

6

8

Qp

QBp

4

4

3

3

•  Start with smallest budget values •  Determine budget of other cores •  If not feasible, move to next budget value

Interface for Core#1 Interface for Core#2

Qp

QBp

Page 40: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

System-level analysis

22

0 2 4 6 8

0

2

4

6

8

Qp

0 2 4 6 8

0

2

4

6

8

Qp

6

6

5

5

Interface for Core#1 Interface for Core#2

QBp QBp

•  Start with smallest budget values •  Determine budget of other cores •  If not feasible, move to next budget value

Qp

QBp

Page 41: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

System-level analysis

22

0 2 4 6 8

0

2

4

6

8

Qp

0 2 4 6 8

0

2

4

6

8

Qp

6

6

7

7

Interface for Core#1 Interface for Core#2

QBp QBp

•  Start with smallest budget values •  Determine budget of other cores •  If not feasible, move to next budget value

Qp

QBp

OK!

Page 42: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Result Summary •  System-level analysis is optimal (based on computed

core-level interfaces) –  If there is a feasible budget assignment, it will find it

•  Computation complexity –  cores, tasks per core, time instances per task –  Core-level analysis is –  System-level analysis is

•  Note: is , but generally much smaller

23

M N K

O�N ·K · log(N ·K)

O�M2 ·N ·K)

K O�N ·max(deadline)/min(period)

Page 43: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 44: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Evaluation

•  Randomly generated task sets

•  Only shared resource is DRAM

•  FR-FCFS arbitration, private banks, write buffering

•  ARM A15, LPDDR2@533Mhz, = 1ms

•  Random task bandwidth in –  : fraction of maximum memory bandwidth

24

Heechul Yun et al: Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems, ECRTS’15

P

[0, Bmax

]

Bmax

Page 45: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Analyses •  non-comp

–  Non-compositional: all tasks on all cores are known –  No regulation, but can use job-driven analysis

•  rw-reg •  read-reg

–  Memory server. PMCs measure either read+write requests, or read only

•  legacy –  All servers have the same fixed budget

•  unreg –  No regulation, compositional, request-driven only

25

Page 46: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

M = 4, N = 10, = 1%

26

Bmax

Per-Core Utilization U0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Perc

enta

ge S

ched

ulab

le T

ask

Sets

0

10

20

30

40

50

60

70

80

90

100

non-comprw-regread-reglegacyunreg

This is the price of compositionality with our approach

Page 47: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

M = 4, N = 10, = 1%

26

Bmax

Per-Core Utilization U0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Perc

enta

ge S

ched

ulab

le T

ask

Sets

0

10

20

30

40

50

60

70

80

90

100

non-comprw-regread-reglegacyunreg

This is the price of compositionality without regulation

Page 48: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

M = 4, N = 10, = 1%

26

Bmax

Per-Core Utilization U0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Perc

enta

ge S

ched

ulab

le T

ask

Sets

0

10

20

30

40

50

60

70

80

90

100

non-comprw-regread-reglegacyunreg

This is the improvement over previous regulation approach

Renato Mancuso et al: WCET(m) Estimation in Multi-Core Systems using Single Core Equivalence, ECRTS’15

Page 49: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

M = 4, N = 10,

27

Bmax

2 [0%, 10%]

Maximum Task Bandwidth Bmax in %0 1 2 3 4 5 6 7 8 9 10

Sche

dula

ble

Util

izat

ion

for 5

0% T

hres

hold

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

non-comprw-regread-reglegacyunreg

Page 50: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Outline

•  Delay analysis and compositionality

•  Solution: Memory Server

•  Core-level analysis

•  System-level analysis

•  Evaluation

•  Conclusions

1

Page 51: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

/ 28

Conclusions •  Resource regulation leads to a “better” compositionality by

establishing contracts (interface) on resource usage

•  The approach works if the cost of regulation is less than the improvement over request-driven analysis

•  Downside: interface is more complex compared to hierarchical scheduling –  Note: this paper computes a tight interface – it might be

easier to use a (linearized?) subset of feasibility region

•  Future work: –  Multiple applications per core –  Multiple resources –  More complex resources (NoC) 28

Page 52: Memory Servers for Multicore Systems - RTAS 20162016.rtas.org/wp-content/uploads/2016/06/81.pdfHeechul Yun et al: MemGuard: Memory Bandwidth Reservation System for Efficient Performance

Thank You!

Questions?