31
MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

  • View
    223

  • Download
    1

Embed Size (px)

Citation preview

Page 1: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

MemTracker Efficient and Programmable Support for

Memory Access Monitoring and Debugging

Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Page 2: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 2

Introduction

• Software is increasingly complex

• More complexity means more bugs

• Memory bugs are most common – Many are security vulnerabilities

• How to catch them efficiently?

Page 3: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 3

Debugging and Monitoring

• Maintain Information/state about memory

• Low performance overhead in “always-on” mode– Hard problem

• Flexible/Programmable– Even harder!!

Page 4: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 4

Challenges

• Software Approach– Flexible – Large (2X to 30X) slowdown

• Hardware approach– Faster– Most are checker specific– Others need Software intervention too

often

Page 5: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 5

Related Work

• DISE [ISCA’03]+ Pattern matches instructions,

dynamically injects instrumentation– Modifies front end of pipeline– Adds extra code to instruction stream

• Mondrian [ASPLOS’02]+ Fine grain protection- different

permissions for adjacent words – Software intervention for permission

updates – Complex hardware (trie structure)

Page 6: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 6

Objectives

• MemTracker* Maintains state for every memory

word* No software intervention for most

state checks and updates* Efficient checks and updates even

when nearby locations have different states

* Programmable (can implement different checkers)

Page 7: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 7

What is MemTracker?

• A programmable state machine– (State, event) → (State, Exception)

• Supports upto 16 states (4 state bits/word)– Not any fundamental limit; Can be extended

• All memory actions are events– Memory accesses : Loads, Stores – User events (affect only state)

Page 8: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 8

Example Heap Checker

Load/Store NON-

HEAP Malloc/Free ERROR

UNALLOCALLOC, UNINIT

INIT

Load/Store/Free ERROR

Malloc

Free

LoadERROR

Store

Free

MallocERROR

Load/Store

Page 9: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 9

MemTracker State Table

UEVT0 (Alloc)

UEVT1 (Free)

LOAD STORE

0(Non-Heap)

1(UnAlloc)

2(UnInit)

3(Init)

State↓

Event →

Page 10: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 10

MemTracker State Table

UEVT0 (Alloc)

UEVT1 (Free)

LOAD STORE

0(Non-Heap)

0 E 0 E 0 0

1(UnAlloc)

2(UnInit)

3(Init)

State↓

Event →

Page 11: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 11

MemTracker State Table

UEVT0 (Alloc)

UEVT1 (Free)

LOAD STORE

0(Non-Heap)

0 E 0 E 0 0

1(UnAlloc)

2 1 E 1 E 1 E

2(UnInit)

3(Init)

State↓

Event →

Page 12: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 12

MemTracker State Table

UEVT0 (Alloc)

UEVT1 (Free)

LOAD STORE

0(Non-Heap)

0 E 0 E 0 0

1(UnAlloc)

2 1 E 1 E 1 E

2(UnInit)

2 E 1 2 E 3

3(Init)

3 E 1 3 3

State↓

Event →

Page 13: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 13

State Storage

Application’s Virtual Address Space

Protected, Reserved Virtual Space for State Data

Normal Virtual Memory Space for code, data, stack and heap

State

Code, Data, Heap and

Stack

Page 14: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 14

State Base Reg Data address (0xABCD)

+State address

(0xF0000ABC)Cache

1100

1010

MU

X

State (11)

2

Number of State Bits

State Lookup – Word access only

101010111100 1101 1010101111000xF00000000xF00000000xF0000000

1100

1010

11

Page 15: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 15

Caching State information

Shared Caching

•No additional resources for state

•Data and state blocks compete for cache lines in existing caches

•Load/Stores already have data lookups, now they also need state lookups

•These state lookups double the L1 port contention

Page 16: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 16

Caching State information

Interleaved Caching

•Expand cache line to store state for its data

+One lookup finds both data and state- L1 cache larger and slower even when no checking

Page 17: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 17

Caching State information

Split Caching

•Dedicated (small) state L1 cache•Provides separate ports for state lookups•Leaves data L1 cache alone•When NOT checking, turn SL1 off

Page 18: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 18

Caching State information

Shared Caching

Split CachingInterleaved CachingL2 and below use shared caching (no addt’l space for state)

L2 single ported, rarely a contention problem (L1 filters out most acceses)State smaller than data, so needs less bandwidth and capacityWe use Split L1 and Shared L2 and below in the rest of the talk

Page 19: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 19

Pipeline

IF ID REN REG EXE MEM WB CMT

Data L1

Front End Out of Order Back end

Page 20: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 20

Pipeline Modifications

IF ID REN REG EXE MEM WBPre- CMT

CHK CMT

State L1

Data L1

State Forwarding

Prefetch

Page 21: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 21

Other Issues

• OS issues– Context switches (Fast)– Paging (same as data)

• Multiprocessor implementation– Coherence

• State information treated same as data

– Consistency• Key issue: atomicity of state and data

– Example: Same instruction accesses new data, old state

• More details in paper !

Page 22: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 22

Evaluation Platform

• SESC, Out of Order, 5 GHz.• L1 Data cache

– 16 KB, 2-way, 2-ports, 32B block

• L1 State cache (split caching)– 2KB, 2-way, 2-ports, 32B block

• L2 cache - 2 MB, 4-way, 1-port, 32B block

Page 23: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 23

Checkers used in Evaluation

• Heap Checker (Example seen before)– 4 states – NonHeap, UnAlloc, Alloc, Init

• Return Address Checker– Detects return address modifications– 3 states – NotRA, GoodRA, BadRA

• HeapChunks Checker– Detects sequential Heap Buffer overflows– 2 states – Delimit, NotDelimit

• Combined Checker– Combines all the above – 7 states,4 (although actually 3) state bits per

word– Most demanding; Default in evaluation

Page 24: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 24

Performance of Checkers on SPEC

Run

tim

e O

verh

ead

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

bzip2 eon mcf art swim Average

HeapChunks

Heap

Stack

Combined

2.7%

Page 25: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 25

Sensitivity - Prefetching

0.00%

2.00%

4.00%

6.00%

8.00%

10.00%

12.00%

14.00%

16.00%

mcf vortex art equake mgrid Average

Imprecise

Precise without Prefetch

Precise with Prefetch

Run

tim

e O

verh

ead

Page 26: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 26

MemTracker vs. other schemes

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

30.00%

35.00%

40.00%

45.00%

50.00%

gcc gzip art apsi swim average

MemTracker

Mondrian 30cycle update

Software 5 cycle check

Run

tim

e O

verh

ead

Page 27: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 27

Conclusions

• MemTracker– Monitors and checks memory accesses– Can be programmed to implement

different checkers– Low performance overheads

2.7% average and 4.7% worst for combined checker on SPEC

– Tested on injected bugs – it finds them!•More Details in paper

Page 28: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 28

Thank you!

Questions? [email protected]

Page 29: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 29

BACKUP SLIDES

Page 30: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 30

Sensitivity – State Cache Size

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

7.00%

8.00%

9.00%

10.00%

crafty eon mcf art applu Average

1 KByte

2 KByte

4 KByte

16 KByte

Run

tim

e O

verh

ead

Page 31: MemTracker Efficient and Programmable Support for Memory Access Monitoring and Debugging Guru Venkataramani, Brandyn Roemer, Yan Solihin, Milos Prvulovic

Venkataramani HPCA’07 31

Caching Configurations on SPEC

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

mcf parser vortex art applu Average

Shared

Interleaved

Interleaved (+1 Port)

Split

Run

tim

e O

verh

ead