Upload
silas
View
31
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Whose Cache Line Is It Anyway?. Operating System Support for Live Detection and Repair of False Sharing. Mihir Nanavati , Mark Spear, Nathan Taylor, Shriram Rajagopalan , Dutch T. Meyer, William Aiello, and Andrew Warfield University of British Columbia. - PowerPoint PPT Presentation
Citation preview
Whose Cache Line Is It Anyway?
Mihir Nanavati, Mark Spear, Nathan Taylor, Shriram Rajagopalan, Dutch T. Meyer, William Aiello, and Andrew
WarfieldUniversity of British Columbia
Operating System Support for LiveDetection and Repair of False Sharing
2
3
4Mondriaan Memory Protection [ASPLOS ’02,
SOSP ’05]
5
Byte-granularity,
software-only remapping
6
False Sharing
7
8
Target System
Xen
Control VM
(Dom0)+
HardwareMemory
9
Dynamic Detection
and
Mitigationof False
Sharing
10
11
T1 T2
Cache
Main Memory
0x300
Read 0x300
0x340
Write 0x300 Write 0x308
12
Cache LineC Structure
With Padding
With AllocatorMetadata
13
1 2 3 4 5 6 7 805
10152025303540
Serial ParallelRegular (FS) Source Fixed
No. of Cores
Tim
e (s
)
14
1 2 3 4 5 6 7 805
10152025303540
Serial ParallelRegular (FS) Source Fixed
No. of Cores
Tim
e (s
)
15
1 2 3 4 5 6 7 805
10152025303540
Serial ParallelRegular (FS) Source Fixed
No. of Cores
Tim
e (s
)
16
1 2 3 4 5 6 7 805
10152025303540
Serial ParallelRegular (FS) Source Fixed
No. of Cores
Tim
e (s
)
17
1 2 3 4 5 6 7 805
10152025303540
Serial ParallelRegular (FS) Source Fixed
No. of Cores
Tim
e (s
)
7.5x
Linux Kernel [OSDI ’10], JVM [Dice, 2012],Software Transactional Memory [HPCA ’06]
18
Dynamic Detection
and
Mitigationof False
Sharing
19
Modify access locations
Modify access frequencySheriff [OOPSLA ’11]
20
21
Isolated Page
Underlay Page
T1 T2
22
Dynamic Detection
and
Mitigationof False
Sharing
23
Persistent, high-frequency
false sharing
24
Very Fast and Imprecise
Fast and Somewhat
Precise
Slow and Precise
25
Does this signifyfalse sharing?
Performance Counters
Log Page Reads
Instruction Emulation
Log-Analysis
Does contention exist?What pages are involvedin the contention?
What are the byte rangesbeing accessed?
Rules for remapper
26
Dynamic Detection
and
Mitigationof False
Sharing
27
Isolated Page
Underlay Page
T1 T2
28
Don’t be EvilHarmful
29
Fault Driven Redirection
30
Original Code Code Cache
?!It’s a Fault?!
31
Original Code Code Cache
32
Avoid code trampolines
Catch all accesses via data path
Amortize page fault cost
33
“Know When You are Beaten”
34
Isolated Page
Underlay Page
T1 T2
35
Evaluation
36
0 1000 2000 3000 4000 5000 60000
100
200
300
400
500
600
Time (ms)
Prog
ress
(milli
on re
cord
s)
Remappings Established
Version with false sharing under Plastic
Coherence Invalidations
Source-fixed Version
110 M/sec160 M/sec
37
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 Regular w/PlasticCCBenc
hPhoenix Parsec
1.4x
3.6x5.4x
Norm
alize
d Pe
rform
ance
38
Low overhead runtime detection
Byte-granularity remapping
Speedup of up to 5.4x
39
Performance Optimizations
Security Enhancements
40