Upload
saber-ferjani
View
182
Download
1
Embed Size (px)
Citation preview
Responsible : Prof. Frédéric Pétrot
Supervisor : Luc Michel
TIMA Laboratory - SLS Group
Grenoble, France
Translation cache policies for
dynamic binary translation
Ecole
Nationale
des Sciences
de l'Informatique
Saber Ferjani
2
DBT: Is a CPU simulation technique, it reads a short sequence of code (Target), translates it, and executes it in a different CPU (Host).
Host Machine
CPU Simulated Target
translation
asm code
TB TB TB TB TB TB
3
Translation cache: It is a buffer in host machine that stores the Translated Blocks (TB)
Outline
1. Virtualization and simulation techniques
2. Qemu Internals
3. Typical cache algorithms
4. Cache algorithm proposal
5. Simulation results
6. Conclusion & Perspectives
4
1. Virtualization and simulation techniques
5
1.1. Just In Time Compiler
1. Virtualization and simulation techniques
6
1.2. Hosted & Native Hypervisors
1. Virtualization and simulation techniques
7
1.3. Virtualization tools
Virtual Box
Virtual PC
VMware
Xen
Bochs
Valgrind
Qemu
KVM
1. Virtualization and simulation techniques
8
1.4. Simulation techniques
Interpretive technique ► Extremely slow!
Native Simulation ► Need source code!
Binary Translation:
Static ► Cannot handle indirect branches
Dynamic ► Quite fast & flexible
2. Qemu internals
9
2.1. Overview
Generic & Open source machine emulator
Created by Fabrice Bellard in 2003
Supported targets: IA32, ARM, SPARC, MIPS, PPC…
2. Qemu internals
10
2.2. Execution flow example
2. Qemu internals
11
2.3. Main execution loop
2. Qemu internals
12
2.4. Translation cache size
2. Qemu internals
13
2.4. TB allocation
3. Typical cache algorithms
14
Optimal cache algorithm (offline)
Basic cache algorithms:
Flush, Random, FIFO, LRU, LFU
Advanced cache algorithms:
LRFU, 2Q, LIRS, ARC
Qemu constraints:
TB are not movable
TB size is variable,
TB size is unpredictable
4. Cache algorithm proposal
15
4.1. Algorithm design
4. Cache algorithm proposal
16
4.2. Data structure
Constant insertion overhead
Frequently referenced TBs are elected for
re-translation into separated cache area
4. Cache algorithm proposal
17
4.3. HST update
Before CSA flush, add address of all TBs
that were executed more than 𝐹𝑡ℎ
HST is used as circular buffer,
HST size is fixed to half of HSA size
@HS1
@HS2
@HS3 @HS4
@HS5
Qemu monitor: Back-end configuration
console interface
Log options:
out asm: show generated host code
In asm: show target assembly code
Exec: show trace before each executed TB
…etc
Generated log of (log exec):
Trace (Host Address) [(Target Address)]
5. Simulation results
18
5.1. Qemu log
5. Simulation results
19
5.2. TB-trace: Translation cache simulator
5. Simulation results
20
5.3. Simulated cache algorithms
LRU
LFU
CSA HSA
• A-LRU:
• A-LFU:
• A-2Q:
@
@
@ @
@ HST
5. Simulation results 5.3. Qemu used guest machines
LZMA benchmark
Linux Kernel
Windows XP start-up
5. Simulation results
22
5.5. Guest 1: LZMA benchmark over Debian
0,25 0,375 0,5
62
89 72
50 55 52 56 68
88
CSA flushs
Quota=
LRU LFU 2Q
0,25 0,375 0,5
18,5%
39,6% 26,1%
86,9% 91,3% 90,1% 81,8% 81,9% 81,8%
Hotspot hit
5. Simulation results
23
5.6. Guest 2: Linux kernel 2.6.20
0,25 0,375 0,5
15 18
22
15 17
21
16 19
23
CSA flushs
Quota=
LRU LFU 2Q
+1
HSA
flush
+1
HSA
flush
0,25 0,375 0,5
24,1% 32,1%
43,6%
24,4%
61,9% 57,4%
30,0%
64,1% 65,2%
Hotspot hit
5. Simulation results
24
5.7. Guest 3: Windows XP start-up
0,25 0,375 0,5
15 18
21
15 17
21
16 19
24
CSA flushs
Quota=
LRU LFU 2Q
+1
HSA
flush
+1
HSA
flush
+1
HSA
flush
0,25 0,375 0,5
16,0%
45,2% 52,1%
23,4%
56,5% 51,4%
29,0%
45,3%
64,7%
Hotspot hit
Qemu translation cache is inefficient
Cache algorithms based on page
replacement cannot be used
Our algorithm proposal advantages:
Reduce unneeded re-translations
TB insertion overhead is constant
Drawbacks:
Invalidated TB remain allocated
Address find operation depend on HST size
6. Conclusion & Perspectives
25
6.1. Conclusion
Use a hash function for HST to accelerate
TB lookup before each new translation,
Use an op-code buffer to accelerate TB
re-translation of hot spots,
Estimate size of next translation, and try
to overwrite invalidated TB
6. Conclusion & Perspectives
26
6.2. Perspectives
27
Questions?