-2- Optimizing Your Dyninst Program
Optimizing Dyninst• Dyninst is being used to insert more
instrumentation into bigger programs. Forexample:– Instrumenting every memory instruction– Working with binaries 200MB in size
• Performance is a big consideration
• What can we do, and what can you do tohelp?
-3- Optimizing Your Dyninst Program
Performance Problem Areas• Parsing: Analyzing the binary and reading
debug information.
• Instrumenting: Rewriting the binary toinsert instrumentation.
• Runtime: Instrumentation slows down amutatee at runtime.
-4- Optimizing Your Dyninst Program
Optimizing Dyninst• Programmer Optimizations
– Telling Dyninst not to output tramp guards.
• Dyninst Optimizations– Reducing the number of registers saved around
instrumentation.
-5- Optimizing Your Dyninst Program
Parsing Overview• Control Flow
– Identifies executable regions• Data Flow
– Analyzes code prior to instrumentation• Debug
– Reads debugging information, e.g. line info• Symbol
– Reads from the symbol table
• Lazy Parsing: Not parsed until it is needed
-6- Optimizing Your Dyninst Program
Control Flow• Dyninst needs to analyze a binary before
it can instrument it.– Identifies functions and basic blocks
• Granularity– Parses all of a module at once.
• Triggers– Operating on a BPatch_function– Requesting BPatch_instPoint objects– Performing Stackwalks (on x86)
-7- Optimizing Your Dyninst Program
Data Flow• Dyninst analyzes a function before
instrumenting it.– Live register analysis– Reaching allocs on IA-64
• Granularity– Analyzes a function at a time.
• Triggers– The first time a function is instrumented
-8- Optimizing Your Dyninst Program
Debug Information• Reads debug information from a binary.
• Granularity– Parses all of a module at once.
• Triggers– Line number information– Type information– Local variable information
-9- Optimizing Your Dyninst Program
Symbol Table• Extracts function and data information
from the symbol
• Granularity– Parses all of a module at once.
• Triggers– Not done lazily. At module load.
-10- Optimizing Your Dyninst Program
Lazy Parsing Overview
AutomaticallyModuleSymbol
Debug InfoQueries
ModuleDebug
InstrumentationFunctionData Flow
BPatch_functionQueries
ModuleControl Flow
Triggered ByGranularity
Lazy parsing allows you to avoid or defercosts.
-11- Optimizing Your Dyninst Program
foo:0x1000: push ebp0x1001: movl esp,ebp0x1002: push $10x1004: call bar0x1005: leave0x1006: ret
foo:0x2000: jmp entry_instr0x2005: push ebp0x2006: movl esp,ebp0x2007: push $10x2009: call bar0x200F: leave0x2010: ret
foo:0x3000: jmp entry_instr0x3005: push ebp0x3006: movl esp,ebp0x3007: push $10x3009: jmp call_instr0x300F: call bar0x3014: leave0x3015: ret
foo:0x4000: jmp entry_instr0x4005: push ebp0x4006: movl esp,ebp0x4007: push $10x4009: jmp call_instr0x400F: call bar0x4014: leave0x4015: jmp exit_instr0x401A: ret
Inserting Instrumentation• What happens when we re-instrument a
function?
-12- Optimizing Your Dyninst Program
Inserting Instrumentation• Bracket instrumentation requests with:
beginInsertionSet()...endInsertionSet()
• Batches instrumentation– Allows for transactional instrumentation– Improves efficiency (rewrite)
-13- Optimizing Your Dyninst Program
Runtime Overhead• Two factors determine how
instrumentation slows down a program.– What does the instrumentation cost?
• Increment a variable• Call a cache simulator
– How often does instrumentation run?• Every time read/write are called• Every memory instruction
• Additional Dyninst overhead on eachinstrumentation point.
-14- Optimizing Your Dyninst Program
Runtime Overhead - Basetramps
save all GPRsave all FPRt = DYNINSTthreadIndex()if (!guards[t]) { guards[t] = true jump to minitramps guards[t] = false}restore all FPRrestore all GPR
Save Registers
Restore Registers
CalculateThread IndexCheck TrampGuardsRun AllMinitramps
A Basetramp
-15- Optimizing Your Dyninst Program
Runtime Overhead – Registers
save all GPRsave all FPRt = DYNINSTthreadIndex()if (!guards[t]) { guards[t] = true jump to minitramps guards[t] = false}restore all FPRrestore all GPR
A Basetramp
•Analyzes minitrampsfor register usage.
•Analyzes functions forregister liveness.
•Only saves what is liveand used.
-16- Optimizing Your Dyninst Program
Runtime Overhead – Registers
save all GPRsave all FPRt = DYNINSTthreadIndex()if (!guards[t]) { guards[t] = true jump to minitramps guards[t] = false}restore all FPRrestore all GPR
A Basetramp•Called functions arerecursively analyzed toa max call depth of 2.
minitramp
foo()
bar()
call
call
-17- Optimizing Your Dyninst Program
Runtime Overhead – Registers
save live GPRt = DYNINSTthreadIndex()if (!guards[t]) { guards[t] = true jump to minitramps guards[t] = false}restore live GPR
A Basetramp•Use shallow functioncall chains underinstrumentation, soDyninst can analyze allreachable code.
•UseBPatch::setSaveFPR()to disable all floatingpoint saves.
-18- Optimizing Your Dyninst Program
Runtime Overhead – Tramp Guards
save live GPRt = DYNINSTthreadIndex()if (!guards[t]) { guards[t] = true jump to minitramps guards[t] = false}Restore live GPR
A Basetramp•Prevents recursiveinstrumentation.
•Needs to be threadaware.
-19- Optimizing Your Dyninst Program
Runtime Overhead – Tramp GuardsA Basetramp
•Build instrumentationthat doesn’t makefunction calls (noBPatch_funcCallExprsnippets)
•UsesetTrampRecursive()if you’re sureinstrumentation won’trecurse.
save live GPRt = DYNINSTthreadIndex()
jump to minitramps
restore live GPR
-20- Optimizing Your Dyninst Program
Runtime Overhead – ThreadsA Basetramp
•Returns an index value(0..N) unique to thecurrent thread.
•Used by tramp guardsand for thread localstorage byinstrumentation
•Expensive
save live GPRt = DYNINSTthreadIndex()
jump to minitramps
restore live GPR
-21- Optimizing Your Dyninst Program
Runtime Overhead – ThreadsA Basetramp
•Not needed if thereare no tramp guards.
•Only used on mutateeslinked with a threadinglibrary (e.g. libpthread)
save live GPR
jump to minitramps
restore live GPR
-22- Optimizing Your Dyninst Program
Runtime Overhead – MinitrampsA Basetramp
•Minitramps containthe actualinstrumentation.
•What can we do withminitramps?
save live GPR
jump to minitramps
restore live GPR
-23- Optimizing Your Dyninst Program
Runtime Overhead – Minitramps
//Increment var by 1load var -> regreg = reg + 1store reg -> varjmp Minitramp B
Minitramp A•Created by our codegenerator, whichassumes a RISC likearchitecture.
•Instrumentationlinked by jumps.
//Call foo(arg1)push arg1call foojmp BaseTramp
Minitramp B
-24- Optimizing Your Dyninst Program
//Increment var by 1load var -> regreg = reg + 1store reg -> varjmp Minitramp B
//Increment var by 1
inc var
jmp Minitramp B
Runtime Overhead – MinitrampsMinitramp A
//Call foo(arg1)push arg1call foojmp BaseTramp
Minitramp B
•New code generatorrecognizes commoninstrumentationsnippets and outputsCISC instructions.
• Works on simplearithmetic, and stores.
-25- Optimizing Your Dyninst Program
//Increment var by 1
inc var
jmp Minitramp B
Runtime Overhead – Minitramps
//Increment var by 1load var -> regreg = reg + 1store reg -> varjmp Minitramp B
Minitramp A
Minitramp B
//Increment var by 1
inc var
jmp Minitramp B
Mergetramp
//Call foo(arg1)push arg1call foojmp BaseTramp
•New merge trampscombine minitrampstogether withbasetramp.
•Faster execution,slower re-instrumentation.
•Change behavior withBPatch::setMergeTramp
-26- Optimizing Your Dyninst Program
Runtime Overhead
• Where does the Dyninst’s runtimeoverhead go?– 87% Calculating thread indexes– 12% Saving registers– 1% Trampoline Guards
• Dyninst allows inexpensive instrumentationto be inexpensive.
-27- Optimizing Your Dyninst Program
Summary
• Make use of lazy parsing
• Use insertion sets when insertinginstrumentation.
• Small, easy to understand snippets areeasier for Dyninst to optimize.– Try to avoid function calls in instrumentation.