30
April 2007 The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin, Madison

The Deconstruction of Dyninst Part 1: The SymtabAPI

  • Upload
    marlie

  • View
    54

  • Download
    0

Embed Size (px)

DESCRIPTION

The Deconstruction of Dyninst Part 1: The SymtabAPI. Giridhar Ravipati University of Wisconsin, Madison. Motivation. Binary tools are increasingly common Two categories of operation Analysis : Derive semantic meaning from the binary code Symbol tables (if present) - PowerPoint PPT Presentation

Citation preview

Page 1: The Deconstruction of Dyninst Part 1: The SymtabAPI

April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI

The Deconstruction of DyninstPart 1: The SymtabAPI

Giridhar RavipatiUniversity of Wisconsin, Madison

Page 2: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 2 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Motivation

Binary tools are increasingly common Two categories of operation

• Analysis : Derive semantic meaning from the binary code– Symbol tables (if present)– Decode (disassemble) instructions– Control-flow information: basic blocks, loops, functions– Data-flow information: from basic register information to

highly sophisticated (and expensive) analyses.

• Modification– Insert, remove, or change the binary code, producing a new

binary.

Page 3: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 3 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Wide Use of Binary Tools

• Binary Modification– Eel, Vulcan, Etch, Atom,

Diablo, Diota

• Binary Matching– BMT

• Forensics– Fenris

• Reverse engineering– IDA Pro

• Binary Translation– Objcopy, UQBT

• Program tracing– QPT

• Program debugging– Total view, gdb, STAT

• Program testing– Eraser

• Performance modeling– METRIC

• Performance profiling– Paradyn, Valgrind, TAU,

OSS

Analysis and Modification are used in a wide variety of applications

Page 4: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 4 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Lack of Code Sharing

Some tools do analysis and some tools do modification•Only a few do both

Tools usually depend on•Similar analysis •Similar modification techniques

Too many different interfaces•Usually too low level

Developers are forced to reinvent the wheel rather than use existing code

Page 5: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 5 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Lack of Portability

Myriad number of differences between•File formats•Architectures•Operating systems• Compilers•…

Building a portable binary tool is highly expensive•Many platforms in common use

Page 6: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 6 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

High-level goals

To build a toolkit that•Has components for analysis •Has components for modification•Is portable & extensible•Has an abstract interfaces•Encourage sharing of functionality

Deconstruct Dyninst into a toolkit that can achieve these goals

Page 7: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 7 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

DyninstAPI

Library that provides a platform-independent interface to dynamic binary analysis and modification

Goal•Simplify binary tool development

Why is Dyninst successful?•Analysis and modification capabilities•Portability•Abstract interface

Page 8: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 8 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Drawbacks of Dyninst

Dyninst is complex Dyninst internal components are portable

but not sharable Sometimes Dyninst is not a perfect match

for user requirements Dyninst is feature-rich in some cases

•Provides unnecessary extra functionality

Page 9: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 9 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Example Scenarios

Hidden functionality•Statically parse and analyze a binary

without executing it•Just perform stackwalking on a binary

compiled without frame pointer information Build new tools

•Static binary rewriter•Tool to add a symbol table to stripped

binaries

Page 10: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 10 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Our Approach

Deconstruct the monolithic Dyninst into a suite of components

Each component provides a platform-independent interface to a core piece of Dyninst functionality

Page 11: The Deconstruction of Dyninst Part 1: The SymtabAPI

April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI

AST

BinaryCode

InstrumentationRequests

MonolithicDyninst

Page 12: The Deconstruction of Dyninst Part 1: The SymtabAPI

April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI

BinaryCode

Code Gen StackWalker

ProcessControl

SymtabAPICode

Parser

IdiomDetector

InstrumentationRequests

InstructionDecoder

Instrumenter

AST

Page 13: The Deconstruction of Dyninst Part 1: The SymtabAPI

April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI

InstructionDecoder

AST

BinaryCode

CodeParser

IdiomDetector

IA32

AMD64

POWER

IA64

SPARC

InstrumentationRequests Process

ControlInstrumenter

IA32

AMD64

POWER

IA64

SPARC

Code Gen

IA32

AMD64

POWER

IA64

SPARC

StackWalker

Linux

AIX

Solaris

Windows

PE

ELF

XCOFF

IA32

AMD64

POWER

IA64

SPARC

Page 14: The Deconstruction of Dyninst Part 1: The SymtabAPI

April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI

PE

InstructionDecoder

AST

BinaryCode

ELF

XCOFF

CodeParser

IdiomDetector

IA32

AMD64

POWER

IA64

SPARC

SymbolTable

Disassembly

FunctionObjects

CallGraph

Intra ProcCFGs

IdiomSignatures

InstrumentationRequests Process

ControlInstrumenter

IA32

AMD64

POWER

IA64

SPARC

Code Gen

IA32

AMD64

POWER

IA64

SPARC

StackWalker

Linux

AIX

Solaris

Windows

IA32

AMD64

POWER

IA64

SPARC

Page 15: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 15 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Goals of Deconstruction

Separate the key capabilities of Dyninst Each Component

•Is responsible for a specific functionality•Provides a general solution

Encourage sharing •Share our functionality when building new

tools•Share functionality of other tools

Page 16: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 16 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Benefits of Deconstruction

Access to the hidden features of Dyninst

Interoperability with other tools•Standardized interfaces and sharing of

components

Finer grain testing of Dyninst

Page 17: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 17 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Benefits of Deconstruction [contd.]

Code reuse among the tool community

Make tools more portable

Unexpected benefits with new application of components

Page 18: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 18 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Our Plan

1. Identify the key functionality2. Refine and generalize the abstract

interfaces to these components3. Extract and separate the functionality

from Dyninst4. Rebuild Dyninst on top of these

components5. Create new tools

• Multi-platform static binary rewriter

Page 19: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 19 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI

The first component of the deconstructed Dyninst

Multi platform library for parsing symbol table information from object files

Leverages the experience and implementation gained from building the DyninstAPI

Page 20: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 20 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI Goals

Abstraction•Be file format-independent

Interactivity•Update data incrementally

Extensibility•User-extensible data structures

Generality•Parse ELF/XCOFF/PE object files•On-Disk/In-Memory parsing

Page 21: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 21 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI Abstractions

Represents an object file in a canonical format

Hides the multi-platform dependences

Header

Modules

Symbols

Relocations

Excp Blocks

Archive

Debug Info

Header

Modules

Symbols

Relocations

Excp Blocks

Debug Info

Page 22: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 22 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI Extensibility

Abstractions are designed to be extensible

Can annotate particular abstractions with tool specific data•e.g. : Store type information for every

symbol in the symbol table

Page 23: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 23 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Interactivity/Extensibility

Symbol Address

func1

func2 0x0804cd1d

variable1

0x0804cc84

0x0804cd00

Register Is Live?

R1 Yes

R2 No

R3 Yes

R4 Yes

... ...

Type Informationint

Size

100

4

050

Page 24: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 24 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

SymtabAPI Interface

Information from a parsed-binary is kept in run time data structures

Intuitive query-based interface•e.g. findSymbolByType(name,type)•Returns matching symbols

Data can then be updated by the user Modifications available for future queries

Page 25: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 25 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Query/Update/Export/Emit

SymtabAPI

Parse

Binary Tool

Query

Response

Update Export/Emit

BinaryCode

XML

NewBinary

Page 26: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 26 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Summary of Operations

Parse the symbols in a binary Query for symbols Update existing symbol information Add new symbols Export/Emit symbols

More details/operations in the SymtabAPI programmer’s guide

Page 27: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 27 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Current Status

Released the initial version of SymtabAPI with the 5.1 release of Dyninst

Dyninst on top of SymtabAPI XML export Emit on Linux and AIX

Page 28: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 28 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Ongoing & Future Work

Import XML Emit a new binary on windows Debugging information for symbols Interfaces for the remaining components Multi-platform static binary rewriter

Page 29: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 29 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Demo

Please stop by and see our demo of stripped binary parsing with the SymtabAPI’s emit functionality on Linux

Tuesday, May 1, 2007Room No – 206

2:00 PM – 3:00 PM

Page 30: The Deconstruction of Dyninst Part 1: The SymtabAPI

– 30 –The Deconstruction of Dyninst: Part 1- the SymtabAPI

Downloads

SymtabAPI•http://www.paradyn.org/html/

downloads.html SymtabAPI Programmer’s guide

•http://www.paradyn.org/html/symtabAPI.html

Ravipati, G., Bernat, A., Miller, B.P. and Hollingsworth, J.K., "Toward the Deconstruction of Dyninst", Technical Report