Upload
trenton-dempsey
View
46
Download
8
Tags:
Embed Size (px)
DESCRIPTION
Part 2: Advanced Static Analysis. Chapter 4: A Crash Course in x86 Disassembly Chapter 5: IDA Pro Chapter 6: Recognizing C Code Constructs in Assembly. How software works. gcc compiler driver pre-processes, compiles, assembles and links to generate executable - PowerPoint PPT Presentation
Citation preview
Part 2: Advanced Static Analysis
Chapter 4: A Crash Course in x86 DisassemblyChapter 5: IDA Pro
Chapter 6: Recognizing C Code Constructs in Assembly
How software works
gcc compiler driver pre-processes, compiles, assembles and links to generate executable Links together object code (i.e. game.o) and static
libraries (i.e. libc.a) to form final executable Links in references to dynamic libraries for code
loaded at load time (i.e. libc.so.1) Executable may still load additional dynamic
libraries at run-time
Pre-processor
Compiler LinkerAssembler
ProgramSource
ModifiedSource
AssemblyCode
ObjectCode
ExecutableCode
hello.c hello.i hello.s hello.o hello
Static libraries
Suppose you have utility code in x.c, y.c, and z.c that all of your programs useLink together individual .o files
gcc –o hello hello.o x.o y.o z.o
Create a library libmyutil.a using ar and ranlib and link library in statically
libmyutil.a : x.o y.o z.o
ar rvu libmyutil.a x.o y.o z.o
ranlib libmyutil.a
gcc –o hello hello.c –L. –lmyutil
Note: library code copied directly into binary
Dynamic libraries
Avoid having multiple copies of common code on diskProblem: libc
“gcc program.c –lc” creates an a.out with entire libc object code in it (libc.a)
Almost all programs use libc!
Solution: Have binaries compiled with a reference to a library of shared objects versus an entire copy of the library
Libraries loaded at run-time from file system“ldd <binary>” to see which dynamic libraries a program relies
upongcc flags “–shared” and “-soname” for handling and generating
dynamic shared object files
The linking process (ld)Merges object files
Merges multiple relocatable (.o) object files into a single executable program.
Resolves external references References to symbols defined in another object file.
Relocates symbols Relocates symbols from their relative locations in the .o files to new absolute
positions in the executable. Updates all references to these symbols to reflect their new positions.
References in both code and data» code: a(); /* reference to symbol a */» data: int *xp=&x; /* reference to symbol x */
Executables
Various file formatsLinux = Executable and Linkable Format (ELF)Windows = Portable Executable (PE)
ELF
Standard binary format for object files in Linux
One unified format for Relocatable object files (.o), Shared object files (.so)Executable object files
Better support for shared libraries than old a.out formats.
More complete information for debuggers.
ELF Object File FormatELF header
Magic number, type (.o, exec, .so), machine, byte ordering, etc.
Program header table Page size, virtual addresses of memory
segments (sections), segment sizes, entry point
.text section Code
.data section Initialized (static) data
.bss section Uninitialized (static) data “Block Started by Symbol”
ELF header
Program header table(required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table(required for relocatables)
0
ELF Object File Format (cont).symtab section
Symbol table Procedure and static variable names Section names and locations
.rel.text section Relocation info for .text section Addresses of instructions that will need to be
modified in the executable Instructions for modifying.
.rel.data section Relocation info for .data section Addresses of pointer data that will need to be
modified in the merged executable
.debug section Info for symbolic debugging (gcc -g)
ELF header
Program header table(required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table(required for relocatables)
0
PE (Portable Executable) file format
Windows file format for executables
Based on COFF Format Magic Numbers, Headers, Tables, Directories, Sections
Disassemblers Overlay Data with C Structures Load File as OS Loader Would Identify Entry Points (Default & Exported)
Example C Program
int e=7; int main() { int r = a(); exit(0); }
m.c a.c
extern int e; int *ep=&e;int x=15; int y; int a() { return *ep+x+y; }
Merging Relocatable Object Files into an Executable Object File
main()m.o
int *ep = &e
a()
a.o
int e = 7
headers
main()
a()
0system code
int *ep = &e
int e = 7
system data
more system code
int x = 15int y
system data
int x = 15
Relocatable Object Files Executable Object File
.text
.text
.data
.text
.data
.text
.data
.bss .symtab.debug
.data
uninitialized data .bss
system code
Program executionOperating system provides
Protection and resource allocation Abstract view of resources (files, system calls) Virtual memory
Uniform memory space abstraction for each processGives the illusion that each process has entire memory space
How does a program get loaded?
The operating system creates a new process. Including among other things, a virtual memory
space Important: any hardware-based debugger must
know OS state in page tables to map accesses to virtual addresses
System loader reads the executable file from the file system into the memory space. Reads executable from file system into memory
spaceExecutable contains code and statically link librariesDone via DMA (direct memory access)Executable in file system remains and can be executed
again Loads dynamic shared objects/libraries into memory Resolves addresses in code given where code/data
is loaded
Then it starts the thread of execution running
Loading Executable Binaries
ELF header
Program header table(required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table(required for relocatables)
0
.text segment(r/o)
.data segment(initialized r/w)
.bss segment(uninitialized r/w)
Executable object file for example program p
Process image
0x08048494
init and shared libsegments
0x080483e0
Virtual addr
0x0804a010
0x0804a3b0
More on relocation
Assembly code with relative and absolute addresses With VM abstraction, old linkers decide layout and
can supply definitive addressesWindows “.com” formatLinker can statically bind the program to virtual addressesNow, they provide hints as to where they would like to be
placed But….this could also be done at load time (address
space layout randomization)Windows “.exe” formatLoader rewrites addresses to proper offsetsSystem needs to force position-independent code
» Force compiler to make all jumps and branches relative to current location or relative to a base register set at run-time
ELF uses Global Offset Table» Symbol addresses obtained from GOT before access» Can be targetted for hooks!» Implementation determines exploit
Program execution
Programmer-Visible State EIP - Instruction Pointer
a. k. a. Program CounterAddress of next instruction
Register FileHeavily used program data
Condition CodesStore status information about most recent arithmetic
operationUsed for conditional branching
EIP
Registers
CPU Memory
Object CodeProgram Data
OS Data
Addresses
Data
Instructions
Stack
ConditionCodes
MemoryMemory Byte addressable array Code, user data, OS data Includes stack used to support
procedures
Run-time data structures
kernel virtual memory(code, data, heap, stack)
memory mapped region forshared libraries
run-time heap(managed by malloc)
user stack(created at runtime)
unused0
%esp (stack pointer)
memoryinvisible touser code
brk
0xc0000000
0x08048000
0x40000000
read/write segment(.data, .bss)
read-only segment(.init, .text, .rodata)
loaded from the executable file
0xffffffff
Registers
The processor operates on data in registers (usually)movl (%eax), %ecx
Fetch data at address contained in %eax Store in register %ecx
movl $array, %ecxMove address of variable array into %ecx
Typically, data is loaded into registers, manipulated or used, and then written back to memory
The IA32 architecture is “register poor” Few general purpose registers Source or destination operand is often memory
locations Makes context-switching amongst processes easy
(less register-state to store)
IA32 General Registers015 7831
%ah %al
%ch %cl
%dh %dl
%bh %bl
%eax
%ecx
%edx
%ebx
%esi
%edi
%esp
%ebp
%ax
%cx
%dx
%bx
%si
%di
%sp
%bp
Stack pointer
Frame pointer
Special purposeregisters
General purposeregisters (mostly)
Operand types
A typical instruction acts on 1 or more operandsaddl %ecx, %edx adds the contents of ecx to
edx
Three general types of operands Immediate
Like a C constant, but preceded by $e.g., $0x1F, $-533Encoded with 1, 2, or 4 bytes based on instruction
Register: the value in one of the 8 integer registers Memory: a memory address
There are many modes for addressing memory
Operand examples using mov
Memory-memory transfers cannot be done with single instruction
movl
Imm
Reg
Mem
Reg
Mem
Reg
Mem
Reg
Source Destination
movl $0x4,%eax
movl $-147,(%eax)
movl %eax,%edx
movl %eax,(%edx)
movl (%eax),%edx
C Analog
temp = 0x4;
*p = -147;
temp2 = temp1;
*p = temp;
temp = *p;
Addressing Modes
Immediate and registers have only one mode
Memory on the other hand … Absolute
specify the address of the data Indirect
use register to calculate address Base + displacement
use register plus absolute address to calculate address Indexed
Indexed» Add contents of an index register
Scaled index» Add contents of an index register scaled by a constant
Summary of IA32 Operand Forms
Scaled IndexedM[Imm + R[Eb] + R[Ei] * s]Imm (Eb, Ei, s)Memory
Scaled IndexedM[R[Eb] + R[Ei] * s](Eb, Ei, s)Memory
Scaled IndexedM[Imm + R[Ei] * s]Imm(, Ei, s)Memory
Scaled IndexedM[R[Ei] * s](, Ei, s)Memory
IndexedM[Imm + R[Eb] + R[Ei]]Imm(Eb, Ei)Memory
IndexedM[R[Eb] + R[Ei]](Eb, Ei)Memory
Base + displacmentM[Imm + R[Eb]Imm(Eb)Memory
IndirectM[R[Ea]](Ea)Memory
AbsoluteM[Imm]ImmMemory
RegisterR[Ea]Ea Register
ImmediateImm$ImmImmediate
NameOperand ValueFormType
x86 instructions
RulesSource operand can be memory, register or
constantDestination can be memory or registerOnly one of source and destination can be memorySource and destination must be same size
Flags set on each instructionEFLAGSConditional branches handled via EFLAGS
What’s the “l” for on the end?
addl 8(%ebp),%eaxIt stands for “long” and is 32-bitsIt tells the size of the operand.Baggage from the days of 16-bit processors
For x86, x86_648 bits is a byte16 bits is a word32 bits is a double word64 bits is a quad word
IA32 Standard Data Types
10/12tExtended precisionlong double
8lDouble precisiondouble
4sSingle precisionfloat
4lDouble wordchar *
4lDouble wordunsigned long
4lDouble wordlong int
4lDouble wordunsigned
4lDouble wordint
2wWordshort
1bBytechar
Size in bytesGAS SuffixIntel Data TypeC Declaration
Global vs. Local variables
Global variables stored in either .data or .bss section of process
Local variables stored on stack
Global vs local exampleint x = 1;int y = 2;void a(){ x = x+y; printf("Total = %d\n",x);}int main(){a();}
void a(){
int x = 1;int y = 2;
x = x+y; printf("Total = %d\n",x);}int main() {a();}
Global vs local exampleint x = 1;int y = 2;void a(){ x = x+y; printf("Total = %d\n",x);}int main(){a();}
080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: mov -0x4(%ebp),%eax 80483db: add %eax,-0x8(%ebp) 80483de: mov -0x8(%ebp),%eax 80483e1: mov %eax,0x4(%esp) 80483e5: movl $0x80484f0,(%esp) 80483ec: call 80482dc <printf@plt> 80483f1: leave 80483f2: ret
void a(){
int x = 1;int y = 2;x = x+y;printf("Total = %d\n",x);
}int main() {a();}
080483c4 <a>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x8,%esp 80483ca: mov 0x804966c,%edx 80483d0: mov 0x8049670,%eax 80483d5: lea (%edx,%eax,1),%eax 80483d8: mov %eax,0x804966c 80483dd: mov 0x804966c,%eax 80483e2: mov %eax,0x4(%esp) 80483e6: movl $0x80484f0,(%esp) 80483ed: call 80482dc <printf@plt> 80483f2: leave 80483f3: ret
Arithmetic operations
void f(){ int a = 0; int b = 1; a = a+11; a = a-b; a--; b++;}
int main() { f();}
08048394 <f>: 8048394: push %ebp 8048395: mov %esp,%ebp 8048397: sub $0x10,%esp 804839a: movl $0x0,-0x8(%ebp) 80483a1: movl $0x1,-0x4(%ebp) 80483a8: addl $0xb,-0x8(%ebp) 80483ac: mov -0x4(%ebp),%eax 80483af: sub %eax,-0x8(%ebp) 80483b2: subl $0x1,-0x8(%ebp) 80483b6: addl $0x1,-0x4(%ebp) 80483ba: leave 80483bb: ret
Machine Instruction ExampleC Code
Add two signed integers
AssemblyAdd 2 4-byte integers
“Long” words in GCC parlanceSame instruction whether signed
or unsignedOperands:
x: Register %eaxy: Memory M[%ebp+8]t: Register %eax
»Return function value in %eax
Object Code3-byte instructionStored at address 0x401046
0x401046: 03 45 08
int sum(int x, int y){ int t = x+y; return t;}
_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpret
Condition codesThe IA32 processor has a register called eflags(extended flags)
Each bit is a flag, or condition codeCF Carry Flag SFSign Flag
ZF Zero Flag OFOverflow Flag
As programmers, we don’t write to this register and seldom read it directly
Flags are set or cleared by hardware depending on the result of an instruction
Condition Codes (cont.)
Setting condition codes via compare instructioncmpl b,aComputes a-b without setting destinationCF set if carry out from most significant bit
Used for unsigned comparisonsZF set if a == bSF set if (a-b) < 0OF set if two’s complement overflow
(a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)
Byte and word versions cmpb, cmpw
Condition Codes (cont.)
Setting condition codes via test instructiontestl b,a Computes a&b without setting destination
Sets condition codes based on resultUseful to have one of the operands be a mask
Often used to test zero, positivetestl %eax, %eax
ZF set when a&b == 0SF set when a&b < 0Byte and word versions testb, testw
if statements
void f(){ int x = 1; int y = 2; if (x==y) { printf("x equals y.\n"); } else { printf("x is not equal to y.\n"); }}
int main() { f();}
080483c4 <f>: 80483c4: push %ebp 80483c5: mov %esp,%ebp 80483c7: sub $0x18,%esp 80483ca: movl $0x1,-0x8(%ebp) 80483d1: movl $0x2,-0x4(%ebp) 80483d8: mov -0x8(%ebp),%eax 80483db: cmp -0x4(%ebp),%eax 80483de: jne 80483ee <f+0x2a> 80483e0: movl $0x80484f0,(%esp) 80483e7: call 80482d8 <puts@plt> 80483ec: jmp 80483fa <f+0x36> 80483ee: movl $0x80484fc,(%esp) 80483f5: call 80482d8 <puts@plt> 80483fa: leave 80483fb: ret
if statementsint a = 1, b = 3, c; if (a > b)
c = a; else
c = b;
00000018: C7 45 FC 01 00 00 00 mov dword ptr [ebp-4],1 ; store a = 1
0000001F: C7 45 F8 03 00 00 00 mov dword ptr [ebp-8],3 ; store b = 3
00000026: 8B 45 FC mov eax,dword ptr [ebp-4] ; move a into EAX register
00000029: 3B 45 F8 cmp eax,dword ptr [ebp-8] ; compare a with b (subtraction)
0000002C: 7E 08 jle 00000036 ; if (a<=b) jump to line 00000036
0000002E: 8B 4D FC mov ecx,dword ptr [ebp-4] ; else move 1 into ECX register &&
00000031: 89 4D F4 mov dword ptr [ebp-0Ch],ecx ; move ECX into c (12 bytes down) &&
00000034: EB 06 jmp 0000003C ; unconditional jump to 0000003C
00000036: 8B 55 F8 mov edx,dword ptr [ebp-8] ; move 3 into EDX register &&
00000039: 89 55 F4 mov dword ptr [ebp-0Ch],edx ; move EDX into c (12 bytes down)
int factorial_do(int x){ int result = 1; do { result *= x; x = x-1; } while (x > 1); return result;}
Loops
factorial_do: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl $1, %eax.L2: imull %edx, %eax decl %edx cmpl $1, %edx jg .L2 leave ret
C switch statementsImplementation options
Series of conditionals testl followed by je Good if few cases Slow if many cases
Jump table (example below) Lookup branch target from a table Possible with a small range of integer constants
GCC picks implementation based on structure
Example:
switch (x) {case 1: case 5:
code at L0case 2:case 3:
code at L1default:
code at L2}
.L2
.L0
.L1
.L1
.L2
.L0
.L3
1. init jump table at .L32. get address at .L3+4*x3. jump to that address
Example int switch_eg(int x){ int result = x; switch (x) { case 100: result *= 13; break;
case 102: result += 10; /* Fall through */
case 103: result += 11; break;
case 104: case 106: result *= result; break;
default: result = 0; } return result;}
41
leal -100(%edx),%eax cmpl $6,%eax ja .L9 jmp *.L10(,%eax,4) .p2align 4,,7.section .rodata .align 4 .align 4.L10: .long .L4 .long .L9 .long .L5 .long .L6 .long .L8 .long .L9 .long .L8.text .p2align 4,,7.L4: leal (%edx,%edx,2),%eax leal (%edx,%eax,4),%edx jmp .L3 .p2align 4,,7.L5: addl $10,%edx
.L6: addl $11,%edx jmp .L3 .p2align 4,,7.L8: imull %edx,%edx jmp .L3 .p2align 4,,7.L9: xorl %edx,%edx.L3: movl %edx,%eax
Key is Key is jump table at L10jump table at L10Array of pointers to jump locationsArray of pointers to jump locations
int switch_eg(int x){ int result = x; switch (x) { case 100: result *= 13; break;
case 102: result += 10; /* Fall through */
case 103: result += 11; break;
case 104: case 106: result *= result; break;
default: result = 0; } return result;}
x86-64 conditionals
Modern CPUs with deep pipelinesInstructions fetched far in advance of executionMask the latency going to memoryProblem: What if you hit a conditional branch?
Must predict which branch to take!Branch prediction in CPUs well-studied, fairly effectiveBut, best to avoid conditional branching altogether
x86-64 conditionalsConditional instruction execution
Conditional MoveConditional move instruction
cmovXX src, dest Move value from src to dest if condition XX holds No branching Handled as operation within Execution Unit Added with P6 microarchitecture (PentiumPro onward)
Example
Current version of GCC won’t use this instruction Thinks it’s compiling for a 386
Performance 14 cycles on all data More efficient than conditional branching (simple control flow) But overhead: both branches are evaluated
movl 8(%ebp),%edx # Get xmovl 12(%ebp),%eax # rval=ycmpl %edx, %eax # rval:x
cmovll %edx,%eax # If <, rval=x
x86-64 conditional example
absdiff: # x in %edi, y in %esimovl %edi, %eax # eax = xmovl %esi, %edx # edx = ysubl %esi, %eax # eax = x-ysubl %edi, %edx # edx = y-xcmpl %esi, %edi # x:ycmovle %edx, %eax # eax=edx if <=
ret
int absdiff( int x, int y){ int result; if (x > y) { result = x-y; } else { result = y-x; } return result;}
IA32 Stack Region of memory
managed with stack discipline
Grows toward lower addresses
Register %esp indicates lowest stack address
address of top element
StackPointer%esp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
IA32 Stack PushingPushing
pushl SrcDecrement %esp by 4Fetch operand at SrcWrite operand at address
given by %esp e.g. pushl %eax
subl $4, %espmovl %eax,(%esp) Stack Grows
Down
IncreasingAddresses
Stack “Top”
Stack “Bottom”
StackPointer%esp -4
IA32 Stack PoppingPopping
popl DestRead operand at address
given by %espWrite to DestIncrement %esp by 4
e.g. popl %eaxmovl (%esp),%eaxaddl $4,%esp
StackPointer%esp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
+4
%esp
%eax
%edx
%esp
%eax
%edx
%esp
%eax
%edx
0x104
555
0x108
0x108
0x10c
0x110
0x104
213
213
123
Stack Operation Examples
0x108
0x10c
0x110
213
123
0x108 0x104
pushl %eax
0x108
0x10c
0x110
213
123
0x104
213
popl %edx
0x108
213
Initially
Top
Top Top
Procedure Control Flow
Procedure call:call label
Push address of next instruction (after the call) on stackJump to label
Procedure return:ret Pop address from stack into eip register
%esp
%eip
%esp
%eip 0x804854e
0x108
0x108
0x10c
0x110
0x104
0x804854e
0x8048553
123
Procedure Call Example
0x108
0x10c
0x110
123
0x108
call 8048b90
804854e: e8 3d 06 00 00 call 8048b90 <main>8048553: 50 next instruction
0x8048b90
0x104
%eip is program counter
0x8048e910x8048553
%esp
%eip
0x104
%esp
%eip0x8048e90
0x1040x104
0x108
0x10c
0x110
0x8048553
123
Procedure Return Example
0x108
0x10c
0x110
123
ret
8048e90: c3 ret
0x108
%eip is program counter
0x8048553
Procedure Control FlowWhen procedure foo calls who:
foo is the caller, who is the callee Control is transferred to the ‘callee’
When procedure returns Control is transferred back to the ‘caller’
Last-called, first-return (LIFO) order Naturally implemented via the stack
foo(…){
• • •who();• • •
}
who(…){
• • •amI();• • •amI();• • •
}
amI(…){
• • •• • •
}
call
call
retret
Procedure calls and stack framesHow does the ‘callee’ know where to return later?
Return address placed in a well-known location on stack within a “stack frame”
How are arguments passed to the ‘callee’? Arguments placed in a well-known location on stack
within a “stack frame”
Upon procedure invocation Stack frame created for the procedure Stack frame is pushed onto program stack
Upon procedure return Its frame is popped off of stack Caller’s stack frame is recovered
foo’sstack frame
who’sstackframe
Stack bottom
increasin
g ad
dressesamI’s
stackframe
stack gro
wth
Call chain: foo => who => amI
Keeping track of stack frames
The stack pointer (%esp) moves around Can be changed within procedure Problem
How can we consistently find our parameters? The base pointer (%ebp)
Points to the base of our current stack frameAlso called the frame pointerWithin each function, %ebp stays constant
Most information on the stack is referenced relative to the base pointer Base pointer setup is the programmer’s job
Actually usually the compiler’s job
IA32/Linux Stack FrameCurrent Stack Frame (Yellow) (From Top
to Bottom) Parameters for function about to be
called “Argument build” of caller
Local variables If can’t keep in registers
Saved register context Old frame pointer
Caller Stack Frame (Pink) Return address
Pushed by call instruction
Arguments for this call “Argument build” of callee
etc…Stack Pointer(%esp)
Frame Pointer(%ebp)
Return Addr
SavedRegisters
+Local
Variables
ArgumentBuild
Old %ebp
Arguments
CallerFrame
swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
int zip1 = 15213;int zip2 = 91125;
void call_swap(){ swap(&zip1, &zip2);}
call_swap:• • •pushl $zip2 # Global Varpushl $zip1 # Global Varcall swap• • •
&zip2
&zip1
Rtn adr %esp
ResultingStack
•••
Calling swap from call_swap
swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
swap:pushl %ebpmovl %esp,%ebppushl %ebx
movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Body
Setup
Finish
swap Setup #1
swap:pushl %ebpmovl %esp,%ebppushl %ebx
Resultingstack
&zip2
&zip1
Rtn adr %esp
EnteringStack
•••
%ebp
yp
xp
Rtn adr
Old %ebp
%ebp
•••
%esp
swap Setup #2
swap:pushl %ebpmovl %esp,%ebppushl %ebx
Stack beforeinstruction
yp
xp
Rtn adr
Old %ebp %ebp
Resultingstack
•••
%esp
yp
xp
Rtn adr
Old %ebp
%ebp
•••
%esp
swap Setup #3
swap:pushl %ebpmovl %esp,%ebppushl %ebx
Stack beforeinstruction
yp
xp
Rtn adr
Old %ebp %ebp
ResultingStack
•••
Old %ebx %esp
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
Effect of swap Setup
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset(relative to %ebp)
•••
&zip2
&zip1
Rtn adr %esp
EnteringStack
•••
%ebp
Old %ebx %esp
movl 12(%ebp),%ecx # get ypmovl 8(%ebp),%edx # get xp. . .
Body
ResultingStack
swap Finish #1
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
swap’sStack
•••
Old %ebx %esp-4
ObservationSaved & restored register %ebx
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
•••
Old %ebx %esp-4
swap Finish #2
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
swap’sStack
•••
Old %ebx %esp-4
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
swap’sStack
•••
%esp
swap Finish #3
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
%ebp
4
8
12
Offset
swap’sStack
•••
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
swap’sStack
•••
%esp
%esp
swap Finish #4
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
&zip2
&zip1 %esp
ExitingStack
•••
%ebp
Observation Saved & restored register %ebx Didn’t do so for %eax, %ecx, or %edx
yp
xp
Rtn adr
%ebp
4
8
12
Offset
swap’sStack
•••
%esp
swap void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
swap:pushl %ebpmovl %esp,%ebppushl %ebx
movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Body
Setup
Finish
Save old %ebp of caller frameSet new %ebp for callee (current) frameSave state of %ebx register from caller
Retrieve parameter yp from caller frameRetrieve parameter xp from caller frame
Perform swap
Restore the state of caller’s %ebx registerSet stack pointer to bottom of callee frame (%ebp)Restore %ebp to original state
Pop return address from stack to %eip
Equivalent to single leave instruction
Local variables
Where are they in relation to ebp?Stored “above” %ebp (at lower addresses)
How are they preserved if the current function calls another function?Compiler updates %esp beyond local variables
before issuing “call”
What happens to them when the current function returns?Are lost (i.e. no longer valid)
Register Saving Conventions
When procedure foo calls who: foo is the caller, who is the callee
Can Register be Used for Temporary Storage?
Conventions “Caller Save”
Caller saves temporary in its frame before calling “Callee Save”
Callee saves temporary in its frame before using
IA32 Register Usage
Integer Registers Two have special uses
%ebp, %esp
Three managed as callee-save%ebx, %esi, %ediOld values saved on stack
prior to using
Three managed as caller-save%eax, %edx, %ecxDo what you please, but
expect any callee to do so, as well
Return value in %eax
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
Caller-SaveTemporaries
Callee-SaveTemporaries
Special
simple.c
_simple: pushl %ebp Setup stack frame pointer movl %esp, %ebp movl 8(%ebp), %edx get xp movl 12(%ebp), %ecx get y movl (%edx), %eax move *xp to t addl %ecx, %eax add y to t movl %eax, (%edx) store t at *xp popl %ebp restore frame pointer ret return to caller
int simple(int *xp, int y){ int t = *xp + y; *xp = t; return t;}
gcc –O2 –c simple.c
Function pointersPointers in C can also point to code locations
Function pointers Store and pass references to code
Some uses Dynamic “late-binding” of functions
Dynamically “set” a random number generator Replace large switch statements for implementing dynamic event handlers
» Example: dynamically setting behavior of GUI buttons
Emulating “virtual functions” and polymorphism from OOP qsort() with user-supplied callback function for comparison
» man qsort Operating on lists of elements
» multiplicaiton, addition, min/max, etc.
Malware leverages this to execute its own code
Using pointers to functions// function prototypesint doEcho(char*);int doExit(char*);int doHelp(char*);int setPrompt(char*);
// dispatch table sectiontypedef int (*func)(char*);
typedef struct{ char* name; func function;} func_t;
func_t func_table[] ={ { "echo", doEcho }, { "exit", doExit }, { "quit", doExit }, { "help", doHelp }, { "prompt", setPrompt },};
#define cntFuncs (sizeof(func_table) / sizeof(func_table[0]))
// find the function and dispatch itfor (i = 0; i < cntFuncs; i++) { if (strcmp(command,func_table[i].name)==0){ done = func_table[i].function(argument); break; }}if (i == cntFuncs) printf("invalid command\n");
Function pointers example#include <sys/time.h>#include <stdio.h>void fp1(int i){ printf("Even\n“,i);}void fp2(int i) { printf("Odd\n”,i); }
main(int argc, char **argv) { void (*fp)(int); int i = argc;
if (argc%2) fp=fp2; else fp=fp1; fp(i);}
mashimaro % ./funcp aEven 2mashimaro % ./funcp a bOdd 3mashimaro %
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
movl (%ecx), %eax
movl $fp2, %edx
testb $1, %al
jne .L4
movl $fp1, %edx
.L4:
movl %eax, (%esp)
call *%edx
addl $4, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
Uses in operating system
Interrupt descriptor tablePointers to interrupt handler functionsIDTR points to IDT
System services descriptor tablePointers to system call functions
Import address tablePointers to imported library calls
Malware attacks all of these
More disassembly
Code patterns in assembly Calling conventions (fast vs. standard vs. cdecl) ebp omission ecx use as C++ this pointer C++ vtables (virtual function table) WinXP SP2 prologue with patching support
For detours Exception handlers (FS register)
Linked list of functions stored in exception frames on stack
Advanced disassembly
Windows examplesLargely the same with small modificationsSize of operands (i.e. dword) specified (not in
operator suffix)Reverse ordering of operands
Disassembly example
0000 mov ecx, 5
0003 push aHello
0009 call printf
000E loop 00000003h
0014 ...
for(int i=0;i<5;i++)
{
printf(“Hello”);
}
0000 cmp ecx, 100h
0003 jnz 001Bh
0009 push aYes
000F call printf
0015 jmp 0027h
001B push aNo
0021 call printf
0027 ...
if(x == 256)
{
printf(“Yes”);
}
else
{
printf(“No”);
}
Disassembly example
int main(int argc, char **argv)
{
WSADATA wsa;
SOCKET s;
struct sockaddr_in name;
unsigned char buf[256];
// Initialize Winsock
if(WSAStartup(MAKEWORD(1,1),&wsa))
return 1;
// Create Socket
s = socket(AF_INET,SOCK_STREAM,0);
if(INVALID_SOCKET == s)
goto Error_Cleanup;
name.sin_family = AF_INET;
name.sin_port = htons(PORT_NUMBER);
name.sin_addr.S_un.S_addr = htonl(INADDR_ANY);
// Bind Socket To Local Port
if(SOCKET_ERROR == bind(s,(struct sockaddr*)&name,sizeof(name)))
goto Error_Cleanup;
// Set Backlog parameters
if(SOCKET_ERROR == listen(s,1))
goto Error_Cleanup;
push ebpmov ebp, espsub esp, 2A8hlea eax, [ebp+0FFFFFE70h]push eaxpush 101hcall 4012BEhtest eax, eaxjz 401028hmov eax, 1jmp 40116Fhpush 0push 1push 2call 4012B8hmov dword ptr [ebp+0FFFFFE6Ch], eaxcmp dword ptr [ebp+0FFFFFE6Ch], byte 0FFhjnz 401047hjmp 401165hmov word ptr [ebp+0FFFFFE5Ch], 2push 800hcall 4012B2hmov word ptr [ebp+0FFFFFE5Eh], axpush 0call 4012AChmov dword ptr [ebp+0FFFFFE60h], eaxpush 10hlea ecx, [ebp+0FFFFFE5Ch]push ecxmov edx, [ebp+0FFFFFE6Ch]push edxcall 4012A6hcmp eax, byte 0FFhjnz 40108Dhjmp 401165hpush 1mov eax, [ebp+0FFFFFE6Ch]push eaxcall 4012A0hcmp eax, byte 0FFhjnz 4010A5hjmp 401165h
Tools for disassembling
IDA Pro, IDA Pro Free
– Disassembler
– Execution graph
– Cross-referencing
– Searching
– Function analysis
– Function and variable labeling
Tools for disassembling
objdumpobjdump -d <object_file> Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either executable or relocatable (.o) file
gdb Debuggergdb pdisassemble sum Disassemble procedurex/13b sum Examine the 13 bytes starting at sum
In-class exerciseLab 5-1 (Steps 1-17)
– Use IDA Pro to bring up the code of DllMain
– Bring up Figures 5-1L, the equivalent of 5-2L, and 5-3L
– Find the remote shell routine in which memcmp is used to compare command strings received over the network
– Show the code for the function called if the command robotwork is invoked
– Show IDA Pro graphs of DLLMain and sub_10004E79
– Explain what the assembly code on p. 499 does
– Find the socket call referred to in Table 5-1L and change its integer constants to symbolic ones
– Show the assembly on p. 500. Find the routine that calls this assembly which shows that it is an anti-VM check.
In-class exerciseLab 6-1
– Show the imported network functions in any tool
– Show the output of executing the binary
– Load binary in IDA Pro to generate Figure 6-1L
Lab 6-2
– Generate Listing 6-1L and 6-2L using a tool of your choice. What calls hint at this code's function?
– Using either Wireshark or netcat with Apate DNS, execute the malware to generate Listing 6-3L
– In IDA Pro, show the functions called by main. What does each one do?
– In IDA Pro, show the order that the WinINet calls are used and explain what each one does.
– Generate Listing 6-5L and explain what each cmp does.
Windows
Chapter 7: Analyzing Malicious Windows Programs
Types
Hungarian notation word (w) = 16 bit value double word (dw) = dword = 32 bit value
• dwSize = A type that is a 32-bit value
Handles (H)• HWND = A handle to a window
Long Pointer (LP) Callback
File system functions
Malware often hits file systemCreateFile, ReadFile, WriteFileMemory mapping calls: CreateFileMapping,
MapViewOfFileTrickiness
• Alternate Data Streams (special file data)
• \Device\PhysicalMemory (accesses memory)
• \\.\ (accesses device)
Registry functions
Malware often hits registryRegistry stores OS and program configuration
informationHKEY_LOCAL_MACHINE (HKLM) – Settings global
to the machineHKEY_CURRENT_USER (HKCU) – Settings for
current userRegedit tool for examining valuesFunctions: RegOpenKeyEx, RegSetValueEx,
RegGetValue (Listing 7-1)
Networking APIs
Berkeley sockets APIsocket, bind, listen, accept, connect, recv, sendListing 7-3
WinINet API
InternetOpen, InternetOpenURL, InternetReadFile
DLLs
Dynamic link librariesStore code that is re-used amongst applications
including malwareCan be used to store malicious code for injection
into a processMalware uses standard Windows DLLs to interact
with OSMalware uses third-party DLLs (e.g. Firefox DLL) to
avoid re-implementing functions
Processes
Execute code outside of current processCreateProcessListing 7-4
Hijack execution of current process
Injecting code via debugger or DLLs
Companion execution
Store executable in resource section of PEProgram extracts executable and writes it to disk
upon execution
Threads
Windows threads share same memory space but have separate registers and stackUsed by Malware to insert a malicious DLL into a
process's address spaceCreateThread with address of LoadLibrary as start
address
Services
Processes run in the backgroundScheduled and run by Windows service manager
without user inputOpenSCManager, CreateService, StartServiceAllows malware to maintain persistence on a
machineTypes
• WIN32_SHARE_PROCESS = allows multiple processes to contact service (e.g. svchost.exe)
• WIN32_OWN_PROCESS = independent process
• KERNEL_DRIVER = loads code into kernel
COM
Microsoft Component Object ModelInterface standard that allows software components
to call each other• OleInitialize, CoInitializeEx
• CLSID = class identifier, IID = interface identifier
“Navigate” function in IWebBrowser2 interface• Used by malware to launch browser
• Listing 7-11
Malware implemented as COM server• Browser helper objects
• Detect COM servers running via its calls– DllCanUnloadNow, DllGetClassObject, DllInstall,
DllRegisterServer, DllUnregisterServer
Exceptions
Allow program to handle exceptional conditions during program executionWindows Structured Exception Handling
• Exception handling information stored on stack
• Listing 7-13
• Not all handlers respond to all exceptions
• Thrown to caller's frame if not handled
Used by malware to hijack execution• Handler address replaced by address to
injected malicious code
• Adversary then triggers exception
Kernel-mode malware
Windows API calls (Kernel32.dll)Typically call into underlying Native API (Ntdll.dll)Code in Ntdll then transfers to kernel
(Ntoskrnl.exe) via INT 0x2E, SYSENTER, SYSCALL
• Figure 7-3
Malware often calls Ntdll directly to avoid detection via interposition of security programs between Kernel32.dll and Ntdll.dll
• Example: Windows API (ReadFile, WriteFile) versus Native API (NtReadFile, NtWriteFile)
• Figure 7-4
Kernel-mode malware
Other Native API callsNtQuerySystemInformation,
NtQueryInformationProcess, NtQueryInformationThread, NtQueryInformationFile, NtQueryInformationKey
• Can also carry “Zw” prefix
NtContinue• Used to return from an exception
• Location to return is specified in exception context, but can be modified to transfer execution in nefarious ways
Kernel-mode malware
Legitimate programs typically do not use NativeAPI exclusively
Programs that are native applications (as specified in subsytem part of PE header) are likely malicious
In-class exercise
Lab 7-2 Using strings, identify the network resource being used by the
malware What imports give away the mechanism this malware uses to
launch the browser? Go to the code snippet shown on p. 518. Follow the references
to show the values of rclsid and riid in memory. Debug the program and break at the call shown on p. 519. Run
the call to show the browser being launched with the embedded URL
Extra
Run-time data structures
More code snippetsRegistry modifications for disabling task manager and changing browser
default page
HKEY_CURRENT_USER\Software\Policies\Microsoft\Internet Explorer\Control Panel,HomepageHKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\SystemDisableRegistryToolsHKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MainStart PageHKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_buzz content urlHKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_Launchcast DisableTaskMgr
More code snippetsKills anti-virus, zone-alarm, firewall processes
More code snippetsNew variants
Download worm update files and register them as services regsvr32 MSINET.OCX
Internet Transfer ActiveX Control
Check for updates