Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Analysis and Visualization of Common Packers
PowerOfCommunity, SeoulEro Carrera - [email protected]
Reverse Engineer at zynamics GmbH
Chief Research Officer at VirusTotal
Introduction
Originally meant to save space by reducing the redundancy in executable file formats
Simply compressed parts or the whole of the executable
Created a new "envelope" around it that restored the original executable and the passed control to it
The decompressing envelope did not much more than just restoring the executable
An historical perspective
Compression provided a trivial degree of obfuscation, but obfuscation nonetheless
Was easy to add additional measures in the decompressing envelope
Evolution of the techniques
Overview of the techniques
Import address table
Simple late reconstruction into an original form
Construction of new connectivity artifacts between the original code and imported modules
Strings
Destruction of informational components
Aimed at making tracing hard
Using SEHs triggered by hard to handle exceptions
Confuse debuggers throwing INTs they use
Calling hard-to-hook low level APIs/syscalls
Checking for hooks
Anti-debug
VM detection.
VMWare, VirtualPC, etc
Techniques aimed against specific tools
OllyDBG, IDA, Softice, etc
Anti-environment
Tricks detecting, confusing or aimed at crashing some of the most common tools
IDA
OllyDBG
Procdump
Softice
Breaking tools
Code obfuscation
Adding junk code, using opaque predicates
Code transformation
Virtual machines
Flow obfuscation (SEH, Nanomites)
Anti-analysis
Bochs
Provides with a high-level view
No need to worry about most of the anti-* techniques
Windbg
Can do kernel-mode debugging, hook syscalls, look deeper that user-mode tools
Inspection of physical/virtual memory
Memoryze, for the real hardcore
Tools
Obfuscation & Anti-Analysis
Most tools will linearly disassemble a chunk of code
Introduce and non-terminal flow branching instruction (not a ret or jmp)
Make it point later in the code, to the middle of what would be an instruction if disassembling linearly
Result => confusion
(latest IDA, 5.3, has some workarounds against this)
Also: indirect obfuscation through heavy optimization
Basic trickery against analysis algorithms
0101F001 60________ pusha
0101F002 E803000000 call near ptr loc_101F007+1
0101F007 E9EB045D45 jmp near ptr 465EF4F7h
Example: ASPack (original)
0101F001 60________ pusha
0101F002 E803000000 call loc_101F008
0101F007 E9________ db 0E9h ; T
0101F008 EB04______ jmp short loc_101F00E
Example: ASPack (fixed)
0DFE000000 or eax, 0FEh
3D02000000 cmp eax, 2
E901000000 jmp $+6
75B8______ jnz short near ptr 0FFFFFFC9h
F3C001C0__ rep rol byte ptr [ecx], 0C0h
C3________ ret
Example: Linear Disassembly
0DFE000000 or eax, 0FEh
3D02000000 cmp eax, 2
E901000000 jmp $+6 // HERE
75________ db 0x75
HERE: B8F3C001C0 mov eax, 0xc001c0f3
C3________ ret
Example: Linear Disassembly
0DFE000000 and eax, 2
3D02000000 cmp eax, 2
7401______ jz $+3
75B8______ jnz short near ptr 0FFFFFFC6h
F3C001C0__ rep rol byte ptr [ecx], 0C0h
C3________ ret
Example: Opaque Predicate
0DFE000000 and eax, 2
3D02000000 cmp eax, 2
7401______ jz $+3
75________ db 0x75
B8F3C001C0 mov eax, 0xc001c0f3
C3________ ret
Example: Opaque Predicate
FunctionExecutable Image
Memory Page
Memory Page
Memory Page
Function Chunk
Function Chunk
Function Chunk
Function Chunk
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
Function Chunk
FunctionExecutable Image
Memory Page
Memory Page
Memory Page
Function Chunk
Function Chunk
Function Chunk
Function Chunk
Function Chunk
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
Shared Blocks
Function A Function Baddress instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
address instruction (operand, ...)
address instruction (operand, ...)
address instruction (operand, ...)
...
Junk. Polymorphic and static
pusha
popa
Non-Standard Branching
Junk JMP insertion
Junk Code
018900CE 48________ dec eax
018900CF 60________ pusha
018900D0 B9FBDEF000 mov ecx, 0F0DEFBh
018900D5 50________ push eax
018900D6 9C________ pushf
018900D7 E912000000 jmp loc_18900EE
018900D7 [junk data]
Exmple: Junk Code I (Themida)
018900EE loc_18900EE:
018900EE E90E000000__ jmp loc_1890101
018900EE [junk data]
01890101 loc_1890101:
01890101 9D__________ popf
01890102 5E__________ pop esi
01890103 61__________ popa
01890104 0F844E06FA7A jz loc_7C830758
Exmple: Junk Code II (Themida)
Visual Basic, Java, Python, Ruby, Perl, .NET
Starforce, VMProtect, x86 Virtualizer, Themida/CodeVirtualizer
At a high-level it’s a: fetch, decode, handle algorithm
Virtual Machines
Real CPU
Virtual CPU
Virtualized Code
Real CPU
Standard Code
Runs Runs
Runs
Virtual CPU opcodes
Virtual CPU
Decoder
opA reg1, reg2
opB reg2
branchA XYZ
Real CPU opcodes
handler for opA
Fetch Instruction Pointer
1
handler for opB
handler for opC
handler for branchA
Execute handlerRegistersUpdate registers
3
4
Registers
-General Purpose
-Instruction Pointer
-Stack pointer
Decode2
...
Decoder
-Look up operand in table
-Call handler
Rolf Rolles and Boris Lau have already shown that optimization/reduction techniques can help
Translating to an intermediate representation and performing optimization in the code leads to reduced forms
You could use a tool like Peter’s “Find executable code” to discover instruction handlers from a memory dump of the VM
Virtual Machine Countermeasures
Some of the hardest current packers are VMProtect, Themida, Armadillo
They incorporate some complex, custom techniques
Usually commercial products protectors
Advanced Packers
Armadillo
Double process debugging, debug blocker
Nanomites
Strategic Code Splicing
Armadillo's invalid instructions
LOCK prefix
Invalid MOV
Armadillo
Parent process Child processDebug
INT 3Parent catches it
Look up address
Find target
Set target in child
context
Resume child Child process
INT 3Parent
catches it
Child's code
push ebp
mov ebp, esp
push 0
push 0
call XYZ
cmp eax, 0
INT 3
.
.
.
.
mov [UWZ], 0xff
pop ebp
ret
Transfer control
Debug
Transfer control
Themida
The general algorithm can be summarized as:
Retrieve the API's function body
Perform a basic analysis and disassembly
Reconstruct the API's function body inserting junk in between each of the real instructions
Re-assemble functionality, keep the semantics, change the syntax
Themida's API obfuscation
Executable Imported DLL
Standard Imports
Executable Imported DLL
Exported DLL Function
Internal DLL Function
A function references other code
Themida protected executable
DLL Function (Obfuscated)
Imported DLL
Exported DLL Function
Internal DLL Function
Some of the references are kept
The algorithm has limitations
References to other functions within the DLL are kept
Same for true branches of conditional branches
Those two points can allow us to do API discovery by studying their connectivity
Reconstruction
Adds lots of branching and junk
Keeps few "real" instructions per obfuscated block
IDA can “easily” deal with the branching
Although bogus calls break IDA analysis and lead to broken obfuscated functions
Some scripting can make this look better
Themida's obfuscation
Packing vs unpacking
Packing is not always a symmetric proces, sometimes it can't be undone perfectly
You won’t get the original process back
Can it be done generically? Some cases the answer is "mostly" yes
You will mostly always be able to obtain code close to its original form
Current state
skape documented an elegant trick on Uninformed 10 a few weeks ago
Attacks a basic heuristic used by most “generic” unpackers
Tracking execution transfer to “dirty-memory”
Recent techniques
Virtual Address Range A Virtual Address Range A
WRITE
TIME
Virtual Address Range A
WRITE
Virtual Address Range A
MMU
Virtual Address Range A
Physical Memory
Virtual Address Range A
MMU
Virtual Address Range A
Physical Memory
WRITE
EXECUTE
Windbg can see the mappings from virtual to physical
Not to hard to spot doubled mapped regions
Bochs and other low level emulators can easily do it as well
Requires kernel-mode access or “higher”
Countermeasures
Reversing. Secrets of Reverse Engineering. Eldad Eilam
Déprotection semi-automatique de binaire, Yoann Guillot & Alexandre Gazet
http://metasm.cr0.org/SSTIC08-article-Guillot_Gazet-Deprotection_Semi_Automatique_Binaire.pdf
Virtual Machine Threats , Peter Ferrie
http://www.symantec.com/avcenter/reference/Virtual_Machine_Threats.pdf
References
A Quick Survey on Automatic Unpacking Techniques, Daniel Reynaud
http://indefinitestudies.wordpress.com/2008/09/25/automatic-unpacking/
Using dual-mappings to evade automated unpackers, skape
http://www.uninformed.org/?v=10&a=1
Dealing with Virtualization packer, Boris Lau
http://www.datasecurity-event.com/uploads/boris_lau_virtualization_obfs.pdf
References II
Rolf Rolles blog in OpenRCE
https://www.openrce.org/blog/browse/RolfRolles
Oreans Themida/CodeVirtualizer
http://www.oreans.com
ReWolf's x86 Virtualizer
http://www.openrce.org/blog/view/847/x86_Virtualizer_-_source_code
References III
VMProtect
http://www.vmprotect.ru/
Deroko's Nanomite's write up
http://www.phearless.org/i3/Nanomites_And_Misc_Stuff.txt
Memoryze, Mandiant
http://www.mandiant.com/software/memoryze.htm
References IV
Thanks to Vangelis and the PoC crew!
Q&A