Implementing Oblivious Hashing Using Overlapped Instruction Encodings

Implementing Oblivious HashingUsing Overlapped Instruction Encodings

ACM Multimedia and Security ‘07

Dallas, TX (USA)

September 20-21, 2007

Mariusz H. Jakubowski

Ramarathnam Venkatesan

Microsoft Research

Matthias Jacob

Nokia Research

ACM Multimedia and Security ’07 September 20-21, 2007 2

Introduction• Field of work: Software protection

– Obfuscation and tamper-resistance– Prevention (or delaying) of reverse engineering and hacking– Securing of content-rights systems (DRM)

• Background: Two specific protection techniques– Oblivious hashing (OH): Computing hashes (“fingerprints”) of

execution traces– Overlapped code: “Jumping into the middle of instructions” to

obfuscate and protect against disassembly

• Goals of our work:– Apply overlapped code towards obfuscation and tamper-

resistance via OH.– Study new techniques in terms of formal models, avoiding “ad

hoc” approaches.


Overview

• Introduction• Background

– Software protection– Oblivious hashing (OH)– Overlapped code

• Code interleaving• Conclusion

Oblivious hashing via overlapped code


Software Protection

• Obfuscation– Making programs “hard to understand”

• Tamper-resistance– Making programs “hard to modify”

• Obfuscation tamper-resistance

• Tamper-resistance obfuscation?


Formal Obfuscation

• Impossible in general– Black-box model (Barak et al.):

“Source code” doesn’t help adversary who can examine input-output behavior.

– Worst-case programs and poly-time attackers

• Possible in specific limited scenarios– Secret hiding by hashing (Lynn et al.)– Point functions (Wee, Kalai et al.)

• Results difficult to use in practice.


Tamper-Resistance• Many techniques used in practice – e.g.:

– Code-integrity checksums (e.g., Atallah et al.’s software guards)– Anti-debugging and anti-disassembly methods– Virtual machines and interpreters– Polymorphic and metamorphic code

• Never-ending battle on a very active field– Targets: DRM, CD/DVD protection, games, dongles, licensing,

etc.– Defenses: Binary packers and “cryptors,” special compilers,

transformation tools, programming strategies, etc.

• Current techniques tend to be “ad hoc:”– No provable security– No analysis of time required to crack protected instances


Tamper-Resistance Model

• Program: A graph G• Execution: A “random” walk on G• Integrity checks:

– Probabilistic monitoring of a set of G’s nodes– Detection of failures that lead to delayed responses

• Security analysis: “Graph game” on G between attacker and defender

• OH and overlapped code in context of model:– Provide a source of integrity checks.– Help enforce “local indistinguishability” and other

engineering assumptions about implementation.

Abstraction of software tamper-resistance (Dedić et al., IH ’07)


Oblivious Hashing• Computation of hashes over program traces

– Initialize hash values at specific points.– Update hashes upon assignments and branches.

int x = 123;

if (GetUserInput() > 10){ x = x + 1;}else{ printf("Hello\n");}

INITIALIZE_HASH(hash1);

int x = 123;UPDATE_HASH(hash1, x);

if (GetUserInput() > 10){ UPDATE_HASH(hash1, BRANCH_ID_1); x = x + 1; UPDATE_HASH(hash1, x);}else{ UPDATE_HASH(hash1, BRANCH_ID_2); printf("Hello\n");}

VERIFY_HASH(hash1);

Original code

Hash transform

Hashed code


Overlapped Code

• Code sharing among different paths– Semantic: Sharing of code blocks among execution

paths.– Physical: Sharing of code bytes among machine or

byte-code instructions.

• Purposes– Anti-disassembly and anti-decompilation– Obfuscation– Tamper-resistance from code sharing and explicit OH


Semantic Overlap

Code section is sharedalong different paths:

increase_ctr(*ctr) { (*ctr)++;}

increase_win() { increase_ctr(&win); return win;}

increase_loss() { increase_ctr(&loss) return loss;}

return win; return loss;

Automated via code outlining


Physical Overlap

Offset 0:Offset 0:B8 B8 04 05 2D B8 B8 04 05 2D mov eax, 2D0504B8mov eax, 2D0504B805 9005 90 sub eax, 90sub eax, 90

Offset 1:Offset 1:B8 04 05 2D 05B8 04 05 2D 05 mov eax, 52D0504mov eax, 52D05049090 nopnop

Offset 2:Offset 2:04 0504 05 add al, 5add al, 52D 05 902D 05 90 sub eax, 9005sub eax, 9005

Execution and disassembly depend on entry point into code.

Sample x86 code: B8 B8 04 05 2D 05 90

Note: Disassembly tends to resynchronize naturally – but we can prevent this.

Offset 3:Offset 3:05 2D 05 9005 2D 05 90 add eax, 90052Dadd eax, 90052D

Offset 4:Offset 4:2D 05 902D 05 90 sub eax, 9005sub eax, 9005

Offset 5:Offset 5:05 9005 90 sub eax, 90sub eax, 90


Disassembly Synchronization• Often observed in practice, but previously not explained

mathematically.• Limits effectiveness of code overlapping for security.• Requires explicit anti-synchronization measures to enforce

protection.• Rigorous explanation: Kruskal count

00411410 55 push ebp 00411411 8B EC mov ebp,esp 00411413 12 EC adc ch,ah 00411415 C0 00 00 rol byte ptr [eax],0 00411418 00 53 56 add byte ptr [ebx+56h],dl 0041141B 57 push edi 0041141C 8D BD 40 FF FF FF lea edi,[ebp-0C0h]

00411410 55 push ebp 00411411 8B EC mov ebp,esp 00411413 81 EC C0 00 00 00 sub esp,0C0h 00411419 53 push ebx 0041141A 56 push esi 0041141B 57 push edi 0041141C 8D BD 40 FF FF FF lea edi,[ebp-0C0h]

Corrupted byte

Synchronization point

Example of corruptionand synchronization:


Disassembly Synchronization• Disassembly: A “leapfrog” process over code bytes

– Each byte address contains an instruction of a definite length.– After disassembling an instruction, a disassembler skips to the next instruction.

• Example: Sequence of instruction lengths at consecutive offsets: 3 4 6 2 6 3 4 5 3 3 5 4 2 7 3 1 4

Sequence of instruction lengths 3 4 6 2 6 3 4 5 3 3 5 4 2 7 3 1 4

3 2 3 3 4 1 4 4 3 3 4 1 4 6 3 4 1 4 2 3 3 4 1 4 6 5 1 4

Synchronization point

Disassembly at offset: 01234

Kruskal count: Such disassembly synchronizes in about B2/16 steps, where B = average # of bytes per instruction.


Disassembly Synchronization

• Let InstructionLength(address) = length of instruction found at address.

• Starting at “slightly different” addresses x and y, a disassembler iterates:

x x + InstructionLength(x) (“leapfrog x”)y y + InstructionLength(y) (“leapfrog y”)

• Our goal: Compute N = approximate number of steps before any intermediate x is equal to any intermediate y.

• Treat all possible values of x-y as states of a Markov chain.• N is the coupling time of this Markov chain.• Kruskal count: N is about B2/16, where B is the average

instruction length.

Model of the disassembly process


Code Interleaving

• A method to overlap arbitrary code blocks– Explicitly prevents disassembly resynchronization

– Adds tamper-resistance• Hash of instruction bytes only (like traditional code

checksums)• Hash of instruction bytes and program state (like oblivious

hashing)

• Basic algorithm– Code interspersing: Create a block of interleaved

instructions from two code blocks.

– Code merging: Inject hashing instructions overlapped with existing instructions.


Code Interleaving: Basic IdeaSEQ1: INST_1 INST_2

SEQ2: INST_A INST_B

Two input code blocks



SEQ2: INST_A INST_B

SEQ1: INST_1 JMP L2SEQ2: INST_A JMP LBL2: INST_2 JMP L3LB: INST_BL3:

Two input code blocks After code interspersing

Code interspersing: Interleave instructions, injecting jumps as needed to maintain control flow.



SEQ2: INST_A INST_B

SEQ1: INST_1 JMP L2SEQ2: INST_A JMP LBL2: INST_2 JMP L3LB: INST_BL3:

SEQ1: INST_1 HASH_1 INST_2 HASH_2

SEQ2: INST_A HASH_A INST_B

Two input code blocks After code interspersingAfter code merging

Code interspersing: Interleave instructions, injecting jumps as needed to maintain control flow.

Code merging: Replace jumps with hash instructions, maintaining control flow.o E.g.: JMP L2; INST_A; JMP_LB transforms into HASH_1o HASH_1 contains INST_A and part of HASH_A.

Suitable hash instructions must be found (and fit together like puzzle pieces).o Various possibilities identified on x86.o Can also design custom byte-codes to maximize utility of overlapping.

Disassembly at SEQ2

Disassembly at SEQ1


Code Interleaving: ExampleSEQ1: C1 E0 02 shl eax, 2I11: 40 inc eax C3 ret

SEQ2: 48 dec eaxI21: C1 E8 03 shr eax, 3 C3 ret

SEQ1: C1 E0 02 shl eax, 2 EB 03 jmp I11SEQ2: 48 dec eax EB 04 jmp I21I11: 90 nop 40 inc eax EB 03 jmp OI21: C1 E8 03 shr eax, 3O: 90 nop C3 ret

SEQ1: C1 E0 02 shl eax, 2 81 F1 48 81 E9 90 xor ecx, 90E98148I11: 40 inc eax 81 C1 C1 E8 03 90 add ecx, 9003E8C1O: C3 ret

SEQ2: 48 dec eax 81 E9 90 40 81 C1 sub ecx, C1814090I21: C1 E8 03 shr eax, 3O: 90 nop C3 ret

Two input code blocks (x86)

After code interspersing

After code merging(OH instructions in red)

Disassembly at SEQ2Disassembly at SEQ1


Code Interleaving

• Observations– Tamper-resistance comes from two main sources:

• Implicit: Shared instruction bytes• Explicit: OH instructions

– Disassembly synchronization is explicitly prevented.– Method enables code-byte hashes even on architectures

that do not allow explicit access to code bytes.

• Extensions– Iteration to build up complexity

• Enhances security at little or no implementation cost.• Complex (emergent) code patterns and behaviors can arise.

– Implementation over custom byte codes designed to maximize utility of overlapping (unlike x86)


Experimental Results

• Tool implementation using Vulcan (binary-rewriting framework)• Reasonable impact on performance, depending on desired security level• Remaining work on analyzing security in practice

Performance impact on SpecINT benchmarks:0 = no overlapping, 1 = full overlapping


Conclusion

• Contributions– Investigation of overlapped code for software

protection• Study of disassembly synchronization and other roadblocks• Design of code interleaving and outlining to address limitations• Integrity checking via oblivious hashing• Placement in context of security models, not ad hoc methods

– Tool implementations to verify practical effectiveness• Code interleaving and outlining for x86 binaries• Iteration framework to enhance security

• Future work– Security analysis in theory and practice– Other overlapped-code methods– Porting to custom byte-codes

Documents

Implementing Oblivious Hashing Using Overlapped Instruction Encodings