Upload
chia-hao-tsai
View
6
Download
0
Embed Size (px)
Citation preview
OUTLINE
NEXT 45 MIN
▸ In the next 45 min
▸ Learn the Mach-O binary format
▸ X86-64 Assembly Language / Machine Code
▸ Trivial Binary Bugs
▸ Order by DESC
BUG TO VULNERABILITY
SIGNAL
▸ There are so~ many SIGNAL in *nix-like system
▸ Some is helpful
▸ Some is bug prevention
▸ Understand the bug will find the vulnerabilities
▸ SIGFPE - devision-by-zero
▸ SIGILL - illegal instruction
▸ SIGSEGV - invalid virtual memory reference
BUG TO VULNERABILITY
SIGNAL
▸ There are so~ many SIGNAL in *nix-like system
▸ Some is helpful
▸ Some is bug prevention
▸ Understand the bug will find the vulnerabilities
▸ SIGFPE - devision-by-zero
▸ SIGILL - illegal instruction
▸ SIGSEGV - invalid virtual memory reference
BUG TO VULNERABILITY
ILLEGAL & INVALID
▸ Caused by compiler, library, logical
▸ Compiler - replace a newer compiler
▸ Run-time library - replace a newer library
▸ Run-time logical - replace a correct input
▸ 都是 They 的錯
BUG TO VULNERABILITY
ILLEGAL & INVALID
▸ Caused by compiler, library, logical
▸ Compiler - replace a newer compiler
▸ Run-time library - replace a newer library
▸ Run-time logical - replace a correct input
▸ 都是 They 的錯
VULNERABILITY
INPUT
▸ User Input
▸ User-Name, Age, email-address, Gender
▸ Store the user input into memory space
▸ ISSUE
A. How
B. What
C. Where
CPU
X86-64
▸ Register - extend to 64-bits
▸ 8 / 16 / 32 / 64 bits
▸ 128 bits (SSE)
▸ NX (No-Execute) bit
▸ Register is limited
▸ limited to 16 general registers
▸ 16 SSE registers
CPU
X86-64
▸ Von Neumann model
▸ Code / Data are put together (memory)
▸ When data need to be stored / loaded
▸ from register to memory
▸ from memory to register
STORAGE
SOMETHING IN MEMORY
▸ Code vs Data vs BSS vs Stack vs Heap
▸ Code is used to read-execute
▸ Data is used to read-write
▸ BSS is used to store Non-Initial data
▸ Stack is used to store template (local) data
▸ Heap is used to store dynamic data
▸ All of these are stored in the memory
HOPE YOU HAVE …
DATA IN PROGRAM
▸ Data
▸ Gender - one letter or full description
▸ Age - possible integer or impossible integer
▸ Name - alphabet or unicode
▸ All data in register / memory are integer-like
▸ 8-bit (0~255) to SSE (0 ~ 3.4e38)
▸ sign or unsigned is a question
HOPE YOU HAVE …
DATA IN PROGRAM
▸ Can simply put age into register
▸ Gender could be
▸ one letter - to ASCII and put in register
▸ Fix-length - store in memory
▸ Name should be
▸ store in memory
MEMORY
WHERE TO STORE
▸ Memory
▸ Sequently store user input
▸ decode by program / programmer
▸ ISSUE
▸ size
▸ permission
MEMORY
WHERE TO STORE
▸ Data vs BSS vs Stack vs Heap stack
▸ Fit the scenario (assumption)
▸ data is
1. temporary
2. global view
3. variable size
⽂字
MOV
▸ In x86-64 opcodes
▸ lots of opcodes are MOV
▸ move from/to memory are frequently used actions
▸ mov ch, dl
▸ mov rax, [rax-0x10]
▸ mov [r8], rsp
▸ lea cx, [rbx]
▸ But there are difference opcode!
AGE
SAVE DATA
▸ Save 18 as age into program
▸ mov rax, 18 ; save as register
▸ mov [rax], 18 ; save into memory
▸ push 18 ; save into stack
GENDER
SAVE DATA
▸ Save ‘F’ (0x46) as gender into program
▸ mov rax, 0x46 ; save as register
▸ mov [rax], 0x46 ; save into memory
▸ push 0x46 ; save into stack
GENDER
SAVE DATA
▸ Save ‘Female’ as gender into program
▸ mov [rax], 0x46656D61
▸ mov [rax+0x04], 0x6C650000
▸ push 0x46
▸ push 0x65
▸ push …
MEMORY
SIZE IS MATTER
▸ Step to store data in memory
1. decide the size of memory
2. how to encode/decode data
3. decide the location of memory
4. put into / get from memory
MEMORY
OVERESTIMATE VS UNDERESTIMATE
▸ Over
▸ memory leak - OOM
▸ waste resource
▸ Under
▸ data corrupt
▸ overflow
MEMORY
▸ move to memory space
▸ Where is the space? BSS or Data or Heap
▸ Compile-time or Run-time
▸ fix-length or variable-length
▸ Save into Stack
▸ Push stack is not unlimited
IN C LANGUAGE
ASSUMPTION
▸ Struct in Cstruct foo { int age; char gender[8]; char email[128];};
‣ What happen if overflow in gender
‣ email is corrupt / age is corrupt
age
gender
0x1230
0x12B9
IN ASM
ASSUMPTION
[0x400000] call 0x400043
…
[0x400043] mov r8 [rip+0x08]
[0x40004A] mov [r8] 18
[0x400051] ret
LEGACY
CODE/DATA BOTH IN MEMORY
▸ First: call is combined from push and jump
▸ call 0x400035
1. push rip
2. jump 0x400035
‣ ret
1. pop rip
2. jump rip
‣ And more
▸ call rax
▸ call [rax]
QUESTION
▸ If vulnerability could be
▸ source code to assembly code
▸ NO BUG from assembly code to machine code?
⽂字
ASSEMBLE
▸ From assembly code to machine code
▸ 1-1 mapping
▸ platform-dependent
▸ Example
▸ pop rax - 58
▸ syscall - 0F 05
▸ xor r8 0x10 - 48 83 F0 10
▸ mov eax 0xDEADBEEF - B8 EF BE AD DE
INSTRUCTION
X86-64 MACHINE CODE
▸ X86-64 machine code layout▸ [prefix] [opcode] [MOD] [SIB] [Displacement] [Immediate]
▸ Max to 15-bytes peer each instruction
▸ Displacement + Immediate max to 8-bytes (64-bit address)
▸ R(educed)ISC vs C(omplex)ISC
STFW
OPCODE
▸ X86-64 opcode
▸ Intel Manual[0]
▸ Web Resource[1]
▸ OPCODE possible 00 ~ FF
▸ Each one has possible usage or invalid
[0]: https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf[1]: http://ref.x86asm.net/coder64.html
SIMPLE LIFE
OPCODE
▸ Simple (frequently-used) opcode
▸ No-OPeration
▸ NOP 90 (maybe xchg eax, eax)
▸ NOP 0F 0D
▸ FNOP D9 D0 (FPU nop)
[0]: http://stackoverflow.com/questions/25008772/whats-the-difference-between-the-x86-nop-and-fnop-instructions
X86-64
SLIGHTLY COMPLICATED
▸ Extension OPCODE
▸ add (01) support 16 / 32 / 64 operand
▸ add r/m16/32/64 r16/32/64
▸ One opcode do multiple thing?
▸ prefix 48 ~ 4F extend the size to 64-bit
7 3 2 1 0
+—————————+———+———+———+———+
| 0 1 0 0 | W | R | X | B |
+—————————+———+———+———+———+
X86-64
REGISTER EXTENSION
▸ Extension
▸ Size (32-bits to 64-bits)
▸ register (general to extension)
▸ mov eax, 0xdeadbeef B8 EF BE AD DE
▸ mov rax, 0xdeadbeef 48 B8 EF BE AD DE
▸ mov r8, 0xdeadbeef 49 B8 EF BE AD DE
X86-64
PRIMARY OPCODE
▸ Some opcode is mixed
▸ OPCODE + second opcode
▸ push r16/64 would be merge with 1-byte
▸ push ax 66 50
▸ push rax 50
▸ push r9w 66 41 51
▸ push r9 41 51
X86-64
SOME PROBLEM
▸ Trivial case - condition check
▸ jz LABEL 48 0F 84 06 00 00 00
▸ Can be modified as
▸ nop 90 90 90 90 90 90 90
X86-64
SOME PROBLEM
▸ If we have
▸ add ax, 0x5150 66 05 50 51
▸ Can be modified as
▸ syscall 0F 05
▸ push rax 50
▸ push rcx 51
POSSIBILITY
MACHO
▸ Mach-O is a binary format
▸ Header
▸ Commands
▸ Sections
▸ Segment
▸ Binary payload
▸ Multi-architecture binaries
MACH-O 64
HEADER
▸ Magic Number 0xFEEDFACF
▸ 64-bit
▸ CPU info
▸ X86_64 / ARM / ARM64 / POWERPC64 / …
▸ File Type
▸ Execute / Preload / DYLIB / …
▸ Number of commands (section/segment)
▸ Flags
▸ PIE / NOUNDEFS / DYLDLINK / LAZY_INIT / …
MACH-O 64
COMMANDS
▸ Lots of commands
▸ LC_SEGMENT_64
▸ LC_SYMTAB
▸ LC_LOAD_DYLIB
▸ LC_UNIXTHREAD
▸ LC_MAIN
▸ LC_RPATH
MACH-O 64
SEGMENT
▸ Segment
▸ command name
▸ memory address
▸ memory size
▸ file offset
▸ file size
▸ max VM protection
▸ max initial protection
▸ number of sections
MACH-O 64
MINIMAL
▸ Minimal Mach-O 64 binary
▸ Low consumption - 4K
▸ Header
▸ 7 commands - 664 bytes
▸ Machine Code - 12 bytes
▸ Dummy \x00
ZASM
ASSEMBLER
▸ Assembler
▸ From assembly language to machine code
▸ Target format (ELF / Mach-O / …)
▸ Target platform (x86-64 / ARMv8 / …)
▸ Generator
[0]: https://github.com/cmj0121/Zerg/tree/master/src/zasm