Upload
duongdien
View
261
Download
0
Embed Size (px)
Citation preview
Assembler
IA32‐x86
Paulo Lopes
1
Intel and AT&T Syntax
• Fonte GNU assembler manual (80386 dependent features)
• We will use intel syntax– Intel
• Registos: eax• Constantes: 4• Dest, Source
– At&t• Registos: %aex• Constantes: $4• Source, Dest
2
Intel and AT&T Syntax
– Intel• Operand size: ‘byte ptr’, ‘word ptr’, ‘dword ptr’ and ‘qword ptr
• call/jmp far section:offset
– At&t• ‘b’, ‘w’, ‘l’ and ‘q’ at the end of the instruction
• Calls: lcall/ljmp $section, $offset
3
Registers
• the 8 32‐bit registers – eax (the accumulator)– ebx, ecx, edx, edi, esi,– ebp (the frame pointer),– esp (the stack pointer)
• Parte baixa de 16bits destes– ax (the accumulator)– bx, cx, dx, di, si,– bp (the frame pointer),– sp (the stack pointer).
4
Registers
• the 8 8‐bit registers – ah, al, bh, bl, ch, cl, dh, dl– (high and low part of: ax, bc, cx, dx)
• the 6 section registers– cs (code section), CS:IP is code line – ds (data section) DS:SI‐>ES:DI– ss (stack section) SS:SP– es, fs, and gs.
• Other registers– Control, Debug and Test registers– Floating point registers– MMX e SSE registers– AMD 64 registers
5
IA32‐X86 Execution Environment
6
FLAGS Register
7
Instructions prefixes
• Section override prefixes: ‘cs’, ‘ds’, ‘ss’, ‘es’, ‘fs’, ‘gs’
• Operand/Address size prefixes: ‘data16’, ‘addr16’, ‘data32’ and ‘addr32’
• Lock: inhibits interrupts
• Wait: wait for the coprocessor (should not be needed)
• ‘rep’, ‘repe’, and ‘repne’: repeat ‘%ecx’ times
• ‘rex’: extensions to i386 instruction
8
Memory references
• section:[base + index*scale + disp]– Base: 32bits register– Index: 32bits register– Scale: 1, 2, 4 or 8 (defaul=1)– Section (optional), overrides the default section register
• The bits in the section register are append to the address• In x86‐64 sections are not used
– Disp: constant– x86‐64: [rip + 1234] (relative to the PC)
• Examples:mov ax, [ebp ‐ 4]mov ax, [foo + eax*4]mov ax, [foo]mov ax, gs:foo
9
Memory references
• BYTE PTR, WORD PTR, and DWORD PTR
• In some cases the size of the operands in an instruction is ambiguous. Ex:– mov [ebx], 2
• Solution, use– Mov [dword ptr ebx], 2
10
Near and Far Pointers (32bit mode)
11
Jumps
• Jump instructions are always optimized to use the smallest possible displacements.– In AT&T Absolute (as opposed to PC relative) call and jump operands must be prefixed with ‘*’. Undelimited in Intel Syntax
• ‘jcxz’, ‘jecxz’, ‘loop’, ‘loopz’, ‘loope’, ‘loopnz’ and ‘loopne’ instructions only come in byte displacements
12
Numbers
• Floating point constructors are ‘.float’ or ‘.single’, ‘.double’, and ‘.tfloat’ for 32‐, 64‐, and 80‐bit formats– ‘s’, ‘l’, and ‘t’
• Integer constructors are ‘.word’, ‘.long’ or ‘.int’, and ‘.quad’ for the 16‐, 32‐, and 64‐bit integer formats.– ‘s’ (single), ‘l’ (long), and ‘q’ (quad)
13
Specifying CPU Architecture
• .arch cpu_type– ‘i8086’ ‘i186’ ‘i286’ ‘i386’ ‘i486’ ‘i586’ ‘i686’ ‘pentium’ ‘pentiumpro’ ‘pentium4’ ‘k6’ ‘athlon’ ‘sledgehammer’
• ‘jumps’ or ‘nojumps’– Jumps: enable jump promotion (to far jumps)
14
AS Assembler
15
Syntax
• Comments– /* coment */– Line coment
• #
• Symbols– Formed by letters numbers and ‘_’, ‘.’ e ‘$’
• Statements– One statement per line
• Constants– .byte 74, 0112, 092, 0x4A, 0X4a, ’J, ’\J # All the same value.– .ascii "Ring the bell\7" # A string constant.– .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.– .float 0f‐314159265358979E‐40 # ‐ pi, a flonum.
16
Constants
• Character Constants– .byte ’J
• Strings– .ascii "Ring the bell\7" # A string constant.– special characters
• \x
• Integers– binary : ‘0b0100111b’ or ‘0B0100111b’– octal: ‘01234567’ (starts with 0)– Decimal: 123456789– Hexadecimal: ‘0x45ab4f’ or ‘0X45AB4F’ – Floating point: 0f‐314.15E‐2
17
Sections
• text, data and bss sections– .text
• code– .data
• Data with initialization– .bss
• Local data (the all section is zeroed at startup)
• .section “section”
• Subsections– .text 0– .text 1– etc
18
Symbols
• Labels– label_1: statement– Dot (‘.’)
• The current address AS is assembling into
• Expressions– Operators
• * Multiplication. / Division. % Remainder. <• << Shift Left. >> Shift Right. • | Bitwise Inclusive Or. & Bitwise And. ^ Bitwise Exclusive Or. ! Bitwise Or Not.
• + Addition ‐ Subtraction• == Is Equal To <> Is Not Equal To < Is Less Than > Is Greater Than• >= Is Greater Than Or Equal To <= Is Less Than Or Equal To• && Logical And. || Logical Or.
19
Directives
• .ascii "string". . .• .asciz or .string "string". . . (followed by a zero byte)
• .byte• .word • .long or .int• .quad
• .equ or .set symbol, expression
20
Directives
• .global symbol• .if, .endif, .else, .elseif• .data subsection• .text subsection• .section name
• .fill repeat , size , value– Size = 1 to 8 (byte to quad word)
• .skip or .space size , fill– Size in bytes
21
Directives
.macro name arg1=def1, arg2=def2statement \arg1 \arg2
.endm
Using the macroname arg1=arg1v, arg2=arg2v orname arg1v, arg2v
This are like inline functions.Example:.macro Add5 arg1
ADD \arg1, 5.endm
22
Instruction listing
23
Instruction format
• E series of fields most of them optional, with instructions with variable length.
24
Memory references
• section:[base + index*scale + disp]– Base: 32bits register– Index: 32bits register– Scale: 1, 2, 4 or 8 (defaul=1)– Section (optional), overrides the default section register– Disp: constant– x86‐64: [rip + 1234] (relative to the PC)
• Examples:mov ax, [ebp ‐ 4]mov ax, [foo + eax*4]mov ax, [foo]mov ax, gs:foo
25
Prefixes
• Group 1– Lock and repeat prefixes
• Group 2– Segment override prefixes:
• CS, SS, DS, ES, FS, GS– Branch hints:
• Branch not taken• Branch taken
• Group 3– Operand‐size override prefix
• Group 4– Address‐size override prefix
26
ModR/M and SIB Bytes
• The mod field combines with the r/m field to form 32 possible values: eight registers and 24 addressing modes.
• The reg/opcode field specifies typically a register number
27
Instruction listingADC Add with carryADD AddAND Logical AND
CALL Call procedureCBW, CWD, CDQ
Convert sizeCLC Clear carry flagCLD Clear direction flagCLI Clear interrupt flagCMC Complement carry flagCMP Compare operands (D‐S)CMPSB, CMPSW, CMPSD, CMPSQ
Compare B, W, D, Q in memory([DS:ESI] ‐ [ES:EDI])
DEC Decrement by 1
DIV Unsigned divide
HLT Enter halt state
IDIV Signed divide
IMUL Signed multiply
IN Input from port
INC Increment by 1
INT Call to interrupt
IRET Return from interrupt
Jxx (JA, JAE, JB, JBE, JC, JCXZ, JE, JG, JGE, JL, JLE, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ) – above, bellow (no sign), greater, less, (sign), etc
JMP Jump
LAHF Load flags into AH register
LEA Load Effective Address
28
Instruction listing
LDS,LES,LFS,LGS,LSS
Load pointer to Segment : DST
LOCK Assert BUS LOCK# signal
LODSB, LODSW, LODSD, LODSQ
Load B,W, D, Q (for strings)
EAX <‐ [DS:ESI]
LOOP/LOOPx Loop control
MOV Move
MOVSB, MOVSW, MOVSD, MOVSQ
Move B,W,D,Q from string to string
[ES:EDI] <‐ [DS:ESI]
MUL Unsigned multiply
NEG Two's complement negation
NOP No operation
NOT Negate the operand, logical NOT
OR Logical OR
OUT Output to port
POP Pop data from stack
POPF Pop data into flags register
PUSH Push data onto stack
PUSHF Push flags onto stack
RCL Rotate left (with carry)
RCR Rotate right (with carry)
REPxx Repeat CMPS/MOVS/SCAS/STOS)
(this is a prefix)
RET , RETN, RETF
Return from procedure
near and far
ROL Rotate left
ROR Rotate right
29
Instruction listings
SAHF Store AH into flags
SAL Shift Arithmetically left
SAR Shift Arithmetically right
SBB Subtraction with borrow
SCASB, SCASW, SCASD, SCASQ Compare B, W, D, Q string
EAX‐[ES:EDI]
SHL Shift left (unsigned shift left)
SHR Shift right (unsigned shift right)
STC Set carry flag
STD Set direction flag
STI Set interrupt flag
STOSB, STOSW, STOSD, STOSQ
Store B,W,D,Q in string
[ES:EDI] <‐ EAX
SUB Subtraction: D=D‐S
TEST Logical compare (AND)
XCHG Exchange data
XOR Exclusive OR
• Further reference Intel Manuals– Intel® 64 and IA‐32 Architectures Software
Developer’s Manual
– Volume 1 ‐ Intel Basic Arquitecture –chapter 5 – Instruction Set Sumary
– Volume 2 ‐ Instruction Set Reference
30
Using AS
31
Hello World.intel_syntax noprefix
.data # section declaration
msg: .ascii "Hello, world!\n" # our dear string
len = . ‐msg # string length
.text # section declaration
# we must export the entry point .global _start
_start:
# write our string to stdout
mov edx, offset len#3 arg: message length
mov ecx, offset msg #2 arg: pointer to message
mov ebx, 1#1 arg: file handle (stdout)
mov eax, 4 #system call number (sys_write)
int 0x80 #call kernel
# and exit
mov ebx, 0 #1 arg: exit code mov eax, 1
#system call number (sys_exit) int 0x80 #call kernel
32
Invoking
• Compiling:• as ‐g ‐o hello.o hello.s
– Option –o: specifies the object file “hello.o”– From the source file hello.s– Option –g: generates debug information
• Linking:• ld ‐o hello hello.o
– Option –o: specifies the executable file, “hello”– From the object file “hello.o”– Option –s: strips from symbolic information
33
Running and debugging
• Running– Just Type (./ specifies current directory)>> ./hello
• Debugging– Use “kdbg” the KDE front end the gnu debugger (dbg)
– Type>> kdbg hello
34
System Calls• System call 3:
– read read from a file descriptor fs/read_write.c– ssize_t read(int fd, void *buf, size_t count);
• System call 4 – write to a file descriptor fs/read_write.c– ssize_t write(int fd, const void *buf, size_t count);
• Parameters are passed in the registers– EBX(1), ECX(2), EDX(3), ESI(4), EDI(5), EBP(6)
• EAX contains the system call number• The kernel is acessed by int 0x80
mov edx, len # third parameter (string length)mov ecx, msg # second parameter (string pointer)mov ebx, 1 # first parameter (stdin)mov eax, 4 int 0x80 #call kernel
35
System Calls
• 162 : nanosleep– int nanosleep(const struct timespec *req, struct timespec *rem);Timespec:
time_t (long) seconds;long nanoseconds;
In assembleTime:
.long # seconds
.long # nanosecond
ARG2: ECXARG1: EBXEAX=162
36
Mixing C with assembler
• C calling convention– Push the arguments in reverse order to the stack (last argument first)
• This means that the first argument will be on top of the stack– Call the function– Free the arguments
• Compilers reserve stack space for locals and arguments in order to avoid push and pops
• Called:– Optional for us: define new stack frame
• Push old bp• bp = sp
37
Mixing C with assembler
• Compiling– gcc ‐o executable_file file.c file.s
• Linking with a library (ncurses): ‐l– gcc ‐o executable_file file.s ‐lncurses
• Entry point is now– “main” and not “_start”
38