42
Assembly Language Fundamentals

Assembly fundamentals

Embed Size (px)

Citation preview

Page 1: Assembly fundamentals

Assembly Language Fundamentals

Page 2: Assembly fundamentals

Starting with an Example

2

TITLE Add and Subtract (AddSub.asm); Adds and subtracts three 32-bit integers; (10000h + 40000h + 20000h)INCLUDE Irvine32.inc.codemain PROC

mov eax,10000h ; EAX = 10000hadd eax,40000h ; EAX = 50000hsub eax,20000h ; EAX = 30000hcall DumpRegs ; display registersexit

main ENDPEND main

Title/header

Include file

Code section

Page 3: Assembly fundamentals

3

Meanings of the Code

Assembly code Machine codeMOV EAX, 10000h B8 00010000(Move 10000h into EAX)

ADD EAX, 40000h 05 00040000(Add 40000h to EAX)

SUB EAX, 20000h 2D 00020000(SUB 20000h from EAX)

3

Operand in instruction

Page 4: Assembly fundamentals

4

Fetched MOV EAX, 10000h

4

ALU

MemoryRegisterEAXEBX

address

00

MOV EAX, 10000h

SUB EAX, 20000h

data

PC

0000011B8 00010000

IR

B8000100

00

ADD EAX, 40000h

05000400

Page 5: Assembly fundamentals

5

Execute MOV EAX, 10000h

5

ALU

RegisterEAXEBX

data

PC

0000011B8 00010000

IR

00010000

00

MOV EAX, 10000h

SUB EAX, 20000h

B8000100

00

ADD EAX, 40000h

05000400

Memory

address

Page 6: Assembly fundamentals

6

Fetched ADD EAX, 40000h

6

ALU

RegisterEAXEBX

data

PC

000100005 00040000

IR

00010000

Memory

address

00

MOV EAX, 10000h

SUB EAX, 20000h

B8000100

00

ADD EAX, 40000h

05000400

Page 7: Assembly fundamentals

7

Execute ADD EAX, 40000h

7

ALU

RegisterEAXEBX

data

PC

000100005 00040000

IR

0001000000050000

Memory

address

00

MOV EAX, 10000h

SUB EAX, 20000h

B8000100

00

ADD EAX, 40000h

05000400

Page 8: Assembly fundamentals

Chapter Overview

• Basic Elements of Assembly Language• Integer constants and expressions• Character and string constants• Reserved words and identifiers• Directives and instructions• Labels• Mnemonics and Operands• Comments

• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants• Real-Address Mode Programming

8

Page 9: Assembly fundamentals

Reserved Words, Directives

• TITLE: • Define program listing title• Reserved word of directive

• Reserved words• Instruction mnemonics,

directives, type attributes, operators, predefined symbols

• See MASM reference in Appendix A

• Directives:• Commands for assembler

9

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 10: Assembly fundamentals

Directive vs Instruction

• Directives: tell assembler what to do• Commands that are recognized and acted upon by the assembler, e.g. declare

code, data areas, select memory model, declare procedures, etc.• Not part of the Intel instruction set• Different assemblers have different directives

• Instructions: tell CPU what to do• Assembled into machine code by assembler• Executed at runtime by the CPU• Member of the Intel IA-32 instruction set• Format:

LABEL (option), Mnemonic, Operands, Comment

10

Page 11: Assembly fundamentals

Comments

•Single-line comments• begin with semicolon (;)

•Multi-line comments• begin with COMMENT

directive and a programmer-chosen character, end with the same character, e.g.

COMMENT ! Comment line 1

Comment line 2!

11

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 12: Assembly fundamentals

Include Files

• INCLUDE directive:• Copies necessary

definitions and setup information from a text file named Irvine32.inc, located in the assembler’s INCLUDE directory (see Chapt 5)

12

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 13: Assembly fundamentals

Code Segment

• .code directive:• Marks the beginning of the

code segment, where all executable statements in a program are located

13

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 14: Assembly fundamentals

Procedure Definition

• Procedure defined by:• [label] PROC• [label] ENDP

• Label:• Place markers: marks the

address (offset) of code and data

• Assigned a numeric address by assembler

• Follow identifier rules• Data label: must be unique,

e.g. myArray• Code label: target of jump and

loop instructions, e.g. L1:

14

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 15: Assembly fundamentals

Identifiers

• Identifiers:• A programmer-chosen

name to identify a variable, a constant, a procedure, or a code label

• 1-247 characters, including digits

• not case sensitive• first character must be a

letter, _, @, ?, or $

15

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 16: Assembly fundamentals

Instructions

[label:] mnemonic operand(s) [;comment]

• Instruction mnemonics:• help to memorize• examples: MOV, ADD, SUB,

MUL, INC, DEC• Operands:

• constant• constant expression• register• memory (data label,

register)

16

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main immediate valuesDestinatio

noperand

Sourceoperand

Page 17: Assembly fundamentals

I/O• Not easy, if program by

ourselves• Will use the library

provided by the author• Two steps:

• Include the library (Irvine32.inc) in your code

• Call the subroutines• call DumpRegs:

• Calls the procedure to displays current values of processor registers

17

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 18: Assembly fundamentals

Remaining• exit:

• Halts the program• Not a MSAM keyword, but

a command defined in Irvine32.inc

• END main:• Marks the last line of the

program to be assembled• Identifies the name of the

program’s startup procedure

18

TITLE Add and …; Adds and subtracts; (10000h + …INCLUDE Irvine32.inc.codemain PROC

mov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsexit

main ENDPEND main

Page 19: Assembly fundamentals

Example Program Output

• Program output, showing registers and flags

19

EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFFESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0

Page 20: Assembly fundamentals

Alternative Version of AddSub

20

TITLE Add and Subtract (AddSubAlt.asm); adds and subtracts 32-bit integers.386.MODEL flat,stdcall.STACK 4096ExitProcess PROTO, dwExitCode:DWORDDumpRegs PROTO.codemain PROC

mov eax,10000h ; EAX = 10000hadd eax,40000h ; EAX = 50000hsub eax,20000h ; EAX = 30000hcall DumpRegsINVOKE ExitProcess,0

main ENDPEND main

Page 21: Assembly fundamentals

Explanations

• .386 directive:• Minimum processor required for this code

• .MODEL directive:• Generate code for protected mode program• Stdcall: enable calling of Windows functions

• PROTO directives:• Prototypes for procedures• ExitProcess: Windows function to halt process

• INVOKE directive:• Calls a procedure or function• Calls ExitProcess and passes it with a return code of zero

21

Page 22: Assembly fundamentals

Suggested Program Template

22

TITLE Program Template (Template.asm); Program Description:; Author:; Creation Date:; Revisions: ; Date: Modified by:INCLUDE Irvine32.inc.data

; (insert variables here).codemain PROC

; (insert executable instructions here)exit

main ENDP; (insert additional procedures here)

END main

Page 23: Assembly fundamentals

What's Next

• Basic Elements of Assembly Language• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants• Real-Address Mode Programming

23

Page 24: Assembly fundamentals

Assemble-Link-Execute Cycle

• Steps from creating a source program through executing the compiled program

http://kipirvine.com/asm/gettingStarted/index.htm

24

SourceFile

ObjectFile

ListingFile

LinkLibrary

ExecutableFile

MapFile

Output

Step 1: text editor

Step 2:assembler

Step 3:linker

Step 4:OS loader

Page 25: Assembly fundamentals

Listing File

• Use it to see how your program is compiled• Contains

• source code• addresses• object code (machine language)• segment names• symbols (variables, procedures, and constants)

• Example: addSub.lst

25

Page 26: Assembly fundamentals

Listing File

00000000 .code00000000 main PROC

00000000 B8 00010000 mov eax,10000h00000005 05 00040000 add eax,40000h0000000A 2D 00020000 sub eax,20000h0000000F E8 00000000E call DumpRegs

exit 00000014 6A 00 * push +000000000h 00000016 E8 00000000E * call ExitProcess 0000001B main ENDP

END main

26

address memorycontent

Page 27: Assembly fundamentals

Data Types

BYTE 8-bit unsigned integerSBYTE 8-bit signed integerWORD 16-bit unsigned integerSWORD 16-bit signed integerDWORD 32-bit unsigned integerSDWORD 32-bit signed integerFWORD 48-bit integer (Far pointer in protected mode)QWORD 64-bit integerTBYTE 80-bit (10-byte) integerREAL4 32-bit (4-byte) short realREAL8 64-bit (8-byte) long realREAL10 80-bit (10-byte) extended real

27

Page 28: Assembly fundamentals

Defining BYTE and SBYTE Data• Each of following defines a single byte of storage:

28

value1 BYTE 'A' ; character constantvalue2 BYTE 0 ; smallest unsigned bytevalue3 BYTE 255 ; largest unsigned bytevalue4 SBYTE -128 ; smallest signed bytevalue5 SBYTE +127 ; largest signed bytevalue6 BYTE ? ; uninitialized byte

• MASM does not prevent you from initializing a BYTE with a negative value, but it is considered poor style

• If you declare a SBYTE variable, the Microsoft debugger will automatically display its value in decimal with a leading sign

Page 29: Assembly fundamentals

Defining Byte Arrays

• Examples that use multiple initializers:

29

list1 BYTE 10,20,30,40list2 BYTE 10,20,30,40 BYTE 50,60,70,80 BYTE 81,82,83,84

list3 BYTE ?,32,41h,00100010blist4 BYTE 0Ah,20h,‘A’,22h

Page 30: Assembly fundamentals

Defining Strings

• An array of characters• Usually enclosed in quotation marks• Will often be null-terminated• To continue a single string across multiple lines, end each line with a comma

30

str1 BYTE "Enter your name",0str2 BYTE 'Error: halting program',0str3 BYTE 'A','E','I','O','U'greeting BYTE "Welcome to the Encryption Demo program " BYTE "created by Kip Irvine.",0menu BYTE "Checking Account",0dh,0ah,0dh,0ah,

"1. Create a new account",0dh,0ah,"2. Open an existing account",0dh,0ah,"Choice> ",0

Is str1 an array? End-of-line sequence:• 0Dh = carriage return• 0Ah = line feed

Page 31: Assembly fundamentals

Using the DUP Operator

• Use DUP to allocate (create space for) an array or string • Syntax:

counter DUP (argument)• Counter and argument must be constants or constant expressions

31

var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zerovar2 BYTE 20 DUP(?) ; 20 bytes, uninitializedvar3 BYTE 4 DUP("STACK"); 20 bytes,

; "STACKSTACKSTACKSTACK”

Page 32: Assembly fundamentals

Defining WORD and SWORD

• Define storage for 16-bit integers• or double characters• single value or multiple values

32

word1 WORD 65535 ; largest unsigned valueword2 SWORD –32768 ; smallest signed valueword3 WORD ? ; uninitialized, unsignedword4 WORD "AB" ; double charactersmyList WORD 1,2,3,4,5 ; array of wordsarray WORD 5 DUP(?) ; uninitialized array

Page 33: Assembly fundamentals

Defining Other Types of Data

• Storage definitions for 32-bit integers, quadwords, tenbyte values, and real numbers:

33

val1 DWORD 12345678h ; unsignedval2 SDWORD –2147483648 ; signedval3 DWORD 20 DUP(?) ; unsigned arrayval4 SDWORD –3,–2,–1,0,1 ; signed arrayquad1 QWORD 1234567812345678hval1 TBYTE 1000000000123456789AhrVal1 REAL4 -2.1rVal2 REAL8 3.2E-260rVal3 REAL10 4.6E+4096ShortArray REAL4 20 DUP(0.0)

Page 34: Assembly fundamentals

Adding Variables to AddSub

34

TITLE Add and Subtract, Version 2 (AddSub2.asm); This program adds and subtracts 32-bit unsigned; integers and stores the sum in a variable.INCLUDE Irvine32.inc.dataval1 DWORD 10000hval2 DWORD 40000hval3 DWORD 20000hfinalVal DWORD ?.codemain PROC

mov eax,val1 ; start with 10000hadd eax,val2 ; add 40000hsub eax,val3 ; subtract 20000hmov finalVal,eax ; store the result (30000h)call DumpRegs ; display the registersexit

main ENDPEND main

Page 35: Assembly fundamentals

Listing File00000000 .data00000000 00010000 val1 DWORD 10000h00000004 00040000 val2 DWORD 40000h00000008 00020000 val3 DWORD 20000h0000000C 00000000 finalVal DWORD ?

00000000 .code00000000 main PROC00000000 A1 00000000 R mov eax,val1 ; start with 10000h00000005 03 05 00000004 R add eax,val2 ; add 40000h0000000B 2B 05 00000008 R sub eax,val3 ; subtract 20000h00000011 A3 0000000C R mov finalVal,eax; store result00000016 E8 00000000 E call DumpRegs ; display registers

exit00000022 main ENDP

35

Page 36: Assembly fundamentals

C vs Assembly

36

.dataval1 DWORD 10000hval2 DWORD 40000hval3 DWORD 20000hfinalVal DWORD ?.codemain PROC

mov eax,val1add eax,val2sub eax,val3mov finalVal,eaxcall DumpRegsexit

main ENDP

main(){

int val1=10000h;int val2=40000h;int val3=20000h;int finalVal;

finalVal = val1 + val2 - val3;

}

Page 37: Assembly fundamentals

What's Next

• Basic Elements of Assembly Language• Example: Adding and Subtracting Integers• Assembling, Linking, and Running Programs• Defining Data• Symbolic Constants

• Equal-Sign Directive• Calculating the Sizes of Arrays and Strings• EQU Directive• TEXTEQU Directive

• Real-Address Mode Programming

37

Page 38: Assembly fundamentals

Equal-Sign Directive

• name = expression• expression is a 32-bit integer (expression or constant)• may be redefined• name is called a symbolic constant• Also OK to use EQU

• good programming style to use symbols

38

COUNT = 500…mov al,COUNT

Page 39: Assembly fundamentals

Calculating the Size of Arrays• Current location counter: $• Size of a byte array

• Subtract address of list and difference is the number of bytes

• Size of a word array• Divide total number of bytes by 2 (size of a word)

39

list BYTE 10,20,30,40ListSize = ($ - list)

list WORD 1000h,2000h,3000h,4000hListSize = ($ - list) / 2

Page 40: Assembly fundamentals

EQU Directive• Define a symbol as either an integer or text expression• Cannot be redefined• OK to use expressions in EQU:

• Matrix1 EQU 10 * 10• Matrix1 EQU <10 * 10>

• No expression evaluation if within < >• EQU accepts texts too

40

PI EQU <3.1416>pressKey EQU <"Press any key to continue",0>.dataprompt BYTE pressKey

Page 41: Assembly fundamentals

TEXTEQU Directive

• Define a symbol as either an integer or text expression• Called a text macro• Can be redefined

41

continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">rowSize = 5.dataprompt1 BYTE continueMsgcount TEXTEQU %(rowSize * 2) ; evaluates expressionsetupAL TEXTEQU <mov al,count>.codesetupAL ; generates: "mov al,10"

Page 42: Assembly fundamentals

Summary

• Integer expression, character constant• Directive – interpreted by the assembler• Instruction – executes at runtime• Code, data, and stack segments• Source, listing, object, map, executable files• Data definition directives:

• BYTE, SBYTE, WORD, SWORD, DWORD, SDWORD, QWORD, TBYTE, REAL4, REAL8, and REAL10

• DUP operator, location counter ($)

• Symbolic constant• EQU and TEXTEQU

42