Author
neil-pearson
View
232
Download
5
Tags:
Embed Size (px)
Overview of Assembly LanguageChapter 9S. Dandamudi
OutlineAssembly language statementsData allocationWhere are the operands?Addressing modesRegisterImmediateDirectIndirectData transfer instructionsmov, xchg, and xlatPTR directiveOverview of assembly language instructionsArithmeticConditionalLogicalShiftRotateDefining constantsEQU and = directivesMacrosIllustrative examples
Assembly Language StatementsThree different classesInstructionsTell CPU what to doExecutable instructions with an op-codeDirectives (or pseudo-ops)Provide information to assembler on various aspects of the assembly processNon-executableDo not generate machine language instructionsMacrosA shorthand notation for a group of statementsA sophisticated text substitution mechanism with parameters
Assembly Language Statements (contd)Assembly language statement format:[label] mnemonic [operands] [;comment]Typically one statement per lineFields in [ ] are optionallabel serves two distinct purposes:To label an instruction Can transfer program execution to the labeled instructionTo label an identifier or constantmnemonic identifies the operation (e.g., add, or)operands specify the data required by the operationExecutable instructions can have zero to three operands
Assembly Language Statements (contd)commentsBegin with a semicolon (;) and extend to the end of the line
Examplesrepeat: inc result ; increment resultCR EQU 0DH ; carriage return characterWhite space can be used to improve readabilityrepeat:inc result
Data AllocationVariable declaration in a high-level language such as Cchar responseint valuefloat totaldouble average_valuespecifiesAmount storage required (1 byte, 2 bytes, )Label to identify the storage allocated (response, value, )Interpretation of the bits stored (signed, floating point, )Bit pattern 1000 1101 1011 1001 is interpreted as -29,255 as a signed number 36,281 as an unsigned number
Data Allocation (contd)In assembly language, we use the define directiveDefine directive can be usedTo reserve storage spaceTo label the storage spaceTo initializeBut no interpretation is attached to the bits storedInterpretation is up to the program codeDefine directive goes into the .DATA part of the assembly language program Define directive format[var-name] D? init-value [,init-value],...
Data Allocation (contd)Five define directivesDB Define Byte ;allocates 1 byteDW Define Word ;allocates 2 bytesDD Define Doubleword ;allocates 4 bytesDQ Define Quadword ;allocates 8 bytesDT Define Ten bytes ;allocates 10 bytesExamplessorted DB yresponse DB ? ;no initializationvalue DW 25159float1 DD 1.234float2 DQ 123.456
Data Allocation (contd)Multiple definitions can be abbreviatedExample message DB B DB y DB e DB 0DH DB 0AHcan be written asmessage DB B,y,e,0DH,0AHMore compactly asmessage DB Bye,0DH,0AH
Data Allocation (contd)Multiple definitions can be cumbersome to initialize data structures such as arrays ExampleTo declare and initialize an integer array of 8 elements marks DW 0,0,0,0,0,0,0,0What if we want to declare and initialize to zero an array of 200 elements?There is a better way of doing this than repeating zero 200 times in the above statementAssembler provides a directive to do this (DUP directive)
Data Allocation (contd)Multiple initializationsThe DUP assembler directive allows multiple initializations to the same valuePrevious marks array can be compactly declared asmarks DW 8 DUP (0)Examplestable1 DW 10 DUP (?) ;10 words, uninitializedmessage DB 3 DUP (Bye!) ;12 bytes, initialized ; as Bye!Bye!Bye!Name1 DB 30 DUP (?) ;30 bytes, each ; initialized to ?
Data Allocation (contd)The DUP directive may also be nestedExamplestars DB 4 DUP(3 DUP (*),2 DUP (?),5 DUP (!))Reserves 40-bytes space and initializes it as***??!!!!!***??!!!!!***??!!!!!***??!!!!!Examplematrix DW 10 DUP (5 DUP (0))defines a 10X5 matrix and initializes its elements to 0This declaration can also be done by matrix DW 50 DUP (0)
Data Allocation (contd)Symbol TableAssembler builds a symbol table so we can refer to the allocated storage space by the associated labelExample.DATAname offsetvalue DW 0value 0sum DD 0sum 2marks DW 10 DUP (?)marks 6message DB The grade is:,0message 26char1 DB ?char1 40
Data Allocation (contd)Correspondence to C Data Types
DirectiveC data type DBchar DWint, unsigned DDfloat, long DQdouble DTinternal intermediate float value
Data Allocation (contd)LABEL DirectiveLABEL directive provides another way to name a memory locationFormat:name LABEL typetype can beBYTE1 byteWORD2 bytesDWORD4 bytesQWORD8 bytesTWORD10 bytes
Data Allocation (contd)LABEL DirectiveExample.DATAcount LABEL WORDLo-count DB 0Hi_count DB 0.CODE ...movLo_count,ALmov Hi_count,CLcount refers to the 16-bit valueLo_count refers to the low byteHi_count refers to the high byte
Where Are the Operands?Operands required by an operation can be specified in a variety of waysA few basic ways are:operand in a registerregister addressing modeoperand in the instruction itselfimmediate addressing modeoperand in memoryvariety of addressing modesdirect and indirect addressing modesoperand at an I/O portdiscussed in Chapter 19
Where Are the Operands? (contd)Register addressing modeOperand is in an internal registerExamplesmov EAX,EBX ; 32-bit copymov BX,CX ; 16-bit copy mov AL,CL ; 8-bit copy
The mov instructionmov destination,sourcecopies data from source to destination
Where Are the Operands? (contd)Register addressing mode (contd)Most efficient way of specifying an operandNo memory access is requiredInstructions using this mode tend to be shorterFewer bits are needed to specify the registerCompilers use this mode to optimize code total := 0 for (i = 1 to 400) total = total + marks[i] end forMapping total and i to registers during the for loop optimizes the code
Where Are the Operands? (contd)Immediate addressing modeData is part of the instructionOoperand is located in the code segment along with the instructionEfficient as no separate operand fetch is neededTypically used to specify a constantExamplemov AL,75This instruction uses register addressing mode for destination and immediate addressing mode for the source
Where Are the Operands? (contd)Direct addressing modeData is in the data segmentNeed a logical address to access dataTwo components: segment:offsetVarious addressing modes to specify the offset componentoffset part is called effective addressThe offset is specified directly as part of instructionWe write assembly language programs using memory labels (e.g., declared using DB, DW, LABEL,...)Assembler computes the offset value for the labelUses symbol table to compute the offset of a label
Where Are the Operands? (contd)Direct addressing mode (contd)Examplesmov AL,responseAssembler replaces response by its effective address (i.e., its offset value from the symbol table)mov table1,56table1 is declared astable1 DW 20 DUP (0)Since the assembler replaces table1 by its effective address, this instruction refers to the first element of table1 In C, it is equivalent totable1[0] = 56
Where Are the Operands? (contd)Direct addressing mode (contd)Problem with direct addressingUseful only to specify simple variablesCauses serious problems in addressing data types such as arraysAs an example, consider adding elements of an arrayDirect addressing does not facilitate using a loop structure to iterate through the arrayWe have to write an instruction to add each element of the arrayIndirect addressing mode remedies this problem
Where Are the Operands? (contd)Indirect addressing modeThe offset is specified indirectly via a registerSometimes called register indirect addressing modeFor 16-bit addressing, the offset value can be in one of the three registers: BX, SI, or DIFor 32-bit addressing, all 32-bit registers can be usedExamplemov AX,[BX]Square brackets [ ] are used to indicate that BX is holding an offset valueBX contains a pointer to the operand, not the operand itself
Where Are the Operands? (contd)Using indirect addressing mode, we can process arrays using loopsExample: Summing array elementsLoad the starting address (i.e., offset) of the array into BXLoop for each element in the arrayGet the value using the offset in BXUse indirect addressingAdd the value to the running totalUpdate the offset in BX to point to the next element of the array
Where Are the Operands? (contd)Loading offset value into a registerSuppose we want to load BX with the offset value of table1We cannot writemov BX,table1Two ways of loading offset valueUsing OFFSET assembler directiveExecuted only at the assembly timeUsing lea instructionThis is a processor instructionExecuted at run time
Where Are the Operands? (contd)Loading offset value into a register (contd)Using OFFSET assembler directiveThe previous example can be written asmov BX,OFFSET table1
Using lea (load effective address) instructionThe format of lea instruction islea register,sourceThe previous example can be written aslea BX,table1
Where Are the Operands? (contd)Loading offset value into a register (contd)Which one to use -- OFFSET or lea?Use OFFSET if possibleOFFSET incurs only one-time overhead (at assembly time)lea incurs run time overhead (every time you run the program)May have to use lea in some instancesWhen the needed data is available at run time onlyAn index passed as a parameter to a procedureWe can writelea BX,table1[SI]to load BX with the address of an element of table1 whose index is in SI registerWe cannot use the OFFSET directive in this case
Default SegmentsIn register indirect addressing mode16-bit addressesEffective addresses in BX, SI, or DI is taken as the offset into the data segment (relative to DS)For BP and SP registers, the offset is taken to refer to the stack segment (relative to SS)32-bit addressesEffective address in EAX, EBX, ECX, EDX, ESI, and EDI is relative to DSEffective address in EBP and ESP is relative to SSpush and pop are always relative to SS
Default Segments (contd)Default segment overridePossible to override the defaults by using override prefixesCS, DS, SS, ES, FS, GSExample 1We can useadd AX,SS:[BX] to refer to a data item on the stackExample 2We can useadd AX,DS:[BP] to refer to a data item in the data segment
Data Transfer InstructionsWe will look at three instructionsmov (move)Actually copyxchg (exchange)Exchanges two operandsxlat (translate)Translates byte values using a translation tableOther data transfer instructions such asmovsx (move sign extended)movzx (move zero extended)are discussed in Chapter 12
Data Transfer Instructions (contd)The mov instructionThe format ismov destination,sourceCopies the value from source to destinationsource is not altered as a result of copyingBoth operands should be of same sizesource and destination cannot both be in memoryMost Pentium instructions do not allow both operands to be located in memoryPentium provides special instructions to facilitate memory-to-memory block copying of data
Data Transfer Instructions (contd)The mov instructionFive types of operand combinations are allowed:Instruction type Examplemov register,register mov DX,CXmov register,immediate mov BL,100mov register,memory mov BX,countmov memory,register mov count,SImov memory,immediate mov count,23
The operand combinations are valid for all instructions that require two operands
Data Transfer Instructions (contd)Ambiguous moves: PTR directiveFor the following data definitions.DATAtable1 DW 20 DUP (0)status DB 7 DUP (1)the last two mov instructions are ambiguousmov BX,OFFSET table1mov SI,OFFSET statusmov [BX],100mov [SI],100Not clear whether the assembler should use byte or word equivalent of 100
Data Transfer Instructions (contd)Ambiguous moves: PTR directiveThe PTR assembler directive can be used to clarifyThe last two mov instructions can be written asmov WORD PTR [BX],100mov BYTE PTR [SI],100WORD and BYTE are called type specifiersWe can also use the following type specifiers:DWORD for doubleword valuesQWORD for quadword valuesTWORD for ten byte values
Data Transfer Instructions (contd)The xchg instructionThe syntax isxchg operand1,operand2Exchanges the values of operand1 and operand2Examplesxchg EAX,EDXxchg response,CLxchg total,DXWithout the xchg instruction, we need a temporary register to exchange values using only the mov instruction
Data Transfer Instructions (contd)The xchg instructionThe xchg instruction is useful for conversion of 16-bit data between little endian and big endian formsExample:mov AL,AHconverts the data in AX into the other endian formPentium provides bswap instruction to do similar conversion on 32-bit databswap 32-bit registerbswap works only on data located in a 32-bit register
Data Transfer Instructions (contd)The xlat instructionThe xlat instruction translates bytesThe format isxlatb To use xlat instructionBX should be loaded with the starting address of the translation tableAL must contain an index in to the tableIndex value starts at zeroThe instruction reads the byte at this index in the translation table and stores this value in ALThe index value in AL is lostTranslation table can have at most 256 entries (due to AL)
Data Transfer Instructions (contd)The xlat instructionExample: Encrypting digits Input digits: 0 1 2 3 4 5 6 7 8 9Encrypted digits: 4 6 9 5 0 3 1 8 7 2.DATAxlat_table DB 4695031872....CODEmov BX,OFFSET xlat_tableGetCh ALsub AL,0 ; converts input character to indexxlatb ; AL = encrypted digit characterPutCh AL ...
Pentium Assembly InstructionsPentium provides several types of instructionsBrief overview of some basic instructions:Arithmetic instructionsJump instructionsLoop instructionLogical instructionsShift instructionsRotate instructionsThese instructions allow you to write reasonable assembly language programs
Arithmetic InstructionsINC and DEC instructionsFormat:inc destination dec destinationSemantics:destination = destination +/- 1destination can be 8-, 16-, or 32-bit operand, in memory or registerNo immediate operandExamplesinc BX dec value
Arithmetic Instructions (contd)Add instructionsFormat:add destination,sourceSemantics:destination = destination + sourceExamplesadd EBX,EAX add value,35inc EAX is better than add EAX,1inc takes less spaceBoth execute at about the same speed
Arithmetic Instructions (contd)Add instructionsAddition with carryFormat:adc destination,sourceSemantics:destination = destination + source + CF
Example: 64-bit additionadd EAX,ECX ; add lower 32 bits adc EBX,EDX ; add upper 32 bits with carry64-bit result in EBX:EAX
Arithmetic Instructions (contd)Subtract instructionsFormat:sub destination,sourceSemantics:destination = destination - sourceExamplessub EBX,EAX sub value,35dec EAX is better than sub EAX,1dec takes less spaceBoth execute at about the same speed
Arithmetic Instructions (contd)Subtract instructionsSubtract with borrowFormat:sbb destination,sourceSemantics:destination = destination - source - CFLike the adc, sbb is useful in dealing with more than 32-bit numbersNegationneg destinationSemantics:destination = 0 - destination
Arithmetic Instructions (contd)CMP instructionFormat:cmp destination,sourceSemantics:destination - sourcedestination and source are not alteredUseful to test relationship (>, =) between two operandsUsed in conjunction with conditional jump instructions for decision making purposesExamplescmp EBX,EAX cmp count,100
Unconditional JumpFormat:jmp labelSemantics:Execution is transferred to the instruction identified by labelTarget can be specified in one of two waysDirectlyIn the instruction itselfIndirectlyThrough a register or memoryDiscussed in Chapter 12
Unconditional Jump (contd)ExampleTwo jump instructionsForward jumpjmp CX_init_doneBackward jumpjmp repeat1Programmer specifies target by a labelAssembler computes the offset using the symbol table
. . . mov CX,10 jmp CX_init_doneinit_CX_20: mov CX,20CX_init_done: mov AX,CXrepeat1: dec CX . . . jmp repeat1 . . .
Unconditional Jump (contd)Address specified in the jump instruction is not the absolute addressUses relative address Specifies relative byte displacement between the target instruction and the instruction following the jump instructionDisplacement is w.r.t the instruction following jmpReason: IP points to this instruction after reading jumpExecution of jmp involves adding the displacement value to current IPDisplacement is a signed 16-bit numberNegative value for backward jumpsPositive value for forward jumps
Target LocationInter-segment jumpTarget is in another segmentCS = target-segment (2 bytes)IP = target-offset (2 bytes)Called far jumps (needs five bytes to encode jmp)Intra-segment jumpsTarget is in the same segmentIP = IP + relative-displacement (1 or 2 bytes)Uses 1-byte displacement if target is within -128 to +127Called short jumps (needs two bytes to encode jmp)If target is outside this range, uses 2-byte displacementCalled near jumps (needs three bytes to encode jmp)
Target Location (contd)In most cases, the assembler can figure out the type of jump For backward jumps, assembler can decide whether to use the short jump form or notFor forward jumps, it needs a hint from the programmerUse SHORT prefix to the target labelIf such a hint is not given Assembler reserves three bytes for jmp instructionIf short jump can be used, leaves one byte of nop (no operation)See the next example for details
Example . . . 8 0005 EB 0C jmp SHORT CX_init_done 0013 - 0007 = 0C 9 0007 B9 000A mov CX,10 10 000A EB 07 90 jmp CX_init_done nop0013 - 000D = 07 11 init_CX_20: 12 000D B9 0014 mov CX,20 13 0010 E9 00D0 jmp near_jump 00E3 - 0013 = D0 14 CX_init_done: 15 0013 8B C1 mov AX,CX
Example (contd)16 repeat1: 17 0015 49 dec CX 18 0016 EB FD jmp repeat10015 - 0018 = -3 = FDH . . . 84 00DB EB 03 jmp SHORT short_jump00E0 - 00DD = 3 85 00DD B9 FF00 mov CX, 0FF00H 86 short_jump: 87 00E0 BA 0020 mov DX, 20H 88 near_jump: 89 00E3 E9 FF27 jmp init_CX_20000D - 00E6 = -217 = FF27H
Conditional Jumps (contd)Format:j labExecution is transferred to the instruction identified by label only if is metExample: Testing for carriage returnread_char: . . . cmp AL,0DH ; 0DH = ASCII carriage return je CR_received inc CL jmp read_char . . .CR_received:
Conditional Jumps (contd)Some conditional jump instructionsTreats operands of the CMP instruction as signed numbers
jejump if equaljgjump if greaterjljump if lessjgejump if greater or equaljlejump if less or equaljnejump if not equal
Conditional Jumps (contd)Conditional jump instructions can also test values of the individual flags
jzjump if zero (i.e., if ZF = 1)jnzjump if not zero (i.e., if ZF = 0) jcjump if carry (i.e., if CF = 1)jncjump if not carry (i.e., if CF = 0)
jz is synonymous for jejnz is synonymous for jne
A Note on Conditional JumpsAll conditional jumps are encoded using 2 bytesTreated as short jumpsWhat if the target is outside this range?
target:. . . cmp AX,BXje targetmov CX,10. . .
traget is out of range for a short jumpUse this code to get aroundtarget:. . . cmp AX,BXjne skip1jmp targetskip1:mov CX,10. . .
Loop InstructionsUnconditional loop instructionFormat:loop targetSemantics:Decrements CX and jumps to target if CX 0CX should be loaded with a loop count valueExample: Executes loop body 50 times mov CX,50repeat: loop repeat ...
Loop Instructions (contd)The previous example is equivalent to mov CX,50repeat: dec CX jnz repeat ...Surprisingly, dec CX jnz repeat executes faster than loop repeat
Loop Instructions (contd)Conditional loop instructionsloope/loopzLoop while equal/zeroCX = CX 1ff (CX = 0 and ZF = 1) jump to target
loopne/loopnzLoop while not equal/not zeroCX = CX 1ff (CX = 0 and ZF = 0) jump to target
Logical InstructionsFormat:and destination,sourceor destination,sourcexor destination,sourcenot destinationSemantics:Performs the standard bitwise logical operationsresult goes to destination test is a non-destructive and instructiontest destination,sourcePerforms logical AND but the result is not stored in destination (like the CMP instruction)
Logical Instructions (contd)Example: . . . and AL,01H ; test the least significant bit jz bit_is_zero
jmp skip1bit_is_zero: skip1: . . .test instruction is better in place of and
Shift InstructionsTwo types of shiftsLogicalArithmeticLogical shift instructionsShift leftshl destination,count shl destination,CLShift rightshr destination,count shr destination,CLSemantics:Performs left/right shift of destination by the value in count or CL register CL register contents are not altered
Shift Instructions (contd)Logical shiftBit shifted out goes into the carry flagZero bit is shifted in at the other end
Shift Instructions (contd)count is an immediate valueshl AX,5Specification of count greater than 31 is not allowedIf a greater value is specified, only the least significant 5 bits are usedCL version is useful if shift count is known at run time Ex: when the shift count value is passed as a parameter in a procedure callOnly the CL register can be usedShift count value should be loaded into CLmov CL,5shl AX,CL
Shift Instructions (contd)Arithmetic shiftTwo versions as in logical shiftsal/sar destination,countsal/sar destination,CL
Double Shift InstructionsDouble shift instructions work on either 32- or 64-bit operandsFormat Takes three operands
shld dest,src,count ; left shiftshrd dest,src,count ; right shiftdest can be in memory or registersrc must be a registercount can be an immediate value or in CL as in other shift instructions
Double Shift Instructions (contd)src is not modified by doubleshift instructionOnly dest is modifiedShifted out bit goes into the carry flag
Rotate InstructionsTwo types of ROTATE instructions Rotate without carryrol (ROtate Left)ror (ROtate Right)Rotate with carryrcl (Rotate through Carry Left)rcr (Rotate through Carry Right)Format of ROTATE instructions is similar to the SHIFT instructionsSupports two versionsImmediate count valueCount value in CL register
Rotate Instructions (contd)Bit shifted out goes into the carry flag as in SHIFT instructions
Rotate Instructions (contd)Bit shifted out goes into the carry flag as in SHIFT instructions
Rotate Instructions (contd)Example: Shifting 64-bit numbersMultiplies a 64-bit value in EDX:EAX by 16Rotate versionmov CX,4shift_left:shl EAX,1rcl EDX,1loop shift_leftDoubleshift versionshld EDX,EAX,4shl EAX,4Division can be done in a similar a way
Defining ConstantsAssembler provides two directives:EQU directiveNo reassignmentString constants can be defined= directiveCan be reassignedNo string constantsDefining constants has two advantages:Improves program readabilityHelps in software maintenanceMultiple occurrences can be changed from a single placeConventionWe use all upper-case letters for names of constants
Defining Constants (contd)The EQU directiveSyntax:name EQU expressionAssigns the result of expression to nameThe expression is evaluated at assembly timeSimilar to #define in CExamplesNUM_OF_ROWS EQU 50NUM_OF_COLS EQU 10ARRAY_SIZE EQU NUM_OF_ROWS * NUM_OF_COLSCan also be used to define string constantsJUMP EQU jmp
Defining Constants (contd)The = directiveSyntax:name = expressionSimilar to EQU directiveTwo key differences:Redefinition is allowedcount = 0. . .count = 99is validCannot be used to define string constants or to redefine keywords or instruction mnemonicsExample: JUMP = jmp is not allowed
MacrosMacros can be defined with MACRO and ENDMFormatmacro_name MACRO[parameter1, parameter2,...] macro body ENDMA macro can be invoked usingmacro_name [argument1, argument2, ]Example: Definition InvocationmultAX_by_16 MACRO ... sal AX,4 mov AX,27 ENDM multAX_by_16 ...
Macros (contd)Macros can be defined with parametersMore flexibleMore useful
Examplemult_by_16 MACRO operand sal operand,4 ENDMTo multiply a byte in DL registermult_by_16 DLTo multiply a memory variable countmult_by_16 count
Macros (contd)Example: To exchange two memory wordsMemory-to-memory transferWmxchg MACRO operand1, operand2 xchg AX,operand1 xchg AX,operand2 xchg AX,operand1 ENDM
Illustrative ExamplesFive examples in this chapterConversion of ASCII to binary representation (BINCHAR.ASM)Conversion of ASCII to hexadecimal by character manipulation (HEX1CHAR.ASM) Conversion of ASCII to hexadecimal using the XLAT instruction (HEX2CHAR.ASM)Conversion of lowercase letters to uppercase by character manipulation (TOUPPER.ASM)Sum of individual digits of a number (ADDIGITS.ASM)Last slide