281
The Pentium Processor

The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. S. Dandamudi Chapter 7:

Embed Size (px)

Citation preview

Page 1: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

The Pentium Processor

Page 2: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 2

Pentium Family

• Intel introduced microprocessors in 1969 4-bit microprocessor 4004 8-bit microprocessors

» 8080» 8085

16-bit processors» 8086 introduced in 1979

– 20-bit address bus, 16-bit data bus» 8088 is a less expensive version

– Uses 8-bit data bus» Can address up to 4 segments of 64 KB» Referred to as the real mode

Page 3: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 3

Pentium Family (cont’d)

80186» A faster version of 8086

» 16-bit data bus and 20-bit address bus

» Improved instruction set

80286 was introduced in 1982» 24-bit address bus

» 16 MB address space

» Enhanced with memory protection capabilities

» Introduced protected mode

– Segmentation in protected mode is different from the real mode

» Backwards compatible

Page 4: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 4

Pentium Family (cont’d)

80386 was introduced 1985» First 32-bit processor

» 32-bit data bus and 32-bit address bus

» 4 GB address space

» Segmentation can be turned off (flat model)

» Introduced paging

80486 was introduced 1989» Improved version of 386

» Combined coprocessor functions for performing floating-point arithmetic

» Added parallel execution capability to instruction decode and execution units

– Achieves scalar execution of 1 instruction/clock

» Later versions introduced energy savings for laptops

Page 5: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 5

Pentium Family (cont’d)

Pentium (80586) was introduced in 1993» Similar to 486 but with 64-bit data bus

» Wider internal datapaths

– 128- and 256-bit wide

» Added second execution pipeline

– Superscalar performance

– Two instructions/clock

» Doubled on-chip L1 cache

– 8 KB data

– 8 KB instruction

» Added branch prediction

Page 6: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 6

Pentium Family (cont’d)

Pentium Pro was introduced in 1995» Three-way superscalar

– 3 instructions/clock

» 36-bit address bus

– 64 GB address space

» Introduced dynamic execution

– Out-of-order execution

– Speculative execution

» In addition to the L1 cache

– Has 256 KB L2 cache

Page 7: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 7

Pentium Family (cont’d)

Pentium II was introduced in 1997» Introduced multimedia (MMX) instructions

» Doubled on-chip L1 cache

– 16 KB data

– 16 KB instruction

» Introduced comprehensive power management features

– Sleep

– Deep sleep

» In addition to the L1 cache

– Has 256 KB L2 cache

Pentium III, Pentium IV,…

Page 8: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 8

Pentium Family (cont’d)

Itanium processor» RISC design

– Previous designs were CISC

» 64-bit processor

» Uses 64-bit address bus

» 128-bit data bus

» Introduced several advanced features

– Speculative execution

– Predication to eliminate branches

– Branch prediction

Page 9: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 9

Pentium Processor

Page 10: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 10

Pentium Processor (cont’d)

• Data bus (D0 – D 63) 64-bit data bus

• Address bus (A3 – A31) Only 29 lines

» No A0-A2 (due to 8-byte wide data bus)

• Byte enable (BE0# - BE7#) Identifies the set of bytes to read or write

» BE0# : least significant byte (D0 – D7)

» BE1# : next byte (D8 – D15)

» …

» BE7# : most significant byte (D56 – D63)

Any combination of bytes can be specified

Page 11: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 11

Pentium Processor (cont’d)

• Data parity (DP0 – DP7) Even parity for 8 bytes of data

» DP0 : D0 – D7

» DP1 : D8 – D15

» …

» DP7 : D56 – D63

• Parity check (PCHK#) Indicates the parity check result on data read Parity is checked only for valid bytes

» Indicated by BE# signals

Page 12: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 12

Pentium Processor (cont’d)

• Parity enable (PEN#) Determines whether parity check should be used

• Address parity (AP) Bad address parity during inquire cycles

• Memory/IO (M/IO#) Defines bus cycle: memory or I/O

• Write/Read (W/R#) Distinguishes between write and read cycles

• Data/Code (D/C#) Distinguishes between data and code

Page 13: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 13

Pentium Processor (cont’d)

• Cacheability (CACHE#) Read cycle: indicates internal cacheability Write cycle: burst write-back

• Bus lock (LOCK#) Used in read-modify-write cycle Useful in implementing semaphores

• Interrupt (INTR) External interrupt signal

• Nonmaskable interrupt (NMI) External NMI signal

Page 14: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 14

Pentium Processor (cont’d)

• Clock (CLK) System clock signal

• Bus ready (BRDY#) Used to extend the bus cycle

» Introduces wait states

• Bus request (BREQ) Used in bus arbitration

• Backoff (BOFF#) Aborts all pending bus cycles and floats the bus Useful to resolve deadlock between two bus masters

Page 15: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 15

Pentium Processor (cont’d)

• Bus hold (HOLD) Completes outstanding bus cycles and floats bus Asserts HLDA to give control of bus to another master

• Bus hold acknowledge (HLDA) Indicates the Pentium has given control to another local

master Pentium continues execution from its internal caches

• Cache enable (KEN#) If asserted, the current cycle is transformed into cache

line fill

Page 16: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 16

Pentium Processor (cont’d)

• Write-back/Write-through (WB/WT#) Determines the cache write policy to be used

• Reset (RESET) Resets the processor Starts execution at FFFFFFF0H Invalidates all internal caches

• Initialization (INIT) Similar to RESET but internal caches and FP registers

are not flushed After powerup, use RESET (not INIT)

Page 17: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 17

Pentium Registers

• Four 32-bit registers can be used as Four 32-bit register (EAX, EBX, ECX, EDX) Four 16-bit register (AX, BX, CX, DX) Eight 8-bit register (AH, AL, BH, BL, CH, CL, DH, DL)

• Some registers have special use ECX for count in loop instructions

Page 18: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 18

Pentium Registers (cont’d)

• Two index registers 16- or 32-bit registers Used in string instructions

» Source (SI) and destination (DI)

Can be used as general-purpose data registers

• Two pointer registers 16- or 32-bit registers Used exclusively to

maintain the stack

Page 19: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 19

Pentium Registers (cont’d)

Page 20: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 20

Pentium Registers (cont’d)

• Control registers (E)IP

» Program counter

(E) FLAGS» Status flags

– Record status information about the result of the last arithmetic/logical instruction

» Direction flag

– Forward/backward direction for data copy

» System flags

– IF : interrupt enable

– TF : Trap flag (useful in single-stepping)

Page 21: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 21

Pentium Registers (cont’d)

• Segment register Six 16-bit registers Support segmented memory

architecture At any time, only six

segments are accessible Segments contain distinct

contents» Code

» Data

» Stack

Page 22: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 22

Real Mode Architecture

• Pentium supports two modes Real mode

» Uses 16-bit addresses

» Runs 8086 programs

» Pentium acts as a faster 8086

Protected mode» 32-bit mode

» Native mode of Pentium

» Supports segmentation and paging

Page 23: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 23

Real Mode Architecture (cont’d)

• Segmented organization 16-bit wide segments Two components

» Base (16 bits)

» Offset (16 bits)

• Two-component specification is called logical address Also called effective

address

• 20-bit physical address

Page 24: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 24

Real Mode Architecture (cont’d)

• Conversion from logical to physical addresses

11000 (add 0 to base)

+ 450 (offset)

11450 (physical address)

Page 25: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 25

Real Mode Architecture (cont’d)

Two logical addresses map to the same physical address

Page 26: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 26

Real Mode Architecture (cont’d)

• Programs can access up to six segments at any time

• Two of these are for Data Code

• Another segment is typically used for Stack

• Other segments can be used for data, code,..

Page 27: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 27

Real Mode Architecture (cont’d)

Page 28: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 28

Protected Mode Architecture

• Supports sophisticated segmentation

• Segment unit translates 32-bit logical address to 32-bit linear address

• Paging unit translates 32-bit linear address to 32-bit physical address If no paging is used

» Linear address = physical address

Page 29: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 29

Protected Mode Architecture (cont’d)

Address translation

Page 30: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 30

Protected Mode Architecture (cont’d)

• Index Selects a descriptor from one of two descriptor tables

» Local» Global

• Table Indicator (TI) Select the descriptor table to be used

» 0 = Local descriptor table » 1 = Global descriptor table

• Requestor Privilege Level (RPL) Privilege level to provide protected access to data

» Smaller the RPL, higher the privilege level

Page 31: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 31

Protected Mode Architecture (cont’d)

Visible part» Instructions to load segment selector

mov, pop, lds, les, lss, lgs, lfs Invisible

» Automatically loaded when the visible part is loaded from a descriptor table

Page 32: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 32

Protected Mode Architecture (cont’d)

Segment descriptor

Page 33: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 33

Protected Mode Architecture (cont’d)

• Base address 32-bit segment starting address

• Granularity (G) Indicates whether the segment size is in

» 0 = bytes, or

» 1 = 4KB

• Segment Limit 20-bit value specifies the segment size

» G = 0: 1byte to 1 MB

» G = 1: 4KB to 4GB, in increments of 4KB

Page 34: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 34

Protected Mode Architecture (cont’d)

• D/B bit Code segment

» D bit: default size operands and offset value

– D = 0: 16-bit values

– D = 1: 32-bit values

Data segment» B bit: controls the size of the stack and stack pointer

– B = 0: SP is used with an upper bound of FFFFH

– B = 1: ESP is used with an upper bound of FFFFFFFFH

Cleared for real mode Set for protected mode

Page 35: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 35

Protected Mode Architecture (cont’d)

• S bit Identifies whether

» System segment, or

» Application segment

• Descriptor privilege level (DPL) Defines segment privilege level

• Type Identifies type of segment

» Data segment: read-only, read-write, …

» Code segment: execute-only, execute/read-only, …

• P bit Indicates whether the segment is present

Page 36: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 36

Protected Mode Architecture (cont’d)

• Three types of segment descriptor tables Global descriptor table (GDT)

» Only one in the system

» Contains OS code and data

» Available to all tasks

Local descriptor table (LDT)» Several LDTs

» Contains descriptors of a program

Interrupt descriptor table (IDT» Used in interrupt processing

» Details in Chapter 20

Page 37: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 37

Protected Mode Architecture (cont’d)

• Segmentation Models Pentium can turn off segmentation Flat model

» Consists of one segment of 4GB

» E.g. used by UNIX

Multisegment model» Up to six active segments

» Can have more than six segments

– Descriptors must be in the descriptor table

» A segment becomes active by loading its descriptor into one of the segment registers

Page 38: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 38

Protected Mode Architecture (cont’d)

Page 39: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 39

Mixed-Mode Operation

• Pentium allows mixed-mode operation Possible to combine 16-bit and 32-bit operands and

addresses D/B bit indicates the default size

» 0 = 16 bit mode

» 1 = 32-bit mode

Pentium provides two override prefixes» One for operands

» One for addresses

Details and examples in Chapter 11

Page 40: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 40

Default Segments

• Pentium uses default segments depending on the purpose of the memory reference Instruction fetch

» CS register

Stack operations» 16-bit mode: SP

» 32-bit mode: ESP

Accessing data» DS register

» Offset depends on the addressing mode

Last slide

Page 41: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

Overview of Assembly Language

Page 42: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 42

Assembly Language Statements

• Three different classes Instructions

» Tell CPU what to do» Executable instructions with an op-code

Directives (or pseudo-ops)» Provide information to assembler on various aspects of the

assembly process» Non-executable

– Do not generate machine language instructions Macros

» A shorthand notation for a group of statements» A sophisticated text substitution mechanism with parameters

Page 43: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 43

Assembly Language Statements (cont’d)

• Assembly language statement format:

[label] mnemonic [operands] [;comment]

Typically one statement per line Fields in [ ] are optional label serves two distinct purposes:

» To label an instruction

– Can transfer program execution to the labeled instruction

» To label an identifier or constant

mnemonic identifies the operation (e.g., add, or) operands specify the data required by the operation

» Executable instructions can have zero to three operands

Page 44: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 44

Assembly Language Statements (cont’d)

comments» Begin with a semicolon (;) and extend to the end of the line

Examplesrepeat: inc result ; increment result

CR EQU 0DH ; carriage return character

• White space can be used to improve readabilityrepeat:

inc result

Page 45: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 45

Data Allocation

• Variable declaration in a high-level language such as C

char responseint valuefloat totaldouble average_value

specifies» Amount storage required (1 byte, 2 bytes, …)» Label to identify the storage allocated (response, value, …)» Interpretation of the bits stored (signed, floating point, …)

– Bit pattern 1000 1101 1011 1001 is interpreted as29,255 as a signed number 36,281 as an unsigned number

Page 46: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 46

Data Allocation (cont’d)

• In assembly language, we use the define directive Define directive can be used

» To reserve storage space

» To label the storage space

» To initialize

» But no interpretation is attached to the bits stored

– Interpretation is up to the program code

Define directive goes into the .DATA part of the assembly language program

• Define directive format

[var-name] D? init-value [,init-value],...

Page 47: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 47

Data Allocation (cont’d)

• Five define directivesDB Define Byte ;allocates 1 byteDW Define Word ;allocates 2 bytesDD Define Doubleword ;allocates 4 bytesDQ Define Quadword ;allocates 8 bytesDT Define Ten bytes ;allocates 10 bytes

Examplessorted DB ’y’response DB ? ;no initializationvalue DW 25159float1 DD 1.234float2 DQ 123.456

Page 48: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 48

Data Allocation (cont’d)

• Multiple definitions can be abbreviated

Example message DB ’B’ DB ’y’ DB ’e’ DB 0DH DB 0AH

can be written as

message DB ’B’,’y’,’e’,0DH,0AH

• More compactly asmessage DB ’Bye’,0DH,0AH

Page 49: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 49

Data Allocation (cont’d)

• Multiple definitions can be cumbersome to initialize data structures such as arrays

ExampleTo declare and initialize an integer array of 8 elements marks DW 0,0,0,0,0,0,0,0

• What if we want to declare and initialize to zero an array of 200 elements? There is a better way of doing this than repeating zero

200 times in the above statement» Assembler provides a directive to do this (DUP directive)

Page 50: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 50

Data Allocation (cont’d)

• Multiple initializations The DUP assembler directive allows multiple

initializations to the same value Previous marks array can be compactly declared as

marks DW 8 DUP (0)

Examplestable1 DW 10 DUP (?) ;10 words, uninitializedmessage DB 3 DUP (’Bye!’) ;12 bytes, initialized

; as Bye!Bye!Bye!Name1 DB 30 DUP (’?’) ;30 bytes, each

; initialized to ?

Page 51: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 51

Data Allocation (cont’d)

• The DUP directive may also be nested

Examplestars DB 4 DUP(3 DUP (’*’),2 DUP (’?’),5 DUP (’!’))

Reserves 40-bytes space and initializes it as

***??!!!!!***??!!!!!***??!!!!!***??!!!!!

Examplematrix DW 10 DUP (5 DUP (0))

defines a 10X5 matrix and initializes its elements to 0

This declaration can also be done by

matrix DW 50 DUP (0)

Page 52: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 52

Data Allocation (cont’d)

Symbol Table Assembler builds a symbol table so we can refer to the

allocated storage space by the associated label

Example.DATA name

offsetvalue DW 0 value 0

sum DD 0 sum 2

marks DW 10 DUP (?) marks 6

message DB ‘The grade is:’,0 message 26

char1 DB ? char1 40

Page 53: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 53

Data Allocation (cont’d)

Correspondence to C Data Types

Directive C data type

DB char

DW int, unsigned

DD float, long

DQ double

DT internal intermediate

float value

Page 54: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 54

Data Allocation (cont’d)

LABEL Directive LABEL directive provides another way to name a

memory location Format:

name LABEL type

type can beBYTE 1 byteWORD 2 bytesDWORD 4 bytesQWORD 8 bytesTWORD 10 bytes

Page 55: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 55

Data Allocation (cont’d)

LABEL DirectiveExample

.DATAcount LABEL WORDLo-count DB 0Hi_count DB 0

.CODE...mov Lo_count,ALmov Hi_count,CL

count refers to the 16-bit value Lo_count refers to the low byte Hi_count refers to the high byte

Page 56: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 56

Where Are the Operands?

• Operands required by an operation can be specified in a variety of ways

• A few basic ways are: operand in a register

– register addressing mode operand in the instruction itself

– immediate addressing mode operand in memory

– variety of addressing modesdirect and indirect addressing modes

operand at an I/O port

Page 57: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 57

Where Are the Operands? (cont’d)

Register addressing mode Operand is in an internal register

Examplesmov EAX,EBX ; 32-bit copy

mov BX,CX ; 16-bit copy

mov AL,CL ; 8-bit copy

The mov instruction

mov destination,source

copies data from source to destination

Page 58: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 58

Where Are the Operands? (cont’d)

Register addressing mode (cont’d) Most efficient way of specifying an operand

» No memory access is required

Instructions using this mode tend to be shorter» Fewer bits are needed to specify the register

• Compilers use this mode to optimize code total := 0

for (i = 1 to 400)

total = total + marks[i]

end for Mapping total and i to registers during the for loop optimizes

the code

Page 59: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 59

Where Are the Operands? (cont’d)

Immediate addressing mode Data is part of the instruction

» Ooperand is located in the code segment along with the instruction

» Efficient as no separate operand fetch is needed

» Typically used to specify a constant

Examplemov AL,75

This instruction uses register addressing mode for destination and immediate addressing mode for the source

Page 60: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 60

Where Are the Operands? (cont’d)

Direct addressing mode Data is in the data segment

» Need a logical address to access data

– Two components: segment:offset

» Various addressing modes to specify the offset component

– offset part is called effective address

The offset is specified directly as part of instruction We write assembly language programs using memory

labels (e.g., declared using DB, DW, LABEL,...)» Assembler computes the offset value for the label

– Uses symbol table to compute the offset of a label

Page 61: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 61

Where Are the Operands? (cont’d)

Direct addressing mode (cont’d)Examples

mov AL,response» Assembler replaces response by its effective address (i.e., its

offset value from the symbol table)

mov table1,56» table1 is declared as

table1 DW 20 DUP (0)

» Since the assembler replaces table1 by its effective address, this instruction refers to the first element of table1

– In C, it is equivalent totable1[0] = 56

Page 62: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 62

Where Are the Operands? (cont’d)

Direct addressing mode (cont’d)• Problem with direct addressing

Useful only to specify simple variables Causes serious problems in addressing data types such

as arrays» As an example, consider adding elements of an array

– Direct addressing does not facilitate using a loop structure to iterate through the array

– We have to write an instruction to add each element of the array

• Indirect addressing mode remedies this problem

Page 63: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 63

Where Are the Operands? (cont’d)

Indirect addressing mode• The offset is specified indirectly via a register

Sometimes called register indirect addressing mode For 16-bit addressing, the offset value can be in one of

the three registers: BX, SI, or DI For 32-bit addressing, all 32-bit registers can be used

Examplemov AX,[BX]

Square brackets [ ] are used to indicate that BX is holding an offset value

» BX contains a pointer to the operand, not the operand itself

Page 64: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 64

Where Are the Operands? (cont’d)

• Using indirect addressing mode, we can process arrays using loops

Example: Summing array elements Load the starting address (i.e., offset) of the array into

BX Loop for each element in the array

» Get the value using the offset in BX

– Use indirect addressing

» Add the value to the running total

» Update the offset in BX to point to the next element of the array

Page 65: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 65

Where Are the Operands? (cont’d)

Loading offset value into a register• Suppose we want to load BX with the offset value

of table1• We cannot write

mov BX,table1

• Two ways of loading offset value» Using OFFSET assembler directive

– Executed only at the assembly time» Using lea instruction

– This is a processor instruction– Executed at run time

Page 66: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 66

Where Are the Operands? (cont’d)

Loading offset value into a register (cont’d)• Using OFFSET assembler directive

The previous example can be written as

mov BX,OFFSET table1

• Using lea (load effective address) instruction The format of lea instruction is

lea register,source The previous example can be written as

lea BX,table1

Page 67: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 67

Where Are the Operands? (cont’d)

Loading offset value into a register (cont’d)Which one to use -- OFFSET or lea?

Use OFFSET if possible» OFFSET incurs only one-time overhead (at assembly time)» lea incurs run time overhead (every time you run the program)

May have to use lea in some instances» When the needed data is available at run time only

– An index passed as a parameter to a procedure» We can write

lea BX,table1[SI]to load BX with the address of an element of table1 whose index is in SI register

» We cannot use the OFFSET directive in this case

Page 68: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 68

Default Segments

• In register indirect addressing mode 16-bit addresses

» Effective addresses in BX, SI, or DI is taken as the offset into the data segment (relative to DS)

» For BP and SP registers, the offset is taken to refer to the stack segment (relative to SS)

32-bit addresses» Effective address in EAX, EBX, ECX, EDX, ESI, and EDI is

relative to DS

» Effective address in EBP and ESP is relative to SS

push and pop are always relative to SS

Page 69: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 69

Default Segments (cont’d)

• Default segment override Possible to override the defaults by using override

prefixes» CS, DS, SS, ES, FS, GS

Example 1» We can use

add AX,SS:[BX] to refer to a data item on the stack

Example 2» We can use

add AX,DS:[BP] to refer to a data item in the data segment

Page 70: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 70

Data Transfer Instructions

• We will look at three instructions mov (move)

» Actually copy

xchg (exchange)» Exchanges two operands

xlat (translate)» Translates byte values using a translation table

• Other data transfer instructions such asmovsx (move sign extended)

movzx (move zero extended)

Page 71: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 71

Data Transfer Instructions (cont’d)

The mov instruction The format is

mov destination,source» Copies the value from source to destination»source is not altered as a result of copying

» Both operands should be of same size

»source and destination cannot both be in memory

– Most Pentium instructions do not allow both operands to be located in memory

– Pentium provides special instructions to facilitate memory-to-memory block copying of data

Page 72: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 72

Data Transfer Instructions (cont’d)

The mov instruction Five types of operand combinations are allowed:

Instruction type Example

mov register,register mov DX,CX

mov register,immediate mov BL,100

mov register,memory mov BX,count

mov memory,register mov count,SI

mov memory,immediate mov count,23

The operand combinations are valid for all instructions that require two operands

Page 73: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 73

Data Transfer Instructions (cont’d)

Ambiguous moves: PTR directive• For the following data definitions

.DATA

table1 DW 20 DUP (0)

status DB 7 DUP (1)

the last two mov instructions are ambiguous

mov BX,OFFSET table1

mov SI,OFFSET status

mov [BX],100

mov [SI],100 Not clear whether the assembler should use byte or word

equivalent of 100

Page 74: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 74

Data Transfer Instructions (cont’d)

Ambiguous moves: PTR directive• The PTR assembler directive can be used to

clarify• The last two mov instructions can be written as

mov WORD PTR [BX],100mov BYTE PTR [SI],100

WORD and BYTE are called type specifiers

• We can also use the following type specifiers:DWORD for doubleword valuesQWORD for quadword valuesTWORD for ten byte values

Page 75: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 75

Data Transfer Instructions (cont’d)

The xchg instruction• The syntax is

xchg operand1,operand2

Exchanges the values of operand1 and operand2

Examplesxchg EAX,EDXxchg response,CLxchg total,DX

• Without the xchg instruction, we need a temporary register to exchange values using only the mov instruction

Page 76: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 76

Data Transfer Instructions (cont’d)

The xchg instruction• The xchg instruction is useful for conversion of

16-bit data between little endian and big endian forms Example:

mov AL,AHconverts the data in AX into the other endian form

• Pentium provides bswap instruction to do similar conversion on 32-bit data

bswap 32-bit register bswap works only on data located in a 32-bit register

Page 77: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 77

Data Transfer Instructions (cont’d)

The xlat instruction• The xlat instruction translates bytes• The format is

xlatb

• To use xlat instruction» BX should be loaded with the starting address of the translation table

» AL must contain an index in to the table

– Index value starts at zero

» The instruction reads the byte at this index in the translation table and stores this value in AL

– The index value in AL is lost

» Translation table can have at most 256 entries (due to AL)

Page 78: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 78

Data Transfer Instructions (cont’d)

The xlat instructionExample: Encrypting digits

Input digits: 0 1 2 3 4 5 6 7 8 9Encrypted digits: 4 6 9 5 0 3 1 8 7 2

.DATAxlat_table DB ’4695031872’...

.CODEmov BX,OFFSET xlat_tableGetCh ALsub AL,’0’ ; converts input character to indexxlatb ; AL = encrypted digit characterPutCh AL ...

Page 79: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 79

Pentium Assembly Instructions

• Pentium provides several types of instructions• Brief overview of some basic instructions:

Arithmetic instructions Jump instructions Loop instruction Logical instructions Shift instructions Rotate instructions

• These instructions allow you to write reasonable assembly language programs

Page 80: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 80

Arithmetic Instructions

INC and DEC instructions Format:

inc destination dec destination

Semantics:destination = destination +/-

1» destination can be 8-, 16-, or 32-bit operand, in memory

or registerNo immediate operand

• Examplesinc BX

dec value

Page 81: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 81

Arithmetic Instructions (cont’d)

Add instructions Format:

add destination,source

Semantics:destination = destination + source

• Examplesadd EBX,EAX

add value,35

inc EAX is better than add EAX,1– inc takes less space

– Both execute at about the same speed

Page 82: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 82

Arithmetic Instructions (cont’d)

Add instructions Addition with carry Format:

adc destination,source

Semantics:destination = destination + source + CF

• Example: 64-bit additionadd EAX,ECX ; add lower 32 bits

adc EBX,EDX ; add upper 32 bits with carry

64-bit result in EBX:EAX

Page 83: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 83

Arithmetic Instructions (cont’d)

Subtract instructions Format:

sub destination,source

Semantics:destination = destination - source

• Examplessub EBX,EAX

sub value,35

dec EAX is better than sub EAX,1– dec takes less space

– Both execute at about the same speed

Page 84: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 84

Arithmetic Instructions (cont’d)

Subtract instructions Subtract with borrow Format:

sbb destination,source Semantics:

destination = destination - source - CF Like the adc, sbb is useful in dealing with more than

32-bit numbers• Negation

neg destination Semantics:

destination = 0 - destination

Page 85: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 85

Arithmetic Instructions (cont’d)

CMP instruction Format:

cmp destination,source

Semantics:destination - source

destination and source are not altered Useful to test relationship (>, =) between two operands Used in conjunction with conditional jump instructions

for decision making purposes• Examples

cmp EBX,EAX cmp count,100

Page 86: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 86

Unconditional Jump

Format:jmp label

Semantics:» Execution is transferred to the instruction identified by label

• Target can be specified in one of two ways Directly

» In the instruction itself

Indirectly» Through a register or memory

Page 87: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 87

Unconditional Jump (cont’d)

Example

• Two jump instructions Forward jump

jmp CX_init_done

Backward jumpjmp repeat1

• Programmer specifies target by a label

• Assembler computes the offset using the symbol table

. . .

mov CX,10

jmp CX_init_done

init_CX_20:

mov CX,20

CX_init_done:

mov AX,CX

repeat1:

dec CX

. . .

jmp repeat1

. . .

Page 88: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 88

Unconditional Jump (cont’d)

• Address specified in the jump instruction is not the absolute address Uses relative address

» Specifies relative byte displacement between the target instruction and the instruction following the jump instruction

» Displacement is w.r.t the instruction following jmp– Reason: IP points to this instruction after reading jump

Execution of jmp involves adding the displacement value to current IP

Displacement is a signed 16-bit number» Negative value for backward jumps

» Positive value for forward jumps

Page 89: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 89

Target Location

• Inter-segment jump Target is in another segment

CS = target-segment (2 bytes)IP = target-offset (2 bytes)

» Called far jumps (needs five bytes to encode jmp)

• Intra-segment jumps Target is in the same segment

IP = IP + relative-displacement (1 or 2 bytes) Uses 1-byte displacement if target is within 128 to +127

» Called short jumps (needs two bytes to encode jmp) If target is outside this range, uses 2-byte displacement

» Called near jumps (needs three bytes to encode jmp)

Page 90: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 90

Target Location (cont’d)

• In most cases, the assembler can figure out the type of jump For backward jumps, assembler can decide whether to

use the short jump form or not

• For forward jumps, it needs a hint from the programmer Use SHORT prefix to the target label If such a hint is not given

» Assembler reserves three bytes for jmp instruction» If short jump can be used, leaves one byte of nop (no

operation)– See the next example for details

Page 91: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 91

Example

. . . 8 0005 EB 0C jmp SHORT CX_init_done

0013 - 0007 = 0C

9 0007 B9 000A mov CX,10

10 000A EB 07 90 jmp CX_init_done

nop 0013 - 000D = 07

11 init_CX_20:

12 000D B9 0014 mov CX,20

13 0010 E9 00D0 jmp near_jump

00E3 - 0013 = D0

14 CX_init_done:

15 0013 8B C1 mov AX,CX

Page 92: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 92

Example (cont’d)

16 repeat1:

17 0015 49 dec CX

18 0016 EB FD jmp repeat1

0015 - 0018 = -3 = FDH

. . .

84 00DB EB 03 jmp SHORT short_jump

00E0 - 00DD = 3

85 00DD B9 FF00 mov CX, 0FF00H

86 short_jump:

87 00E0 BA 0020 mov DX, 20H

88 near_jump:

89 00E3 E9 FF27 jmp init_CX_20

000D - 00E6 = -217 = FF27H

Page 93: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 93

Conditional Jumps (cont’d)

Format:j<cond> lab

– Execution is transferred to the instruction identified by label only if <cond> is met

• Example: Testing for carriage returnread_char:

. . . cmp AL,0DH ; 0DH = ASCII carriage return je CR_received inc CL jmp read_char . . .CR_received:

Page 94: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 94

Conditional Jumps (cont’d)

Some conditional jump instructions– Treats operands of the CMP instruction as signed numbers

je jump if equaljg jump if greaterjl jump if lessjge jump if greater or equaljle jump if less or equaljne jump if not equal

Page 95: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 95

Conditional Jumps (cont’d)

Conditional jump instructions can also test values of the individual flags

jz jump if zero (i.e., if ZF = 1)jnz jump if not zero (i.e., if ZF = 0) jc jump if carry (i.e., if CF = 1)jnc jump if not carry (i.e., if CF = 0)

jz is synonymous for je jnz is synonymous for jne

Page 96: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 96

A Note on Conditional Jumps

target:

. . .

cmp AX,BX

je target

mov CX,10

. . .

traget is out of range for a short jump

• Use this code to get around

target:

. . .

cmp AX,BX

jne skip1

jmp target

skip1:

mov CX,10

. . .

• All conditional jumps are encoded using 2 bytes Treated as short jumps

• What if the target is outside this range?

Page 97: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 97

Loop Instructions

Unconditional loop instruction Format:

loop target

Semantics:» Decrements CX and jumps to target if CX 0

– CX should be loaded with a loop count value• Example: Executes loop body 50 times

mov CX,50

repeat:

<loop body>

loop repeat ...

Page 98: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 98

Loop Instructions (cont’d)

• The previous example is equivalent to mov CX,50

repeat:

<loop body>

dec CX

jnz repeat ...

Surprisingly, dec CX

jnz repeat

executes faster than loop repeat

Page 99: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 99

Loop Instructions (cont’d)

• Conditional loop instructions loope/loopz

» Loop while equal/zero

CX = CX – 1ff (CX = 0 and ZF = 1) jump to target

loopne/loopnz» Loop while not equal/not zero

CX = CX – 1ff (CX = 0 and ZF = 0) jump to target

Page 100: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 100

Logical Instructions

Format:and destination,sourceor destination,sourcexor destination,sourcenot destination

Semantics:» Performs the standard bitwise logical operations

– result goes to destination test is a non-destructive and instruction

test destination,source

Performs logical AND but the result is not stored in destination (like the CMP instruction)

Page 101: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 101

Logical Instructions (cont’d)

Example: . . .

and AL,01H ; test the least significant bit

jz bit_is_zero

<bit 1 code>

jmp skip1

bit_is_zero:

<bit 0 code>

skip1:

. . .

• test instruction is better in place of and

Page 102: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 102

Shift Instructions

• Two types of shifts» Logical» Arithmetic

Logical shift instructionsShift left

shl destination,count shl destination,CL

Shift rightshr destination,count shr destination,CL

Semantics:» Performs left/right shift of destination by the value in count or CL register

– CL register contents are not altered

Page 103: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 103

Shift Instructions (cont’d)

Logical shift Bit shifted out goes into the carry flag

» Zero bit is shifted in at the other end

Page 104: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 104

Shift Instructions (cont’d)

count is an immediate valueshl AX,5

Specification of count greater than 31 is not allowed» If a greater value is specified, only the least significant 5 bits

are used

CL version is useful if shift count is known at run time» Ex: when the shift count value is passed as a parameter in a

procedure call

» Only the CL register can be usedShift count value should be loaded into CL

mov CL,5

shl AX,CL

Page 105: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 105

Shift Instructions (cont’d)

Arithmetic shift Two versions as in logical shift

sal/sar destination,count

sal/sar destination,CL

Page 106: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 106

Double Shift Instructions

• Double shift instructions work on either 32- or 64-bit operands

• Format Takes three operands

shld dest,src,count ; left shift

shrd dest,src,count ; right shift dest can be in memory or register src must be a register count can be an immediate value or in CL as in other

shift instructions

Page 107: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 107

Double Shift Instructions (cont’d)

src is not modified by doubleshift instruction Only dest is modified Shifted out bit goes into the carry flag

Page 108: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 108

Rotate Instructions

Two types of ROTATE instructions Rotate without carry

» rol (ROtate Left)» ror (ROtate Right)

Rotate with carry» rcl (Rotate through Carry Left)» rcr (Rotate through Carry Right)

Format of ROTATE instructions is similar to the SHIFT instructions

» Supports two versions– Immediate count value– Count value in CL register

Page 109: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 109

Rotate Instructions (cont’d)

Bit shifted out goes into the carry flag as in SHIFT instructions

Page 110: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 110

Rotate Instructions (cont’d)

Bit shifted out goes into the carry flag as in SHIFT instructions

Page 111: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 111

Rotate Instructions (cont’d)

• Example: Shifting 64-bit numbers Multiplies a 64-bit value in EDX:EAX by 16

» Rotate versionmov CX,4

shift_left:shl EAX,1rcl EDX,1loop shift_left

» Doubleshift versionshld EDX,EAX,4shl EAX,4

• Division can be done in a similar a way

Page 112: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 112

Defining Constants

• Assembler provides two directives:» EQU directive

– No reassignment– String constants can be defined

» = directive– Can be reassigned– No string constants

• Defining constants has two advantages: Improves program readability Helps in software maintenance

» Multiple occurrences can be changed from a single place

• Convention» We use all upper-case letters for names of constants

Page 113: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 113

Defining Constants (cont’d)

The EQU directive• Syntax:

name EQU expression Assigns the result of expression to name The expression is evaluated at assembly time

Similar to #define in CExamples

NUM_OF_ROWS EQU 50NUM_OF_COLS EQU 10ARRAY_SIZE EQU NUM_OF_ROWS * NUM_OF_COLS

Can also be used to define string constantsJUMP EQU jmp

Page 114: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 114

Defining Constants (cont’d)

The = directive• Syntax:

name = expression Similar to EQU directive Two key differences:

» Redefinition is allowedcount = 0. . .count = 99

is valid» Cannot be used to define string constants or to redefine

keywords or instruction mnemonics

Example: JUMP = jmp is not allowed

Page 115: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 115

Macros

• Macros can be defined with MACRO and ENDM• Format

macro_name MACRO[parameter1, parameter2,...] macro body

ENDM

• A macro can be invoked usingmacro_name [argument1, argument2, …]

Example: Definition InvocationmultAX_by_16 MACRO ...

sal AX,4 mov AX,27

ENDM multAX_by_16

...

Page 116: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 116

Macros (cont’d)

• Macros can be defined with parameters» More flexible

» More useful

• Examplemult_by_16 MACRO operand

sal operand,4

ENDM To multiply a byte in DL register

mult_by_16 DL

To multiply a memory variable count

mult_by_16 count

Page 117: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 117

Macros (cont’d)

Example: To exchange two memory words Memory-to-memory transfer

Wmxchg MACRO operand1, operand2

xchg AX,operand1

xchg AX,operand2

xchg AX,operand1

ENDM

Page 118: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 118

Illustrative Examples

• Five examples in this chapter Conversion of ASCII to binary representation

(BINCHAR.ASM) Conversion of ASCII to hexadecimal by character

manipulation (HEX1CHAR.ASM) Conversion of ASCII to hexadecimal using the XLAT

instruction (HEX2CHAR.ASM) Conversion of lowercase letters to uppercase by

character manipulation (TOUPPER.ASM) Sum of individual digits of a number

(ADDIGITS.ASM)Last slide

Page 119: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

Procedures and the Stack

Page 120: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 120

What is a Stack?

• Stack is a last-in-first-out (LIFO) data structure If we view the stack as a linear array of elements, both

insertion and deletion operations are restricted to one end of the array

Only the element at the top-of-stack (TOS) is directly accessible

• Two basic stack operations push

» Insertion

pop » Deletion

Page 121: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 121

What is a Stack? (cont’d)

• Example Insertion of data items into the stack

» Arrow points to the top-of-stack

Page 122: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 122

What is a Stack? (cont’d)

• Example Deletion of data items from the stack

» Arrow points to the top-of-stack

Page 123: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 123

Pentium Implementation of the Stack

• Stack segment is used to implement the stack Registers SS and (E)SP are used SS:(E)SP represents the top-of-stack

• Pentium stack implementation characteristics are Only words (i.e., 16-bit data) or doublewords (i.e., 32-

bit data) are saved on the stack, never a single byte Stack grows toward lower memory addresses

» Stack grows “downward”

Top-of-stack (TOS) always points to the last data item placed on the stack

Page 124: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 124

Pentium Stack Example - 1

Page 125: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 125

Pentium Stack Example - 2

Page 126: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 126

Pentium Stack Instructions

• Pentium provides two basic instructions:push source

pop destination

source and destination can be a» 16- or 32-bit general register

» a segment register

» a word or doubleword in memory

source of push can also be an immediate operand of size 8, 16, or 32 bits

Page 127: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 127

Pentium Stack Instructions: Examples

• On an empty stack created by

.STACK 100Hthe following sequence of push instructions

push 21ABH

push 7FBD329AH

results in the stack state shown in (a) in the last figure

• On this stack, executingpop EBX

results in the stack state shown in (b) in the last figure

and the register EBX gets the value 7FBD329AH

Page 128: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 128

Additional Pentium Stack Instructions

Stack Operations on Flags• push and pop instructions cannot be used on the

Flags register• Two special instructions for this purpose are

pushf (push 16-bit flags)

popf (pop 16-bit flags)

• No operands are required• Use pushfd and popfd for 32-bit flags

(EFLAGS)

Page 129: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 129

Additional Pentium Stack Instructions (cont’d)

Stack Operations on 8 General-Purpose Registers

• pusha and popa instructions can be used to save and restore the eight general-purpose registers

AX, CX, DX, BX, SP, BP, SI, and DI

• pusha pushes these eight registers in the above order (AX first and DI last)

• popa restores these registers except that SP value is not loaded into the SP register

• Use pushad and popad for saving and restoring 32-bit registers

Page 130: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 130

Uses of the Stack

• Three main uses» Temporary storage of data» Transfer of control» Parameter passing

Temporary Storage of DataExample: Exchanging value1 and value2 can be

done by using the stack to temporarily hold datapush value1push value2pop value1pop value2

Page 131: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 131

Uses of the Stack (cont’d)

• Often used to free a set of registers

;save EBX & ECX registers on the stack

push EBX

push ECX. . . . . .

<<EBX and ECX can now be used>>. . . . . .

;restore EBX & ECX from the stack

pop ECX

pop EBX

Page 132: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 132

Uses of the Stack (cont’d)

Transfer of Control• In procedure calls and interrupts, the return

address is stored on the stack Our discussion on procedure calls clarifies this

particular use of the stack

Parameter Passing• Stack is extensively used for parameter passing

Our discussion later on parameter passing describes how the stack is used for this purpose

Page 133: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 133

Assembler Directives for Procedures

• Assembler provides two directives to define procedures: PROC and ENDP

• To define a NEAR procedure, useproc-name PROC NEAR

In a NEAR procedure, both calling and called procedures are in the same code segment

• A FAR procedure can be defined byproc-name PROC FAR

Called and calling procedures are in two different segments in a FAR procedure

Page 134: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 134

Assembler Directives for Procedures (cont’d)

• If FAR or NEAR is not specified, NEAR is assumed (i.e., NEAR is the default)

• We focus on NEAR procedures• A typical NAER procedure definition

proc-name PROC . . . . .

<procedure body> . . . . .

proc-name ENDP

proc-name should match in PROC and ENDP

Page 135: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 135

Pentium Instructions for Procedures

• Pentium provides two instructions: call and ret• call instruction is used to invoke a procedure• The format is

call proc-nameproc-name is the procedure name

• Actions taken during a near procedure call

SP = SP 2(SS:SP) = IPIP = IP + relative displacement

Page 136: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 136

Pentium Instructions for Procedures (cont’d)

• ret instruction is used to transfer control back to the calling procedure

• How will the processor know where to return? Uses the return address pushed onto the stack as part of

executing the call instruction Important that TOS points to this return address when ret instruction is executed

• Actions taken during the execution of ret are:

IP = (SS:SP) SP = SP + 2

Page 137: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 137

Pentium Instructions for Procedures (cont’d)

• We can specify an optional integer in the ret instruction The format is

ret optional-integer

Example:

ret 6• Actions taken on ret with optional-integer are:

IP = (SS:SP)

SP = SP + 2 + optional-integer

Page 138: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 138

How Is Program Control Transferred?

Offset(hex) machine code(hex)main PROC. . . . . .

cs:000A E8000C call sumcs:000D 8BD8 mov BX,AX

. . . . . .main ENDP

sum PROCcs:0019 55 push BP

. . . . . .sum ENDP

avg PROC. . . . . .

cs:0028 E8FFEE call sumcs:002B 8BD0 mov DX,AX

. . . . . .avg ENDP

Page 139: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 139

Parameter Passing

• Parameter passing is different and complicated than in a high-level language

• In assembly language» First place all required parameters in a mutually accessible

storage area» Then call the procedure

• Type of storage area used» Registers (general-purpose registers are used)» Memory (stack is used)

• Two common methods of parameter passing» Register method» Stack method

Page 140: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 140

Parameter Passing: Register Method

• Calling procedure places the necessary parameters in the general-purpose registers before invoking the procedure through the call instruction

• Examples:

PROCEX1.ASM» call-by-value using the register method

» a simple sum procedure

PROCEX2.ASM» call-by-reference using the register method» string length procedure

Page 141: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 141

Pros and Cons of the Register Method

• Advantages Convenient and easier Faster

• Disadvantages Only a few parameters can be passed using the register

method– Only a small number of registers are available

Often these registers are not free– Freeing them by pushing their values onto the stack

negates the second advantage

Page 142: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 142

Parameter Passing: Stack Method

• All parameter values are pushed onto the stack before calling the procedure

• Example:push number1push number2call sum

Page 143: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 143

Accessing Parameters on the Stack

• Parameter values are buried inside the stack• We cannot use

mov BX,[SP+2] ;illegalto access number2 in the previous example

• We can usemov BX,[ESP+2] ;valid

Problem: The ESP value changes with push and pop operations

» Relative offset depends of the stack operations performed

» Not desirable

Page 144: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 144

Accessing Parameters on the Stack (cont’d)

• We can also useadd SP,2

mov BX,[SP] ;valid

Problem: cumbersome» We have to remember to update SP to point to the return

address on the stack before the end of the procedure

• Is there a better alternative? Use the BP register to access parameters on the stack

Page 145: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 145

Using BP Register to Access Parameters

• Preferred method of accessing parameters on the stack is

mov BP,SP

mov BX,[BP+2]

to access number2 in the previous example• Problem: BP contents are lost!

We have to preserve the contents of BP Use the stack (caution: offset value changes)

push BP

mov BP,SP

Page 146: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 146

Clearing the Stack Parameters

Stack state after push BP

Stack state after pop BP

Stack state afterexecuting ret

Page 147: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 147

Clearing the Stack Parameters (cont’d)

• Two ways of clearing the unwanted parameters on the stack: Use the optional-integer in the ret instruction

» Use ret 4

in the previous example

Add the constant to SP in calling procedure (C uses this method)

push number1push number2call sumadd SP,4

Page 148: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 148

Housekeeping Issues

• Who should clean up the stack of unwanted parameters? Calling procedure

» Need to update SP with every procedure call

» Not really needed if procedures use fixed number of parameters

» C uses this method because C allows variable number of parameters

Called procedure» Code becomes modular (parameter clearing is done in only

one place)

» Cannot be used with variable number of parameters

Page 149: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 149

Housekeeping Issues (cont’d)

• Need to preserve the state across a procedure call» Stack is used for this purpose

• Which registers should be saved? Save those registers that are used by the calling

procedure but are modified by the called procedure» Might cause problems

Save all registers (brute force method) » Done by using pusha

» Increased overhead

– pusha takes 5 clocks as opposed 1 to save a register

Page 150: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 150

Housekeeping Issues (cont’d)

Stack state after pusha

Page 151: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 151

Housekeeping Issues (cont’d)

• Who should preserve the state of the calling procedure? Calling procedure

» Need to know the registers used by the called procedure

» Need to include instructions to save and restore registers with every procedure call

» Causes program maintenance problems

Called procedure» Preferred method as the code becomes modular (state

preservation is done only once and in one place)

» Avoids the program maintenance problems mentioned

Page 152: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 152

Stack Frame Instructions

• ENTER instruction Facilitates stack frame (discussed later) allocation

enter bytes,levelbytes = local storage space

level = nesting level (we use 0) Example

enter XX,0Equivalent to

push BPmov BP,SPsub SP,XX

Page 153: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 153

Stack Frame Instructions (cont’d)

• LEAVE instruction Releases stack frame

leave» Takes no operands

» Equivalent to

mov SP,BP

pop BP

Page 154: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 154

A Typical Procedure Template

proc-name PROC

enter XX,0

. . . . . .

<procedure body>

. . . . . .

leave

ret YY

proc-name ENDP

Page 155: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 155

Stack Parameter Passing: Examples

• PROCEX3.ASM call-by-value using the stack method a simple sum procedure

• PROCSWAP.ASM call-by-reference using the stack method first two characters of the input string are swapped

• BBLSORT.ASM implements bubble sort algorithm uses pusha and popa to save and restore registers

Page 156: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 156

Variable Number of Parameters

• For most procedures, the number of parameters is fixed Same number of arguments in each call

• In procedures that can have variable number of parameters Number of arguments can vary from call to call C supports procedures with variable number of

parameters

• Easy to support variable number of parameters using the stack method

Page 157: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 157

Variable Number of Parameters (cont’d)

• To implement variable number of parameter passing: Parameter count should

be one of the parameters passed

This count should be the last parameter pushed onto the stack

Page 158: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 158

Local Variables

• Local variables are dynamic in nature Come into existence when the procedure is invoked Disappear when the procedure terminates

• Cannot reserve space for these variable in the data segment for two reasons:

» Such space allocation is static

– Remains active even when the procedure is not

» It does not work with recursive procedures

• Space for local variables is reserved on the stack

Page 159: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 159

Local Variables (cont’d)

Example

• N and temp Two local

variables

Each requires two bytes of storage

Page 160: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 160

Local Variables (cont’d)

• The information stored in the stack» parameters» returns address» old BP value» local variables

is collectively called stack frame

• In high-level languages, stack frame is also referred to as the activation record

» Each procedure activation requires all this information

• The BP value is referred to as the frame pointer» Once the BP value is known, we can access all the data in the

stack frame

Page 161: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 161

Local Variables: Examples

• PROCFIB1.ASM For simple procedures, registers can also be used for

local variable storage Uses registers for local variable storage Outputs the largest Fibonacci number that is less than

the given input number

• PROCFIB2.ASM Uses the stack for local variable storage Performance implications of using registers versus

stack are discussed later

Page 162: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 162

Multiple Module Programs

• In multi-module programs, a single program is split into multiple source files

• Advantages» If a module is modified, only that module needs to be

reassembled (not the whole program)

» Several programmers can share the work

» Making modifications is easier with several short files

» Unintended modifications can be avoided

• To facilitate separate assembly, two assembler directives are provided:

» PUBLIC and EXTRN

Page 163: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 163

PUBLIC Assembler Directive

• The PUBLIC directive makes the associated labels public

» Makes these labels available for other modules of the program

• The format isPUBLIC label1, label2, . . .

• Almost any label can be made public including» procedure names

» variable names

» equated labels

• In the PUBLIC statement, it is not necessary to specify the type of label

Page 164: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 164

Example: PUBLIC Assembler Directive

. . . . . PUBLIC error_msg, total, sample

. . . . . .DATAerror_msg DB “Out of range!”,0total DW 0

. . . . . .CODE

. . . . . sample PROC

. . . . . sample ENDP

. . . . .

Page 165: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 165

EXTRN Assembler Directive

• The EXTRN directive tells the assembler that certain labels are not defined in the current module The assembler leaves “holes” in the OBJ file for the

linker to fill in later on

• The format isEXTRN label:type

where label is a label made public by a PUBLIC directive in some other module and

type is the type of the label

Page 166: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 166

EXTRN Assembler Directive (cont’d)

Type Description

UNKNOWN Undetermined or unknown type

BYTE Data variable (size is 8 bits)

WORD Data variable (size is 16 bits)

DWORD Data variable (size is 32 bits)

QWORD Data variable (size is 64 bits)

FWORD Data variable (size is 6 bytes)

TBYTE Data variable (size is 10 bytes)

PROC A procedure name

(NEAR or FAR according to .MODEL)

NAER A near procedure name

FAR A far procedure name

Page 167: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 167

EXTRN Assembler Directive (cont’d)

Example.MODEL SMALL

. . . .

EXTRN error_msg:BYTE, total:WORD

EXTRN sample:PROC

. . . .

Note: EXTRN (not EXTERN)

Examplemodule1.asm (main procedure)

module2.asm (string-length procedure)Last slide

Page 168: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

Addressing Modes

Page 169: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 169

Addressing Modes

• Addressing mode refers to the specification of the location of data required by an operation

• Pentium supports three fundamental addressing modes: Register mode Immediate mode Memory mode

• Specification of operands located in memory can be done in a variety of ways Mainly to support high-level language constructs and

data structures

Page 170: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 170

Pentium Addressing Modes (32-bit Addresses)

Page 171: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 171

Memory Addressing Modes (16-bit Addresses)

Page 172: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 172

Simple Addressing Modes

• Register addressing mode Operands are located in registers It is the most efficient addressing mode

• Immediate addressing mode Operand is stored as part of the instruction

» This mode is used mostly for constants It imposes several restrictions Efficient as the data comes with the instructions

» Instructions are generally prefetched

• Both addressing modes are discussed before

Page 173: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 173

Memory Addressing Modes

• Pentium offers several addressing modes to access operands located in memory

» Primary reason: To efficiently support high-level language constructs and data structures

• Available addressing modes depend on the address size used 16-bit modes (shown before)

» same as those supported by 8086

32-bit modes (shown before)» supported by Pentium

» more flexible set

Page 174: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 174

32-Bit Addressing Modes

• These addressing modes use 32-bit registers

Segment + Base + (Index * Scale) + displacementCS EAX EAX 1 no displacementSS EBX EBX 2 8-bit displacementDS ECX ECX 4 32-bit

displacementES EDX EDX 8FS ESI ESIGS EDI EDI

EBP EBPESP

Page 175: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 175

Differences between 16- and 32-bit Modes

16-bit addressing 32-bit addressing

Base register BX, BP EAX, EBX, ECX,EDX, ESI, EDI,EBP, ESP

Index register SI, DI EAX, EBX, ECX,EDX, ESI, EDI,EBP

Scale factor None 1, 2, 4, 8

Displacement 0, 8, 16 bits 0, 8, 32 bits

Page 176: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 176

16-bit or 32-bit Addressing Mode?

• How does the processor know?• Uses the D bit in the CS segment descriptor

D = 0» default size of operands and addresses is 16 bits

D = 1» default size of operands and addresses is 32 bits

• We can override these defaults Pentium provides two size override prefixes

66H operand size override prefix 67H address size override prefix

• Using these prefixes, we can mix 16- and 32-bit data and addresses

Page 177: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 177

Examples: Override Prefixes

• Our default mode is 16-bit data and addresses

Example 1: Data size overridemov AX,123 ==> B8 007B

mov EAX,123 ==> 66 | B8 0000007B

Example 2: Address size overridemov AX,[EBX*ESI+2] ==> 67 | 8B0473

Example 3: Address and data size overridemov EAX,[EBX*ESI+2] ==> 66 | 67 | 8B0473

Page 178: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 178

Memory Addressing Modes

• Direct addressing mode Offset is specified as part of the instruction

– Assembler replaces variable names by their offset values

– Useful to access only simple variables

Exampletotal_marks =

assign_marks + test_marks + exam_markstranslated into

mov EAX,assign_marks

add EAX,test_marks

add EAX,exam_marks

mov total_marks,EAX

Page 179: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 179

Memory Addressing Modes (cont’d)

• Register indirect addressing mode Effective address is placed in a general-purpose register

In 16-bit segments» only BX, SI, and DI are allowed to hold an effective address

add AX,[BX] is validadd AX,[CX] is NOT allowed

In 32-bit segments» any of the eight 32-bit registers can hold an effective address

add AX,[ECX] is valid

Page 180: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 180

Memory Addressing Modes (cont’d)

• Default Segments 16-bit addresses

» BX, SI, DI : data segment

» BP, SP : stack segment

32-bit addresses» EAX, EBX, ECX, EDX, ESI, EDI: data segment

» EBP, ESP: stack segment

• Possible to override these defaults Pentium provides segment override prefixes

Page 181: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 181

Based Addressing

• Effective address is computed asbase + signed displacement

Displacement:– 16-bit addresses: 8- or 16-bit number– 32-bit addresses: 8- or 32-bit number

• Useful to access fields of a structure or record» Base register points to the base address of the structure» Displacement relative offset within the structure

• Useful to access arrays whose element size is not 2, 4, or 8 bytes

» Displacement points to the beginning of the array» Base register relative offset of an element within the array

Page 182: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 182

Based Addressing (cont’d)

Page 183: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 183

Indexed Addressing

• Effective address is computed as(index * scale factor) + signed displacement

16-bit addresses:– displacement: 8- or 16-bit number– scale factor: none (i.e., 1)

32-bit addresses:– displacement: 8- or 32-bit number– scale factor: 2, 4, or 8

• Useful to access elements of an array (particularly if the element size is 2, 4, or 8 bytes)

» Displacement points to the beginning of the array» Index register selects an element of the array (array index)» Scaling factor size of the array element

Page 184: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 184

Indexed Addressing (cont’d)

Examplesadd AX,[DI+20]

– We have seen similar usage to access parameters off the stack (in Chapter 10)

add AX,marks_table[ESI*4]– Assembler replaces marks_table by a constant (i.e.,

supplies the displacement)

– Each element of marks_table takes 4 bytes (the scale factor value)

– ESI needs to hold the element subscript value

add AX,table1[SI]– SI needs to hold the element offset in bytes– When we use the scale factor we avoid such byte counting

Page 185: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 185

Based-Indexed Addressing

Based-indexed addressing with no scale factor• Effective address is computed as

base + index + signed displacement

• Useful in accessing two-dimensional arrays» Displacement points to the beginning of the array

» Base and index registers point to a row and an element within that row

• Useful in accessing arrays of records» Displacement represents the offset of a field in a record

» Base and index registers hold a pointer to the base of the array and the offset of an element relative to the base of the array

Page 186: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 186

Based-Indexed Addressing (cont’d)

• Useful in accessing arrays passed on to a procedure

» Base register points to the beginning of the array» Index register represents the offset of an element relative to

the base of the array

ExampleAssuming BX points to table1

mov AX,[BX+SI]

cmp AX,[BX+SI+2] compares two successive elements of table1

Page 187: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 187

Based-Indexed Addressing (cont’d)

Based-indexed addressing with scale factor• Effective address is computed as

base + (index * scale factor) + signed displacement

• Useful in accessing two-dimensional arrays when the element size is 2, 4, or 8 bytes

» Displacement ==> points to the beginning of the array

» Base register ==> holds offset to a row (relative to start of array)

» Index register ==> selects an element of the row

» Scaling factor ==> size of the array element

Page 188: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 188

Illustrative Examples

• Insertion sort ins_sort.asm Sorts an integer array using insertion sort algorithm

» Inserts a new number into the sorted array in its right place

• Binary search bin_srch.asm Uses binary search to locate a data item in a sorted

array» Efficient search algorithm

Page 189: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 189

Arrays

One-Dimensional Arrays

• Array declaration in HLL (such as C)

int test_marks[10];

specifies a lot of information about the array:

» Name of the array (test_marks)

» Number of elements (10)

» Element size (2 bytes)

» Interpretation of each element (int i.e., signed integer)

» Index range (0 to 9 in C)

• You get very little help in assembly language!

Page 190: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 190

Arrays (cont’d)

• In assembly language, declaration such astest_marks DW 10 DUP (?)

only assigns name and allocates storage space. You, as the assembly language programmer, have to

“properly” access the array elements by taking element size and the range of subscripts.

• Accessing an array element requires its displacement or offset relative to the start of the array in bytes

Page 191: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 191

Arrays (cont’d)

• To compute displacement, we need to know how the array is laid out

» Simple for 1-D arrays

• Assuming C style subscriptsdisplacement = subscript *

element size in bytes

• If the element size is 2, 4, or 8 bytes a scale factor can be used to

avoid counting displacement in bytes

Page 192: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 192

Multidimensional Arrays

• We focus on two-dimensional arrays» Our discussion can be generalized to higher dimensions

• A 53 array can be declared in C asint class_marks[5][3];

• Two dimensional arrays can be stored in one of two ways: Row-major order

– Array is stored row by row– Most HLL including C and Pascal use this method

Column-major order– Array is stored column by column– FORTRAN uses this method

Page 193: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 193

Multidimensional Arrays (cont’d)

Page 194: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 194

Multidimensional Arrays (cont’d)

• Why do we need to know the underlying storage representation?

» In a HLL, we really don’t need to know» In assembly language, we need this information as we have to

calculate displacement of element to be accessed

• In assembly language,class_marks DW 5*3 DUP (?)

allocates 30 bytes of storage• There is no support for using row and column

subscripts» Need to translate these subscripts into a displacement value

Page 195: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 195

Multidimensional Arrays (cont’d)

• Assuming C language subscript convention, we can express displacement of an element in a 2-D array at row i and column j as

displacement = (i * COLUMNS + j) * ELEMENT_SIZE

whereCOLUMNS = number of columns in the array

ELEMENT_SIZE = element size in bytes

Example: Displacement of class_marks[3,1]

element is (3*3 + 1) * 2 = 20

Page 196: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 196

Examples of Arrays

Example 1• One-dimensional array

» Computes array sum (each element is 4 bytes long e.g., long integers)

» Uses scale factor 4 to access elements of the array by using a 32-bit addressing mode (uses ESI rather than SI)

» Also illustrates the use of predefined location counter $

Example 2• Two-dimensional array

» Finds sum of a column

» Uses “based-indexed addressing with scale factor” to access elements of a column

Page 197: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 197

Recursion

• A recursive procedure calls itself Directly, or Indirectly

• Some applications can be naturally expressed using recursion

factorial(0) = 1

factorial (n) = n * factorial(n1) for n > 0

• From implementation viewpoint Very similar to any other procedure call

» Activation records are stored on the stack

Page 198: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 198

Recursion (cont’d)

Page 199: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 199

Recursion (cont’d)

• Example 1 Factorial

» Discussed before

• Example 2 Quicksort (on an N-element array) Basic algorithm

» Selects a partition element x

» Assume that the final position of x is array[i]

» Moves elements less than x into array[0]…array[i1]

» Moves elements greater than x into array[i+1]…array[N1]

» Applies quicksort recursively to sort these two subarraysLast slide

Page 200: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

Selected Pentium Instructions

Page 201: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 201

Status Flags

Page 202: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 202

Status Flags (cont’d)

• Status flags are updated to indicate certain properties of the result Example: If the result is zero, zero flag is set

• Once a flag is set, it remains in that state until another instruction that affects the flags is executed

• Not all instructions affect all status flags add and sub affect all six flags inc and dec affect all but the carry flag mov, push, and pop do not affect any flags

Page 203: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 203

Status Flags (cont’d)

• Example; initially, assume ZF = 0

mov AL,55H ; ZF is still zero

sub AL,55H ; result is 0 ; ZF is set (ZF = 1)

push BX ; ZF remains 1

mov BX,AX ; ZF remains 1

pop DX ; ZF remains 1

mov CX,0 ; ZF remains 1

inc CX ; result is 1 ; ZF is cleared (ZF = 0)

Page 204: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 204

Status Flags (cont’d)

• Zero Flag Indicates zero result

– If the result is zero, ZF = 1

– Otherwise, ZF = 0

Zero can result in several ways (e.g. overflow) mov AL,0FH mov AX,0FFFFH mov AX,1

add AL,0F1H inc AX dec AX» All three examples result in zero result and set ZF

Related instructionsjz jump if zero (jump if ZF = 1)

jnz jump if not zero (jump if ZF = 0)

Page 205: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 205

Status Flags (cont’d)

• Uses of zero flag Two main uses of zero flag

» Testing equality

– Often used with cmp instruction

cmp char,’$’ ; ZF = 1 if char is $

cmp AX,BX

» Counting to a preset value

– Initialize a register with the count value

– Decrement it using dec instruction

– Use jz/jnz to transfer control

Page 206: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 206

Status Flags (cont’d)

• Consider the following code

sum := 0

for (i = 1 to M)

for (j = 1 to N)

sum := sum + 1

end for

end for

• Assembly code

sub AX,AX ; AX := 0 mov DX,Mouter_loop: mov CX,Ninner_loop: inc AX loop inner_loop dec DX jnz outer_loopexit_loops: mov sum,AX

Page 207: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 207

Status Flags (cont’d)

• Two observations loop instruction is equivalent to

dec DX

jnz outer_loop» This two instruction sequence is more efficient than the loop

instruction (takes less time to execute)» loop instruction does not affect any flags!

This two instruction sequence is better than initializing DX = 1 and executing

inc DX

cmp DX,M

jle inner_loop

Page 208: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 208

Status Flags (cont’d)

• Carry Flag Records the fact that the result of an arithmetic

operation on unsigned numbers is out of range The carry flag is set in the following examples

mov AL,0FH mov AX,12AEH

add AL,0F1H sub AX,12AFH

Range of 8-, 16-, and 32-bit unsigned numbers

size range

8 bits 0 to 255 (28 1)

16 bits 0 to 65,535 (216 1)

32 bits 0 to 4,294,967,295 (2321)

Page 209: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 209

Status Flags (cont’d)

Carry flag is not set by inc and dec instructions» The carry flag is not set in the following examples

mov AL,0FFH mov AX,0

inc AL dec AX

Related instructions jc jump if carry (jump if CF = 1)

jnc jump if no carry (jump if CF = 0)

Carry flag can be manipulated directly using stc set carry flag (set CF to 1)

clc clear carry flag (clears CF to 0)

cmc complement carry flag (inverts CF value)

Page 210: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 210

Status Flags (cont’d)

• Uses of carry flag To propagate carry/borrow in multiword

addition/subtraction 1 carry from lower 32 bitsx = 3710 26A8 1257 9AE7Hy = 489B A321 FE60 4213H 7FAB C9CA 10B7 DCFAH

To detect overflow/underflow condition» In the last example, carry out of leftmost bit indicates overflow

To test a bit using the shift/rotate instructions» Bit shifted/rotated out is captured in the carry flag» We can use jc/jnc to test whether this bit is 1 or 0

Page 211: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 211

Status Flags (cont’d)

• Overflow flag Indicates out-of-range result on signed numbers

– Signed number counterpart of the carry flag The following code sets the overflow flag but not the carry

flagmov AL,72H ; 72H = 114Dadd AL,0EH ; 0EH = 14D

Range of 8-, 16-, and 32-bit signed numbers

size range

8 bits 128 to +127 27 to (27 1)16 bits 32,768 to +32,767 215 to (215 1)32 bits 2,147,483,648 to +2,147,483,647 231 to (231 1)

Page 212: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 212

Status Flags (cont’d)

Unsigned interpretation

mov AL,72Hadd AL,0EHjc overflow

no_overflow:

(no overflow code here) . . . .

overflow:

(overflow code here) . . . .

Signed interpretation

mov AL,72Hadd AL,0EHjo overflow

no_overflow:

(no overflow code here) . . . .

overflow:

(overflow code here) . . . .

• Signed or unsigned: How does the system know? The processor does not know the interpretation It sets carry and overflow under each interpretation

Page 213: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 213

Status Flags (cont’d)

Related instructions jo jump if overflow (jump if OF = 1)

jno jump if no overflow (jump if OF = 0)

There is a special software interrupt instruction into interrupt on overflow

Details on this instruction in Chapter 20

• Uses of overflow flag Main use

» To detect out-of-range result on signed numbers

Page 214: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 214

Status Flags (cont’d)

• Sign flag Indicates the sign of the result

– Useful only when dealing with signed numbers– Simply a copy of the most significant bit of the result

Examplesmov AL,15 mov AL,15add AL,97 sub AL,97clears the sign flag as sets the sign flag asthe result is 112 the result is 82(or 0111000 in binary) (or 10101110 in binary)

Related instructionsjs jump if sign (jump if SF = 1)jns jump if no sign (jump if SF = 0)

Page 215: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 215

Status Flags (cont’d)

• Consider the count down loop:

for (i = M downto 0)

<loop body>

end for

• If we don’t use the jns, we need cmp as shown below:

cmp CX,0

jl for_loop

The count down loop can be implemented as

mov CX,M

for_loop:

<loop body>

dec CX

jns for_loop

• Usage of sign flag To test the sign of the result Also useful to efficiently implement countdown loops

Page 216: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 216

Status Flags (cont’d)

• Auxiliary flag Indicates whether an operation produced a carry or

borrow in the low-order 4 bits (nibble) of 8-, 16-, or 32-bit operands (i.e. operand size doesn’t matter)

Example 1 carry from lower 4 bits

mov AL,43 43D = 0010 1011B

add AL,94 94D = 0101 1110B

137D = 1000 1001B

» As there is a carry from the lower nibble, auxiliary flag is set

Page 217: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 217

Status Flags (cont’d)

Related instructions» No conditional jump instructions with this flag» Arithmetic operations on BCD numbers use this flag

aaa ASCII adjust for additionaas ASCII adjust for subtractionaam ASCII adjust for multiplicationaad ASCII adjust for divisiondaa Decimal adjust for additiondas Decimal adjust for subtraction

– Appendices I has more details on these instructions Usage

» Main use is in performing arithmetic operations on BCD numbers

Page 218: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 218

Status Flags (cont’d)

• Parity flag Indicates even parity of the low 8 bits of the result

– PF is set if the lower 8 bits contain even number 1 bits– For 16- and 32-bit values, only the least significant 8 bits

are considered for computing parity value Example

mov AL,53 53D = 0011 0101Badd AL,89 89D = 0101 1001B 142D = 1000 1110B» As the result has even number of 1 bits, parity flag is set

Related instructionsjp jump on even parity (jump if PF = 1)jnp jump on odd parity (jump if PF = 0)

Page 219: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 219

Status Flags (cont’d)

Usage of parity flag» Useful in writing data encoding programs» Example: Encodes the byte in AL (MSB is the parity bit)

parity_encode PROC shl AL jp parity_zero stc ; CF = 1 jmp move_parity_bit parity_zero: clc ; CF = 0 move_parity_bit: rcr ALparity_encode ENDP

Page 220: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 220

Arithmetic Instructions

• Pentium provides several arithmetic instructions that operate on 8-, 16- and 32-bit operands

» Addition: add, adc, inc

» Subtraction: sub, sbb, dec, neg, cmp

» Multiplication: mul, imul

» Division: div, idiv

» Related instructions: cbw, cwd, cdq, cwde, movsx, movzx

There are few other instructions such as aaa, aas, etc. that operate on decimal numbers

» See Appendix I for details

Page 221: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 221

Arithmetic Instructions (cont’d)

• Multiplication More complicated than add/sub

» Produces double-length results

– E.g. Multiplying two 8 bit numbers produces a 16-bit result

» Cannot use a single multiply instruction for signed and unsigned numbers– add and sub instructions work both on signed and

unsigned numbers

– For multiplication, we need separate instructions

mul for unsigned numbers

imul for signed numbers

Page 222: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 222

Arithmetic Instructions (cont’d)

• Unsigned multiplicationmul source» Depending on the source operand size, the location of the

other source operand and destination are selected

Page 223: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 223

Arithmetic Instructions (cont’d)

Examplemov AL,10mov DL,25mul DL

produces 250D in AX register (result fits in AL)

• The imul instruction can use the same syntax» Also supports other formats

Examplemov DL,0FFH ; DL = -1mov AL,0BEH ; AL = -66imul DL

produces 66D in AX register (again, result fits in AL)

Page 224: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 224

Arithmetic Instructions (cont’d)

• Division instruction Even more complicated than multiplication

» Produces two results– Quotient– Remainder

» In multiplication, using a double-length register, there will not be any overflow

– In division, divide overflow is possiblePentium provides a special software interrupt when a

divide overflow occurs

Two instructions as in multiplicationdiv source for unsigned numbers

idiv source for signed numbers

Page 225: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 225

Arithmetic Instructions (cont’d)

• Dividend is twice the size of the divisor

• Dividend is assumed to be in AX (8-bit divisor) DX:AX (16-bit divisor) EDX:EAX (32-bit divisor)

Page 226: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 226

Arithmetic Instructions (cont’d)

Page 227: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 227

Arithmetic Instructions (cont’d)

• Examplemov AX,251mov CL,12div CL

produces 20D in AL and 11D as remainder in AH

• Examplesub DX,DX ; clear DX

mov AX,141BH ; AX = 5147Dmov CX,012CH ; CX = 300Ddiv CX

produces 17D in AX and 47D as remainder in DX

Page 228: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 228

Arithmetic Instructions (cont’d)

• Signed division requires some help» We extended an unsigned 16 bit number to 32 bits by placing

zeros in the upper 16 bits

» This will not work for signed numbers

– To extend signed numbers, you have to copy the sign bit into those upper bit positions

Pentium provides three instructions in aiding sign extension

» All three take no operands

cbw converts byte to word (extends AL into AH)

cwd converts word to doubleword (extends AX into DX)

cdq converts doubleword to quadword (extends EAX into EDX)

Page 229: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 229

Arithmetic Instructions (cont’d)

Some additional related instructions

» Sign extensioncwde converts word to doubleword (extends AX into EAX)

» Two move instructionsmovsx dest,src (move sign-extended src to dest)

movzx dest,src (move zero-extended src to dest)

» For both move instructions, dest has to be a register

» The src operand can be in a register or memory– If src is 8-bits, dest must be either a 16- or 32-bit

register

– If src is 16-bits, dest must be a 32-bit register

Page 230: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 230

Arithmetic Instructions (cont’d)

• Examplemov AL,-95cbw ; AH = FFHmov CL,12idiv CL

produces 7D in AL and 11D as remainder in AH

• Examplemov AX,-5147cwd ; DX := FFFFHmov CX,300idiv CX

produces 17D in AX and 47D as remainder in DX

Page 231: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 231

Arithmetic Instructions (cont’d)

• Use of Shifts for Multiplication and Division Shifts are more efficient Example: Multiply AX by 32

mov CX,32

imul CX

takes 12 clock cycles

Using

sal AX,5

takes just one clock cycle

Page 232: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 232

Application Examples

• PutInt8 procedure To display a number, repeatedly divide it by 10 and

display the remainders obtainedquotient remainder

108/10 10 8

10/10 1 0

1/10 0 1

To display digits, they must be converted to their character form

» This means simply adding the ASCII code for zero

line 24: add AH,’0’

Page 233: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 233

Application Examples (cont’d)

• GetInt8 procedure To read a number, read each digit character

» Convert to its numeric equivalent

» Multiply the running total by 10 and add this digit

Input digit Numericvalue (N)

Number := Number*10 + N

Initial value -- 0‘1’ 1 0 * 10 + 1 = 1‘5’ 5 1 * 10 + 5 = 15‘8’ 8 15 * 10 + 8 = 158

Page 234: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 234

Indirect Jumps

• Direct jump Target address is encoded in the instruction itself

• Indirect jump Introduces a level of indirection

» Address is specified either through memory of a general-purpose register

Example

jmp CXjumps to the address in CX

Address is absolute» Not relative as in direct jumps

Page 235: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 235

Indirect Jumps (cont’d)

Switch (ch) {

Case ’0’:

count[0]++; break;

Case ’1’:

count[1]++; break;

Case ’2’:

count[2]++; break;

Case ’3’:

count[3]++; break;

Default:

count[3]++;

}

Page 236: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 236

Indirect Jumps (cont’d)

Turbo C assembly code for the switch statement

_main PROC NEAR

. . .

mov AL,ch

cbw

sub AX,48 ; 48 = ASCII for 0

mov BX,AX

cmp BX,3

ja default

shl BX,1 ; BX = BX * 2

jmp WORD PTR CS:jump_table[BX]

Indirect jump

Page 237: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 237

Indirect Jumps (cont’d)

case_0: inc WORD PTR [BP-10]

jmp SHORT end_switch

case_1: inc WORD PTR [BP-8]

jmp SHORT end_switch

case_2: inc WORD PTR [BP-6]

jmp SHORT end_switch

case_3: inc WORD PTR [BP-4]

jmp SHORT end_switch

default: inc WORD PTR [BP-2]

end_switch:

. . .

_main ENDP

Page 238: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 238

Indirect Jumps (cont’d)

jump_table LABEL WORD

DW case_0

DW case_1

DW case_2

DW case_3

. . .

• Indirect jump uses this table to jump to the appropriate case routine

• The indirect jump instruction uses segment override prefix to refer to the jump_table in the CODE segment

Jump table for the indirect jump

Page 239: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 239

Conditional Jumps

• Three types of conditional jumps Jumps based on the value of a single flag

» Arithmetic flags such as zero, carry can be tested using these instructions

Jumps based on unsigned comparisons» Operands of cmp instruction are treated as unsigned

numbers

Jumps based on signed comparisons» Operands of cmp instruction are treated as signed numbers

Page 240: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 240

Jumps Based on Single Flags

Testing for zerojz jump if zero jumps if ZF = 1

je jump if equal jumps if ZF = 1

jnz jump if not zero jumps if ZF = 0

jne jump if not equal jumps if ZF = 0

jcxz jump if CX = 0 jumps if CX = 0

(Flags are not tested)

Page 241: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 241

Jumps Based on Single Flags (cont’d)

Testing for carryjc jump if carry jumps if CF = 1

jnc jump if no carry jumps if CF = 0

Testing for overflowjo jump if overflow jumps if OF = 1

jno jump if no overflow jumps if OF = 0

Testing for signjs jump if negative jumps if SF = 1

jns jump if not negative jumps if SF = 0

Page 242: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 242

Jumps Based on Single Flags (cont’d)

Testing for parityjp jump if parity jumps if PF = 1

jpe jump if parity jumps if PF = 1is even

jnp jump if not parity jumps if PF = 0

jpo jump if parity jumps if PF = 0is odd

Page 243: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 243

Jumps Based on Unsigned Comparisons

Mnemonic Meaning Conditionje jump if equal ZF = 1jz jump if zero ZF = 1

jne jump if not equal ZF = 0jnz jump if not zero ZF = 0

ja jump if above CF = ZF = 0jnbe jump if not below CF = ZF = 0

or equal

Page 244: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 244

Jumps Based on Unsigned Comparisons

Mnemonic Meaning Conditionjae jump if above CF = 0

or equaljnb jump if not below CF = 0

jb jump if below CF = 1jnae jump if not above CF = 1

or equal

jbe jump if below CF=1 or ZF=1or equal

jna jump if not above CF=1 or ZF=1

Page 245: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 245

Jumps Based on Signed Comparisons

Mnemonic Meaning Conditionje jump if equal ZF = 1jz jump if zero ZF = 1

jne jump if not equal ZF = 0jnz jump if not zero ZF = 0

jg jump if greater ZF=0 & SF=OFjnle jump if not less ZF=0 & SF=OF

or equal

Page 246: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 246

Jumps Based on Signed Comparisons (cont’d)

Mnemonic Meaning Condition

jge jump if greater SF = OF or equal

jnl jump if not less SF = OF

jl jump if less SF OFjnge jump if not greater SF OF

or equal

jle jump if less ZF=1 or SF OF or equal

jng jump if not greater ZF=1 or SF OF

Page 247: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 247

Implementing HLL Decision Structures

• High-level language decision structures can be implemented in a straightforward way

• See Section 12.4 for examples that implement if-then-else if-then-else with a relational operator if-then-else with logical operator AND if-then-else with logical operator OR while loop repeat-until loop for loops

Page 248: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 248

Logical Expressions in HLLs

• Representation of Boolean data Only a single bit is needed to represent Boolean data Usually a single byte is used

» For example, in C

– All zero bits represents false

– A non-zero value represents true

• Logical expressions Logical instructions AND, OR, etc. are used

• Bit manipulation Logical, shift, and rotate instructions are used

Page 249: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 249

Evaluation of Logical Expressions

• Two basic ways Full evaluation

» Entire expression is evaluated before assigning a value

» PASCAL uses full evaluation

Partial evaluation» Assigns as soon as the final outcome is known without blindly

evaluating the entire logical expression

» Two rules help:– cond1 AND cond2

If cond1 is false, no need to evaluate cond2– cond1 OR cond2

If cond1 is true, no need to evaluate cond2

Page 250: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 250

Evaluation of Logical Expressions (cont’d)

• Partial evaluation Used by C

• Useful in certain cases to avoid run-time errors• Example

if ((X > 0) AND (Y/X > 100))

If x is 0, full evaluation results in divide error Partial evaluation will not evaluate (Y/X > 100) if X = 0

• Partial evaluation is used to test if a pointer value is NULL before accessing the data it points to

Page 251: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 251

Bit Instructions

• Bit Test and Modify Instructions Four bit test instructions Each takes the position of the bit to be tested

Instruction Effect on the selected bit

bt (Bit Test) No effect

bts (Bit Test and Set) selected bit 1

btr (Bit Test and Reset) selected bit 0

btc selected bit NOT(selected bit)

(Bit Test and Complement)

Page 252: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 252

Bit Instructions (cont’d)

• All four instructions have the same format

• We use bt to illustrate the format

bt operand,bit_pos operand is word or doubleword

» Can be in a register or memory

bit_pos indicates the position of the bit to be tested

» Can be an immediate value or in a 16/32-bit register

• Instructions in this group affect only the carry flag» Other five flags are undefined

Page 253: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 253

Bit Scan Instructions

• These instructions scan the operand for a 1 bit return the bit position in a register

• Two instructionsbsf dest_reg,operand ;bit scan forward

bsr dest_reg,operand ;bit scan reverse» operand can be a word or doubleword in a register or

memory» dest_reg receives the bit position

– Must be a 16- or 32-bit register

Only ZF is updated (other five flags undefined)– ZF = 1 if all bits of operand are 0

– ZF = 0 otherwise (position of first 1 bit in dest_reg)

Page 254: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 254

Illustrative Examples

• Example 1 Linear search of an integer array

• Example 2 Selection sort on an integer array

• Example 3 Multiplication using shift and add operations

» Multiplies two unsigned 8-bit numbers

– Uses a loop that iterates 8 times

• Example 4 Multiplication using bit instructions

Page 255: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 255

String Representation

• Two types Fixed-length Variable-length

• Fixed length strings Each string uses the same length

» Shorter strings are padded (e.g. by blank characters)

» Longer strings are truncated

Selection of string length is critical» Too large ==> inefficient

» Too small ==> truncation of larger strings

Page 256: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 256

String Representation (cont’d)

• Variable-length strings Avoids the pitfalls associated with fixed-length strings

• Two ways of representation Explicitly storing string length (used in PASCAL)

string DB ‘Error message’

str_len DW $-string– $ represents the current value of the location counter

$ points to the byte after the last character of string

Using a sentinel character (used in C)» Uses NULL character

– Such NULL-terminated strings are called ASCIIZ strings

Page 257: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 257

String Instructions

• Five string instructions

LODS LOaD String source

STOS STOre String destination

MOVS MOVe String source & destination

CMPS CoMPare String source & destination

SCAS SCAn String destination

• Specifying operands 32-bit segments:

DS:ESI = source operand ES:EDI = destination operand

16-bit segments:DS:SI = source operand ES:DI = destination operand

Page 258: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 258

String Instructions (cont’d)

• Each string instruction Can operate on 8-, 16-, or 32-bit operands Updates index register(s) automatically

» Byte operands: increment/decrement by 1» Word operands: increment/decrement by 2» Doubleword operands: increment/decrement by 4

• Direction flag DF = 0: Forward direction (increments index registers) DF = 1: Backward direction (decrements index registers)

• Two instructions to manipulate DFstd set direction flag (DF = 1)cld clear direction flag (DF = 0)

Page 259: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 259

Repetition Prefixes

• String instructions can be repeated by using a repetition prefix

• Two types Unconditional repetition

rep REPeat

Conditional repetitionrepe/repz REPeat while Equal

REPeat while Zero

repne/repnz REPeat while Not Equal

REPeat while Not Zero

Page 260: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 260

Repetition Prefixes (cont’d)

repwhile (CX 0)

execute the string instruction

CX := CX1end while

• CX register is first checked If zero, string instruction is not executed at all More like the JCXZ instruction

Page 261: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 261

Repetition Prefixes (cont’d)

repe/repzwhile (CX 0)

execute the string instruction

CX := CX1if (ZF = 0)

then

exit loop

end if

end while

• Useful with cmps and scas string instructions

Page 262: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 262

Repetition Prefixes (cont’d)

repne/repnz

while (CX 0)

execute the string instruction

CX := CX1if (ZF = 1)

then

exit loop

end if

end while

Page 263: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 263

String Move Instructions

• Three basic instructions movs, lods, and stos

Move a string (movs)• Format

movs dest_string,source_stringmovsb ; operands are bytesmovsw ; operands are wordsmovsd ; operands are doublewords

• First form is not used frequently

Source and destination pointed by DS:(E)SI and ES:(E)DI, respectively

Page 264: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 264

String Move Instructions (cont’d)

movsb --- move a byte stringES:DI := (DS:SI) ; copy a byteif (DF=0) ; forward direction then

SI := SI+1 DI := DI+1

else ; backward directionSI := SI1DI := DI1

end ifFlags affected: none

Page 265: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 265

String Move Instructions (cont’d)

Example.DATAstring1 DB 'The original string',0strLen EQU $ - string1string2 DB 80 DUP (?).CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen ; strLen includes NULL mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction rep movsb

Page 266: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 266

String Move Instructions (cont’d)

Load a String (LODS)• Copies the value from the source string at DS:

(E)SI to AL (lodsb) AX (lodsw) EAX (lodsd)

• Repetition prefix does not make sense It leaves only the last value in AL, AX, or EAX register

Page 267: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 267

String Move Instructions (cont’d)

lodsb --- load a byte stringAL := (DS:SI) ; copy a byte

if (DF=0) ; forward direction

thenSI := SI+1

else ; backward directionSI := SI1

end if

Flags affected: none

Page 268: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 268

String Move Instructions (cont’d)

Store a String (STOS)• Performs the complementary operation• Copies the value in

» AL (lodsb)» AX (lodsw) » EAX (lodsd)

to the destination string at ES:(E)DI

• Repetition prefix can be used to initialize a block of memory

Page 269: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 269

String Move Instructions (cont’d)

stosb --- store a byte stringES:DI := AL ; copy a byteif (DF=0) ; forward direction then

DI := DI+1 else ; backward direction

DI := DI1end if

Flags affected: none

Page 270: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 270

String Move Instructions (cont’d)

Example: Initializes array1 with -1.DATA

array1 DW 100 DUP (?)

.CODE

.STARTUP

mov AX,DS ; set up ES

mov ES,AX ; to the data segment

mov CX,100

mov DI,OFFSET array1

mov AX,-1

cld ; forward direction

rep stosw

Page 271: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 271

String Move Instructions (cont’d)

• In general, repeat prefixes are not useful with lods and stos

• Used in a loop to do conversions while copying mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward directionloop1: lodsb or AL,20H stosb loop loop1done:

Page 272: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 272

String Compare Instruction

cmpsb --- compare two byte stringsCompare two bytes at DS:SI and ES:DI and set flags

if (DF=0) ; forward direction then

SI := SI+1 DI := DI+1

else ; backward directionSI := SI1DI := DI1

end if

Flags affected: As per cmp instruction (DS:SI)(ES:DI)

Page 273: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 273

String Compare Instruction (cont’d)

.DATAstring1 DB 'abcdfghi',0strLen EQU $ - string1string2 DB 'abcdefgh',0.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction repe cmpsb dec SI dec DI ; leaves SI & DI pointing to the last character that differs

Page 274: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 274

String Compare Instruction (cont’d)

.DATAstring1 DB 'abcdfghi',0strLen EQU $ - string1 - 1string2 DB 'abcdefgh',0.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 + strLen - 1 mov DI,OFFSET string2 + strLen - 1 std ; backward direction repne cmpsb inc SI ; Leaves SI & DI pointing to the first character that matches inc DI ; in the backward direction

Page 275: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 275

String Scan Instruction

scasb --- Scan a byte stringCompare AL to the byte at ES:DI and set

flagsif (DF=0) ; forward direction then

DI := DI+1 else ; backward direction

DI := DI1end if

Flags affected: As per cmp instruction (DS:SI)-(ES:DI)• scasw uses AX and scasd uses EAX registers instead of

AL

Page 276: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 276

String Scan Instruction (cont’d)

.DATAstring1 DB 'abcdefgh',0strLen EQU $ - string1.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov DI,OFFSET string1 mov AL,'e' ; character to be searched cld ; forward direction repne scasb dec DI ; leaves DI pointing to e in string1

Example 1

Page 277: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 277

String Scan Instruction (cont’d)

.DATA

string1 DB ' abc',0

strLen EQU $ - string1

.CODE

.STARTUP

mov AX,DS ; set up ES

mov ES,AX ; to the data segment

mov CX,strLen

mov DI,OFFSET string1

mov AL,' ' ; character to be searched

cld ; forward direction

repe scasb

dec DI ; leaves DI pointing to the first non-blank character a

Example 2

Page 278: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 278

Illustrative Examples

LDS and LES instructions• String pointer can be loaded into DS/SI or ES/DI

register pair by using lds or les instructions• Syntax

lds register,sourceles register,source

register should be a 16-bit register source is a pointer to a 32-bit memory operand

• register is typically SI in lds and DI in les

Page 279: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 279

Illustrative Examples (cont’d)

• Actions of lds and les

ldsregister := (source)

DS := (source+2)

lesregister := (source)

ES := (source+2)

• Pentium also supports lfs, lgs, and lss to load the other segment registers

Page 280: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 280

Illustrative Examples (cont’d)

• Seven popular string processing routines are given as examples in string.asm str_len

str_mov

str-cpy

str_cat

str_cmp

str_chr

str_cnv

Given in the text

Page 281: The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.  S. Dandamudi Chapter 7:

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7: Page 281

Indirect Procedure Call

• Direct procedure calls specify the offset of the first instruction of the called procedure

• In indirect procedure call, the offset is specified through memory or a register If BX contains pointer to the procedure, we can use

call BX

If the word in memory at target_proc_ptr contains the offset of the called procedure, we can use

call target_proc_ptr

• These are similar to direct and indirect jumpsLast slide