58
204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

Embed Size (px)

Citation preview

Page 1: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 1

Lecture 3Instruction Set Architecture

Pradondet Nilagupta

Fall 2000(original notes from Prof. Mike Schulte)

Page 2: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 2

Overview ISA (I)

• Concentrate on ISA• Introduce wide variety of design alternative to

instruction set architecture– Focus on four topics

• Classification of instruction set alternative– Give some qualitative assessment of the advantage

and disadvantage of various approach• Present and analyze some instruction set measurement

that are largely independent of a specific instruction

Page 3: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 3

• Address the issue of a languages and compiler and their bearing on ISA

• Show how these idea are reflected in DLX instruction set, which is typical of recent instruction set architectures

• Examine a wide variety of architectural measurement– Measurements depend on the programs

measured and on the compiler used in making these measurements

Overview ISA (II)

Page 4: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 4

Hot Topics in Computer Architecture

• 1950s and 1960s:– Computer Arithmetic

• 1970 and 1980s: – Instruction Set Design– ISA Appropriate for Compilers

• 1990s: – Design of CPU– Design of memory system– Design of I/O system– Multiprocessors– Instruction Set Extensions

Page 5: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 5

Instruction Set Architecture

• Instruction set architecture is the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine.

• The instruction set architecture is also the machine description that a hardware designer must understand to design a correct implementation of the computer.

Page 6: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 6

Instruction Set Architecture

• The instruction set architecture serves as the interface between software and hardware

instruction set

software

hardware

Page 7: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 7

Interface Design

• A good interface:

– Lasts through many implementations (portability, compatibility)

– Is used in many different ways (generality)

– Provides convenient functionality to higher levels

– Permits an efficient implementation at lower levels

Page 8: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 8

What Are the Components of an ISA?

• Sometimes known as The Programmer’s Model of the machine

• Storage cells– General and special purpose registers in the CPU– Many general purpose cells of same size in memory– Storage associated with I/O devices

• The machine instruction set– The instruction set is the entire repertoire of machine

operations– Makes use of storage cells, formats, and results of the

fetch/execute cycle– i.e., register transfers

Page 9: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 9

• The instruction format– Size and meaning of fields within the instruction

• The nature of the fetch-execute cycle– Things that are done before the operation code

is known

What Are the Components of an ISA?

Page 10: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 10

Programmer’s Models of Various Machines

216 bytes of main memorycapacity

Fewer than 100

instructions

7

15

A

216 – 1

B

IX

SP

PC

0

12 generalpurposeregisters

More than 300instructions

More than 250instructions

More than 120instructions

232 – 1

252 – 1

0

PSW

Status

R0

PC

R11

AP

FP

SP

0 31 0

32 64-bit

floating pointregisters

(introduced 1993)(introduced 1981)(introduced 1975) (introduced 1979)

0

31

0 63

32 32-bitgeneral purposeregisters

0

31

0 31

More than 50 32-bit special

purposeregisters

0 31

252 bytes of main mem orycapacity

0

M6800 VAX11 PPC601

220 – 1

AX

BX

CX

DX

SP

BP

SI

DI

15 7 08

IP

Status

Addressand

countregisters

CS

DS

SS

ES

M emorysegm entregisters

220 bytes of main memorycapacity

0

I8086

232 bytes of main mem orycapacity

Dataregisters

6 specialpurposeregisters

Page 11: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 11

• Which operation to perform add r0, r1, r3– Ans: Op code: add, load, branch, etc.

• Where to find the operand or operands add r0, r1, r3

– In CPU registers, memory cells, I/O locations, or part of instruction

• Place to store result add r0, r1, r3– Again CPU register or memory cell

What Must an Instruction Specify?(I)

Data Flow

Page 12: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 12

• Location of next instruction add r0, r1, r3 br endloop

– Almost always memory cell pointed to by program counter—PC

• Sometimes there is no operand, or no result, or no next instruction. Can you think of examples?

What Must an Instruction Specify?(II)

Page 13: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 13

Instructions Can Be Divided into 3 Classes (I)

• Data movement instructions– Move data from a memory location or register to

another memory location or register without changing its form

– Load—source is memory and destination is register– Store—source is register and destination is memory

• Arithmetic and logic (ALU) instructions– Change the form of one or more operands to

produce a result stored in another location– Add, Sub, Shift, etc.

Page 14: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 14

• Branch instructions (control flow instructions)– Alter the normal flow of control from executing

the next instruction in sequence– Br Loc, Brz Loc2,—unconditional or

conditional branches

Instructions Can Be Divided into 3 Classes (II)

Page 15: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 15

Examples of Data Movement Instructions

• Lots of variation, even with one instruction type

Instruction Meaning Machine

MOV A, B Move 16 bits from memory location A to VAX11 Location B

LDA A, Addr Load accumulator A with the byte at memory M6800 location Addr

lwz R3, A Move 32-bit data from memory location A to PPC601 register R3

li $3, 455 Load the 32-bit integer 455 into register $3 MIPS R3000

mov R4, dout Move 16-bit data from R4 to output port dout DEC PDP11

IN, AL, KBD Load a byte from in port KBD to accumulator Intel Pentium

LEA.L (A0), A2 Load the address pointed to by A0 into A2 M6800

Page 16: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 16

Examples of ALUInstructions

Instruction Meaning Machine

MULF A, B, C multiply the 32-bit floating point values at VAX11mem loc’ns. A and B, store at C

nabs r3, r1 Store abs value of r1 in r3 PPC601

ori $2, $1, 255 Store logical OR of reg $ 1 with 255 into reg $2 MIPS R3000

DEC R2 Decrement the 16-bit value stored in reg R2 DEC PDP11

SHL AX, 4 Shift the 16-bit value in reg AX left by 4 bit pos’ns. Intel 8086

• Notice again the complete dissimilarity of both syntax and semantics.

Page 17: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 17

Examples of Branch Instructions

Instruction Meaning Machine

BLSS A, Tgt Branch to address Tgt if the least significant VAX11bit of mem loc’n. A is set (i.e. = 1)

bun r2 Branch to location in R2 if result of previous PPC601floating point computation was Not a Number (NAN)

beq $2, $1, 32 Branch to location (PC + 4 + 32) if contents MIPS R3000of $1 and $2 are equal

SOB R4, Loop Decrement R4 and branch to Loop if R4 น 0 DEC PDP11

JCXZ Addr Jump to Addr if contents of register CX น 0. Intel 8086

Page 18: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 18

ISA Metrics• Orthogonality

– No special registers, few special cases, all operand modes available with any data type or instruction type

• Completeness– Support for a wide range of operations and target

applications• Regularity

– No overloading for the meanings of instruction fields• Streamlined

– Resource needs easily determined• Ease of compilation (programming?), Ease of

implementation, Scalability

Page 19: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 19

Instruction Set Design Issues

• Instruction set design issues include:– Where are operands stored?

• registers, memory, stack, accumulator– How many explicit operands are there?

• 0, 1, 2, or 3 – How is the operand location specified?

• register, immediate, indirect, . . . – What type & size of operands are supported?

• byte, int, float, double, string, vector. . .– What operations are supported?

• add, sub, mul, move, compare . . .

Page 20: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 20

Evolution of Instruction SetsSingle Accumulator (EDSAC 1950)

Accumulator + Index Registers(Manchester Mark I, IBM 700 series 1953)

Separation of Programming Model from Implementation

High-level Language Based Concept of a Family(B5000 1963) (IBM 360 1964)

General Purpose Register Machines

Complex Instruction Sets Load/Store Architecture

RISC

(Vax, Intel 8086 1977-80) (CDC 6600, Cray 1 1963-76)

(Mips,Sparc,88000,IBM RS6000, . . .1987+)

Page 21: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 21

Evolution of Instruction Sets

• Major advances in computer architecture are typically associated with landmark instruction set designs– Ex: Stack VS. GPR (System 360)

• Design decisions must take into account:– technology– machine organization– programming languages– compiler technology– operating systems

• The design decisions in turn influence these.

Page 22: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 22

Classifying ISAsAccumulator (before 1960):

1 address add A acc นacc + mem[A]

Stack (1960s to 1970s):0 address add tos นtos + next

Memory-Memory (1970s to 1980s):2 address add A, B mem[A] นmem[A] + mem[B]3 address add A, B, C mem[A] นmem[B] + mem[C]

Register-Memory (1970s to present): 2 address add R1, A R1 นR1 + mem[A]

load R1, A R1 นmem[A]

Register-Register (Load/Store) (1960s to present):3 address add R1, R2, R3 R1 นR2 + R3

load R1, R2 R1 นmem[R2]store R1, R2 mem[R1] นR2

Page 23: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 23

Ex. Expression Evaluation for 3-, 2-, 1-, and 0-Address Machines

• Number of instructions & number of addresses both vary• Discuss as examples: size of code in each case

3 - a d d r e s s 2 - a d d r e s s 1 - a d d r e s s S t a c k

add a, b, cmpy a, a, dsub a, a, e

load a, badd a, cmpy a, dsub a, e

load badd cmpy dsub estore a

push bpush caddpush dmpypush esubpop a

Evaluat e a = (b+c) *d - e

Page 24: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 24

Stack Architectures• Instruction set:

add, sub, mult, div, . . .

push A, pop A• Example: A*B - (A+C*B)

push A

push B

mul

push A

push C

push B

mul

add

sub

A BA

A*BA*B

A*BA*B

AAC

A*BA A*B

A C B B*C A+B*C result

Page 25: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 25

The 0-Address, or Stack, Machine and Instruction Format

Memory

Op1Addr:

TOS

SOS

etc.

Op1

Programcounter

NextiAddr: Nexti

Bits:

Format

Format

8 24

CPU

Where to findnext instruction

Stack

24

push Op1 (TOS น Op1)

Instruction formats

add (TOS น TOS + SOS)

push Op1Addr

Operation

Bits: 8

add

Which operation

Result

W here to find operands, and where to put result

(on the stack)

Page 26: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 26

Stacks: Pros and Cons• Pros

– Good code density (implicite top of stack)

– Low hardware requirements

– Easy to write a simpler compiler for stack architectures

• Cons

– Stack becomes the bottleneck

– Little ability for parallelism or pipelining

– Data is not always at the top of stack when need, so additional instructions like TOP and SWAP are needed

– Difficult to write an optimizing compiler for stack architectures

Page 27: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 27

Accumulator Architectures• Instruction set:

add A, sub A, mult A, div A, . . .

load A, store A

• Example: A*B - (A+C*B)load B

mul C

add A

store D

load A

mul B

sub D

B B*C A+B*C AA+B*C A*B result

Page 28: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 28

1-Address Machine and Instruction Format

• Special CPU register, the accumulator, supplies 1 operand and stores result

• One memory address used for other operand

Need instructions to load and store operands:LDA OpAddrSTA OpAddr

Memory

Op1Addr: Op1

NextiProgramcounter

Accumulator

NextiAddr:

CPU

Where to findnext instruction

24

add Op1 (Acc น Acc + Op1)

Bits: 8 24

Instruction format

add Op1Addr

Whichoperation

Where to find operand1

Where to find operand2, and

where to put result

Page 29: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 29

Accumulators: Pros and Cons

• Pros– Very low hardware requirements

– Easy to design and understand

• Cons– Accumulator becomes the bottleneck

– Little ability for parallelism or pipelining

– High memory traffic

Page 30: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 30

Memory-Memory Architectures

• Instruction set: (3 operands) add A, B, C sub A, B, C mul A, B, C

(2 operands) add A, B sub A, B mul A, B

• Example: A*B - (A+C*B)– 3 operands 2 operands

mul D, A, B mov D, A

mul E, C, B mul D, B

add E, A, E mov E, C

sub E, D, E mul E, B

add E, A

sub E, D

Page 31: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 31

The 2-Address Machine and Instruction Format

• Result overwrites Operand 2• Needs only 2 addresses in instruction but less

choice in placing data

M em ory

O p1Addr:

O p2Addr:

Op1

Programcounter

Op2,Res

N extiNextiAddr:

CPU

W here to findnext instruction

24

add Op2, Op1 (O p2 น O p2 + Op1)

B its: 8 24 24

Instruction format

add Op2Addr Op1Addr

W hichoperation

W here toput result

W here to find operands

Page 32: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 32

Memory-Memory:Pros and Cons

• Pros

– Requires fewer instructions (especially if 3 operands)

– Easy to write compilers for (especially if 3 operands)

• Cons

– Very high memory traffic (especially if 3 operands)

– Variable number of clocks per instruction

– With two operands, more data movements are required

Page 33: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 33

Register-Memory Architectures• Instruction set:

add R1, A sub R1, A mul R1, B

load R1, A store R1, A

• Example: A*B - (A+C*B)load R1, A

mul R1, B /* A*B */

store R1, D

load R2, C

mul R2, B /* C*B */

add R2, A /* A + CB */

sub R2, D /* AB - (A + C*B) */

Page 34: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 34

Memory-Register: Pros and Cons

• Pros

– Some data can be accessed without loading first

– Instruction format easy to encode

– Good code density

• Cons

– Operands are not equivalent (poor orthorganality)

– Variable number of clocks per instruction

– May limit number of registers

Page 35: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 35

Load-Store Architectures• Instruction set:

add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3load R1, R4 store R1, R4

• Example: A*B - (A+C*B)load R1, &Aload R2, &Bload R3, &Cload R4, R1load R5, R2load R6, R3mul R7, R6, R5 /* C*B */add R8, R7, R4 /* A + C*B */mul R9, R4, R5 /* A*B */sub R10, R9, R8 /* A*B - (A+C*B) */

Page 36: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 36

The 3-Address Machine and Instruction format

• Address of next instruction kept in processor state register—the PC (except for explicit branches/jumps)

• Rest of addresses in instruction– Discuss: savings in instruction word size

add, Res, Op1, Op2 (Res น Op2 + Op1)

Op1Addr:

Op2Addr:

Op1

Programcounter

Op2

ResAddr:

NextiAddr:

Res

Nexti

Where to findnext instruction

24Bits: 8 24 24

Instruction format

24

add ResAddr Op1Addr Op2Addr

Whichoperation

Where toput result Where to find operands

Memory CPU

Page 37: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 37

Load-Store: Pros and Cons

• Pros

– Simple, fixed length instruction encoding

– Instructions take similar number of cycles

– Relatively easy to pipeline

• Cons

– Higher instruction count

– Not all instructions need three operands

– Dependent on good compiler

Page 38: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 38

Registers:Advantages and Disadvantages

• Advantages

– Faster than cache (no addressing mode or tags)

– Deterministic (no misses)

– Can replicate (multiple read ports)

– Short identifier (typically 3 to 8 bits)

– Reduce memory traffic

• Disadvantages

– Need to save and restore on procedure calls and context switch

– Can’t take the address of a register (for pointers)

– Fixed size (can’t store strings or structures efficiently)

– Compiler must manage

Page 39: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 39

General Register Machine and Instruction Formats

Memory

Op1Addr: Op1load

Nexti Programcounter

load R8, Op1 (R8 น Op1)

CPU

Registers

R8

R6

R4

R2

Instruction formats

R8load Op1Addr

add R2, R4, R6 (R2 น R4 + R6)

R2add R6R4

Page 40: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 40

• It is the most common choice in today’s general-purpose computers

• Which register is specified by small “address” (3 to 6 bits for 8 to 64 registers)

• Load and store have one long & one short address: 1- addresses

• Arithmetic instruction has 3 “half” addresses

General Register Machine and Instruction Formats

Page 41: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 41

Real Machines Are Not So Simple

• Most real machines have a mixture of 3, 2, 1, 0, and 1- address instructions

• A distinction can be made on whether arithmetic instructions use data from memory

• If ALU instructions only use registers for operands and result, machine type is load-store

– Only load and store instructions reference memory

• Other machines have a mix of register-memory and memory-memory instructions

Page 42: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 42

Big Endian Addressing

• With Big Endian addressing, the byte binary address

x . . . x00

is in the most significant position (big end) of a 32 bit word (IBM, Motorola, Sun, HP).

MSB LSB0 1 2 34 5 6 7

Page 43: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 43

Little Endian Addressing

• With Little Endian addressing, the byte binary address

x . . . x00

is in the least significant position (little end) of a 32 bit word (DEC, Intel).

MSB LSB3 2 1 07 6 5 4

Page 44: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 44

Operand Alignment

• An access to an operand of size s bytes at byte address A is said to be aligned if

A mod s = 0

40 41 42 43 44D0 D1 D2 D3

D0 D1 D2 D3

Page 45: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 45

Unrestricted Alignment

• If the architecture does not restrict memory accesses to be aligned then

– Software is simple

– Hardware must detect misalignment and make 2 memory accesses

– Expensive detection logic is required

– All references can be made slower

• Sometimes unrestricted alignment is required for backwards compatibility

Page 46: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 46

Restricted Alignment

• If the architecture restricts memory accesses to be aligned then

– Software must guarantee alignment

– Hardware detects misalignment access and traps

– No extra time is spent when data is aligned

• Since we want to make the common case fast, having restricted alignment is often a better choice, unless compatibility is an issue.

Page 47: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 47

Types of Addressing Modes (VAX)

1. Register direct Ri

2. Immediate (literal) #n

3. Displacement M[Ri + #n]

4. Register indirect M[Ri]

5. Indexed M[Ri + Rj]

6. Direct (absolute) M[#n]

7. Memory Indirect M[M[Ri] ]

8. Autoincrement M[Ri++]

9. Autodecrement M[Ri - -]

10. Scaled M[Ri + Rj*d + #n]

• Studies by [Clark and Emer] indicate that modes 1-4 account for 93% of all operands on the VAX.

memory

reg. file

Page 48: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 48

Frequency of Immediate Addressing on DLX

Operation SPECint92 SPECfp92Loads 10% 45%

Compares 87% 77%ALU ops 58% 78%

Overall 35% 10%

• Not all instructions can take advantage of immediate addressing.

Page 49: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 49

Types of Operations

• Arithmetic and Logic: AND, ADD• Data Transfer: MOVE, LOAD, STORE• Control BRANCH, JUMP, CALL• System OS CALL, VM • Floating Point ADDF, MULF, DIVF• Decimal ADDD, CONVERT• String MOVE, COMPARE• Graphics (DE)COMPRESS

Page 50: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 50

80x86 Instruction Frequency Rank Instruction Frequency

1 load 22%2 branch 20%3 compare 16%4 store 12%5 add 8%6 and 6%7 sub 5%8 register move 4%

9

9 call 1%10 return 1%

Total 96%

Page 51: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 51

Relative Frequency of Control Instructions

Operation SPECint92 SPECfp92Call/Return 13% 11%

Jumps 6% 4%Branches 81% 87%

• Design hardware to handle branches quickly, since these occur most frequently

Page 52: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 52

Frequency of Operand Sizeson 32-bit Load-Store Machine

Size SPECint92 SPECfp9264 bits 0% 69%32 bits 74% 31%16 bits 19% 0%

8 bits 19% 0%

• For floating-point want good performance for 64 bit operands. • For integer operations want good performance for 32 bit operands.

Page 53: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 53

Encoding an Instruction set

• a desire to have as many registers and addressing mode as possible

• the impact of size of register and addressing mode fields on the average instruction size and hence on the average program size

• a desire to have instruction encode into lengths that will be easy to handle in the implementation

Page 54: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 54

Three choice for encoding the instruction set

• Variable– Instruction length varies based on opcode and address

specifiers– For example, VAX instructions vary between 1 and 53

bytes– Good code density, but difficult to decode

• Fixed– Only a single size for all instructions– For example, DLX, MIPS, Power PC, Sparc all have 32 bit

instructions– Not as good code density, but easier to decode

• Hybrid– Have multiple format lengths specified by the opcode– For example, IBM 360/370 and Intel 80x86– Compromise between code density and ease of decode

Page 55: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 55

Compilers and ISA

• Compiler Goals

– All correct programs compile correctly

– Most compiled programs execute quickly

– Most programs compile quickly

– Achieve small code size

– Provide debugging support

• Multiple Source Compilers

– Same compiler can compiler different languages

• Multiple Target Compilers

– Same compiler can generate code for different machines

Page 56: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 56

Compilers Phases

• Compilers use phases to manage complexity– Front end

• Convert language to intermediate form– High level optimizer

• Procedure inlining and loop transformations– Global optimizer

• Global and local optimization, plus register allocation

– Code generator (and assembler)• Dependency elimination, instruction selection,

pipeline scheduling

Page 57: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 57

Allocation of Variables

• Stack – used to allocate local variables– grown and shrunk on procedure calls and returns– register allocation works best for stack-allocated objects

• Global data area– used to allocate global variables and constants– many of these objects are arrays or large data structures– impossible to allocate to registers if they are aliased

• Heap– used to allocate dynamic objects– heap objects are accessed with pointers– never allocated to registers

Page 58: 204521 Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)

204521 Digital System Design 58

Designing ISA to Improve Compilation

• Provide enough general purpose registers to ease register allocation ( more than 16).

• Provide regular instruction sets by keeping the operations, data types, and addressing modes orthogonal.

• Provide primitive constructs rather than trying to map to a high-level language.

• Simplify trade-off among alternatives. • Allow compilers to help make the common case

fast.