49
INSTRUCTION SET ARCHITECTURE 1

03 isa computer architecture

Embed Size (px)

Citation preview

Page 1: 03 isa computer architecture

INSTRUCTION SET ARCHITECTURE

1

Page 2: 03 isa computer architecture

Instruction Set Architecture (ISA)

2

What is an ISA? A functional contract

All ISAs similar in high-level ways But many design choices in

details Two “philosophies”:

CISC/RISC Difference is blurring

Good ISA… Enables high-performance At least doesn’t get in the way

Compatibility is a powerful force Tricks: binary translation,

µISAs

Application

OS

Firmware

Compiler

CPU I/O

Memory

Digital Circuits

Gates & Transistors

Page 3: 03 isa computer architecture

CISC vs RISC3

CISC RISC

Emphasis on hardware Emphasis on software

Includes multi-clockcomplex instructions

Single-clock,reduced instruction only

Memory-to-memory:"LOAD" and "STORE"incorporated in instructions

Register to register:"LOAD" and "STORE"are independent instructions

Transistors used for storingcomplex instructions

Spends more transistorson memory registers

Read article CISC vs RISC placed in generalshare folder of computer architecture

Page 4: 03 isa computer architecture

Program Compilation4

Program written in a “high-level” programming language C, C++, Java, C# structured control: loops, functions, conditionals structured data: scalars, arrays, pointers, structures

Compiler: translates program to assembly Parsing and straight-forward translation Compiler also optimizes Compiler itself another application … who compiled

compiler?

CPUMem I/O

System softwareAppApp App

int array[100], sum;

void array_sum() {

for (int i=0; i<100;i++) {

sum += array[i];

}

}

Page 5: 03 isa computer architecture

Assembly & Machine Language5

Assembly language Human-readable representation

Machine language Machine-readable representation 1s and 0s (often displayed in “hex”)

Assembler Translates assembly to machine

CPUMem I/O

System softwareAppApp App

Example is in “LC4” a toy instruction set architecture, or ISA

Page 6: 03 isa computer architecture

Example Assembly Language & ISA

6

MIPS: example of real ISA 32/64-bit operations 32-bit instructions 32/64 registers

32 integer, 64 floating point ~100 different instructions

CPUMem I/O

System softwareAppApp App .datadata

aarray: .space 100

sum: .word 0sum: .word 0

.text.text

array_sum:array_sum:

li $5, 0li $5, 0

la $1, array

la $2, sumla $2, sum

array_sum_loop:array_sum_loop:

lw $3, 0($1)

lw $4, 0($2)lw $4, 0($2)

add $4, $3, $4

sw $4, 0($2)sw $4, 0($2)

addi $1, $1, 1

addi $5, $5, 1addi $5, $5, 1

li $6, 100

blt $5, $6, array_sum_loopblt $5, $6, array_sum_loop

Example code is MIPS, but all ISAs are pretty similar

Page 7: 03 isa computer architecture

Instruction Execution Model7

A computer is just a finite state machine Registers (few of them, but fast) Memory (lots of memory, but slower) Program counter (next insn to execute)

Called “instruction pointer” in x86 A computer executes instructions

Fetches next instruction from memory Decodes it (figure out what it does) Reads its inputs (registers & memory) Executes it (adds, multiply, etc.) Write its outputs (registers & memory) Next insn (adjust the program counter)

Program is just “data in memory” Makes computers programmable

(“universal”)

CPUMem I/O

System softwareAppApp App

FetchFetchDecodeDecodeRead InputsRead InputsExecuteExecuteWrite OutputWrite OutputNext InsnNext Insn

Instruction = Insn

Page 8: 03 isa computer architecture

Designing a Computer8

Control

Datapath MemoryCentral Processing

Unit (CPU)or “processor”

Input

Output

FIVE PIECES OF HARDWARE

Page 9: 03 isa computer architecture

Start by Defining ISA9

What is instruction set architecture (ISA)? ISA

Defines registers Defines data transfer modes (instructions) between

registers, memory and I/O There should be s uffic ie nt instructions to efficiently

translate any program for machine processing Next, define instruction set format – binary

representation used by the hardware Variable-length vs. fixed-length instructions

Page 10: 03 isa computer architecture

Types of ISA10

Complex instruction set computer (CISC) Many instructions (several hundreds) An instruction takes many cycles to execute Example: Intel Pentium

Reduced instruction set computer (RISC) Small set of instructions (typically 32) Simple instructions, each executes in one clock

cycle –Well, almost. Effective use of pipelining Example: ARM, MIPS

Page 11: 03 isa computer architecture

On Two Types of ISA11

Brad Smith, “ARM and Intel Battle over the Mobile Chip’s Future,” Computer, vol. 41, no. 5, pp. 15-18, May 2008.

Compare 3Ps: Performance Power consumption Price

Page 12: 03 isa computer architecture

Pipelining of RISC Instructions12

FetchInstruction

DecodeOpcode

FetchOperands

ExecuteOperation

StoreResult

Although an instruction takes five clock cycles,one instruction can be completed every cycle.

Page 13: 03 isa computer architecture

MIPS Instruction Set (RISC)13

Instructions execute simple functions. Maintain regularity of format – each instruction

is one word, contains o p c o d e and a rg um e nts . Minimize memory accesses – whenever

possible use registers as arguments. Three types of instructions:

Register (R)-type – only registers as arguments. Immediate (I)-type – arguments are registers and

numbers (constants or memory addresses). Jump (J)-type – argument is an address.

Page 14: 03 isa computer architecture

MIPS Arithmetic Instructions14

All instructions have 3 operands Operand order is fixed (destination first)

Example:

C code: a = b + c;

MIPS ‘code’: add a, b, c

“The natural number of operands for an operation like addition is three… requiring every instruction to have exactly three operands conforms to the philosophy of keeping the hardware simple”

2004 © Morgan Kaufman Publishers

Page 15: 03 isa computer architecture

Arithmetic Instr. (Continued)15

Design Principle: simplicity favors regularity. Of course this complicates some things...

C code: a = b + c + d;

MIPS code: add a, b, cadd a, a, d

Operands must be registers 32 registers provided Each register contains 32 bits

Page 16: 03 isa computer architecture

Registers vs. Memory16

Arithmetic instructions operands must be registers 32 registers provided

Compiler associates variables with registers. What about programs with lots of variables? Mus t us e

m e m o ry .

Processor I/O

Control

Datapath

Memory

Input

Output

Page 17: 03 isa computer architecture

Memory Organization17

Viewed as a large, single-dimension array, with an address.

A memory address is an index into the array. "Byte addressing" means that the index points to a byte

of memory.

2004 © Morgan Kaufman Publishers

32 bit word

32 bit word

32 bit word

32 bit word

.

.

.

8 bits of data 8 bits of data 8 bits of data 8 bits of data

8 bits of data 8 bits of data 8 bits of data 8 bits of data

8 bits of data 8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data 8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

Byte 0 byte 1 byte 2 byte 3

byte 4 byte 10

Page 18: 03 isa computer architecture

Memory Organization18

Bytes are nice, but most data items use larger "words" For MIPS, a word contains 32 bits or 4 bytes.

word addresses

232 bytes with addresses from 0 to 232 – 1 230 words with addresses 0, 4, 8, ... 232 – 4 Words are aligned

i.e., what are the least 2 significant bits of a word address?

...

Registers hold 32 bits of data

Use 32 bit address

2004 © Morgan Kaufman Publishers

0

4

8

12

.

32 bits of data

32 bits of data

32 bits of data

32 bits of data

32 bits of data

Page 19: 03 isa computer architecture

Instructions19

Load and store instructions Example:

C code: A[12] = h + A[8];

MIPS code: lw $t0, 32($s3) #addr of A in reg s3add $t0, $s2, $t0 #h in reg s2sw $t0, 48($s3)

Can refer to registers by name (e.g., $s2, $t2) instead of number Store word has destination last Remember arithmetic operands are registers, not memory!

Can’t write: add 48($s3), $s2, 32($s3)

2004 © Morgan Kaufman Publishers

Page 20: 03 isa computer architecture

Our First Example20

Can we figure out the code of subroutine?

Initially, k is in reg 5; base address of v is in reg 4; return addr is in reg 31

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}

swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31

Page 21: 03 isa computer architecture

What Happens?21

When the program reaches “call swap” statement: Jump to swap routine

Registers 4 and 5 contain the arguments (register convention) Register 31 contains the return address (register convention)

Swap two words in memory Jump back to return address to continue rest of the

program

.

.call swap...

return address

Page 22: 03 isa computer architecture

Memory and Registers22

Word 0

Word 1

Word 2

v[0] (Word n)

048

12 .

4n . . .

4n+4k.

v[1] (Word n+1)

Register 0

Register 1

Register 2

Register 3

Register 4

Register 31

Register 5

v[k] (Word n+k)

4n

k

Memorybyte addr.

Ret. addr.v[k+1] (Word n+k+1)

.

.

Page 23: 03 isa computer architecture

Our First Example23

Now figure out the code:

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}

swap:sll $2, $5, 2add $2, $4, $2lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)jr $31

2004 © Morgan Kaufman Publishers

Page 24: 03 isa computer architecture

So Far We’ve Learned24

MIPS— loading words but addressing bytes— arithmetic on registers only

Instruction Meaning

add $s1, $s2, $s3 $s1 = $s2 + $s3sub $s1, $s2, $s3 $s1 = $s2 – $s3lw $s1, 100($s2) $s1 = Memory[$s2+100]

sw $s1, 100($s2) Memory[$s2+100] = $s1

2004 © Morgan Kaufman Publishers

Page 25: 03 isa computer architecture

Stored Program Concept25

Instructions are bits Programs are stored in memory

to be read or written just like data

Fetch and Execute Cycles Instructions are fetched and put into a special register Opcode bits in the register "control" the subsequent actions Fetch the “next” instruction and continue

Processor Memorymemory for data, programs,

compilers, editors, etc.

Page 26: 03 isa computer architecture

Control26

Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed

MIPS conditional branch instructions:

bne $t0, $t1, Label beq $t0, $t1, Label

Example: if (i==j) h = i + j;

bne $s0, $s1, Labeladd $s3, $s0, $s1

Label: ....

2004 © Morgan Kaufman Publishers

Page 27: 03 isa computer architecture

Control27

MIPS unconditional branch instructions:j label

Example:if (i!=j) beq $s4, $s5, Lab1 h=i+j; add $s3, $s4, $s5else j Lab2 h=i-j; Lab1: sub $s3,

$s4, $s5Lab2: ...

Can you build a simple fo r loop?

Page 28: 03 isa computer architecture

So Far We’ve Learned28

Instruction Meaningadd $s1,$s2,$s3 $s1 = $s2 + $s3sub $s1,$s2,$s3 $s1 = $s2 – $s3lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1bne $s4,$s5,Label Next instr. is at Label if

$s4 ≠ $s5beq $s4,$s5,Label Next instr. is at Label if

$s4 = $s5j Label Next instr. is at Label

Formats:OP

OP

OP

rs rt rd sa funct

rs rt immediate

R Format

I Format

6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

6 bits 5 bits 5 bits 16 bits

J Format6 bits 26 bits

jump target

Page 29: 03 isa computer architecture

Three Ways to Jump: j, jr, jal29

j ins tr # jump to machine instruction ins tr (unconditional jump)

jr $ra # jump to address in register ra (used by calee to go back to caller)

jal a ddr # set $ra = PC+4 and go to a ddr (jump and link; used to jump to a procedure)

Page 30: 03 isa computer architecture

Control Flow30

We have: beq, bne, what about Branch-if-less-than? New instruction:

if $s1 < $s2 then $t0 = 1

slt $t0, $s1, $s2 else $t0 = 0

Can use this instruction to build new “pseudoinstruction”blt $s1, $s2, Label

Note that the assembler needs a register to do this,— there are policy of use conventions for registers

2004 © Morgan Kaufman Publishers

Page 31: 03 isa computer architecture

Pseudoinstructions31

blt $s1, $s2, reladdr Assembler converts to:

slt $1, $s1, $s2bne $1, $zero, reladdr

Other pseudoinstructions: bgt, ble, bge, li, move Not implemented in hardware Assembler expands pseudoinstructions into

machine instructions Register 1, called $at, is reserved for converting

pseudoinstructions into machine code.

Page 32: 03 isa computer architecture

Constants32

Small constants are used quite frequently (50% of operands) e.g., A = A + 5;

B = B + 1;C = C – 18;

Solutions? Why not? put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like one.

MIPS Instructions: addi $29, $29, 4

slti $8, $18, 10andi $29, $29, 6ori $29, $29, 4

Design Principle: Make the common case fast. Which fo rm a t?2004 © Morgan Kaufman Publishers

Page 33: 03 isa computer architecture

How About Larger Constants?33

We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction

lui $t0, 1010101010101010

Then must get the lower order bits right, i.e.,ori $t0, $t0, 1010101010101010

1010101010101010 0000000000000000

0000000000000000 1010101010101010

1010101010101010 1010101010101010

ori

1010101010101010 0000000000000000

filled with zeros

2004 © Morgan Kaufman Publishers

Page 34: 03 isa computer architecture

Assembly Language vs. Machine Language

34

Assembly provides convenient symbolic representation much easier than writing down numbers e.g., destination first

Machine language is the underlying reality e.g., destination is no longer first

Assembly can provide 'pseudoinstructions' e.g., “move $t0, $t1” exists only in Assembly implemented using “add $t0, $t1, $zero”

When considering performance you should count real instructions and clock cycles

2004 © Morgan Kaufman Publishers

Page 35: 03 isa computer architecture

Overview of MIPS35

simple instructions, all 32 bits wide very structured, no unnecessary baggage only three instruction formats

rely on compiler to achieve performance

R-type

I-type

J-type

2004 © Morgan Kaufman Publishers

Page 36: 03 isa computer architecture

Addresses in Branches and Jumps36

Instructions:bne $t4, $t5, Label Next instruction is at Label

if $t4 ≠ $t5

beq $t4, $t5, Label Next instruction is at Label if $t4 = $t5

j Label Next instruction is at Label Formats:

op rs rt 16 bit rel. address

op 26 bit absolute address

I

J

Page 37: 03 isa computer architecture

Addresses in Branches37

Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4 ≠ $t5beq $t4,$t5,Label Next instruction is at Label if $t4 = $t5

Formats: – 215 to 215 – 1 ~ ±32 Kwords

Relative addressing 226 = 64 Mwords with respect to PC (program counter) most branches are local (principle of locality)

Jump instruction just uses high order bits of PC address boundaries of 256 MBytes (maximum jump 64 Mwords)

op rs rt 16 bit address

op 26 bit address

Page 38: 03 isa computer architecture

Example: Loop in C38

while ( save[i] == k ) i += 1;

Given a value for k, set i to the index of element in array save [ ] that does not equal k.

Page 39: 03 isa computer architecture

Alternative Architectures39

Design alternative: provide more powerful operations goal is to reduce number of instructions executed danger is a slower cycle time and/or a higher CPI

Let’s look (briefly) at IA-32

–“The path toward operation complexity is thus fraught with peril.

To avoid these problems, designers have moved toward simpler

instructions”

Page 40: 03 isa computer architecture

IA–32 (a.k.a. x86)40

1978: The Intel 8086 is announced (16 bit architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits,

+instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989-1995: The 80486, Pentium, Pentium Pro add a few instructions

(mostly designed for higher performance) 1997: 57 new “MMX” instructions are added, Pentium II 1999: The Pentium III added another 70 instructions (SSE –

streaming SIMD extensions) 2001: Another 144 instructions (SSE2) 2003: AMD extends the architecture to increase address space to

64 bits, widens all registers to 64 bits and makes other changes (AMD64)

2004: Intel capitulates and embraces AMD64 (calls it EM64T) and adds more media extensions

“This his to ry illus tra te s the im p a c t o f the “g o ld e n ha ndcuffs ” o f c o m p a tibility :“a dd ing ne w fe a ture s a s s o m e o ne m ig ht a dd c lo thing to a p a cke d ba g ”“a n a rchite c ture tha t is d ifficult to e x p la in a nd im p o s s ible to lo ve ”

Page 41: 03 isa computer architecture

IA-32 Overview41

Complexity: Instructions from 1 to 17 bytes long one operand must act as both a source and destination one operand can come from memory complex addressing modes

e.g., “base or scaled index with 8 or 32 bit displacement” Saving grace:

the most frequently used instructions are not too difficult to build compilers avoid the portions of the architecture that are slow

“wha t the x 8 6 la c ks in s ty le is m a d e up in q ua ntity , m a king it be a utiful fro m the rig ht p e rs p e c tive ”

Page 42: 03 isa computer architecture

IA-32 Register Restrictions42

Fourteen major registers. Eight 32-bit general purpose registers.

ESP or EBP cannot contain memory address. ESP cannot contain displacement from base address. . . .

Page 43: 03 isa computer architecture

IA-32 Typical Instructions43

Four major types of integer instructions: Data movement including move, push, pop Arithmetic and logical (destination register or memory) Control flow (use of condition codes / flags ) String instructions, including string move and string compare

Page 44: 03 isa computer architecture

Some IA-32 Instructions44

PUSH 5-bit opcode, 3-bit register operand

JE 4-bit opcode, 4-bit condition, 8-bit jump offset

5-b | 3-b

4-b | 4-b | 8-b

Page 45: 03 isa computer architecture

Some IA-32 Instructions45

MOV 6-bit opcode, 8-bit register/mode*, 8-bit offset

XOR 8-bit opcode, 8-bit reg/mode*, 8-bit base, 8-b index

6-b |d|w| 8-b | 8-b

bit indicates move to or from memory

bit indicates byte or double word operation

8-b | 8-b | 8-b | 8-b

Page 46: 03 isa computer architecture

Some IA-32 Instructions46

ADD 4-bit opcode, 3-bit register, 32-bit immediate

TEST 7-bit opcode, 8-bit reg/mode, 32-bit immediate

4-b | 3-b |w| 32-b

7-b |w| 8-b | 32-b

Page 47: 03 isa computer architecture

Additional References47

IA-32, IA-64 (CISC) A. S. Tanenbaum, Struc ture d Co m p ute r O rg a niz a tio n, Fifth

Ed itio n, Upper Saddle River, New Jersey: Pearson Prentice-Hall, 2006, Chapter 5.

ARM (RISC) D. Seal, ARM Archite c ture Re fe re nc e Ma nua l, Se c o nd Ed itio n ,

Addison-Wesley Professional, 2000. SPARC (Scalable Processor Architecture) PowerPC

V. C. Hamacher, Z. G. Vranesic and S. G. Zaky, Computer Organization, Fourth Edition, New York: McGraw-Hill, 1996.

Page 48: 03 isa computer architecture

Instruction Complexity48

Increasing instruction complexityPro

gra

m s

ize

in m

ach

ine

inst

ruct

ion

s (P

)

Av.

exe

cutio

n tim

e pe

r in

stru

ctio

n (

T)

PT

P×T

Page 49: 03 isa computer architecture

Summary49

Instruction complexity is only one variable lower instruction count vs. higher CPI / lower clock rate

– we will s e e p e rfo rm anc e m e a s ure s la te r Design Principles:

simplicity favors regularity smaller is faster good design demands compromise make the common case fast

Instruction set architecture a very important abstraction indeed!