New ECE369 Chapter 2 · 2008. 9. 15. · ECE369 2 Instruction Set Architecture • A very important abstraction – interface between hardware and low-level software – standardizes

1ECE369

ECE369

Chapter 2

2ECE369

Instruction Set Architecture

• A very important abstraction– interface between hardware and low-level software– standardizes instructions, machine language bit patterns, etc.– advantage: different implementations of the same architecture– disadvantage: sometimes prevents using new innovations

• Modern instruction set architectures:– IA-32, PowerPC, MIPS, SPARC, ARM, and others

3ECE369

Instructions:

• Language of the Machine• We’ll be working with the MIPS instruction set architecture

– Simple instruction set and format– Similar to other architectures developed since the 1980's– Almost 100 million MIPS processors manufactured in 2002– used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, …

1400

1300

1200

1100

1000

900

800

700

600

500

400

300

200

100

01998 2000 200 1 20021999

O the rS PAR CH itach i S HPow erP CM o to ro la 68KM IP SIA -32A R M

4ECE369

MIPS arithmetic

• All instructions have 3 operands• Operand order is fixed (destination first)

Example:

C code: a = b + c

MIPS ‘code’: add a, b, c

(we’ll talk about registers in a bit)

“The natural number of operands for an operation like addition isthree…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple”

5ECE369

MIPS arithmetic

• Design Principle: simplicity favors regularity. • Of course this complicates some things...

C code: a = b + c + d;

MIPS code: add a, b, cadd a, a, d

• Operands must be registers, only 32 registers provided• Each register contains 32 bits

• Design Principle: smaller is faster. Why?

6ECE369

Registers vs. Memory

Processor I/O

Control

Datapath

Memory

Input

Output

• Arithmetic instructions operands must be registers, — only 32 registers provided

• Compiler associates variables with registers• What about programs with lots of variables

7ECE369

Memory Organization

• Viewed as a large, single-dimension array, with an address.• A memory address is an index into the array• "Byte addressing" means that the index points to a byte of memory.

0123456...

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8ECE369

Memory Organization

• Bytes are nice, but most data items use larger "words"• For MIPS, a word is 32 bits or 4 bytes.

• 232 bytes with byte addresses from 0 to 232-1• 230 words with byte addresses 0, 4, 8, ... 232-4• Words are aligned

i.e., what are the least 2 significant bits of a word address?

048

12...

32 bits of data

32 bits of data

32 bits of data

32 bits of data

Registers hold 32 bits of data

9ECE369

Instructions

• Load and store instructions• Example:

C code: A[12] = h + A[8]; # $s3 stores base address of A and $s2 stores h

MIPS code: lw $t0, 32($s3)add $t0, $s2, $t0sw $t0, 48($s3)

• Can refer to registers by name (e.g., $s2, $t2) instead of number• Store word has destination last• Remember arithmetic operands are registers, not memory!

Can’t write: add 48($s3), $s2, 32($s3)

10ECE369

Instructions

• Example:

C code: g = h + A[i];

# $s3 stores base address of A and# g,h and i in $s1,$s2 and $s4

Add $t1,$s4,$s4

Add $t1,$t1,$t1

Add $t1,$t1,$s3

Lw $t0,0($t1)

Add $s1,$s2,$t0

t1 = 2*i

t1 = 4*i

t1 = 4*i + s3

t0 = A[i]

g = h + A[i]

11ECE369

So far we’ve learned:

• MIPS— loading words but addressing bytes— arithmetic on registers only

• Instruction Meaning

add $s1, $s2, $s3 $s1 = $s2 + $s3sub $s1, $s2, $s3 $s1 = $s2 – $s3lw $s1, 100($s2) $s1 = Memory[$s2+100] sw $s1, 100($s2) Memory[$s2+100] = $s1

12ECE369

Policy of Use ConventionsName Register number Usage

$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address

Register 1 ($at) reserved for assembler, 26-27 for operating system

13ECE369

MIPS Format

• Instructions, like registers and words– are also 32 bits long– add $t1, $s1, $s2– Registers: $t1=9, $s1=17, $s2=18

• Instruction Format:000000 10001 10010 01001 00000 100000

op rs rt rd shamt funct

Fetch,Decode,Execute, PC !!

14ECE369

Our Goal

add $t1, $s1, $s2 ($t1=9, $s1=17, $s2=18)– 000000 10001 10010 01001 00000 100000


15ECE369

MIPS Format

• How about– lw $s1, 100($s2)

16ECE369

• Consider the load-word and store-word instructions,– What would the regularity principle have us do?– New principle: Good design demands a compromise

• Introduce a new type of instruction format– I-type for data transfer instructions– other format was R-type for register

• Example: lw $t0, 32($s2)

35 18 9 32

op rs rt 16 bit number

• Where's the compromise?

Machine Language

17ECE369

SummaryName Register number Usage


A[300]=h+A[300] # $t1 = base address of A, $s2 stores h# use $t0 for temporary register

Lw $t0,1200($t1)Add $t0, $s2, $t0Sw $t0, 1200($t1)

instruction format op rs rt rd shamt funct addressadd R 0 reg reg reg 0 32 nasub R 0 reg reg reg 0 34 nalw I 35 reg reg na na na addresssw I 43 reg reg na na na address

Op rs,rt,address 35,9,8,1200Op,rs,rt,rd,shamt,funct 0,18,8,8,0,32Op,rs,rt,address 43,9,8,1200

18ECE369

Summary of Instructions We Have Seen So Far

19ECE369

Shift and Logical Operations

20ECE369

Summary of New Instructions

21ECE369

Control Instructions

22ECE369

Using If-Else

$s0 = f$s1 = g$s2 = h$s3 = i$s4 = j$s5 = k

Where is 0,1,2,3 stored?

23ECE369

• Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4≠$t5beq $t4,$t5,Label Next instruction is at Label if $t4=$t5

• Formats:

op rs rt 16 bit addressI

•What if the “Label” is too far away (16 bit address is not enough)

Addresses in Branches

24ECE369

• Instructions:bne $t4,$t5,Label if $t4 != $t5beq $t4,$t5,Label if $t4 = $t5j Labelj Label Next instruction is at Label

• Formats:

•

op rs rt 16 bit address

op 26 bit address

I

J

Addresses in Branches and Jumps

25ECE369

• We have: beq, bne, what about Branch-if-less-than?If (a<b) # a in $s0, b in $s1

Control Flow

slt $t0, $s0, $s1 # t0 gets 1 if a<bbne $t0, $zero, Less # go to Less if $t0 is not 0

Combination of slt and bne implements branch on less than.

26ECE369

While Loop

While (save[i] == k) # i, j and k correspond to registersi = i+j; # $s3, $s4 and $s5

# array base address at $s6

Loop: add $t1, $s3, $s3add $t1, $t1, $t1add $t1, $t1, $s6lw $t0, 0($t1)bne $t0, $s5, Exitadd $s3, $s3, $s4j loop

Exit:

27ECE369

What does this code do?

28ECE369

• simple instructions all 32 bits wide• very structured, no unnecessary baggage• only three instruction formats


op rs rt 16 bit address

op 26 bit address

R

I

J

Overview of MIPS

29ECE369

Policy of Use ConventionsName Register number Usage


Register 1 ($at) reserved for assembler, 26-27 for operating system

What is the use of $ra?

30ECE369

Example

swap(int v[], int k);{ int temp;

temp = v[k]v[k] = v[k+1];v[k+1] = temp;

}

swap:muli $s2, $s5, 4add $s2, $s4, $s2lw $t15, 0($s2)lw $t16, 4($s2)sw $t16, 0($s2)sw $t15, 4($s2)jr $31

Let $s5 store k value and

$s4 store base address of the v[]

Byte addressing problem???

swap:muli $s2, $s5, 4add $s2, $s4, $s2:

31ECE369

Summary

32ECE369

• More reading:support for procedureslinkers, loaders, memory layoutstacks, frames, recursionmanipulating strings and pointersinterrupts and exceptionssystem calls and conventions

• Some of these we'll talk more about later• We have already talked about compiler optimizations

Other Issues

33ECE369

Nested Procedures,

function_main(){function_a(var_x); /* passes argument using $a0 */: /* function is called with “jal” instruction */return;

}

function_a(int size){function_b(var_y); /* passes argument using $a0 */: /* function is called with “jal” instruction */return;

}

function_b(int count){:return;

}

Resource Conflicts ???

34ECE369

Function Call and Stack Pointer

jr

35ECE369

Recursive Procedures Invoke Clones !!!

int fact (int n) {if (n < 1 )

return ( 1 );else

return ( n * fact ( n-1 ) );}

“n” corresponds to $a0

Program starts with the label of the procedure “fact”

How many registers do we need to save on the stack?

Registers $a0 and $ra

36ECE369

Factorial Code

fact: sub $sp, $sp, 8sw $ra, 4($sp)sw $a0, 0($sp)slt $t0, $a0, 1 # is n<1?beq $t0, $zero, L1 # if not go to L1

add $v0, $zero, 1add $sp, $sp, 8jr $ra

L1: sub $a0, $a0, 1jal fact # call fact(n-1)lw $a0, 0($sp) # restore “n”lw $ra, 4($sp) # restore addressadd $sp, $sp,8 # pop 2 itemsmult $v0,$a0,$v0 # return n*fact(n-1)jr $ra # return to caller

Stack for n=3Ra3Ra2Ra1

37ECE369

What is the Use of Frame Pointer?

Variables local to procedure do not fit in registers !!!

38ECE369

ElaborationName Register number Usage


What if there are more than 4 parameters for a function call?Addressable via frame pointerReferences to variables in the stack have the same offset

Use of jal?Saves address of the next instruction that follows jal into $raProcedure return can now be simply jr $ra

39ECE369

Data and Stack Segments• Given the following initial layout of memory • Each box represents a word (two bytes), • $S0 initially contains the value 5, • what is the result of each of the following

instructions? • (Assume that each instruction begins with

memory laid out as shown)

• How is memory affected by: sw $S0, 4($sp)

• What is the value in $S0 afterlw $S0,0($sp)

• What is contained in $S0 afterlw $S1,-4($sp)lw $S0,6($s1)

• What is contained in $S0 afterlw $S1,-2($sp)lw $S0,0($S1)

• Write the sequence of instructions to store contents of $S0 into address stored in $sp+6

40ECE369

Using If-Else

$s1-$s5: g-k

41ECE369

Jump Address Table

42ECE369

MIPS Example, Use of Jump Address Table

Name Register number Usage$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address

43ECE369

Using Jump Table

44ECE369

Same Code, this time using If-Else

45ECE369

CPI

46ECE369

Arrays vs. Pointers

• clear1( int array[ ], int size)• {• int i;• for (i=0; i<size; i++)• array[i]=0;• }

• clear2(int* array, int size)• {• int* p;• for( p=&array[0]; p<&array[size]; p++)• *p=0;• }

47ECE369

Clear1

clear1( int array[ ], int size){

int i;for (i=0; i<size; i++)

array[i]=0;}

add $t0,$zero,$zero # i=0, register $t0=0loop1: add $t1,$t0,$t0 # $t1=i*2

add $t1,$t1,$t1 # $t1=i*4add $t2,$a0,$t1 # $t2=address of array[i]sw $zero, 0($t2) # array[i]=0addi $t0,$t0,1 # i=i+1slt $t3,$t0,$a1 # $t3=(i<size)bne $t3,$zero,loop1 # if (i<size) go to loop1

array in $a0size in $a1i in $t0

48ECE369

Clear2

clear2(int* array, int size){

int* p;for( p=&array[0]; p<&array[size]; p++)

*p=0;}

loop2: sw $zero,0($t0) # memory[p]=0add $t0,$a0,$zero # p = address of array[0]

addi $t0,$t0,4 # p = p+4add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,$zero,loop2 # if (p<&array[size]) go to loop2

Array and size to registers $a0 and $a1

for (i=0; i<size; i++)array[i]=0;

49ECE369

Clear2, Version 2

clear2(int* array, int size){

int* p;for( p=&array[0]; p<&array[size]; p++)

*p=0;}

Array and size to registers $a0 and $a1

loop2: sw $zero,0($t0) # memory[p]=0

add $t0,$a0,$zero # p = address of array[0]add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]

addi $t0,$t0,$4 # p = p+4slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,zero,loop2 # if (p<&array[size]) go to loop2

50ECE369

Array vs. Pointer

loop2: sw $zero,0($t0) # memory[p]=0

add $t0,$a0,$zero # p = address of array[0]

addi $t0,$t0,$4 # p = p+4

add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]

slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,zero,loop2 # if (p<&array[size]) go to loop2

add $t0,$zero,$zero # i=0, register $t0=0loop1: add $t1,$t0,$t0 # $t1=i*2

add $t1,$t1,$t1 # $t1=i*4add $t2,$a0,$t1 # $t2=address of array[i]sw $zero, 0($t2) # array[i]=0addi $t0,$t0,1 # i=i+1slt $t3,$t0,$a1 # $t3=(i<size)bne $t3,$zero,loop1 # if (i<size) go to loop1

51ECE369

Big Picture

52ECE369

Compiler

53ECE369

Addressing Modes

54ECE369

Our Goal

add $t1, $s1, $s2 ($t1=9, $s1=17, $s2=18)– 000000 10001 10010 01001 00000 100000


55ECE369

• Assembly provides convenient symbolic representation– much easier than writing down numbers– e.g., destination first

• Machine language is the underlying reality– e.g., destination is no longer first

• Assembly can provide 'pseudoinstructions'– e.g., “move $t0, $t1” exists only in Assembly – would be implemented using “add $t0,$t1,$zero”

• When considering performance you should count real instructions

Assembly Language vs. Machine Language

56ECE369

• Instruction complexity is only one variable– lower instruction count vs. higher CPI / lower clock rate

• Design Principles:– simplicity favors regularity– smaller is faster– good design demands compromise– make the common case fast

• Instruction set architecture– a very important abstraction indeed!

Summary

Documents

New ECE369 Chapter 2 · 2008. 9. 15. · ECE369 2 Instruction Set Architecture • A very important abstraction – interface between hardware and low-level software – standardizes