Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1ECE369
ECE369
Chapter 2
2ECE369
Instruction Set Architecture
• A very important abstraction– interface between hardware and low-level software– standardizes instructions, machine language bit patterns, etc.– advantage: different implementations of the same architecture– disadvantage: sometimes prevents using new innovations
• Modern instruction set architectures:– IA-32, PowerPC, MIPS, SPARC, ARM, and others
3ECE369
Instructions:
• Language of the Machine• We’ll be working with the MIPS instruction set architecture
– Simple instruction set and format– Similar to other architectures developed since the 1980's– Almost 100 million MIPS processors manufactured in 2002– used by NEC, Nintendo, Cisco, Silicon Graphics, Sony, …
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
01998 2000 200 1 20021999
O the rS PAR CH itach i S HPow erP CM o to ro la 68KM IP SIA -32A R M
4ECE369
MIPS arithmetic
• All instructions have 3 operands• Operand order is fixed (destination first)
Example:
C code: a = b + c
MIPS ‘code’: add a, b, c
(we’ll talk about registers in a bit)
“The natural number of operands for an operation like addition isthree…requiring every instruction to have exactly three operands, no more and no less, conforms to the philosophy of keeping the hardware simple”
5ECE369
MIPS arithmetic
• Design Principle: simplicity favors regularity. • Of course this complicates some things...
C code: a = b + c + d;
MIPS code: add a, b, cadd a, a, d
• Operands must be registers, only 32 registers provided• Each register contains 32 bits
• Design Principle: smaller is faster. Why?
6ECE369
Registers vs. Memory
Processor I/O
Control
Datapath
Memory
Input
Output
• Arithmetic instructions operands must be registers, — only 32 registers provided
• Compiler associates variables with registers• What about programs with lots of variables
7ECE369
Memory Organization
• Viewed as a large, single-dimension array, with an address.• A memory address is an index into the array• "Byte addressing" means that the index points to a byte of memory.
0123456...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8ECE369
Memory Organization
• Bytes are nice, but most data items use larger "words"• For MIPS, a word is 32 bits or 4 bytes.
• 232 bytes with byte addresses from 0 to 232-1• 230 words with byte addresses 0, 4, 8, ... 232-4• Words are aligned
i.e., what are the least 2 significant bits of a word address?
048
12...
32 bits of data
32 bits of data
32 bits of data
32 bits of data
Registers hold 32 bits of data
9ECE369
Instructions
• Load and store instructions• Example:
C code: A[12] = h + A[8]; # $s3 stores base address of A and $s2 stores h
MIPS code: lw $t0, 32($s3)add $t0, $s2, $t0sw $t0, 48($s3)
• Can refer to registers by name (e.g., $s2, $t2) instead of number• Store word has destination last• Remember arithmetic operands are registers, not memory!
Can’t write: add 48($s3), $s2, 32($s3)
10ECE369
Instructions
• Example:
C code: g = h + A[i];
# $s3 stores base address of A and# g,h and i in $s1,$s2 and $s4
Add $t1,$s4,$s4
Add $t1,$t1,$t1
Add $t1,$t1,$s3
Lw $t0,0($t1)
Add $s1,$s2,$t0
t1 = 2*i
t1 = 4*i
t1 = 4*i + s3
t0 = A[i]
g = h + A[i]
11ECE369
So far we’ve learned:
• MIPS— loading words but addressing bytes— arithmetic on registers only
• Instruction Meaning
add $s1, $s2, $s3 $s1 = $s2 + $s3sub $s1, $s2, $s3 $s1 = $s2 – $s3lw $s1, 100($s2) $s1 = Memory[$s2+100] sw $s1, 100($s2) Memory[$s2+100] = $s1
12ECE369
Policy of Use ConventionsName Register number Usage
$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
Register 1 ($at) reserved for assembler, 26-27 for operating system
13ECE369
MIPS Format
• Instructions, like registers and words– are also 32 bits long– add $t1, $s1, $s2– Registers: $t1=9, $s1=17, $s2=18
• Instruction Format:000000 10001 10010 01001 00000 100000
op rs rt rd shamt funct
Fetch,Decode,Execute, PC !!
14ECE369
Our Goal
add $t1, $s1, $s2 ($t1=9, $s1=17, $s2=18)– 000000 10001 10010 01001 00000 100000
op rs rt rd shamt funct
15ECE369
MIPS Format
• How about– lw $s1, 100($s2)
16ECE369
• Consider the load-word and store-word instructions,– What would the regularity principle have us do?– New principle: Good design demands a compromise
• Introduce a new type of instruction format– I-type for data transfer instructions– other format was R-type for register
• Example: lw $t0, 32($s2)
35 18 9 32
op rs rt 16 bit number
• Where's the compromise?
Machine Language
17ECE369
SummaryName Register number Usage
$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
A[300]=h+A[300] # $t1 = base address of A, $s2 stores h# use $t0 for temporary register
Lw $t0,1200($t1)Add $t0, $s2, $t0Sw $t0, 1200($t1)
instruction format op rs rt rd shamt funct addressadd R 0 reg reg reg 0 32 nasub R 0 reg reg reg 0 34 nalw I 35 reg reg na na na addresssw I 43 reg reg na na na address
Op rs,rt,address 35,9,8,1200Op,rs,rt,rd,shamt,funct 0,18,8,8,0,32Op,rs,rt,address 43,9,8,1200
18ECE369
Summary of Instructions We Have Seen So Far
19ECE369
Shift and Logical Operations
20ECE369
Summary of New Instructions
21ECE369
Control Instructions
22ECE369
Using If-Else
$s0 = f$s1 = g$s2 = h$s3 = i$s4 = j$s5 = k
Where is 0,1,2,3 stored?
23ECE369
• Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4≠$t5beq $t4,$t5,Label Next instruction is at Label if $t4=$t5
• Formats:
op rs rt 16 bit addressI
•What if the “Label” is too far away (16 bit address is not enough)
Addresses in Branches
24ECE369
• Instructions:bne $t4,$t5,Label if $t4 != $t5beq $t4,$t5,Label if $t4 = $t5j Labelj Label Next instruction is at Label
• Formats:
•
op rs rt 16 bit address
op 26 bit address
I
J
Addresses in Branches and Jumps
25ECE369
• We have: beq, bne, what about Branch-if-less-than?If (a<b) # a in $s0, b in $s1
Control Flow
slt $t0, $s0, $s1 # t0 gets 1 if a<bbne $t0, $zero, Less # go to Less if $t0 is not 0
Combination of slt and bne implements branch on less than.
26ECE369
While Loop
While (save[i] == k) # i, j and k correspond to registersi = i+j; # $s3, $s4 and $s5
# array base address at $s6
Loop: add $t1, $s3, $s3add $t1, $t1, $t1add $t1, $t1, $s6lw $t0, 0($t1)bne $t0, $s5, Exitadd $s3, $s3, $s4j loop
Exit:
27ECE369
What does this code do?
28ECE369
• simple instructions all 32 bits wide• very structured, no unnecessary baggage• only three instruction formats
op rs rt rd shamt funct
op rs rt 16 bit address
op 26 bit address
R
I
J
Overview of MIPS
29ECE369
Policy of Use ConventionsName Register number Usage
$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
Register 1 ($at) reserved for assembler, 26-27 for operating system
What is the use of $ra?
30ECE369
Example
swap(int v[], int k);{ int temp;
temp = v[k]v[k] = v[k+1];v[k+1] = temp;
}
swap:muli $s2, $s5, 4add $s2, $s4, $s2lw $t15, 0($s2)lw $t16, 4($s2)sw $t16, 0($s2)sw $t15, 4($s2)jr $31
Let $s5 store k value and
$s4 store base address of the v[]
Byte addressing problem???
swap:muli $s2, $s5, 4add $s2, $s4, $s2:
31ECE369
Summary
32ECE369
• More reading:support for procedureslinkers, loaders, memory layoutstacks, frames, recursionmanipulating strings and pointersinterrupts and exceptionssystem calls and conventions
• Some of these we'll talk more about later• We have already talked about compiler optimizations
Other Issues
33ECE369
Nested Procedures,
function_main(){function_a(var_x); /* passes argument using $a0 */: /* function is called with “jal” instruction */return;
}
function_a(int size){function_b(var_y); /* passes argument using $a0 */: /* function is called with “jal” instruction */return;
}
function_b(int count){:return;
}
Resource Conflicts ???
34ECE369
Function Call and Stack Pointer
jr
35ECE369
Recursive Procedures Invoke Clones !!!
int fact (int n) {if (n < 1 )
return ( 1 );else
return ( n * fact ( n-1 ) );}
“n” corresponds to $a0
Program starts with the label of the procedure “fact”
How many registers do we need to save on the stack?
Registers $a0 and $ra
36ECE369
Factorial Code
fact: sub $sp, $sp, 8sw $ra, 4($sp)sw $a0, 0($sp)slt $t0, $a0, 1 # is n<1?beq $t0, $zero, L1 # if not go to L1
add $v0, $zero, 1add $sp, $sp, 8jr $ra
L1: sub $a0, $a0, 1jal fact # call fact(n-1)lw $a0, 0($sp) # restore “n”lw $ra, 4($sp) # restore addressadd $sp, $sp,8 # pop 2 itemsmult $v0,$a0,$v0 # return n*fact(n-1)jr $ra # return to caller
Stack for n=3Ra3Ra2Ra1
37ECE369
What is the Use of Frame Pointer?
Variables local to procedure do not fit in registers !!!
38ECE369
ElaborationName Register number Usage
$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
What if there are more than 4 parameters for a function call?Addressable via frame pointerReferences to variables in the stack have the same offset
Use of jal?Saves address of the next instruction that follows jal into $raProcedure return can now be simply jr $ra
39ECE369
Data and Stack Segments• Given the following initial layout of memory • Each box represents a word (two bytes), • $S0 initially contains the value 5, • what is the result of each of the following
instructions? • (Assume that each instruction begins with
memory laid out as shown)
• How is memory affected by: sw $S0, 4($sp)
• What is the value in $S0 afterlw $S0,0($sp)
• What is contained in $S0 afterlw $S1,-4($sp)lw $S0,6($s1)
• What is contained in $S0 afterlw $S1,-2($sp)lw $S0,0($S1)
• Write the sequence of instructions to store contents of $S0 into address stored in $sp+6
40ECE369
Using If-Else
$s1-$s5: g-k
41ECE369
Jump Address Table
42ECE369
MIPS Example, Use of Jump Address Table
Name Register number Usage$zero 0 the constant value 0$v0-$v1 2-3 values for results and expression evaluation$a0-$a3 4-7 arguments$t0-$t7 8-15 temporaries$s0-$s7 16-23 saved$t8-$t9 24-25 more temporaries$gp 28 global pointer$sp 29 stack pointer$fp 30 frame pointer$ra 31 return address
43ECE369
Using Jump Table
44ECE369
Same Code, this time using If-Else
45ECE369
CPI
46ECE369
Arrays vs. Pointers
• clear1( int array[ ], int size)• {• int i;• for (i=0; i<size; i++)• array[i]=0;• }
• clear2(int* array, int size)• {• int* p;• for( p=&array[0]; p<&array[size]; p++)• *p=0;• }
47ECE369
Clear1
clear1( int array[ ], int size){
int i;for (i=0; i<size; i++)
array[i]=0;}
add $t0,$zero,$zero # i=0, register $t0=0loop1: add $t1,$t0,$t0 # $t1=i*2
add $t1,$t1,$t1 # $t1=i*4add $t2,$a0,$t1 # $t2=address of array[i]sw $zero, 0($t2) # array[i]=0addi $t0,$t0,1 # i=i+1slt $t3,$t0,$a1 # $t3=(i<size)bne $t3,$zero,loop1 # if (i<size) go to loop1
array in $a0size in $a1i in $t0
48ECE369
Clear2
clear2(int* array, int size){
int* p;for( p=&array[0]; p<&array[size]; p++)
*p=0;}
loop2: sw $zero,0($t0) # memory[p]=0add $t0,$a0,$zero # p = address of array[0]
addi $t0,$t0,4 # p = p+4add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,$zero,loop2 # if (p<&array[size]) go to loop2
Array and size to registers $a0 and $a1
for (i=0; i<size; i++)array[i]=0;
49ECE369
Clear2, Version 2
clear2(int* array, int size){
int* p;for( p=&array[0]; p<&array[size]; p++)
*p=0;}
Array and size to registers $a0 and $a1
loop2: sw $zero,0($t0) # memory[p]=0
add $t0,$a0,$zero # p = address of array[0]add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]
addi $t0,$t0,$4 # p = p+4slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,zero,loop2 # if (p<&array[size]) go to loop2
50ECE369
Array vs. Pointer
loop2: sw $zero,0($t0) # memory[p]=0
add $t0,$a0,$zero # p = address of array[0]
addi $t0,$t0,$4 # p = p+4
add $t1,$a1,$a1 # $t1 = size*2add $t1,$t1,$t1 # $t1 = size*4add $t2,$a0,$t1 # $t2 = address of array[size]
slt $t3,$t0,$t2 # $t3=(p<&array[size])bne $t3,zero,loop2 # if (p<&array[size]) go to loop2
add $t0,$zero,$zero # i=0, register $t0=0loop1: add $t1,$t0,$t0 # $t1=i*2
add $t1,$t1,$t1 # $t1=i*4add $t2,$a0,$t1 # $t2=address of array[i]sw $zero, 0($t2) # array[i]=0addi $t0,$t0,1 # i=i+1slt $t3,$t0,$a1 # $t3=(i<size)bne $t3,$zero,loop1 # if (i<size) go to loop1
51ECE369
Big Picture
52ECE369
Compiler
53ECE369
Addressing Modes
54ECE369
Our Goal
add $t1, $s1, $s2 ($t1=9, $s1=17, $s2=18)– 000000 10001 10010 01001 00000 100000
op rs rt rd shamt funct
55ECE369
• Assembly provides convenient symbolic representation– much easier than writing down numbers– e.g., destination first
• Machine language is the underlying reality– e.g., destination is no longer first
• Assembly can provide 'pseudoinstructions'– e.g., “move $t0, $t1” exists only in Assembly – would be implemented using “add $t0,$t1,$zero”
• When considering performance you should count real instructions
Assembly Language vs. Machine Language
56ECE369
• Instruction complexity is only one variable– lower instruction count vs. higher CPI / lower clock rate
• Design Principles:– simplicity favors regularity– smaller is faster– good design demands compromise– make the common case fast
• Instruction set architecture– a very important abstraction indeed!
Summary