Upload
leola
View
33
Download
0
Embed Size (px)
DESCRIPTION
CSE 331 Computer Organization and Design Fall 2007 Week 5. Section 1: Mary Jane Irwin ( www.cse.psu.edu/~mji ) Section 2: Krishna Narayanan Course material on ANGEL: cms.psu.edu [ adapted from D. Patterson slides ]. Head’s Up. Last week’s material Supporting procedure calls and returns - PowerPoint PPT Presentation
Citation preview
CSE331 W05.1 Irwin Fall 2007 PSU
CSE 331Computer Organization and
DesignFall 2007
Week 5
Section 1: Mary Jane Irwin (www.cse.psu.edu/~mji)
Section 2: Krishna Narayanan
Course material on ANGEL: cms.psu.edu
[adapted from D. Patterson slides]
CSE331 W05.2 Irwin Fall 2007 PSU
Head’s UpLast week’s material
Supporting procedure calls and returns
This week’s material Addressing modes; Assemblers, linkers and loaders
- Reading assignment - PH: 2.10, A.1-A.5
Next week’s material (after Exam #1) Introduction to VHDL and basic VHDL constructs
- Reading assignment – Y, Chapters 1 through 3
Reminders HW 4 is due Thursday, Sept 27th (by 11:55pm) Quiz 3 will close Sunday, Sept 30th (at 11:55pm) Exam #1 is Tuesday, Oct 2, 6:30 to 7:45pm, 113 IST
CSE331 W05.3 Irwin Fall 2007 PSU
CDT Article on IBM’s BlueGene/P (9/25/07) Calculations per second
Gigaflop 1 billion (1,000,000,000) Teraflop 1 trillion (1,000,000,000,000) Petaflop 1 quadrillion
(1,000,000,000,000,000) Exaflop 1 quintrillion (1,000,000,000,000,000,000)Year Performance Maker
1993 16,000,000,000 Thinking Machine C-5
1994 236,000,000,000 Fujitsu Numerical Wind Tunnel
1996 368,000,000,000 Hitachi CP-PACS
1997 1,800,000,000,000 intel ASCI Red
2000 12,300,000,000,000 IBM ASCI White
2002 36,000,000,000,000 NEC Earth Simulator
2004 280,000,000,000,000 IBM BlueGene/L
2008 1,000,000,000,000,000 IBM BlueGene/P
2011 10,000,000,000,000,000 IBM BlueGene/P+
2017 20,000,000,000,000,000 IBM BlueGene/P++
202? 1,000,000,000,000,000,000 TBD
CSE331 W05.4 Irwin Fall 2007 PSU
MIPS Operand Addressing Modes SummaryRegister addressing – operand is in a register
Base (displacement) addressing – operand’s address in memory is the sum of a register and a 16-bit constant contained within the instruction
Immediate addressing – operand is a 16-bit constant contained within the instruction
1. Register addressing
op rs rt rd funct Register
word operand
op rs rt offset
2. Base addressing
base register
Memory
word or byte operand
3. Immediate addressing
op rs rt operand
CSE331 W05.5 Irwin Fall 2007 PSU
MIPS Instruction Addressing Modes SummaryPC-relative addressing – instruction’s address
in memory is the sum of the PC and a 16-bit constant contained within the instruction
Pseudo-direct addressing – instruction’s address in memory is the 26-bit constant contained within the instruction concatenated with the upper 4 bits of the PC
4. PC-relative addressing
op rs rt offset
Program Counter (PC)
Memory
branch destination instruction
5. Pseudo-direct addressing
op jump address
Program Counter (PC)
Memory
jump destination instruction||
CSE331 W05.6 Irwin Fall 2007 PSU
Review: MIPS Instructions, so farCategory Instr OpC Example Meaning
Arithmetic
(R & I format)
add 0 & 20 add $s1, $s2, $s3 $s1 = $s2 + $s3
subtract 0 & 22 sub $s1, $s2, $s3 $s1 = $s2 - $s3
add immediate 8 addi $s1, $s2, 4 $s1 = $s2 + 4
shift left logical 0 & 00 sll $s1, $s2, 4 $s1 = $s2 << 4
shift right logical
0 & 02 srl $s1, $s2, 4 $s1 = $s2 >> 4 (fill with zeros)
shift right arithmetic
0 & 03 sra $s1, $s2, 4 $s1 = $s2 >> 4 (fill with sign bit)
and 0 & 24 and $s1, $s2, $s3 $s1 = $s2 & $s3
or 0 & 25 or $s1, $s2, $s3 $s1 = $s2 | $s3
nor 0 & 27 nor $s1, $s2, $s3 $s1 = not ($s2 | $s3)
and immediate c and $s1, $s2, ff00 $s1 = $s2 & 0xff00
or immediate d or $s1, $s2, ff00 $s1 = $s2 | 0xff00
load upper immediate
f lui $s1, 0xffff $s1 = 0xffff0000
CSE331 W05.7 Irwin Fall 2007 PSU
Review: MIPS Instructions, so farCategory Instr OpC Example Meaning
Datatransfer(I format)
load word 23 lw $s1, 100($s2) $s1 = Memory($s2+100)
store word 2b sw $s1, 100($s2) Memory($s2+100) = $s1
load byte 20 lb $s1, 101($s2) $s1 = Memory($s2+101)
store byte 28 sb $s1, 101($s2) Memory($s2+101) = $s1
load half 21 lh $s1, 101($s2) $s1 = Memory($s2+102)
store half 29 sh $s1, 101($s2) Memory($s2+102) = $s1
Cond. branch
(I & R format)
br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L
br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L
set on less than immediate
a slti $s1, $s2, 100
if ($s2<100) $s1=1; else $s1=0
set on less than
0 & 2a slt $s1, $s2, $s3 if ($s2<$s3) $s1=1; else $s1=0
Uncond. jump
jump 2 j 2500 go to 10000
jump register 0 & 08 jr $t1 go to $t1
jump and link 3 jal 2500 go to 10000; $ra=PC+4
CSE331 W05.8 Irwin Fall 2007 PSU
Review: MIPS R3000 ISA Instruction Categories
Load/Store Computational Jump and Branch Floating Point
- coprocessor Memory Management Special
3 Instruction Formats: all 32 bits wide
R0 - R31
PCHI
LO
OP rs rt rd shamt funct
OP rs rt 16 bit number
OP 26 bit jump target
Registers
R format
I format
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
J format
CSE331 W05.9 Irwin Fall 2007 PSU
RISC Design Principles ReviewSimplicity favors regularity
fixed size instructions – 32-bits small number of instruction formats
Smaller is faster limited instruction set limited number of registers in register file limited number of addressing modes
Good design demands good compromises three instruction formats
Make the common case fast arithmetic operands from the register file (load-
store machine) allow instructions to contain immediate operands
CSE331 W05.10 Irwin Fall 2007 PSU
The Code Translation Hierarchy
C program
compiler
assembly code
CSE331 W05.11 Irwin Fall 2007 PSU
Compiler
Transforms the C program into an assembly language program
Advantages of high-level languages many fewer lines of code easier to understand and debug …
Today’s optimizing compilers can produce assembly code nearly as good as an assembly language programming expert and often better for large programs
smaller code size, faster execution
To learn more take CSE 421
CSE331 W05.12 Irwin Fall 2007 PSU
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code
CSE331 W05.13 Irwin Fall 2007 PSU
CSE331 W05.14 Irwin Fall 2007 PSU
Assembler
Does a syntactic check of the code (i.e., did you type it in correctly) and then transforms the symbolic assembler code into object (machine) code
Advantages of assembler much easier than remembering instr’s binary codes can use labels for addresses – and let the assembler
do the arithmetic can use pseudo-instructions
- e.g., “move $t0, $t1” exists only in assembler (would be implemented using “add $t0,$t1,$zero”)
When considering performance, you should count instructions executed, not code size
CSE331 W05.15 Irwin Fall 2007 PSU
The Two Main Tasks of the Assembler1. Builds a symbol table which holds the
symbolic names (labels) and their corresponding addresses A label is local if it is used only within the file where
its defined. Labels are local by default. A label is external (global) if it refers to code or
data in another file or if it is referenced from another file. Global labels must be explicitly declared global (e.g., .globl main)
2. Translates each assembly language statement into object (machine) code by “assembling” the numeric equivalents of the opcodes, register specifiers, shift amounts, and jump/branch targets/offsets
CSE331 W05.16 Irwin Fall 2007 PSU
MIPS (spim) Memory AllocationMemory
230
words
0000 0000
f f f f f f f c
TextSegment
Reserved
Static data
Mem Map I/O
0040 0000
1000 00001000 8000
7f f f f f fcStack
Dynamic data
$sp
$gp
PC
Kernel Code & Data
CSE331 W05.17 Irwin Fall 2007 PSU
Other Tasks of the AssemblerConverts pseudo-instr’s to legal assembly code
register $at is reserved for the assembler to do this
Converts branches to far away locations into a branch followed by a jump
Converts instructions with large immediates into a lui followed by an ori
Converts numbers specified in decimal and hexidecimal into their binary equivalents and characters into their ASCII equivalents
Deals with data layout directives (e.g., .asciiz)Expands macros (frequently used sequences of
instructions)
CSE331 W05.18 Irwin Fall 2007 PSU
Typical Object File Pieces Object file header: size and position of the following
pieces of the file Text (code) segment (.text) : assembled object
(machine) code Data segment (.data) : data accompanying the code
static data - allocated throughout the program dynamic data - grows and shrinks as needed
Relocation information: identifies instructions (data) that use (are located at) absolute addresses – not relative to a register (including the PC)
on MIPS only j, jal, and some loads and stores (e.g., lw $t1, 100($zero) ) use absolute addresses
Symbol table: global labels with their addresses (if defined in this code segment) or without (if defined external to this code segment)
Debugging information
CSE331 W05.19 Irwin Fall 2007 PSU
An Example
.data
.align 0str: .asciiz "The answer is "cr: .asciiz "\n"
.text
.align 2
.globl main
.globl printfmain: ori $2, $0, 5
syscallmove $8, $2
loop: beq $8, $9, doneblt $8, $9, brncsub $8, $8, $9j loop
brnc: sub $9, $9, $8j loop
done: jal printf
Gbl? Symbol Addressstr 1000 0000
cr 1000 000b
yes main 0040 0000
loop 0040 000c
brnc 0040 001c
done 0040 0024
yes printf ???? ????
Relocation Info
Address Data/Instr1000 0000 str
1000 000b cr
0040 0018 j loop
0040 0020 j loop
0040 0024 jal printf
CSE331 W05.20 Irwin Fall 2007 PSU
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code library routines
executable
linker
machine code
main text segment
printf text segment
CSE331 W05.21 Irwin Fall 2007 PSU
LinkerTakes all of the independently assembled code segments
and “stitches” (links) them together Faster to recompile and reassemble a patched segment, than it
is to recompile and reassemble the entire program
1. Decides on memory allocation pattern for the code and data segments of each module Remember, modules were assembled in isolation so each has
assumed its code’s starting location is 0x0040 0000 and its static data starting location is 0x1000 0000
2. Relocates absolute addresses to reflect the new starting location of the code segment and its data segment
3. Uses the symbol tables information to resolve all remaining undefined labels branches, jumps, and data addresses to/in external modules
Linker produces an executable file
CSE331 W05.22 Irwin Fall 2007 PSU
Linker Code Schematic
printf: . . .
main: . . . jal ????
call, printf
Linker
Object file
C library
Relocation info
main: . . . jal printfprintf: . . .
Executable file
CSE331 W05.23 Irwin Fall 2007 PSU
Linking Two Object FilesH
dr
T
xtse
g
D
seg
R
eloc
S
mtb
l D
bg
File 1
Hdr
T
xtse
g
Dse
g
Rel
oc S
mtb
l D
bg
File 2
+
Executable
Hdr
T
xtse
g
Dse
g
R
eloc
CSE331 W05.24 Irwin Fall 2007 PSU
The Code Translation Hierarchy
C program
compiler
assembly code
assembler
object code library routines
executable
linker
loader
memory
machine code
CSE331 W05.25 Irwin Fall 2007 PSU
Loader
Loads (copies) the executable code now stored on disk into memory at the starting address specified by the operating system
Copies the parameters (if any) to the main routine onto the stack
Initializes the machine registers and sets the stack pointer to the first free location (0x7fff fffc)
Jumps to a start-up routine (at PC addr 0x0040 0000 on xspim) that copies the parameters into the argument registers and then calls the main routine of the program with a jal main
To learn more take CSE 311 and CSE 411
CSE331 W05.26 Irwin Fall 2007 PSU
Dynamically Linked LibrariesStatically linking libraries mean that the library
becomes part of the executable code It loads the whole library even if only a small part is
used (e.g., standard C library is 2.5 MB) What if a new version of the library is released ?
(Lazy) dynamically linked libraries (DLL) – library routines are not linked and loaded until a routine is called during execution
The first time the library routine called, a dynamic linker-loader must
- find the desired routine, remap it, and “link” it to the calling routine (see book for more details)
DLLs require extra space for dynamic linking information, but do not require the whole library to be copied or linked
CSE331 W05.27 Irwin Fall 2007 PSU
First Evening Exam
Exam #1: Tuesday, Oct 2, 6:30 to 7:45pm, 113 IST
Look at the practice exam (and solutions) on ANGEL In Section 1 – two people requesting a conflict exam In Section 2 – one person requesting a conflict exam
Questions?