36
CISC 662 Graduate Computer Architecture Lecture 3 - ISA Michela Taufer Powerpoint Lecture Notes from John Hennessy and David Patterson’s: Computer Architecture, 4th edition ---- Additional teaching material from: Jelena Mirkovic (U Del) and John Kubiatowicz (UC Berkeley)

CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

CISC 662 Graduate ComputerArchitecture

Lecture 3 - ISAMichela Taufer

Powerpoint Lecture Notes from John Hennessy and David Patterson’s: ComputerArchitecture, 4th edition

----Additional teaching material from:

Jelena Mirkovic (U Del) and John Kubiatowicz (UC Berkeley)

Page 2: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

MemoryAddressing

Page 3: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Alignment Restrictions• Computer systems place restrictions on

allowable addresses for some objects• Access to an object of size s bytes at byte

address A is aligned if A mod s = 0• Why do machines have alignment

restrictions?– Hardware to access memory is simpler– Program with alignment accesses run faster– A misalignment memory access will take multiple

aligned memory references

Page 4: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Alignment Restrictions con’t• A 32-bit processors require a 4-byte integer to reside

at a memory address that is evenly divisible by 4• Any aligned 4-byte int has its address be multiple of 4

e.g., 0x2000 or 0x2004 -> the value can be read orwritten with a single memory operation

• Any unaligned double has its address not a multipleof 4 e.g., 0x2001 -> the object may be slit across two4-byte blocks and therefore read or written with twomemory operations

Page 5: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Addressing Modes• Addressing mode = how architectures specify

the address of an object they will access• Addressing modes may:

– Reduce instructions counters– Add to the complexity of building a computer– Increase the average CPI

• Figure B.6 lists all the addressing modes inrecent computers

• Some examples in the next slide

Page 6: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Addressing Modes con’t• Register

– Add R4,R3 R4 <- R4 + R3– When a value is in a register

• Immediate – Add R4, #3 R4 <- R4 + 3– For constants

• Displacement– Add R4, 100(R1) R4 <- R4 + M[100+R1]– Accessing local variables

• Register indirect – Add R4,(R1) R4 <- R4 + M[R1]– Accessing using a pointer or a computed address

Page 7: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Operands andOperations

Page 8: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Operands and Operations

opcode:• which operation (ADD, MULT …)• type of operands (INT, FP)

result operand1 operandn

• operand location (memory or register)• type (INT, FP)

ADD R1, R3, R4ADD F1, F2, F3SUB R1, R2, R3FADD R1, R2, R3

Operands and operations are encoded in instructions

Page 9: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Frequency of Data Access• Frequency of access to different data

helps in deciding what types are moreimportant to support efficiently

Page 10: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Operations in the Instruction SetArithmetic: Add, multiply, subtract, divideLogical: And, orControl: branch, jump, procedure call andreturnSystem: OS call, virtual memorymanagementFP operations: add, multiply, subtract,divideDecimal: add, multiply, convertString: move, compare, searchGraphics: pixel and vertex operations

Page 11: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Frequency of Instructions• The most widely executed instructions

are the simple operations of an ISA

Page 12: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Control Flow Instructions (CFI)• Conditional branches• Jumps• Procedure calls• Procedure returns

Page 13: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Frequency of CFI• Each event is different and may use different

instructions and have different behaviors

Page 14: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

How To Specify Branch Condition?Condition code• ALU operation sets special bits,

get condition for free• Constrain instruction orderingCondition register• Write 0 (false) or 1 (true) into a register

after comparison• Support only BZ and BNZ instructionsCompare and branch• Compare operands (BLT, BGT, BEQ …) and

branch• Instruction may last long

Page 15: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Procedure Invocation OptionsReturn address and some state must besavedCaller saving:• Calling procedure saves registers that it will

need upon return• Must be used for globally accessed variablesCallee saving:• Called procedure saves registers that it

will overwrite

Page 16: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Encoding The Instruction Set

Design decisions affect the size of theinstruction:• Size of the compiled program• Ease of decoding

Page 17: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Encoding The opcode Field

Depends on whether every operation canbe combined with every addressing mode• If it can separate address specifier is needed

for each operand• If it can’t opcode can signify the addressing

mode

Page 18: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Instruction Set Design Trade-offs

More registers are better for compileroptimizationMore addressing modes bring fasteroperationMore registers and addressing modesmake instructions longerShorter instructions and instructions withsimilar CPI are better for pipelining

Page 19: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible
Page 20: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Instruction Formats

Variable

operation and number of operands

addressing modeand address 1

addressing mode andaddress n

Works best if there are many operations and addressing modesAll addressing modes with all instructionAs few bits as possible to encode the programDecoding might be complicated

Page 21: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Fixed

operation and addressing mode

address 1 address 3

Works best if there is a small number of operationsand addressing modesLarger programsAlways same number bits to encode instructionsEasy decoding

address 2

Instruction Formats

Page 22: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Hybrid

operation addressing mode 1address 1

operation address spec 1

operation address spec 1address 1

address spec 2 address 1

address 2

Instruction Formats

Page 23: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

CISC vs. RISCComplex Instruction Set Computer (CISC)• Instructions are highly specialized• Support for a variety of instructions,

addressing modes, etc.• Different CPI and instruction size

Reduced Instruction Set Computer (RISC)• Short, simple instructions, support for a few addressing

modes• More complex instructions must be programmed• Same low CPI

Page 24: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Reduced Code SizeImportant for embedded applicationsDesign hybrid version of instruction set withboth 16-bit and 32-bit instructions• 16-bit instructions are simpler, support fewer

operations and addressing modesCompressed code• Instruction cache contains full instructions• Memory contains compressed instructions• On cache miss, instruction is fetched and

decompressed

Page 25: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Role ofCompilers

Page 26: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Role of CompilersCompiler generates object code inmachine language from the high-levellanguage such as CInstruction set is compiler’s targetIn addition to generating the code,compiler optimizes the code to make it:• Shorter – 25% to 90%• Faster• Susceptible to pipelining

Page 27: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

CompilationCompiler makes two to four passes through thecode• In each pass it performs one of the optimizations• The optimizations are optional and may be skipped to

achieve faster compilation• Passes are sequential

• If compiler could go back and repeat steps it might discoverbetter optimizations but this would increase time andcomplexity

Compiler design goals:• Correctness• Speed of compilation

Page 28: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Compilation

Front end per language

High-level optimizations

Global optimizer

Code generator

Page 29: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible
Page 30: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Front End

Transforms high-level language intocommon intermediate representationWhen a new language becomes popularonly front-end needs to be rewritten

Page 31: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

High-Level Optimizations

Transform the code to take advantage ofparallelism and increase speed of execution:•• Loop unrollingLoop unrolling – expand body of the loop to

encompass several iterations thus eliminating numberof conditional branches

•• Procedure Procedure inlininginlining – eliminates context switch•• Prefetch insertionPrefetch insertion – prefech array references

in loops

for (i = 0; i < 100; i++) {

g ();}

for (i = 0; i < 100; i+=2) {

g ();g ();

} ⇒

Page 32: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Global OptimizationsGlobal and local optimizations•• Global common Global common subexpression subexpression eliminationelimination – locates

several expressions that compute same value andreplaces the second with the temporary variable

• Local optimization is done only within basic block•• Copy propagationCopy propagation: if A=X replace all later references to A

with XRegister allocation• Allocate most accessed variables to registers• Since number of registers is limited, must find a strategy

that does not result in too many transfers between thememory and the registers

Page 33: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Code GeneratorTakes advantage of design features of aspecific architecture• Reorder instructions to improve pipeline

performance• Replace multiplication with addition and shifts

Page 34: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Which Variables → RegistersProgram data allocation• Stack

• Local scalar variables and activation records forprocedures

• Best for register allocation• Global area

• Global variables and constants• Should be allocated to registers if accessed frequently

• Heap• Dynamic objects accessed with pointers• Should not be allocated to registers

Aliased variables should also not be allocatedto registers

Page 35: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

How Can Architecture Help?Provide regularity• Operations, data types and addressing modes

should be orthogonalProvide primitives not solutions• Special features that match kernels or high-

level languages are often unusableSimplify trade-offs among alternatives• Compilers strive to generate efficient code• Specify benefits and costs of each alternative

Make use of everything that is known atcompile time

Page 36: CISC 662 Graduate Computer Architecture Lecture 3 - ISAAlignment Restrictions con’t •A 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible

Next Weeks …

Week Date Topics Reading assigned Quiz

1 Sep 4 Lec01 - Introduction Chap 1; App B

2 Sep 9 Lec02 – Performance and ISAs Q1

2 Sep 11 Lec03 – ISAs and Role of Compilers App A1-A6

3 Sep 16 Lec04 - MIPS Overview

3 Sep 18 Lec05 – Pipeline Q2

4 Sep 23 Lec06 - Hazards

4 Sep 25 Lec07 – Multi-cycles App A.7; Chap 2

Sep 29 Homework 1 due

5 Sep 30 Homework review Q3