99
Instruction Set Architecture CS510 Computer Architectures Lecture 4 - 1 Lecture 4 Lecture 4 Instruction Set Instruction Set Architecture Architecture

Instruction Set ArchitectureCS510 Computer ArchitecturesLecture 4 - 1 Lecture 4 Instruction Set Architecture

Embed Size (px)

Citation preview

  • Lecture 4Instruction Set Architecture

    Computer Architectures

  • Instruction Set Architecture1950s to 1960s: Computer Architecture Course Computer Arithmetic

    1970 to mid 1980s: Computer Architecture Course Instruction Set Design, especially ISA appropriate for compilers

    1990s: Computer Architecture Course Design of CPU, memory system, I/O system, Multiprocessors

    Computer Architectures

  • Languages of ComputersMachine LanguagePrograms consist of machine instructionsDirectly executable without preprocessingDirect manipulation of machine registersEfficient in view of machine resource utilizationDifficult to programAssembly languageImproved version of machine language with emphasis on user-friendlinessSymbolic machine language(symbols for operations and addresses)Assembler is needed to translate into a machine language program High-Level LanguagePrograms consist of statements, each of which can be translated into several machine language instructionsNeed a compiler to translate into a machine language programRelatively easy to program compare to ML or ALHardware resource utilization may be inefficient

    Computer Architectures

  • Semantic Gap Between ML and HLLAs Hardware cost goes down, Software cost goes up Shortage of programmers Unreliable Software => Unreliable Computers Response: Keep the programming cost down Develop powerful, complex user-friendly HLL HLL programmers are easy to train Greater Semantic Gap between HLL and Machine Language Execution inefficiency Software complexity Compiler complexity To offset the semantic gap Large instruction set Variety of addressing modes Hardware/Firmware implementation of HLL primitives

    Computer Architectures

  • Instruction SetBoundary between Designers(architects) and programmersFor designers:Specification of the function of CPUFor Programmers:A pool of functions from which they choose to use in the programOne would expect that human language should directly reflect the characteristics of human intellectual capabilities that language should be a direct mirror of mind in ways which other systems of knowledge and belief cannot. - Noam Chomsky Instruction SetLanguage of a machineCharacterizes the machines capability and behaviorPerformance IssuesMemory Bandwidth is used 1/2 for Instructions and 1/2 for DataFor efficient utilization of MB, instruction representation must as compact as possible whilst still being compatible with datavon Neumann Bottleneck exists in MB

    Computer Architectures

  • Memory Bandwidth IssueMemory Bandwidth is used by CPU and I/O Memory Bandwidth given to CPU is used for Instruction Fetches and Operand Fetches or Operand Stores Consider an AC-machine; ADD X, or LDA X

    Computer Architectures

  • Machine LanguageMachine Language VocabularyOperationsAddressing Modes for operands addresses and the next instruction addressSyntaxMethods of representing operation(OP-code), operands, addresses in an instructionInstruction formatEncoding of Instruction fieldsGrammarRules of using instructions to make a program

    Computer Architectures

  • Components of an InstructionOperation Code(OP-code)Format specifierLong / ShortField definitionOperation Types of operands

    Operand Address(es)Operand itselfAddress themselves(including abbreviated)Address modification specificationAutomatic indexingRelative addressSequencing

    Computer Architectures

  • Instruction Set and Computer ArchitectureComputer Architectures are classified into three classes according to the Register Structures for operands storage AC Computer Architecture Stack Computer Architecture General Purpose Register Computer Architecture

    Computer Architectures

  • Stack Computer ArchitectureInstructionOperation

    PUSH X if F=1, then S overflow;POP X if E=1, then empty S;if E=1, then empty S;if E=1, then empty S;if SP=(n-1), (Shift Left)(ADD)Unary Instr.Binary Instr.

    Computer Architectures

  • Characteristics of the Stack ArchitectureInstruction length is shortNo need to represent the address(es) of operand(s) in functional instructionsInstruction execution time is fastOperand(s) access is fast because they are in the stack(register)Operand(s) must be stored in the stack before operating on themInconvenient to prepare data in the stackFrequent use of PUSH and POP instructions to prepare data in the stack - memory access

    Computer Architectures

  • AC Computer ArchitectureCharacteristics:

    - Instruction execution time of binary instructions are slow One of the operands must be read from memory

    - Instruction length is longer than in the stack architecture One of the operands memory address must be specified in the instruction although AC(a data register) can be implied

    - Frequency of LDA/STA instructions is high There is only one data register(CPA)(ADD X)Transfer Instruction

    Computer Architectures

  • GPR Computer ArchitectureCharacteristics:- Instruction length is short because register addresses are used for operands- Instruction execution time is fast because all the operands are in the registers- Frequency of using LD/ST instructions depends on the number of registers- Opportunities of storing the results of operations in GPR is high because there are many registersUnary InstructionBinary InstructionTransfer Instruction

    Computer Architectures

  • Computer Architecture?. . . the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls, the logic design, and the physical implementation. Amdahl, Blaaw, and Brooks, 1964SOFTWARE

    Computer Architectures

  • Towards Evaluation of ISA and Organization

    Computer Architectures

  • Interface DesignA Good Interface:Lasts through many implementations (portability, compatibility)Is used in many different ways (generality)Provides convenient functionality to higher levelsPermits an efficient implementation at lower levels

    Computer Architectures

  • Evolution of Instruction SetsSingle Accumulator (EDSAC 1950)

    Computer Architectures

  • Evolution of Instruction SetsMajor advances in computer architecture are typically associated with landmark instruction set designsEx: Stack(B1700) vs GPR (System S/360)Design decisions must take into account:technology(component)machine organizationprogramming languagescompiler technologyoperating systemsAnd they in turn influence these

    Computer Architectures

  • Design Space of ISAFive Primary DimensionsNumber of explicit operands( 0, 1, 2, 3 )Operand StorageWhere besides memory?Effective AddressHow is memory location specified?Type and Size of Operandsbyte, int, float, vector, . . .How is it specified?Operationsadd, sub, mul, . . .How is it specified?Other AspectsSuccessorHow is it specified?ConditionsHow are they determined?EncodingFixed or variable? Wide?Parallelism

    Computer Architectures

  • Number of Explicit OperandsTo optimize the memory bandwidth required by instructions(for fetching from Memory), the number of explicitly specified operands in the instruction needs to be reduced2 operands(GPR machine)2 source operands(1 of the source operands is destroyed after execution to store the result)1 operand(AC machine)1 of the operands is implied to a specific hardware register called Accumulator(AC)(result of the execution is also stored in this register)0 operand(Stack machine)Both of the operands and the result are implied to a stack

    Computer Architectures

  • Operand StorageStorageMemory- Long memory addressing- Need to represent the address with a few bitsRelative addressing with displacementPage/Segment addressingRegister- General purpose registerShort register addressing- ACStack(register)- Does not need for addresses

    Computer Architectures

  • Address Space and Storage SpaceAddress SpaceConsists of addresses that programmers can useStorage SpaceConsists of physical storage locationsFor a simple low cost machine, the Address Space and the Storage Space are identical Programmers program with the actual storage addressesModern computers provide the storage systems with Independent Address and Storage SpacesAn Effective Address(EA) needs to be obtained from the Address used in the program to access the operand from the memoryUsually the Address Space is much larger than the Storage Space Virtual Storage System

    Computer Architectures

  • Effective Address Address and Physical Storage Location are two different concepts. Addresses of Operands are represented or implied in the instruction. Operands address needs to be mapped into an Effective Address of the physical storage location Basic Addressing Modes(A or R in instructions)Immediateopd=A# of M refer limited valueDirectEA=Asimple limited addr spaceIndirectEA=M[A]large addr space multiple M referRegisterEA=Rno M refer limited addr spaceR IndirectEA= M[R]large addr space extra M referDisplacementEA= A+[R]flexibility complexityStackopd=S[TOP]no M refer limited applications

    Computer Architectures

  • Specification of Type and Size of OperandSpecification of the Type of the operandUsually different op-codes for different types of operandsSpecification of the Size of the operandop-code represents the resolution of the operand addressbit, byte, half word(upper/lower half) , word, ...Length of operandsImplicitVariable lengthSpecified explicitly in the instructionSpecified by a designated registerSpecified by the delimiter marks in the operandreserved-bit delimiter(field or word mark)reserved-bit configuration(record or group mark)

    Computer Architectures

  • Computer Architectures

  • OperationSpecificationEncoded to reduce the instruction length reasonTypesMinimal Instruction Set Complex Instruction Set vs RISC

    Computer Architectures

  • Four Types of OperationsFunctionalADD, AND, CPA, CPC, ROL, CLA, CLC, INC,

    TransferLDA, STA(LD, ST),

    ControlJMP, JNA, JZA, JZC(SMA, SZA, SZC),

    Input/OutputINP, OUT,

    Computer Architectures

  • Minimal Instruction Set

    Computer Architectures

  • Why NOT Use a Minimal Instruction Set?Inefficient Program Size(M bandwidth)- Large IC and CPI Programming difficulty

    Computer Architectures

  • Instruction Set Design:Operations to Include in the Instruction Set Trade-off 3 Es(Elegance, Efficiency, Environment) EleganceCompleteness(Even Bn instruction is complete)Symmetry:AC
  • ISA MetricsAesthetics:OrthogonalNo special registers, few special cases, all operand modes available with any data type or instruction typeCompletenessSupport for a wide range of operations and target applicationsRegularityNo overloading for the meanings of instruction fieldsStreamlinedResource needs easily determinedEase of compilation (programming?)Ease of implementationScalability

    Computer Architectures

  • Powerful InstructionRich, Powerful Instruction:Instruction with longer Execution Time(E) to balance the overhead penalty(O)Instruction which has a large E/OOverhead for Execution(O)(E)

    Computer Architectures

  • Powerful InstructionsExtended Arithmetic FunctionMultiply, divide, Trigonometric Functions, etcAutomatic IndexingBCT R1, addr(R1
  • Powerful InstructionsProcess State Exchange(Context Switch):Instructions required in the multiprogramming environmentsOtherwiseLD R1, addrLD R2, addr+1LD R5, addr+4XJ(Exchange Jump of CDC 6000 series)

    Computer Architectures

  • Basic ISA Classes:Type of Internal Storage

    Computer Architectures

  • Stack MachinesInstruction set: Arithmetic operators(+, -, *, /, . . .)push A, pop A

    Computer Architectures

  • The Case Against StacksPerformance is derived from the existence of several fast registers, not from the way they are organizedData does not always surface when neededConstants, repeated operands, common sub-expressions so TOP and Swap instructions are requiredCode density is about equal to that of GPR instruction setsRegisters have short addressesKeep things in registers and reuse themSlightly simpler to write a poor compiler, but not an optimizing compiler

    Computer Architectures

  • Computer Architectures

  • GPR MachinesFaster than memoryEasier for a compiler to useUsed to hold variables, intermediate operandsthe memory traffic reducesthe code density improvesHow many registers?depends on how they are used by the compilerGPR(General Purpose Register)

    Computer Architectures

  • How Many Registers in RFWe need to try to keep the live registers in the RF6 algorithms from CALGO(ACM) written in 4 languages;ALGOL,BASIC, BLISS,FORTRAN

    Computer Architectures

  • GPR MachinesMaximum number of operands(O)two or three operandsNumber of memory addresses(M)0,1,2,3

    Computer Architectures

  • GPR MachinesType

    Register-register(0,3)

    Register-memory(1,2)

    Memory-memory(3,3)

    Advantages

    Simple, fixed-length instr. encoding.Simple code generation model

    Data can be accessed without loading first. Instruction format tends to be easy to encode and yields good density.

    Program becomes most compact. No waste of registers for temporaries.

    Disadvantages

    Higher instruction count.Some instructions are short and bit encoding may be wasteful.

    A source operand is destroyed.Clocks per instruction varies by operand location.

    Large variation in instruction sizes and in work per instruction. Memory accesses create memory bottleneck.

    Computer Architectures

  • R-R vs RMA+B+CRR InstructionsLDR1,ALDR2,BLDR3,CADDR4,R1,R2ADDR5,R4,R3

    RM instructionsLDR1,AADDR1,BADDR1,CRM instructions reduce IC

    Computer Architectures

  • What About Actual ProgramsConsider a GPR machine with a large register file.- Highly probable that the intermediate data can be found in a register- Thus, LD/ST instruction will be used less frequently- However, frequency of using LD/ST instructions in the computers that use RM instructions will reduced further

    Computer Architectures

  • VAX-11Variable format, 2- and 3-address instructions 32-bit word size, 16 GPR (4 reserved) Rich set of addressing modes (apply to any operand) Rich set of operations bit field, stack, call, case, loop, string, poly, system Rich set of data types (B, W, L, Q, O, F, D, G, H) Condition codes

    Computer Architectures

  • Kinds of Addressing ModesRegister direct[Ri]Immediate (literal)vDirect (absolute)M[v]Register indirect M[[Ri]]Base+DisplacementM[[Ri] + v]Base+IndexM[[Ri] + [Rj]]Scaled IndexM[[Ri] + [Rj]*d + v], eg. d=8AutoincrementM[[Ri]+1]AutodecrementM[[Ri] - 1]Memory IndirectM[ M[Ri] ][Indirection Chains]Addressing Mode value in [ ] is the operand

    Computer Architectures

  • Memory Addressing Modes (VAX)

    Computer Architectures

  • Operand Address bits:Displacement ValuesThis value is related to the operand address field when the address is represented by the displacement from the base addressWide distributionThe vast majority --- positiveA majority of the large displacements -negative

    Computer Architectures

  • Operand Address bits: Immediate Addressing ModePercentage of operations that use immediates

    Computer Architectures

  • Operand Address bits: Immediate Addressing Mode

    Computer Architectures

  • Operations in the Instr. SetOperator type Examples Add, Subtract, Data transferLoad, Store, Move, ControlBranch, Jump, Procedure Call, Return, TrapSystemOperating System Call, VMM instructionsFloating PointFloating Point AddDecimalDecimal Add, Decimal-to-Character ConversionStringString Move, String Compare, String SearchGraphicsArithmetic and logicalPixel operations, Compress/Decompress op.

    Computer Architectures

  • Operations in the Instr. SetRank

    12345678910

    Total

    80x86 instructions

    loadconditional branchcomparestoreaddandsubmove reg-regcallreturn

    Integer average (% total executed)

    22%20%16%12% 8% 6% 5% 4% 1% 1%

    96%

    Computer Architectures

  • Control Flow Instructions

    Computer Architectures

  • Computer Architectures

  • RISC

    Computer Architectures

  • Instruction Execution Characteristics:Type of OperationsWhat type of statements is most frequent? Assignment statements dominate Functional instructions and Transfer instructions Movements of data must be made simple, thus fast Conditional Statements(if and loop together) Instructions with Control function Sequence control mechanism is importantRelative Dynamic Frequencies of statements in HLL programs

    Computer Architectures

  • Instruction Execution Characteristics:Time Consumed by StatementsMachine instruction weighted = [Average No. of machine Instr. / Statements] x [Frequency of Occurrences]

    Memory reference weighted = [Average No. of memory references / Statement] x [Frequency of Occurrences]

    Most time consuming statement is procedure CALL/RETURN

    Computer Architectures

  • Instruction Execution Characteristics:Type of OperandsMajority of references to scalar 80% are local to a procedure References to arrays/structure require index or pointerLocations of operands(Average per instruction) 0.5 operands in memory 1.4 operands in registers

    Computer Architectures

  • Instruction Execution Characteristics:Procedure CallsTwo most significant aspects in implementing procedure Call/ReturnsNumber of parametersDepth of nestingStatistics on Number of Parameters98% of dynamically called procedures were passed fewer than 6 parameters92% of them used fewer than 6 local scalar variables

    Computer Architectures

  • Multiple Register SetsMultiple register sets: - Assume that we have several sets of registers that each set can be used by each different procedure - Saves some time in procedure CALL/RETURN simply by changing the R set pointer value

    Computer Architectures

  • Instruction Execution Characteristics:Depth of Procedure NestingDeptht Statistics: Window depth of 8 will need to shift only on less than 1% of calls and returnsProcedure Nesting and Register Set WindowShifting register set window: need to save the information in one register set in the memory so that a register set can be used by the new procedure

    Computer Architectures

  • RISC Philosophy(1):Make the Most Frequent Statements Execute FastMost frequent statements are Assignment Type of Statements and each of them are translated by the compiler into a set of Functional Instructions and/or Transfer Instruction. Thus Functional and Transfer Instructions need to be made to execute fast.

    Instruction Cycle of Functional Instruction or Transfer Instruction

    Computer Architectures

  • Assignment StatementsTo make the Instruction Fetch fast Short OP-code part: Small number of instructions in the instruction setShort Operand Address part: Make the operands in the registers instead of MTo make the Instruction Preparation fastFixed length instructionFixed format instructionSimple addressing modesTo make the Operand Fetch fastMake the operands available from registers instead of memoryNeeds a large register fileTo make the Instruction Execution fastMultiple register set; Overlapping MRSInstruction execution pipeline

    Computer Architectures

  • RISC Philosophy(2):Make the Most Time-Consuming Statements Execute Fast Methods of passing Parameters Through memory Parameters are stored in the memory locations which are commonly accessible by both calling and called procedures Execution of CALL and RETURN instructions are very slow due to the memory accesses, especially when there are many parameters to pass

    Through registers Parameters are stored in the registers in CPU Calling procedure needs to save the registers, which are not used for passing parameters, in the memory. This results in a lot of memory accesses and makes the execution times of these instructions slow.

    Computer Architectures

  • CISC and RISCRISC A limited and simple instruction set A large number of GPR(Register File) An emphasis on optimizing the instruction pipeline

    Computer Architectures

  • Large Register FileIf the number of registers is small, it needs a strategy to keep the most frequently accessed operands in registers to minimize Register-Memory traffic

    - Software approachMaximize register usage by compiler(Requires sophisticated program analysis)

    - Hardware approachMore registers in the register fileQuick access to operands is desirable- Assignment Statements rely on Functional and Transfer Instructions- Functional Instructions heavily rely on registers- Frequency of Transfer Instructions depends on the number of registers in the register file

    Computer Architectures

  • Register WindowFactStatistically, most operand references are to local scalars - 80%Local variables to a procedure cannot be accessed by other procedure(s)ProblemLocal changes with each procedure CALL/RETURNCALL/RETURN occurs frequentlyParameters need to be passed aroundObservationsStatistically, a few parameters(
  • Multiple Register SetEach Register Set is assigned to a different procedure- Size of a Register Set is equal to the size of a window- Parameters need to be copied in the called/calling procedures Register Set, however, there is no need to copy all the registers from the switched off register set - Require register move instructions

    Computer Architectures

  • Overlapping Register WindowWhen multiple of Register Sets are implemented in a large Register File, we call a Register Set as a Register Window.Multiple register sets still require to copy the parameter values between register sets.Overlapping Register Window- Portions of register windows overlap for passing parameters- At any time only one window is visible- No need for moving information for parameter passing How about global variables?

    Computer Architectures

  • Global VariablesGlobal Variables are commonly accessible by all the procedures

    Assign to memory locations by compilerStraight forward but inefficient for the frequently accessed global variables because of frequent memory accesses Set aside a set of Global Variable registers Available to all proceduresUnified register numbering system to simplify instruction formate.g.R0 ~ R7: Global R8 ~ R13: Current window

    Computer Architectures

  • Linear Organization of Register Windows

    Computer Architectures

  • Computer Architectures

  • Computer Architectures

  • Computer Architectures

  • Code SizeSmaller programsProgram takes less memory spaceSmaller program improves performanceFewer instructionsFewer bytes to fetchIn paging environment, occupy in fewer pages and reduces page faultsCISCSmaller number of instructions in the program(program may be shorter but not necessarily smaller space)

    Computer Architectures

  • ExampleCISCRISC LD Rb B LD Rc C ADD Ra Rb Rc ST Ra AMemory TrafficInstruction:56 bitsData:32 x 3 = 96 bitsTotal MB used:56 + 96 = 152 bitsMemory TrafficInstruction:112 bitsData:96 bitsTotal MB used:200 bits

    Computer Architectures

  • Characteristic of RISC(1, 2)(1) 1 Instruction per cycle(memory cycle)Machine cycle: IF + IP + Time to fetch the operands from registers + Perform operation + Store the result in a registerRISC instruction CISC micro-instruction => No need to microprogram(Hardwired control)(2) Register-to-Register operationWith only simple Load and Store operations for accessing memory(Load/Store Arch.)Simplifies the instruction set, and control unit

    Computer Architectures

  • Characteristic of RISC(3, 4)(3) Simple Addressing Modes - Shorten EA generation time Almost all instructions use register addressingRelative addressing using PC, BAR, and Index addressOther complex modes may be synthesized by software

    (4) Simple Instruction Format - Shorten instruction Decoding TimeUsually one formatFixed length/align on word boundaryFixed field length

    Computer Architectures

  • Characteristic of RISC(5)(5) Pipelining (We will learn this later in detail)

    At this time, you just need to know that - Instruction execution hardware can be made of a few inter-connected independent sub-modules, called pipeline STAGEs- An instruction execution progresses at each pipeline stage in sequence - When an instruction completes its execution at the i-th stage, the next instruction commences its execution at the i-th stage- Thus, in the ideal situation, throughput increases nearly n times, where n is the number of pipeline stages - Branch instruction makes the pipelined execution inefficient

    Computer Architectures

  • Pipelined ExecutionI0I2I3I1S3I4Execution of a Sequence of InstructionsS31 instruction executionI0At 4t: I0At 5t: I1At 6t: I2At 7t: I3At 8t: I4N instructions complete at

    (n+3)t

    When n is large it becomes ntThus, 1 instruction in every t

    Computer Architectures

  • A "Typical" RISC32-bit fixed format instruction (3 formats)32 32-bit GPR (R0 contains zero, DP takes a pair)3-address, R-R functional instructionSingle address mode for load/store: base + displacementno indirectionSimple branch conditionsDelayed branchsee: SPARC, MIPS, MC88100, AMD2900, i960, i860 PARisc, DEC Alpha, Clipper, CDC 6600, CDC 7600, Cray-1, Cray-2, Cray-3

    Computer Architectures

  • Branch Displacement

    Computer Architectures

  • Implementation of Conditional Branch InstructionsEvaluating branch conditionsName

    Condition Code(CC)

    Condition Register

    Compare and BranchHow condition is tested

    Test special bits set by ALU operations, possibly under program control.

    Test arbitrary registers set by the result of a comparison.

    Compare is part of the branch.Often compare is limited to subset.

    Advantages

    Sometimes condition is set for free, if not 2 instrs for a branch.

    Simple.2 instrs for a branch

    1 instr. rather than 2 for a branch

    Disadvantages

    CC is an extra state. CCs constrain the ordering of instrs since they pass info from one instr to a branch.

    Uses up a register.

    May be too much work per instruction.

    Computer Architectures

  • Putting It All Together: DLX ArchitectureRead Section 2.8 ---- MUST

    DLX emphasizesA simple load-store instruction setDesign for pipelining efficiencyA fixed instr. set encodingEfficiency as a compiler target

    Computer Architectures

  • Example: MIPS

    Computer Architectures

  • The Different Goals for VAX and MIPSVAX - simple compilers and code densitypowerful addressing modespowerful instructionsefficient instruction encodingfew registersMIPS - high performance via pipelining, ease of HW implementation, compatibility with highly optimizing compilersimple instructionsimple addressing modesfixed-length instruction formatsa large number of registers

    Computer Architectures

  • VAX vs. MIPS

    Computer Architectures

  • Fallacies and PitfallsPitfall: Designing a high-level instruction set feature specifically oriented to supporting a high-level language structure.Fallacy: There is such a thing as a typical program.Fallacy: An architecture with flaws cannot be successful.80x86: supports Segmentation while other support page Extended AC for integer, while others use GPR Stack for FP operations, while others abandoned stackFallacy: You can design a flawless architecture.All architecture design involves trade-off made in the context of a set of HW and SW technologies.

    Computer Architectures

  • Most Popular ISA of All Time:Intel 80x861971: Intel invents microprocessor 4004/8008, 8080 in 1975 1975: Gordon Moore realized one more chance for new ISA before ISA locked in for decadesHired CS people in OregonWerent ready in 1977 (CS people did 432 in 1980)Started crash effort for 16-bit microcomputer1978: 8086 dedicated registers, segmented address, 16- bit 8088; 8-bit external bus version of 8086; added as after thought

    Computer Architectures

  • Most Popular ISA of All Time:Intel 80x861980: IBM selects 8088 as basis for IBM PC 1980: 8087 floating point coprocessor: adds 60 instructions using hybrid stack/register scheme1982: 80286 24-bit address, protection, memory mapping1985: 80386 32-bit address, 32-bit GP registers, paging1989: 80486 & Pentium in 1992: faster + MP few instructions

    Computer Architectures

  • 80x86 Addressing/Protection

    Computer Architectures

  • 80x86 Instruction Format8086 in blue; 80386 extensions in red

    Computer Architectures

  • 80x86 Instruction Encoding: Address Specifier Field: Mod, Reg, R/Mr w=0 w=1 r/m mod=0 mod=1 mod=2 mod=316b32b16b 32b16b32b16b32b0ALAXEAX0addr=BX+SI=EAXsamesamesamesame same1CLCXECX1addr=BX+DI=ECXaddr addr addr addr as2DLDXEDX2addr=BP+SI=EDXmod=0mod=0mod=0mod=0 reg3BLBXEBX3addr=BP+SI=EBX+d8+d8+d16+d32 field4AHSPESP4addr=SI=(sib)SI+d8(sib)+d8SI+d8 (sib)+d32 5CHBPEBP5addr=DI=d32DI+d8EBP+d8DI+d16 EBP+d32 6DHSIESI6addr=d16=ESIBP+d8ESI+d8BP+d16 ESI+d32 7BHDIEDI7addr=BX=EDIBX+d8EDI+d8BX+d16 EDI+d32 Address Specifier: Reg=3 bits, R/M=3 bits, Mod=2 bits

    Computer Architectures

  • 80x86 Instruction Encoding:Sc/Index/Base field0EAXEAX1ECXECX2EDXEDX3EBXEBX4no indexESP5EBPif mod=0, d32 if mod0, EBP6ESIESI7EDIEDI

    Base + Scaled Index ModeUsed when: mod = 0,1,2 in 32-bit mode and r/m = 4

    2-bit Scale Field3-bit Index Field3-bit Base Field

    Computer Architectures

  • 80x86 Addressing Mode Usage for 32-bit ModeRegister indirect10%10%6%2%7%Base + 8-bit disp46%43%32%4%31%Base + 32-bit disp2%0%24%10%9%Indexed1%0%1%0%1%Based indexed + 8b disp0%0%4%0%1%Based indexed + 32b disp0%0%0%0%0%Base + Scaled Indexed12%31%9%0%13%Base + Scaled Index + 8b disp2%1%2%0%1%Base + Scaled Index + 32b disp6%2%2%33%11%32-bit Direct19%12%20%51%26%

    Computer Architectures

  • 80x86 Length Distribution

    Computer Architectures

  • Instruction Counts: 80x86 vs. DLXgcc 3,771,327,742 3,892,063,4601.03 espresso 2,216,423,413 2,801,294,2861.26spice 15,257,026,309 16,965,928,7881.11nasa7 15,603,040,963 6,118,740,3210.39

    Computer Architectures

  • Intel Compiler vs. Compilers YOU Can Buy66 MHz Pentium ComparisonSpecInt92SpecFP92Intel Internal Optimizing Compiler64.659.7Best 486 Compiler (June 1993)57.639.9Typical 486 Compiler in 1990, when Intel started project 41.032.5Integer: Intel 1.1X faster, FP: 1.5X faster.486 ComparisonSpecInt92SpecFP92Intel Internal Optimizing Compiler35.517.5Best 486 Compiler (June 1993)32.216.0Typical 486 Compiler in 1990, when Intel started project 23.012.8Integer: Intel 1.1X faster, FP: 1.1X faster

    Computer Architectures

  • Intel SummaryArcheology: history of instruction design in a single productAddress size: 16 bit vs. 32-bitProtection: Segmentation vs. pagedTemp. storage: accumulator vs. stack vs. registersGolden Handcuffsof binary compatibility affect design 20 years later, as Moore predictedNot too difficult to make faster, as Intel has shownHP/Intel announcement of common future instruction set by 2000 means end of 80x86???Beauty is in the eye of the beholderAt 50M/year sold, it is a beautiful business

    Computer Architectures