Upload
rohit
View
234
Download
0
Embed Size (px)
DESCRIPTION
Super Scalar Issue Despatch
Citation preview
CSL718 : Superscalar ProcessorsIssue and Despatch23rd Jan, 2006
Early proposals/prototypes1982 1983 1984 1985 1986 1987 1988 1989IBMDECStanford UKyushu UCheetahAmerica project(4)Multititan project(2)Match(2) Torch(4)SIMP(4) DSNS(4)TermSuperscalar
Commercial superscalarsRISCsIntel960KA/KB 960CA (3)1989IBMPower 1 RS/6000 (4)1990HPPA7000 PA7100 (2)1992SUNSPARC SuperSparc (3)1992DECAlpha 21064(2)1992MotorolaMC88100 MC88110(2)1993MotorolaPowerPC 601/603 (3)1993MIPSR4000 R8000(4)1994
Commercial superscalarsCISCsIntel80486 Pentium (2)1993Motorola MC68040 MC68060 (2)1993GmicroGmicro/100p Gmicro 500 (2)1993AMDK5(2) 4 RISC instr1995CYRIXM1 (2)1995
Tasks of superscalar processingParallel Parallel Preserving thedecoding instruction sequential and issue execution consistency of instruction execution and exception processing
Superscalar decode and issueI - cacheInstructionbufferDecode & IssueIFD/II - cacheInstructionbufferDecode & IssueIFDIScalarIssueSuperscalarIssue
Parallel DecodingFetch multiple instructions in instruction bufferDecode multiple instructions in parallel instruction windowPossibly check dependencies among these as well as with the instructions already under execution
Pre-decodingDo partial decoding while instructions are being loaded in I-cacheDecoded information is appended to the instructionThis includes instruction class, resources required etc.
Second level cacheor main memoryPre-decode unitI - cacheN bits/cycleN + n bits/cycle
Number of Pre-decode bitsProcessorNo. of predecode bitsPA 7200 (1995)5PA 8000 (1996)5PowerPC 620(1996)7UltraSparc (1995)4HAL PM1 (1995)4AMD K5 (1995)5 (per byte)R 10000 (1996)4
Issue vs DispatchBlocking IssueDecode and issue to EU
Instructions may be blocked due to data dependencyNon-blocking IssueDecode and issue to bufferFrom buffer dispatch to EU
Instructions are not blocked due to data dependency
Blocking IssueEUEUEUDecode Check & IssueInstructionbufferissue window
Non-blocking (shelved) IssueReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUDecode & IssueInstructionbuffer
Handling of Issue BlockagesPreserving issue order Alignment of instruction issuealigned unalignedin-order out of order
Issue OrdercdabeaIssue windowInstructionsto be issued
InstructionsissuedcdabeaIssue windowInstructionsto be issued
InstructionsissuedIssue in strict program orderOut of order IssuecExample: MC 88110, PowerPC 601Independent instructionDependent instructionIssued instruction
Alignmentcdabeafixed windowcheckedin cycle 1Aligned IssueUnaligned Issueissuedin cycle 1fghnext windowcdbebcheckedin cycle 2issuedin cycle 2fghdedcheckedin cycle 3issuedin cycle 3fghccdabeagliding windowfghcdbebfghdefghcdef
Design choices in instruction issueCoping with Coping with Use of Handling of Issuefalse data unresolved shelving issue blockages ratedependencies control (2-6) dependenciesno Register renamingwait speculativeblocking shelved
Frequently used issue policies in scalar processorsTraditional Traditional Traditional Traditionalscalar issue scalar issue scalar issue scalar issue with shelving with shelving with spec. and renaming executionCDC 6600IBM 360/91i386MC68030R3000SparcI486MC68040R4000MicroSparc
Frequently used issue policies in super scalar processorsStraightforward Straightforward Straight forward Advancedsuperscalar superscalar superscalar superscalar issue issue with issue with issue shelving renaming (renaming+shelving)aligned unaligned(speculative execution in all)PentiumPowerPC601PA7100SuperSparcAlpha21164MC68060PA7200UltraSparcMC88110R8000PowerPC602R10000PentiumProPowerPC602PA8000Sparc64Am29000K5
Frequently used issue policies Traditional Traditional Straight forward Advancedscalar issue scalar issue superscalar issue superscalar with spec. Issue executionaligned unaligned
Design Space of ShelvingScope of Layout of Operand fetch Instructionshelving shelving policy dispatch scheme bufferspartial full
Layout of Shelving BuffersType of the Number of Number of readshelving buffers shelving buffer entries and write portsStand combined withalone renaming and(RS) reorderingindividual 2-4group 6-16central 20total 15-40depends onno. of EUsconnected
Reservation Stations (RS)EUEUEUEUEUEUEUEUIndividual RSsGroup RSsCentral RS
Combined Buffer(for Shelving, Renaming, Reordering)EUEUDRISFrom decode/issueDeferred scheduling, Register renaming and InstructionShelving
Operand Fetch PoliciesIssueboundfetchDispatchboundfetch
Issue bound operand fetch(with single register file)EUEUEUEUDecode/issueRFinstructiondata
Dispatch bound operand fetch (with single register file)EUEUEUEUDecode/issue
Issue bound operand fetch(with multiple register files)EUEUEUEUDecode/issueRFRFinstructiondata
Dispatch bound operand fetch (with multiple register files)EUEUEUEUDecode/issue
Updating RFs and RSsEUEUEUEUDecode/issueRFRFinstructiondata
Instruction dispatch schemeDispatch Dispatch Checking Treatment ofpolicy rate operand empty RS availabilitysingle multipleinstr/ instr/cycle cycleIndividual RSGroup or central RS
Dispatch policySelection Arbitration Dispatchrule rule orderRule for identifyinginstructions which areready for execution(data dependency check)Rule for choosingone out of severalready instructions(earlier instruction has priority)
Dispatch orderin-order partially out of out of order ordercheckcheck
Checking availability of operandsDirect check of Check of explicit score-board bits status bits in RS
(usual for dispatch (usual for issuebound operand fetch) bound operand fetch)
control flow approach data flow approachFlynns terminology
Score-boardRegisterFile10110012Data statusIntroduced with CDC6600
Checking in dispatch bound fetchRegisterFileReservationstationOC Rs1 Rs2 RdEUdecodedinstructioncheck V bits of sourcesupdate Rdset V bitRs1,Rs2,Rdreset V bit of RdOC(opcode)Os1Os2 (operand value)result, Rd
Checking in issue bound fetchOC Os1/Is1 Vs1 Os2/Is2 Vs2 RdEUdecodedinstructionOC, Os1, Os2, Rdresult, RdRegisterFileupdate Rd, set V bitRs1,Rs2,Rdreset V bit of RdOs1Os2 (operand value)Reservation stationcheck Vs1, Vs2associative update ofIs1, Is2 with Rd, set Vs bits
Treatment of an empty RSStraight forward Bypassingapproach RS if emptyAt least onecycle stay in RSEUEUNx586Sparc64PowerPc 604
Approaches in dispatchingStraight forward Enhanced Advanced in order partially out of order out of order single single multiple instr/cycle instr/cycle instr/cycleindividual RSs individual RSs group/central RSs
Power1, PPC603 Power2 PM1, PentiumProNx586, Am29000 PPC604,620 PA8000, R10000