Upload
others
View
7
Download
1
Embed Size (px)
Citation preview
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell
CS352H: Computer Systems Architecture
Topic 9: MIPS Pipeline - Hazards
October 1, 2009
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2
Data Hazards in ALU Instructions
Consider this sequence:sub $2, $1,$3and $12,$2,$5or $13,$6,$2add $14,$2,$2sw $15,100($2)
We can resolve hazards with forwardingHow do we detect when to forward?
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3
Dependencies & Forwarding
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4
Detecting the Need to Forward
Pass register numbers along pipelinee.g., ID/EX.RegisterRs = register number for Rssitting in ID/EX pipeline register
ALU operand register numbers in EX stage aregiven by
ID/EX.RegisterRs, ID/EX.RegisterRtData hazards when1a. EX/MEM.RegisterRd = ID/EX.RegisterRs1b. EX/MEM.RegisterRd = ID/EX.RegisterRt2a. MEM/WB.RegisterRd = ID/EX.RegisterRs2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
Fwd fromEX/MEMpipeline reg
Fwd fromMEM/WBpipeline reg
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5
Detecting the Need to Forward
But only if forwarding instruction will write to aregister!
EX/MEM.RegWrite, MEM/WB.RegWriteAnd only if Rd for that instruction is not $zero
EX/MEM.RegisterRd ≠ 0,MEM/WB.RegisterRd ≠ 0
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6
Forwarding Paths
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7
Forwarding Conditions
EX hazardif (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10
MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8
Double Data Hazard
Consider the sequence:add $1,$1,$2add $1,$1,$3add $1,$1,$4
Both hazards occurWant to use the most recent
Revise MEM hazard conditionOnly fwd if EX hazard condition isn’t true
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9
Revised Forwarding Condition
MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10
Datapath with Forwarding
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11
Load-Use Data Hazard
Need to stallfor one cycle
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12
Load-Use Hazard Detection
Check when using instruction is decoded in ID stageALU operand register numbers in ID stage are given by
IF/ID.RegisterRs, IF/ID.RegisterRtLoad-use hazard when
ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))
If detected, stall and insert bubble
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13
How to Stall the Pipeline
Force control values in ID/EX registerto 0
EX, MEM and WB do nop (no-operation)
Prevent update of PC and IF/ID registerUsing instruction is decoded againFollowing instruction is fetched again1-cycle stall allows MEM to read data for lw
Can subsequently forward to EX stage
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14
Stall/Bubble in the Pipeline
Stall insertedhere
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15
Stall/Bubble in the Pipeline
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16
Datapath with Hazard Detection
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17
Stalls and Performance
Stalls reduce performanceBut are required to get correct results
Compiler can arrange code to avoid hazards and stallsRequires knowledge of the pipeline structure
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18
Branch Hazards
If branch outcome determined in MEM
PC
Flush theseinstructions(Set controlvalues to 0)
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19
Reducing Branch Delay
Move hardware to determine outcome to ID stageTarget address adderRegister comparator
Example: branch taken36: sub $10, $4, $840: beq $1, $3, 744: and $12, $2, $548: or $13, $2, $652: add $14, $4, $256: slt $15, $6, $7 ...72: lw $4, 50($7)
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20
Example: Branch Taken
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21
Example: Branch Taken
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22
Data Hazards for Branches
If a comparison register is a destination of 2nd or 3rd
preceding ALU instruction
…
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
add $4, $5, $6
add $1, $2, $3
beq $1, $4, target
Can resolve using forwarding
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23
Data Hazards for Branches
If a comparison register is a destination of preceding ALUinstruction or 2nd preceding load instruction
Need 1 stall cycle
beq stalled
IF ID EX MEM WB
IF ID EX MEM WB
IF ID
ID EX MEM WB
add $4, $5, $6
lw $1, addr
beq $1, $4, target
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24
Data Hazards for Branches
If a comparison register is a destination of immediatelypreceding load instruction
Need 2 stall cycles
beq stalled
IF ID EX MEM WB
IF ID
ID
ID EX MEM WB
beq stalled
lw $1, addr
beq $1, $0, target
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25
Dynamic Branch Prediction
In deeper and superscalar pipelines, branchpenalty is more significantUse dynamic prediction
Branch prediction buffer (aka branch history table)Indexed by recent branch instruction addressesStores outcome (taken/not taken)To execute a branch
Check table, expect the same outcomeStart fetching from fall-through or targetIf wrong, flush pipeline and flip prediction
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26
1-Bit Predictor: Shortcoming
Inner loop branches mispredicted twice!outer: … …inner: … … beq …, …, inner … beq …, …, outer
Mispredict as taken on last iteration of inner loopThen mispredict as not taken on first iteration ofinner loop next time around
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27
2-Bit Predictor
Only change prediction on two successive mispredictions
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28
Calculating the Branch Target
Even with predictor, still need to calculate the targetaddress
1-cycle penalty for a taken branch
Branch target bufferCache of target addressesIndexed by PC when instruction fetched
If hit and instruction is branch predicted taken, can fetch targetimmediately
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29
Exceptions and Interrupts
“Unexpected” events requiring changein flow of control
Different ISAs use the terms differently
ExceptionArises within the CPU
e.g., undefined opcode, overflow, syscall, …
InterruptFrom an external I/O controller
Dealing with them without sacrificing performance is hard
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30
Handling Exceptions
In MIPS, exceptions managed by a SystemControl Coprocessor (CP0)Save PC of offending (or interrupted) instruction
In MIPS: Exception Program Counter (EPC)Save indication of the problem
In MIPS: Cause registerWe’ll assume 1-bit
0 for undefined opcode, 1 for overflow
Jump to handler at 8000 00180
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31
An Alternate Mechanism
Vectored InterruptsHandler address determined by the cause
Example:Undefined opcode: C000 0000Overflow: C000 0020…: C000 0040
Instructions eitherDeal with the interrupt, orJump to real handler
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32
Handler Actions
Read cause, and transfer to relevant handlerDetermine action requiredIf restartable
Take corrective actionuse EPC to return to program
OtherwiseTerminate programReport error using EPC, cause, …
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33
Exceptions in a Pipeline
Another form of control hazardConsider overflow on add in EX stageadd $1, $2, $1
Prevent $1 from being clobberedComplete previous instructionsFlush add and subsequent instructionsSet Cause and EPC register valuesTransfer control to handler
Similar to mispredicted branchUse much of the same hardware
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34
Pipeline with Exceptions
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35
Exception Properties
Restartable exceptionsPipeline can flush the instructionHandler executes, then returns to the instruction
Refetched and executed from scratch
PC saved in EPC registerIdentifies causing instructionActually PC + 4 is saved
Handler must adjust
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36
Exception Example
Exception on add in40 sub $11, $2, $444 and $12, $2, $548 or $13, $2, $64C add $1, $2, $150 slt $15, $6, $754 lw $16, 50($7)…
Handler80000180 sw $25, 1000($0)80000184 sw $26, 1004($0)…
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37
Exception Example
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 38
Exception Example
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 39
Multiple Exceptions
Pipelining overlaps multiple instructionsCould have multiple exceptions at once
Simple approach: deal with exception from earliestinstruction
Flush subsequent instructions“Precise” exceptions
In complex pipelinesMultiple instructions issued per cycleOut-of-order completionMaintaining precise exceptions is difficult!
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 40
Imprecise Exceptions
Just stop pipeline and save stateIncluding exception cause(s)
Let the handler work outWhich instruction(s) had exceptionsWhich to complete or flush
May require “manual” completion
Simplifies hardware, but more complex handler softwareNot feasible for complex multiple-issueout-of-order pipelines
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41
Fallacies
Pipelining is easy (!)The basic idea is easyThe devil is in the details
e.g., detecting data hazards
Pipelining is independent of technologySo why haven’t we always done pipelining?More transistors make more advanced techniquesfeasiblePipeline-related ISA design needs to take account oftechnology trends
e.g., predicated instructions
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 42
Pitfalls
Poor ISA design can make pipelining hardere.g., complex instruction sets (VAX, IA-32)
Significant overhead to make pipelining workIA-32 micro-op approach
e.g., complex addressing modesRegister update side effects, memory indirection
e.g., delayed branchesAdvanced pipelines have long delay slots