Upload
adip-chy
View
239
Download
0
Embed Size (px)
Citation preview
7/29/2019 Lec4b Supp
1/56
Chapter 4B: The Processor,Part B
Mary Jane Irwin ( www.cse.psu.edu/~mji )
[Adapted from Computer Organization and Design, 4th Edition,
CSE431 Chapter 4B.1 Irwin, PSU, 2008
, ,
7/29/2019 Lec4b Supp
2/56
Review: Why Pipeline? For Performance!
me c oc cyc es
A Once the
I
n
s
Inst 1
LUeg eg
AL
IM Re DM Re
p pe ne s u ,
one instructionis completed
r.
O Inst 2
U
ALUIM Reg DM Reg
,CPI = 1
r
d
e Inst 3ALUIM Reg DM Reg
Inst 4ALUIM Reg DM Reg
CSE431 Chapter 4B.2 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
3/56
Review: MIPS Pipeline Data and Control Paths
ID/EXEX/MEM
Control
PCSrc
Add
4 ShiftAdd
IF/ID
MEM/WB
RegWriteBranch
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
MemtoReg
ALUSrc
Write Data
eaData 2 Write Data
Si n
ALU
cntrlMemWrite MemRead
16 32Extend ALUOp
CSE431 Chapter 4B.3 Irwin, PSU, 2008
eg s
7/29/2019 Lec4b Supp
4/56
Review: Can Pipelining Get Us Into Trouble?
Yes: Pipeline Hazards
z structural hazards: attempt to use the same resource by twodifferent instructions at the same time
z data hazards: attempt to use data before it is ready- An instructions source operand(s) are produced by a prior
z control hazards: attempt to make a decision about program
control flow before the condition has been evaluated and the
- branch and jump instructions, exceptions
Pipeline control must detect the hazard and then takeaction to resolve hazards
CSE431 Chapter 4B.4 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
5/56
Review: Register Usage Can Cause Data Hazards
Value of $1 10 10 10 10 10/-20 -20 -20 -20 -20
ALUIM Reg DM Regadd$1,
ALUIM Reg DM Regsub $4,$1,$5
ALUIM Reg DM Regand $6,$1,$7
LUIM Reg DM Reg
A
or $8,$1,$9
CSE431 Chapter 4B.5 Irwin, PSU, 2008
U, ,
7/29/2019 Lec4b Supp
6/56
One Way to Fix a Data Hazard
Iadd$1,
AL
IM Reg DM Reg
Can fix datahazard by
stall
n
s
t
but impacts CPI
stall
.
O
r
d
e
rsub $4,$1,$5
ALUIM Reg DM Reg
and $6,$1,$7ALUIM Reg DM Reg
CSE431 Chapter 4B.6 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
7/56
Another Way to Fix a Data Hazard
Iadd$1,
ALUIM Reg DM Reg
Fix data hazardsby forwarding
results as soon as
st
r.sub $4,$1,$5
ALUIM Reg DM Reg
they are availableto where they are
needed
O
r and $6,$1,$7
ALUIM Reg DM Reg
e
r or $8,$1,$9ALUIM Reg DM Reg
xor $4,$1,$5ALUIM Reg DM Reg
CSE431 Chapter 4B.7 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
8/56
Another Way to Fix a Data Hazard
ALUIM Reg DM Reg
Fix data hazardsby forwarding
results as soon asIadd$1,
ALUIM Reg DM Reg
they are availableto where they are
needed
n
st sub $4,$1,$5
ALUIM Reg DM Reg
.
O
r and $6,$1,$7
ALUIM Reg DM Reg
e
r or $8,$1,$9
ALUIM Reg DM Regxor $4,$1,$5
CSE431 Chapter 4B.8 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
9/56
Data Forwarding (aka Bypassing)
a e e resu rom e ear es po n a ex s s n anyof the pipeline state registers and forward it to thefunctional units (e.g., the ALU) that need it that cycle
For ALU functional unit: the inputs can come from anypipeline register rather than just from ID/EX by
z adding multiplexors to the inputs of the ALU
z connecting the Rd write data in EX/MEM or MEM/WB to either (or
both) of the EXs stage Rs and Rt ALU mux inputs
z adding the proper control hardware to control the new muxes
Other functional units may need similar forwarding logic
e.g., e With forwarding can achieve a CPI of 1 even in the
CSE431 Chapter 4B.9 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
10/56
Data Forwarding Control Conditions
1. EX Forward Unit:i f ( EX/ MEM. RegWr i t eand ( EX/ MEM. Regi st er Rd ! = 0)
Forwards theresult from the
. .
For war dA = 10i f ( EX/ MEM. RegWr i t eand ( EX/ MEM. Regi st er Rd ! = 0)
previous instr.to either inputof the ALU
and ( EX/ MEM. Regi st er Rd = I D/ EX. Regi st er Rt ) )For war dB = 10
Forwards theresult from the
. i f ( MEM/ WB. RegWr i t eand ( MEM/ WB. Regi st er Rd ! = 0)and ( MEM/ WB. Regi st er Rd = I D/ EX. Regi st er Rs) )
previous instr.to either input
For war dA = 01i f ( MEM/ WB. RegWr i t eand ( MEM/ WB. Regi st er Rd ! = 0)
CSE431 Chapter 4B.10 Irwin, PSU, 2008
. .For war dB = 01
7/29/2019 Lec4b Supp
11/56
Forwarding Illustration
I add$1,AL
IM Reg DM Reg
n
s
t sub $4,$1,$5ALUIM Reg DM Reg
.
O
r and $6 $7 $1
A
LUIM Reg DM Reg
d
e
r
EX forwarding MEM forwarding
CSE431 Chapter 4B.11 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
12/56
Yet Another Complication!
no er po en a a a azar can occur w en ere s aconflict between the result of the WB stage instructionand the MEM stage instruction which should be
forwarded?
I
n
add$1,$1,$2AL
UIM Reg DM Reg
t
r. add$1,$1,$3ALUIM Reg DM Reg
r
d
e
add $1,$1,$4ALUIM Reg DM Reg
CSE431 Chapter 4B.12 Irwin, PSU, 2008
r
7/29/2019 Lec4b Supp
13/56
Yet Another Complication!
no er po en a a a azar can occur w en ere s aconflict between the result of the WB stage instructionand the MEM stage instruction which should be
forwarded?
I
n
add$1,$1,$2ALUIM Reg DM Reg
t
r. add$1,$1,$3ALUIM Reg DM Reg
r
d
e
add $1,$1,$4ALUIM Reg DM Reg
CSE431 Chapter 4B.13 Irwin, PSU, 2008
r
7/29/2019 Lec4b Supp
14/56
Corrected Data Forwarding Control Conditions
1. EX Forward Unit:i f ( EX/ MEM. RegWr i t eand ( EX/ MEM. Regi st er Rd ! = 0)and ( EX/ MEM. Regi st er Rd = I D/ EX. Regi st er Rs) )
Forwards theresult from the
=
i f ( EX/ MEM. RegWr i t eand ( EX/ MEM. Regi st er Rd ! = 0)and ( EX/ MEM. Regi st er Rd = I D/ EX. Regi st er Rt ) )
previous instr.
to either inputof the ALU
2. MEM Forward Unit:i f ( MEM/ WB. RegWr i t e
For war dB = 10
and ( MEM/ WB. Regi st er Rd ! = 0)and ( EX/ MEM. Regi st er Rd ! = I D/ EX. Regi st er Rs)and ( MEM/ WB. Regi st er Rd = I D/ EX. Regi st er Rs) )
=
Forwards theresult from the
i f ( MEM/ WB. RegWr i t eand ( MEM/ WB. Regi st er Rd ! = 0)
secondprevious instr.
CSE431 Chapter 4B.14 Irwin, PSU, 2008
an EX MEM. Reg st er R = I D EX. Reg st er Rtand ( MEM/ WB. Regi st er Rd = I D/ EX. Regi st er Rt ) )
For war dB = 01
of the ALU
7/29/2019 Lec4b Supp
15/56
Datapath with Forwarding HardwarePCSrc
ID/EXEX/MEM
Control
Add
4 ShiftAdd
IF/ID
MEM/WBBranch
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
Forward
CSE431 Chapter 4B.15 Irwin, PSU, 2008
Unit
7/29/2019 Lec4b Supp
16/56
Datapath with Forwarding HardwarePCSrc
ID/EXEX/MEM
Control
Add
4 ShiftAdd
IF/ID
MEM/WBBranch
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
ForwardID/EX.RegisterRt
EX/MEM.RegisterRd
MEM/WB.RegisterRd
CSE431 Chapter 4B.16 Irwin, PSU, 2008
UnitID/EX.RegisterRs
7/29/2019 Lec4b Supp
17/56
Memory-to-Memory Copies
or oa s mme a e y o owe y s ores memory- o-memory copies) can avoid a stall by adding forwardinghardware from the MEM/WB register to the data memoryinput.
z Would need to add a Forward Unit and a mux to the MEM stage
I
str.
lw $1,4($2) LUIM Reg DM Reg
Ord
sw $1,4($3)ALUIM Reg DM Reg
CSE431 Chapter 4B.17 Irwin, PSU, 2008
r
7/29/2019 Lec4b Supp
18/56
Forwarding with Load-use Data Hazards
I
n
lw $1,4($2)
ALUIM Reg DM Reg
s
tr. A
A
LUIM Reg DM Regsub $4,$1,$5
O
r
d
and $6,$1,$7LUeg eg
AL
IM Re DM Ree
r
, , U
AL
IM Reg DM Reg
ALUIM Reg DM
CSE431 Chapter 4B.18 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
19/56
Forwarding with Load-use Data Hazards
I
n
lw $1,4($2)
ALUIM Reg DM Reg
stalls
tr. A
A
LUIM Reg DM Regsub $4,$1,$5
O
r
d
su , ,LUeg eg
AL
IM Re DM Re
an , ,
e
r
, ,
or $8,$1,$9
U
AL
IM Reg DM Regxor $4,$1,$5
xor $4,$1,$5ALUIM Reg DM
CSE431 Chapter 4B.19 Irwin, PSU, 2008 Will still need one stall cycle even with forwarding
7/29/2019 Lec4b Supp
20/56
Load-use Hazard Detection Unit
ee a azar e ec on n n e s age a nser sa stall between the load and its use
.
i f ( I D/ EX. MemReadand ( ( I D/ EX. Regi st er Rt = I F/ I D. Regi st er Rs)or I D/ EX. Re i st er Rt = I F/ I D. Re i st er Rtst al l t he pi pel i ne
The first line tests to see if the instruction now in the EXstage is a l w; the next two lines check to see if thedestination register of the l wmatches either source
-instruction)
After this one c cle stall, the forwardin lo ic can handle
CSE431 Chapter 4B.20 Irwin, PSU, 2008
the remaining data hazards
7/29/2019 Lec4b Supp
21/56
Hazard/Stall Hardware
, Prevent the instructions in the IF and ID stages from
ro ressin down the i eline done b reventin the
PC register and the IF/ID pipeline register from changingz Hazard detection Unit controls the writing of the PC (PC. wr i t e)
.
Insert a bubble between the l winstruction (in the EX
stage) and the load-use instruction (in the ID stage) (i.e.,insert a noop in the execution stream)
z Set the control bits in the EX, MEM, and WB control fields of theID/EX i eline re ister to 0 noo . The Hazard Unit controls the
mux that chooses between the real control values and the 0s.
Let the l winstruction and the instructions after it in the
CSE431 Chapter 4B.21 Irwin, PSU, 2008
p pe ne e ore n e co e procee norma y own e
pipeline
7/29/2019 Lec4b Supp
22/56
Adding the Hazard/Stall HardwarePCSrc
ID/EXEX/MEM
Hazard
Unit0
Add
4 ShiftAdd
IF/ID
MEM/WB
Control
Branch
1
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
Forward
CSE431 Chapter 4B.22 Irwin, PSU, 2008
Unit
7/29/2019 Lec4b Supp
23/56
Adding the Hazard/Stall HardwarePCSrc
ID/EXEX/MEM
Hazard
Unit0
ID/EX.MemRead
Add
4 ShiftAdd
IF/ID
MEM/WB
Control
Branch
1
0
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
Forward
CSE431 Chapter 4B.23 Irwin, PSU, 2008
UnitID/EX.RegisterRt
7/29/2019 Lec4b Supp
24/56
Control Hazards
(i.e., PC = PC + 4); incurred by change of flow instructions
z Unconditional branches (j , j al , j r )
z onditional branches beq, bne
z Exceptions
z Stall (impacts CPI)
z Move decision point as early in the pipeline as possible, therebyreducing the number of stall cycles
z Delay decision (requires compiler support)
z Predict and ho e for the best !
Control hazards occur less frequently than data hazards,but there is nothing as effective against control hazards as
CSE431 Chapter 4B.24 Irwin, PSU, 2008
forwarding is for data hazards
7/29/2019 Lec4b Supp
25/56
Datapath Branch and Jump HardwareJump
ID/EXEX/MEM
PCSrc
Shift
left 2
Add
4
IF/ID
MEM/WB
Control
BranchShift Add
PC+4[31-28]
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
Forward
CSE431 Chapter 4B.25 Irwin, PSU, 2008
Unit
7/29/2019 Lec4b Supp
26/56
Jumps Incur One Stall
umps no eco e un , so one us s nee ez To flush, set I F. Fl ush to zero the instruction field of the
IF/ID pipeline register (turning it into a noop)
I jALUIM Reg DM Reg
Fix jump
flush
n
s
t
waiting
flushALUIM
RegDM Reg
.
O
rj target
ALUIM Reg DM Reg
e
r
CSE431 Chapter 4B.26 Irwin, PSU, 2008
or una e y, umps are very n requen on y o e
SPECint instruction mix
7/29/2019 Lec4b Supp
27/56
Two Types of Stalls
Noop ns ruc on or u e nser e e ween woinstructions in the pipeline (as done for load-usesituations)
z Keep the instructions earlierin the pipeline (later in the code)from progressing down the pipeline for a cycle (bounce them inplace with write control signals)
z Insert noop by zeroing control bits in the pipeline register at theappropriate stage
z Let the instructions later in the pipeline (earlier in the code)y w
Flushes (or instruction squashing) were an instruction in
for instructions located sequentially afterj instructions)z Zero the control bits for the instruction to be flushed
CSE431 Chapter 4B.27 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
28/56
Supporting ID Stage JumpsJump
ID/EXEX/MEM
PCSrc
Shift
left 2
Add
4
IF/ID
MEM/WB
Control
BranchShift Add
PC+4[31-28]
ReadAddress
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
Register
File
ReadData 1
ALU
DataMemory
Address Read
0
Write Data
eaData 2
16 32
Write Data
Sign
ALU
cntrl
Forward
CSE431 Chapter 4B.28 Irwin, PSU, 2008
Unit
7/29/2019 Lec4b Supp
29/56
Review: Branch Instrs Cause Control Hazards
A
I
ns
LUeg eg
AL
IM Re DM Re
r.
O Inst 3
U
A
LUIM Reg DM Reg
r
d
e Inst 4ALUIM Reg DM Reg
CSE431 Chapter 4B.29 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
30/56
One Way to Fix a Branch Control Hazard
I
n
beqALUIM Reg DM Reg
Fix branchhazard bywaitin
flush
s
tr.
flushbutaffects CPI
ALUIM Reg DM Reg
flushOr
d
LUIM Reg DM Reg
ALIM Re DM Re
flusher
be tar et
ALUIM Reg DM Reg
U
ALU
Inst 3IM Reg DM
CSE431 Chapter 4B.30 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
31/56
Another Way to Fix a Branch Control Hazard
the pipeline as possible i.e., during the decode cycle
I beq Fix branchALUIM Reg DM Reg
flush
n
s
t
waiting
flushALUIM Reg DM Reg
.
O
rbeqtarget
ALUIM Reg DM Reg
e
r Inst 3
ALUIM Reg DM
CSE431 Chapter 4B.31 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
32/56
Reducing the Delay of Branches
z Reduces the number of stall (flush) cycles to two
z Adds an and gate and a 2x1 mux to the EX timing path
Add hardware to compute the branch target address andevaluate the branch decision to the ID stage
z Reduces the number of stall (flush) cycles to one(like with jumps)
- But now need to add forwardin hardware in ID sta e
z Computing branch target address can be done in parallel withRegFile read (done for all instructions only used when needed)
,comparing and updating the PC adds a mux, a comparator, and anand gate to the ID timing path
CSE431 Chapter 4B.32 Irwin, PSU, 2008
For deeper pipelines, branch decision points can be even
laterin the pipeline, incurring more stalls
7/29/2019 Lec4b Supp
33/56
ID Branch Forwarding Issues
is taken care of by thenormal RegFile write
,MEM add2 $3,EX add1 $4,ID be 1 2 LooIF next _seq_i nst r
EX/MEM pipeline stage to
the ID comparison
,MEM add2 $1,EX add1 $4,
ar ware or cases e IF next _seq_i nst r
i f ( I Dcont r ol . Br anchand ( EX/ MEM. Regi st erRd ! = 0) Forwards the
and ( EX MEM. Regi st er Rd = I F I D. Regi st er Rs) )For war dC = 1
i f ( I Dcont r ol . Br anchand EX/ MEM. Re i st er Rd ! = 0
second
previous instr.
CSE431 Chapter 4B.33 Irwin, PSU, 2008
and ( EX/ MEM. Regi st er Rd = I F/ I D. Regi st er Rt ) )For war dD = 1
of the compare
7/29/2019 Lec4b Supp
34/56
ID Branch Forwarding Issues, cont
e ns ruc on mme a e ybefore the branch producesone of the branch source
WB add3 $3,MEM add2 $4,EX add1 $1,
operands, then a stall needsto be inserted (between the
eq , , oop
IF next _seq_i nst r
occurring at the same time as the ID stage branch
compare operationz Bounce the beq (in ID) and next_seq_instr (in IF) in place
(ID Hazard Unit deasserts PC. Wr i t e and I F/ I D. Wr i t e)
z Insert a stall between the add in the EX stage and the beq in
the ID stage by zeroing the control bits going into the ID/EXpipeline register (done by the ID Hazard Unit)
CSE431 Chapter 4B.34 Irwin, PSU, 2008
,instruction currently in IF (I F. Fl ush)
7/29/2019 Lec4b Supp
35/56
Supporting ID Stage BranchesBranch
PCSrc
ID/EXEX/MEM
Hazard
Unit10
4 Shift
left 2
Add
IF/ID
MEM/WB
Control
pare
Add
lush
ReadAddress
InstructionMemory
P
C
Read Addr 1
Read Addr 2
Write Addr
RegFile
Read Data 1 ALU
DataMemory
Read Data
Co
IF.F
0
Write Data
ReadData 2
16
Write Data
SignExtend
ALU
cntrl
Forward
ForwardUnit
CSE431 Chapter 4B.35 Irwin, PSU, 2008
Unit
7/29/2019 Lec4b Supp
36/56
Delayed Branches
,then we can eliminate all branch stalls with delayedbranches which are defined as always executing the next
branch takes effect afterthat next instructionz MIPS compiler moves an instruction to immediately after the
thereby hiding the branch delay
With deeper pipelines, the branch delay grows requiringmore than one delay slot
expensive but more flexible (dynamic) hardware branch prediction
z Growth in available transistors has made hardware branch
CSE431 Chapter 4B.36 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
37/56
Scheduling Branch Delay Slots
add $1, $2, $3i f $2=0 t hen
. . .add $1, $2, $3i f $1=0 t hen
sub $4, $5, $6
add $1, $2, $3
i f $1=0 t hendelay slot
sub $4, $5, $6
becomes becomes becomes
i f $2=0 t hen
add $1, $2, $3
i f $1=0 t hen
add $1,$2,$3add $1, $2, $3i f $1=0 t hen
sub $4,$5,$6
A is the best choice, fills delay slot and reduces IC
, ,
CSE431 Chapter 4B.37 Irwin, PSU, 2008
n an , e sub ns ruc on may nee o e cop e , ncreas ng
In B and C, must be okay to execute sub when branch fails
7/29/2019 Lec4b Supp
38/56
Static Branch Prediction
and proceeding without waiting to see the actual branchoutcome
1. Predict not taken always predict branches will not betaken, continue to fetch from the sequential instruction,
z If taken, flush instructions afterthe branch (earlier in the pipeline)
- in IF, ID, and EX stages if branch logic in MEM three stalls- In IF and ID stages if branch logic in EX two stalls
- in IF stage if branch logic in ID one stall
z ensure that those flushed instructions havent changed themac ne s a e au oma c n e p pe ne s nce mac nestate changing operations are at the tail end of the pipeline(MemWrite (in MEM) or RegWrite (in WB))
CSE431 Chapter 4B.38 Irwin, PSU, 2008
7/29/2019 Lec4b Supp
39/56
Flushing with Misprediction (Not Taken)
4 beq $1,$2,2In
ALUIM Reg DM Reg
s
tr.
AL
UIM Reg DM Reg
8 sub $4,$1,$5
O
r
de
r
To flush the IF stage instruction, assert I F. Fl ush tozero the instruction field of the IF/ID pipeline register
CSE431 Chapter 4B.39 Irwin, PSU, 2008
rans orm ng n o a noop
7/29/2019 Lec4b Supp
40/56
Flushing with Misprediction (Not Taken)
4 beq $1,$2,2In
ALUIM Reg DM Reg
flush
s
tr.
AL
UIM Reg DM Reg
8 sub $4,$1,$5
O
r
d
16 and $6,$1,$7LUIM Reg DM Reg
A
e
r
or r , , U
To flush the IF stage instruction, assert I F. Fl ush tozero the instruction field of the IF/ID pipeline register
CSE431 Chapter 4B.40 Irwin, PSU, 2008
(transforming it into a noop)
7/29/2019 Lec4b Supp
41/56
Branching Structures
branching structures
Loop: beq $1, $2, Out1nd l oop i nst r
.
..
l ast l oop i nst r
bottom of the loop to return to thetop of the loop and incur theum stall overhead
j LoopOut : f al l out i nstr
loop branching structures Loop: 1st l oop i nst r
2nd l oop i nst r.
.
.
l ast l oop i nst r
CSE431 Chapter 4B.41 Irwin, PSU, 2008
ne , , oopf al l out i nst r
7/29/2019 Lec4b Supp
42/56
Static Branch Prediction, cont
and proceeding
.
z Predict taken always incurs one stall cycle (if branchdestination hardware has been moved to the ID stage)
z Is there a way to cache the address of the branch targetinstruction ??
As the branch penalty increases (for deeper pipelines),a simple static prediction scheme will hurt performance.
more ar ware, s poss e o ry o pre cbranch behaviordynamically during program execution
-
CSE431 Chapter 4B.42 Irwin, PSU, 2008
. time using run-time information
7/29/2019 Lec4b Supp
43/56
Dynamic Branch Prediction
(BHT)) in the IF stage addressed by the lower bits of thePC, contains bit(s) passed to the ID stage through the
taken the last time it was executez Prediction bit may predict incorrectly (may be a wrong prediction
with the same low order PC bits) but the doesnt affectcorrectness, just performance
- Branch decision occurs in the ID sta e after determinin that thefetched instruction is a branch and checking the prediction bit(s)
z If the prediction is wrong, flush the incorrect instruction(s) inpipeline, restart the pipeline with the right instruction, and invert
e pre c on s
- A 4096 bit BHT varies from 1% misprediction (nasa7, tomcatv) to18% (eqntott)
CSE431 Chapter 4B.43 Irwin, PSU, 2008
B h T t B ff
7/29/2019 Lec4b Supp
44/56
Branch Target Buffer
,tell where its taken to!
z A branch target buffer (BTB) in the IF stage caches the branch,
instruction. The prediction bit in IF/ID selects which next
instruction will be loaded into IF/ID at the next clock edge- Would need a two read port
instruction memory
BTB
z Or the BTB can cache thebranch taken instruction while theinstruction memory is fetching the Read
InstructionMemory
C0
If the rediction is correct stalls can be avoided no matter
next sequential instruction
Address
CSE431 Chapter 4B.44 Irwin, PSU, 2008
which direction they go
1 bit P di ti A
7/29/2019 Lec4b Supp
45/56
1-bit Prediction Accuracy
-z Assume predict_bit = 0 to start (indicating
branch not taken) and loop control is ate o om o e oop co e
1. First time through the loop, the predictormispredicts the branch since the branch is
oop: oop ns r2nd l oop i nst r
.
.
a en ac o e op o e oop; nverprediction bit (predict_bit = 1)
2. As long as branch is taken (looping),
.
l ast l oop i nst rbne $1, $2, Loop
f al l out i nst rprediction is correct
3. Exiting the loop, the predictor againmispredicts the branch since this time the
For 10 times throu h the loo we have a 80 rediction
branch is not taken falling out of the loop;invert prediction bit (predict_bit = 0)
CSE431 Chapter 4B.45 Irwin, PSU, 2008
accuracy for a branch that is taken 90% of the time
2 bit P di t
7/29/2019 Lec4b Supp
46/56
2-bit Predictors
-must be wrong twice before the prediction bit is changed
Taken
oop: oop ns r2nd l oop i nst r
.
.
PredictTaken
PredictTaken
Not taken
Taken
.
l ast l oop i nst rbne $1, $2, Loop
f al l out i nst r
Predict
Not taken
Not taken
Taken
Not Taken Not TakenNot taken
Taken
CSE431 Chapter 4B.46 Irwin, PSU, 2008
2 bit P di t
7/29/2019 Lec4b Supp
47/56
2-bit Predictors
-must be wrong twice before the prediction bit is changed
Taken
oop: oop ns r2nd l oop i nst r
.
.wrong on loopfall out
PredictTaken
PredictTaken
Not taken
Taken
.
l ast l oop i nst rbne $1, $2, Loop
f al l out i nst r
11
1011
Predict
Not taken
Not taken
Taken right on 1s
iteration
000
Not Taken Not TakenNot taken
Taken
stores theinitial FSM
01
CSE431 Chapter 4B.47 Irwin, PSU, 2008
state
Dealing with Exceptions
7/29/2019 Lec4b Supp
48/56
Dealing with Exceptions
control hazard. Exceptions arise fromz R-type arithmetic overflow
z ry ng o execu e an un e ne ns ruc on
z
An I/O device requestz An OS service request (e.g., a page fault, TLB exception)
z
The pipeline has to stop executing the offending
instruction in midstream, let all prior instructionscomp e e, us a o ow ng ns ruc ons, se a reg s er oshow the cause of the exception, save the address of theoffending instruction, and then jump to a prearranged
a ress e a ress o e excep on an er co e The software (OS) looks at the cause of the exception
and deals with it
CSE431 Chapter 4B.48 Irwin, PSU, 2008
Two Types of Exceptions
7/29/2019 Lec4b Supp
49/56
Two Types of Exceptions
z caused by external events
z may be handled between instructions, so can let theinstructions currently active in the pipeline complete before
passing control to the OS interrupt handlerz simply suspend and resume user program
Traps (Exception) synchronous to program execution
z caused by internal events
z condition must be remedied by the trap handler forthatinstruction, so much stop the offending instruction midstream
z the offending instruction may be retried (or simulated by theOS) and the program may continue or it may be aborted
CSE431 Chapter 4B.49 Irwin, PSU, 2008
Where in the Pipeline Exceptions Occur
7/29/2019 Lec4b Supp
50/56
Where in the Pipeline Exceptions Occur
ALUIM Reg DM Reg
Arithmetic overflow
Stage(s)? Synchronous?
Undefined instruction
I/O service request
Hardware malfunction
CSE431 Chapter 4B.50 Irwin, PSU, 2008
Where in the Pipeline Exceptions Occur
7/29/2019 Lec4b Supp
51/56
Where in the Pipeline Exceptions Occur
ALUIM Reg DM Reg
Arithmetic overflow
Stage(s)? Synchronous?
EX yes
Undefined instruction yesID
IF MEM
I/O service request noany
Hardware malfunction noany
CSE431 Chapter 4B.51 Irwin, PSU, 2008
eware a mu p e excep ons can occursimultaneously in a single clock cycle
Multiple Simultaneous Exceptions
7/29/2019 Lec4b Supp
52/56
Multiple Simultaneous Exceptions
I Inst 0ALUIM Reg DM Reg
n
st Inst 1ALUIM Reg DM Reg
.
O
r
Inst 2ALUIM Reg DM Reg
e
rInst 3
ALUIM Reg DM Reg
Inst 4
LUIM Reg DM Reg
CSE431 Chapter 4B.52 Irwin, PSU, 2008
Hardware sorts the exceptions so that the earliest
instruction is the one interrupted first
Multiple Simultaneous Exceptions
7/29/2019 Lec4b Supp
53/56
Multiple Simultaneous Exceptions
I Inst 0ALUIM Reg DM Reg
n
st Inst 1ALUIM Reg DM Reg
D$ page fault
.
O
r
Inst 2ALUIM Reg DM Reg
ar me c over ow
e
rInst 3
ALUIM Reg DM Reg
Inst 4
LUIM Reg DM Reg
I$ page fault
CSE431 Chapter 4B.53 Irwin, PSU, 2008
Hardware sorts the exceptions so that the earliest
instruction is the one interrupted first
Additions to MIPS to Handle Exceptions (Fig 6 42)
7/29/2019 Lec4b Supp
54/56
Additions to MIPS to Handle Exceptions (Fig 6.42)
in Cause the exceptions and a signal to control writes to it(CauseWr i t e)
EPC register (records the addresses of the offending
instructions) hardware to record in EPC the address ofthe offendin instruction and a si nal to control writes to it(EPCWr i t e)
z Exception software must match exception to instruction
A way to load the PC with the address of the exceptionhandler
the exception handler address - (e.g., 8000 0180hex for arithmeticoverflow)
CSE431 Chapter 4B.54 Irwin, PSU, 2008
follow it
Datapath with Controls for Exceptions
7/29/2019 Lec4b Supp
55/56
Datapath with Controls for ExceptionsBranch
PCSrc
ID/EXEX/MEM
Hazard
Unit10
8000 0180hexEX.Flush
0
ID.Flush
4 Shift
left 2
Add
IF/ID
MEM/WB
Control
pareAdd
lush
CauseEPC
0
Read
Address
InstructionMemory
PC
Read Addr 1
Read Addr 2
Write Addr
RegFile
Read Data 1
ALU
DataMemory
Read Data
Co
IF.F
0
Write Data
ReadData 2
16
Write Data
SignExtend
ALU
cntrl
Forward
ForwardUnit
CSE431 Chapter 4B.55 Irwin, PSU, 2008
Unit
Summary
7/29/2019 Lec4b Supp
56/56
Summary
All modern day processors use pipelining forperformance (a CPI of 1 and fast a CC)
Pi eline clock rate limited b slowest i eline sta e so designing a balanced pipeline is important
Must detect and resolve hazards
correctly
z Data hazards
- Stall (impacts CPI)- Forward (requires hardware support)
z Control hazards put the branch decision hardware in asearly a stage in the pipeline as possible
- a mpac s
- Delay decision (requires compiler support)- Static and dynamic prediction (requires hardware support)
CSE431 Chapter 4B.56 Irwin, PSU, 2008