Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
The Processor Pipeline
Chapter 4, Patterson and Hennessy, 4ed.Section 5.3, 5.4: J P Hayes.
Pipeline
A Basic MIPS Implementation
● Memory-reference instructions – Load Word (lw) and Store Word (sw)
● ALU instructions – add, sub, AND, OR and slt● Branch on equal (beq)
Instruction Fetch – Elements
Instruction Fetch
ALU Operations – Elements
ADD R1, R2, R3
REGISTERFILE
REGISTERFILE
Addr
Data
Data
Write
ALU Operations – Elements
ADD R1, R2, R3
ALU Operations – Elements
ADD R1, R2, R3
Loads and Stores – Elements
LW R1, -8(R2)
Branches – Elements
BEQ R1, R2, LABEL BEQ R1, R2, -16
Branches – Elements
BEQ R1, R2, LABEL BEQ R1, R2, -16
Memory and R-type Instructions
Memory Instruction – Load
LW R1, -8(R2)
Memory Instruction – Store
SW R1, -8(R2)
R Type Instruction – ADD
ADD R1, R2, R3
The MIPS Datapath
The MIPS Datapath – BEQ
BEQ R1, R2, -16
MIPS Datapath and Control Lines
Pipeline Stages
Instruction Fetch (IF)Instruction Fetch (IF) ID: Instruction decode/Register file read
ID: Instruction decode/Register file read
EX: Execution/Address Calculation
EX: Execution/Address Calculation
MEM: MemoryAccess
MEM: MemoryAccess
WB: WriteBack
WB: WriteBack
Pipelined Datapath
Instruction Fetch (IF)Instruction Fetch (IF)
ID: Instruction decode/Register file read
ID: Instruction decode/Register file read
EX: Execution/Address Calculation
EX: Execution/Address Calculation
MEM: MemoryAccess
MEM: MemoryAccess
WB: WriteBack
WB: WriteBack
Pipelined vs. Nonpipelined Implementation
Pipelined vs. Nonpipelined Implementation
● Ratio of total execution times between the two versions for 10^6 instructions?
● Pipelining increases the instruction throughput opposed to individual instruction execution time.
IF ID EX MEM WB
Speedup of the Pipeline● The speedup of a k stage pipelined processor
over an unpipelined processor
S k=T unpipelinedT pipelined
=n⋅k
k+(n−1)
n: number of instructions in the program.k: number of pipeline stages
Efficiency of the Pipeline● Percentage of stages accomplishing tasks
related to the instruction in execution
η=No. of Instructions
Instruction ExecutionTime
n: number of instructions in the program.k: number of pipeline stages
η=n
k+(n−1)
Throughput of the Pipeline
● Number of tasks completed in unit time (one second)
w=η× f
f: frequency of operation
Pipeline Hazards
● Hazard: n. An unavoidable danger or risk, even though often foreseeable.
● Situations that prevent the next instruction in the instruction stream from being executing during its designated clock cycle
● Reduce the performance from the ideal speedup gained by pipelining
Structural Hazard
MEM ID EX MEM WB
MEM ID EX MEM WB
MEM ID EX MEM WB
MEM ID EX MEM WB
i1
i2
i3
i4
...
1 2 3 4 5 6 7 8 9
MEM ID EX MEM WBi5
HAZARD!!!
● Lack of resources● Solution: Increase resources
– Use of separate Data and Instruction memories in the MIPS pipeline
s
Data Hazard
IF ID EX MEM WBADD R1, R2, R3
1 2 3 4 5 6 7 8 9
IFSUB R4, R1, R5 ID EX MEM WB WRONG!
● Data (input operands) required by the instruction are not ready/available
● Data dependence● RAW, WAR, WAW dependences
ADD R1, R2, R3
SUB R2, R4, R5
ADD R1, R2, R3
SUB R1, R4, R5
Data HazardDADDDSUBANDORXOR
R4,R1,R5R6,R1,R7
R1,R2,R3
R8,R1,R9R10,R1,R11
IM REG DMDADD
DSUB
AND
OR
Time (clock cycles)
XOR
ALU REG
IM REG DMALU REG
IM REG DMALU REG
IM REG DMALU
IM REG ALU
Avoiding Data Hazards – ForwardingDADDDSUBANDORXOR
R4,R1,R5R6,R1,R7
R1,R2,R3
R8,R1,R9R10,R1,R11
IM REG DMDADD
DSUB
AND
OR
Time (clock cycles)
XOR
ALU REG
IM REG DMALU REG
IM REG DMALU REG
IM REG DMALU
IM REG ALU
Pipeline without Forwarding
Pipeline with Forwarding
Data Hazard – Load InstructionLDDSUBANDOR
R4,R1,R5R6,R1,R7
R1,0(R2)
R8,R1,R9
IM REG DMLD
DSUB
AND
OR
Time (clock cycles)
ALU REG
IM REG DMALU REG
IM REG DMALU REG
IM REG DMALU
Data Hazards – StallsLDDSUBANDOR
R4,R1,R5R6,R1,R7
R1,0(R2)
R8,R1,R9
IM REG DMLD
DSUB
AND
OR
Time (clock cycles)
ALU REG
IM REG DMALU REG
IM REG
IM REG
ALU
DMALU ALU
ALU ALU
Data Hazard – Solutions
● Data Forwarding● Instruction Reordering
Control Hazard
● Arise from the pipelining of branches and other instructions that change the PC
● Also called Branch Hazards
Branch Hazards
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
BEQ
Branch Successor
Branch Successor + 1
Branch Successor + 2
Time(clock cycles)
1 2 3 4 5 6 7 8 9
IF ID EX MEM WB
IFADD ID EX MEM WB
Assumption: Branch condition evaluation completed in the ID stageAssumption: Branch condition evaluation completed in the ID stage
Reducing Pipeline Branch Penalties
● Freeze the pipeline● Predict Taken● Predict Untaken● Fill Branch Delay Slot
IF ID EX MEM WB
IF
IF ID EX MEM WB
IF ID EX MEM WB
BEQ
AND
Branch Successor
Branch Successor + 1
Time(clock cycles)
1 2 3 4 5 6 7 8 9
ID EX MEM WB
i
i-1
i+16
i+17
Dynamic Branch Prediction● Branch prediction buffers
– Single bit predictors
– Change prediction with branch behaviour
– No. of wrong predictions?
PC Prediction
0x0100 1
0x0154 0
0x0210 1
... 1
BRANCH PREDICTION BUFFER
T T T T N T T T T T T T T T T T T
Wrong Predictions
Dynamic Branch Prediction
● 2-bit predictors 00
11
10
11
11
Branch PredictionBuffer
0x0100
0x0154
0x0210
11 10
0100