View
41
Download
0
Category
Tags:
Preview:
DESCRIPTION
Write Back. Memory Access. Execute. Instruction Decode / Operand Fetch. Instruction Fetch. Processor Basic steps to process an instruction. IF. ID/OF. EX. MEM. WB. zero. +4. A. A L U. Data Mem. Inst. Mem. Reg. PC. IR. B. imm. Memory Access. Execute. Inst. Dec. - PowerPoint PPT Presentation
Citation preview
EE524/CptS561 Jose G. Delgado-Frias 1
Processor Basic steps to process an instruction
IF ID/OF EX MEM WB
Instruction FetchInstruction Decode / Operand Fetch
ExecuteMemory Access
Write Back
EE524/CptS561 Jose G. Delgado-Frias 2
Instruction FetchWrite
Back
Memory
AccessExecute
Inst. Dec.
Op. Fetch
Datapath
IRReg
imm
A
B
A
L
UPC
Inst.
Mem.
Data
Mem.
+4zero
IR Mem[PC]
NPC PC + 4
A Reg[IR 6..10]
B Reg[IR 11..15]
Imm ((IR16)16## IR 11..15]
NPC
Multiplexers(mux)
EE524/CptS561 Jose G. Delgado-Frias 3
Datapath (Arith/Logic Inst.)
IRReg
imm
A
B
A
L
UPC
Inst.
Mem.
Data
Mem.
+4zero
ALUoutput A op B
ALUoutput A op ImmReg[IR16..20] ALUoutputIR Mem[PC]
NPC PC + 4
A Reg[IR 6..10]
B Reg[IR 11..15]
Imm ((IR16)16## IR 11..15]
EE524/CptS561 Jose G. Delgado-Frias 4
Datapath (Load Inst.)
IRReg
imm
A
B
A
L
UPC
Inst.
Mem.
Data
Mem.
+4zero
ALUoutput A op ImmReg[IR11-15] LMDIR Mem[PC]
NPC PC + 4
A Reg[IR 6..10]
B Reg[IR 11..15]
Imm ((IR16)16## IR 11..15]
EE524/CptS561 Jose G. Delgado-Frias 5
Datapath (Store Inst.)
IRReg
imm
A
B
A
L
UPC
Inst.
Mem.
Data
Mem.
+4zero
ALUoutput A op ImmIR Mem[PC]
NPC PC + 4
A Reg[IR 6..10]
B Reg[IR 11..15]
Imm ((IR16)16## IR 11..15]
Mem[ALUoutput] B
EE524/CptS561 Jose G. Delgado-Frias 6
Datapath (Branch Inst.)
IRReg
imm
A
B
A
L
UPC
Inst.
Mem.
Data
Mem.
+4zero
ALUoutput (PC+4) op ImmIR Mem[PC]
NPC PC + 4
A Reg[IR 6..10]
B Reg[IR 11..15]
Imm ((IR16)16## IR 11..15]
Instructions of a program
EE524/CptS561 Jose G. Delgado-Frias 7
1 IF ID EX MEM WB
IF ID EX WB2
IF ID3
Time (clock cycles)
Instructions of a program
EE524/CptS561 Jose G. Delgado-Frias 8
1
2
3
45
6
ID
IF
EX
IF
ID
MEM
IF
EX
ID
WB
IF
MEM
EX
ID
ID
WB
MEM
EX
ID
IF
IF
CLOCK CYCLE
WB
MEM
EX
ID
IF
WB
MEM
EX
IF
ID
WB
MEM
ID
EX7
8
Pipelining Lessons
EE524/CptS561 Jose G. Delgado-Frias 9
• Pipelining doesn’t help latency of single task, it helps throughput of entire workload
• Pipeline rate limited by slowest pipeline stage• Multiple tasks operating simultaneously• Potential speedup = Number pipe stages• Unbalanced lengths of pipe stages reduces speedup• Time to “fill” pipeline and time to “drain” it reduces speedup
EE524/CptS561 Jose G. Delgado-Frias 10
Datapath w/ pipeline
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Pipeline registers
Clock
EE524/CptS561 Jose G. Delgado-Frias 12
Pipeline
IF1 ID/OF
IF2
3
4
5
6
7
8
9
INS
TR
UC
TIO
NS
CLOCK CYCLE
1 2 3 4 5 6 7 8 9
EX
ID/OF
IF
MEM
EX
ID/OF
IF
WB
MEM
EX
ID/OF
IF
WB
MEM
EX
ID/OF
IF
WB
MEM
EX
ID/OF
IF
WB
MEM
EX
ID/OF
IF
WB
MEM
EX
ID/OF
IF
EE524/CptS561 Jose G. Delgado-Frias 13
Pipeline Hazards
• Structural Hazards– two or more instructions use same hardware at the same time.
• Data Hazards– Data dependencies
– Result from inst. j is needed by inst. k
• Control Hazards– Branch changes flow, what happen with the following
instruction(s)
EE524/CptS561 Jose G. Delgado-Frias 14
ResourcesMem
(IM)Reg
Mem
(IM)
AL
U
Reg
Mem
(IM)
Mem
(DM)
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)Reg
Mem
(DM)Reg
AL
U
RegMem
(DM)Reg
AL
U
EE524/CptS561 Jose G. Delgado-Frias 15
Data HazardsMem
(IM)Reg
Mem
(IM)
AL
U
Reg
Mem
(IM)
Mem
(DM)
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)Reg
Mem
(DM)Reg
AL
U
RegMem
(DM)Reg
AL
U
R1 R2+R3
R5 R1+R3
R8 R1-R6
EE524/CptS561 Jose G. Delgado-Frias 16
Data ForwardingMem
(IM)Reg
Mem
(IM)
AL
U
Reg
Mem
(IM)
Mem
(DM)
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)Reg
Mem
(DM)Reg
AL
U
RegMem
(DM)Reg
AL
U
R1 R2+R3
R5 R1+R3
R8 R1-R6
EE524/CptS561 Jose G. Delgado-Frias 17
Datapath w/ pipeline
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Forwarding unit
EE524/CptS561 Jose G. Delgado-Frias 18
Example
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Forwarding unit
ADD R1,R2,R3 ADD R1,R2,R3SUB R4,R3,R1 ADD R1,R2,R3SUB R4,R3,R1XOR R7,R8,R1ADD R1,R2,R3
SUB R4,R3,R1XOR R7,R8,R1
EE524/CptS561 Jose G. Delgado-Frias 19
Example
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Forwarding unit
ADD R1..SUB R4,R3,R1XOR R7,R8,R1
EE524/CptS561 Jose G. Delgado-Frias 20
Example
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Forwarding unit
ADD R1..SUB R8,R3,R1XOR R7,R8,R1
EE524/CptS561 Jose G. Delgado-Frias 21
Data Hazard Classification
• RAW (Read After Write)– w/ forward only load presents a problem
• WAW
• WAR
• RAR
j: R1
k: RY R1
j: R1
k: R1
j: R1
k: R1
j: R1
k: R1
EE524/CptS561 Jose G. Delgado-Frias 22
Data Forwarding (load)Mem
(IM)Reg
Mem
(IM)
AL
U
Reg
Mem
(IM)
Mem
(DM)
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)Reg
Mem
(DM)Reg
AL
U
RegMem
(DM)Reg
AL
U
R1 LD[Mem]
R5 R1+R3
R8 R1-R6
EE524/CptS561 Jose G. Delgado-Frias 23
Data hazard (load)
IFLW R1,0(R1) ID
IFSUB R4,R1,R5
EX
ID
IF
WB
EX
ID
IF
MEM
EX
IDAND R6,R1,R7
OR R8,R1,R9
MEM
stall
stall
stall
MEM
EX
WB
“R1”
EE524/CptS561 Jose G. Delgado-Frias 24
Branch
BR R1, LABEL_A
ADD R2,R3,R7
AND R5,R7,R11
:
:
LD R4,R2,005LABEL_A:
EE524/CptS561 Jose G. Delgado-Frias 25
BranchMem
(IM)Reg
Mem
(IM)
Reg
BR R1, LABEL_A
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)
AL
U
Reg
Reg
Mem
(DM)
AL
U
Reg
Mem
(DM)
ADD R2,R3,R7
AND R5,R7,R11
LD R4,R2,005
Mem
(DM)
AL
U
Reg
Mem
(IM)
EE524/CptS561 Jose G. Delgado-Frias 26
Datapath w/ pipeline
RegA
L
U
Data
Mem.
zero
PCInst.
Mem.
+4
Forwarding unit
EE524/CptS561 Jose G. Delgado-Frias 27
What to do w/ branch
• Reduce the number of cycles to decide on a branch.
• Delayed branch (Software Solutions)– NO-OP
– move instructions• from before
• from target
• from fall through
EE524/CptS561 Jose G. Delgado-Frias 28
BranchMem
(IM)Reg
Mem
(IM)
Reg
BR R1, LABEL_A
AL
U
Reg
Mem
(IM)
Mem
(DM)
AL
U
Reg
Mem
(IM)
Reg
Mem
(DM)
AL
U
Reg
Reg
Mem
(DM)
AL
U
Reg
Mem
(DM)
ADD R2,R3,R7
LD R4,R2,005
35
Out of order completionExecution starts in order
Example
MULTD
ADDD
LD
SD
1
IF
2
ID
IF
3
m1
ID
IF
4
m2
a1
ID
IF
5
m3
a2
X
ID
6
m4
a3
M
X
7
m5
a4
W
M
8
m6
M
W
9
m7
W
10
M
11
W
36
MIPS R4000(Superpipelining)
instruction memory
IF IS
AL
U
EX
data memory
DF DS TC
Reg
WB
Reg
RF
IF: Instruction fetch First half
IS: Instruction fetch Second half
RF: Inst. Decode & Register Fetch
EX: Execution
DF: Data fetch First half
DS: Data fetch Second half
TC: Tag Check
WB: Write Back
37
Load
instruction memoryA
LU
data memory RegReg
instruction memoryA
LU
data memory RegReg
instruction memoryA
LU
data memory RegReg
instruction memoryA
LU
data memoryReg
LW R1
Instruction 1
Instruction 2
ADD R2,R1
CC1 CC2 CC3 CC4 CC5 CC6 CC7
38
Branch
instruction memory
AL
U
data memory RegReg
instruction memory
AL
U
data memory RegReg
instruction memory
AL
U
data memory RegReg
instruction memory
AL
U
data memoryReg
BEQZ
instruction memory data memoryReg
AL
U
39
Branch (taken)
Branch inst IF IS RF EX DF DS TC WB
Delay slot IF IS RF EX DF DS TC WB
stall S S S S S S S S
stall S S S S S S S S
Branch target IF IS RF EX DF DS TC WB
Recommended