29
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04 1 Scoreboarding The following four steps replace ID, EX and WB steps ID: Issue – if a functional unit for instruction is free and no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls ID: Read operands – a source operand is available if no earlier instruction is going to write it EX: Execute – once the execution is complete this stage notifies the scoreboard WB: Write results – scoreboard checks for WAR hazards and may stall write back

CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04 1 Scoreboarding The following…

Embed Size (px)

DESCRIPTION

CIS 662 – Computer Architecture – Fall Class 11 – 10/12/04 3 IssueRead operandsExecution complete Write result L.D F6, 34(R2) L.D F2, 45(R3) MUL.D F0, F2, F4 SUB.D F8, F6, F2 DIV.D F10, F0, F6 ADD.D F6, F8, F2  Instruction status BusyOp F i F j F k Q j Q k R j R k Integer ALU FP Mult1 FP Mult2 FP Add FP Div Functional unit status F 0 … F 2 … F 4 … F 6 … F 8 … F 10 … F 12 Functional unit Register result status Integer YesLoadF6R2Yes Issue first load Time =1

Citation preview

What is Computer Architecture?*
The following four steps replace ID, EX and WB steps
ID: Issue – if a functional unit for instruction is free and no other active instruction has the same destination register (WAW) it can proceed, otherwise it stalls
ID: Read operands – a source operand is available if no earlier instruction is going to write it
EX: Execute – once the execution is complete this stage notifies the scoreboard
WB: Write results – scoreboard checks for WAR hazards and may stall write back
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
Scoreboarding
Operands are always read from register file – no advantage is taken of forwarding
This is no large penalty as write occurs immediately after the execution and not after MEM stage
Read operand and write result stages cannot overlap so we have 1 cycle latency
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
due to structural hazard
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
due to structural hazard
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
and frees ALU
due to structural hazard
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Yes
Load
F2
R3
No
Yes
Yes
Yes
Integer
Integer
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Add cannot be issued
due to structural hazard
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Div is stalled waiting for F0
10
10
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Sub completes execution
10
10
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Sub writes result, frees adder
Div is stalled waiting for F0
Add cannot be issued
due to structural hazard
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Div is stalled waiting for F0
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Div is stalled waiting for F0
10
No
No
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
No
No
Add
Div is stalled waiting for F0
15
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Div is stalled waiting for F0
10
15
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Mult in execution (8 out of 10)
Div is stalled waiting for F0
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Mult completes execution
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
Mult writes result
Yes
Mult
F0
F2
F4
No
No
Mult1
Mult1
Divide
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
22
Yes
Add
No
No
Add
F6
F8
F2
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
*
L.D F6, 34(R2)
L.D F2, 45(R3)
Integer ALU
FP Mult1
FP Mult2
FP Add
FP Div
Functional unit
*
Tomasulo’s Algorithm
Use reservation stations that will hold operands for instructions waiting to issue
Reservation station fetches the operand as soon as it is available
Pending instructions read operands from reservation stations
When writes overlap in execution, only the last write actually updates the register
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
*
Tomasulo’s Algorithm
Each reservation station holds the opcode for the pending instruction and either operand values or names of reservation stations that will provide them
Load and store buffers hold data and addresses for memory access
Transfer of all data goes over the common data bus
CIS 662 - Computer Architecture - Fall 2003 - Class 1 - 9/4/03
CIS 662 – Computer Architecture – Fall 2004 - Class 11 – 10/12/04
*
Due Tuesday, October 19 by the end of the class
Submit either in class (paper) or by E-mail (PS or PDF only) or bring the paper copy to my office
Show scheduling of the following code using scoreboard
(assume one integer ALU, two FP multipliers, one FP adder and one FP divider)
LD F2, 0(R2)
LD F4, 100(R3)