59
Advanced Pipelining • Optimally Scheduling Code • Optimally Programming Code • Scheduling for Superscalars (6.9) • Exceptions (5.6, 6.8)

Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Embed Size (px)

Citation preview

Page 1: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Advanced Pipelining

• Optimally Scheduling Code

• Optimally Programming Code

• Scheduling for Superscalars (6.9)

• Exceptions (5.6, 6.8)

Page 2: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Optimally schedule code

• for(i=0;i<N;i++)• A[i] = A[i] + 10;

• & (A[0]) in $s1• & (A[i]) in $s2

slt $t1, $s3, $s0

beq $t1, $0, end

loop:

lw $t0, 0($s1)

addi $t0, $t0, 10

sw $t0, 0($s1)

addi $s1, $s1, 4

slt $t1, $s1, $s2

bne $t1, $0, loop

Page 3: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

1. Identify Dependencies

lw $t0, 0($s1)

addi $t0, $t0, 10

sw $t0, 0($s1)

addi $s1, $s1, 4

slt $t1, $s1, $s2

bne $t1, $0, loop

$t0 – lw->addi – RAW$t0 – addi->sw - RAW

Page 4: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

2. Draw timing diagramWITH DATA FORWARDING

lw $t0, 0($s1)

addi $t0, $t0, 10

sw $t0, 0($s1)

addi $s1, $s1, 4

slt $t1, $s1, $s2

bne $t1, $0, loop

F D X M W

Page 5: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

3. Remove WAR/WAW dependencies

lw $t0, 0($s1)

addi $t0, $t0, 10

sw $t0, 0($s1)

addi $s1, $s1, 4

slt $t1, $s1, $s2

bne $t1, $0, loop

RAW, WAR, WAW

F D X M W F D X M W F D X M W F D X M W F D X M W F D X M W

D

F

F

lw

addi

sw

addi

slt

bne

Target the false dependencies

Page 6: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

3. Remove WAR/WAW dependencies

lw $t0, 0($s1)

sw $t0, 0($s1)

addi $s1, $s1, 4

lw $t0, 0($s1)

addi $s1, $s1, 4 sw $t0, 0($s1)

lw $t0, 0($s1)

addi

sw

Original Incorrect Correct

Page 7: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

lw $t0, 0($s1)

addi $s1, $s1, 4

addi $t0, $t0, 10

sw $t0, ____($s1)

slt $t1, $s1, $s2

bne $t1, $0, loop

lw $t0, 0($s1)

addi $t0, $t0, 10

sw $t0, 0($s1)

addi $s1, $s1, 4

slt $t1, $s1, $s2

bne $t1, $0, loop

Page 8: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

3. Remove WAR/WAW dependencies

lw $t0, 0($s1)

addi $s1, $s1, 4

addi $t0, $t0, 10

slt $t1, $s1, $s2

sw $t0, -4($s1)

bne $t1, $0, loop

F D X M W F D X M W F D X M W F D X M W F D X M W F D X M W

lw

addi

sw

addi

slt

bne

Page 9: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Software Control Hazard Removal

If ( (x % 2) == 1)isodd = 1;

Page 10: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Software Control Hazard Removal

If ( x == true)y = false;

elsey = true;

Page 11: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

If ((x == MON) || (x == TUE) || (x == WED)){}

Software Control Hazard Removal

Page 12: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

If ((TheCoinTossIsHeads) || (StudentStudiedForExam)){}

Increasing Branch Performance

Page 13: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

What does it all mean?

• Does that mean that error-checking code is bad? That is a whole lot of branches if you do it well!!!

Page 14: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

The moral is…..

• Calculation is less expensive than …..

Page 15: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Superscalars - Parallelism

Ford mass produces cars. We want to “mass produce” instructions

Increase Depth – assembly line – build many cars at the same time, but each car is in a different stage of assembly.

Increase Width – multiple assembly lines – build many cars at the same time by building many line, all of which operate simultaneously.

Page 16: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

“Superpipelining” (deep pipelining – many stages)

• Limiting returns because….

• Register delays are __________________________ of clock

• Difficult to __________________

Page 17: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

SuperScalars

• __________ parts of pipeline

• Multiple instructions in _______ stage at once

Page 18: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

SuperScalars

• Which instructions can execute in parallel?

• Fetching multiple instructions per cycle

Page 19: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Static Scheduling – VLIW or EPIC (Itanium)

• __________ schedules the instructions

• If one instruction stalls, all following instructions stall

• Book Example: SuperScalar MIPS:• Two instructions / cycle

• one alu/branch, one ld/st each cycle

Page 20: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Schedule for SS MIPSLoop: lw $t0, 0($s1)

addu $t0, $t0, $s2sw $t0, 0($s1)addi $s1, $s1, -4bne $s1, $zero,Loop

PC ALU/branch ld/st08162432

Page 21: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

SuperScalars - Static

bne

Fetch Memory WriteBackExecuteDecode

Read Values Write Values

addu

sw lw

addi

Page 22: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Problem

• Problem:– Too many _______________ in loop

– Not enough ______________ to fill in holes

• Solution:– Do ______________ at once

– More instructions

– Only one branch

Page 23: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Unrolling1. Unroll Loop

Loop: lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1) lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1)bne $s1, $zero,Loop

Loop: lw $t0, 0($s1) addi $s1, $s1, -4 addu $t0, $t0, $s2sw $t0, 4($s1)bne $s1, $zero,Loop

Page 24: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Unrolling2. Rename Registers

Loop: lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1) lw $t1, 0($s1)addi $s1, $s1, -4addu $t1, $t1, $s2sw $t1, 4($s1)bne $s1, $zero,Loop

But wait!!! How has this helped? There are tons of dependencies?Whatever are we to do? Register Renaming!!!

Page 25: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Unrolling2. Rename Registers

Loop: lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1) lw $t1, 0($s1)addi $s1, $s1, -4addu $t1, $t1, $s2sw $t1, 4($s1)bne $s1, $zero,Loop

(Repeated slide for your reference)

Loop: lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1) lw $t0, 0($s1)addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, 4($s1)bne $s1, $zero,Loop

Page 26: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Unrolling3. Reduce Instructions

Loop: lw $t0, 0($s1)addi $s1, $s1, -8addu $t0, $t0, $s2sw $t0, 8($s1) lw $t1, 4($s1)addu $t1, $t1, $s2sw $t1, 4($s1)bne $s1, $zero,Loop

Loop: lw $t0, 0($s1)addi $s1, $s1, -4addi $s1, $s1, -4addu $t0, $t0, $s2sw $t0, ___($s1) lw $t1, ___($s1)addu $t1, $t1, $s2sw $t1, 4($s1)bne $s1, $zero,Loop

Page 27: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Loop Unrolling4. Schedule

Loop: lw1 $t0, 0($s1)addi $s1, $s1, -8addu1 $t0, $t0, $s2sw1 $t0, 8($s1) lw2 $t1, 4($s1)addu2 $t1, $t1, $s2sw2 $t1, 4($s1)bne $s1, $zero,Loop

ALU/branch lw/swlw1

Page 28: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Performance Comparison

Original Unrolled

ALU/branch ld/stlw $t0, 0($s1)

addi $s1, $s1, -4addu $t0, $t0, $s2bne $s1, $zero,L sw $t0, 4($s1)

Page 29: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Static Scheduling Summary

• Code size ______________ (because of nops)

• It can not resolve __________ dependencies

• If one instruction stalls, ___________________

Page 30: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Dynamic Scheduling

• _________ schedules ready instructions

• Only ___________ instructions stall

• _______________ resolved in hardware

Page 31: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarFetch

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Fetch 4 instructions each

cycle

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 32: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarDecode

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Register Alias Table records 1. Current Register Number

(WAW/WAR Register Renaming)

or

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 33: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarDecode

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Register Alias Table records 1. Current Register Number

(WAW/WARRegister Renaming)

or2. Functional Unit

(RAW – result not ready)

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 34: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarExecute

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Wait until your inputs are ready

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 35: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarExecute

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Execute once they are ready

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 36: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarMemory

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias TableLoop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

First calculate the address

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)lw r2, 0(s1)

Page 37: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarMemory

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

Ld/St Queue checks memory addresses – out

of order lw/sw

addu r2,ldst1,r5addi r1,r1,-4

bne 2add1,r7,Loop

sw r2, 0(s1)

Page 38: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4-wide Dynamic SuperscalarCommit

Register FileInstruction Window

Ld/St 1Add 2Add 3Add

CommitBuffer

Ld/StQueue

2add1 1add1 2 3

Register Alias Table

lw r2, 0(s1)

KEYWaiting for valueReading value

Loop: lw r2, 0(r1) addu r2, r2, r5 sw r2, 0(r1) addi r1, r1, -4 bne r1, r7,Loop

addu r2,ldst1,r5sw 1add1, 0(s1)

addi r1,r1,-4bne 2add1,r7,Loop

lw r2, 0(s1)

sw r2, 0(s1)addu r2,r2,r5

addi r1,r1,-4

addi r1,r1,-4lw r2, 0(s1)

addu r2,r2,r5addi r1,r1,-4

bne r1,r7,Loop

sw r2, 0(s1)

Instructions wait until all previous instructions

have completed

Page 39: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Fallacies & Pitfalls• Pipelining is easy

–______________ is difficult

• Instruction set has no impact on pipelining–Complicated _____________

& _____________________ instructions complicate pipelining immensely

Page 40: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Technology Influences

• Pipelining ideas are good ideas regardless of technology–Only recently, with extra chip

space, has ___________________ become better than ____________________

–Now, pipelining limited by ________

Page 41: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Exceptions –Unexpected Events

• Internal • External

Page 42: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Definitions

a. Anything unexpected happens

b. External event occurs

c. Internal event occurs

d. Change in control flow

Exception Interrupt

PowerPC

Intel

MIPS

Page 43: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Exception-Handling

• Stop• Transfer control to OS• Tell OS what

happened• Begin executing

where we left off

Page 44: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

1. Detect Exception

• Add control lines to detect errors

Page 45: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Step 2: Store PC into EPC

Read Addr Out Data

InstructionMemory

PC

Inst

4

src1 src1data

src2 src2dataRegister File

destreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

Page 46: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Step 3: Tell OS the problem

• Store error code in the _________

• Use vectored interrupts

– Use error code to determine _________

Page 47: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Cause Register

• Set a flag in the cause register

• How does the OS find out if an overflow occurred if the bit corresponding to an overflow is bit 5?

Page 48: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Vectored Interrupts

• The address of trap handler is determined by cause

Exception type Exception vector address (in hex)

Undefined Instruction C0 00 00 00hex

Arithmetic Overflow C0 00 00 20hex

Page 49: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Cause Register – Go to OS

Read Addr Out Data

InstructionMemory

PC

Inst

4

src1 src1data

src2 src2dataRegister File

destreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

EPC-4 Cause

Handler PC

Page 50: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Vectored Interrupt – Go to OS

Read Addr Out Data

InstructionMemory

PC

Inst

4

src1 src1data

src2 src2dataRegister File

destreg

destdata

op/funrsrtrdimm

Addr Out Data

Data Memory

In Data

32Sign Ext

16

<<2

<<2

EPC-4

Cause Vector Table

Page 51: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Steps for Exceptions

• Detect exception

• Place processor in state before offending instruction

• Record exception type

• Record instruction’s PC in EPC

• Transfer control to OS

Page 52: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

What happens if the third instruction is undefined?

Time->

add $s0, $0, $0

lw $s1, 0($t0)

undefined

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

ID WB

MEM

WB

MEM

WB

MEM

WB

In what stage is it detected? In what cycle?

1. Detection

Page 53: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

1. Detection

• Must associate exception with proper instruction

• What happens if multiple exceptions happen in the same cycle?

Page 54: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

Time->

add $s0, $0, $0

lw $s1, 0($t0)

undefined

or $s3, $s4, $t3

IF ID

IF ID

IF

MEM

ID

IF

1 2 3 4 5 6 7 8

2. Preserve state before instruction

What? What does that mean?!?

Page 55: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

3. Record exception type

• Place value in cause register or

• Use vectored interrupts– (exception routine address dependent on

exception type)

Page 56: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

PC

44

Addr Instr

Inst Mem

src1 src1datasrc2

RegFile src2datadestdestdata

ALUAddr OutData

DataMem

InData

X

<

Undef addlwor

4. Record PC in EPCMachine in detection cycle

Page 57: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

PC

44

Addr Instr

Inst Mem

src1 src1datasrc2

RegFile src2datadestdestdata

ALUAddr OutData

DataMem

InData

X

<

Undef

4. Record PC in EPCMachine in before transfer

Where is the proper PC? Long gone!!!

Page 58: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

4. Record PC in EPC

• Non-trivial because PC changes each cycle, and exceptions can be detected in several stages (decode, execute, memory)

• Precise exceptions

• Imprecise exceptions

Page 59: Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)

5. Transfer control to OS

• Same as before