35
Csci 136 Computer Architecture II Csci 136 Computer Architecture II – Branch Hazards, Exceptions – Branch Hazards, Exceptions Xiuzhen Cheng [email protected]

Csci 136 Computer Architecture II – Branch Hazards, Exceptions

  • Upload
    alanna

  • View
    30

  • Download
    1

Embed Size (px)

DESCRIPTION

Csci 136 Computer Architecture II – Branch Hazards, Exceptions. Xiuzhen Cheng [email protected]. Announcement. Homework assignment # 10 , Due time – Before class, April 12 Readings: Sections 6.4 – 6.5 - PowerPoint PPT Presentation

Citation preview

Page 1: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Csci 136 Computer Architecture IICsci 136 Computer Architecture II – Branch Hazards, Exceptions – Branch Hazards, Exceptions

Xiuzhen [email protected]

Page 2: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Announcement

Homework assignment #10, Due time – Before class, April 12

Readings: Sections 6.4 – 6.5

Problems: 6.17-6.19, 6.21-6.22, 6.33-6.36, 6.39-6.40 (six of them will be graded. Your TA will give hints in the lab sections.)

Project #3 is due on April 10, 2005

Quiz #4: April 12, 2005

Final: Thursday, May 12, 12:40AM-2:40PM

Note: you must pass final to pass this course!

Page 3: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Review on Data Hazards, Forwarding, Stall

When does a data hazard happen?Data dependencies

Using forwarding to overcome data hazardsData is available after ALU stage

Forwarding conditions

Stall the pipeline for load-use instructionsData is available after MEM stage (lw instruction)

Hazard detection conditionsWhy in ID stage?

Page 4: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Review on Data Hazards

Page 5: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Review on Data Hazards, Forwarding, Stall

Sign-extend

PC+4

Page 6: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

LW and SW

lw $5, 0($15)sw $5, 100($15)

Sign-Ext

lw $5, 0($15)beq $5, $0, Exitsw $5, 100($15)

lw $5, 0($15)add $8, $8, $8sw $5, 100($15)

Page 7: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

SW is in MEM Stage

MEM/WB.RegWrite and EX/MEM.MemWrite and

MEM/WB.RegisterRd = EX/MEM.RegisterRd and

MEM/WB.RegisterRD != 0

Sign-Ext

EX/MEM

Data memory

lwsw

lw $5, 0($15)sw $5, 100($15)

Page 8: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

SW is In EX Stage

ID/EX.MemWrite and MEM/WB.RegWrite and

MEM/WB.RegisterRd = ID/EX.RegisterRt and

MEM/WB.RegisterRd != 0

Sign-Ext

lwsw

Page 9: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

More Cases

lw $15, 0($8) # load-use,sw $5, 100($15) # stall pipeline

R-Type followed by sw?The result from R-Type will be saved into memory

R-Type will overwrite base register for sw

Page 10: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

An Example

40: lw $2, 20($1)

44: and $4, $2, $5

48: or $8, $2, $4

Clock Cycle 1:

Clock Cycle 2:

Clock Cycle 3:

Clock Cycle 4:

Page 11: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Clock 1

Sign-extend

PC+4

Clock 1

Lw $2, 20($1)

44

Page 12: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Clock 2

Sign-extend

PC+4

Clock 2

And $4, $2, $5

48

Lw $2, 20($1)

44

$1

20

122

11

010

0001

Page 13: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Clock 3

Sign-extend

PC+4

Clock 3

Or $8, $2, $4

52

And $4, $2, $5

44

$2

255

10

000

1100

$5

4

Lw $2, 20($1)

11

010

122

$1

20

Page 14: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Clock 4

Sign-extend

PC+4

Clock 4

Or $8, $2, $4

52

And $4, $2, $5

44

$2

255

10

000

1100

$5

4

Bubble

00

000

Lw $2, 20($1)

11

Page 15: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Clock 5

Sign-extend

PC+4

Clock 5

Or $8, $2, $4 And $4, $2, $5

44

$2

244

10

000

1100

$4

8

Bubble

10

000

Lw $2, 20($1)

00

$2

$5

255

44 2

11

Page 16: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Branch Hazards

Control hazard: attempt to make a decision before condition is evaluated

Page 17: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Branch Hazards

flush flush flush

Decision is made here

Page 18: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Observations

Branch decision does not occur until MEM stage; 3 CCs are wasted. – Current design, non-optimized

Is it possible to reduce branch delay?YESIn EXE stage?

Two CCs branch delay

In ID Stage?One CC branch delayHow? – for beq $x, $y, label, $x xor $y then or all bits, much faster than ALU operation. Also we have a separate ALU to compute branch address.

3 strategiesDelayed branch; Static branch prediction; Dynamic branch Prediction

Page 19: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Delayed Branch

Will always execute the instruction following the branch.

Only one will be executed

Done by compiler or assembler50% successful rate

Losing popularityWhy?

More pipeline stages

Superscalar

Page 20: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Scheduling the Branch Delay Slot

Independent instruction, best choice B is good when branch taking probability is high. It must be OK to execute the sub instruction when the branch goes to the unexpected direction

Page 21: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Static Branch Prediction

Assume the branch will not be taken; If prediction is wrong, clear the effect of sequential instruction execution.

How to discard instructions in the pipeline?Branch decision is made at MEM stage: instructions in IF, ID, EX stages need to be discarded.

Branch decision is made at ID stage: only flush IF/ID pipeline register!

Page 22: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Static Branch Prediction

flush flush flush

Decision is made here

Page 23: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Static Branch Prediction

IF.Flush

Page 24: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Pipelined Branch – An Example36:

10

$4

$8

40:

44

28

72

IF.Flush

44:

Page 25: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Pipelined Branch – An Example72:

Page 26: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Dynamic Branch Prediction

Static branch prediction is crude!

Take history into considerationIf a branch was taken last time, then fetching the new instruction from the same place

Branch prediction buffer – indexed by the lower bits of the branch instruction

This memory contains a bit (or bits) which tells whether the branch was recently taken or not

Is the prediction correct? Any bad effect?

1-bit prediction scheme

2-bit prediction scheme

Prediction Taken Prediction Taken

Prediction not Taken Prediction not Taken

taken

Not taken

takentaken

Not taken

Not taken

Not taken

taken

Page 27: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Observation

Since we move branch prediction to the ID stage, we need to copy forwarding control related hardware to the ID stage too!

Beq following lwHazard detection unit should work.

Page 28: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

In-Class Exercise

Consider a loop branch that branches nine times in a row, then is not taken once. What is the prediction accuracy for this branch, assuming the prediction bit for this branch remains in the prediction buffer?

1-bit prediction?

With 2-bit prediction?

Prediction Taken Prediction Taken

Prediction not Taken Prediction not Taken

taken

Not taken

takentaken

Not taken

Not taken

Not taken

taken

Page 29: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Performance Comparision

Compare the performance of single-cycle, multi-cycle and pipelined datapath

200ps for memory access, 100ps for ALU operation, 50ps for register file access

25% loads, 10% stores, 11% branches, 2% jumps, 52% ALU ops

For piplelined datapath, 50% of load are immediately followed an instruction that uses the result

Branch delay on misprediction is 1 clock cycle and 25% branches are mispredicted

Jump delay is 1 clock cycle

Page 30: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Exceptions

Exceptions: events other than branch or jump that change the normal flow of instruction

Arithmetic overflow, undefined instruction, etc

Internal of the processor

Interrupts from external – IO interrupts

Use arithmetic overflow as an exampleWhen an overflow is detected, we need to transfer control to the exception handling routine at location 0x 8000 0180 immediately because we do not want this invalid value to contaminate other registers or memory locations

Similar idea as branch hazard

Detected in the EX stage

De-assert all control signals in EX and ID stages, flush IF/ID

Page 31: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Exceptions

80000180

Page 32: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Example

sub $11, $2, $4

and $12, $2, $5

or $13, $2, $6

add $1, $2, $1 -- overflow occurs

slt $15, $6, $7

lw $16, 50($7)

Exceptions handling routine:

0x 8000 0180 sw $25, 1000($0)

0x 8000 0184 sw $26, 1004($0)

Page 33: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Example

80000180

Clock 6

Page 34: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Example

Clock 7

80000180

Page 35: Csci 136 Computer Architecture II  – Branch Hazards, Exceptions

Questions?