24
Computer Architecture: A Constructive Approach Branch Direction Prediction Pipeline Integration Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 23, 2012 L20-1 http://csg.csail.mit.edu/ 6.S078

Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

  • Upload
    babu

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. NA pred with decode feedback. Reg Read. Fetch. Decode. Execute. Memory. Write- back. xf. - PowerPoint PPT Presentation

Citation preview

Page 1: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Computer Architecture: A Constructive Approach

Branch Direction Prediction –Pipeline Integration

Joel EmerComputer Science & Artificial Intelligence Lab.Massachusetts Institute of Technology

April 23, 2012 L20-1http://csg.csail.mit.edu/6.S078

Page 2: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

NA pred with decode feedback

April 23, 2012 L20-2http://csg.csail.mit.edu/6.S078

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-backxf

NextAddress

Prediction

df

DirectionPrediction

Page 3: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Direction prediction recipeExecute

Send redirects on mispredicts (unchanged) Send direction prediction training

Decode

Check if next address matches direction pred Send redirect if different (update naPred)

Fetch Generate prediction Learn from feedback Accept redirects from later stages

April 23, 2012 L20-3http://csg.csail.mit.edu/6.S078

Page 4: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Epoch management recipeExecute

On exec epoch mismatch - poison instruction Otherwise,

On mispredict – change exec epoch and redirect.Decode

On new exec epoch – update local exec/decode epochs Otherwise,

On decode epoch mismatch – drop instruction If not dropped,

On next addr mispredict – change decode epoch and redirect.Fetch

On exec redirect – update local exec epoch On decode redirect – if for current exec epoch then update

local decode epochApril 18, 2012 L20-4http://csg.csail.mit.edu/6.S078

Page 5: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Add direction feedbacktypedef struct { Bool correct; NaInfo naPredInfo; Addr nextAddr; DirInfo dirPredInfo; Bool taken;} Feedback deriving (Bits, Eq);

FIFOF#(Tuple3#(Epoch,Epoch,Feedback)) decFeedback<-mkFIFOF;FIFOF#(Tuple2#(Epoch,Feedback)) execFeedback <- mkFIFOF;

April 23, 2012 L20-5http://csg.csail.mit.edu/6.S078

Feedback needs information for training

direction predictor

Execute epochDecode epoch

Execute epoch

Page 6: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Execute (branch analysis)// after executing instruction...let nextEeEpoch = eeEpoch;let cond = execData.execInst.cond; let nextPc = cond?execData.execInst.addr : execData.pc+4;let correctPred = (nextPC == execData.nextAddrPred);

if (!correctPred) nextEeEpoch += 1;eeEpoch <= nextEeEpoch;execFeedback.enq(tuple2(nextEeEpoch, Feedback{correct: correctPred, taken: cond, dirPredInfo: execData.dirPredInfo, naPredInfo: execData.naPredInfo, nextAddr: nextPc}));

// enqueue instruction to next stage

April 23, 2012 L20-6http://csg.csail.mit.edu/6.S078

Note: may have been reset in

decode

Always send feedback

Page 7: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Decode with mispredict detectrule doDecode; let decData = newDecData(fr.first); let correctPath = (decData.execEpoch != deEpoch) ||(decData.decEpoch == ddEpoch);

let instResp = decData.fInst.instResp; let pcPlus4 = decData.pc+4;

if (correctPath) begin decData.decInst = decode(instResp, pcPlus4); let target = knownTargetAddr(decData.decInst); let brClass = getBrClass(decData.decInst); let predTarget = decData.nextAddrPred; let predDir = decData.dirPred;

April 23, 2012 L20-7http://csg.csail.mit.edu/6.S078

Determine if epoch of incoming instruction is on

good path

New exec epoch

Same dec epoch

Page 8: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Decode with mispredict detect let decodedTarget = case (brClass) NonBranch: pcPlus4; UncondKnown: target; CondBranch: (predDir?target:pcPlus4); default: decData.nextAddrPred; endcase; if (decodedTarget != predTarget) begin decData.decEpoch = decData.decEpoch + 1; decData.nextAddrPred = decodedTarget; decFeedback.enq( tuple3(decData.execEpoch, decData.decEpoch, Feedback{correct: False, naPredInfo: decData.naPredInfo, nextAddr: decodedTarget, dirPredInfo: decData.dirPredInfo, taken: decData.takenPred})); enddr.enq(decData); end // of correct path April 23, 2012 L20-8http://csg.csail.mit.edu/6.S078

Wrong next addr?

Tell exec addr of next instruction!

Send feedback

New dec epoch

Enqueue to next stage on correct path

Calculate target as best as decode can

Page 9: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Decode with mispredict detect else begin // incorrect path decData.decEpoch = ddEpoch; decData.execEpoch = deEpoch; end ddEpoch <= decData.decEpoch; deEpoch <= decData.execEpoch; fr.deq;

endrule

April 23, 2012 L20-9http://csg.csail.mit.edu/6.S078

Preserve current epoch if instruction on incorrect path

decData.*Epoch have been set properly so we always save them.

Page 10: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Integration into Fetchrule doFetch(); function Action enqInst(); action let d <- mem.side(MemReq{op: Ld, addr: fetchPC, data:?}; match {.nAddrPred,.naPredInfo}<-naPred.predict(fetchPc); match {.dirPred,.dirPredInfo}<-dirPred.predict(fetchPc); FBundle fInst = FBundle{instResp: d}; FData fData = FData{pc: fetchPc, fInst: fInst, inum: iNum, execEpoch: feEpoch, naPredInfo:naPredInfo, nextAddrPred:nAddrPred, dirPredInfo:dirPredInfo, dirPred:dirPred }; iNum <= iNum + 1; fetchPc <= nAddrPred; fr.enq(fData); endactionendfunction

April 18, 2012 L20-10http://csg.csail.mit.edu/6.S078

Page 11: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Handling redirect from executeif (execFeedback.notEmpty) begin match {.execEpoch, .fb} = execFeedback.first; execFeedback.deq; if(!fb.correct) begin dirPred.repair(fb.dirPredInfo, fb.taken); dirPred.train(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); naPred.train(fb.naPredInfo, fb.nextAddr); feEpoch <= execEpoch; fetchPc <= feedback.nextAddr; end else begin dirPred.train(fb.dirPredInfo, fb.taken); naPred.train(fb.naPredInfo, fb.nextAddr); enqInst; endend

April 23, 2012 L20-11http://csg.csail.mit.edu/6.S078

Train and repair on redirect

Just train on correct prediction

Page 12: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Handling redirect from decodeelse if (decFeedback.notEmpty) begin decFeedback.deq; match {.execEpoch, .decEpoch, .fb} = decFeedback.first; if (execEpoch == feEpoch) begin if (!fb.correct) begin // epoch unchanged fdEpoch <= decEpoch; dirPred.repair(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); fetchPc <= feedback.nextAddr; end else // dec feedback on correct prediction enqInst; end else // dec feedback, but fetch is in new exec epoch enqInst;else // no feedback enqInst;

April 23, 2012 L20-12http://csg.csail.mit.edu/6.S078

Just repair never train on feedback

from decode

Page 13: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Immediate update issuesIf the direction director does not update immediately on predictions things are easy. But if the predictor updates, we will predict and update the predictor on non-branches.

Possible solutions: Move direction prediction to decode, so we know not to

update on non-branches. But makes timing more critical. Simply use direction predictor even on non-branch

instructions. Note: for superscaler issue designs this is a less significant problem.

April 23, 2012 L20-13http://csg.csail.mit.edu/6.S078

Note: In the lab code we communicate the branch type of each instruction to allow training and repair to decide if they want to perform updates or not based on instruction type.

Page 14: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Predictor PrimitiveIndexed table holding values

Operations Predict Update

Algebraic notation

Prediction = P[Width, Depth](Index; Update)

October 24, 2011 L20-14http://csg.csail.mit.edu/6.s078

Index

Prediction

Update

Depth

Width

P

UI

Page 15: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

One-bit Predictor

October 24, 2011 L20-15http://csg.csail.mit.edu/6.s078

PC

Taken

Prediction

A21064(PC; T) = P[ 1, 2K ](PC; T)

P

U

I

1 bit

What happens on loop branches?

At best, mispredicts twice for every use of loop.

Simple temporal prediction

Page 16: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Two-bit Predictor

October 24, 2011 L20-16http://csg.csail.mit.edu/6.s078

PC

+/- Adder

TakenPrediction

Counter[W,D](I; T) = P[W, D](I; if T then P+1 else P-1)

A21164(PC; T) = MSB(Counter[2, 2K](PC; T))

P

U

I

2 bits

Page 17: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

History Register

October 24, 2011 L20-17http://csg.csail.mit.edu/6.s078

PC

Concatenate

TakenHistory

History(PC, T) = P(PC; P || T)

P

U

I

Page 18: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Global History

October 24, 2011 L20-18http://csg.csail.mit.edu/6.s078

GHist(;T) = MSB(Counter(History(0, T); T))

Ind-Ghist(PC;T) = MSB(Counter(PC || Hist(GHist(;T);T)))

Taken

0

Concat

Global History

+/-

Prediction

Can we take advantage of a pattern at a particular PC?

Page 19: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Local History

October 24, 2011 L20-19http://csg.csail.mit.edu/6.s078

PC

Concat

Local History

+/-

Prediction

Taken

LHist(PC, T) = MSB(Counter(History(PC; T); T))

Can we take advantage of the global pattern at a particular PC?

Page 20: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Two-level Predictor

October 24, 2011 L20-20http://csg.csail.mit.edu/6.s078

0

Concat

Global History

+/-

Prediction

Taken

2Level(PC, T) = MSB(Counter(History(0; T)||PC; T))

Concat

PC

Page 21: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Two-Level Branch Predictor

October 24, 2011 L20-21http://csg.csail.mit.edu/6.s078

Pentium Pro uses the result from the last two branchesto select one of the four sets of BHT bits (~95% correct)

0 0kFetch PC

Shift in Taken/¬Taken results of each branch

2-bit global branch history shift register

Taken/¬Taken?

Page 22: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Gshare Predictor

October 24, 2011 L20-22http://csg.csail.mit.edu/6.s078

0

Concat

Global History

+/-

Prediction

Taken

2Level(PC, T) = MSB(Counter(History(0; T) PC; T))

xor

PC

Page 23: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Choosing Predictors

October 24, 2011 L20-23http://csg.csail.mit.edu/6.s078

LHist

GHist

Chooser

Chooser = MSB(P(PC; P + (A==T) - (B==T))or

Chooser = MSB(P(GHist(PC; T); P + (A==T) - (B==T))

Prediction

Page 24: Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration

Tournament Branch Predictor(Alpha 21264)

Choice predictor learns whether best to use local or global branch history in predicting next branchGlobal history is speculatively updated but restored on mispredictClaim 90-100% success on range of applications

October 24, 2011 L12-24http://csg.csail.mit.edu/6.s078

Local history table

(1,024x10b)

PC

Local prediction (1,024x3b)

Global Prediction (4,096x2b)

Choice Prediction (4,096x2b)

Global History (12b)Prediction