34
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http:// csg.csail.mit.edu/6.375 March 8, 2010 L10-1

Modular Refinement

  • Upload
    sera

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

Modular Refinement. Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology. rf. pc. CPU. rf. pc. write- back. fetch. memory. execute. decode. dMem. iMem. CPU. fetch & decode. execute. bu. Successive refinement & Modular Structure. - PowerPoint PPT Presentation

Citation preview

Page 1: Modular Refinement

Modular Refinement Arvind Computer Science & Artificial Intelligence LabMassachusetts Institute of Technology

http://csg.csail.mit.edu/6.375March 8, 2010 L10-1

Page 2: Modular Refinement

Successive refinement & Modular Structure

fetch execute

iMem

rf

CPU

decode memory

pc

write-back

dMem

Can we derive the 5-stage pipeline by successive refinement of a 2-stage pipeline? fetch &

decode execute

pc rfCPU

buhttp://csg.csail.mit.edu/6.375March 8, 2010 L10-2

Page 3: Modular Refinement

Architectural refinementsSeparating Fetch and DecodeReplace magic memory by multicycle memoryMulticycle functional units …

http://csg.csail.mit.edu/6.375

First, let us examine our two-stage pipeline

March 8, 2010 L10-3

Page 4: Modular Refinement

Two-stage Pipelinerule fetch_and_decode (!stallfunc(instr, bu)); bu.enq(newIt(instr,rf)); pc <= predIa;endrule

rule execAdd (it matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}); rf.upd(rd, va+vb); bu.deq(); endrulerule BzTaken(it matches tagged Bz {cond:.cv,addr:.av})

&&& (cv == 0); pc <= av; bu.clear(); endrule rule BzNotTaken(it matches tagged Bz {cond:.cv,addr:.av}); &&& !(cv == 0); bu.deq(); endrulerule execLoad(it matches tagged ELoad{dst:.rd,addr:.av}); rf.upd(rd, dMem.read(av)); bu.deq(); endrulerule execStore(it matches tagged EStore{value:.vv,addr:.av}); dMem.write(av, vv); bu.deq(); endrule

fetch & decode execute

pc rfCPU

bufetch rule can execute concurrently with every execute rule except the BzTaken ruleX

http://csg.csail.mit.edu/6.375March 8, 2010 L10-4

Page 5: Modular Refinement

Properties Required of Register File and FIFO for Instruction Pipelining

Bypass Register File: rf.upd(r1, v) < rf.sub(r2)

Pipeline SFIFO bu: {first , deq} < {find, enq}

bu.first < bu.find bu.first < bu.enq bu.deq < bu.find bu.deq < bu.enq

http://csg.csail.mit.edu/6.375March 8, 2010 L10-5

Page 6: Modular Refinement

Bypass Register Filemodule mkBypassRFFull(RegFile#(RName,Value));

RegFile#(RName,Value) rf <- mkRegFileFullWCF(); RWire#(Tuple2#(RName,Value)) rw <- mkRWire();

method Action upd (RName r, Value d); rf.upd(r,d); rw.wset(tuple2(r,d)); endmethod

method Value sub(RName r); case rw.wget() matches tagged Valid {.wr,.d}:

return (wr==r) ? d : rf.sub(r); tagged Invalid: return rf.sub(r); endcase endmethodendmodule

“Config reg file”

http://csg.csail.mit.edu/6.375March 8, 2010 L10-6

Page 7: Modular Refinement

One Element Searchable Pipeline SFIFOmodule mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full || deqp); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); endmethod method Action clear(); full <= False; endmethod method Bool find(tr r); return (findf(r, data) && full); endmethod endmodule

bu.enq > bu.deqbu.enq > bu.firstbu.enq < bu.clear

(full && !deqp));

bu.find < bu.enqbu.find < bu.deqbu.find < bu.clear

bu.deq > bu.firstbu.deq < bu.clear

http://csg.csail.mit.edu/6.375

bu.find < bu.enqbu.find > bu.deqbu.find < bu.clear

March 8, 2010 L10-7

Page 8: Modular Refinement

Suppose we used the wrong SFIFO?bu.find < bu.deq

http://csg.csail.mit.edu/6.375

Will the system produce wrong results? NO because the fetch rule will simply

conflict with the execute rules

March 8, 2010 L10-8

Page 9: Modular Refinement

CPU as one module

http://csg.csail.mit.edu/6.375

Read method callAction method call

Method calls embody both data and control (i.e., protocol)

iMem dMem

fetch & decode

pc executeRFile rf SFIFO

bu

CPU

March 8, 2010 L10-9

Page 10: Modular Refinement

A Modular organization

Suppose we include rf and pc in Fetch and bu in ExecuteFetch delivers decoded instructions to Execute and needs to consult Execute for the stall conditionExecute writes back data in rf and supplies the pc value in case of a branch misprediction

iMem RFile rf FIFO bu dMem

fetch & decode

pc

executesetPc

CPU

enqIt

WB

stall

module

s call

each o

ther

(recur

sive)

http://csg.csail.mit.edu/6.375March 8, 2010 L10-10

Page 11: Modular Refinement

Interface definitions:Fetch and Executeinterface Fetch; method Action setPC (Iaddress cpc); method Action writeback (RName dst, Value v); endinterface

interface Execute; method Action enqIt(InstTemplate it); method Bool stall(Instr instr)endinterface

http://csg.csail.mit.edu/6.375March 8, 2010 L10-11

Page 12: Modular Refinement

Recursive modular organizationmodule mkCPU2#(Mem iMem, Mem dMem)(); Execute execute <- mkExecute(dMem, fetch); Fetch fetch <- mkFetch(iMem, execute);endmodule

recursive calls

Unfortunately, the recursive module syntax is not so simple

http://csg.csail.mit.edu/6.375March 8, 2010 L10-12

Page 13: Modular Refinement

IssueA recursive call structure can be wrong in the sense of “circular calls”; fortunately the compiler can perform this checkUnfortunately recursive call structure amongst modules is supported by the compiler in a limited way.

The syntax is complicated Recursive modules cannot be synthesized separately

http://csg.csail.mit.edu/6.375March 8, 2010 L10-13

Page 14: Modular Refinement

Syntax for Recursive Modules

moduleFix is like the Y combinator F = Y F

module mkFix#(Tuple2#(Fetch, Execute) fe)(Tuple2#(Fetch, Execute));

match{.f, .e} = fe; Fetch fetch <- mkFetch(e); Execute execute <- mkExecute(f); return(tuple2(fetch,execute));endmodule

(* synthesize *)module mkCPU(Empty); match {.fetch, .execute} <- moduleFix(mkFix);endmodule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-14

Page 15: Modular Refinement

Passing parametersmodule mkCPU#(IMem iMem, DMem dMem)(Empty); module mkFix#(Tuple2#(Fetch, Execute) fe) (Tuple2#(Fetch, Execute)); match{.f, .e} = fe; Fetch fetch <- mkFetch(iMem,e); Execute execute <- mkExecute(dMem,f); return(tuple2(fetch,execute);endmodule

match {.fetch, .execute} <- moduleFix(mkFix);endmodule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-15

Page 16: Modular Refinement

Fetch Modulemodule mkFetch#(IMem iMem, Execute execute) (Fetch);

Instr instr = iMem.read(pc);Iaddress predIa = pc + 1;

Reg#(Iaddress) pc <- mkReg(0);RegFile#(RName, Bit#(32)) rf <- mkBypassRegFile();

rule fetch_and_decode (!execute.stall(instr)); execute.enqIt(newIt(instr,rf));pc <= predIa;endrule

method Action writeback(RName rd, Value v); rf.upd(rd,v);

endmethodmethod Action setPC(Iaddress newPC);

pc <= newPC;endmethod

endmodule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-16

Page 17: Modular Refinement

Execute Modulemodule mkExecute#(DMem dMem, Fetch fetch) (Execute);

SFIFO#(InstTemplate) bu <- mkSPipelineFifo(findf);InstTemplate it = bu.first;

rule execute …

method Action enqIt(InstTemplate it);bu.enq(it);

endmethodmethod Bool stall(Instr instr);

return (stallFunc(instr, bu));endmethod

endmodule no change

http://csg.csail.mit.edu/6.375March 8, 2010 L10-17

Page 18: Modular Refinement

Execute Module Rulerule execute (True); case (it) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin fetch.writeback(rd, va+vb); bu.deq();

end tagged EBz {cond:.cv,addr:.av}:

if (cv == 0) then beginfetch.setPC(av); bu.clear(); end

else bu.deq(); tagged ELoad{dst:.rd,addr:.av}: begin fetch.writeback(rd, dMem.read(av)); bu.deq();

end tagged EStore{value:.vv,addr:.av}: begin dMem.write(av, vv); bu.deq();

end endcaseendrule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-18

Page 19: Modular Refinement

Subtle Architecture Issues

After setPC is called the next instruction enqueued via enqIt must correspond to iMem(cpc)stall and writeback methods are closely related;

writeback affects the results of subsequent stalls the effect of writeback must be reflected immediately in

the decoded instructions

interface Fetch; method Action setPC (Iaddress cpc); method Action writeback (RName dst, Value v); endinterfaceinterface Execute; method Action enqIt(InstTemplate it); method Bool stall(Instr instr)endinterface

Any modular refinement must preserve these extra-linguistic semantic properties

http://csg.csail.mit.edu/6.375March 8, 2010 L10-19

Page 20: Modular Refinement

Modular refinement: Separating Fetch and Decode

http://csg.csail.mit.edu/6.375March 8, 2010 L10-20

Page 21: Modular Refinement

Fetch Module RefinementSeparating Fetch and Decodemodule mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(Instr) fetch2DecodeQ <- mkPipelineFIFO(); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; … rule fetch(True); pc <= predIa; fetch2DecodeQ.enq(instr); endrule

rule decode (!execute.stall(fetch2DecodeQ.first())); execute.enqIt(newIt(fetch2DecodeQ.first(),rf)); fetch2DecodeQ.deq();

endrule method Action setPC …method Action writeback …

endmodule Are any changes needed in the methods?

http://csg.csail.mit.edu/6.375March 8, 2010 L10-21

Page 22: Modular Refinement

Fetch Module Refinementmodule mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(Instr) fetchDecodeQ <- mkFIFO(); … rule fetch … rule decode …

method Action writeback(RName rd, Value v); rf.upd(rd,v);

endmethod

method Action setPC(Iaddress newPC); pc <= newPC; fetch2DecodeQ.clear();

endmethodEndmodule

no change

The stall signal definition guarantees that any order for the execution of the decode rule and writeback method is valid.

http://csg.csail.mit.edu/6.375March 8, 2010 L10-22

Page 23: Modular Refinement

Modular refinement: Replace magic memory by multicycle memory

http://csg.csail.mit.edu/6.375March 8, 2010 L10-23

Page 24: Modular Refinement

The desired behavior

rule fetch_and_decode(True); instr <- actionvalue

imem.req(pc) <$> i <- imem.resp(); return i; endactionvalue execute.enqIt(newIt(instr,rf))

when (!execute.stall(instr));pc <= predIa; endrule

a guard is needed to avoid repeatedly

requesting the stalled instr

imem req and res change state

http://csg.csail.mit.edu/6.375

iMem.read(pc) needs to be split into a pair of request-response actions:

method Action iMem.req(Addr pc)method Actionvalue#(Inst) iMem.res()

Unfortunately, BSV does not have multicycle rules <$>

March 8, 2010 L10-24

Page 25: Modular Refinement

Action Connectives:Par vs. Seq

Parallel compositions (a1;a2) Neither action observes others’ updates Writes are disjoint Natural in hardware

Sequential Connective (a1 <$> a2) a2 observes a1’s updates Still atomic Not present in BSV because of implementation

complications

(r1 <= r2 ; r2 <= r1) swaps r1 & r2

We need to split the rule to get rid of <$> http://csg.csail.mit.edu/6.375March 8, 2010 L10-25

Page 26: Modular Refinement

Predicating Actions:Guards vs. Ifs

Guards affect their surroundings(a1 when p1) ; a2 ==> (a1 ; a2) when p1

The effect of an “if” is local

(if p1 then a1) ; a2 ==> if p1 then (a1 ; a2) else a2

p1 has no effect on a2

http://csg.csail.mit.edu/6.375March 8, 2010 L10-26

Page 27: Modular Refinement

Splitting the rulerule fetch(True); imem.req(pc) pc <= predIa;endrule

rule decode(True); let instr <- imem.resp(); execute.enqIt(newIt(instr,rf)) when (!execute.stall(instr));endrule

Suppose the PC was also needed to decode the instruction

http://csg.csail.mit.edu/6.375March 8, 2010 L10-27

Page 28: Modular Refinement

Passing data from Fetch to Decoderule fetch(True); imem.req(pc); fet2decQ.enq(pc); pc <= predIa;endrule

rule decode(True); let instr <- imem.resp(); let pc = fet2decQ.first(); fet2decQ.deq(); execute.enqIt(newIt(instr,rf,pc)) when (!execute.stall(instr));endrule

All data between actions passed through state

(imem + fet2decQ)

http://csg.csail.mit.edu/6.375March 8, 2010 L10-28

Page 29: Modular Refinement

Methods of Fetch moduleTo finish we need to change method

setPC so that the next instruction sent to the Execute module is the correct one

method setPC(Iaddress npc); pc <= npc; fet2decQ.clear();endmethod

http://csg.csail.mit.edu/6.375March 8, 2010 L10-29

Page 30: Modular Refinement

Multicycle memory:Refined Execute Module Rulerule execute (True); case (it) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin fetch.writeback(rd, va+vb); bu.deq(); end tagged EBz {cond:.cv,addr:.av}:

if (cv == 0) then beginfetch.setPC(av); bu.clear(); end

else bu.deq(); tagged ELoad{dst:.rd,addr:.av}: begin let val <- actionvalue dMem.req(Read {av}) <$> v <- dMem.resp(); return v; endactionvalue; fetch.writeback(rd, val); bu.deq(); end tagged EStore{value:.vv,addr:.av}: begin dMem.req(Write {av, vv}); <$> let val <- dMem.resp(); bu.deq(); end endcase endrule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-30

Page 31: Modular Refinement

Splitting the Backend Rules:The execute rulerule execute (True); bu.deq(); case (it) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin exec2wbQ.enq ( WWB {dst: rd, val: va+vb}); end tagged EBz {cond:.cv,addr:.av}:

if (cv == 0) then begin fetch.setPC(av); bu.clear(); end;

tagged ELoad{dst:.rd,addr:.av}: begin dMem.req(Read {addr:av}); exec2wbQ.enq(WLd {dst:rd}); end tagged EStore{value:.vv,addr:.av}: begin dMem.req(Write {addr:av, val:vv}); exec2wbQ.enq(WSt {}); end endcase endrule

rule writeback(True); ….

http://csg.csail.mit.edu/6.375March 8, 2010 L10-31

Page 32: Modular Refinement

Splitting the Backend Rules The writeback rulerule execute (True);… endrule

rule writeback(True); exec2wbQ.deq(); case exec2wbQ.first() matches tagged WWB {dst: .rd, val: .v}: fetch.writeback(rd,v); tagged WLd {dst: .rd}: begin let v <- dMem.resp(); fetch.writeback(rd,v); end tagged WSt {} : begin let ack <- dMem.resp(); end default: noAction; endcase endrule

http://csg.csail.mit.edu/6.375March 8, 2010 L10-32

Page 33: Modular Refinement

Final stepStepwise refinement makes the verification task easier but does not eliminate it We still need to prove that each step

is correct

http://csg.csail.mit.edu/6.375March 8, 2010 L10-33

Page 34: Modular Refinement

Splitting the Backend Rules:The execute rulerule execute (True); bu.deq(); case (it) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin exec2wbQ.enq ( WWB {dst: rd, val: va+vb}); end tagged EBz {cond:.cv,addr:.av}:

if (cv == 0) then begin fetch.setPC(av); bu.clear(); end;

tagged ELoad{dst:.rd,addr:.av}: begin dMem.req(Read {addr:av}); exec2wbQ.enq(WLd {dst:rd}); end tagged EStore{value:.vv,addr:.av}: begin dMem.req(Write {addr:av, val:vv}); exec2wbQ.enq(WSt {}); end endcase endrule

rule writeback(True); ….

What about exec2wbQ?Should it be cleared?

No because all instructions in exec2wbQ are non speculative and must commit

Stall condition prevents fetched instruction from reading stale data

http://csg.csail.mit.edu/6.375March 8, 2010 L10-34