63
SEQ part 4 / pipelining 0 1

SEQ part 4 / pipelining 0

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

SEQ part 4 / pipelining 0

1

last timebuilding processors: add MUX for each decisionsingle-cycle processor timing

storage (memory, registers, register file) write at rising edgetherefore, rising edge changes storage outputs = other circuit inputseverything else adjusts as their inputs changeclock signal long enough to give storage stable inputs (at rising edge)

so-called “stages”conceptual division of the processor (later: timing division)fetch (read instruction; split into pieces; compute next instruction addr)‘decode’ (read from register file)execute (ALU operation; condition codes)memory (setup data memory access)writeback (setup write to register file)PC update (setup next PC value)

2

SEQ: instruction fetchread instruction memory at PC

split into seperate wires:icode:ifun — opcoderA, rB — register numbersvalC — call target or mov displacement

compute next instruction address:valP — PC + (instr length)

3

instruction fetch

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

4

SEQ: instruction “decode”read registers

valA, valB — register values

5

instruction decode (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: for which instructions would there be a problem ?nop, addq, mrmovq, rmmovq, jmp, pushq

6

instruction decode (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: for which instructions would there be a problem ?nop, addq, mrmovq, rmmovq, jmp, pushq

6

instruction decode (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: for which instructions would there be a problem ?nop, addq, mrmovq, rmmovq, jmp, pushq of these: only pushq

6

SEQ: srcA, srcBalways read rA, rB?

Problems:push rApopcallret

book: extra signals: srcA, srcB — computed input register

MUX controlled by icode

7

SEQ: possible registers to read

instruction srcA srcBhalt, nop, jCC, irmovq none nonecmovCC, rrmovq rA nonemrmovq none rBrmmovq, OPq rA rBcall, ret none? %rsppushq, popq rA %rsp

MUX srcB

rB

%rsp

(none) F

logic functionicode

8

SEQ: possible registers to read

instruction srcA srcBhalt, nop, jCC, irmovq none nonecmovCC, rrmovq rA nonemrmovq none rBrmmovq, OPq rA rBcall, ret none? %rsppushq, popq rA %rsp

MUX srcB

rB

%rsp

(none) F

logic functionicode

8

instruction decode (2)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

9

SEQ: executeperform ALU operation (add, sub, xor, and)

valE — ALU output

read prior condition codesCnd — condition codes based on ifun (instruction type for jCC/cmovCC)

write new condition codes

10

using condition codes: cmov(always) 1

(le) SF | ZF(l) SF

cc(from instr)

rB0xF dstENOT

11

execute (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: which of these instructions would there be a problem ?nop, addq, mrmovq, popq, call,

12

execute (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: which of these instructions would there be a problem ?nop, addq, mrmovq, popq, call,

12

SEQ: ALU operations?ALU inputs always valA, valB (register values)?

no, inputs from instruction: (Displacement + rB)MUX aluA

valAvalC

mrmovqrmmovq

no, constants: (rsp +/- 8)pushqpopqcallret

extra signals: aluA, aluBcomputed ALU input values

13

execute (2)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

14

SEQ: Memoryread or write data memory

valM — value read from memory (if any)

15

memory (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: which of these instructions would there be a problem ?nop, rmmovq, mrmovq, popq, call,

16

memory (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

exercise: which of these instructions would there be a problem ?nop, rmmovq, mrmovq, popq, call,

16

SEQ: control signals for memoryread/write — read enable? write enable?

Addr — addressmostly ALU outputspecial cases (need extra MUX): popq, ret

Data — value to writemostly valAspecial cases (need extra MUX): call

17

memory (2)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

18

SEQ: write backwrite registers

19

write back (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

textbook convention:E used for storing ALU results (e.g. add)M used for storing memory results (e.g. rmmovq)(you don’t have to do this…)

exercise: which of these instructions would there be a problem ?nop, irmovq, mrmovq, rmmovq, addq, popq

20

write back (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

textbook convention:E used for storing ALU results (e.g. add)M used for storing memory results (e.g. rmmovq)(you don’t have to do this…)

exercise: which of these instructions would there be a problem ?nop, irmovq, mrmovq, rmmovq, addq, popq

20

write back (1)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

textbook convention:E used for storing ALU results (e.g. add)M used for storing memory results (e.g. rmmovq)(you don’t have to do this…)

exercise: which of these instructions would there be a problem ?nop, irmovq, mrmovq, rmmovq, addq, popq

20

SEQ: control signals for WBtwo write inputs — two needed by popq

valM (memory output), valE (ALU output)

two register numbersdstM, dstE

write disable — use dummy register number 0xF

MUX dstErBF

%rsp

21

write back (2a)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

22

write back (2b)

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

23

SEQ: Update PCchoose value for PC next cycle (input to PC register)

usually valP (following instruction)exceptions: call, jCC, ret

24

PC update

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Stat

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+valP

25

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

26

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

26

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

26

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

27

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

27

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

28

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

29

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

30

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

31

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

32

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

33

circuit: setting MUXes

PC

Instr.Mem.

register filesrcA

srcB

R[srcA]R[srcB]

dstE

next R[dstE]

dstM

next R[dstM]

DataMem.

ZF/SF

Data in

Addr inData out

valC

0xF

0xF%rsp

%rsp

0xF

0xF%rsp

rA

rB

ALUaluA

aluBvalE8

0

add/subxor/and(functionof instr.)

write?

functionof opcode

PC+9

instr.length+

8

9

PC+2

M[PC+1]

rA=8

rB=9

R[8]

R[9]

aluA + aluB

M[PC+2]

add

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select when running addq %r8, %r9?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for rmmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for irmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for mrmovq?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for jle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for cmovle?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for ret?MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for popq?

MUXes — PC, dstM, dstE, aluA, aluB, dmemIn, dmemAddr, …Exercise: what do they select for call?

34

Human pipeline: laundry

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

35

Human pipeline: laundry

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

35

Waste (1)

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

wasted time!wasted time!

36

Waste (1)

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

wasted time!wasted time!

36

Waste (2)

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

37

Latency — Time for One

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

pipelined latency (2.1 h)

colors colors colors

normal latency (1.8 h)

38

Latency — Time for One

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

pipelined latency (2.1 h)

colors colors colors

normal latency (1.8 h)

38

Latency — Time for One

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

pipelined latency (2.1 h)

colors colors colors

normal latency (1.8 h)

38

Throughput — Rate of Many

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

time between finishes (0.83 h)

1 load0.83h = 1.2 loads/h

time between starts (0.83 h)

39

Throughput — Rate of Many

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

time between finishes (0.83 h)

1 load0.83h = 1.2 loads/h

time between starts (0.83 h)

39

Throughput — Rate of Many

Washer

Dryer

FoldingTable

11:00 12:00 13:00 14:00

whites

whites

whites

colors

colors

colors

sheets

sheets

sheets

time between finishes (0.83 h)

1 load0.83h = 1.2 loads/h

time between starts (0.83 h)

39

times three circuit

AADDADD

ADDADD

2 × A

3 × A

Aadd A + A

2 × Aadd 2A + A

3 × A

7

14

21

0 ps 50 ps 100 ps100 ps latency =⇒ 10 results/ns throughput

40

times three circuit

AADDADD

ADDADD

2 × A

3 × A

Aadd A + A

2 × Aadd 2A + A

3 × A

7

14

21

0 ps 50 ps 100 ps

100 ps latency =⇒ 10 results/ns throughput

40

times three circuit

AADDADD

ADDADD

2 × A

3 × A

Aadd A + A

2 × Aadd 2A + A

3 × A

7

14

21

0 ps 50 ps 100 ps100 ps latency =⇒ 10 results/ns throughput

40

times three and repeatA

add A + A2 × A

add 2A + A3 × A

7

14

17

34

4

8

1

2

23

46

0 ps 100 ps 200 ps 300 ps 400 ps 500 ps

Aadd A + A

2 × Aadd 2A + A

3 × A

7

14

21

17

34

51

4

8

12

1

2

3

23

46

69

0 ps 100 ps 200 ps 300 ps 400 ps 500 ps

41

times three and repeatA

add A + A2 × A

add 2A + A3 × A

7

14

17

34

4

8

1

2

23

46

0 ps 100 ps 200 ps 300 ps 400 ps 500 ps

Aadd A + A

2 × Aadd 2A + A

3 × A

7

14

21

17

34

51

4

8

12

1

2

3

23

46

69

0 ps 100 ps 200 ps 300 ps 400 ps 500 ps

41

backup slides

42

comparing to yis$ ./hclrs nopjmp_cpu.hcl nopjmp.yo......+--------------------- (end of halted state) ---------------------------+Cycles run: 7$ ./tools/yis nopjmp.yoStopped in 7 steps at PC = 0x1e. Status 'HLT', CC Z=1 S=0 O=0Changes to registers:

Changes to memory:

43

HCLRS summarydeclare/assign values to wires

MUXes with[ test1: value1; test2: value2; 1: default; ]

register banks with register iO:next value on i_name; current value on O_name

fixed functionalityregister file (15 registers; 2 read + 2 write)memories (data + instruction)Stat register (start/stop/error)

44