58
EI209 Chapter 4B.1 Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor, Control and Multi-cycle Datapath [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, © 2012, MK]

EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

Embed Size (px)

DESCRIPTION

EI209 Chapter 4B.3Haojin Zhu, SJTU 2015 Step 5: Assemble Control logic ALUctr RegDst ALUSrc ExtOp MemtoRegMemWr Equal Instruction Imm16RdRsRt nPC_sel Adr Inst Memory DATA PATH Decoder Op Fun RegWr

Citation preview

Page 1: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.1 Haojin Zhu, SJTU 2015

EI 209 Computer Organization

Fall 2015

Chapter 4B: The Processor, Control and Multi-cycle Datapath

[Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2012, MK]

Page 2: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.2 Haojin Zhu, SJTU 2015

Recap: A Summary of Control Signalsinst Register Transfer

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”

SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”

ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4

ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”

LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”

STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”

BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4

nPC_sel = “Br”, ALUctr = “sub”

Page 3: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.3 Haojin Zhu, SJTU 2015

Step 5: Assemble Control logic

ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal

Instruction<31:0>

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Adr

InstMemory

DATA PATH

Decoder

Op

<21:25>

Fun

RegWr

Page 4: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.4 Haojin Zhu, SJTU 2015

add sub ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUctr<2:0>

1001000x

Add

1001000x

Subtr

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtr

xxx0001x

xxx

op target address

op rs rt rd shamt func061116212631

op rs rt immediate

R-type

I-type

J-type

add, sub

ori, lw, sw, beq

jump

funcop 00 0000 00 0000 00 110110 0011 10 1011 00 0100 00 0010

10 0000 10 0010 We Don’t Care :-)

A Summary of the Control Signals

Page 5: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.5 Haojin Zhu, SJTU 2015

R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUctr

1001000x

Add/Subtr

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtr

xxx0001x

xxx

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

funcMain

Controlop

6

ALUControl(Local)N=?

6ALUop

ALUctr3

ALU

Two levels of decoding: Main Control and ALU Control

ALUctr is determined by ALUop and func, while other control signals are determined by op。How many bits will N need?

R、 I-ori、 I-lw/sw、 I-beq、J

The Concept of Local Decoding

3, why?

Page 6: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.6 Haojin Zhu, SJTU 2015

R-type ori lw sw beq jump

ALUop (Symbolic) “R-type” Or Add Add Subtr xxx

ALUop<2:0> 1 xx 0 10 0 00 0 00 0x1 xxx

MainControl

op

6

ALUControl(Local)

func

N

6ALUop

ALUctr

3

000000 rs rt rd shamt func

061116212631

R-type

func<5:0> Instruction Operation

10 0000

10 0010

10 0100

10 0101

10 1010

add

subtract

and

or

set-on-less-than

ALUctr<2:0> ALU Operation

000

001

100

101

010

Add

Subtract

And

Or

Subtract

ALUctr

ALU

Encoding ALUop as follows

Could ALUop use just 2 bits?Yes! Since jump is X, we could use 2 bits: R:11, I-ori:10, I-beq:01, I-lw/sw:00, J-xx

R-Type is1xx, could identify R-instruction by using 1 bit!

The Decoding of the “func” Field

Page 7: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.7 Haojin Zhu, SJTU 2015

R-type ori lw sw beqALUop(Symbolic) “R-type” Or Add Add Subtr

ALUop<2:0> 1 00 0 10 0 00 0 00 0 x1

ALUop funcbit2 bit1 bit0 bit<2> bit<1> bit<0>bit<3>

0 0 0 x x x x

ALUctrALUOperation

Add 0 0 0bit<2> bit<1> bit<0>

0 x 1 x x x x Subtract 0 0 10 1 0 x x x x Or 1 1 01 x x 0 0 0 0 Add 0 0 01 x x 0 0 1 0 Subtract 0 0 11 x x 0 1 0 0 And 0 1 01 x x 0 1 0 1 Or 1 1 01 x x 1 0 1 0 Subtract 0 0 1

funct<3:0> Instruction Op.00000010010001011010

addsubtractandorset-on-less-than

The Truth Table for ALUctrNon-R-type Instructionsdetermined by ALUop

R-type Instructionsdetermined by funct

Page 8: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.8 Haojin Zhu, SJTU 2015

The Logic Equation for ALUctr<0>

ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<0>

0 x 1 x x x x 11 x x 0 0 1 0 11 x x 1 0 1 0 1

This makes func<3> a don’t care

Choose the rows with ALUctr[0]=1

Page 9: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.9 Haojin Zhu, SJTU 2015

The Logic Equation for ALUctr<1>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>

0 1 0 x x x x 1ALUctr<1>

1 x x 0 1 0 0 11 x x 0 1 0 1 1

ALUctr<1> = !ALUop<2> & ALUop<1> & ! ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1>

Choose the rows with ALUctr[1]=1

Page 10: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.10 Haojin Zhu, SJTU 2015

The Logic Equation for ALUctr<2>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<2>

0 1 0 x x x x 11 x x 0 1 0 1 1

ALUctr<2> = !ALUop<2> & ALUop<1> & !ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1>

& func<0>

Choose the rows with ALUctr[2]=1

Page 11: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.11 Haojin Zhu, SJTU 2015

Summery of Control Logic of Local Control

ALUControl(Local)

func

3

6ALUop

ALUctr

3

ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>

ALUctr<1> = !ALUop<2> & ALUop<1> & !ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1>

ALUctr<2> = !ALUop<2> & ALUop<1> & !ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1> &

func<0>

Page 12: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.12 Haojin Zhu, SJTU 2015

R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUop (Symbolic)

1001000x

“R-type”

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtr

xxx0001x

xxx

op 00 0000 00 1101 10 0011 10 101100 010000 0010

ALUop <2> 1 0 0 0 0 xALUop <1> x 1 0 0 x xALUop <0> x 0 0 0 1 x

MainControl

op6

ALUControl(Local)

func

3

6

ALUop

ALUctr3

RegDstALUSrc

:

Input of main controlOutput of Main Control

The “Truth Table” for the Main Control

Page 13: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.13 Haojin Zhu, SJTU 2015

R-type ori lw sw beq jumpRegWrite 1 1 1 0 0 0

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

RegWrite = R-type + ori + lw= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type) + !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori) + op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)

op<0>

op<5>. .op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

R-type ori lw sw beq jumpRegWrite

Decoder

The “Truth Table” for RegWrite

Page 14: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.14 Haojin Zhu, SJTU 2015

op<0>

op<5>. .op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

op<5>. .<0>

R-type ori lw sw beq jumpRegWrite

ALUSrc

MemtoRegMemWrite

BranchJump

RegDst

ExtOp

ALUop<2>ALUop<1>ALUop<0>

Decoder

The “Truth Table” for RegWrite

Page 15: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.15 Haojin Zhu, SJTU 2015

The Complete Single Cycle Data Path with Control

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Ext

Mux

Mux

3216imm16

ALUSrc

ExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

32

MemWrA

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

Jump

Branch

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

MainControl

op6

ALUControlfunc

6

3ALUop

ALUctr3

RegDst

ALUSrc:

Instr<5:0>

Instr<31:26>

Instr<15:0>

Page 16: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.16 Haojin Zhu, SJTU 2015

Time Delay for LW: Critical PathClk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busARegister File Access Time

Old Value New Value

busB

ALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

MemtoReg Old Value New Value

Address Old Value New Value

busW Old Value New

Delay through Extender & Mux

RegisterWrite Occurs

Data Memory Access Time

PC+4

PC+4 PC

Page 17: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.17 Haojin Zhu, SJTU 2015

Performance of Single-Cycle Machine Assume that the operation times for the major functional units in this

implementation are the following Memory units: 200 picoseconds ALU and adders: 100 ps Register file (read or write): 50 ps

Assuming that the multiplexors, control unit, PC accesses, sign extension unit, and wires have no delay, which of the following implementations would be faster and by how much?

1. An implementation in which every instruction operates in 1 clock cycle of a fixed length.

2. An implementation where every instruction executes in 1 clock cycle using a variable-length clock.

To compare the performance, assume the following instruction mix: 25% loads, 10% stores, 45% ALU instructions, 15 % branches, and 5% jumps.

Page 18: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.18 Haojin Zhu, SJTU 2015

What’s wrong with our CPI=1 processor?

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

Page 19: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.19 Haojin Zhu, SJTU 2015

200 50 100 0 350

200 200

Instr. I Mem Reg Rd ALU Op D Mem Reg Wr TotalR-typeloadstorebeqjump

200 50 100 200 50 600

200 50 100 200 550

Instruction Critical Paths

200 50 100 0 50 400

Calculate cycle time assuming negligible delays (for muxes, control unit, sign extend, PC access, shift left 2, wires, setup and hold times) except: Instruction and Data Memory (200 ps) ALU and adders (100 ps) Register File access (reads or writes) (50 ps)

Page 20: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.20 Haojin Zhu, SJTU 2015

Comparison of clock cycles for two implementation

2. An implementation where every instruction executes in 1 clock cycle using a variable-length clock.

1. An implementation in which every instruction operates in 1 clock cycle of a fixed length.

CPU clock cycle=600*25%+550*10%+400*45%+350*15%+200*5% =447.5 ps

CPU clock cycle=600 ps (longest instruction)

Page 21: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.21 Haojin Zhu, SJTU 2015

Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently – the clock cycle must

be timed to accommodate the slowest instruction especially problematic for more complex instructions like

floating point multiply

May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle

but Is simple and easy to understand

Clk

lw sw Waste

Cycle 1 Cycle 2

Page 22: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.22 Haojin Zhu, SJTU 2015

Multicycle Implementation Overview Each instruction step takes 1 clock cycle

Therefore, an instruction takes more than 1 clock cycle to complete

Not every instruction takes the same number of clock cycles to complete

Multicycle implementations allow faster clock rates different instructions to take a different number of clock cycles functional units to be used more than once per instruction as long as

they are used on different clock cycles, as a result- only need one memory- only need one ALU/adder

Page 23: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.23 Haojin Zhu, SJTU 2015

The Multicycle Datapath – A High Level View

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB A

LUou

t

Registers have to be added after every major functional unit to hold the output value until it is used in a subsequent clock cycle

Page 24: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.24 Haojin Zhu, SJTU 2015

Clocking the Multicycle Datapath

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB A

LUou

t

System Clock

MemWrite RegWrite

clock cycle

Page 25: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.25 Haojin Zhu, SJTU 2015

Break up the instructions into steps where each step takes a clock cycle while trying to balance the amount of work to be done in each step use only one major functional unit per clock cycle

At the end of a clock cycle Store values needed in a later clock cycle by the current instruction in a

state element (internal register not visible to the programmer)IR – Instruction RegisterMDR – Memory Data RegisterA and B – Register File read data registersALUout – ALU output register

- All (except IR) hold data only between a pair of adjacent clock cycles (so they don’t need a write control signal)

Data used by subsequent instructions are stored in programmer visible state elements (i.e., Register File, PC, or Memory)

Our Multicycle Approach

Page 26: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.26 Haojin Zhu, SJTU 2015

The Complete Multicycle Data with Control

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 27: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.27 Haojin Zhu, SJTU 2015

Reading from or writing to any of the internal registers, Register File, or the PC occurs (quickly) at the beginning (for read) or the end of a clock cycle (for write)

Reading from the Register File takes ~50% of a clock cycle since it has additional control and access overhead (but reading can be done in parallel with decode)

Had to add multiplexors in front of several of the functional unit input ports (e.g., Memory, ALU) because they are now shared by different clock cycles and/or do multiple jobs

All operations occurring in one clock cycle occur in parallel This limits us to one ALU operation, one Memory access, and

one Register File access per clock cycle

Our Multicycle Approach, con’t

Page 28: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.28 Haojin Zhu, SJTU 2015

1. Instruction Fetch

2. Instruction Decode and Register Fetch

3. R-type Instruction Execution, Memory Read/Write Address Computation, Branch Completion, or Jump Completion

4. Memory Read Access, Memory Write Completion or R-type Instruction Completion

5. Memory Read Completion (Write Back)

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

Five Instruction Steps

Page 29: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.29 Haojin Zhu, SJTU 2015

RTL for Instructions Common Steps:

Instr fetch IR = Memory[PC]; PC Updating PC = PC + 4;

Decode and Register reading A = Reg[IR[25-21]];B = Reg[IR[20-16]];

Instruction Dependent operation

Page 30: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.30 Haojin Zhu, SJTU 2015

RTL for Instructions Instruction Dependent operation

Execute ALUOut = A op B;

ALUOut = A + sign-

extend (IR[15-0]);

if (A==B) PC = ALUOut;

ALUOut = PC +(sign-

extend(IR[15-0])<< 2);

PC = PC[31-28] ||(IR[25-0]

<< 2);

Memory access

Reg[IR[15-11]] = ALUOut;

MDR = Memory[ALUOu

t]; or

Memory[ALUOut] = B;

Write-back Reg[IR[20-

16]] = MDR;

Page 31: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.31 Haojin Zhu, SJTU 2015

Group Discussion 1 Four-six members form a group and discuss the

following questions related to multi-cycle datapath? 1. What are the task for each step?(balance the load of

each step to minimize the number of clock cycles) Recommend one member to present your results?

Page 32: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.32 Haojin Zhu, SJTU 2015

Multi-cycle datapath

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 33: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.33 Haojin Zhu, SJTU 2015

RTL SummaryStep R-type Mem Ref Branch Jump

Instr fetch

IR = Memory[PC]; PC = PC + 4;

Decode A = Reg[IR[25-21]];B = Reg[IR[20-16]];

ALUOut = PC +(sign-extend(IR[15-0])<< 2);

Execute ALUOut = A op B;

ALUOut = A + sign-extend

(IR[15-0]);

if (A==B) PC =

ALUOut;

PC = PC[31-28] ||(IR[25-0] << 2);

Memory access

Reg[IR[15-11]] = ALUOut;

MDR = Memory[ALUOut];

orMemory[ALUOut]

= B; Write-back

Reg[IR[20-16]] = MDR;

Page 34: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.34 Haojin Zhu, SJTU 2015

Group Discussion 2 Four-six members form a group and discuss the

following questions related to multi-cycle datapath? 1. What are data paths and control signals for each step

of each instruction? Recommend one member to present your results?

Page 35: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.35 Haojin Zhu, SJTU 2015

Multi-cycle datapath

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 36: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.36 Haojin Zhu, SJTU 2015

Use PC to get instruction from the memory and put it in the Instruction Register

Increment the PC by 4 and put the result back in the PC Can be described succinctly using the RTL "Register-

Transfer Language“

IR = Memory[PC];PC = PC + 4;

Step 1: Instruction Fetch

Can we figure out the values of the control signals?

What is the advantage of updating the PC now?

Page 37: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.37 Haojin Zhu, SJTU 2015

Datapath Activity During Instruction Fetch

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

00

Page 38: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.39 Haojin Zhu, SJTU 2015

Fetch Control Signals Settings

Start

Instr FetchIorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

Page 39: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.40 Haojin Zhu, SJTU 2015

Don’t know what the instruction is yet, so can only Read registers rs and rt in case we need them Compute the branch address in case the instruction is a branch

The RTL:

A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC

+(sign-extend(IR[15-0])<< 2);

Note we aren't setting any control lines based on the instruction (since we don’t know what it is (the control logic is busy "decoding" the op code bits))

Step 2: Instruction Decode and Register Fetch

Page 40: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.42 Haojin Zhu, SJTU 2015

Datapath Activity During Instruction Decode

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

00

Page 41: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.44 Haojin Zhu, SJTU 2015

Decode Control Signals Settings

Start

Instr Fetch DecodeALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

Page 42: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.45 Haojin Zhu, SJTU 2015

ALU is performing one of four functions, based on instruction type

Memory reference (lw and sw):

ALUOut = A + sign-extend(IR[15-0]);

R-type:

ALUOut = A op B;

Branch:

if (A==B) PC = ALUOut; Jump:

PC = PC[31-28] || (IR[25-0] << 2);

Step 3 (instruction dependent)

Page 43: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.47 Haojin Zhu, SJTU 2015

Datapath Activity During lw & sw Execute

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

00

Page 44: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.49 Haojin Zhu, SJTU 2015

Datapath Activity During R-type Execute

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

10

Page 45: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.51 Haojin Zhu, SJTU 2015

Datapath Activity During beq Execute

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

01

Page 46: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.53 Haojin Zhu, SJTU 2015

Datapath Activity During j Execute

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 47: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.55 Haojin Zhu, SJTU 2015

Execute Control Signals Settings

Start

Instr Fetch Decode

(Op = R-ty

pe)

(Op =

beq)

(Op = lw or sw

) (Op = j)

ALUSrcA=1ALUSrcB=10ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=10

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=01

PCSource=01PCWriteCond

PCSource=10PCWriteExecute

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

Page 48: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.56 Haojin Zhu, SJTU 2015

Memory reference:

MDR = Memory[ALUOut]; -- lwor

Memory[ALUOut] = B; -- sw

R-type instruction completion

Reg[IR[15-11]] = ALUOut;

Remember, the register write actually takes place at the end of the cycle on the clock edge

Step 4 (also instruction dependent)

Page 49: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.58 Haojin Zhu, SJTU 2015

Datapath Activity During lw Memory Access

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 50: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.60 Haojin Zhu, SJTU 2015

Datapath Activity During sw Memory Access

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 51: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.62 Haojin Zhu, SJTU 2015

Datapath Activity During R-type Completion

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 52: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.64 Haojin Zhu, SJTU 2015

Memory Access Control Signals Settings

Start

Instr Fetch Decode

Memory Access

Execute

(Op = R-ty

pe)

(Op =

beq)

(Op = lw or sw

) (Op = j)

(Op = lw)(Op = sw)

MemReadIorD=1

PCWriteCond=0

MemWriteIorD=1

PCWriteCond=0

RegDst=1RegWriteMemtoReg=0

PCWriteCond=0

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=10ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=10

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=01

PCSource=01PCWriteCond

PCSource=10PCWrite

Page 53: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.65 Haojin Zhu, SJTU 2015

All we have left is the write back into the register file the data just read from memory for lw instruction

Reg[IR[20-16]]= MDR;

What about all the other instructions?

Step 5: Memory Read Completion (Write Back)

Page 54: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.67 Haojin Zhu, SJTU 2015

Datapath Activity During lw Write Back

Address

Read Data(Instr. or Data)

Memory

PC

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

Read Data 1

Read Data 2

ALU

Write Data

IRM

DR

AB

ALU

out

SignExtend

Shiftleft 2 ALU

control

Shiftleft 2

ALUOpControl

IRWriteMemtoReg

MemWriteMemRead

IorDPCWrite

PCWriteCond

RegDstRegWrite

ALUSrcAALUSrcB

zero

PCSource

1

1

1

1

1

10

0

0

0

0

0

2

2

3

4

Instr[5-0]

Instr[25-0]

PC[31-28]

Instr[15-0]

Instr[31-26]

32

28

Page 55: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.69 Haojin Zhu, SJTU 2015

Write Back Control Signals Settings

Start

Instr Fetch Decode

Write Back

Memory Access

Execute

(Op = R-ty

pe)

(Op =

beq)

(Op = lw or sw

) (Op = j)

(Op = lw)(Op = sw)

RegDst=0RegWrite

MemtoReg=1PCWriteCond=0

Unless otherwise assigned

PCWrite,IRWrite, MemWrite,RegWrite=0 others=X

IorD=0MemRead;IRWrite

ALUSrcA=0ALUsrcB=01

PCSource,ALUOp=00PCWrite

ALUSrcA=0ALUSrcB=11ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=10ALUOp=00

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=10

PCWriteCond=0

ALUSrcA=1ALUSrcB=00ALUOp=01

PCSource=01PCWriteCond

PCSource=10PCWrite

MemReadIorD=1

PCWriteCond=0

MemWriteIorD=1

PCWriteCond=0

RegDst=1RegWriteMemtoReg=0

PCWriteCond=0

Page 56: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.70 Haojin Zhu, SJTU 2015

RTL SummaryStep R-type Mem Ref Branch Jump

Instr fetch

IR = Memory[PC]; PC = PC + 4;

Decode A = Reg[IR[25-21]];B = Reg[IR[20-16]];

ALUOut = PC +(sign-extend(IR[15-0])<< 2);

Execute ALUOut = A op B;

ALUOut = A + sign-extend

(IR[15-0]);

if (A==B) PC =

ALUOut;

PC = PC[31-28] ||(IR[25-0] << 2);

Memory access

Reg[IR[15-11]] = ALUOut;

MDR = Memory[ALUOut];

orMemory[ALUOut]

= B; Write-back

Reg[IR[20-16]] = MDR;

Page 57: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.71 Haojin Zhu, SJTU 2015

Example

Using the following instruction mix, what is the CPI, assuming that each state in the multicycle CPU requires 1 clock cycle?

Load 25%store 10%

branches 11%

jumps 2%

ALU 52%

Page 58: EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor,…

EI209 Chapter 4B.72 Haojin Zhu, SJTU 2015

Answer:

Loads:5Stores:4ALU: 4Branches:3Jumps:3

CPI=0.25*5+0.10*4+0.52*4+0.11*3+0.02*3=4.12