58
순순순순순순 순순순순순순순 순 순 순 1 5. The Processor: Datapath and Con trol

순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

Embed Size (px)

Citation preview

Page 1: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 1

5. The Processor: Datapath and Control

Page 2: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 2

Computer Architecture

5. The Processor: Datapath and Control

We're ready to look at an implementation of the MIPS

Simplified to contain only:• memory-reference instructions: lw, sw • arithmetic-logical instructions: add, sub, and, or, slt• control flow instructions: beq, j

Generic Implementation:• use the program counter (PC) to supply instruction address• get the instruction from memory• read registers• use the instruction to decide exactly what to do

All instructions use the ALU after reading the registersWhy? memory-reference? arithmetic? control

flow?

The Processor: Datapath & Control

Page 3: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 3

Computer Architecture

5. The Processor: Datapath and Control

Abstract / Simplified View:

Two types of functional units:• elements that operate on data values (combinational)• elements that contain state (sequential)

More Implementation Details

Page 4: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 4

Computer Architecture

5. The Processor: Datapath and Control

Unclocked vs. Clocked Clocks used in synchronous logic

• when should an element that contains state be updated?

State Elements

Clock period Rising edge

Falling edge

cycle time

Page 5: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 5

Computer Architecture

5. The Processor: Datapath and Control

The set-reset latch• output depends on present inputs and also on past inputs

An unclocked state element

R

S

Q

Q

Page 6: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 6

Computer Architecture

5. The Processor: Datapath and Control

Output is equal to the stored value inside the element

(don't need to ask for permission to look at the value)

Change of state (value) is based on the clock Latches: whenever the inputs change, and the clock

is asserted Flip-flop: state changes only on a clock edge

(edge-triggered methodology)"logically true", — could mean electrically low

A clocking methodology defines when signals can be read and written— wouldn't want to read a signal at the same time it was being written

Latches and Flip-flops

Page 7: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 7

Computer Architecture

5. The Processor: Datapath and Control

Two inputs:• the data value to be stored (D)• the clock signal (C) indicating when to read & store D

Two outputs:• the value of the internal state (Q) and it's complement

D-latch

Q

C

D

_Q

D

C

Q

Page 8: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 8

Computer Architecture

5. The Processor: Datapath and Control

D flip-flop

Output changes only on the clock edge

D

C

Q

D

C

Dlatch

D

C

QD

latch

D

C

Q Q

QQ

Page 9: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 9

Computer Architecture

5. The Processor: Datapath and Control

Our Implementation

An edge triggered methodology Typical execution:

• read contents of some state elements, • send values through some combinational logic• write results to one or more state elements

Stateelement

1

Stateelement

2Combinational logic

Clock cycle

Page 10: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 10

Computer Architecture

5. The Processor: Datapath and Control

Built using D flip-flops

Register File

Read registernumber 1 Read

data 1Read registernumber 2

Readdata 2

Writeregister

WriteWritedata

Register file

Read registernumber 1

Register 0

Register 1

. . .

Register n – 2

Register n – 1

M

u

x

Read registernumber 2

M

u

x

Read data 1

Read data 2

Do you understand? What is the “Mux” above?

Page 11: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 11

Computer Architecture

5. The Processor: Datapath and Control

Abstraction

Make sure you understand the abstractions! Sometimes it is easy to think you do, when you

don’t

Mux

C

Select

32

32

32

B

A

Mux

Select

B31

A31

C31

Mux

B30

A30

C30

Mux

B0

A0

C0

...

...

Page 12: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 12

Computer Architecture

5. The Processor: Datapath and Control

Register File

Note: we still use the real clock to determine when to write

Write

01

n-to-2n

decoder

n – 1

n

Register 0

C

D

Register 1

C

D

Register n – 2

C

D

Register n – 1

C

D

...

Register number...

Register data

Page 13: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 13

Computer Architecture

5. The Processor: Datapath and Control

Elements for Instruction Fetch

Basic Instruction Fetch Unit

Page 14: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 14

Computer Architecture

5. The Processor: Datapath and Control

Elements for R-format Instructions /Load/Store Instuctions

Two elements for R-format• R[rd] <- R[rs] op R[rt]

Additional elements for Load/Store• R[rt] <- Mem[R[rs] + SignExt[imm16]]

Page 15: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 15

Computer Architecture

5. The Processor: Datapath and Control

Datapath for a Branch

If (R[rs] == R[rt]) PC <- PC + 4 + ( SignExt(imm16) x 4 ) Else PC <- PC + 4

Page 16: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 16

Computer Architecture

5. The Processor: Datapath and Control

Datapath for Lw/Sw and R-type

Page 17: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 17

Computer Architecture

5. The Processor: Datapath and Control

Putting it Altogether

Page 18: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 18

Computer Architecture

5. The Processor: Datapath and Control

Control

Selecting the operations to perform (ALU, read/write, etc.)

Controlling the flow of data (multiplexor inputs)

Information comes from the 32 bits of the instruction

Example:

add $8, $17, $18

000000 10001 10010 01000 00000 100000

op rs rt rd shamt funct

ALU's operation based on instruction type and function code

Page 19: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 19

Computer Architecture

5. The Processor: Datapath and Control

Three Instruction Classes

op. 6 bits and funct. 6 bits can be used to generate control signals I-type an J-type instructions only have op. 6 bits R-type instructions have both op. 6 bits and funct. 6 bits

op(6) rs(5) rt(5) rd(5) shamt(5)func(6)

op(6) rs(5) rt(5) 16 bit address

op(6) 26 bit address

R

I

J

Page 20: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 20

Computer Architecture

5. The Processor: Datapath and Control

e.g., what should the ALU do with this instruction Example: lw $1, 100($2)

35 2 1 100

op rs rt 16 bit offset

ALU control input

0000 AND0001 OR0010 add0110 subtract0111 set-on-less-than1100 NOR

ALU Control

Page 21: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 21

Computer Architecture

5. The Processor: Datapath and Control

Control with Local Decoding

SmallControl 1

(fast)

op

6

SmallControl 2

(fast)

func

2

6ALUop

ALUctr

4

Big Single Control Logic(becomes slow)

op

6

func

6

ALUctr

MemRead

RegWrite

MemtoReg

ALUSrc

MemWrite

RegDst

Branch

4

MemRead

RegWriteMemtoReg

ALUSrcMemWrite

RegDstBranch

Main Control

ALU Control

Page 22: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 22

Computer Architecture

5. The Processor: Datapath and Control

ALU Control (ALUctr)

InstructionOpcode

ALUOp Instructionoperation

Functionfield

DesiredALU action

ALU control input

LW 00 Load word xxxxxx Add 0010

SW 00 Store word xxxxxx Add 0010

Branch equal

01 Branch equal xxxxxx Subtract 0110

R-type 10 Add 100000 Add 0010

R-type 10 Subtract 100010 Subtract 0110

R-type 10 AND 100100 And 0000

R-type 10 OR 100101 Or 0001

R-type 10 Set on less than

101010 Set on less than

0111

Page 23: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 23

Computer Architecture

5. The Processor: Datapath and Control

Can turn into gates:

Truth Table for ALU Control

Page 24: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 24

Computer Architecture

5. The Processor: Datapath and Control

Control for R-type Instruction

Data path and active control signals are highlighted

Page 25: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 25

Computer Architecture

5. The Processor: Datapath and Control

Control for “load” instruction

Page 26: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 26

Computer Architecture

5. The Processor: Datapath and Control

Control for “branch equal”

Page 27: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 27

Computer Architecture

5. The Processor: Datapath and Control

Instruction RegDst ALUSrcMemto-

RegReg

WriteMemRead

MemWrite Branch ALUOp1 ALUp0

R-format 1 0 0 1 0 0 0 1 0lw 0 1 1 1 1 0 0 0 0sw X 1 X 0 0 1 0 0 0beq X 0 X 0 0 0 1 0 1

Readregister 1

Readregister 2

Writeregister

Writedata

Writedata

Registers

ALU

Add

Zero

Readdata 1

Readdata 2

Signextend

16 32

Instruction[31–0] ALU

result

Add

ALUresult

Mux

Mux

Mux

Address

Datamemory

Readdata

Shiftleft 2

4

Readaddress

Instructionmemory

PC

1

0

0

1

0

1

Mux

0

1

ALUcontrol

Instruction [5–0]

Instruction [25–21]

Instruction [31–26]

Instruction [15–11]

Instruction [20–16]

Instruction [15–0]

RegDstBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite

Control

Page 28: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 28

Computer Architecture

5. The Processor: Datapath and Control

Truth Table for Main Control

Input or output Signal name R-format lw sw beq

Inputs Op5 0 1 1 0

Op4 0 0 0 0

Op3 0 0 1 0

Op2 0 0 0 1

Op1 0 1 1 0

Op0 0 1 1 0

Outputs RegDst 1 0 X X

ALUSrc 0 1 1 0

MemtoReg 0 1 X X

RegWrite 1 1 0 0

MemRead 0 1 0 0

MemWrite 0 0 1 0

Branch 0 0 0 1

ALUOp1 1 0 0 0

ALUOp0 0 0 0 1

Page 29: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 29

Computer Architecture

5. The Processor: Datapath and Control

Control

Simple combinational logic (truth tables)

Operation2

Operation1

Operation0

Operation

ALUOp1

F3

F2

F1

F0

F (5– 0)

ALUOp0

ALUOp

ALU control block

R-format Iw sw beq

Op0

Op1

Op2

Op3

Op4

Op5

Inputs

Outputs

RegDst

ALUSrc

MemtoReg

RegWrite

MemRead

MemWrite

Branch

ALUOp1

ALUOpO

Page 30: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 30

Computer Architecture

5. The Processor: Datapath and Control

Implementing Jump

op Exit (26 bit address)

xxxx 00

j Exit

$pc

jump addr. Exit (26 bit address)

xxxx

Page 31: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 31

Computer Architecture

5. The Processor: Datapath and Control

All of the logic is combinational

We wait for everything to settle down, and the right thing to be done• ALU might not produce “right answer” right away

• we use write signals along with clock to determine when to write

Cycle time determined by length of the longest path

Our Simple Control Structure

We are ignoring some details like setup and hold times

Stateelement

1

Stateelement

2Combinational logic

Clock cycle

Page 32: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 32

Computer Architecture

5. The Processor: Datapath and Control

Performance of Single Cycle Datapath

Calculate cycle time assuming negligible delays except:• memory (200ps),

ALU and adders (100ps), register file access (50ps)

Minimum Cycle Time > 600ps!Instruction class Functional units used by the instruction class Required Time

R-type Instruction fetch

Register access

ALU Register access

400ps

lw Instruction fetch

Register access

ALU Memory access

Register access

600ps

sw Instruction fetch

Register access

ALU Memory access

550ps

beq Instruction fetch

Register access

ALU 350ps

Page 33: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 33

Computer Architecture

5. The Processor: Datapath and Control

Where we are headed

Single Cycle Problems:• what if we had a more complicated instruction like floating

point?• wasteful of area

One Solution:• use a “smaller” cycle time• have different instructions take different numbers of cycles• a “multicycle” datapath:

Data

Register #

Register #

Register #

PC Address

Instructionor dataMemory Registers ALU

Instructionregister

Memorydata

register

ALUOut

A

BData

Page 34: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 34

Computer Architecture

5. The Processor: Datapath and Control

What’s wrong with our CPI=1 processor?

Long Cycle Time All instructions take as much time as the slowest Real memory is not so nice as our idealized memory

• cannot always get the job done in one (short) cycle

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

Page 35: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 35

Computer Architecture

5. The Processor: Datapath and Control

Reducing Cycle Time

Cut combinational dependency graph and insert register / latch

Do same work in two fast cycles, rather than one slow one storage element

CombinationalLogic (A)

storage element

storage element

CombinationalLogic (B)

storage element

CombinationalLogic

storage element

=>

Page 36: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 36

Computer Architecture

5. The Processor: Datapath and Control

Partitioning the CPI=1 Datapath

Add registers between smallest stepsP

C

Nex

t P

C

Ope

rand

Fet

ch Exec Reg

. F

ile

Mem

Acc

ess

Inst

ruct

ion

Fet

ch

AL

Uct

r

Reg

Dst

AL

USr

c

Ext

Op

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

Page 37: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 37

Computer Architecture

5. The Processor: Datapath and Control

Example Multicycle Datapath

PC

Nex

t P

C

Ope

rand

Fet

ch

Ext

ALU Reg

. F

ile

Mem

Acc

es s

Inst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

USr

c

Ext

Op

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

IRA

B

S

M

RegFile

Mem

ToR

eg

Page 38: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 38

Computer Architecture

5. The Processor: Datapath and Control

R-rtype (add, sub, . . .)

Logical Register Transfer Physical Register Transfers

inst Logical Register Transfers

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

ADD A<– R[rs]; B <– R[rt]

S <– A + B

R[rd] <– S; PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

A

B

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em

Page 39: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 39

Computer Architecture

5. The Processor: Datapath and Control

Load

Logical Register Transfer

inst Logical Register Transfers

LW R[rt] <– MEM(R[rs] + sx(Im16);

PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

LW A<– R[rs]; B <– R[rt]

S <– A + SignEx(Im16)

M <– MEM[S]

R[rt] <– M; PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

es s

A

B

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em

Physical Register Transfers

Page 40: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 40

Computer Architecture

5. The Processor: Datapath and Control

Store

Logical Register Transfer

inst Logical Register Transfers

SW MEM(R[rs] + sx(Im16) <– R[rt];

PC <– PC + 4

inst Physical Register Transfers

IR <– MEM[pc]

SW A<– R[rs]; B <– R[rt]

S <– A + SignEx(Im16);

MEM[S] <– B PC <– PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

ess

A

B

S

M

Reg

File

PC

Nex

t P

C

IR

Inst

. M

em

Physical Register Transfers

Page 41: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 41

Computer Architecture

5. The Processor: Datapath and Control

Branch

Logical Register Transfer

inst Logical Register Transfers

BEQ if R[rs] == R[rt]

then PC <= PC + 4 + sx(Im16) || 00

else PC <= PC + 4

Exe

c

Reg

. F

ile

Mem

Acc

es s

A

B

S

M

Reg

File

Equ

al

PC

Nex

t P

C

IR

Inst

. M

eminst Physical Register Transfers

IR <– MEM[pc]

BEQ A<– R[rs]; B <– R[rt]

S <– A - B

Eq:PC<-PC+4+sx(Im16)||00, ~Eq:PC <– PC + 4

Physical Register Transfers

Page 42: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 42

Computer Architecture

5. The Processor: Datapath and Control

Summary:

Page 43: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 43

Computer Architecture

5. The Processor: Datapath and Control

Multiple Cycle Datapath

Miminizes Hardware: 1 memory, 1 adder

IdealMemoryWrAdrDin

RAdr

32

32

32Dout

MemWr

32

AL

U

3232

ALUOp

ALUControl

Instru

ction R

eg

32

IRWr

32

Reg File

Ra

Rw

busW

Rb5

5 busA

busB

RegWr

Rs

Rt

Mux

0

1Rd

PCWr

ALUSelA

Mux 01

RegDst

Mux

0

1

32

PC

MemtoReg

Extend

ExtOp

Mux

0

132

0

1

23

4

16Imm 32

<< 2

ALUSelB

Mux

1

0

32

Zero

ZeroPCWrCond PCSrc

32

IorD

S

A

B5

M

Page 44: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 44

Computer Architecture

5. The Processor: Datapath and Control

Multi-Cycle Datapath & Control Signals

Figure 5.28 in page 323

Shiftleft 2

PCMux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

1 Mux

0

3

2

Mux

0

1ALUOut

Memory

MemData

Writedata

Address

Page 45: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 45

Computer Architecture

5. The Processor: Datapath and Control

High-Level Control Flow

Instruction fetch / decode and register fetchInstruction fetch / decode and register fetch

Memory accessMemory accessinstructionsinstructions

R-type instructionsR-type instructions Branch instructionsBranch instructions Jump instructionJump instruction

start

• Common 2-clock sequence to fetch/decode any instruction• Separate sequences of 1 to 3 clocks to execute specific types of instruction• Two techniques for control

• finite state machine• microprogramming

Page 46: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 46

Computer Architecture

5. The Processor: Datapath and Control

Review of Finite State Machine(FSM)

Finite state machine• a set of states and• next state function (determined by current state and input)• output function (determined by current state and input)

We will use a Moore machine • output based only on current state• cf. Mealy machine: output based on both current state and input

Current stateCurrent state Next-stateNext-statefunctionfunction

OutputOutputfunctionfunction

Inputs

Outputs

clock

Page 47: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 47

Computer Architecture

5. The Processor: Datapath and Control

Control Finite State Machine(FSM) Diagram

PCWritePCSource = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA =1ALUSrcB = 00ALUOp= 10

RegDst = 1RegWrite

MemtoReg = 0

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0RegWrite

MemtoReg =1

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op = 'LW') or (Op = 'SW') (Op = R-type)

(Op

= 'B

EQ')

(Op

= 'J

')

(Op = 'SW

')

(Op

= 'L

W')

4

01

9862

753

Start

Page 48: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 48

Computer Architecture

5. The Processor: Datapath and Control

Control Finite State Machine Structure

Outputs to datapath control determined by current state Next state determined by current state and input from instruction register

CombinationalCombinationalControl LogicControl Logic

(random logic or PLA)(random logic or PLA)

State registerState register

inputs

outputs

Next state

Datapath control outputs

inputs from instructionregister opcode field

Page 49: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 49

Computer Architecture

5. The Processor: Datapath and Control

PLA Implementation

Op5

Op4

Op3

Op2

Op1

Op0

S3

S2

S1

S0

IorD

IRWrite

MemReadMemWrite

PCWritePCWriteCond

MemtoRegPCSource1

ALUOp1

ALUSrcB0ALUSrcARegWriteRegDstNS3NS2NS1NS0

ALUSrcB1ALUOp0

PCSource0

OP code

current state

ANDplane

ORplane

Page 50: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 50

Computer Architecture

5. The Processor: Datapath and Control

Microprogramming

Microprogramming is an alternate structure for implementing the control finite state machine• represent outputs and next state selection as microinstructions in a memory (the c

ontrol store) addressed by the current state (often called the microprogram counter)

• the control store can be implemented in ROM or RAM (writable control store)• provides level of abstraction between software and datapath hardware: firmware

Page 51: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 51

Computer Architecture

5. The Processor: Datapath and Control

“Macroprogram” Interpretation

MainMemory

execution

unit

controlmemory

CPU

User program plus Data

AND microsequence

e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s)

one of these ismapped into oneof these

ANDSUBADD…

DATA

Page 52: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 52

Computer Architecture

5. The Processor: Datapath and Control

Exceptions

Normal Control Flow• sequential, jumps, branches, calls, returns

Exception = unprogrammed control transfer• system takes action to handle the exception

• must record the address of the offending instruction• returns control to user• must save & restore user state

user programSystemExceptionHandlerException:

return fromexception

Page 53: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 53

Computer Architecture

5. The Processor: Datapath and Control

Exceptions vs. Interrupts

MIPS convention:• exception means any unexpected change in control flow, without

distinguishing internal or external

• use the term interrupt only when the event is externally caused.

Type of event From where? MIPS terminologyI/O device request External InterruptInvoke OS from user program Internal ExceptionArithmetic overflow Internal ExceptionUsing an undefined instruction Internal ExceptionHardware malfunctions Either Exception or Interrupt

Page 54: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 54

Computer Architecture

5. The Processor: Datapath and Control

Exception Handling in MIPS

Save PC in the dedicated register EPC• like $ra but can’t assume that $ra is unused

Records the reason for the exception in the Cause register• used by operating system to figure out what went wrong

Load special value into PC, the exception handler’s address• transfer execution to exception handler

The instruction RFE restores the PC from EPC• like jr $ra for subroutines but a special register just for exceptions

Page 55: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 55

Computer Architecture

5. The Processor: Datapath and Control

Updated Datapath to Implement Exceptions

Shiftleft 2

Memory

MemData

Writedata

Mux

0

1

Instruction[15– 11]

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

Address

EPC

CO 00 00 00 3

Cause

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

EPCWriteIntCauseCauseWrite

1

0

1 Mux

0

3

2

Mux

0

1

Mux

0

1

PC

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

ALUOut

Page 56: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 56

Computer Architecture

5. The Processor: Datapath and Control

Updated Finite State Machine

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

RegDst = 1RegWrite

MemtoReg = 0

MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 00ALUOp = 00

RegWriteMemtoReg = 1

RegDst = 0

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

Register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op = 'LW') or (Op = 'SW') (Op = R-type)

(Op

= 'B

EQ')

(Op

= 'J

')

(Op = 'SW

')

(Op

= 'L

W')

4

01

9862

7 11 1053

Start

(Op = other)

Overflow

Overflow

ALUSrcA = 0ALUSrcB = 01ALUOp = 01

EPCWritePCWrite

PCSource = 11

IntCause = 0CauseWrite

ALUSrcA = 0ALUSrcB = 01ALUOp = 01

EPCWritePCWrite

PCSource = 11

IntCause = 1CauseWrite

PCWritePCSource = 10

Page 57: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 57

Computer Architecture

5. The Processor: Datapath and Control

Pentium 4

Pipelining is important (last IA-32 without it was 80386 in 1985)

Pipelining is used for the simple instructions favored by compilers

“Simply put, a high performance implementation needs to ensure that the simple instructions execute quickly, and that the burden of the complexities of the instruction set penalize the complex, less frequently used, instructions”

Control

Control

Control

Enhancedfloating pointand multimedia

Control

I/Ointerface

Instruction cache

Integerdatapath

Datacache

Secondarycacheandmemoryinterface

Advanced pipelininghyperthreading support

Chapter 6

Chapter 7

Page 58: 순천향대학교 정보기술공학부 이 상 정 1 5. The Processor: Datapath and Control

순천향대학교 정보기술공학부 이 상 정 58

Computer Architecture

5. The Processor: Datapath and Control

Pentium 4

Somewhere in all that “control we must handle complex instructions

Processor executes simple microinstructions, 70 bits wide (hardwired) 120 control lines for integer datapath (400 for floating point) If an instruction requires more than 4 microinstructions to implement,

control from microcode ROM (8000 microinstructions) Its complicated!

Control

Control

Control

Enhancedfloating pointand multimedia

Control

I/Ointerface

Instruction cache

Integerdatapath

Datacache

Secondarycacheandmemoryinterface

Advanced pipelininghyperthreading support