25
1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

Embed Size (px)

Citation preview

Page 1: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1

Chapter Five

The Processor: Datapath and Control(Parte B: multiciclo)

Page 2: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-2

• Break up the instructions into steps, each step takes a cycle

– balance the amount of work to be done

– restrict each cycle to use only one major functional unit

• 1 ALU, 1 Memória, 1 Banco de Registradores

• At the end of a cycle

– store values for use in later cycles (easiest thing to do)

– introduce additional “internal” registers

Multicycle Approach

PC

Memory

Address

Instructionor data

Data

Instructionregister

Registers

Register #

Data

Register #

Register #

ALU

Memorydata

register

A

B

ALUOut

Page 3: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-3

• IR (Instruction Register) e MDR (Memory Data Register) salvam saída da memória

• Registradores A e B salvam saída do banco de registradores

• ALUout salva saída da ALU

• Todos, exceto IR, guardam dados por um clock controle de escrita é desnecessário

• novos MUXes: endereço da memória, operandos da ALU

Multicycle Approach

Shiftleft 2

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Mux

0

1

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

Mux

ALUresult

ALUZero

Memorydata

register

Instruction[15– 11]

A

B

ALUOut

0

1

Address

Page 4: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-4

Via de dados multiciclo e sinais de controle

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

Falta implementar 3 possíveis fontes de carga para PC: PC+4, beq e j

Page 5: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-5

Via de dados completa

S h ift

le f t 2

P C

Mux

0

1

R e gis te rsW ritereg is te r

W rited a ta

R e a dda ta 1

R e a dda ta 2

R e a dreg is te r 1

R e a dreg is te r 2

In stru c tio n[1 5– 11 ]

Mux

0

1

Mux

0

1

4

Ins truc t io n[1 5– 0 ]

S ig n

ex te n d

3 216

In s t ru ct ion[25 – 21 ]

In s t ru ct ion[20 – 16 ]

In s t ru ct ion[1 5 – 0 ]

Ins tru c tio n

re g is te r

A L U

c on t ro l

A L Ur es u l t

A L U

Z er o

M e m o ry

d a ta

re g is te r

A

B

Io r D

M e m R e a d

M e m W r ite

M e m to R e g

P C W r it e C o n d

P C W rit e

IR W ri te

A L U O p

A L U S r c B

A L U S rc A

R e g D s t

P C S o u r c e

R e g W rite

C o n tr o l

O u tp u ts

O p

[5 – 0 ]

Ins truc tio n[3 1 - 26 ]

Ins tru c t io n [5 – 0]

Mu

x

0

2

J u m p

a d d re s s [3 1 -0 ]Ins truc t io n [2 5 – 0 ] 2 6 28S h ift

le f t 2

P C [3 1 - 2 8 ]

1

1 Mux

0

3

2

Mux

0

1

A L U O u t

M e m or y

M em D ata

W r ited a ta

A d d re s s

Page 6: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-6

Sinais de controle (1 bit)

Desligado Ligado

RegDst Reg de escrita rt Reg de escrita rd

RegWR Escreve no banco

ALUSrcA operando é o PC operando é o reg A

MemRD Lê a memória

MemWR Escreve na memória

Memto Reg Reg WR data ALUout Reg WR data MDR

IorD Endereço do PC Endereço de ALUout

IRWrite Escreve no IR

PCWrite Escreve no PC

PCWriteCond Escr PC se ALU=0

Page 7: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-7

Sinais de controle (2 bits)

00 Soma 01 Subtração

ALUop

10 Function 00 Segundo operando é o registrador B 01 “ constante 4 10 “ sign-extend, 16 bits do IR

ALUSrcB

11 “ idem acima, deslocado 2 bits à esquerda 00 PC + 4 01 ALUout (target address)

PCSrc

10 Jump (4 bits PC & 26 bits ender & 00)

Page 8: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-8

• Instruction Fetch

• Instruction Decode and Register Fetch

• Execution, Memory Address Computation, or Branch Completion

• Memory Access or R-type instruction completion

• Write-back step

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!

Five Execution Steps

Page 9: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-9

• Use PC to get instruction and put it in the Instruction Register.

• Increment the PC by 4 and put the result back in the PC.

• Can be described succinctly using RTL "Register-Transfer Language"

IR = Memory[PC];PC = PC + 4;

Can we figure out the values of the control signals?

What is the advantage of updating the PC now?

Step 1: Instruction Fetch

Page 10: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-10

• Read registers rs and rt in case we need them

• Compute the branch address in case the instruction is a branch

• RTL:

A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC + (sign-extend(IR[15-0]) << 2);

• We aren't setting any control lines based on the instruction type (we are busy "decoding" it in our control logic)

Step 2: Instruction Decode and Register Fetch

Page 11: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-11

• ALU is performing one of three functions, based on instruction type

• Memory Reference:

ALUOut = A + sign-extend(IR[15-0]);

• R-type:

ALUOut = A op B;

• Branch:

if (A==B) PC = ALUOut;

Step 3 (instruction dependent)

Page 12: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-12

• Loads and stores access memory

MDR = Memory[ALUOut];or

Memory[ALUOut] = B;

• R-type instructions finish

Reg[IR[15-11]] = ALUOut;

The write actually takes place at the end of the cycle on the edge

Step 4 (R-type or memory-access)

Page 13: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-13

• Reg[IR[20-16]]= MDR;

What about all the other instructions?

Write-back step

Page 14: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-14

Summary:

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

Page 15: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-15

Outra visão, fluxograma

IR=Mem[PC]PC=PC+4

A=Reg[IR[25:21]]B=Reg[IR[20:16]]

ALUout=PC+(SignExt(IR(15:0))<<2)

ALUout=A op B

Reg[IR[15:11]]=ALUout

ALUout=A+SignExt(IR(15:0))

Mem(ALUout)=BMDR=Mem(ALUout)

Reg[IR[15:11]]=MDR

If A==B PC=ALUout

PC=PC[31:28]&IR[25:0]<<2

5

0

1

6 2 8 9

7 3

4

Page 16: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-16

• How many cycles will it take to execute this code?

lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Label #assume notadd $t5, $t2, $t3sw $t5, 8($t3)

Label: ...

• What is going on during the 8th cycle of execution?• In what cycle does the actual addition of $t2 and $t3 takes place?

Simple Questions

Page 17: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-17

• Value of control signals is dependent upon:

– what instruction is being executed

– which step is being performed

• Use the information we’ve acculumated to specify a finite state machine

– specify the finite state machine graphically, or

– use microprogramming

• Implementation can be derived from specification

Implementing the Control

Page 18: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-18

Tabela dos sinais de controle

Page 19: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

• How many state bits will we need?

Graphical Specification of FSM

PCWritePCSource = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond

PCSource = 01

ALUSrcA =1ALUSrcB = 00ALUOp= 10

RegDst = 1RegWrite

MemtoReg = 0MemWriteIorD = 1

MemReadIorD = 1

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst=0RegWrite

MemtoReg=1

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

MemReadALUSrcA = 0

IorD = 0IRWrite

ALUSrcB = 01ALUOp = 00

PCWritePCSource = 00

Instruction fetchInstruction decode/

register fetch

Jumpcompletion

BranchcompletionExecution

Memory addresscomputation

Memoryaccess

Memoryaccess R-type completion

Write-back step

(Op = 'LW') or (Op = 'SW') (Op = R-type)

(Op

= 'BE

Q')

(Op

= 'J

' )

(Op = 'SW')

(Op

= ' L

W' )

4

01

9862

753

Start

Page 20: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-20

Finite State Machine for Control

PCWrite

PCWriteCond

IorD

MemtoReg

PCSource

ALUOp

ALUSrcB

ALUSrcA

RegWrite

RegDst

NS3NS2NS1NS0

Op5

Op4

Op3

Op2

Op1

Op0

S3

S2

S1

S0

State register

IRWrite

MemRead

MemWrite

Instruction registeropcode field

Outputs

Control logic

Inputs

Page 21: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-21

Desempenho desta máquina para o gcc

• CPI = 0,23*5 + 0,13*4 + 0,43*4 + 0,19*3 + 0,02*3 = 4,02

• Melhor do que se todas as instruções tomassem 5 ciclos

• Melhoria em MIPS (supor ck = 100 MHz)

– CPI = 4 -> 25 MIPS

– CPI = 5 -> 20 MIPS

– CPI = 1 (fazer tudo em um ciclo) -> 100 MIPS

lw sw Tipo R beq jump

% 23 13 43 19 2

CPI 5 4 4 3 3

Page 22: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-22

PLA Implementation

• If I picked a horizontal or vertical line could you explain it?Op5

Op4

Op3

Op2

Op1

Op0

S3

S2

S1

S0

IorD

IRWrite

MemReadMemWrite

PCWritePCWriteCond

MemtoRegPCSource1

ALUOp1

ALUSrcB0ALUSrcARegWriteRegDstNS3NS2NS1NS0

ALUSrcB1ALUOp0

PCSource0

Page 23: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-23

• ROM = "Read Only Memory"– values of memory locations are fixed ahead of time

• A ROM can be used to implement a truth table– if the address is m-bits, we can address 2m entries in the ROM.– our outputs are the bits of data that the address points to.

m is the "heigth", and n is the "width"

ROM Implementation

m n

0 0 0 0 0 1 10 0 1 1 1 0 00 1 0 1 1 0 00 1 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 11 1 0 0 1 1 01 1 1 0 1 1 1

Page 24: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-24

• How many inputs are there?6 bits for opcode, 4 bits for state = 10 address lines(i.e., 210 = 1024 different addresses)

• How many outputs are there?16 datapath-control outputs, 4 state bits = 20 outputs

• ROM is 210 x 20 = 1K x 20 bits (and a rather unusual size)

• Rather wasteful, since for lots of the entries, the outputs are the same

— i.e., opcode is often ignored

ROM Implementation

Page 25: 1998 Morgan Kaufmann Publishers Mario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-1 Chapter Five The Processor: Datapath and Control (Parte B: multiciclo)

1998 Morgan Kaufmann PublishersMario Côrtes - MO401 - IC/Unicamp- 2002s1 Ch5B-25

• Break up the table into two parts

— 4 state bits tell you the 16 outputs, 24 x 16 bits of ROM

— 10 bits tell you the 4 next state bits, 210 x 4 bits of ROM

— Total: 4.3K bits of ROM

• PLA is much smaller

— can share product terms

— only need entries that produce an active output

— can take into account don't cares

• Size is (#inputs #product-terms) + (#outputs #product-terms)

For this example = (10x17)+(20x17) = 460 PLA cells

• PLA cells usually about the size of a ROM cell (slightly bigger)

ROM vs PLA