24
CPE 335 Computer Organization Computer Organization Basic MIPS Architecture – Part II Dr. Iyad Jafar Ad t df D Gh ith Ab dh lid Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html 1

CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Embed Size (px)

Citation preview

Page 1: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

CPE 335Computer OrganizationComputer Organization

Basic MIPS Architecture – Part II

Dr. Iyad Jafar

Ad t d f D Gh ith Ab d h lidAdapted from Dr. Gheith Abandah slides

http://www.abandah.com/gheith/Courses/CPE335_S08/index.html

CPE232 Basic MIPS Architecture 1

Page 2: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Datapath ApproachLet an instruction take more than 1 clock cycle to completeLet an instruction take more than 1 clock cycle to complete

Break up instructions into steps where - each step takes a cycle while trying to balance the amount of work to be

done in each step- restrict each cycle to use only one major functional unit; unless used in

parallel

Not every instruction takes the same number of clock cycles

In addition to faster clock rates multicycle allows functionalIn addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result

Need one memory only– but only one memory access per cycleNeed one ALU/adder only – but only one ALU operation per cycle

CPE232 Basic MIPS Architecture 2

Page 3: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

At the end of a cycleMulticycle Datapath Approach, con’t

At the end of a cycleStore values needed in a later cycle by the current instruction in internal registers (A,B, IR, and MDR) . These registers are invisible to the programmer. All of these registers, except IR, hold data only between a pair of adjacent clockAll of these registers, except IR, hold data only between a pair of adjacent clock cycles thus they don’t need write control signal.

AddressRead Data

Memory

PC Read Addr 1

Read Addr 2Register Read

Data 1

IR

A

out

Read Data(Instr. or Data)

Write Data

Read Addr 2

Write AddrFile

Data 1

ReadData 2

ALU

Write Data

MD

R

B ALU

o

IR – Instruction Register MDR – Memory Data RegisterA, B – regfile read data registers ALUout – ALU output register

CPE232 Basic MIPS Architecture 3

, g g p gData used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory)

Page 4: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Datapath Approach, con’t

Similar to single cycle, shared functional units should have multiplexers at their inputs.

There is only one adder that will be used to update PC perform ALU

CPE232 Basic MIPS Architecture 4

There is only one adder that will be used to update PC, perform ALU operations, comparison for beq, memory address computation, and branch address computation.

Page 5: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Datapath Approach- Control Signals

CPE232 Basic MIPS Architecture 5

Page 6: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

The Multicycle Datapath with Control Signals

ALUOpControl

MemWriteMemRead

IorDPCWrite

PCWriteCond

ALUSrcAALUSrcB

PCSource

Shift

IRWriteMemtoReg

RegDstRegWrite

PC[31-28]

Instr[31-

28

Memory

PC Read Addr 1 A

Shiftleft 2

10

00

2Instr[25-0]

26] 28

Address

Read Data(Instr. or Data)

P Read Addr 1

Read Addr 2

Write Addr

Register

File

ReadData 1

ReadData 2

ALU

Write Data

IR AB

ALU

outzero1

1

1

0

0

0

Write DataData 2Write Data

MD

R

B

SignExtend

Shiftleft 2 ALU

110

0 23

4

Instr[15-0]32

CPE232 Basic MIPS Architecture 6

ALUcontrolInstr[5-0]

32

Page 7: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Machine: 1-bit Control SignalsSignal Effect when deasserted Effect when asserted

RegDst The destination register number comes from the rt field

The destination register number comes from the rd field

RegWrite None Write is enabled to selected destination register

ALUSrcA The first ALU operand is the PC The first ALU operand is register A

MemRead None Content of memory address is placed on Memory data out

MemWrtite None Memory location specified by the address is replaced by the value on Write data inputreplaced by the value on Write data input

MemtoReg The value fed to register file is from ALUOut The value fed to register file is from memory

IorD PC is used as an address to memory ALUOut is used to supply the address to the IorD unit memory unit

IRWrite None The output of memory is written into IR

PCWrite None PC is written; the source is controlled by

CPE232 Basic MIPS Architecture 7

PCWrite None yPCSource

PCWriteCond None PC is written if Zero output from ALU is also active

Page 8: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Machine: 2-bit Control Signals

Signal Value Effect

00 ALU performs add operation

ALUOp 01 ALU performs subtract operation

10 The funct field of the instruction determines the ALU operation

ALUSrcB

00 The second input to the ALU comes from register B

01 The second input to the ALU is 4 (to increment PC)

The second input to the ALU is the sign extended offset lower 16ALUSrcB10 The second input to the ALU is the sign extended offset , lower 16

bits of IR.

11 The second input to the ALU is the sign extended , lower 16 bits of the IR shifted left by two bits

PCSource

00 Output of ALU (PC +4) is sent to the PC for writing

01 The content of ALUOut are sent to the PC for writing (Branch address)

CPE232 Basic MIPS Architecture 8

)

10 The jump address is sent to the PC for writing

Page 9: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock CyclesCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

IFetch Dec Exec Mem WB

1. IFetch: Instruction Fetch and Update PC (Same for all instructions)

IFetch Dec Exec Mem WB

instructions)Operations

1 1 I t ti F t h IR M [PC]1.1 Instruction Fetch: IR <= Memory[PC]

1.2 Update PC : PC <= PC + 4

Control signals valuesControl signals values- IorD = 0 , MemRead = 1 , IRWrite = 1- ALUSrcA = 0, ALUSrcB = 01, ALUOp = 00, PCWrite = 1

CPE232 Basic MIPS Architecture 9

- PCSrc = 00

Page 10: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock Cyclesf (2. Decode - Instruction decode and register fetch (same

for all instructions)

We don’t know the instruction yet do non harmfulWe don t know the instruction yet, do non harmful operations

Operationsp

2.1 read the two source registers rs and rt and place them in registers A and B, respectively.

A <= Reg[IR[25:21]]

B <= Reg[IR[20:16]]

2.2 Compute the branch address

ALUOut <= PC + (sign-extend(IR[15:0]) <<2)

C t l i l l

CPE232 Basic MIPS Architecture 10

Control signals values- ALUSrcA = 0, ALUSrcB = 11, ALUOp = 00

Page 11: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock Cycles3. Execution, Memory address computation, or branch

completion

Operation in this cycle depends on instruction typeOperation in this cycle depends on instruction typeOperations

* if f dd* if memory reference, compute address

ALUOut <= A + sign-extend(IR[15:0])

ALUS A 1 ALUS B 10 ALUO 00ALUSrcA = 1, ALUSrcB = 10, ALUOp = 00

* if arithmetic logic instruction perform operation* if arithmetic-logic instruction, perform operation

ALUOut <= A op B

ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10

CPE232 Basic MIPS Architecture 11

ALUSrcA = 1, ALUSrcB = 00, ALUOp = 10

Page 12: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock Cycles3. Execution, Memory address computation, or branch

completion (continued)

operation depends on instruction typeoperation depends on instruction typeOperations

* if b h i i* if branch instruction

if (A == B) PC<= ALUOut

ALUS A 1 ALUS B 00 ALUO 01ALUSrcA = 1, ALUSrcB = 00, ALUOp = 01, PCWriteCond = 1, PCSrc = 01

* if jump instruction

PC <= {PC[31:28], (IR[25:0],2’b00)}

PCSource = 10, PCWrite = 1

CPE232 Basic MIPS Architecture 12

Page 13: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock Cycles4. Memory access or R-type completion

operation in this cycle depends on instruction typeOperations* if load instruction : read value from memory into MDR

MDR <= Memory[ALUOut]

MemRead = 1, IorD = 1

* if store instruction: store rt into memory

Memory[ALUOut] <= B

M W it 1 I D 1MemWrite = 1, IorD = 1

* if arithmetic-logical instruction: write ALU result into rd

CPE232 Basic MIPS Architecture 13

Reg[IR[15:11]] <= ALUOut

MemtoReg = 0, RegDst = 1, RegWrite = 1

Page 14: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock Cycles5. Memory read completion

Needed for the load instruction onlyOperations5.1 store the loaded value in MDR into rt

Reg[IR[20:16]] <= MDR

RegWrite = 1, MemtoReg = 1, RegDst = 0

CPE232 Basic MIPS Architecture 14

Page 15: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Breaking Instruction Execution into Clock CyclesIn this implementation, not all instructions take 5 cycles

Instruction Class Clock Cycles RequiredLoad 5Store 4

Branch 3Branch 3Arithmetic-logical 4

Jump 3

CPE232 Basic MIPS Architecture 15

Page 16: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle PerformanceC C f fCompute the average CPI for multicycle implementation for SPECINT2000 program which has the following instruction mix: 25% loads, 10% stores, 11% branches, 2% jumps, 52% ALU. Assume the CPI for each instruction class as given in the previous table

CPI = Σ CPIi x ICi / ICCPI = Σ CPIi x ICi / IC= 0.25 x 5 + 0.1 x 4 + 0.11 x 3 + 0.02 x 3 + 0.52 x 4

= 4 12= 4.12

Compare to CPI = 1 for single cycle ?!!Assume CCM = 1/5 CCSM S

Then PerformanceM / PerformanceS = (IC x 1 x CCS ) / (IC x 4.12 x (1/5) CCS)

1 21

CPE232 Basic MIPS Architecture 16

= 1.21 Multicycle is also cost-effective in terms of hardware.

Page 17: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

M lti l d t th t l i l t d t i d l l

Multicycle Control UnitMulticycle datapath control signals are not determined solely by the bits in the instruction

e.g., op code bits tell what operation the ALU should be doing, but g , p p g,not what instruction cycle is to be done nextSince the instruction is broken into multiple cycles, we need to know what we did in the previous cycle(s) in order to determine the currentwhat we did in the previous cycle(s) in order to determine the current action

Must use a finite state machine (FSM) for controla set of states (current state stored in State Register)next state function (determined by current state and the input) Combinational

control logic

Datapathcontrolpoints

. . .

output function (determined by current state and the input)

control logic

State RegInst

points

. . . . . .

CPE232 Basic MIPS Architecture 17

InstOpcode

Next State

Page 18: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

The States of the Control Unit

10 states are required in the FSM controlFSM control

The sequence of states is determined by five steps of execution and the instruction

CPE232 Basic MIPS Architecture 18

Page 19: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

The Control Unit1. Logic gates

inputs : present state + opcode #bits = 10outputs: control + next

state #bits = 20truth table size = 210 rows

x 20 columns

2. ROMCan be used to implementCan be used to implement the truth table above (210 x 20 bit = 20 Kbit)Each location stores theEach location stores the control signals values and the next state Each location is

CPE232 Basic MIPS Architecture 19

addressable by the opcode and next state value

Page 20: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Micro-programmed Control UnitROM i l t ti iROM implementation is vulnerable to bugs and expensive especially for complex CPU. Size increase as theCPU. Size increase as the number and complexity of instructions (states) increases.

Use MicroprogrammingUse Microprogramming

The next state value may not be sequential

Generate the next state outside the storage element

Each state is aEach state is a microinstruction and the signals are specified symbolically

CPE232 Basic MIPS Architecture 20

Use labels for sequencing

Page 21: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Sequencer

CPE232 Basic MIPS Architecture 21

Page 22: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Microprogram

The microassembler converts the microcode into actual signal values

CPE232 Basic MIPS Architecture 22

The microassembler converts the microcode into actual signal values

The sequencing field is used along with the opcode to determine the next state

Page 23: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Multicycle Advantages & DisadvantagesffUses the clock cycle efficiently – the clock cycle is timed to

accommodate the slowest instruction step

Clk

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

lw sw R-type

Multicycle implementations allow functional units to be used

IFetch Dec Exec Mem WB IFetch Dec Exec Mem IFetchyp

Multicycle implementations allow functional units to be used more than once per instruction as long as they are used on different clock cycles

but

Requires additional internal state registers, more muxes,

CPE232 Basic MIPS Architecture 23

and more complicated (FSM) control

Page 24: CPE 335 Computer OrganizationComputer …driyad.ucoz.net/Courses/CPE335/Slides/05mipsbasicarch_II.pdfCPE 335 Computer OrganizationComputer Organization Basic MIPS Architecture

Single Cycle vs. Multiple Cycle Timing

Clk

Single Cycle Implementation:

Cycle 1 Cycle 2Clk

lw sw Wastemulticycle clock

Multiple Cycle Implementation:

multicycle clock slower than 1/5th of single cycle clock due to state register

Clk Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

to state register overhead

IFetch Dec Exec Mem WB IFetch Dec Exec Memlw sw

IFetchR-type

CPE232 Basic MIPS Architecture 24