Upload
michael-fiorelli
View
69
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Lecture about Single-Cycle MIPS processors. Provides detailed overview of the datapath. Great resource for computer and electrical engineering students.
Citation preview
1
ECE 3445 Computer Architecture
Lecture 10 & 11
Building a Single-Cycle MIPS Processor: Datapth
Dr. X. Wang ECE@Villanova
2
q Three Classic Components of a Computer
The Big Picture: What is Coming?
Control
Datapath
Memory
Processor Input
Output
4
7
1001010010110000 0010100101010001 0111011101100110
1000010100110000 1001010010110000
1001010010110000 0011001011110000
1011011101100110
1001010010110000
1111011101100110
1001010010110000
1111011101100110 5
1011011101100110 6
• datapath deals with moving data around • Control generates signals to control proper operation
of datapath
2
Formal Definition: Datapath and Control Unit
3
Control Unit The component that decodes instructions and generates control signals to command proper operations of datapath, memory, and I/O devices
Datapath The collection of state elements, computation elements, and interconnections that together provide a conduit for the flow and transformation of data in the processor during execution
4
Overview of Chapter 4
q Implementation of the MIPS Processor: DataPath and Control Unit – We’ve built a small ALU – How the processor is designed and implemented to execute
instructions q Datapath and control
– Three different implementations will be examined • A single-cycle processor • A multi-cycle processor (not included in the 4th edition of textbook) • A pipelined processor
q Outcome of this chapter – You will be able to design and implement your own
processor that executes your own instructions
3
5
General Processor Operation Steps (von Neumann model)
q A processor uses 3 steps to execute an instruction: 1. Use the program counter (PC) to supply the instruction address and fetch
the instruction from memory (and update the PC) 2. Decode the instruction (and read registers) in the processor 3. Execute the instruction in the processor
• Perform operation • Read/write data
Fetch PC = PC
+4
Decode Exec
Control"
Datapath"PC
Address"
Instruction"
1001010010110000 0010100101010001 0111011101100110
1000010100110000 1001010010110000
1001010010110000 0011001011110000
1011011101100110
1001010010110000
1111011101100110
1001010010110000
1111011101100110
1011011101100110 1001010010110000
Memory Processor
6
Processor Design q The specification of the MIPS instruction set architecture drives
the design of the hardware. q Design hardware that performs the following:
– Fetches instructions from memory – Decodes instructions – Executes instructions as specified by ISA
q Design considerations: Cost, Speed, Power, etc.
q Start with a simplified MIPS
q Initial design supports a key subset of instructions:
– Memory access: lw, sw – Arithmetic/Logical: add, sub, and, or, slt – Branch/Jump: beq, j
q Add instructions as we go
4
7
Design a Processor: Step-by-Step
1. Analyze instruction set -> datapath requirements – meaning of each instruction is given by the register transfer operations – datapath must include storage elements for ISA registers – datapath must support each register transfer
2. Select a set of datapath components and establish clocking methodology
3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to
determine setting of control points that effects the register transfer.
5. Design the control logic unit
8
Step 1a: Analyze the MIPS Instruction Subset (Use Handout 2)
1. ADD and SUB – addu rd, rs, rt – subu rd, rs, rt
2. OR Immediate: – ori rt, rs, imm16
3. LOAD and STORE – lw rt, imm16(rs) – sw rt, imm16(rs)
4. BRANCH: – beq rs, rt, imm16
5. Jump – j imm26
op rs rt rd shamt funct 0 6 11 16 21 26 31
6 bits 6 bits 5 bits 5 bits 5 bits 5 bits
op rs rt immediate 0 16 21 26 31
6 bits 16 bits 5 bits 5 bits
op rs rt immediate 0 16 21 26 31
6 bits 16 bits 5 bits 5 bits
op target address
0 26 31
6 bits 26 bits
op rs rt immediate 0 16 21 26 31
6 bits 16 bits 5 bits 5 bits
(rd)
5
9
Step 1b: Logical Register Transfers q All start by fetching the instruction
Instruction <- MEM[PC]
q Execution
Instr. Register Transfers ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
AND R[rd] <– R[rs] & R[rt]; PC <– PC + 4
ORI R[rt] <– R[rs] | zero_ext(Imm16); PC <– PC + 4
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
STORE MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC <– PC + 4
BEQ if ( R[rs]==R[rt]) then PC <– PC+4+sign_ext(Imm16)||00 else
PC <– PC + 4
J PC <- (upper 4 bits of PC)||(imm26 << 2)
add rd, rs, rt sub rd, rs, rt add rd, rs, rt ori rt, rs, imm16 lw rt, rs, imm16 sw rt, rs, imm16 beq rs, rt, imm16 j imm26
10
Summary: Instruction Set Requirements
q Memory – Store and supply instructions (instruction memory)
– Read and Write Data (data memory)
q PC (program counter)
q Registers - 32 – read (from rs field in instruction) – read (from rt field in instruction)
– write (from rd or rt field in instruction)
q Sign Extender q Add, Subtract, OR, AND (register values) -> ALU q Add 4 or extended immediate to PC
Combinational Logic: Outputs are always just a logic function of its inputs (with some delay) No clock
Sequential Circuits: input written at the rising or falling edge of a clock cycle.
6
11
Step 2: Select Datapath Components: Memory and PC
q Memory
q PC (Program Counter)
Address Instruction
Instruction Memory
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
PC ***Blue signals represent control
signals from control unit
12
Step 2: Select Datapath Components: Registers q Similar to the D Flip Flop except
– N-bit input and output – Write Enable input
• invalid (deasserted): Data Out will not change • valid (asserted): Data Out will become Data In on the
clock edge
Clk
Data In
Write Enable
N N Data Out
7
13
Step 2: Select Datapath Components: Register File
q 32 registers in MIPS are organized as a group called register file
q Register file consists of 32 regs: – Two 32-bit output busses: Read data 1 and Read data 2 (why?) – One 32-bit input bus: Write Data
q Register is selected by: – 5-bit Read reg. 1 (number) selects the register to output on Read data 1 – 5-bit Read reg. 2 (number) selects the register to output on Read data 2 – 5-bit Write reg. (number) selects the register to be written
via Write Data when RegWrite is 1
Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
32
32
Read Port
14
Write Port
8
15
Step 2: Selecting Datapath Components: more q ALU
q Sign Extender
q Add 4 or extended immediate to PC q Multiplexers
32 B
32 A
32 ALU Result
ALU Operation
ALU
Zero
4
16 32 Sign
extend
Adder"
32
32 ADD
4
Overflow
Carryout Negative
16
Step 3: Assemble DataPath Meeting Requirements
q We need to “glue” the components together to construct a datapath that allows data to move around among the components according the needs of the instruction.
q The data moving around in datapath include: – Register values – Memory values – ALU results – The program counter
q We combine the function units with wires and multiplexers q How? Trace the instructions and resolve conflicts
9
17
Fetching Instructions
q Fetching instructions involves 1. Reading an instruction from the Instruction Memory
2. Updating the PC to hold the address of the next instruction
Instruction Memory
Read Address Instruction PC
register
Add
4
Updated every cycle; no need for an explicit write signal
Read every cycle; no need for an
explicit read signal
To next stage
18
Decoding Instructions q Decoding instructions involves
– Sending the fetched instruction’s opcode and function field bits to the control unit
– Reading two values from the Register File • Register File addresses are contained in the instruction
Instruction (from the fectch stage) Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read Data 1
Read Data 2
Control Unit
op rs rt rd shamt funct
10
19
Execution Stage q The action depends on the instruction class:
– Similarities across instruction classes:
– e.g., all instructions (except j) use the ALU after reading the registers.
• Memory-reference: use ALU to calculate addresses • Arithmetic: operation execution • Branches: comparison
– Simplicity and regularity of instructions simplifies the implementation of a processor
– After using the ALU, actions differ… • Memory-reference: access data memory • Arithmetic: write result back to register file • Branches: may need to update the PC with the final address based on
the comparison result.
20
Datapath for R-Type Instructions
Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
op
rs
rt
rd
shamt funct
32
32
32 ALU Result
ALU Operation
ALU
Zero
add rd, rs, rt!
R[rd] <- R[rs] + R[rt];"
11
21
Datapath for Load Instructions
Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
op
rs
rt
Offset/Immediate
32
32
32
ALU Result ALU Operation
ALU
Zero
lw rt, offset(rs) R[rt] <- MEM[R[rs] + s_extend(offset)];
16 32 Sign
extend
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
Register File
22
Datapath for Store Instructions
Read Register1
Read Register2
Write Register
Read Data1
Read Data2 Write Data
RegWrite Clock
5
5
5
32
op
rs
rt
Offset/Immediate
32
32
32
ALU Operation
ALU
Zero
sw rt, offset(rs) MEM[R[rs] + s_extend(offset)] <- R[rt]
16 32 Sign
extend
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
Register File
12
23
Load and Store Together
Read Register1
Read Register2
Write Register
Read Data1
Read Data2 Write Data
RegWrite Clock
5
5
5
32
op
rs
rt
Offset/Immediate
32
32
32
ALU Operation
ALU
Zero
16 32 Sign
extend
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
Register File
sw rt, offset(rs)
lw rt, offset(rs)
24
Branch Instructions
q beq $1, $2, addr # if ($1==$2) go to addr
q Actually if (($1 - $2) == 0)
PC = addr
else PC = newPC
where: newPC = PC + 4 addr = newPC + (imm << 2) # in MIPS offset from newPC
op rs rt immediate
0 16 21 26 31
6 bits 16 bits 5 bits 5 bits $1, $2 imm
13
25
32
32
Datapath for Branch Instructions
Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
op
rs
rt
Offset/Immediate
32
Zero
ALU Operation
ALU
16 32 Sign
extend
Register File
beq rs, rt, offset
if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2)
<<2
32
32 ADD PC+4
PC +4 + S_Ext(imm<<2)
26
Putting it all together…
q Goal: merge datapaths for each function – Instruction Fetch
– R-Type Instructions – Load/Store Instructions
– Branch instructions
q Add multiplexers to steer data as needed
14
27
R-type Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
32
32
32 ALU Result
ALU Operation
ALU
Zero
Read Register1
Read Register2
Write Register
Read Data1
Read Data2 Write Data
RegWrite Clock
5
5
5
32
32
32
32
ALU Operation
ALU
Zero
16 32 Sign
extend
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
Register File
L/S
rs (25-21) rt (20-16)
imm (15-0)
rs (25-21) rt (20-16)
1
1
2
2
3
3
rd (15-11)
28
Example: combine R-Type and Load/Store Datapaths
1. Select write register from rt or rd – rd for R-type – rt for Store
2. Select an ALU input from either • Register File output Read data 2 (for R-Type) • Sign-extender output (for LW/SW)
3. Select Register File input WD from either • ALU output (for R-Type) • Memory output RD (for LW)
15
29
Arithmetic and Load/Store Together
Read Register1
Read Register2
Write Register
Read Data1
Read Data2
Write Data
RegWrite Clock
5
5
5
32
32
32
32
ALU Result ALU Operation
ALU
Zero
16 32 Sign
extend
Address Read Data
Write Data
MemWrite
MemRead
Data Memory
Register File
rs (25-21)
rd (15-11)
rt (20-16)
mux
RegDst
Imm (15-0) m
ux
ALUSrc
mux
MemtoReg
0
1
0
1
0
1
30
Complete Datapath of Sing-cycle MIPS
16
Supporting Jumps (Fig. 4.24)
31
FIGURE 4.24 The simple control and datapath are extended to handle the jump instruction. An additional multiplexor (at the upper right) is used to choose between the jump target and either the branch target or the sequential instruction following this one. This multiplexor is controlled by the jump control signal. The jump target address is obtained by shifting the lower 26 bits of the jump instruction left 2 bits, effectively adding 00 as the low-order bits, and then concatenating the upper 4 bits of PC + 4 as the high-order bits, thus yielding a 32-bit address. Copyright © 2009 Elsevier, Inc. All rights reserved.
Readregister 1
Writedata
Registers
ALU
Add
Zero
Readdata 1
Readdata 2
Sign-
extend
16 32
Instruction[31–0] ALU
result
Add
ALUresult
Mux
Mux
Mux
Address
Datamemory
Readdata
Shift
left 2
4
Readaddress
Instructionmemory
PC
1
0
0
1
0
1
Mux
0
1
ALU
control
Instruction [5–0]
Instruction [25–21]
Instruction [31–26]
Instruction [15–11]
Instruction [20–16]
Instruction [15–0]
RegDst
Jump
BranchMemReadMemtoReg
ALUOp
MemWrite
ALUSrc
RegWrite
Control
Readregister 2
Writeregister
Writedata
Mux
1
0
Shift
left 2
Instruction [25–0] Jump address [31–0]
26 28PC + 4 [31–28]
32
The Effect of the Seven Signals
17
33
Design a Processor: Step-by-Step
1. Analyze instruction set -> datapath requirements – the meaning of each instruction is given by the register transfers – datapath must include storage element for ISA registers – datapath must support each register transfer
2. Select a set of datapath components and establish clocking methodology
3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to
determine setting of control points that effects the register transfer.
5. Assemble the control logic
34
Complete Datapath Executing add
add rd, rs, rt
op � rs� rt� rd � shamt� funct�6 bits 5 bits 5 bits 5 bits 5 bits 6 bits�
18
35
Complete Datapath Executing lw
lw rt, offset(rs)
op � rs� rt� Offset�
6 bits 5 bits 5 bits 16 bits�
36
Complete Datapath Executing sw
sw rt, offset(rs) op � rs� rt� Offset�
6 bits 5 bits 5 bits 16 bits�
19
37
Complete Datapath Executing Beq (Taken)
beq rs, rt, label # rs==rt
op � rs� rt� Imm�6 bits 5 bits 5 bits 16 bits�
111
38
Complete Datapath Executing Beq (NOT Taken)
00
1
beq rs, rt, label # rs/=rt
op � rs� rt� Imm�6 bits 5 bits 5 bits 16 bits�
20
39
Complete Datapath Executing J
1
J label
op � Imm�6 bits 26 bits�
40
Summary of Control Signals