1
COMP541COMP541
Datapaths II &Datapaths II &Control IControl I
Montek SinghMontek Singh
Mar 22, 2010Mar 22, 2010
TopicsTopics Single cycle MIPSSingle cycle MIPS
Reading: Chapter 7Reading: Chapter 7 Verilog code for MIPS at the end (!)Verilog code for MIPS at the end (!)
If you don’t feel comfortable with assembly If you don’t feel comfortable with assembly language, pls review Ch. 6language, pls review Ch. 6
2
First, Top Level of CPUFirst, Top Level of CPUmodule top(input clk, reset, module top(input clk, reset,
output [31:0] writedata, dataadr, output [31:0] writedata, dataadr,
output memwrite);output memwrite);
wire [31:0] pc, instr, readdata;wire [31:0] pc, instr, readdata;
// instantiate processor and memories// instantiate processor and memories
mips mips(clk, reset, pc, instr, memwrite, dataadr, mips mips(clk, reset, pc, instr, memwrite, dataadr,
writedata, readdata);writedata, readdata);
imem imem(pc[7:2], instr);imem imem(pc[7:2], instr);
dmem dmem(clk, memwrite, dataadr, writedata,dmem dmem(clk, memwrite, dataadr, writedata,
readdata);readdata);
endmoduleendmodule
3
Top Level Schematic (ISE)Top Level Schematic (ISE)
4
MIPS
imem
dmem
Top Level of MIPSTop Level of MIPSmodule mips(input clk, reset,module mips(input clk, reset,
output [31:0] pc,output [31:0] pc,
input [31:0] instr,input [31:0] instr,
output memwrite,output memwrite,
output [31:0] aluout, writedata,output [31:0] aluout, writedata,
input [31:0] readdata);input [31:0] readdata);
wire memtoreg, branch,wire memtoreg, branch,
pcsrc, zero,pcsrc, zero,
alusrc, regdst, regwrite, jump;alusrc, regdst, regwrite, jump;
wire [2:0] alucontrol;wire [2:0] alucontrol;
controller c(instr[31:26], instr[5:0], zero,controller c(instr[31:26], instr[5:0], zero,
memtoreg, memwrite, pcsrc,memtoreg, memwrite, pcsrc,
alusrc, regdst, regwrite, jump,alusrc, regdst, regwrite, jump,
alucontrol);alucontrol);
datapath dp(clk, reset, memtoreg, pcsrc,datapath dp(clk, reset, memtoreg, pcsrc,
alusrc, regdst, regwrite, jump,alusrc, regdst, regwrite, jump,
alucontrol,alucontrol,
zero, pc, instr,zero, pc, instr,
aluout, writedata, readdata);aluout, writedata, readdata);
endmoduleendmodule
5
MIPS SchematicMIPS Schematic
6
DatapathDatapath
7
MIPS State ElementsMIPS State Elements
8
CLK
A RD
InstructionMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
A RD
DataMemory
WD
WEPCPC'
CLK
32 3232 32
32
32
32 32
32
32
5
5
5
We’ll fill out the datapath and control logic for basic single cycle MIPS• First the datapath• then the control logic
Let’s Design Let’s Design lwlw What does it do?What does it do?
9
Single-Cycle Datapath: Single-Cycle Datapath: lwlw fetch fetch First consider executing First consider executing lwlw
How does How does lw lw work?work?
STEP 1: Fetch instructionSTEP 1: Fetch instruction
CLK
A RD
InstructionMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
A RD
DataMemory
WD
WEPCPC'
Instr
CLK
Single-Cycle Datapath: Single-Cycle Datapath: lwlw register register readread STEP 2: Read source operands from register fileSTEP 2: Read source operands from register file
Instr
CLK
A RD
InstructionMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
RegisterFile
A RD
DataMemory
WD
WEPCPC'
25:21
CLK
Single-Cycle Datapath: Single-Cycle Datapath: lwlw immediateimmediate STEP 3: Sign-extend the immediateSTEP 3: Sign-extend the immediate
SignImm
CLK
A RD
InstructionMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
A RD
DataMemory
WD
WEPCPC' Instr
25:21
15:0
CLK
Single-Cycle Datapath: Single-Cycle Datapath: lwlw addressaddress STEP 4: Compute the memory addressSTEP 4: Compute the memory address
SignImm
CLK
A RD
InstructionMemory
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
A RD
DataMemory
WD
WEPCPC' Instr
25:21
15:0
SrcB
ALUResult
SrcA Zero
CLK
ALUControl2:0
ALU
010
Note Control
Single-Cycle Datapath: Single-Cycle Datapath: lwlw memory memory readread STEP 5: Read data from memory and write it back to STEP 5: Read data from memory and write it back to
register fileregister file
A1
A3
WD3
RD2
RD1WE3
A2
SignImm
CLK
A RD
InstructionMemory
CLK
Sign Extend
RegisterFile
A RD
DataMemory
WD
WEPCPC' Instr
25:21
15:0
SrcB20:16
ALUResult ReadData
SrcA
RegWrite
Zero
CLK
ALUControl2:0
ALU
0101
Single-Cycle Datapath: Single-Cycle Datapath: lwlw PC PC incrementincrement STEP 6: Determine the address of the next instructionSTEP 6: Determine the address of the next instruction
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
A RD
DataMemory
WD
WEPCPC' Instr
25:21
15:0
SrcB20:16
ALUResult ReadData
SrcA
PCPlus4
Result
RegWrite
Zero
CLK
ALUControl2:0
ALU
0101
Let’s be ClearLet’s be Clear Although the slides said “STEP” …Although the slides said “STEP” …
… … all that stuff executed in all that stuff executed in one cycle!!!one cycle!!!
Let’s look at Let’s look at swsw and then R-typeand then R-type
16
Single-Cycle Datapath: Single-Cycle Datapath: swsw, write , write backback Write data in Write data in rtrt to memory to memory
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
A RD
DataMemory
WD
WEPCPC' Instr
25:21
20:16
15:0
SrcB20:16
ALUResult ReadData
WriteData
SrcA
PCPlus4
Result
MemWriteRegWrite
Zero
CLK
ALUControl2:0
ALU
10100
Single-Cycle Datapath: R-type Single-Cycle Datapath: R-type instrinstr Read from Read from rsrs and and rtrt Write Write ALUResultALUResult to register file to register file Write to Write to rdrd (instead of (instead of rtrt))
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PCPC' Instr25:21
20:16
15:0
SrcB
20:16
15:11
ALUResult ReadData
WriteData
SrcA
PCPlus4WriteReg4:0
Result
RegDst MemWrite MemtoRegALUSrcRegWrite
Zero
CLK
ALUControl2:0
ALU
0varies1 001
Single-Cycle Datapath: Single-Cycle Datapath: beqbeq Determine whether values in Determine whether values in rsrs and and rtrt are equal are equal Calculate branch target address: Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4)BTA = (sign-extended immediate << 2) + (PC+4)
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1
PC' Instr25:21
20:16
15:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
RegDst Branch MemWrite MemtoRegALUSrcRegWrite
Zero
PCSrc
CLK
ALUControl2:0
ALU
01100 x0x 1
Complete Single-Cycle Processor Complete Single-Cycle Processor (w/control)(w/control)
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
Control UnitControl Unit
RegDst
Branch
MemWrite
MemtoReg
ALUSrcOpcode5:0
ControlUnit
ALUControl2:0Funct5:0
MainDecoder
ALUOp1:0
ALUDecoder
RegWrite
Review: ALUReview: ALU
ALU
N N
N
3
A B
Y
F
F2:0 Function
000 A & B
001 A | B
010 A + B
011 not used
100 A & ~B
101 A | ~B
110 A - B
111 SLT
Review: ALU DesignReview: ALU Design
+
2 01
A B
Cout
Y
3
01
F2
F1:0
[N-1] S
NN
N
N
N NNN
N
2
Ze
roE
xtend
F2:0 Function
000 A & B
001 A | B
010 A + B
011 not used
100 A & ~B
101 A | ~B
110 A - B
111 SLT
• Fields:– op: the operation code or opcode (0 for R-type
instructions)– funct: the function
together, the opcode and function tell the computer
what operation to perform
Review: R-TypeReview: R-Type
op rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
R-Type
Controller (2 modules)Controller (2 modules)module controller(input [5:0] op, funct,module controller(input [5:0] op, funct,
input zero,input zero,
output memtoreg, memwrite,output memtoreg, memwrite,
output pcsrc, alusrc,output pcsrc, alusrc,
output regdst, regwrite,output regdst, regwrite,
output jump,output jump,
output [2:0] alucontrol);output [2:0] alucontrol);
wire [1:0] aluop;wire [1:0] aluop;
wire branch;wire branch;
maindec md(op, memtoreg, memwrite, branch,maindec md(op, memtoreg, memwrite, branch,
alusrc, regdst, regwrite, jump,alusrc, regdst, regwrite, jump,
aluop);aluop);
aludec ad(funct, aluop, alucontrol);aludec ad(funct, aluop, alucontrol);
assign pcsrc = branch & zero;assign pcsrc = branch & zero;
endmoduleendmodule
25
Main DecoderMain Decodermodule maindec(input [5:0] op,module maindec(input [5:0] op,
output memtoreg, memwrite, branch, alusrc,output memtoreg, memwrite, branch, alusrc,
output regdst, regwrite, jump,output regdst, regwrite, jump,
output [1:0] aluop);output [1:0] aluop);
reg [8:0] controls;reg [8:0] controls;
assign {regwrite, regdst, alusrc,assign {regwrite, regdst, alusrc,
branch, memwrite,branch, memwrite,
memtoreg, jump, aluop} = controls;memtoreg, jump, aluop} = controls;
always @(*)always @(*)
case(op)case(op)
6'b000000: controls <= 9'b110000010; //Rtype6'b000000: controls <= 9'b110000010; //Rtype
6'b100011: controls <= 9'b101001000; //LW6'b100011: controls <= 9'b101001000; //LW
6'b101011: controls <= 9'b001010000; //SW6'b101011: controls <= 9'b001010000; //SW
6'b000100: controls <= 9'b000100001; //BEQ6'b000100: controls <= 9'b000100001; //BEQ
6'b001000: controls <= 9'b101000000; //ADDI6'b001000: controls <= 9'b101000000; //ADDI
6'b000010: controls <= 9'b000000100; //J6'b000010: controls <= 9'b000000100; //J
default: controls <= 9'bxxxxxxxxx; //???default: controls <= 9'bxxxxxxxxx; //???
endcaseendcase
endmoduleendmodule
26
Why do this?
ALU DecoderALU Decodermodule aludec(input [5:0] funct,module aludec(input [5:0] funct,
input [1:0] aluop,input [1:0] aluop,
output reg [2:0] alucontrol);output reg [2:0] alucontrol);
always @(*)always @(*)
case(aluop)case(aluop)
2'b00: alucontrol <= 3'b010; // add2'b00: alucontrol <= 3'b010; // add
2'b01: alucontrol <= 3'b110; // sub2'b01: alucontrol <= 3'b110; // sub
default: case(funct) // RTYPEdefault: case(funct) // RTYPE
6'b100000: alucontrol <= 3'b010; // ADD6'b100000: alucontrol <= 3'b010; // ADD
6'b100010: alucontrol <= 3'b110; // SUB6'b100010: alucontrol <= 3'b110; // SUB
6'b100100: alucontrol <= 3'b000; // AND6'b100100: alucontrol <= 3'b000; // AND
6'b100101: alucontrol <= 3'b001; // OR6'b100101: alucontrol <= 3'b001; // OR
6'b101010: alucontrol <= 3'b111; // SLT6'b101010: alucontrol <= 3'b111; // SLT
default: alucontrol <= 3'bxxx; // ???default: alucontrol <= 3'bxxx; // ???
endcaseendcase
endcaseendcase
endmoduleendmodule
27
Control Unit: ALU DecoderControl Unit: ALU Decoder
ALUOp1:0 Meaning
00 Add
01 Subtract
10 Look at Funct
11 Not Used ALUOp1:0 Funct ALUControl2:0
00 X 010 (Add)
X1 X 110 (Subtract)
1X 100000 (add) 010 (Add)
1X 100010 (sub) 110 (Subtract)
1X 100100 (and) 000 (And)
1X 100101 (or) 001 (Or)
1X 101010 (slt) 111 (SLT)
Control Unit: Main DecoderControl Unit: Main DecoderInstruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0
R-type 000000 1 1 0 0 0 0 10lw 100011 1 0 1 0 0 0 00sw 101011 0 X 1 0 1 X 00beq 000100 0 X 0 1 0 X 01
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
Single-Cycle Datapath Example: Single-Cycle Datapath Example: oror
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
0010
01
0
0
1
0
Extended Functionality: Extended Functionality: addiaddi No change to datapathNo change to datapath
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
Control Unit: Control Unit: addiaddiInstruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0
R-type 000000 1 1 0 0 0 0 10lw 100011 1 0 1 0 0 0 00sw 101011 0 X 1 0 1 X 00beq 000100 0 X 0 1 0 X 01
addi 001000 1 0 1 0 0 0 00
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
Extended Functionality: Extended Functionality: jj
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC'
Instr25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
0
1
25:0 <<2
27:0 31:28
PCJump
Jump
Control Unit: Main DecoderControl Unit: Main DecoderInstruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump
R-type 000000 1 1 0 0 0 0 10lw 100011 1 0 1 0 0 0 00sw 101011 0 X 1 0 1 X 00beq 000100 0 X 0 1 0 X 01
j 000100 0 X X X 0 X XX 1
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC'
Instr25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU
0
1
25:0 <<2
27:0 31:28
PCJump
Jump
Review: Processor PerformanceReview: Processor Performance
Program Execution Time = Program Execution Time = (# (#
instructions)(cycles/instruction)(seconds/cycle)instructions)(cycles/instruction)(seconds/cycle)
= # instructions x CPI x TC= # instructions x CPI x TC
Single-Cycle PerformanceSingle-Cycle Performance TTCC is limited by the critical path is limited by the critical path (lw)(lw)
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+
ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
AL
U
1
010
01
0
1
0 0
Single-Cycle PerformanceSingle-Cycle Performance
• Single-cycle critical path: Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem
+ tmux + tRFsetup
• In most implementations, limiting paths are: – memory, ALU, register file. – Tc = tpcq_PC + 2tmem + tRFread + tALU + tRFsetup + tmux
SignImm
CLK
A RD
InstructionMemory
+
4
A1
A3
WD3
RD2
RD1WE3
A2
CLK
Sign Extend
RegisterFile
0
1
0
1
A RD
DataMemory
WD
WE0
1
PC0
1PC' Instr
25:21
20:16
15:0
5:0
SrcB
20:16
15:11
<<2
+ALUResult ReadData
WriteData
SrcA
PCPlus4
PCBranch
WriteReg4:0
Result
31:26
RegDst
Branch
MemWrite
MemtoReg
ALUSrc
RegWrite
Op
Funct
ControlUnit
Zero
PCSrc
CLK
ALUControl2:0
ALU1
010
01
0
1
0 0
Single-Cycle Performance Single-Cycle Performance ExampleExample
Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup
= [30 + 2(250) + 150 + 25 + 200 + 20] ps = 925 ps
Single-Cycle Performance Single-Cycle Performance ExampleExample
For a program with 100 billion instructions executing on a single-cycle MIPS processor,
Execution Time = # instructions x CPI x TC
= (100 × 109)(1)(925 × 10-12 s) = 92.5 seconds
Any potentials problems?Any potentials problems? How do our Block RAMs differ from the RAM How do our Block RAMs differ from the RAM
illustrated here?illustrated here?
Do we want a Harvard architecture?Do we want a Harvard architecture? instruction memory and data memory are separateinstruction memory and data memory are separate
40
Next TimeNext Time We’ll look at multi-cycle MIPSWe’ll look at multi-cycle MIPS Adding functionality to our designAdding functionality to our design
41