Upload
meredith-mckinney
View
228
Download
1
Embed Size (px)
Citation preview
Ch.4 RTL Design
Standard Cell Design
TAIST ICTES ProgramVLSI Design Methodology
Hiroaki Kunieda
Tokyo Institute of Technology
4.1 Basic Components
Logic Design
Functional Verification
Logic Synthesis
Scan Path Design
RTL SimulationRTL
Synthesis Netlist
Scan Netlist
Timing Analysis
Functional Verification
VerilogHDL I
This level describes a system by concurrent algorithms (Behavioral). Each algorithm itself is sequential, that means it consists of a set of instructions that are executed one after the other. Functions, Tasks and Always blocks are the main elements. There is no regard to the structural realization of the design.
Designs using the Register-Transfer Level specify the characteristics of a circuit by operations and the transfer of data between the registers. An explicit clock is used. RTL design contains exact timing bounds: operations are scheduled to occur at certain times. Modern RTL code definition is "Any code that is synthesizable is called RTL code".
Within the logic level the characteristics of a system are described by logical links and their timing properties. All signals are discrete signals. They can only have definite logical values (`0', `1', `X', `Z`). The usable operations are predefined logic primitives (AND, OR, NOT etc gates). Using gate level modeling might not be a good idea for any level of logic design. Gate level code is generated by tools like synthesis tools and this netlist is used for gate level simulation and for backend.
Behavior Level
RTL Level(Structural Level)
Gate Level
VerilogHDL II reg: memory elements. Substitute in “always” sentence. ( <=,
=) wire: signal wire in modules. Substitute in “assign” sentence. = Blocking substitution, affected by right variable,
sequentially. a = b; c = a; // c is equivalent to b value
<= Non Blocking, changed by clock timing in parallel. a <= b; c <= a; //c and a behaves as shift register.
Signal level: x, o, 1, z Strength of signal : supply, strong, pull, large, weak, medium,
small, highz parameter: to decide the bit size. assign #10 x = a & b; //assign after 10 nsec
VerilogHDL III
initial begin a = 1’b0; // a=0 at t=0 #10 a = 1’b1; // a=1 at t=10 #20 a = 1’b0; // a=0 at t=20 end
reg out;wire a, b, sel;always @( a or b or sel ) if(sel = = 1’b1) out = a; else if ( sel = = 1’b0 ) out
= b; else out = 1’bx;
Note: reg is used in procedure
block for left term.
Behavior Description with Procedure Block initial: oncealways: repetitive
VerilogHDL IV
Blocking always @(posedge clock) // q and qr is replaced
beginq=d;qr=~d;
end
Non Blocking //exchange a and b by positive edge of clockalways @(posedge clock)
begina<=b;b<=a;
end // a=b; b=a; makes both a and b to be old b value.
VerilogHDL V
function [ 7 : 0 ] sign_extend; input [ 3 : 0 ] a; if ( a[ 3 ] ) sign_extend = {4’b1111, a }; else sign_extend = {4’b0000, a };endfunction
x <= sign_extend( a );
task sign_extend; input [ 3 : 0 ] a; output [ 7 : 0 ] x; if ( a[ 3 ] ) x= {4’b1111, a }; else x= {4’b0000, a };endtask
sign_extend( a, x );Tasks are used in all programming languages, generally known as procedures or subroutines. The lines of code are enclosed in task....end task brackets. Data is passed to the task, the processing done, and the result returned. They have to be specifically called, with data ins and outs, rather than just wired in to the general netlist. Included in the main body of code, they can be called many times, reducing code repetition.
A Verilog HDL function is the same as a task, with very little differences, like function cannot drive more than one output, can not contain delays.
Concatenation 8bit data of sign_extend is made by combining 2 4bits-data
EXOR Gates with Delay
module hard_eor(c, a, b);output c;input a, b;wire d, e, f;
nand #4 g1(d, a, b);nand #4 g2(e, a, d);nand #4 g3(f, b, d);nand #8 g4(c, e, f);
endmodule
Mutiplexer
module mux(f, a, b, sel);output f;input a, b, sel;wire not_sel;and g1(f1, a, not_sel), g2(f2, b, sel);or g3(f, f1, f2);not g4(not_sel, sel);
endmodule
Decoder
module decoder(data_in, data_out);input[1:0] data_in;output[3:0] data_out;always @(data_in) begin
case(data_in) 2’b00:data_out<=4’b0001; 2’b01:data_out<=4’b0010; 2’b10:data_out<=4’b0100; 2’b11:data_out<=4’b1000; default: data_out<=4’bxxxx; // the case
not describedendcase
endendmodule
Priority Encoder
module encoder(data_in, data_out);input[3:0] data_in;output[1:0] data_out;always @(data_in) begin
case(data_in) 4’b0001:data_out<=2’b00; 4’b001x:data_out<=2’b01; 4’b01xx:data_out<=2’b10; 4’b1xxx:data_out<=2’b11; default: data_out<=2’bxx; // the case
not describedendcase
endendmodule
Adder (structure description)
module adder(sum, a, b);output sum;input a, b;wire[1:0] a, b;wire[2:0] sum;wire c;
half_adder hal(c, sum[0], a[0], b[0])full_adder fal(sum[2], sum[1], a[1], b[1], c)
endmodule
Adder (behavior description)
module adder(sum, a, b);parameter size=12, delay=8;input[size-1:0] a, b;output[size-1:0] sum;always @(a or b) #delay s=a+b;
endmodule
ALU (Arithmetic and Logic Unit)
module alu(out, in_a, in_b, cntrl)parameter size=8;input in_a, in_b, ctrl;output out;wire [size-1:0] in_a, in_b, out;wire [5:0] cntrl;
always @(cntrl) begin
case(cntrl) 6’b000010:out<=~in_a; 6’b000110:out<=~(in_a|in_b);
6’b001010:out<=(~in_a)&in_b; 6’b001110:out<=0; 6’b010010:out<=~(in_a &
in_b); 6’b010110:out<=~in_b;
6’b101110:out<= in_a & in_b;
6’b110010:out<=~1; 6’b110110:out<= in_a |
(~in_b); 6’b111010:out<= in_a|
in_b; 6’b111110:out<= in_a; default:out<=x;
endcase end
endmodule
Register
module register(data_out, data_in, load, resetn, clk);parameter size=16;input data_in, resetn, clk;output data_out;wire [size-1:0] data_in;reg [size-1:0] data_out;wire resetn, load, clk;always @(posedge clk or negedge resetn); begin
if(~resetn) data_out=0;else if(load) data_out=data_in;
endendmodule
Counter_Registermodule counter_register(data_out, data_in, load, inc, resetn, clk);
parameter size=16;input data_in, reset, inc, clk;output data_out;wire [size-1:0] data_in;reg [size-1:0] data_out;wire resetn, load, clk;always @(posedge clk or negedge resetn); begin if(~resetn) data_out=0; else
if(load) data_out=data_in;else begin
if(inc) data_out=data_out+1;
endend
endmodule
Tristate Buffer (Bus driver)module tristate_buffer(data_out, data_in, enable);
parameter size=16;input data_in, enable;output data_out;input[size-1:0] data_in;output[size-1:0] data_out;wire enable;always @(data_in or enable) begin
if(enable ==1) data_out=data_in;
else if(enable==0) data_out=‘bz; else
data_out=‘bx; end
endmodule
State Machineparameter s0=2’b00, s1=2’b01, s2=2’b11, s3=2’b10;always @(posedge clock)
current_state<=next_state;always @(current_state or input) begin
case(current_state) s0: next_state<=(input[0])?
s1:s0; s1: next_state<=(input[1])?
s2:s0; s2: next_state<=s3; s3: next_state<=s0; default:next_state<=s0;endcase
end
always @(current_state or input) begin
case(current_state) s0: output<=0; s1: output<=0; s2: output<=0; s3: output<=1; default:output<=0;endcase
end
4.2 Processor Example
DATA PATH 1
Data Path 1
module datapath 1 (InputA, OutputB, loadA, loadB, clk); input InputA, loadA, loadB, clk; Output OutputB wire [7:0] InputA, OutputB; wire load_A, load_B, clk; reg [7:0] OutputA, OutputB;
always @(posedge clk) begin
if(loadA == 1) OutputA <= InputA;if(loadB == 1) OutputB <= OutputA;
end
endmodule
module controller(start, Input, loadA, loadB, clk) parameter S0=3’b000, S1=3’b010, S2=3’b100;
begin always @(posedge clock) current_state<=next_state;
always @(current_state or input) begin case(current_state)
S0: next_state<= (Input)?S1:S0; S1: next_state<= S0; S2: next_state<= S1;
default:next_state<=s0; endcase end
always @(current_state or input) begin HOLD_REQ=0; ADR_ENn=1; ADR_STB=0;
DMA_ACK=0; IOR_OUTn=1; Dbout_STB=0;//default case(current_state) s1: loadA <=1; S2: loadB <=1; endcase endendmodule
Controller (State Machine)
Architecture of Micro Processor
AC
Memory
PC IR
ALU
OUTRINPR
V CSZ
Adress_BusData_Bus
F 1
F2
F3
Decoder
Control words
status
AR
DR
Computer System
module CPU(resetn, clk);input resetn, clk;wire [12:0] A_bus;wire [15:0] D_bus;wire [7:0] cntrl1, cntrl2, cntrl3;wire CEn, WEn, OEn; data_path dp1(A_bus, D_bus, cntrl1, cntrl2, cntrl3, resetn, clk); memory sram1(A_bus, CEn, WEn, OEn D_bus);
controller cntl1(cntrl1, cntrl2, cntrl3, resetn, clk);endmodule;
Data Path I
module data_path(A_bus, D_bus, cntrl1, cntrl2, cntrl3, resetn, clk);input cntrl1, cntrl2, cntrl3, resetn, clk;inout A_bus, D_bus;wire [12:0] A_bus;wire [15:0] D_bus;wire [7:0] cntrl1, cntrl2, cntrl3;wire reestn, clk;reg [15:0] AC_out, IR_out;reg [11:0] PC_out;reg [7:0] INPR_out, OUTR_out;wire [15:0] ALU_out, IR_in;wire [11:0] PC_in;wire [ 7:0] INPR_in, OUTR_in;
Data Path II
always // Control Circuits
begin
AC_in=ALU_out;
ld_PC=
tbuff_PC=
inc_PC=
ld_IR=
tbuff_IR=
op_ALU=
ld_AC=
tbuff_AC=
tbuff_INPR=
ld_OUTR=
Cen=
Oen=
WEn-=;
end
Data Path IIIRAM32 ram1(ABUS, CEn, WEn, OEn, DBUS);
alu alu1(ALU_out, AC_out, D_bus, c_ALU ) ;register #16 AC1(AC_out, AC_in, ld_AC, resetn, clk); tristate_buffer #16 AC_buffer1(D_bus, AC_out, tbuff_AC);couter_register #12 PC1(PC_out, PC_in, ld_PC, inc_PC, resetn, clk); tristate_buffer #12 PC2(D_bus, PC_out, tbuff_PCDBUS); tristate_buffer #12 PC2(A_bus, PC_out, tbuff_PCABUS);register #16 IR1(IR_out, D_bus, ld_IR, resetn, clk); tristate_buffer #16 IR_buffer1(D_bus, IR_out, tbuff_IRDBUS); tristate_buffer #12 IR_buffer2(A_bus, IR_out[11:0], tbuff_IRABUS);register #8 INPR(INPR_out, INPR_in, ld_INPR, resetn, clk); tristate_buffer #8 INPR_buffer(D_bus, INPR_out, tbuff_INPR);register #8 OUTR(OUTR_out, D_bus, ld_OUTR, resetn, clk); tristate_buffer #8 OUTR_buffer(D_bus, OUTR_out, tbuff_OUTR);
endmodule;
Register
module register(data_out, data_in, load, resetn, clk);parameter size=16;input data_in, resetn, clk;output data_out;wire [size-1:0] data_in;reg [size-1:0] data_out;wire resetn, load, clk;always @(posedge clk or negedge resetn); begin
if(~resetn) data_out=0;else if(load) data_out=data_in;
endendmodule
Counter_Registermodule counter_register(data_out, data_in, load, inc, resetn, clk);
parameter size=16;input data_in, reset, inc, clk;output data_out;wire [size-1:0] data_in;reg [size-1:0] data_out;wire resetn, load, clk;always @(posedge clk or negedge resetn); begin if(~resetn) data_out=0; else
if(load) data_out=data_in;else begin
if(inc) data_out=data_in+1;
endend
endmodule
Tristate Buffer (Bus driver)module tristate_buffer(data_out, data_in, enable);
parameter size=16;input data_in, enable;output data_out;input[size-1:0] data_in;output[size-1:0] data_out;wire enable;always @(data_in or enable) begin
if(enable ==1) data_out=data_in;else if(enable==0) data_out=‘bz; else
data_out=‘bx; end
endmodule
ALUmodule alu(out, a, b, c_alu )
parameter size=8;input a, b, c_alu;output out;wire [size-1:0] a, b, out;wire [2:0] c_alu;always @(c_alu) begin
case(c_alu) 3‘b000: out<= a; // trasfer 3'b001: out<= a + 1; // increment 3'b010: out<= a +b; // add 3'b011: out<= a+(~b)+1; // subtract 3'b100: out<= b; // load 3'b101: out<= a and b; // and 3'b110: out<= a+(~b)+1; // subtract 3'b111: out<= (~a); // complement default:out<= x;
endcase end
endmodule
4.3 Memory
SRAM read cycle
CEn=OEn=0
SRAM write cycle
WEn Controlled CEn Controlled
Asynchronous SRAM Imodule RAM32 (A, CEn, WEn, OEn, DQ);
input [25:2] Adr; // External memory address inout [31:0] DQ; // External memory data I/O input CEn; // Chip enable input WEn; // Write enable
input OEn; // Output enable
`define RAMDEPTH 1024 // Memory depth in Kbytes
reg [31:0] Ram [0:((`RAMDEPTH * 1024) - 1)]; // Memory register array reg PosedgeWEn; // Rising edge of write enable reg [15:0] Adr_Latch; // Latched address during writes reg [7:0] TRI_DQ; // Tri-state data out
always @(posedge WEn) // Detects the rising edge of WEn begin PosedgeWEn = 1'b1; #5; PosedgeWEn = 1'b0; end// Read Cycle: CEn=OEn=1 always @(CEn or WEn or OEn or Adr or PosedgeWEn) begin if (~CEn & ~OEn & WEn) TRI_DQ = Ram[Adr]; else if (~CEn & ~WEn) begin Adr_Latch = Adr; // Latch address at start of write TRI_DQ = 8'hzz; end
Asynchronous SRAM II
else if (PosedgeWEn) begin Ram[Adr_Latch] = DQ; PosedgeWEn = #1 1'b0; // Delay added so that shows up on waveform view
end else TRI_DQ = 8'hzz; end assign #2 DQ = TRI_DQ;Endmodule
Asynchronous SRAM II
4.4 State Machine
Control Circuit (State Machine Type)
3bit Counter
IR
Combinational Logic
SZ~FGI~FGO
Decoder
CF1[7:0]CF2[7:0]
CF3[7:0]
Control words
module controller(parameter T0=4’b0000, T1=4’b0001,
T2=4’b0010, T3=4’b0011, T4=4’b0100, T5=4’b0101, T6=4’b0110, T7=4’b0111;
always @(posedge clock)current_state<=next_state;
always @(current_state or input) begin
case(current_state) T0: next_state<= (S)?T1:T0; T1: next_state<= T2;
T2: next_state<= T3; T3: next_state<= (T)?T4:T0;
T4: next_state<= (T)?T5:T0; T5: next_state<= (T)?T6:T0; T6: next_state<= (T)?T7:T0; T7: next_state <= T0;
default:next_state<=s0;endcase
end
Controller (State Machine)
CF1[ 1 ]<=T3 and AI[2];
CF1[ 2 ]<=T5 and MI[1];
CF1[ 3 ]<=(T5 and MI[2]) or (T5 and MI[6] );
CF1[ 4 ]<=T5 and MI[3];
CF1[ 5 ]<=T5 and MI[3];
CF1[ 6 ]<=T3 and AI[1];
CF1[ 7 ]<=T3 and AI[0];
State Machine III (output)
State Machine IV (output)CF2[ 1 ]<=(T3 and MIALL) or T2;
CF2[ 2 ]<=(T4 and MI[0]) or ( T4 and MI[0]) or (T4 and MI[2]) or (T4and MI[3]) or (T4 and MI[4]) or ( T4 and MI[6]) ;
CF2[ 3 ]<=T1;
CF2[ 4 ]<=T3 and IO[5];
CF2[ 5 ]<=T5a and MI[5];
CF3[ 1 ]<=(T3 and (~ FGI) and IO[2]) or (T3 and (~FGO) and IO[3]) or (S and T3 and AI[3]) or (Z and T3 and AI[4]) or (T6 and Z and MI[6]) or T1;
CF3[ 2 ]<=T4 and MI[5];
CF3[ 4 ]<=T3 and IO[0];
CF3[ 5 ]<=T3 and IO[1};
CF3[ 6 ]<=T0;
CF3[ 7 ]<=T3 and IO[4];
State Machine III (output)
T<=((M[1] or M[2] or M[3] or M[4] or M[5]) and T5) or (M[6] and T6) or (AIALL and T3) or (IOALL and T3);
endendmodule
4.5 DMA Controller
DMA
DMAController
MicroProcessor
Memory
I/O Unit
DMA stands for Direct Memory Access. I/.O Unit accesses memoryDirectly while micro processor is idle.
DMA memory to I/O
CLOCK
DMA_REQ (Input)
HOLD_REQ
HOLD_ACK(Input)
Dbout[7:0]
DMA_ACK
ADR_EN
ADR_STB
EOP_Inn ( Input)( end of operation)
IOR_OUTn
S2 S3 S4 S 5
S 0 S 0 S 1 S 1 S 0 S0 S0
Valid Data Valid data
S 3 S 4 S 5
State Diagram
S0
DMA_REQ=0
S4
S3
S5
HOLD_ACK=0DMA_REQ=1
S1
HOLD_ACK=1
S2
EOP_INn=1
EOP_INn=0
State Diagaram
Current_
state
Hold-
REQ
ADR_
EN
ADR_
STB
DMA_
ACK
IOR_
OUTn
Dbout_
STB
S0 1
S1 1 1
S2 1 1 1 1
S3 1 1 1 1 1
S4 1 1 1 1
S5 1 1 1 1
DMA_REQ
HOLD_
ACK
EOP_In n
Current_state
Next_
state
0 * * S0 S0
1 * * S0 S1
* 0 * S1 S1
* 1 * S1 S2
* * * S2 S3
* * * S3 S4
* * * S4 S5
* * 1 S5 S3
* * 0 S5 S0
parameter s0=3’b000, s1=3’b001, s2=3’b010, s3=2’b011, s4=3’b100, s5=3’b101;
always @(posedge clock)current_state<=next_state;
always @(current_state or input) begin
case(current_state) s0: next_state<=(DMA_REQ)?s1:s0; s1: next_state<=(HOLD_ACK)?s2:s1; s2: next_state<=s3; s3: next_state<=s4; s4: next_state<=s5; s5: next_state<=(EOP_Inn)?s0:s3; default:next_state<=s0;endcase
end
State Machine I
State Machine II
always @(current_state or input) begin
HOLD_REQ=0; ADR_ENn=1; ADR_STB=0;DMA_ACK=0; IOR_OUTn=1; Dbout_STB=0;//defaultcase(current_state) s1: HOLD_REQ<=1, DMA_ACK<=1; s2: HOLD_REQ<=1, ADR_EN<=1, ADR_STB<=1; s3: HOLD_REQ<=1, ADR_EN<=1, ADR_STB<=1,
DMA_ACK<=1; s4: HOLD_REQ<=1, ADR_EN=1, DMA_ACK<=1, IOR_OUTn<=0, Dbout_STB<=1; s5: HOLD_REQ<=1, ADR_EN<=1, DMA_ACK<=1,
IOR_OUTn<=0, Dbout_STB<=1; default:output<=0;endcase
end