76
CSCE 611 Advanced Digital Design RISC-V Microarchitecture

CSCE 611 Advanced Digital Design - cse.sc.edu

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSCE 611 Advanced Digital Design - cse.sc.edu

CSCE 611Advanced Digital Design

RISC-V Microarchitecture

Page 2: CSCE 611 Advanced Digital Design - cse.sc.edu

Exam 1

• Topics:

1. RISC-V ISA• Write or analyze snippets of code

• Translate between C and RISC-V

• R- I- U- B- and J-types

• Control hazards

2. Microarchitecture• Control signal / datapath generation

3. Fixed point arithmetic

4. Logic circuits• Combinational vs sequential

• Logic values

5. Hardware Description Language• Simulation vs synthesis

• Behavioral vs structural

CSCE 611 2

6. SystemVerilog• Simulation waveforms

• Modules and structure

• Continuous assignment

• Numbers

• Bit manipulation

• Reduction

• always statement

• sensitivity list

• default assignments

• case statement

• Sequential logic

• Blocking vs nonblocking assignment

• Registers and RAMs

• Testbenches

Page 3: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 1

Consider the following SystemVerilog module:

module foo (input logic a,b, output logic c);

assign c = ~a;

always_comb

if (b) c = a & b;

endmodule

What is NOT a problem with this module?

CSCE 611 3

a) c is double driven

b) no sensitivity list for the always

statement

c) cannot use if statement in

SystemVerilog

d) use of blocking assignment in

always statement

e) cannot combine always_comb and

assign statements in one module

f) the always block has no default

assignment for c

Page 4: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 2

Consider the Verilog module shown below:

module rand_num (input logic clk, rst, load, input logic

[7:0] seed, output logic [7:0] val);

always_ff @(posedge clk) begin

if (rst) val <= 8’b0; else

if (load) val <= seed; else

val <= {val[5] ^ val[2],val[7:1]};

end

endmodule

How many input signals are specified?

CSCE 611 4

Page 5: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 3

Consider the following SystemVerilog module:

module seq(input logic clk,rst);

logic [2:0] state;

always_ff @(posedge clk,posedge rst) begin

if (rst) state <= 3'b100; else begin

state[2]<=1'b0;

state[1]<=state[2];

state[0]<=state[1] | state[0];

end

end

endmodule

Assume the module is reset with a reset pulse at the start of execution. What is the value of "state" in the first three clock cycles after the reset pulse?

CSCE 611 5

cycle 0: 100

cycle 1:

state[2] <= 0

state[1] <= 1

state[0] <= 0 | 0 = 0

010

cycle 2:

state[2] <= 0

state[1] <= 0

state[0] <= 1 | 0 = 1

001

Page 6: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 4

Give the value of signal "a" in the first three clock cycles (after reset).

logic [2:0] a;

always_ff @(posedge clk,posedge rst) begin

if (rst) a <= 3'b101; else begin

a = a << 1'b1; // note the blocking assignment

a <= a ^ 3b111 // note the nonblocking assignment

end

end

CSCE 611 6

cycle 0: 101

cycle 1:

a = 010

then:

a = 010 ^ 111 = 101

101

cycle 2:

a = 010

then:

a = 010 ^ 111 = 101

101

Page 7: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 5

What is the problem with the following code?

always_comb begin

a = a + 3'b1;

end

CSCE 611 7

a) always statement should be

combinational but has a cycle

b) missing sensitivity list

c) cannot use addition in always_comb

statement

d) use of blocking assignment

Page 8: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 6

Consider the following testbench.

module tb;

logic a,b,c;

somemodule dut (a,b.c);

initial begin

a=1'b1; b=1'b1; #10;

if (c != 1'b1) $display("error!");

a=1'b0; #10;

if (c != 1'b0) $display("error!");

b=1'b0; #10;

if (c != 1'b0) $display("error!");

a=1'b1; #10;

if (c != 1'b0) $display("error!");

end

endmodule

What function does module "somemodule" perform?

CSCE 611 8

a) or gate

b) and gate

c) none of the above

d) inverter

e) xor gate

a,b Output expected

11 1

01 0

00 0

10 0

Page 9: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 2 Q 7

What are the first three values of "a" generated from the snippet below?

logic [4:0] a;

always_ff @(posedge clk,posedge rst) begin

if (rst) a <= 5'b10101; else

a <= {a[0],{2{a[1]}},~a[3],a[2]^a[4]};

end

CSCE 611 9

cycle a[4] a[3] a[2] a[1] a[0]

0 1 0 1 0 1

1 1 0 0 1 0

2 0 1 1 1 1

Page 10: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 3 Q 1

• Consider the RISC-V microarchitecture shown below. What are the value of {regwrite_WB,alusrc_EX,regsel_WB} when the following instructions are in the following pipeline stages? 'X' signifies a don't care.

CSCE 611 10

Page 11: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 3 Q 2

• Consider the RISC-V microarchitecture shown below. What are the value of {regwrite_WB,alusrc_EX,regsel_WB} when the following instructions are in the following pipeline stages? 'X' signifies a don't care.

CSCE 611 11

Page 12: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 3 Q 1

• Consider the RISC-V microarchitecture shown below. What are the value of {regwrite_WB,alusrc_EX,regsel_WB} when the following instructions are in the following pipeline stages? 'X' signifies a don't care.

CSCE 611 12

Page 13: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 3 Q 4

• How many cycles is required to perform a write to a synchronous RAM?

CSCE 611 13

• 1

Page 14: CSCE 611 Advanced Digital Design - cse.sc.edu

Quiz 3 Q 5

• What is the result from performing the RISC-V mulh instruction on input values 4'b1111 and 4'b0001 assuming the registers are 4 bits wide?

CSCE 611 14

• 4'b1111

Page 15: CSCE 611 Advanced Digital Design - cse.sc.edu

Register

• Register:

logic q,d,rst,en,clk;

logic [3:0] q,d;

always_ff @(posedge clk)

if (rst) q <= 4'b0; else if (en) q <= d;

• Contains one memory location, width = 4

CSCE 611 15

Page 16: CSCE 611 Advanced Digital Design - cse.sc.edu

Address

Data

1024-word x

32-bit

Array

10

32

Memory Arrays

CSCE 611 16

Page 17: CSCE 611 Advanced Digital Design - cse.sc.edu

Address

Data

ArrayN

M

Address Data

11

10

01

00

depth

0 1 0

1 0 0

1 1 0

0 1 1

width

Address

Data

Array2

3

• 2-dimensional array of bit cells

• Each bit cell stores one bit

• N address bits and M data bits:

– 2N rows and M columns

– Depth: number of rows (number of words)

– Width: number of columns (size of word)

– Array size: depth × width = 2N × M

Memory Arrays

CSCE 611 17

Page 18: CSCE 611 Advanced Digital Design - cse.sc.edu

Address Data

11

10

01

00

depth

0 1 0

1 0 0

1 1 0

0 1 1

width

Address

Data

Array2

3

• 22 × 3-bit array

• Number of words: 4

• Word size: 3-bits

• For example, the 3-bit word stored at address 10 is 100

Memory Array Example

CSCE 611 18

Page 19: CSCE 611 Advanced Digital Design - cse.sc.edu

RAM

• 8192x32 RAM (asynchronous read):

logic [31:0] mem[8191:0];

logic [12:0] addr;

logic [31:0] readdata,writedata;

logic we,clk;

initial $readmemh("mem.txt",mem);

assign readdata = mem[addr];

always_ff @(posedge clk)

if (we) mem[addr] <= writedata;

CSCE 611 19

Page 20: CSCE 611 Advanced Digital Design - cse.sc.edu

RAM

• 8192x32 RAM (synchronous read):

logic [31:0] mem[8191:0];

logic [12:0] addr;

logic [31:0] readdata,writedata;

logic we,clk;

initial $readmemh("mem.txt",mem);

always_ff @(posedge clk) begin

if (we) mem[addr] <= writedata;

readdata <= mem[addr];

end

CSCE 611 20

Page 21: CSCE 611 Advanced Digital Design - cse.sc.edu

Asynchronous vs Synchronous Read

CSCE 611 21

Page 22: CSCE 611 Advanced Digital Design - cse.sc.edu

Register File

module regfile32x32 (input logic we,clk,

input logic [4:0] readaddr1,

readaddr2,

writeaddr,

output logic [31:0] readdata1,readdata2,

input logic [31:0] writedata);

logic [31:0] mem[31:0];

assign readdata1 = readaddr1 == 5'b0 ? 32'b0 :

readaddr1==writeaddr && we ? writedata :

mem[readaddr1];

assign readdata2 = readaddr2 == 5'b0 ? 32'b0 :

readaddr2==writeaddr && we ? writedata :

mem[readaddr2];

always_ff @(posedge clk)

if (we) mem[writeaddr] <= writedata;

endmodule

CSCE 611 22

readaddr1[4:0]

readaddr2[4:0]

writeaddr[4:0]

we

clk readdata1[31:0]

readdata2[31:0]

regfile32x32

writedata[31:0]

Page 23: CSCE 611 Advanced Digital Design - cse.sc.edu

Arithmetic Logic Unit (ALU)

op Function

0000 A and B

0001 A or B

0010 A xor B

0011 A + B

0100 A – B

0101 A * B (low)

0110 A * B (high, signed)

0111 A * B (high, unsigned)

1000 A << B

1001 A >> B

1010 A >>> B

1011 A >>> B

1100 A < B (signed)

1101 A < B (unsigned)

1110 A < B (unsigned)

1111 A < B (unsigned)

• 4 bit multiply:• 0001 * 1111 = 0000 1111 (unsigned)• 0001 * 1111 = 1111 1111 (signed)

• 4 bit SLT:• 0001 < 1111 = TRUE (unsigned)• 0001 < 1111 = FALSE (signed)

A[31:0]

B[31:0]

op[3:0]

R[31:0]

ALU

zero

CSCE 611 23

Page 24: CSCE 611 Advanced Digital Design - cse.sc.edu

ALU Design

module alu(

input logic [31:0] A,

input logic [31:0] B,

input logic [ 3:0] op,

output logic [31:0] R,

output logic zero

);

logic [31:0] mulls, mullu, mulhu, mulhs;

logic [ 4:0] shamt;

logic [31:0] shifted;

assign zero = R==32'd0 ? 1'b1 : 1'b0;

assign {mulhu, mullu} = A*B;

assign {mulhs, mulls} = $signed(A)*$signed(B);

assign shamt = B[4:0];

// Arithmetic right shift does not work under ModelSim, so we we work

// around this by implementing our own shift. Notice the blocking assignments.

always_comb begin

shifted = A;

if (shamt[0]) shifted = {{1{A[31]}},shifted[31:1]};

if (shamt[1]) shifted = {{2{A[31]}},shifted[31:2]};

if (shamt[2]) shifted = {{4{A[31]}},shifted[31:4]};

if (shamt[3]) shifted = {{8{A[31]}},shifted[31:8]};

if (shamt[4]) shifted = {{16{A[31]}},shifted[31:16]};

end

CSCE 611 24

assign R =

(op == 4'b0000) ? A & B :

(op == 4'b0001) ? A | B :

(op == 4'b0010) ? A ^ B :

(op == 4'b0011) ? A + B :

(op == 4'b0100) ? A - B :

(op == 4'b0101) ? mulls :

(op == 4'b0110) ? mulhs :

(op == 4'b0111) ? mulhu :

(op == 4'b1000) ? A << shamt :

(op == 4'b1001) ? A >> shamt :

(op == 4'b1010) ? shifted :

(op == 4'b1011) ? shifted :

(op == 4'b1100) ? ($signed(A) < $signed(B)) :

(op == 4'b1101) ? (A < B) :

(op == 4'b1110) ? (A < B) : (A < B);

endmodule

Page 25: CSCE 611 Advanced Digital Design - cse.sc.edu

CSCE 611 25

22 RISC-V Instructions (Lab 3)

• Goal: design a pipelined processor that can execute a minimal set of RISC-V instructions:

– Arithmetic R-type: add, sub, mul, mulh, mulhu

– Arithmetic I-type: addi

– Comparison R-type: slt, sltu

– Logical R-type: and, or, xor

– Logical I-type: andi, ori, xori

– Shift R-type: sll, srl, sra

– Shift I-type: slli, srai, srli

– U-type: lui

– I/O R-type: csrrw

Page 26: CSCE 611 Advanced Digital Design - cse.sc.edu

RISC-V Instruction Formats

CSCE 611 26

31:25(7 bits)

24:20(5 bits)

19:15(5 bits)

14:12(3 bits)

11:7(5 bits)

6:0(7 bits)

funct7 rs2 rs1 funct3 rd opcode

31:20(12 bits)

19:15(5 bits)

14:12(3 bits)

11:7(5 bits)

6:0(7 bits)

imm12[11:0] rs1 funct3 rd opcode

31:12(20 bits)

11:7(5 bits)

6:0(7 bits)

imm20[31:12] rd opcode

R-type

I-type

U-type

add rd,rs1,rs2

sub rd,rs1,rs2

mul rd,rs1,rs2

mulh rd,rs1,rs2

mulhu rd,rs1,rs2

slt rd,rs1,rs2

sltu rd,rs1,rs2

and rd,rs1,rs2

or rd,rs1,rs2

xor rd,rs1,rs2

sll rd,rs1,rs2

srl rd,rs1,rs2

sra rd,rs1,rs2

addi rd,rs1,imm

andi rd,rs1,imm

ori rd,rs1,imm

xori rd,rs1,imm

lui rd,imm

slli rd,rs1,imm

srai rd,rs1,imm

srli rd,rs1,imm

csrrw rd,imm,rs1

Page 27: CSCE 611 Advanced Digital Design - cse.sc.edu

csrrw Instruction

• csrrw rd,csr,rs1

– "Control and status register read and write"

– Concurrently update CSR[imm12] and place old value of CSR[imm12] into rd• t = CSRs[imm12];

• CSRs[imm12] = R[rs1];

• R[rd] = t;

• For us:

– csrrw rd,0,x0: write the state of the switches into rd

– csrrw x0,2,x1: write the value in x1 to the HEX displays

CSCE 611 27

Page 28: CSCE 611 Advanced Digital Design - cse.sc.edu

CPU

io2_out

Top-Level Design

CSCE 611 28

CLOCK_50

KEY[0]clk

rst

32-bitRAM

Instruction memory

32 bits

FPGA

• Use on-chip memory for program

– Word addressed

SWio0_in HEX7…HEX0decoders

10 bits

HEX_out

Page 29: CSCE 611 Advanced Digital Design - cse.sc.edu

Execution Stages

• Recall basic steps: fetch, decode, execute, memory, write back

– R-type computational instructions (ex. ADD x1, x2, x3):• fetch, decode, execute, write back

– Branch instructions• fetch, decode, execute

– Load instruction• fetch, decode, execute, memory, write back

– Store instruction• fetch, decode, memory

CSCE 611 29

Page 30: CSCE 611 Advanced Digital Design - cse.sc.edu

Execution Stages

• Single cycle CPU:

CSCE 611 30

fetch

decode

execute

memory

WB

fetch

decode

execute

memory

WB

fetch

decode

execute

memory

WB

cycle 0 cycle 1 cycle 2

inst 1

inst 2

inst 3

Page 31: CSCE 611 Advanced Digital Design - cse.sc.edu

CPU Design

• Loads have a one-cycle latency

– Need to separate MEM and WB

• Fetches have a one-cycle latency

– Need to separate FETCH and DECODE

• Solution:

– Three-stage pipeline:• FETCH, DECODE/EX/MEM, WB

• F E W

– Add pipeline registers between E and W

– Loaded data is already delayed, so don’t need a register for this

– Leads to control hazard• Must flush FETCH for taken branches

• Zero-out control signals in if branch is taken in previous cycle

CSCE 611 31

Page 32: CSCE 611 Advanced Digital Design - cse.sc.edu

Execution Stages

CSCE 611 32

fetch

decode

execute

memory

WB

cycle 0 cycle 1 cycle 2

memory latency

memory latency

Page 33: CSCE 611 Advanced Digital Design - cse.sc.edu

Execution Stages

• Three stage pipeline:

CSCE 611 33

fetch

decode

execute

memory

WB

fetch

decode

execute

memory

WB

fetch

WB

cycle 0 cycle 1 cycle 2

decode

execute

memory

cycle 3 cycle 4

Instruction n

Instruction n+1

Instruction n+2

Page 34: CSCE 611 Advanced Digital Design - cse.sc.edu

[6:0] opcode[14:12] funct3[31:25] funct7[31:20] csr

RISC-V Fetch/Decode

CSCE 611 34

PC

instruction memory

addr

instr

+1

instr

uction_E

X

PC_FETCH

Control Unit

• Instruction bits:– To control unit:

– To register file:• [24:20] rs1 address

(readaddr1)

• [19:15] rs2 address (readaddr2)

RegFile

[19:15]readaddr1

[24:20]

[11:7] regdest_WB

readaddr2

writeaddrR

Page 35: CSCE 611 Advanced Digital Design - cse.sc.edu

instruction_EX

RISC-V Fetch/Decode

CSCE 611 35

PC

instruction memory

addr

instr

+1in

str

uction_E

XPC_FETCH

Control Unit

RegFile

rs1_EXreadaddr1

rs2_EX

rd_EX rd_WB

readaddr2

writeaddrR

Decoder

opcode_EXfunct3_EXfunct7_EXcsr_EXimm12_EX

imm20_EXrs1_EXrs2_EXrd_EX

Page 36: CSCE 611 Advanced Digital Design - cse.sc.edu

Initializing Instruction Memory

• Use script:– Use RARS to assemble (F3)

– File | Dump Memory

– Use this file to initialize instruction RAM:logic [31:0] instruction_mem [4095:0];

initial

$readmemh(“hexcode.txt”, instruction_mem);

– Implementation:always_ff @(posedge clk)

if (rst) begin

instruction_EX <= 32’b0;

PC_FETCH <= 12’b0;

end else begin

instruction_EX <= instruction_mem[PC_FETCH];

PC_FETCH <= PC_FETCH + 12’b1;

end

CSCE 611 36

Page 37: CSCE 611 Advanced Digital Design - cse.sc.edu

R_WBR_EX

EX/WB Stage

RegFile

Control Unit

ALU

op_EX

regwrite_WB

A_EX

R

Rregwrite_EX

R

GPIO_in

{instruction_EX[31:12],12'b0}

or {imm20_EX,12'b0}

Rregsel_EX

R

regsel_WB

0

1

2

writedata_WB

regwrite_WB

B_EX

CSCE 611 37

Page 38: CSCE 611 Advanced Digital Design - cse.sc.edu

alusrc_EX

instruction_EX[31:20]

or imm12_EX

EX Stage: I-Type ARITH

RegFile

Control Unit

FetchALU

sign extend

instruction_EX

instruction_EX[11:7]

or rd_EXrd_WB

R

R

GPIO_out

GPIO_we

0

1

CSCE 611 38

Page 39: CSCE 611 Advanced Digital Design - cse.sc.edu

[6:0] opcode[14:12] funct3[31:25] funct7[31:20] csr

Whole CPU

CSCE 611 39

PC

instruction memory

addr

instr

+1in

str

uction_E

X

PC_FETCH

Control Unit

RegFile

[19:15]readaddr1

[24:20]

reg

dest_

WB

readaddr2

writeaddrR

readdata1

readdata2

writedata

we

alusrc_EX

[31:20]

ALU

sign extend

RGPIO_out

GPIO_we

0

1

[11:7]

R

regwrite_EX

R_WB

R_EX

R

R

GPIO_inR 0

1

2

{instruction_EX[31:12],12'b0}

Rregsel_EX

regsel_WB

Page 40: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 40

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33

sub 7'h20 X 3'b000 7'h33

mul 7'h1 X 3'b000 7'h33

slt 7'h0 X 3'b010 7'h33

Page 41: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 41

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011

sub 7'h20 X 3'b000 7'h33 4'b0100

mul 7'h1 X 3'b000 7'h33 4'b0101

slt 7'h0 X 3'b010 7'h33 4'b1100

Page 42: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 42

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011 1'b0

sub 7'h20 X 3'b000 7'h33 4'b0100 1'b0

mul 7'h1 X 3'b000 7'h33 4'b0101 1'b0

slt 7'h0 X 3'b010 7'h33 4'b1100 1'b0

Page 43: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 43

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011 1'b0 2'b10

sub 7'h20 X 3'b000 7'h33 4'b0100 1'b0 2'b10

mul 7'h1 X 3'b000 7'h33 4'b0101 1'b0 2'b10

slt 7'h0 X 3'b010 7'h33 4'b1100 1'b0 2'b10

Page 44: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 44

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011 1'b0 2'b10 1'b1

sub 7'h20 X 3'b000 7'h33 4'b0100 1'b0 2'b10 1'b1

mul 7'h1 X 3'b000 7'h33 4'b0101 1'b0 2'b10 1'b1

slt 7'h0 X 3'b010 7'h33 4'b1100 1'b0 2'b10 1'b1

Page 45: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 45

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011 1'b0 2'b10 1'b1 1'b0

sub 7'h20 X 3'b000 7'h33 4'b0100 1'b0 2'b10 1'b1 1'b0

mul 7'h1 X 3'b000 7'h33 4'b0101 1'b0 2'b10 1'b1 1'b0

slt 7'h0 X 3'b010 7'h33 4'b1100 1'b0 2'b10 1'b1 1'b0

Page 46: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 46

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33

andi X X 3'b111 7'h13

sll 7'h0 X 3'b001 7'h33

slli 7'h0 X 3'b100 7'h13

Page 47: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 47

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33 4'b0000

andi X X 3'b111 7'h13 4'b0000

sll 7'h0 X 3'b001 7'h33 4'b1000

slli 7'h0 X 3'b100 7'h13 4'b1000

Page 48: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 48

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33 4'b0000 1'b0

andi X X 3'b111 7'h13 4'b0000 1'b1

sll 7'h0 X 3'b001 7'h33 4'b1000 1'b0

slli 7'h0 X 3'b100 7'h13 4'b1000 1'b1

Page 49: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 49

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33 4'b0000 1'b0 2'b10

andi X X 3'b111 7'h13 4'b0000 1'b1 2'b10

sll 7'h0 X 3'b001 7'h33 4'b1000 1'b0 2'b10

slli 7'h0 X 3'b100 7'h13 4'b1000 1'b1 2'b10

Page 50: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 50

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33 4'b0000 1'b0 2'b10 1'b1

andi X X 3'b111 7'h13 4'b0000 1'b1 2'b10 1'b1

sll 7'h0 X 3'b001 7'h33 4'b1000 1'b0 2'b10 1'b1

slli 7'h0 X 3'b100 7'h13 4'b1000 1'b1 2'b10 1'b1

Page 51: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 51

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

and 7'h0 X 3'b111 7'h33 4'b0000 1'b0 2'b10 1'b1 1'b0

andi X X 3'b111 7'h13 4'b0000 1'b1 2'b10 1'b1 1'b0

sll 7'h0 X 3'b001 7'h33 4'b1000 1'b0 2'b10 1'b1 1'b0

slli 7'h0 X 3'b100 7'h13 4'b1000 1'b1 2'b10 1'b1 1'b0

Page 52: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 52

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73

csrrwSW

X 7'hf00(io0)

3'b001 7'h73

Page 53: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 53

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37 4'bX

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX

Page 54: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 54

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37 4'bX 1'bX

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX 1'bX

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX 1'bX

Page 55: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 55

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37 4'bX 1'bX 2'b01

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX 1'bX 4'bX

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX 1'bX 2'b00

Page 56: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 56

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37 4'bX 1'bX 2'b01 1'b1

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX 1'bX 4'bX 1'b0

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX 1'bX 2'b00 1'b1

Page 57: CSCE 611 Advanced Digital Design - cse.sc.edu

Datapaths

CSCE 611 57

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

lui X X X 7'h37 4'bX 1'bX 2'b01 1'b1 1'b0

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX 1'bX 4'bX 1'b0 1'b1

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX 1'bX 2'b00 1'b1 1'b0

Page 58: CSCE 611 Advanced Digital Design - cse.sc.edu

Control Unit Implementation (NOT COMPLETE)

inst

inst31:25funct7

inst31:20(imm12)

inst14:12funct3

inst6:0opcode aluop alusrc regsel regwrite gpio_we

add 7'h0 X 3'b000 7'h33 4'b0011 1'b0 2'b10 1'b1 1'b0

sub 7'h20 X 3'b000 7'h33 4'b0100 1'b0 2'b10 1'b1 1'b0

mul 7'h1 X 3'b000 7'h33 4'b0101 1'b0 2'b10 1'b1 1'b0

slt 7'h0 X 3'b010 7'h33 4'b1100 1'b0 2'b10 1'b1 1'b0

and 7'h0 X 3'b111 7'h33 4'b0000 1'b0 2'b10 1'b1 1'b0

andi X X 3'b111 7'h13 4'b0000 1'b1 2'b10 1'b1 1'b0

sll 7'h0 X 3'b001 7'h33 4'b1000 1'b0 2'b10 1'b1 1'b0

slli 7'h0 X 3'b100 7'h13 4'b1000 1'b1 2'b10 1'b1 1'b0

lui X X X 7'h37 4'bX 1'bX 2'b01 1'b1 1'b0

csrrwHEX

X 7'hf02(io2)

3'b001 7'h73 4'bX 1'bX 4'bX 1'b0 1'b1

csrrwSW

X 7'hf00(io0)

3'b001 7'h73 4'bX 1'bX 2'b00 1'b1 1'b0

CSCE 611 58

Page 59: CSCE 611 Advanced Digital Design - cse.sc.edu

Example Test Program

.text

addi x1, x0, 12 # (x1) <= 12 / 0x0000000c

addi x2, x0, 17 # (x2) <= 17 / 0x00000011

add x3, x1, x2 # (x3) <= 29 / 0x0000001D

sub x3, x3, x1 # (x3) <= 17 / 0x00000011

slli x3, x3, 27 # (x3) <= -2013265920 / 0x88000000

mul x4, x2, x3 # (x4) <= 134217728 / 0x08000000

mulh x4, x2, x3 # (x4) <= -8 / 0xFFFFFFF8

mulhu x5, x2, x3 # (x5) <= 9 / 0x00000009

slt x6, x4, x1 # (x6) <= 1 / 0x00000001

sltu x6, x4, x1 # (x6) <= 0 / 0x00000000

and x7, x4, x1 # (x7) <= 8 / 0x00000008

or x8, x4, x1 # (x8) <= -4 / 0xFFFFFFFC

xor x9, x1, x2 # (x9) <= 29 / 0x0000001D

andi x10, x3, -2048 # (x10) <= -2013265920 / 0x88000000

ori x11, x2, -2048 # (x11) <= -2031 / 0xFFFFF811

xori x12, x2, 14 # (x12) <= 31 / 0x0000001F

sll x13, x2, x1 # (x13) <= 69632 / 0x00011000

srl x14,x3,x2 # (x14) <= 17408 / 0x00004400

sra x15, x3, x2 # (x15) <= -15360 / 0xFFFFC400

slli x16 ,x1, 1 # (x16) <= 24 / 0x00000018

srli x17, x2, 1 # (x17) <= 8 / 0x00000008

srai x18, x3, 1 # (x18) <= -1006632960 / 0xC4000000

lui x19, 0xDBEEF # (x19) <= -605097984 / 0xDBEEF000

csrrw x20, 0xf02, x3 # HEX <= 0x88000000 / -2013265920

csrrw x21, 0xf00, x3 # (x21) <= SW

CSCE 611 59

Page 60: CSCE 611 Advanced Digital Design - cse.sc.edu

Binary to Decimal Conversion

val %10 /10

5678 8 567

567 7 56

56 6 5

5 5 0

0

CSCE 611 60

• reg=0

• while val /= 0

– reg = reg << 4

– reg = reg | val % 10

– val = val / 10

• This will generate the digits in reverse, so you can connect the least significant BCD digit of the register to the most significant HEX

Page 61: CSCE 611 Advanced Digital Design - cse.sc.edu

Work Around Divide

• Our CPU doesn’t have a divide or a modulo instruction

• Since we’re using a constant factor 10, we can use multiply

• Idea: use fixed-point multiply

– Assume a decimal point at some location in a value:

– Example (6,4)-fixed format:

• = 2 + 13/16 = 2.8125

– Now assume we multiply a (32,0) value by a (32,32) value

– Result would be (64,32) value

– Use this to multiply a (32,0) value by 0.1

– (32,32) representation of 0.1 = 2^32 / 10

CSCE 611 61

1 0. 1 1 0 1

Page 62: CSCE 611 Advanced Digital Design - cse.sc.edu

Example

• Assume value = 234

– Step 1: temp = 234 x 0.1 = 23.4

– Step 2: temp2 = fractional(temp) = .4

– Step 3: temp = whole(temp) = 23

– Step 4: digit = temp2 x 10 = 4

– Step 5: temp = temp x 0.1 = 2.3

– Step 6: temp2 = fractional(temp) = .3

– Step 7: temp = whole(temp) = 2

– Step 8: digit = temp2 x 10 = 3

– Step 9: temp = temp x 0.1 = 0.2

– Step 10: temp2 = fractional(temp) = .2

– Step 11: temp = whole(temp) = 0

– Step 12: digit = temp2 x 10 = 2

– Step 13: temp == 0 so finish

CSCE 611 62

Page 63: CSCE 611 Advanced Digital Design - cse.sc.edu

Branch Instructions

• Branch offset = immediate field

– Number of instructions to advance the PC if the branch is taken

– Which PC?• PC_FETCH: the address of the instruction currently being fetched

• PC_EX: the address of the instruction currently being executed (in the execute stage)

• RISC-V: use PC_EX

• MIPS: use PC_FETCH

CSCE 611 63

RISC-V:

MIPS:

Page 64: CSCE 611 Advanced Digital Design - cse.sc.edu

Branch Instructions

• Branch instructions:

– B-type

• beq, bne, blt, bge, bltu, bgeu

• Ex: beq $x2, $x3, loop

• If taken:

PC_FETCH <= PC_EX + SE(imm<<1)

// assuming PC has byte address ^^^

instruction_EX <= NOP

CSCE 611 64

Page 65: CSCE 611 Advanced Digital Design - cse.sc.edu

Branch Instructions

• 12-bit immediate (rep. 13-bit offset):assign branch_offset_EX =

{instruction_EX[31], instruction_EX[7], instruction_EX[30:25], instruction_EX[11:8], 1’b0};

assign branch_addr_EX = PC_EX + {branch_offset_EX[12],branch_offset_EX[12:2]};

CSCE 611 65

instruction_EX bits offset bits

31 12

30:25 10:5

11:8 4:1

7 11

instruction_EX bits offset bits

31 12

7 11

30:25 10:5

11:8 4:1

Page 66: CSCE 611 Advanced Digital Design - cse.sc.edu

Jump Instruction

• Jump-and-link (jal)

– J-type:

– 20-bit immediate (rep. 21-bit offset)

– R[rd] <= PC_FETCH

– PC_FETCH <= PC_EX + SE(imm<<1) // assuming byte-address– assign jal_offset_EX = {instruction_EX[31],instruction_EX[19:12],instruction_EX[20],instruction_EX[30:21],1’b0};

– assign jal_addr_EX = PC_EX + jal_offset_EX[13:2];

CSCE 611 66

instruction_EX bits offset bits

31 20

30:21 10:1

20 11

19:12 19:12

instruction_EX bits offset bits

31 20

19:12 19:12

20 11

30:21 10:1

Page 67: CSCE 611 Advanced Digital Design - cse.sc.edu

Jump Register Instruction

• Jump-and-link-register (jalr)

• I-type:

– PC_FETCH <= [ R[rs1] + SE(imm) ] & 0xffff fffe // assuming byte address

– R[rd] <= PC_FETCH

– assign jalr_offset = instruction_EX[31:20];

– assign jalr_addr = readdata1_EX[13:2] + {{2{jalr_offset[11]}},jalr_offset[11:2]};

CSCE 611 67

Page 68: CSCE 611 Advanced Digital Design - cse.sc.edu

Three Stage Pipeline

CSCE 611 68

add

add

branch (t)

fall through

target

0cycle 1 2 3 4 5 6 7

F E W

F E W

F E W

F E(S) W(S)

F E W

F E Wbranch(nt)

Page 69: CSCE 611 Advanced Digital Design - cse.sc.edu

Fetch Stage

• Add a separate bus for branch vs. j-type jumps vs. r-type jumps

PC

instruction memory

addr

instr

+1

pcsrc_EX

branch_addr_EX

stall_FETCH

jal_addr_EX

jalr_addr_EX EXinstruction_EX

stall_EX

PC_FETCH

PC_EX

CSCE 611 69

Page 70: CSCE 611 Advanced Digital Design - cse.sc.edu

B_EX R_EX R_WB

EX/WB Stage

RegFile

Control Unit

ALU

op_EX

regwrite_WB

R

Rregwrite_EX

R

SW_in

{instruction_EX[31:12],12'b0}

Rregsel_EX

R

regsel_WB

0

1

2

writedata_WB

regwrite_WB

3PC_FETCHSE(imm)

CSCE 611 70

Page 71: CSCE 611 Advanced Digital Design - cse.sc.edu

Changes to CPU

• Structure:

– Add branch_addr, jal_addr, and jalr_addr as possible values for PC_FETCH

– Add pcsrc_EX to control PC mux

– Add stall_FETCH as output of control unit

– Add stall_EX as input to control unit

– Add PC_EX as possible input to regwrite

• Control:

– If stall_EX is asserted, don’t write registers or PC (no-op)

– For branches:• Set aluop to SLT for blt, bge, SLTU for bltu, bgeu, and SUB for beq, bne

• Use ALU output R_EX to resolve branches

CSCE 611 71

Page 72: CSCE 611 Advanced Digital Design - cse.sc.edu

Branch Resolution

CSCE 611 72

Instruction ALU op

Branch (pcsrc_EX=2'

b01) if:

beq sub R_EX == 32'b0

bne sub R_EX != 32'b0

blt slt R_EX == 32'b1

bltu sltu R_EX == 32'b1

bge slt R_EX == 32'b0

bgeu sltu R_EX == 32'b0

Page 73: CSCE 611 Advanced Digital Design - cse.sc.edu

How to Update Lab 3 CPU to Lab 4 CPU

1. Add code to generate:

– branch_addr

– jal_addr

– jalr_addr

2. Add "PC mux" to control next state of PC_FETCH via pc_src_EX control signal

3. Add stall_FETCH as output of control unit, add stall_EX as input to control unit

– Add the delay register to create stall_EX from stall_FETCH

4. Add decoding maps for J- and B-type instruction

5. Add PC_FETCH into "regsel" mux

6. Add entries to control unit

– Add nested if-statement for branches that resolve branch using R_EX and zero_EX

CSCE 611 73

Page 74: CSCE 611 Advanced Digital Design - cse.sc.edu

Test Program

PC

0 li x4,-1

1 li x5,2

2 beq x4,x5,target3 # not taken

3 bne x4,x5,target3 # taken

4 target1: blt x4,x5,target4 # taken

5 jal target5

6 j exit

7 target2: nop

8 target3: jal x1,target1

9 target4: jalr x1

10 target5: bge x4,x5,target1 # not taken

11 bgeu x4,x5,target6 # taken

12 beq x0,x0,target1 # not executed

13 target6: bltu x4,x4,target1 # not taken

14 exit:

fetch sequence: 0,1,2,3,4(stall),8,9(stall),4,5(stall),9,10(stall),9,10(stall),10,11,12(stall),13

CSCE 611 74

Page 75: CSCE 611 Advanced Digital Design - cse.sc.edu

Lab 3: Branch/Jump Instructions

• Objectives:1. Implement RISCV processor that implements instructions:

beq, bne, blt, bge, bltu, bgeu, jal, jalr

2. Test with test program

3. Write a MIPS test program:• Computes square root of a value on switches

• Displays value on 7-segment LEDs

• Implement on DE2

• Value on switches is (18,0) value

• Solution stored as (32,23) internally

• Value displayed on HEXs is (8,5)

• You can’t light up decimal point on HEXs (not connected on DE2 board)

CSCE 611 75

Page 76: CSCE 611 Advanced Digital Design - cse.sc.edu

Fractional Conversion

• Switches allow a value from 0 to (218-1)

– 0 <= x <= 262143

– 0 <= x.5 < 512

• Assume our BCD display is (8,5)

– Need 3 digits to left of decimal point

– Gives precision of 1/100000 (least significant BCD value is 10-5)

– Roughly equivalent to 2-14 (2^-14 = 6.1x10-5)

• Binary representation: (32,14) fixed-point

– shift initial step 14 bits to left

– initial guess = 0, initial step = 256.0

• Use same algorithm as in lab 4, but first:

– Multiply value by 10^5 (100000)

– Shift 14 bits to the right

CSCE 611 76