25
Page 1 Page 1 Marquette University Page 1 COEN-4710 Computer Hardware Lecture 3 Processor Part 1: Datapath and Control (Ch.4) Cristinel Ababei Marquette University Department of Electrical and Computer Engineering Page 2 Marquette University Page 2 Goals of this Chapter Design a datapath and control that implement the RISC-V instruction set architecture (ISA). By the end of this chapter, you should: Be able to design a datapath for an instruction set Be able to design a control logic for the datapath Understand the importance of the clocking methodology on the processor design We will examine two RISC-V implementations A simplified version A more realistic pipelined version 1 2

Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 1

Page 1

Marquette University

Page 1

COEN-4710 Computer Hardware

Lecture 3

Processor Part 1: Datapath and Control (Ch.4)

Cristinel Ababei

Marquette University

Department of Electrical and Computer Engineering

Page 2

Marquette University

Page 2

Goals of this Chapter

❖Design a datapath and control that implement

the RISC-V instruction set architecture (ISA).

❖By the end of this chapter, you should:

◆Be able to design a datapath for an instruction set

◆Be able to design a control logic for the datapath

◆Understand the importance of the clocking

methodology on the processor design

❖We will examine two RISC-V implementations◆A simplified version

◆A more realistic pipelined version

1

2

Page 2: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 2

Page 3

Marquette University

Page 3

What are datapath and control?

❖ Datapath

◆The path the “data” follow and undergo computations.

◆Realized by the hardware components connected in a way to

perform operations on data such that machine instructions are

implemented.

❖ Control

◆Control is the sequential logic that reconfigures the Datapath to

allow the “data” to flow properly through the hardware

components.

◆Responsible with the generation of all control signals to

“orchestrate” the correct flow of data through Datapath!

◆Can be implemented as finite state-machine(s).

◆Can also be implemented as a computer inside of a computer

(microcode).

Page 4

Marquette University

Page 4

Design Process

1. Select a subset of the instruction set to

implement. Simple subset, shows most

aspects◆Memory reference: ld, sd

◆Arithmetic/logical: add, sub, and, or

◆Control transfer: beq

2. Order the steps within instruction cycle

(performed during instruction execution)

3. Select the hardware components.

4. Connect the hardware components.

5. Design the control to make the components

work together properly.

3

4

Page 3: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 3

Page 5

Marquette University

Page 5

Order the steps: F I D E

❖ FIDE – the sequence of activities that

happens during instruction execution

1. Fetch (the instruction)

2. “Increment” (the Program Counter)

3. Decode (the Instruction Register)

4. Execute (using datapath hardware)

Page 6

Marquette University

Page 6

Fetch and Increment Hardware

PC

Instruction

MemoryInstruction

address

Instruction

Adder

5

6

Page 4: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 4

Page 7

Marquette University

Page 7

Fetch and Increment Connections

PCInstruction

MemoryInstruction

addressInstruction

Adder

4

64-bit

register

Increment by

4 for next

instruction

Page 8

Marquette University

Page 8

Decode and Execute

❖Decode

◆Takes the Instruction Register (IR) and computes

the bits needed to control the datapath (R/W flags,

enables, mux selects, etc.)

◆We will work more on the control later

❖Execute

◆Take the inputs specified by the instruction, and

complete the required operation

7

8

Page 5: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 5

Page 9

Marquette University

Page 9

Instruction Execution

❖PC → instruction memory, fetch instruction

❖Register numbers → register file, read

registers

❖Depending on instruction class

◆Use ALU to calculate

➢Arithmetic result

➢Memory address for load/store

➢Branch comparison

◆Access data memory for load/store

◆PC target address or PC + 4

Page 10

Marquette University

Page 10

R-format Example: “ADD” Instruction

add x9,x20,x21

❖ For the execute step of FIDE, what hardware

do we need?

funct7 rs2 rs1 rdfunct3 opcode

7 bits 7 bits5 bits 5 bits 5 bits3 bits

0 21 20 90 51

0000000 10101 10100 01001000 0110011

9

10

Page 6: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 6

Page 11

Marquette University

Page 11

R-Format Instructions - Hardware

❖Read two register operands

❖Perform arithmetic/logical operation

❖Write register result

Page 12

Marquette University

Page 12

With Hardware Connections

ALUzero

result

rs1 [19-15]

rs2 [24-20]

rd [11-7]

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

0010

11

12

Page 7: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 7

Page 13

Marquette University

Page 13

Complete “Add” Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

Page 14

Marquette University

Page 14

Complete Add Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4

13

14

Page 8: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 8

Page 15

Marquette University

Page 15

Complete Add Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4

Page 16

Marquette University

Page 16

Complete Add Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4

rs1 [19-15]

rs2 [24-20]

rd [11-7]

15

16

Page 9: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 9

Page 17

Marquette University

Page 17

Complete Add Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

0010

PC

Instruction

Memory

Adder

4

rs1 [19-15]

rs2 [24-20]

rd [11-7]

Page 18

Marquette University

Page 18

Complete Add Datapath

ALUzero

result

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

0010

PC

Instruction

Memory

Adder

4

rs1 [19-15]

rs2 [24-20]

rd [11-7]

17

18

Page 10: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 10

Page 19

Marquette University

Page 19

Control of R-format Instructions

❖ Simplicity favors regularity!

◆and, or, add, subtract, set-on-less-than all use the same

datapath

❖ Need to decode the instructions to control the ALU.

◆ Input: Function codes for each (recall from Chapter 2)

◆Output: ALU control lines (will look at later)

Page 20

Marquette University

Page 20

ALU Decoding for R-format

ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

19

20

Page 11: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 11

Page 21

Marquette University

Page 21

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

ALU

Control

zero

resultALU

Page 22

Marquette University

Page 22

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

21

22

Page 12: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 12

Page 23

Marquette University

Page 23

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

Page 24

Marquette University

Page 24

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

23

24

Page 13: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 13

Page 25

Marquette University

Page 25

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

Page 26

Marquette University

Page 26

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

25

26

Page 14: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 14

Page 27

Marquette University

Page 27

R-format Datapath and Control

Registers

read reg1

read reg2

write reg

write data

read data2

read data1

PC

Instruction

Memory

Adder

4ALU

Control

zero

resultALU

rs1 [19-15]

rs2 [24-20]

rd [11-7]

Page 28

Marquette University

Page 28

Load/Store Instructions - Hardware

❖Read register operands

❖Calculate address using 12-bit offset◆Use ALU, but sign-extend offset

❖Load: Read memory and update register

❖Store: Write register value to memory

27

28

Page 15: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 15

Page 29

Marquette University

Page 29

I-format Instructions

❖ Immediate arithmetic and load instructions◆ rs1: source or base address register number

◆ immediate: constant operand, or offset added to base address

➢ 2s-complement, sign extended

immediate rs1 rdfunct3 opcode

12 bits 7 bits5 bits 5 bits3 bits

Page 30

Marquette University

Page 30

S-format Instructions

❖ Different immediate format for store instructions◆ rs1: base address register number

◆ rs2: source operand register number

◆ immediate: offset added to base address

➢ Split so that rs1 and rs2 fields always in the same place

rs2 rs1 funct3 opcode

7 bits 7 bits5 bits 5 bits 5 bits3 bits

imm[11:5] imm[4:0]

29

30

Page 16: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 16

Page 31

Marquette University

Page 31

Composing the Elements

❖First-cut datapath does an instruction in one

clock cycle

◆Each datapath element can only do one function

at a time

◆Hence, we need separate instruction and data

memories

❖Use multiplexers where alternate data

sources are used for different instructions

Page 32

Marquette University

Page 32

R-Type/Load/Store Datapath

31

32

Page 17: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 17

Page 33

Marquette University

Page 33

Branch Instructions

❖Read register operands

❖Compare operands

◆Use ALU, subtract and check Zero output

❖Calculate target address

◆Sign-extend displacement

◆Shift left 1 place (halfword displacement)

◆Add to PC value

Page 34

Marquette University

Page 34

SB-format - Branch Addressing

❖SB format:

◼ PC-relative addressing

◼ Target address = PC + immediate × 2

rs2 rs1 funct3 opcodeimm

[10:5]imm[4:1]

imm[12] imm[11]

❖Branch to a labeled instruction if a condition is true◆Otherwise, continue sequentially

❖beq rs1, rs2, L1◆if (rs1 == rs2) branch to instruction labeled L1

33

34

Page 18: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 18

Page 35

Marquette University

Page 35

Branch Instructions

Just

re-routes

wires

Sign-bit wire

replicated

Page 36

Marquette University

Page 36

R-Type/Load/Store Datapath

35

36

Page 19: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 19

Page 37

Marquette University

Page 37

Full Datapath

Page 38

Marquette University

Page 38

Overall Control

❖Divide and conquer

1. “ALU Control” unit

➢Uses 2-bit ALUOp generated by Main Control unit

➢Generates output that controls directly the function

executed by the ALU

2. “Main Control” unit

➢Control signals derived from instruction

➢Generates a 2-bit ALUOp used by ALU Control

37

38

Page 20: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 20

Page 39

Marquette University

Page 39

1) “ALU Control” unit

❖ALU used for

◆Load/Store: F = add

◆Branch: F = subtract

◆R-type: F depends on opcode

“ALU control” Function

0000 AND

0001 OR

0010 add

0110 subtract

Page 40

Marquette University

Page 40

ALU Control

❖ Generates “ALU Control” signals

◆Based on inputs: ALUOp, Funct7, and Funct3 field of instruction

❖ 2-bit ALUOp derived from opcode by “Main Control” unit

❖ Can be implemented by simple combinational logic

◆By logic synthesis from Truth Table on next slide

Opcode ALUOp Operation Opcode field ALU function

ALUOperat

ion

ld 00 load register XXXXXXXXXXX add 0010

sd 00 store register XXXXXXXXXXX add 0010

beq 01 branch on equal XXXXXXXXXXX subtract 0110

R-type 10 add 100000 add 0010

subtract 100010 subtract 0110

AND 100100 AND 0000

OR 100101 OR 0001

39

40

Page 21: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 21

Page 41

Marquette University

Page 41

ALU Control – Truth Table

❖ Input signals:

◆ALUOp (2 bits), Funct7 (7 bits, Instruction[31-25], Funct3 (3 bits,

Instruction [14-12])

❖ Output signals:

◆ALU Operation (4 bits)

❖Table from Figure 4.13 in Textbook

ALUOperat

ion

Page 42

Marquette University

Page 42

2) “Main Control” unit

❖ Generates control signals for: Register file, data

memory, multiplexers, AND gate (branch related),

ALUOp (2 bits), etc. (will see more later)

❖ Control signals derived from instruction opcode

41

42

Page 22: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 22

Page 43

Marquette University

Page 43

❖Table from Figure 4.22 in Textbook

❖ Input signals:

◆Opcode (7 bits, Instruction [6-0])

❖ Output signals:

◆ALUSrc, MemtoReg, RegWrite, etc.

Main Control – Truth Table

Page 44

Marquette University

Page 44

Datapath With Control

43

44

Page 23: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 23

Page 45

Marquette University

Page 45

R-Type Instruction

Instr.[31-25,14-12]

Page 46

Marquette University

Page 46

Load Instruction

Instr.[31-25,14-12]

45

46

Page 24: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 24

Page 47

Marquette University

Page 47

BEQ Instruction

Instr.[31-25,14-12]

Page 48

Marquette University

Page 48

❖ All of the logic is combinational (“Single cycle”)

❖We wait for everything to settle down, and the right

thing to be done

◆ We use write signals along with clock to determine when to write

❖ Cycle time determined by length of the longest path

Review Control Structure

Note: We are ignoring some details like setup and hold times

47

48

Page 25: Goals of this Chapter - dejazzer.comdejazzer.com/coen4710/lectures/lec03_ch4a_single... · components. Responsible with the generation of all control signals to “orchestrate”

Page 25

Page 49

Marquette University

Page 49

Practice – Performance Analysis❖ Calculate cycle time assuming:

◆ memory (2ns), ALU and adders (2ns), register file access (1ns)

Instr.[31-25,14-12]

Page 50

Marquette University

Page 50

Limitations of single-cycle operations

❖Each instruction uses the ENTIRE datapath,

until it finishes.

◆Clock cycle time based on slowest instruction

◆What if we add more stuff, like floating-point?➢ Worst-case time delay drastically exceeds average

➢ Need really long cycle time to accommodate

❖Goal: Use as much of the hardware as much

of the time as possible◆PIPELINING: Break the datapath into smaller chunks, and let

new instructions start while others are finishing

49

50