19
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UC QuickTime™ and a TIFF (Uncompressed) decompre are needed to see this pic 3.6 TB DVDs? Maybe! Researchers at Harvard have found a way to use light- focusing “optical antennas” to get way past the diffraction limit and focus light smaller. Lecturer SOE Dan Garcia www.cs.berkeley.edu/ ~ddgarcia UC Berkeley CS61C : Machine Structures Lecture 26 CPU Design: Designing a Single-cycle CPU, pt 2 2007-03-19 technologyreview.com/Infotech/18295

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe! Researchers at Harvard have found a way to use

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

3.6 TB DVDs? Maybe! Researchers at Harvard

have found a way to use light-focusing “optical antennas” to get way past the

diffraction limit and focus light smaller.

Lecturer SOE Dan Garcia

www.cs.berkeley.edu/~ddgarcia

inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures

Lecture 26 CPU Design: Designing a Single-cycle CPU, pt 2

2007-03-19

technologyreview.com/Infotech/18295

Page 2: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (2) Garcia, Spring 2007 © UCB

How to Design a Processor: step-by-step1. Analyze instruction set architecture (ISA)

=> datapath requirements• meaning of each instruction is given by the register transfers

• datapath must include storage element for ISA registers

• datapath must support each register transfer

2. Select set of datapath components and establish clocking methodology

3. Assemble datapath meeting requirements4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

5. Assemble the control logic

Page 3: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (3) Garcia, Spring 2007 © UCB

Clocking Methodology

• Storage elements clocked by same edge• Being physical devices, flip-flops (FF) and

combinational logic have some delays • Gates: delay from input change to output change • Signals at FF D input must be stable before active clock

edge to allow signal to travel within the FF (set-up time), and we have the usual clock-to-Q delay

• “Critical path” (longest path through logic) determines length of clock period

Clk

.

.

.

.

.

.

.

.

.

.

.

.

Page 4: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (4) Garcia, Spring 2007 © UCB

Register-Register Timing: One complete cycleClk

PCRs, Rt, Rd,Op, Func

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, BRegister File Access TimeOld Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

Register WriteOccurs Here

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

AL

U

5Rd

Page 5: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (5) Garcia, Spring 2007 © UCB

3c: Logical Operations with Immediate• R[rt] = R[rs] op ZeroExt[imm16] ]

op rs rt immediate

016212631

6 bits 16 bits5 bits5 bits

immediate

016 1531

16 bits16 bits

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs RtA

LU

5Rd

But we’re writing to Rt register??

Page 6: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (6) Garcia, Spring 2007 © UCB

3c: Logical Operations with Immediate• R[rt] = R[rs] op ZeroExt[imm16] ]

op rs rt immediate

016212631

6 bits 16 bits5 bits5 bits

immediate

016 1531

16 bits16 bits

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

• Already defined 32-bit MUX; Zero Ext?

What about Rt register read??

32

ALUctr

clk

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

Rd

ZeroE

xt 3216imm16

ALUSrc

01

0

1

AL

U

5

RegDst

Page 7: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (7) Garcia, Spring 2007 © UCB

3d: Load Operations• R[rt] = Mem[R[rs] + SignExt[imm16]]Example: lw rt,rs,imm16

op rs rt immediate

016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

clk

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

Rd

ZeroE

xt 3216imm16

ALUSrc

01

0

1

AL

U

5

RegDst

Page 8: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (8) Garcia, Spring 2007 © UCB

3d: Load Operations• R[rt] = Mem[R[rs] + SignExt[imm16]]Example: lw rt,rs,imm16

op rs rt immediate

016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Exten

der 3216imm16

ALUSrcExtOp

MemtoReg

clk

Data In

32

MemWr01

0

1

AL

U 0

1

WrEn Adr

DataMemory

5

?

Page 9: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (9) Garcia, Spring 2007 © UCB

3e: Store Operations• Mem[ R[rs] + SignExt[imm16] ] = R[rt]

Ex.: sw rt, rs, imm16

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Exten

der 3216imm16

ALUSrcExtOp

MemtoReg

clk

Data In

32

MemWr01

0

1

AL

U 0

1

WrEn Adr

DataMemory

5

Page 10: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (10) Garcia, Spring 2007 © UCB

3e: Store Operations• Mem[ R[rs] + SignExt[imm16] ] = R[rt]

Ex.: sw rt, rs, imm16

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Exten

der 3216imm16

ALUSrcExtOp

MemtoReg

clk

Data In

32

MemWr01

0

1

AL

U 0

1

WrEn Adr

DataMemory

5

Page 11: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (11) Garcia, Spring 2007 © UCB

3f: The Branch Instruction

beq rs, rt, imm16• mem[PC] Fetch the instruction from memory

• Equal = R[rs] == R[rt] Calculate branch condition

• if (Equal) Calculate the next instruction’s address PC = PC + 4 + ( SignExt(imm16) x 4 )

else PC = PC + 4

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Page 12: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (12) Garcia, Spring 2007 © UCB

Datapath for Branch Operations• beq rs, rt, imm16

Datapath generates condition (equal)

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Already have mux, adder, need special sign extender for PC, need equal compare (sub?)imm16

clk

PC

00

4nPC_sel

PC

Ext

Ad

derA

dder

Mu

x

Inst Address

32

ALUctr

clk

busW

RegWr

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs Rt

AL

U

5

=

Equal

Page 13: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (13) Garcia, Spring 2007 © UCB

Putting it All Together:A Single Cycle Datapath

imm16

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Exten

der

3216imm16

ALUSrcExtOp

MemtoReg

clk

Data In32

MemWrEqual

Instruction<31:0><21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

clk

PC

00

4

nPC_sel

PC

Ext

Adr

InstMemory

Ad

derA

dder

Mu

x

01

0

1

=

AL

U 0

1

WrEn Adr

DataMemory

5

Page 14: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (14) Garcia, Spring 2007 © UCB

An Abstract View of the Implementation

DataOut

clk

5

Rw Ra Rb

RegisterFile

Rd

Data In

DataAddr Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

PC

5Rs

5Rt

32

323232

A

B

Nex

t A

dd

ress

Control

Datapath

Control Signals Conditions

clk clk

AL

U

Page 15: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (15) Garcia, Spring 2007 © UCB

An Abstract View of the Critical Path Critical Path (Load Instruction) =

Delay clock through PC (FFs) + Instruction Memory’s Access Time + Register File’s Access Time, + ALU to Perform a 32-bit Add + Data Memory Access Time + Stable Time for Register File Write

clk

5

Rw Ra Rb

RegisterFile

Rd

Data In

DataAddr Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

PC

5Rs

5Rt

32

323232

A

B

Nex

t A

dd

ress

clk clk

AL

U(Assumes a fast controller)

Page 16: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (16) Garcia, Spring 2007 © UCB

Administrivia

• Friday will be a webcast archived lec• …no need to attend class, but you can if you want to (I’ll be here)

Page 17: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (17) Garcia, Spring 2007 © UCB

Peer Instruction

A. Truth table for mux with 4-bits of signals has 24 rows

B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl

C. If 1-bit adder delay is T, the N-bit adder delay would also be T

ABC0: FFF1: FFT2: FTF3: FTT4: TFF5: TFT6: TTF7: TTT

Page 18: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (18) Garcia, Spring 2007 © UCB

Peer Instruction Answer

A. Truth table for mux with 4-bits of signals has 24 rows

B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl

C. If 1-bit adder delay is T, the N-bit adder delay would also be T

A. Truth table for mux with 4-bits of signals controls 16 inputs, for a total of 20 inputs, so truth table is 220 rows…FALSE

B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl … TRUE

C. What about the cascading carry? FALSE ABC0: FFF1: FFT2: FTF3: FTT4: TFF5: TFT6: TTF7: TTT

Page 19: CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use

CS61C L26 CPU Design : Designing a Single-Cycle CPU II (19) Garcia, Spring 2007 © UCB

Summary: A Single Cycle Datapath

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Exten

der

3216imm16

ALUSrcExtOp

MemtoReg

clk

Data In32

MemWrzero

01

0

1

=

AL

U 0

1

WrEn Adr

DataMemory

5

Instruction<31:0><21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

nPC_sel instrfetchunitclk

• We have everything except control signals