ECE 550D Fundamentals of Computer Systems and Engineering...

Preview:

Citation preview

ECE 550D Fundamentals of Computer Systems and Engineering

Fall 2017

Storage and Clocking

Prof. John Board

Duke University

Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke)

2

VHDL: Behavioral vs Structural

• A few words about VHDL

• Structural:

• Spell out at (roughly) gate level

• Abstract piece into entities for abstraction/re-use

• Very easy to understand what synthesis does to it

• Behavioral:

• Spell out at higher level

• Sequential statements, for loops, process blocks

• Can be difficult to understand how it synthesizes

• Difficult to resolve performance issues

3

Last time…

• Who can remind us what we did last time?

• Add

• Subtract

• Bit shift

• Floating point

4

So far…

• We can make logic to compute “math”

• Add, subtract,… (we’ll see multiply/divide later)

• Bitwise: AND, OR, NOT,…

• Shifts

• Selection (MUX)

• …pretty much anything

• But processors need state, i.e. memory (hold value)

• Registers

• Main Memory

• Storage (disks)

5

Memory Elements

• All the circuits we looked at so far are combinational circuits: the output is (only) a Boolean function of the present inputs (a specific combination of inputs always results in exactly the same output)

• We need circuits that can remember values (registers, memory)

• For sequential circuits, the output of the circuit is a function of both the present inputs and the history (sequence of events) of how we got here, represented by a value which we call the state

• Circuits with storage are called sequential circuits

• Key to storage: feedback

6

Ideal Storage – Where We’re Headed

• Ultimately, we want something that can hold 1 bit and we want to control when it is re-written

• However, instead of just giving it to you as a magic black box, we’re going to first dig a bit into the box

“flip flop” =

device that

holds one

bit (0 or 1)

bit to be written bit currently being held

bit to control

when we write

7

FF Step #1: NOR-based Set-Reset (SR) Latch

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

0

If I am crazy enough to wire up

2 nor gates this way – fire??

No – two stable configurations

as shown here. But how to move

between them?

Let’s start with the left side, which

we consider to be the “reset” state

8

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

1

Set-Reset Latch (Continued)

Time

S 0

1

R 0

1

Q 0

1

We activate SET

for a brief time,

then lower it

again

9

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

1

Set-Reset Latch (Continued)

Time

S 0

1

R 0

1

Q 0

1

Set Signal Goes High

Output Signal Goes High

(2 gate delays later)

10

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

1

Set-Reset Latch (Continued)

Time

S 0

1

R 0

1

Q 0

1

Set Signal Goes Low

Output Signal Stays High

11

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

1

Set-Reset Latch (Continued)

Time

S 0

1

R 0

1

Q 0

1

Until Reset Signal

Goes High

Then Output Signal Goes Low

(again 2 gate delays later)

12

FF Step #1: NOR-based Set-Reset (SR) Latch

R

S

Q

Q

0

1 0

1 0

0

R

S

Q

Q

0

0 1

0 1

0

R S Q

0 0 Q

0 1 1

1 0 0

1 1 - Don’t set both S & R to 1.

Seriously, don’t do it.

R,S both low, remember the

previous command

S set the latch to 1 (even if it is

already 1)

R reset the latch to 0 (even if it

is already 0)

R,S together causes oscillation,

random value - bad

13

SR Latch

• Upside: IT HAS MEMORY! If we never invented something better/fancier, we could build computers with these!!!

• Upside: it’s cheap – just 2 nor gates

• Downside: S and R at once = chaos

• Downside: Bad interface, bad timing behavior

• So let’s build on it to do better

14

Building up to the D Flip-Flop and beyond

SR Latch

Q

Q

R

S D

E

Q

Q

R

S

D Latch

D

latch

D Q

E Q

D

latch

D Q

E

D

latch

D Q

E !Q !Q

Q D

C

DFF

D Q

E Q

D Flip-Flop

32 bit reg

D Q

E Q

Register

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

(too awkward) (bad timing) (okay but only one bit) (nice!)

15

FF Step #2: Data Latch (“D Latch”)

Starting with SR Latch

Q

Q

R

S

16

Data Latch (D Latch)

Starting with SR Latch

Change interface to

Data + Enable (D + E)

If E=0, then R=S=0.

If E=1, then S=D and R=!D

Data

Enable

Q

Q

R

S

17

Data Latch (D Latch)

Data

Enable

Q

Q

D E Q

0 1 0

1 1 1

- 0 Q

Time

D 0

1

E 0

1

Q 0

1

E goes high

D “latched”

Stays as output

R

S

18

Data Latch (D Latch)

Data

Enable

Q

Q

D E Q

0 1 0

1 1 1

- 0 Q

Time

D 0

1

E 0

1

Q 0

1

Does not

affect Output

E goes low

Output unchanged

By changes to D

R

S

19

Data Latch (D Latch)

Data

Enable

Q

Q

D E Q

0 1 0

1 1 1

- 0 Q

Time

D 0

1

E 0

1

Q 0

1

E goes high

D “latched”

Becomes new output

R

S

20

Data Latch (D Latch)

Data

Enable

Q

Q

D E Q

0 1 0

1 1 1

- 0 Q

Time

D 0

1

E 0

1

Q 0

1

Slight Delay

(Logic gates take

time)

- Here about 4

gate delays

worth

R

S

21

Logic Takes Time

• The D latch is a good next step but still has messy timing for using in a computer

• Logic takes time:

• Gate delays: delay to switch each gate

• Wire delays: delay for signal to travel down wire

• Other factors (not going into them here)

• Need to make sure that signal timing is right

• Don’t want to have races or unpredictable or unexpected conditions..

22

Clocks

• Processors have a clock:

• Alternates 0 1 0 1

• Like the processor’s internal metronome

• Latch logic latch in one clock cycle

• 3.4 GHz processor = 3.4 Billion clock cycles/sec

One clock cycle

23

FF Step #3: Using Level-Triggered D Latches

• First thoughts: Level Triggered

• Latch enabled when clock is high

• Hold value when clock is low

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

24

Strawman: Level Triggered

• How we’d like this to work

• Clock is low, all values stable

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

010 111 100 001

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

25

Strawman: Level Triggered

• How we’d like this to work

• Clock goes high, latches capture and xmit new val

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

010 010 100 100

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

26

Strawman: Level Triggered

• How we’d like this to work

• Signals work their way through logic w/ high clk

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

010 010 100 100

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

27

Strawman: Level Triggered

• How we’d like this to work

• Clock goes low after all logic but just before signals reach next latch

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

010 010 100 100

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

28

Strawman: Level Triggered

• How we’d like this to work

• Clock goes low before signals reach next latch

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

111 010 000 100

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

29

Strawman: Level Triggered

• How we’d like this to work

• Everything stable before clk goes high a second time

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

111 010 000 100

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

30

Strawman: Level Triggered

• How we’d like this to work

• Clk goes high again, repeat

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

111 111 000 000

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

31

Strawman: Level Triggered

• Problem: Really hard to time this! What if signal reaches latch too early while clk is still high the first time?

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

111 111 101 000

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

32

Strawman: Level Triggered

• Problem: What if signal reaches latch too early?

• Signal goes right through latch, into next stage..

D

latch

D Q

E Q

D

latch

D Q

E Q

Logic

Clk

3 3

111 111 101 101

0

Clk

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

33

That would be bad…

• Getting into a stage too early is bad

• Something else is going on there corrupted

• Also may be a loop with one latch

• And this assumed all signals take the same amount of time to get through the logic cloud – in reality some faster than others

• Consider incrementing counter (or PC)

• Too fast: increment twice? Eeek…

D

latch

D Q

E Q

+1

3

010

011

This slide describes how D-latches can malfunction because they were level triggered. Real D-flip-flops are edge-triggered, and we’re showing you why that’s important.

3 bit adder with one side wired

permanently to 001

34

Building up to the D Flip-Flop and beyond

SR Latch

Q

Q

R

S D

E

Q

Q

R

S

D Latch

D

latch

D Q

E Q

D

latch

D Q

E

D

latch

D Q

E !Q !Q

Q D

C

DFF

D Q

E Q

D Flip-Flop

32 bit reg

D Q

E Q

Register

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

(too awkward) (bad timing) (okay but only one bit) (nice!)

35

FF Step #4: Edge Triggered

• Instead of level triggered latch

• Latch a new value at a clock level (high or low)

• We use edge triggered flip flop

• Latch a value (only) at a clock edge (rising or falling)

Falling Edges

Rising Edges

36

Our Ultimate Goal: D Flip-Flop

• Rising edge triggered D Flip-flop

• Two D Latches w/ opposite clocking of enables

D

latch

D Q

E

D

latch

D Q

E Q Q

Q D

C

37

D Flip-Flop

• Rising edge triggered D Flip-flop

• Two D Latches w/ opposite clocking of enables

• On Low Clock, first latch enabled (propagates value)

• Second not enabled, maintains value

D

latch

D Q

E

D

latch

D Q

E Q Q

Q D

C

38

D Flip-Flop

• Rising edge triggered D Flip-flop

• Two D Latches w/ opposite clking of enables

• On Low Clock, first latch enabled (propagates value)

• Second not enabled, maintains value

• On High Clock, second latch enabled

• First latch not enabled, maintains value

• Value from first latch flips into second latch at instant of rising clock edge, stays constant until next rising edge

D

latch

D Q

E

D

latch

D Q

E Q Q

Q D

C

39

D Flip-Flop

• No possibility of “races” anymore

• Even if I put 2 DFFs back-to-back…

• By the time signal gets through 2nd latch of 1st DFF

1st latch of 2nd DFF is disabled

• Still must ensure signals reach DFF before clk rises

• Important concern in logic design “making timing”

• Having signals ready in time for next clock edge (need to be stable a few moments before the edge in fact)

D

latch

D Q

E

D

latch

D Q

E Q

D

C

D

latch

D Q

E

D

latch

D Q

E Q

C

40

Making Timing

• Making timing is important in a design • If you don’t make timing, your logic won’t compute right

• Synthesis tool (Quartus) tells you what maximum frequency your design will run safely

• If you run above this, your logic doesn’t “finish” in time

41

D Flip-flops (continued…)

• Could also do falling edge triggered

• Switch which latch has NOT on clock

• D Flip-flop is ubiquitous

• Must properly differentiate between latches (level sensitive) and flip-flops (edge sensitive)

• Which edge: doesn’t matter

• As long as consistent in entire design

• We’ll use rising edge

• And often both – rising for writing, falling for reading, to allow data transfer in a single clock cycle, but more on that later!

42

D flip flops

• Generally don’t draw clk input

• Have one global clk, assume it goes there

• Should always see “>” to indicate edge triggered device

• Maybe have explicit enable

• Might not want to write every cycle

• If no enable signal shown, implies always enabled

• Get output and NOT(output) for “free”

DFF

D Q

Q

DFF

D Q

>

Q

DFF

D Q

E Q

> >

43

DFFs in VHDL

x: dffe port map (

clk => clk, --the clock

d => someInput, --the input is d

q => theOutput, --the output is q

ena => en, --the enable

clrn => '1', --clear

prn => '1'); --set (PReset)

Also, comes in “dff” with no enable

44

A word of advice

signal x_q : std_logic;

signal x_d: std_logic;

x : dffe port map (…, d=> x_d, q=>x_q,…);

• Use naming convention: x_d, x_q

• Write x_d, read x_q

• Remember new value shows up next cycle

x x_d x_q

45

A few words about timing

• Homework 2: VGA Controller

• Requires certain clock frequency

• Else won’t control monitor properly

• Quartus will tell you what maximum timing your current circuit makes

• Fmax : how fast can this be clocked

• Tells you your worst timing paths

• From which dff to which dff

• Can see in schematic viewer (usually)

• Homework 2

• Should be plenty of slack

• But if not…

46

Fixing timing misses

• Typical approach: reduce logic (gate delays)

• Better adder?

• Rethink approach?

• Change “don’t care” behavior?

• Fix high fanout

• Duplicate high Fan Out/simple logic

• Also, feel free to ask for help from me/TAs

• Quartus’s tools to help you fix them aren’t the best

47

Building up to the D Flip-Flop and beyond

SR Latch

Q

Q

R

S D

E

Q

Q

R

S

D Latch

D

latch

D Q

E Q

D

latch

D Q

E

D

latch

D Q

E !Q !Q

Q D

C

DFF

D Q

E Q

D Flip-Flop

32 bit reg

D Q

E Q

Register

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

(too awkward) (bad timing) (okay but only one bit) (nice!)

48

Stick a bunch of DFFs together to make a register

32 bit reg

D Q

E Q

in0

in1

in2

in31

. . .

out0

out1

out2

out31

enable

DFF

D Q

E Q

>

DFF

D Q

E Q

>

DFF

D Q

E Q

>

DFF

D Q

E Q

>

49

Next evolution: multiple registers

32 bit reg

D Q

E Q

Register

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

DFF

D Q

E Q

(nice!)

En0

En1

En30

En31

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

WrData

En0

En1

En30

En31

Register File (Tremendous!)

> “Tristate”

buffers –

more in a

moment!

50

Register File

• Can store one value… How about many?

• Register File

• In processor, holds values it computes on

• MIPS, 32 32-bit registers

• How do we build a Register File using D Flip-Flops?

• What other components do we need?

51

Register File: Interface

• 4 inputs

• 3 register numbers (5 bit): 2 registers to read, 1 register to write

• 1 register write value (32 bits)

• 2 outputs

• 2 register values (32 bits)

Register File

r_numA

r_numB

r_numW Wval

Aval

Bval

32 5

5

5

32

32

52

Register file strawman

• Use a mux to pick read ?

• 32 input mux = slow

• (other regs not pictured)

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

… …

53

Register File Design

• Two problems: write and read

• Writing the registers

• Need to pick which reg

• Have reg num (e.g., 19)

• Need to make En19=1

• En0, En1,… = 0

• Read: Use a mux to pick?

• 32-input mux = slow

• Need a better method…

• Let’s talk about writing first.

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

… … WrData

En0

En1

En30

En31

54

First: A Decoder

• First task: convert binary number to “one hot”

• Saw this before

• Take register number

Decoder

3

101

0 0

0

0 0

1 0

0

55

Register File

• Now we know how to write:

• Use decoder to convert reg # to one hot

• Send write data to all regs

• Use one hot encoding of reg # to enable right reg

• Still need to fix read side

• 32 input mux (the way we’ve made it) not realistic

• To do this: expand our world from {1,0} to {1, 0, Z}

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

WrData

En0

En1

En30

En31

56

Tri-states have a third logic level

• Z simply means “high impedance” which for our purposes means high resistance, effectively disconnected from both Vcc and Ground

57

So this third option: Z

• There is a third possibility: Z (“high impedance”)

• Just closed off/blocked

• Prevents electricity from flowing through

• Gate that gives us Z : Tri-state

D E Q

0 1 0

1 1 1

- 0 Z D

Q

E

58

CMOS: Complementary MOS

• 2 inputs: E and D. What does this do?

• Write truth table for output

Vcc

Gnd

E Output

E

D

E D Output

0 0

0 1

1 0

1 1

59

CMOS: Complementary MOS

• 2 inputs: E and D. What does this do?

• Write truth table for output

• When E =1, straightforward

Vcc

Gnd

E Output

E

D

E D Output

0 0

0 1

1 0 1

1 1 0

60

CMOS: Complementary MOS

• 2 inputs: E and D. What does this do?

• Write truth table for output

• When E =1, straightforward

• When E= 0, no connection: Z

Vcc

Gnd

E Output

E

D

E D Output

0 0 Z

0 1 Z

1 0 1

1 1 0

When E=0, both enable

transistors are not conducting –

they are both effectively million

ohm resistors, isolating output

from either 1 or 0

61

High Impedance: Z

• Z = High Impedance

• No path to power or ground

• “Gate” does not produce a 1 or a 0

• Previous slide: tri-state inverter

• More commonly drawn: tri-state buffer

• E = enable, D = data

D E Out

0 1 0

1 1 1

X 0 Z D

Out

E

X=“Don’t care”

D Out

62

Remember this rule?

• Remember I told you not to connect two outputs?

• If one gate tries to drive a 1 and the other drives a 0

• One pumps water in.. The other sucks it out

• Except its electric charge, not water

• “Short circuit”—lots of current -> lots of heat

a b

c

d

BAD!

63

We’ve had this rule one day… and you break it

It’s ok to connect multiple outputs together

Under one circumstance:

All but at most one must be outputting Z at

any time

Then no shorts – no fire

One-hot circuits guarantee this

D0

E0

D1

E1

Dn-2

En-2

Dn-1

En-1

64

Mux, implemented with tri-states

• We can build effectively a mux from tri-states

• Much more efficient for large #s of inputs (e.g., 32)

Decoder

5

11110

0

0

1

0

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

… …

… …

65

En0

En1

En30

En31

Register File

• Now we can write and read in one clock cycle!

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

32 bit reg

D Q

E Q

WrData

En0

En1

En30

En31

66

Ports

• What we just saw: read port

• Ability to do one read / clock cycle

• May want more: read 2 source registers per instruction

• Maybe even more if we do many instructions at once

• This design: can just replicate port

• Another decoder

• Another set of tri-states

• Another output bus (wire connecting the tri-states)

• Ok to read same register on both ports – just driving two lines with same bits!

• Earlier: write port

• Ability to do one write/cycle

• Could add more: need muxes to pick write values

67

Minor Detail

• FYI: This is not how most real register files are implemented

• (Though it is how other things are implemented, and it would work for a register file, just not as well as other techniques we’ll learn later)

• Actually done with SRAM

• We’ll see how those work soon

68

Summary

Can layout logic to compute things

Add, subtract,…

Now can store things

SR, D latches

D flip-flops

Registers

Register files

Also begin to understand clocks

Recommended