Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
CSE140: Components and Design Techniques for Digital Systems
Tajana Simunic Rosing
Sources: TSR, Katz, Boriello, Vahid
1
Announcements & Outline
• HW#7 due, HW#8 outMidt b k Th d• Midterm back Thursday
• Today: Memory– ROM– RAM– FIFO– Queue
• Next: Register Transfer Level Design
Sources: TSR, Katz, Boriello, Vahid
2
CSE140: Components and Design Techniques CSE140: Components and Design Techniques for Digital Systems
Memory
Tajana Simunic Rosing
Sources: TSR, Katz, Boriello, Vahid
3
Memory: basic conceptsy p
• Stores large number of bitsm x n: m words of n bits each
m × n memory
– m x n: m words of n bits each– k = Log2(m) address input signals– or m = 2^k words
4 096 8
…
…
mw
ords
– e.g., 4,096 x 8 memory:• 32,768 bits• 12 address input signals• 8 input/output data signals
n bits per word
8 input/output data signals
• Memory access– r/w: selects read or write
enable: read or write only when asserted
enable2k × n read and write
memory
A0
r/w
memory external view
– enable: read or write only when asserted– multiport: multiple accesses to different
locations simultaneously
0…
…
Ak-1
Sources: TSR, Katz, Boriello, Vahid4
Q0Qn-1
Write ability/ storage permanencey g p
• Traditional ROM/RAM age
nenc
e
– ROM• read only, bits stored
without powerRAM EPROM
Mask-programmed ROM
EEPROM FLASH
Stor
ape
rman
Ideal memory
OTP ROM
Tens of
Life ofproduct
– RAM• read and write, lose stored
bits without power
• Distinctions blurred
Batterylife (10years)
EPROM EEPROM FLASH
NVRAM
SRAM/DRAM
Nonvolatile
In-systemprogrammable
Tens ofyears
• Distinctions blurred– Advanced ROMs can be
written to• e.g., EEPROM
Externalprogrammer
OR in-system,
Writeability
programmable
Duringfabrication
only
Externalprogrammer,
1,000s
Externalprogrammer,one time only
Externalprogrammer
OR in-system,
In-system, fastwrites,
unlimited
Nearzero
e g , O– Advanced RAMs can hold
bits without power• e.g., NVRAM Write ability and storage permanence of memories,
showing relative degrees along each axis (not to scale)
block-orientedwrites, 1,000s
of cycles
of cycles 1,000sof cycles
unlimitedcycles
Sources: TSR, Katz, Boriello, Vahid5
showing relative degrees along each axis (not to scale).
Random Access Memory (RAM)y ( )
• RAM – Readable and writable memory32
4
32
4W_data
W dd
R_data
R ddy
– Logically the same as register file• RAM just one port; register file two or more
RAM vs register file
W_addr
W_en
R_addr
R_en16×32
register file– RAM vs. register file
• RAM is larger• RAM stores bits using a bit storage vs. FFs• RAM implemented on a chip in a square
Register file
• RAM implemented on a chip in a square –keeps longest wires (hence delay) short
32
10data
addr
rw1024× 32
RAM
en
RAM block symbol
Sources: TSR, Katz, Boriello, Vahid
6
y
RAM Internal Structure32
10data
addr
rw1024x32
RAM
Let A = log2Mwdata(N-1) wdata(N-2)wdata0
bit storageworden
RAM
addr0addr1
gblock(aka “cell”)
word
word
a0a1
d0
d1AxMd d
enable
addr(A-1)
clkword
enableword
enablerw
data cell
data
d(M-1)
a(A-1)
e
decoder
enrw to all cells
rdata(N-1) rdata(N-2) rdata0 RAM cell
rw data
• Similar internal structure as register file– Decoder enables appropriate word based on address inputs
Sources: TSR, Katz, Boriello, Vahid
7
– rw controls whether cell is written or read– Let’s see what’s inside each RAM cell
Static RAM (SRAM) - writing( ) g
addr0addr1r
Let A = log2 M
a0a1
d0
A M
wordenable
wdata(N-1) wdata(N-2) wdata0
bit storageblock(aka cell )
word
,,,,
SRAM celldata data’
d’dcell
32
10data
addr
rw1024x32
RAM addr1
addr(A-1)
clkenrw
add
r a1 d1
d(M-1)
a(A-1)
e
A ⋅ Mdecoder
to all cells
cell
wordenable
wordenable
rw
data
data
0wordbl
enRAM
• “Static” RAM cell– 6 transistors (recall inverter is 2 transistors)
Writing this cell
rdata(N-1) rdata(N-2) rdata0 enable
1 0
SRAM celldata data’
– Writing this cell• word enable input comes from decoder• When 0, value d loops around inverters
– That loop is where a bit stays stored
1 0
d
• When 1, the data bit value enters the loop– data is the bit to be stored in this cell– data’ enters on other side– Example shows a “1” being written into cell
1wordenable
Sources: TSR, Katz, Boriello, Vahid
8
Static RAM (SRAM) - reading( ) g
addr0addr1r
Let A = log2 M
a0a1
d0
A M
wordenable
wdata(N-1) wdata(N-2) wdata0
bit storageblock(aka cell )
word
,,,,
32
10data
addr
rw1024x32
RAM addr1
addr(A-1)
clkenrw
add
r a1 d1
d(M-1)
a(A-1)
e
A ⋅ Mdecoder
to all cells
cell
wordenable
wordenable
rw
data
data
SRAM cell
enRAM
• “Static” RAM cell - reading– When rw set to read, the RAM logic sets
rdata(N-1) rdata(N-2) rdata0
data data’
d1 1
both data and data’ to 1– The stored bit d will pull either the left line
or the right bit down slightly below 1– “Sense amplifiers” detect which side is
1wordenable
1 0
1 <1
Sense amplifiers detect which side is slightly pulled down To sense amplifiers
Sources: TSR, Katz, Boriello, Vahid
9
Dynamic RAM (DRAM)y ( )
addr0addr1r
Let A = log2 M
a0a1
d0
A M
wordenable
wdata(N-1) wdata(N-2) wdata0
bit storageblock(aka cell )
word
,,,,
32
10data
addr
rw1024x32
RAM addr1
addr(A-1)
clkenrw
add
r a1 d1
d(M-1)
a(A-1)
e
A ⋅ Mdecoder
to all cells
cell
wordenable
wordenable
rw
data
data DRAM cell
enRAM
data
• “Dynamic” RAM cell– 1 transistor (rather than 6)
rdata(N-1) rdata(N-2) rdata0
wordenable
data
cell
d
– Relies on large capacitor to store bit• Write: Transistor conducts, data voltage
level gets stored on top plate of capacitor (a)
dcapacitorslowlydischarging
• Read: Just look at value of d• Problem: Capacitor discharges over time
– Must “refresh” regularly, by reading d and h i i i i h b k
data
enable
d discharges
Sources: TSR, Katz, Boriello, Vahid
10
then writing it right back (b)d discharges
Comparing Memory Typesp g y yp• Register file
– FastestB t bi t i
MxN Memoryimplemented as a:
register– But biggest size• SRAM
– FastMore compact than register file
registerfile
SRAM
– More compact than register file• DRAM
– Slowest• And refreshing takes time
DRAM
And refreshing takes time– But very compact– Different technology for large caps.
Size comparison for the samenumber of bits (not to scale)
SRAM DRAMREGISTER FILE
DataData'
SRAM
DataW
DRAM
R S R S R S
OUT1 OUT2 OUT3 OUT4
R S
REGISTER FILE
Sources: TSR, Katz, Boriello, Vahid
11W
D Q D Q D Q D Q
CLK
IN1 IN2 IN3 IN4
Generic SRAM timingg
CE’
R/W’
Adrs
Data
d it
From SRAM From CPU
Sources: TSR, Katz, Boriello, Vahid
timeread write
Generic DRAM timingg
CE’CE’
R/W’
RAS’
CAS’CAS’
Adrs rowadrs
coladrs
Data
adrs adrs
data
Sources: TSR, Katz, Boriello, Vahid13
time
Page mode accessg
CE’CE’
R/W’
RAS’
CAS’CAS’
Adrs rowadrs
coladrs
coladrs
coladrs
Data
adrs adrs
data
adrs adrs
data data
Sources: TSR, Katz, Boriello, Vahid14
time
Ram variations
• PSRAM: Pseudo-static RAMPSRAM: Pseudo static RAM– DRAM with built-in memory refresh controller– Popular low-cost high-density alternative to SRAM
NVRAM N l til RAM• NVRAM: Nonvolatile RAM– Holds data after external power removed– Battery-backed RAMy
• SRAM with own permanently connected battery• writes as fast as reads• no limit on number of writes unlike nonvolatile ROM-based
memory– SRAM with EEPROM or flash
• stores complete RAM contents on EEPROM or flash before power turned off
Sources: TSR, Katz, Boriello, Vahid15
power turned off
Extended data out DRAM
• Improvement of FPM (full page mode) DRAM• Extra latch before output buffer
– allows strobing of cas before data read operation completed
R d d/ it l t b dditi l l• Reduces read/write latency by additional cycle
row col col col
data data data
ras
cas
address
data
Sources: TSR, Katz, Boriello, Vahid16
data data data
Speedup through overlap
data
(S)ynchronous and Enhanced Synchronous (ES) DRAMEnhanced Synchronous (ES) DRAM
• SDRAM latches data on active edge of clock• SDRAM latches data on active edge of clock• Eliminates time to detect ras/cas and rd/wr signals• A counter is initialized to column address thenA counter is initialized to column address then
incremented on active edge of clock to access consecutive memory locations
• ESDRAM improves SDRAM– added buffers enable overlapping of column addressing– faster clocking and lower read/write latency possible
clock
ras
cas
Sources: TSR, Katz, Boriello, Vahid17
address
data
row col
data data data
Rambus DRAM (RDRAM)( )
• More of a bus interface architecture than DRAM architecture
• Data is latched on both rising and falling edge of clock
• Broken into 4 banks each with own row decoder
h 4 t ti– can have 4 pages open at a time• Capable of very high throughput
Sources: TSR, Katz, Boriello, Vahid18
DRAM integration problemg p• SRAM easily integrated with CPU• DRAM more difficultDRAM more difficult
– Different chip making process between DRAM and conventional logicGoal of conventional logic (IC) designers:– Goal of conventional logic (IC) designers:
• minimize parasitic capacitance to reduce signal propagation delays and power consumption
Goal of DRAM designers:– Goal of DRAM designers:• create capacitor cells to retain stored information
Sources: TSR, Katz, Boriello, Vahid19
RAM Example: Digital Sound Recorderp g
data
addr
rw en
4096⋅ 16RAM
wiremicrophone
wireanalog-to-
digitalconverter
digital-to-analog
converterd ld d ld
Rrw RenRa12
16
ad_buf
speaker
ad_ld da_ldprocessor
• Behavior– Record: Digitize sound, store as series of 4096 12-bit digital values in RAMRecord: Digitize sound, store as series of 4096 12 bit digital values in RAM
• We’ll use a 4096x16 RAM (12-bit wide RAM not common)– Play back later from RAM
Sources: TSR, Katz, Boriello, Vahid
20
RAM Example: Digital Sound Recorderp gRecord behavior
d ld 1
S
a=0
a<4095T
Local register: a (12 bits)
16
4096x16RAM
ad_ld=1ad_buf=1Ra=aRrw=1Ren=1
a=0
a=a+1U
analog-to-digital
converter
digital-to-analog
converterad_ld da_ld
Rw RenRa12
16
processor
ad_buf
Keep local register a
a=4095
Keep local register a– Stores current address, ranges from 0 to 4095 – Create state machine that counts from 0 to 4095 using a– For each a
Sources: TSR, Katz, Boriello, Vahid
21
For each a• Read analog-to-digital conv: ad_ld=1, ad_buf=1• Write to RAM at address a: Ra=a, Rrw=1, Ren=1
RAM Example: Digital Sound Recorderp g
Play behavior
d b f 0
V
a=0
a<4095W
Local register: a (12 bits) data bus4096x16RAM
da_ld=1
ad_buf=0Ra=aRrw=0Ren=1
a=0
a=a+1
X analog-to-digital
converter
digital-to-analog
converterad_ld da_ld
Rw RenRa12
16
processor
ad_buf
a=4095
• Create state machine that counts from 0 to 4095; for each a:– Read RAM
W it t di it l t l
Sources: TSR, Katz, Boriello, Vahid
22
– Write to digital-to-analog conv.– Note: Must write d-to-a one cycle after reading RAM, when the read
data is available on the data bus
Read-Only Memory – ROMy y• Memory that can only be read from
– Data lines are output onlyAd t RAM
32
10data
addr 1024x32• Advantages over RAM
– Nonvolatile– Low power
Compact
addr
enROM
ROM block symbol– Compact y
Let A = log2M
d0
wordenable
bit storageblock(aka “cell”)
addr0addr1
addr(A-1)addr
a0a1
d0
d1
a(A-1)
AxMdecoder
(aka cell )
word
dataaddr(A 1)
clken
addrd(M-1)
a(A 1)
eword
enableword
enabledata
Sources: TSR, Katz, Boriello, Vahid
23ROM cellrdata(N-1) rdata(N-2) rdata0
ROM Typesyp• Mask-programmed ROM
– Bits are hardwired as 0s or 1s duringcell cell
data line data line01
Bits are hardwired as 0s or 1s during chip manufacturing
• Fuse-Based Programmable ROM
wordenable
Fuse Based Programmable ROM– Each cell has a fuse– Programmer blows certain fuses
(using higher-than-normal voltage)data line data line11
( g g g )• Those cells will be read as 0s
(involving some special electronics) • Cells with unblown fuses will be read
as 1s
cell cell
wordenable
as 1s– Aka. One-Time Programmable ROM
blown fusefuse
Sources: TSR, Katz, Boriello, Vahid
24
ROM Typesyp• Erasable Programmable ROM (EPROM)
– Uses “floating-gate transistor” in each cellP hi h th l lt t– Programmer uses higher-than-normal voltage to cause electrons to tunnel into the gate
• Electrons become trapped in the gate• Only done for cells that should store 0
cell cell
data line data line
g-ga
te
tor
• Other cells will be 1– To erase, shine ultraviolet light onto chip
• Gives trapped electrons energy to escape• Requires chip package to have window
cell cell
wordenable
eÐeÐ
01
float
ing
trans
ist
• Requires chip package to have window • Electronically-Erasable Programmable ROM
(EEPROM)– Programming similar to EPROM
trapped electrons
32
10data
g g– Erasing one word at a time electronically
• Flash memory– Like EEPROM, but large blocks of words can be
10addr
en
write
1024x32EEPROM
Sources: TSR, Katz, Boriello, Vahid
25
gerased simultaneously
• EEPROM & FLASH are in-system programmablebusy
ROM Example: Digital Telephone Answering Machinep g g
• Record the outgoing announcement– When rec=1, record digitized sound in locations 0 to 4095When rec 1, record digitized sound in locations 0 to 4095 – When play=1, play those stored sounds to digital-to-analog converter
busy
4096x16 Flash
“We’re not home.”
analog-to-digital
converterdigital-to-
analogconverterad_ld
d ld
Rrw Rener buRa12
16
processor
ad_buf
da_ld
recplayrecord
microphone speaker
Sources: TSR, Katz, Boriello, Vahid
26
microphone speaker
ROM Example: Digital Telephone Answering Machinep g g
4096x16 FlashLocal register: a (13 bits)
da rw en
analog-to-digital digital-to-Rrw Ren er buRa
1216
ad buf
Ter=0
bu
bu’S
Local register: a (13 bits)
a<4096U
ad ld=1a=0 digitalconverter
digital toanalog
converterad_ldda_ld
Rrw Ren er buRa
processor
ad_buf
recplayrecord
er=1rec
a=4096
V
ad_ld 1ad_buf=1Ra=aRrw=1Ren=1a=a+1
• High-level state machine
microphone speaker
High level state machine– Once rec=1, begin erasing flash by setting er=1– Wait for flash to finish erasing by waiting for bu=0– Execute loop that sets local register a from 0 to 4095 reading
Sources: TSR, Katz, Boriello, Vahid
27
– Execute loop that sets local register a from 0 to 4095, reading analog-to-digital converter and writing to flash for each a
Composing Memory – Wider Wordsp g y
1024x8addr
10
1024x8addr
1024x8addr
1024x8addr
1024x8ROM
endata
8 8 8en
addr1024x8ROM
endata
1024x8ROM
endata
1024x8ROM
endata
8
1024x3210
data(31..0)
addrROM
data
32
en
data• Making memory words wider
– Easy – just place memories side-by-side until desired width obtainedShare address/control lines concatenate data lines
data
Sources: TSR, Katz, Boriello, Vahid
28
– Share address/control lines, concatenate data lines– Example: Compose 1024x8 ROMs into 1024x32 ROM
Composing Memory – More Words
0 0 0 0 0 0 0 0 0 0 0a0a10a9a8
ProvinProvin
1024x8ROM
addr
en data
0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 1 0
0 1 1 1 1 1 1 1 1 1 0a10 just chooses
11
1024x8ROM
addr
d t
a9..a0
a10 d0
d1
addr 1x2dcdi0vin
ce 2ovince 1
1024x8ROM
addr
0 1 1 1 1 1 1 1 1 1 1
1 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 1 0
a10 just chooses which memory to access
en data
8
1024x8addr
d1
en
e
ROMen data1 1 1 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1 2048x8ROM
data
111024x8ROM
en data
8addren
• Creating memory with more words – Combine memories until the number of desired words is achieved
Use decoder to select
8
Sources: TSR, Katz, Boriello, Vahid
29
– Use decoder to select– Example: Compose 1024x8 memories into 2048x8 memory
• More words and wider words – first make enough words, then widen
Queues
frontback1 7
0
B
1 7
f
write itemsto the backof the queue
read (andremove) itemsfrom front ofth
2 6rr
• FIFO Queue (first-in-first-out)– Write at the back: push, Read at the front: pop– Treat memory as a circle
C
the queue
3 5
4
• Common uses:– Computer keyboard
• Pushes pressed keys onto queue; Meanwhile pop and send to computer– Digital video recorder
• Pushes frames onto queue; Meanwhile pops frames, compresses them, and stores them
Sources: TSR, Katz, Boriello, Vahid
30
– Routers• Pushes incoming packets onto queue; Meanwhile pops packets, processes destination information, and
forwards each packet out over appropriate port
Queues• Two conditions have
front=rear need FSM to 8⋅ 16 register file
wdata rdata16 16wdata rdata
detect:– Full: No pushes until a
popEmpty: No pops until a
33
wdata
waddrwr
rdata
raddr
rd– Empty: No pops until a
push• Use Register file for
storage
clr
3-bit 3-bit
incclr
inc
wr
rdg• Implement Rear and front
with up counters: – rear as RF’s write
dd f t d
3 bitup counter
3 bitup counter
rear front
rd
reset Cont
rolle
raddress, front as read address
=eq
full
empty
Sources: TSR, Katz, Boriello, Vahid
31
p y8-word 16-bit queue
Summaryy• Memory
– RAM– ROM– Combining Memory
Queue– Queue
Sources: TSR, Katz, Boriello, Vahid
32