4.Cache Memory

7/27/2019 4.Cache Memory

1/16

Computer Architecture and Organization

Lecture on Cache Memory

155:035 Computer Architecture and Organization


2/16

Memory access time is important to performance!

Users want large memories with fast access times

ideally unlimited fast memory

To use an analogy, think of a bookshelf containingmany books: Suppose you are writing a paper on birds. You go to the bookshelf, pull out

some of the books on birds and place them on the desk. As you start to

look through them you realize that you need more references. So you go

back to the bookshelf and get more books on birds and put them on the

desk. Now as you begin to write your paper, you have many of thereferences you need on the desk in front of you.

This is an example ofthe principle of locality:

This principle states that programs access a relatively small

portion of their address space at any instant of time.

Introduction



3/16

Levels of the Memory Hierarchy

55:035 Computer Architecture and Organization 3

Part of The On-chipCPU DatapathISA 16-128Registers

One or more levels (Static RAM):

Level 1: On-chip 16-64KLevel 2: On-chip 256K-2MLevel 3: On or Off-chip 1M-16M

Registers

CacheLevel(s)

Main Memory

Magnetic Disc

Optical Disk or Magnetic Tape

Farther away fromthe CPU:

Lower Cost/BitHigher Capacity

Increased AccessTime/Latency

Lower Throughput/Bandwidth

Dynamic RAM (DRAM)256M-16G

Interface:

SCSI, RAID,IDE, 1394

80G-300G

CPU


4/16

Memory Hierarchy Comparisons


CPU Registers

100s Bytes


5/16

We can exploit the natural locality in programs by implementing thememory of a computer as a memory hierarchy.

Multiple levels of memory with different speeds and sizes.

The fastest memories are more expensive, and usually much smaller in size

(see figure).

The user has the illusion of a memory that is both large and fast. Accomplished by using efficient methods for memory structure and organization.

Memory Hierarchy



6/16


Inventor of Cache

M. V. Wilkes, Slave Memories and Dynamic Storage Allocation,IEEE Transact io ns on Electro nic Com puters, vol. EC-14, no. 2,pp. 270-271, April 1965.


7/16


Cache Processor does all memory

operations with cache.

Miss If requested word is not incache, a blockof words containingthe requested word is brought to

cache, and then the processorrequest is completed.

Hit If the requested word is incache, read or write operation isperformed directly in cache, withoutaccessing main memory.

Block minimum amount of datatransferred between cache andmain memory.

Processor

Cache

small, fast

memory

Main memorylarge, inexpensive

(slow)

words

blocks


8/16


The Locality Principle A program tends to access data that form a physical

cluster in the memory multiple accesses may bemade within the same block.

Physical localities are temporal and may shift over

longer periods of time data not used for some time isless likely to be used in the future. Upon miss, the leastrecently used(LRU) block can be overwritten by a newblock.

P. J. Denning, The Locality Principle,

Communications of the ACM, vol. 48, no. 7, pp. 19-24,July 2005.


9/16

There are two types of locality:TEMPORAL LOCAL ITY

(locality in time) If an item is referenced, it will likely be referenced againsoon. Data is reused.

SPATIAL LOCAL ITY

(locality in space) If an item is referenced, items in neighboring addresses willlikely be referenced soon

Most programs contain natural locality in structure. For example,most programs contain loops in which the instructions and dataneed to be accessed repeatedly. This is an example of temporallocality.

Instructions are usually accessed sequentially, so they contain ahigh amount of spatial locality.

Also, data access to elements in an array is another example ofspatial locality.

Temporal & Spatial Locality



10/16


Data Locality, Cache, Blocks

Increase

block s ize

to match

local i ty

size

Increasecache size

to includemost blocks

Data

needed bya program

Block 1

Block 2

Memory

Cache


11/16

Memory system is organized as a hierarchy

with the level closest to the processor being a

subset of any level further away, and all of

the data is stored at the lowest level (see

figure).

Data is copied between only two adjacent

levels at any given time. We call the

minimum unit of information contained in a

two-level hierarchy a block or line. See the

highlighted square shown in the figure.

If data requested by the user appears in

some block in the upper level it is known as a

hit. If data is not found in the upper levels, it

is known as a miss.


Basic Caching Concepts


12/16

Basic Cache Organization

Tags Data Array

Full byte address:

Decode & RowSelect

?Compare Tags

Hit

Tag Idx Off

Data Word

Mux

select

Block address



13/16


Direct-Mapped Cache

Data

needed bya program

Memory

Cache

Dataneeded

Swap-in

LRU

Block 1

Block 2


14/16


Set-Associative Cache

Data

needed bya program

Memory

Cache

Dataneeded

Swap-in

LRU

Block 1

Block 2


15/16

Three Major Placement Schemes



16/16

Direct-Mapped Placement A block can only go into one place in the cache

Determined by the blocks address (in memory space)

The index number for block placement is usually given by some low-

order bits of blocks address.

This can also be expressed as:(Index) =

(Block address) mod (Number of blocks in cache)

Note that in a direct-mapped cache,

Block placement & replacement choices are both completely

determined by the address of the new block that is to be

accessed.


Documents

4.Cache Memory