3
Notes on: Cache Comparison Problem A 450MHz Pentium with 32KBytes L1 cache, 128MBytes RAM, and a 133MHz system bus runs a program with an average working set size of 80KBytes. While in a working set the program has a 0.9997 probability that the next memory request will be from this working set and a 0.9 probability that the next memory request will be the next instruction/data value in memory (i.e. 10% of the time a request is from a random memory address in the working set). (Note: when the program changes working sets, it will begin making memory requests from the new working set with 0.9997 probability.) (1) Determine how much (if any) performance improvement could be achieved by adding a 256KByte L2 (access speed= 450/2MHz) to the processor. (2) Determine what size memory blocks should be moved between cache and RAM. (3) Give an outline of a memory caching strategy that makes sense. It is strongly recommended that you take the time to investigate the details of cache memory, especially the operation of the cache controller chip-set. As a guide, try to answer the following questions about the operation of the Pentium II/III and the related cache controller. When there is a cache miss, how many words of memory does the CPU need to be able to continue processing? When the cache memory is replaced, how many words are transferred? (assume 4K). What hardware component is responsible for the transfer of blocks of memory to cache?

Notes on: Cache Comparison Problem

Embed Size (px)

DESCRIPTION

Notes on: Cache Comparison Problem. - PowerPoint PPT Presentation

Citation preview

Page 1: Notes on:  Cache Comparison Problem

Notes on: Cache Comparison Problem

A 450MHz Pentium with 32KBytes L1 cache, 128MBytes RAM, and a 133MHz system bus runs a program with an average working set size of 80KBytes. While in a working set the program has a 0.9997 probability that the next memory request will be from this working set and a 0.9 probability that the next memory request will be the next instruction/data value in memory (i.e. 10% of the time a request is from a random memory address in the working set). (Note: when the program changes working sets, it will begin making memory requests from the new working set with 0.9997 probability.)

(1) Determine how much (if any) performance improvement could be achieved by adding a 256KByte L2 (access speed= 450/2MHz) to the processor.

(2) Determine what size memory blocks should be moved between cache and RAM.

(3) Give an outline of a memory caching strategy that makes sense.

It is strongly recommended that you take the time to investigate the details of cache memory, especially the operation of the cache controller chip-set. As a guide, try to answer the following questions about the operation of the Pentium II/III and the related cache controller.

When there is a cache miss, how many words of memory does the CPU need to be able to continue processing?

When the cache memory is replaced, how many words are transferred? (assume 4K).

What hardware component is responsible for the transfer of blocks of memory to cache?

Given the stats listed in the problem, what will happen to the hit-ratio while a new block of memory is being loaded

into cache?

Page 2: Notes on:  Cache Comparison Problem

ismemory access at

next address

tacc=tacc+L1acc

num=num+1

start

yes

no ismemory access in

working set

yes

ismemory access in

cache

yes tacc=tacc+block_size(.75L1acc+.25RAMacc)num=num+block_size

no

stop

isnum>=numtotal

no

tavg=tacc/num

yes

Flowchart for L1 Only Cache Analysis

simulatesblock read for

4-way setassociativeL1 cache

Page 3: Notes on:  Cache Comparison Problem

ismemory access at

next address

tacc=tacc+L1acc

start

yes

no ismemory access in

working set

yes

tacc=tacc+block_size(.75L2acc+.25RAMacc)num=num+block_size

no

stop

isnum>=numtotal

tavg=tacc/num

yes

which cacheL1 L2

tacc=tacc+L1acc

num=num+1

Flowchart for L1/L2 Cache Analysis

simulatesblock read for

4-way setassociativeL2 cache

You may assume a write through operationbetween L1 and L2 cache

(i.e. no additional delay)