21
1 2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy- 3 by Patterson

1 2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

Embed Size (px)

Citation preview

Page 1: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

12004 Morgan Kaufmann Publishers

Chapter Seven Memory Hierarchy-3by Patterson

Page 2: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

22004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• The technique of using the main memory as a cache for the secondary storage is called virtual memory.

• The principle of locality enables virtual memory as well as caches an virtual memory allows us to efficiently share the processor as well as main memory.

• Two major reasons of using virtual memory: – Allow efficient and safe sharing of memory among multiple

programs.– Remove the programming burdens of small, limited amount of

main memory. (To avoid exceeding the size of main memory.)• Programs share the memory change dynamically while the

programs are running. Compile each program into its own address space – separate range of memory locations accessible only to this program.

• Virtual memory implements the translation of a program’s address space to physical addresses (An address in main memory.)

• The translation process enforces protection of a program’s address space from other programs.

Page 3: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

32004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• Virtual memory also allows a single user program to exceed the size of primary memory.

• A virtual memory block is called a page, and a virtual memory miss is called a page fault.

• With the virtual memory, the processor produces a virtual address, which is translated by a combination of hardware and software to a physical address, which in turn can be used to access main memory

• Virtual address is an address that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed.

• Address translation (address mapping) is the process by which a virtual address is mapped to an address used to access memory.

Page 4: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

42004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• Virtual memory also allows a single user program to exceed the size of primary memory.

• A virtual memory block is called a page, and a virtual memory miss is called a page fault.

• With the virtual memory, the processor produces a virtual address, which is translated by a combination of hardware and software to a physical address, which in turn can be used to access main memory

• Virtual address is an address that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed.

• Address translation (address mapping) is the process by which a virtual address is mapped to an address used to access memory.

Page 5: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

52004 Morgan Kaufmann Publishers

7.4 Virtual Memory

Address mapping is called address translation• Advantages:

– illusion of having more physical memory– program relocation – protection

Virtual addresses Physical addresses

Address translation

Disk addresses

Page 6: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

62004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• Virtual memory simplifies loading the program for execution by providing relocation.

• Relocation: It maps the virtual address used by a program to different physical addresses before the addresses are used to access memory.

– This relocation allows us to load the program anywhere in main memory.

– It relocates the program in fixed-size blocks (pages).

• This eliminate the need to find a contiguous block of memory to allocate a program.

• In virtual memory, the address is broken into a virtual page number and a page offset.

• Virtual page number is translated to Physical page number

• Physical address = Physical page number + Page offset

Page 7: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

72004 Morgan Kaufmann Publishers

Pages: virtual memory blocks

Virtual page number Page offset

31 30 29 28 27 3 2 1 015 14 13 12 11 10 9 8

Physical page number Page offset

29 28 27 3 2 1 015 14 13 12 11 10 9 8

Virtual address

Physical address

Translation

Page 8: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

82004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• The number of bits in the page-offset field determines the page size.

• Page faults: the data is not in memory, retrieve it from disk

– huge miss penalty, thus pages should be fairly large (e.g., 4KB)

– reducing page faults is important (LRU is worth the price)

– can handle the faults in software instead of hardware

– using write-through is too expensive so we use write-back

• Page faults is so expensive, designers reduce page fault frequency by optimizing page placement.

• If we allow a virtual page to be mapped to any physical page, the operating system can then choose to replace any page it wants when a page fault occurs.

• We allocate pages by using a table that indexes the memory; this structure is called page table and resides in memory. (Since a full search is impractical)

Page 9: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

92004 Morgan Kaufmann Publishers

7.4 Virtual Memory

• Each program has is own page table. The page table containing the virtual to physical address translations in a virtual memory system. The table, which is stored in memory, is typically indexed by the virtual page number, each entry in table contains the physical page number for that virtual page if the page is currently in memory.

• Page table register points to the start of the page table.

• To indicate the location of the page table in memory, the hardware includes a register called page table register that points to the start of the page table.

Page 10: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

102004 Morgan Kaufmann Publishers

Page Tables

Virtual page number Page offset

3 1 3 0 2 9 2 8 2 7 3 2 1 01 5 1 4 1 3 1 2 11 1 0 9 8

Physical page number Page offset

2 9 2 8 2 7 3 2 1 01 5 1 4 1 3 1 2 11 1 0 9 8

Virtual address

Physical address

Page table register

Physical page numberValid

Page table

If 0 then page is notpresent in memory

20 12

18

Figure 7-21

Page 11: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

112004 Morgan Kaufmann Publishers

Page Faults

• If the valid bit for a virtual page is off, a page fault occurs.• The virtual address alone does not immediately tell us where the

page is. We have to keep track of the location on disk of each page in virtual address space.

• The operating system usually creates the space on disk for all the pages of a process when it creates the process. swap space.– A swap space is created on disk for all the pages of a process

when it creates the process.– A data structure is also created to record where each virtual

page is stored on disk.– The operating system also creates a data structure that tracks

which processes and which virtual addresses use each physical page.

• The operating system follows the Least recently used (LRU) replacement scheme when a page fault occurs and if all the pages in main memory are in use.

Page 12: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

122004 Morgan Kaufmann Publishers

Page Tables

Page tablePhysical page or

disk addressPhysical memory

Virtual pagenumber

Disk storage

1111011

11

1

0

0

Valid

The page table maps each page in virtual memory to either a page in main memory or a page stored on disk, which is the next level in the hierarchy.

Figure 7-22

Page 13: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

132004 Morgan Kaufmann Publishers

Page Faults

• Elaboration:

With a 32-bit virtual address, 4 KB pages, and 4 bytes per page table entry, we can compute the total page table size:

Number of page table entries = 232/212 = 220

size of page table = 220 page table entries X 22 X (bytes/page table entry)

Page 14: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

142004 Morgan Kaufmann Publishers

Making Address Translation Fast

• With virtual address, every memory access by a program can take: one memory access to obtain the physical address and a second access to get the data.

• The key to improving access performance is to rely on locality of reference to the page table.

• See the Figure 7-23 of TLB next page.

• The TLB (Translation-Lookaside Buffer) acts as a cache on page table for the entries that map to physical page only. It contains a sub-set of the virtual to physical page mapping that are in the page table.

Page 15: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

152004 Morgan Kaufmann Publishers

Making Address Translation Fast

• A cache for address translations: translation-lookaside buffer (TLB)

1111011

11

1

0

0

1000000

11

1

0

0

1001011

11

1

0

0

Physical pageor disk addressValid Dirty Ref

Page table

Physical memory

Virtual pagenumber

Disk storage

111101

011000

111101

Physical pageaddressValid Dirty Ref

TLB

Tag

Figure 7 -23

Page 16: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

162004 Morgan Kaufmann Publishers

Making Address Translation Fast

Typical values: size: 16-512 entries, block size: 1-2 pages table entries

miss-rate: .01% - 1%miss-penalty: 10 – 100 cycles

• Each tag entry in the TLB holds a portion of the virtual page number, and each data entry of the TLB holds a physical page number,

• It contains subset of the virtual-to-physical page mapping that are in the page table.

• TLB is a cache, it must have a tag field.• If there is no matching entry in the TLB for a page, the page table

must be examined. • The page table either supplies a physical page number for the page

or indicates the page resides on a disk, in which case a page fault occurs.

• The page table has an entry for every virtual page, no tag field is needed.

Page 17: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

172004 Morgan Kaufmann Publishers

TLBs and Caches

=

=

20

Virtual page number Page offset

31 30 29 3 2 1 014 13 12 11 10 9

Virtual address

TagValid Dirty

TLB

Physical page number

TagValid

TLB hit

Cache hit

Data

Data

Byteoffset

=====

Physical page number Page offset

Physical address tag Cache index

12

20

Blockoffset

Physical address

18

32

8 4 2

12

8

Cache

Figure 7-24

Page 18: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

182004 Morgan Kaufmann Publishers

TLBs and Caches

• Figure 7-24 depicts Intrinsity FastMath TLB.• 4KB pages, 32-bit address space; virtual page number is 20 bits

long.• The physical address is the same size of virtual address.• TLB contains 16 entries and is fully associative, and is shared

between the instruction and data references.• The entry of TLB is 64 bits and contains:

– 20-bit tag virtual page number– The corresponding physical number 20 bits– A valid bit– A dirty bit– Other bookkeeping bits.

Page 19: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

192004 Morgan Kaufmann Publishers

TLBs and caches

YesWrite access

bit on?

No

YesCache hit?

No

Write data into cache,update the dirty bit, and

put the data and theaddress into the write buffer

YesTLB hit?

Virtual address

TLB access

Try to read datafrom cache

No

YesWrite?

No

Cache miss stallwhile read block

Deliver datato the CPU

Write protectionexception

YesCache hit?

No

Try to write datato cache

Cache miss stallwhile read block

TLB missexception

Physical address

Figure 7-25 Processing a read or a write through in the Intrinsity FastMath TLB and cache.

Page 20: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

202004 Morgan Kaufmann Publishers

Homework #3-2

• Page 558

7.32 Consider three processors with different cache configurations:

– Cache 1: Direct-mapped with one-word blocks

– Cache 2: Direct-mapped with four-word blocks

– Cache 3: Two-way set associative with four-word blocks

The following miss rate measurements have been made:

– Cache 1: Instruction miss rate is 4%; data miss rate is 6%

– Cache 2: Instruction miss rate is 2%; data miss rate is 4%

– Cache 3: Instruction miss rate is 2%; data miss rate is 3%

For these processor, one-half of the instructions contain a data reference Assume that the cache miss penalty is 6 + Block size in words. The CPI for this workload was measured on a processor with cache 1 and was found to be 2.0. Determine which processor spends the most cycles on cache misses.

Page 21: 1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson

212004 Morgan Kaufmann Publishers

Homework #3-2

• Page 559

7.33 The cycle times for the processors in Exercise 7.32 are 420 ps for the first and second processors and 310 ps for the third processor. Determine which processor is the fastest and which is the slowest.

7.39 Consider a virtual memory system with the following properties:

– 40-bit virtual byte address

– 16 KB pages

– 36-bit physical byte address

What is the total size of the page table for each process on this processor, assuming that the valid, protection, dirty and use bit take a total of 4 bits and that all the virtual pages are in in use? (Assume that disk addresses are not stored in the page table.)