Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Objectives
Be able to explain the rationale for virtual memory (VM)
Be able to describe the functionality provided by VM
Be able to translate virtual addresses to physical addresses
Be able to explain how VM benefits fork() and execve()
2Virtual MemoryCox / Fagan
Programs refer to virtual memory addresses
movl (%ecx),%eax Conceptually very large array of bytes Each byte has its own address System provides address space private
to particular “process”
Compiler and run-time system allocate VM
Where different program objects should be stored
All allocation within single virtual address space
Virtual Memory
00 0∙∙∙∙∙∙
FF F∙∙∙∙∙∙
3Virtual MemoryCox / Fagan
Problem 1: How Does Everything Fit?
64-bit addresses:16 Exabyte (16 billion GB!) Physical main memory:
Tens or Hundreds of Gigabytes
?
And there are many processes ….
4Virtual MemoryCox / Fagan
Problem 2: Memory Management
Physical main memory
What goes where?
stackheap
.text
.data…
Process 1Process 2Process 3
…Process n
x
5Virtual MemoryCox / Fagan
Problem 3: How To Protect
Physical main memory
Process i
Process j
Problem 4: How To Share?Physical main memory
Process i
Process j
6Virtual MemoryCox / Fagan
Solution: Indirection“All problems in computer science can be solved by another level of indirection... Except for the problem of too many layers of indirection.” – David Wheeler
Mapping solves the previous problemsEach process gets its own private memory space
Physical memory
Virtual memory
Virtual memory
Process 1
Process n
mapping
7Virtual MemoryCox / Fagan
Address Spaces
Virtual address space: Set of N = 2n virtual addresses{0, 1, 2, 3, …, N-1}
Physical address space: Set of M = 2m physical addresses{0, 1, 2, 3, …, M-1}
Clear distinction between data (bytes) and their attributes (addresses)
Each data object can have multiple addressesEvery byte in main memory: one physical address, one (or more) virtual addresses
8Virtual MemoryCox / Fagan
A System Using Physical Addressing
Used in “simple” systems like embedded microcontrollers in devices like cars, elevators, and digital picture frames
0:1:
M-1:
Main memory
CPU
2:3:4:5:6:7:
Physical address(PA)
Data word
8: ...
9Virtual MemoryCox / Fagan
A System Using Virtual Addressing
Used in all modern servers, desktops, laptops, tablets, and cell phones
0:1:
M-1:
Main memory
MMU
2:3:4:5:6:7:
Physical address(PA)
Data word
8: ...CPU
Virtual address(VA)
CPU Chip
10Virtual MemoryCox / Fagan
Why Virtual Memory (VM)?
Efficient use of limited main memory (RAM) Use RAM as a cache for parts of a virtual address space
• some non-cached parts stored on disk• some (unallocated) non-cached parts stored nowhere
Keep only active areas of virtual address space in RAM• transfer data back and forth to disk as needed
Share immutable code and data
Simplifies memory management for programmers Each process gets a full private address space
Isolates address spaces One process can’t interfere with another’s memory
• because they operate in different address spaces User process cannot access privileged information
• different sections of address spaces have different permissions
11Virtual MemoryCox / Fagan
Outline
Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation mmap(), fork(), exec()
12Virtual MemoryCox / Fagan
VM as a Tool for CachingVirtual memory: array of N = 2n contiguous bytes
think of the array (allocated part) as being stored on diskPhysical main memory (DRAM) = cache for allocated virtual
memoryBasic unit: page; size = 2p
PP 2m-p-1
Physical memory
Empty
Empty
Uncached
VP 0VP 1
VP 2n-p-1
Virtual memory
UnallocatedCached
UncachedUnallocated
CachedUncached
PP 0PP 1
EmptyCached
0
2n-12m-1
0
Virtual pages (VP's) stored on disk
Physical pages (PP's) cached in DRAM
Disk
13Virtual MemoryCox / Fagan
An Example Memory Hierarchy
Disk
Main Memory
L2 unified cache
L1 I-cache
L1 D-cache
CPU Reg
2 B/cycle8 B/cycle16 B/cycle 1 B/30 cyclesThroughput:Latency: 100 cycles14 cycles3 cycles millions
MBsKBs GBs TBs
Not drawn to scale
L1/L2 cache: 64 B blocks
Miss penalty (latency): 30x
Miss penalty (latency): 10,000x
14Virtual MemoryCox / Fagan
DRAM Cache OrganizationDRAM cache organization driven by the enormous miss penalty
DRAM is about 10x slower than the SRAM used to implement CPU caches
A solid-state disk is about 100x slower than DRAM and a mechanical disk is about 10,000x slower than DRAM• For first byte, faster for next byte
Consequences Large page size: at least 4-8 KB, sometimes 2-4 MB Fully associative
• Any VP can be placed in any PP
Highly sophisticated replacement algorithms• Too complicated and open-ended to be implemented in
hardware
Write-back rather than write-through
15Virtual MemoryCox / Fagan
A System Using Virtual Addressing
Used in all modern servers, laptops, and cell phones
0:1:
M-1:
Main memory
MMU
2:3:4:5:6:7:
Physical address(PA)
Data word
8: ...CPU
Virtual address(VA)
CPU Chip
16Virtual MemoryCox / Fagan
Address Translation: Page TablesA page table is an array of page table entries (PTEs) that maps virtual pages to physical pages. Here: 8 VPs
Per-process kernel data structure in DRAM
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 4
Virtual memory(disk)
Valid01
010
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
17Virtual MemoryCox / Fagan
Address Translation With a Page Table
Virtual page number (VPN) Virtual page offset (VPO)
Virtual address
Physical page number (PPN) Physical page offset (PPO)
Physical address
Page table base register
(PTBR)
Valid Physical page number (PPN)Page table Page table address
for process
Valid bit = 0:page not in memory
(page fault)
18Virtual MemoryCox / Fagan
Page HitPage hit: reference to VM word that is in physical memory
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 4
Virtual memory(disk)
Valid01
010
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
19Virtual MemoryCox / Fagan
Page MissPage miss: reference to VM word that is not in physical memory
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 4
Virtual memory(disk)
Valid01
010
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
20Virtual MemoryCox / Fagan
Handling Page FaultPage miss causes page fault (an exception)
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 4
Virtual memory(disk)
Valid01
010
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
21Virtual MemoryCox / Fagan
Handling Page FaultPage miss causes page fault (an exception)Page fault handler selects a victim to be evicted (here VP 4)
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 4
Virtual memory(disk)
Valid01
010
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
22Virtual MemoryCox / Fagan
Handling Page FaultPage miss causes page fault (an exception)Page fault handler selects a victim to be evicted (here VP 4)
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 3
Virtual memory(disk)
Valid01
100
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
23Virtual MemoryCox / Fagan
Handling Page FaultPage miss causes page fault (an exception)Page fault handler selects a victim to be evicted (here VP 4)Offending instruction is restarted: page hit!
null
null
Memory residentpage table
(DRAM)
Physical memory(DRAM)
VP 7VP 3
Virtual memory(disk)
Valid01
100
10
1
Physical pagenumber or
disk addressPTE 0
PTE 7
PP 0VP 2VP 1
PP 3
VP 1
VP 2
VP 4
VP 6
VP 7
VP 3
Virtual address
24Virtual MemoryCox / Fagan
Disk
Servicing a Page Fault
(1) Processor signals disk controller
Read block of length P starting at disk address X and store starting at memory address Y
(2) Read occurs Direct Memory Access
(DMA) Under control of I/O
controller
(3) Controller signals completion
Interrupts processor OS resumes suspended
process Disk
Memory-I/O busMemory-I/O bus
ProcessorProcessor
CacheCache
MemoryMemoryI/O
controllerI/O
controller
Reg
(2) DMA Transfer
(1) Initiate Block Read
(3) Read Done
25Virtual MemoryCox / Fagan
Why does it work? LocalityVirtual memory works because of locality
At any point in time, programs tend to access a set of active virtual pages called the working set
Programs with better temporal locality will have smaller working sets
If (working set size < main memory size) Good performance for one process after initial
compulsory misses
If ( SUM(working set sizes) > main memory size )
Thrashing: Performance meltdown where pages are swapped (copied) in and out continuously
26Virtual MemoryCox / Fagan
Outline
Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation mmap(), fork(), exec()
27Virtual MemoryCox / Fagan
VM as a Tool for Memory ManagementMemory allocation
Each virtual page can be mapped to any physical page A virtual page can be stored in different physical pages at
different timesSharing code and data among processes
Map virtual pages to the same physical page (here: PP 6)
Virtual Address
Space for Process 1:
Physical Address
Space (DRAM)
0
N-1(e.g., read-only
library code)
Virtual Address
Space for Process 2:
VP 1VP 2
...
0
N-1
VP 1VP 2
...
PP 2
PP 6
PP 8
...
0
M-1
Address translation
28Virtual MemoryCox / Fagan
Simplifying Linking and Loading
Linking Each program has similar
virtual address space Code, shared libraries,
and stack can start at the same address in each process
Loading execve() allocates virtual
pages for .text and .data sections = creates PTEs marked as invalid
Pages within the .text and .data sections are loaded on demand by the virtual memory system
Kernel virtual memory
Memory-mapped region forshared libraries
Run-time heap(created by malloc)
User stack(created at runtime)
Unused
%rsp (stack
pointer)
Memoryinvisible touser code
Read/write segment(.data, .bss)
Read-only segment(.init, .text, .rodata)
Loaded from the
executable file
29Virtual MemoryCox / Fagan
Outline
Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation mmap(), fork(), exec()
30Virtual MemoryCox / Fagan
VM as a Tool for Memory Protection
Extend PTEs with permission bits
Page fault handler checks these before remapping
If violated, send process SIGSEGV (segmentation fault)
Process i: AddressREAD WRITE
PP 6Yes NoPP 4Yes YesPP 2Yes
VP 0:VP 1:VP 2:
•••
Process j:
Yes
SUP
NoNoYes
AddressREAD WRITE
PP 9Yes NoPP 6Yes Yes
PP 11Yes Yes
SUP
NoYesNo
VP 0:VP 1:VP 2:
Physical Address Space
PP 2
PP 4
PP 6
PP 8PP 9
PP 11
kernel/supervisor mode needed
31Virtual MemoryCox / Fagan
Outline
Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation mmap(), fork(), exec()
32Virtual MemoryCox / Fagan
Address Translation: Page Hit
1) Processor sends virtual address to MMU
2-3) MMU fetches PTE from page table in memory
4) MMU sends physical address to cache/memory
5) Cache/memory sends data word to processor
MMU Cache/MemoryPA
Data
CPUVA
CPU Chip PTEA
PTE1
2
3
4
5
33Virtual MemoryCox / Fagan
Address Translation: Page Fault
1) Processor sends virtual address to MMU
2-3) MMU fetches PTE from page table in memory
4) Valid bit is zero, so MMU triggers page fault exception
5) Handler identifies victim (and, if dirty, pages it out to disk)
6) Handler pages in new page and updates PTE in memory
7) Handler returns to original process, restarting faulting instruction
MMU Cache/Memory
CPU VA
CPU Chip PTEA
PTE1
2
3
4
5
Disk
Page fault handler
Victim page
New page
Exception
6
7
34Virtual MemoryCox / Fagan
Page Tables are often Multi-level in reality; accessing them is slowGiven:
4KB (212) page size 48-bit address space 8-byte PTE
Problem: Would need a 512 GB page table!
• 248 * 2-12 * 23 = 239 bytesHardware solution:
Multi-level page tables Example: 2-level page table
• Level 1 table: each PTE points to a level 2 page table
• Level 2 table: each PTE points to a physical page
Complementary software: Level 1 table stays in memory Level 2 tables paged in and out like
other data
Level 1Table
...
Level 2Tables
...
35Virtual MemoryCox / Fagan
A Two-Level Page Table HierarchyLevel 1
page table
...
Level 2page tables
VP 0
...
VP 1023
VP 1024
...
VP 2047
Gap
0
PTE 0
...
PTE 1023
PTE 0
...
PTE 1023
1023 nullPTEs
PTE 1023 1023 unallocated
pagesVP 9215
Virtualmemory
(1K - 9)null PTEs
PTE 0
PTE 1
PTE 2 (null)
PTE 3 (null)
PTE 4 (null)
PTE 5 (null)
PTE 6 (null)
PTE 7 (null)
PTE 8
2K allocated VM pagesfor code and data
6K unallocated VM pages
1023 unallocated pages
1 allocated VM pagefor the stack
36Virtual MemoryCox / Fagan
Translating with a k-level Page Table
VPN 10p-1n-1
VPOVPN 2 ... VPN k
PPN
0p-1m-1
PPOPPN
Virtual Address
Physical Address
... ...Level 1
page tableLevel 2
page tableLevel k
page table
37Virtual MemoryCox / Fagan
Speeding up Translation with a TLB
Page table entries (PTEs) are cached in L1 like any other memory word
PTEs may be evicted by other data references PTE hit still incurs memory access delay
Solution: Translation Lookaside Buffer (TLB) Small hardware cache in MMU Maps virtual page numbers to physical page
numbers Contains complete page table entries for small
number of pages
38Virtual MemoryCox / Fagan
Without TLB
1) Processor sends virtual address to MMU
2-3) MMU fetches PTE from page table in memory
4) MMU sends physical address to cache/memory
5) Cache/memory sends data word to processor
MMU Cache/MemoryPA
Data
CPUVA
CPU Chip PTEA
PTE1
2
3
4
5
39Virtual MemoryCox / Fagan
TLB Hit
MMU Cache/Memory
PA
Data
CPUVA
CPU Chip
PTE
1
2
4
5
A TLB hit eliminates a memory access
TLB
VPN 3
40Virtual MemoryCox / Fagan
TLB Miss
MMU Cache/MemoryPA
Data
CPUVA
CPU Chip
PTE
1
2
5
6
TLB
VPN
4
PTEA3
A TLB miss incurs an add’l memory access (the PTE)Fortunately, TLB misses are rare
Might need multipleiterations if a multi-level
page table is used
41Virtual MemoryCox / Fagan
Simple Memory System Example
Addressing 14-bit virtual addresses 12-bit physical address Page size = 64 bytes
13 12 11 10 9 8 7 6 5 4 3 2 1 0
11 10 9 8 7 6 5 4 3 2 1 0
VPO
PPOPPN
VPN
Virtual Page Number Virtual Page Offset
Physical Page Number Physical Page Offset42Virtual MemoryCox / Fagan
Simple Memory System Page Table
Only show first 16 entries (out of 256)Assume all other entries are invalid
10D0F1110E12D0D0–0C0–0B1090A1170911308
ValidPPNVPN
0–070–06116050–0410203133020–0112800
ValidPPNVPN
43Virtual MemoryCox / Fagan
Simple Memory System TLB
16 entries4-way associative
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
0–021340A10D030–073
0–030–060–080–022
0–0A0–040–0212D031
102070–0010D090–030
ValidPPNTagValidPPNTagValidPPNTagValidPPNTagSet
44Virtual MemoryCox / Fagan
Simple Memory System Cache16 lines, 4-byte block sizePhysically addressedDirect mapped
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
03DFC2111167
––––0316
1DF0723610D5
098F6D431324
––––0363
0804020011B2
––––0151
112311991190
B3B2B1B0ValidTagIdx
––––014F
D31B7783113E
15349604116D
––––012C
––––00BB
3BDA159312DA
––––02D9
8951003A1248
B3B2B1B0ValidTagIdx
Cox / Fagan 45Virtual Memory
Address Translation Example #1
Virtual Address: 0x03D4
VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CO ___ CI___ CT ____ Hit? __ Byte: ____
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
00101011110000
0x0F 3 0x03 Y N 0x0D
0001010 11010
0 0x5 0x0D Y 0x36
46Virtual MemoryCox / Fagan
Address Translation Example #2
Virtual Address: 0x0B8F
VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CO ___ CI___ CT ____ Hit? __ Byte: ____
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
11110001110100
0x2E 2 0x0B N Y TBD
47Virtual MemoryCox / Fagan
Address Translation Example #3
Virtual Address: 0x0020
VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____
Physical Address
CO___ CI___ CT ____ Hit? __ Byte: ____
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
00000100000000
0x00 0 0x00 N N 0x28
0000000 00111
0 0x8 0x28 N In DRAM, don’t know value
48Virtual MemoryCox / Fagan
Outline
Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation mmap(), fork(), exec()
49Virtual MemoryCox / Fagan
Memory MappingCreation of new VM area done via “memory mapping”
Create new vm_area_struct and page tables for area
Area can be backed by (i.e., get its initial values from) : Regular file on disk (e.g., an executable object file)
• Initial page bytes come from a section of a file
Nothing (e.g., .bss)• First fault will allocate a physical page full of 0's (demand-
zero)• Once the page is written to (dirtied), it is like any other page
Dirty pages are swapped back and forth between a special swap area of the disk.
Key point: no virtual pages are copied into physical memory until they are referenced!
Known as “demand paging” Crucial for time and space efficiency
50Virtual MemoryCox / Fagan
User-Level Memory Mappingvoid *mmap(void *start, size_t len, int prot, int flags, int fd, off_t offset)
len bytes
start(or address
chosen by kernel)
Process virtual memoryDisk file specified by file descriptor fd
len bytes
offset(bytes)
51Virtual MemoryCox / Fagan
User-Level Memory Mappingvoid *mmap(void *start, size_t len, int prot, int flags, int fd, off_t offset)
Map len bytes starting at offset offset of the file specified by file descriptor fd, preferably at address start
start: may be 0 for “pick an address” prot: PROT_READ, PROT_WRITE, ... flags: MAP_PRIVATE, MAP_SHARED, ...
Return a pointer to start of mapped area (may not be start)
Example: fast file-copy Useful for applications like Web servers that need to
quickly copy files. mmap()allows file transfers without copying into user space.
52Virtual MemoryCox / Fagan
mmap() Example: Fast File Copy
/* mmap.c - a program that uses mmap to copy itself to stdout *//* include <unistd.h>, <sys/mman.h>, <sys/stat.h>, and <fcntl.h> */intmain(void){ struct stat stat; int fd; char *bufp;
/* Open the file & get its size. */ fd = open("./mmap.c", O_RDONLY); fstat(fd, &stat);
/* Map the file to a new VM area. */ bufp = mmap(NULL, stat.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
/* Write the VM area to stdout. */ write(STDOUT_FILENO, bufp, stat.st_size);
return (0);}
53Virtual MemoryCox / Fagan
Cox / Fagan Virtual Memory 54
fork() Revisited
To create a new process using fork(): Make copies of the old process’ page table, etc.
• The two processes now share all of their pages Copy-on-write
• Allows each process to have a separate address space without copying all of the virtual pages
• Make pages of writeable areas read-only• Flag these areas as private “copy-on-write” in OS• Writes to these pages will cause protection faults
– Fault handler recognizes copy-on-write, makes a copy of the page, and restores write permissions
Net result: Processes have identical address spaces Copies are deferred until absolutely necessary
Cox / Fagan Virtual Memory 55
exec() Revisited
To load p using exec: Delete existing page
tables, etc. Create new page tables,
etc.:• Stack/heap/.bss are
anonymous, demand-zero• Code and data is mapped
to ELF executable file p Shared libraries are
dynamically linked and mapped
Set program counter to entry point in .text• OS will swap in pages
from disk as they are used
User Stack
Shared Libraries
Heap
Read/Write Data
Read-only Code and Data
Unused
%rsp
brk
.rodata.text
p
demand-zero (.bss)
demand-zero
libc.so
.data.text
.data
Cox / Fagan Virtual Memory 56
Virtual Memory
Supports many OS-related functions Process creation
• Initial• Forking children
Task switching Protection/sharing
Combination of hardware & software implementation
Software manages page tables, page allocations Hardware reads page tables
• Page fault when no entry Hardware enforcement of protection
• Protection fault when invalid access
Next Time
System-Level I/O
57Virtual MemoryCox / Fagan