10/26/98 2
OS kernel inherently concurrent
From 60s: multiprogramming– Context switch on I/O wait– Reentrant interrupts– Threads simplified the
implementation
90s servers, 00s PCs: multiprocessing– Multiple CPUs executing kernel code
10/26/98 3
Thread (-centric concurrency) controlSingle CPU kernel:
– Only one thread in kernel at a time– No locks– Disable interrupts to control
concurrency
MP kernels inherit this mindset1. Control concurrency of threads2. Add locks to objects only where
required
10/26/98 4
Case study: memory mapping
BackgroundPage faults
– challenges– pseudocode
Page victimization – challenges– pseudocode
Discussion & design lessons
10/26/98 5
Other interesting patterns
nonblocking queuesasymmetric reader-writer locksone lock/object, different lock for
chainpriority donation locksimmutable message buffers
10/26/98 7
Life cycle of a page frame
unallocated
invalid IO_pending
valid
pushout
Allocate
Read from disk
Victimize
Victimize
Write todisk
dirty
Modify
10/26/98 8
Page fault challenges
Multiple processes fault to same file page
Multiple processes fault to same pfdat Multiple threads of same process fault
to same segment (low frequency)Bidirectional mapping between segment
pointers and pfdatsStop only minimal process set during
disk I/OMinimize locking/unlocking on fast path
10/26/98 9
Page fault stage 1vfault(virtual_address addr)
segment.lock();
if ((pfdat = segment.fetch(addr)) == null)
pfdat = lookup(s.file, s.pageNum(addr));
/* returns locked pfdat */
if (pfdat.status == PUSHOUT)
/* do something complicated */
install pfdat in segment;
add segment to pfdat owner list;
else
pfdat.lock();
10/26/98 10
Page fault stage 2if (pfdat.status == IO_PENDING)
segment.unlock();
pfdat.wait();
goto top of vfault;
else if (pfdat.status == INVALID)
pfdat.status = IO_PENDING;
pfdat.unlock();
fetch_from_disk(pfdat);
pfdat.lock();
pfdat.status = VALID;
pfdat.notify_all();
10/26/98 11
Page fault stage 3
segment.insert_TLB(addr, pfdat.paddr());
pfdat.unlock();
segment.unlock();
restart application
10/26/98 12
Page victimization challenges
Bidirectional mapping between segment pointers and pfdats
Stop no processes during batch writes
Deadlock caused by paging thread racing with faulting thread
10/26/98 13
Page victimization stage 1
next_victim:
pfdat p = choose_victim();
p.lock();
if (! (p.status == valid
|| p.status == dirty))
p.unlock();
goto next_victim;
10/26/98 14
Page victimization stage 2
foreach segment s in p.owner_list
if (s.trylock() == ALREADY_LOCKED)
p.unlock();
/* do something! (p.r.d.) */
remove p from s;
/* also deletes any TLB mappings */
delete s from p.owner_list;
s.unlock();
10/26/98 15
Page victimization stage 3if (p.status == DIRTY)
p.status = PUSHOUT;
schedule p for disk write;
p.unlock();
goto next_victim;
else
unbind(p.file, p.pageNum, p);
p.status = UNALLOCATED;
add_to_free_list(p);
p.unlock();
10/26/98 16
Discussion questions (1)
– Why have IO_PENDING state; why not just keep pfdat locked until data valid?
What happens when:– Some thread discovers IO_PENDING
and blocks. Before it restarts, that page is victimized.
– Page chosen as victim is being actively used by application code.
10/26/98 17
Discussion questions (2)
– What mechanisms ensure that a page is only read from disk once despite multiple processes faulting at the same time?
– Why is it safe to skip checking for PUSHOUT in fault stage 2?
Write out the invariants that support your reasoning.
10/26/98 18
Discussion questions (3)
– Louis Reasoner suggests releasing the segment lock at the end of fault stg 1 and reacquiring it for stg 3. This will speed up parallel threads. What could go wrong?
– At the point marked p.r.d. (victim stg 2), Louis suggests
goto next_victim;What could go wrong?
10/26/98 19
Design lessons
Causes of complexity:– data structure traversed in multiple
directions– high level of concurrency for performance
Symptoms of complexity– nontrivial mapping from locks to objects– invariants relating thread, lock, and object
states across multiple data structures
10/26/98 20
Loose vs tight concurrency
Loose– Separate subsystems connected by
simple protocols– Use often, for performance or simplicity
Tight– Shared data structures with complex
invariants– Only use where you have to– Minimize code and states involved
10/26/98 21
Page frame sample invariants
All pfdat p: (p.status == UNALLOCATED)
|| lookup(p.file, p.pageNum) == p
; all processes will find same pfdat
p.status != INVALID
; therefore only 1 process will read disk
(p.status == UNALLOCATED
|| p.status == PUSHOUT)
=> p.owner_list empty
; therefore no TLB mappings to PUSHOUT
; avoiding cache consistency problems