COSC 3407: Operating Systems Lecture 7: Implementing Mutual Exclusion

Preview:

Citation preview

This lecture… Hardware support for synchronization Building higher-level synchronization

programming abstractions on top of the hardware support.– Semaphores

The Big Picture The abstraction of threads is good, but concurrent

threads sharing state is still too complicated Implementing a concurrent program directly with

loads and stores would be tricky and error-prone. So we’d like to provide a synchronization

abstraction that hides/manages most of the complexity and puts the burden of coordinating multiple activities on the OS instead of the programmer – Give the programmer higher level operations, such as locks.

Ways of implementing locks All require some level of hardware support. Directly implement locks and context

switches in hardware– Makes hardware slow! One has to be careful not to slow

down the common case in order to speed up a special case.

Concurrent Programs

High level atomic operations (API)Low level atomic operations (hardware)

Locks semaphores monitors send&receive

Load/store interrupt disable test&set comp&swap

Disable interrupts (uniprocessor only) Two ways for dispatcher to get control:

– internal events – thread does something to relinquish the CPU

– external events – interrupts cause dispatcher to take CPU away

On a uniprocessor, an operation will be atomic as long as a context switch does not occur in the middle of the operation.

Need to prevent both internal and external events. Preventing internal events is easy (although virtual

memory makes it a bit tricky). Prevent external events by disabling interrupts, in effect,

telling the hardware to delay handling of external events until after we’re done with the atomic operation.

A flawed, but very simple solution Why not do the following:

1. Need to support synchronization operations in user-level code. Kernel can’t allow user code to get control with interrupts disabled (might never give CPU back!).

2. Real-time systems need to guarantee how long it takes to respond to interrupts, but critical sections can be arbitrarily long. Thus, one should leave interrupts off for shortest time possible.

3. Simple solution might work for locks, but wouldn’t work for more complex primitives, such as semaphores or condition variables.

Lock::Acquire() { disable interrupts;}Lock::Release() { enable interrupts;}

Implementing locks by disabling interrupts Key idea: maintain a lock variable and impose

mutual exclusion only on the operations of testing and setting that variable.

class Lock { int value = FREE;}Lock::Acquire() { Disable interrupts; if (value == BUSY) { Put on queue of threads waiting for lock Go to sleep // Enable interrupts? See comments in next slides } else { value = BUSY; } Enable interrupts;}

Enable PositionEnable Position

Enable Position

Implementing locks by disabling interrupts

Why do we need to disable interrupts at all? Otherwise, one thread could be trying to acquire the lock,

and could get interrupted between checking and setting the lock value, so two threads could think that they both have the lock.

Lock::Release() { Disable interrupts; If anyone on wait queue { Take a waiting thread off wait queue Put it at the front of the ready queue } else { value = FREE; } Enable interrupts;}

Implementing locks by disabling interrupts

By disabling interrupts, the check and set operations occur without any other thread having the chance to execute in the middle.

When does Acquire re-enable interrupts in going to sleep? Before putting the thread on the wait queue?

– Then Release can check the queue, and not wake the thread up.

After putting the thread on the wait queue, but before going to sleep?

– Then Release puts the thread on the ready queue, but the thread still thinks it needs to go to sleep!

– goes to sleep, missing the wakeup from Release, and still holds lock (deadlock!)

Want to put it after sleep(). But – how?

Implementing locks by disabling interrupts

To fix this, in Nachos, interrupts are disabled when you call Thread::Sleep; it is the responsibility of the next thread to run to re-enable interrupts.

When the sleeping thread wakes up, it returns from Thread::Sleep back to Acquire. Interrupts are still disabled, so turn on interrupts.

Thread A Thread B

Disable sleep Sleep return

enable

Disable sleepSleep return

enable

switch

switch

Time.

.

.

.

Interrupt disable and enable pattern across context switches An important point about structuring code:

– If you look at the Nachos code you will see lots of comments about the assumptions made concerning when interrupts are disabled.

This is an example of where modifications to and assumptions about program state can’t be localized within a small body of code.

When that’s the case you have a very good chance that eventually your program will “acquire” bugs: as people modify the code they may forget or ignore the assumptions being made and end up invalidating the assumptions.

Can you think of other examples where this will be a concern?– What about acquiring and releasing locks in the

presence of C++ exception exits out of a procedure?

Atomic read-modify-write instructions mylock.acquire();

a = b / 0; mylock.release()

Problems with this solution:– Can’t give lock implementation to users

On a multiprocessor, interrupt disable doesn’t provide atomicity.

It stops context switches from occurring on that CPU, but it doesn’t stop the other CPUs from entering the critical section.

One could provide support to disable interrupts on all CPUs, but that would be expensive: stopping everyone else, regardless of what each CPU is doing.

Atomic read-modify-write instructions Instead, every modern processor architecture

provides some kind of atomic read-modify-write instruction.

These instructions atomically read a value from memory into a register, and write a new value.

The hardware is responsible for implementing this correctly on both uniprocessors (not too hard) and multiprocessors (requires special hooks in the multiprocessor cache coherence strategy).

Unlike disabling interrupts, this can be used on both uniprocessors and multiprocessors.

Examples of read-modify-write instructions test&set (most architectures) – read value, write 1 back to

memory exchange (x86) – swaps value between register and

memory compare&swap (68000) – read value, if value matches

register, do exchange load linked and conditional store (R4000, Alpha) –

designed to fit better with load/store architecture (speculative computation). – Read value in one instruction, do some operations,

when store occurs, check if value has been modified in the meantime.

– If not, ok. – If it has changed, abort, and jump back to start.

Test-and-Set Instruction The Test-and-Set instruction is executed atomically

Busy-waiting: thread consumes CPU cycles while it is waiting.

Boolean TestAndSet(Boolean &target) {

Boolean rv = target; target = true;

return rv;}

Initially: boolean lock = false;

void acquire(lock) { while TestAndSet(lock) ; // while BUSY}void release(lock) { lock = false; }

Thread Ti : while(true) {

acquire(lock) ; critical sectionrelease(lock); remainder section

}

Problem: Busy-Waiting for Lock Positives for this solution

– Machine can receive interrupts– User code can use this lock– Works on a multiprocessor

Negatives– This is very inefficient because the busy-waiting

thread will consume cycles waiting– Waiting thread may take cycles away from thread

holding lock (no one wins!)– Priority Inversion: If busy-waiting thread has higher

priority than thread holding lock no progress! Priority Inversion problem with original Martian

rover For semaphores and monitors, waiting thread may

wait for an arbitrary length of time!– Thus even if busy-waiting was OK for locks,

definitely not ok for other primitives

Test-and-Set (minimal busy waiting) Idea: only busy-wait to atomically check lock

value; if lock is busy, give up CPU. Use a guard on the lock itself (multiple layers of

critical sections!) Waiter gives up the processor so that Release can

go forward more quickly:Acquire(lock) { while test&set(guard) ; // Short busy-wait time if (value == BUSY) { Put on queue of threads waiting for lock Go to sleep & set guard to false } else { value = BUSY; guard = false; }}

Test-and-Set (minimal busy waiting)

Notice that sleep has to be sure to reset the guard variable. Why can’t we do it just before or just after the sleep?

Release(lock) { while (test&set(guard)) ; if anyone on wait queue { take a waiting thread off put it at the front of the ready queue } else { value = FREE; } guard = false;}

Mutual Exclusion with Swap Swap instruction operates on the contents of two

words and is executed atomically.

Atomically swap two variables.

void Swap(boolean &a, boolean &b) { Boolean temp = a; a = b; b = temp;}

Shared data:Boolean lock = false;

Boolean waiting[n]; Thread Ti

do { key = true; while (key == true)

Swap(lock,key); critical section lock = false; remainder

section } while (1);

Higher-level Primitives than Locks Goal of last couple of lectures:

– What is the right abstraction for synchronizing threads that share memory?

– Want as high a level primitive as possible Good primitives and practices important!

– Since execution is not entirely sequential, really hard to find bugs, since they happen rarely

– UNIX is pretty stable now, but up until about mid-80s (10 years after started), systems running UNIX would crash every week or so – concurrency bugs

Synchronization is a way of coordinating multiple concurrent activities that are using share state– Next lecture presents a couple of ways of

structuring the sharing

Semaphores Synchronization primitive

– higher level than locks– invented by Dijkstra in 1968, as part of the THE os– used in the original UNIX.

A semaphore is:– a non-negative integer value S, and – support two atomic operations wait/P() and signal/V()

wait (S) { // also called P() while S 0 ; // no-op Spinlock S--;}signal(S) { // also called V() S++;}

Busy waiting problem Busy waiting wastes CPU cycles. Spinlocks are useful in multiprocessor systems.

– no context switch is required when a process must wait on a lock.

– Spinlocks are useful when held for short times Each semaphore has an associated queue of

processes/threads– wait(S): decrement S. If S = 0, then block until

greater than zero– Signal(S): increment S by one and wake 1 waiting

thread (if any)– Classic semaphores have no other operations

Hypothetical Implementationtype semaphore = record

value: integer:L: list of processes;

endwait(S) { S.value--; if (S.value < 0) {

add this process to S.L;block(); // a system call

} }signal(S) { S.value++; if (S.value <= 0) {

remove a process P from S.Lwakeup(P); // a system call

}}

wait()/signal() are critical sections!

Hence, they must be executed atomically with respect to each

other.

busy waiting is limited only to the critical sections of wait and signal operations, and these are short

Two Types of Semaphores Binary semaphore: like a lock (has a Boolean

value)– Initialized to 1– A thread performs a wait() until value is 1 and then sets

it to 0– Signal() sets value to 1, waking up a waiting thread, if

any Counting semaphore:

– represents a resource with many units available– allows threads/process to enter as long as more units

are available– counter is initialized to N

» N = number of units available

Semaphore as General Synchronization Tool Execute B in Pj only after A executed in Pi

Use semaphore flag initialized to 0 Code:

Pi Pj

A wait(flag)

signal(flag) B

Deadlock and Starvation Deadlock – two or more processes are waiting indefinitely for

an event (execution of a signal operation) that can be caused by only one of the waiting processes.

Let S and Q be two semaphores initialized to 1 P0 P1

wait(S); wait(Q);wait(Q); wait(S); signal(S); signal(Q);signal(Q); signal(S);

Starvation – indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended.

Indefinite blocking may occur if we add and remove processes from the list associated with a semaphore in LIFO order.

Two uses of semaphores Mutual exclusion (initially S = 1)

– Binary semaphores can be used for mutual exclusion

Process Pi:

do { wait(S); CriticalSection() signal(S); remainder section } while (1);

Two uses of semaphores Scheduling constraints

– Locks are fine for mutual exclusion, but what if you want a thread to wait for something?

– For example, suppose you had to implement Thread::Join, which must wait for a thread to terminate.

– By setting the initial value to 0 instead of 1, we can implement waiting on a semaphore:

Initially S = 0ForkThread::Join calls wait() // will wait until something makes // the semaphore positive.

Thread finish calls signal() // makes the semaphore positive // and wakes up the thread // waiting in Join.

Summary Important concept: Atomic Operations

– An operation that runs to completion or not at all

– These are the primitives on which to construct various synchronization primitives

Talked about hardware atomicity primitives:– Disabling of Interrupts, test&set, swap,

comp&swap, load-linked/store conditional Showed several constructions of Locks

– Must be very careful not to waste/tie up machine resources» Shouldn’t disable interrupts for long» Shouldn’t spin wait for long

– Key idea: Separate lock variable, use hardware mechanisms to protect modifications of that variable

Talked about Semaphores

Recommended