Background The Critical-Section Problem Synchronization Hardware Semaphores Classical Problems of Synchronization Critical Regions Monitors

Process Synchronization

BackgroundThe Critical-Section ProblemSynchronization HardwareSemaphoresClassical Problems of SynchronizationCritical RegionsMonitors

Concurrent access to shared data may result in data inconsistency.

Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes.

Background

Shared-memory solution to bounded-buffer problem allows at most n – 1 items in buffer at the same time.

A solution, where all N buffers are used is not simple.

◦ Suppose that we modify the producer-consumer code by adding a variable counter, initialized to 0 and incremented each time a new item is added to the buffer

Background

Shared data#define BUFFER_SIZE 10typedef struct {. . .

} item;

item buffer[BUFFER_SIZE];int in = 0;int out = 0;

Solution is correct, but can only use BUFFER_SIZE-1 elements

Bounded-Buffer (Old)

item nextProduced;

while (1) {

while (((in + 1) % BUFFER_SIZE) == out);/* do nothing */

buffer[in] = nextProduced;

in = (in + 1) % BUFFER_SIZE;

}

Bounded-Buffer (Old)– Producer Process

item nextConsumed;

while (1) {

while (in == out); /* do nothing */

nextConsumed = buffer[out];

out = (out + 1) % BUFFER_SIZE;

}

Bounded-Buffer (Old)– Consumer Process

Shared data

#define BUFFER_SIZE 10typedef struct {. . .

} item;

item buffer[BUFFER_SIZE];

int in = 0;int out = 0;int counter = 0;

Bounded-Buffer (Proper)

Producer process item nextProduced;while (1) {

while (counter == BUFFER_SIZE); //spin

buffer[in] = nextProduced;in = (in + 1) % BUFFER_SIZE;counter++;

}

Bounded-Buffer

Consumer process

item nextConsumed;while (1) {

while (counter == 0); // do nothing nextConsumed = buffer[out];out = (out + 1) % BUFFER_SIZE;counter--;

}

Bounded-Buffer

The statement “count++ ” may be implemented in machine language as:

register1 = counterregister1 = register1 + 1counter = register1

The statement “count-- ” may be implemented as:register2 = counterregister2 = register2 – 1counter = register2

Bounded Buffer

If both the producer and consumer attempt to update the buffer concurrently, the assembly language statements may get interleaved.

Interleaving depends upon how the producer and consumer processes are scheduled.

Bounded Buffer

Assume counter is initially 5. One interleaving of statements is:

producer: register1 = counterproducer: register1 = register1 + 1 consumer: register2 = counterconsumer: register2 = register2 – 1 consumer: counter = register2 producer: counter = register1

The correct result should be 5.

Race Condition

Race condition: The situation where several processes access – and manipulate shared data concurrently. The final value of the shared data depends upon which process finishes last.

To prevent race conditions, concurrent processes must be synchronized.

Race Condition

Uniprocessor:◦ Disable interrupts to solve lock race condition.

Multiprocessor:◦ Disabling interrupts:◦ Send interrupt message to all processors.◦ Delay in message passing.◦ System degradation.

Solution?◦ Handle lock variable with atomic assignment.◦ Swap variable in hardware.

Disable Interrupts ?

The statements:counter++;counter--;must be performed atomically.

Atomic operation means an operation that completes in its entirety without interruption.

Bounded Buffer

n processes, all competing for some shared data

Each process has a code segment, called critical section, in which the shared data is accessed.

Problem: ◦ ensure that when one process is executing in its

critical section, no other process is allowed to execute in its critical section.

The Critical-Section Problem

General structure of process Pi (other process Pj)

do {entry section

critical sectionexit section

remainder section} while (1);

Processes may share some common variables to synchronize their actions.

General Structure

There are 3 requirements that must be met for a solution to the critical-section problem:

Mutual Exclusion Progress Bounded Waiting

Critical-Section Solution

Mutual Exclusion

If process Pi is executing in its critical section no other processes can be executing in their critical sections.


Progress

a) if no process is executing in its critical section and b) there exist some processes that wish to enter their

critical section

only those processes that are not executing in the remainder section can participate in the decision on which will enter the critical section next.

The selection of the processes cannot be postponed indefinitely.


Bounded Waiting

A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted.◦ Assume that each process executes at a non-zero

speed (and completes at a non-infinite time)◦ No assumption concerning relative speed of the n

processes.


Shared variables: ◦ int turn; // initially turn = 0◦ turn = i Pi can enter its critical section

Process Pido {

while (turn != i); /* NOP */critical section

turn = j;remainder section

} while (1);

Algorithm 1 (2 processes)

Satisfies mutual exclusion (strict alternation), but not progress e.g., When P0 exits CS, sets turn to P1, and executes its remainder section for quite some time, P1 executes CS, sets turn to P0 and P1 immediately wants to executes CS again but has to wait for its turn again. ◦ Remember that P0 is still in the remainder section and is

preventing P1’s entry into the CS because it is P0 turn…

i.e., if turn = 0 and P1 is ready to enter CS, it can’t. Recall Progress requirement:

◦ Only processes not in remainder section can decide on who goes next.

P1 can’t enter even though P0 may be in its remainder section.

Algorithm 1

Problem with Algorithm 1: does not have info on state of processes, only remembers who is next.

Shared variablesboolean flag[2]; // remember state of process flag [0] = flag [1] = false;flag [i] = true;

// Pi ready to enter its critical sectiondo { flag[i] = true; // I am ready to enter CS

while (flag[j]); /* NOP */ critical section flag [i] = false; remainder section} while (1);

Algorithm 2

Satisfies mutual exclusion, but (still) not the progress requirement.

Counterexample:

if flag[0] = true at time t0 then, P0 gets interrupted and P1 gets CPU & sets flag[1] = true at time t1 …

… they both spin lock indefinitely.

Algorithm 2

Combined shared variables of algorithms 1 and 2.Process Pi

do {flag [i] = true;turn = j;while (flag [j] and turn = j);

critical sectionflag [i] = false;

remainder section} while (1);

Meets all three requirements; solves the critical-section problem for two processes.

Algorithm 3 – Peterson’s Solution

Before entering its critical section, process receives a number. Holder of the smallest number enters the critical section.

If processes Pi and Pj receive the same number, check process/thread numbers – if i < j, then Pi is served first; else Pj is served first.

The numbering scheme always generates numbers in increasing order of enumeration; i.e., 1,2,3,3,3,3,4,5...

Bakery Algorithm(Critical section for n processes)

Notation < lexicographical order (ticket #, process id #)◦ (a,b) < (c,d) if

a < c or a = c and b < d (tie broken with smaller pid)

◦ max (a0,…, an-1) is a number, k, such that

k ai for i - 0, …, n – 1

Bakery Algorithm

Shared data:◦ boolean choosing[n]; ◦ int number[n];

Data structures are initialized to false and 0 respectively

Bakery Algorithm

do { choosing[i] = true;

/* Pi will be choosing (assigned) a number */num[i] = max(num[0], num[1], …, num [n – 1])+1;choosing[i] = false;for (j = 0; j < n; j++) {

while (choosing[j]); while ((num[j]!=0)&&(num[j,j]<num[i,i]));

}critical section

num[i] = 0;remainder section

} while (1);

Bakery Algorithm

Synchronization Hardware

Simple in uniprocessor environment, i.e., forbid interrupts while a shared variable is modified.◦ Generally too inefficient on multiprocessor

systems Operating systems using this not broadly scalable

Multiprocessor environment: ◦ Interrupt disable is infeasible, i.e., time

consuming since message is passed to other processors.


Modern machines provide special atomic hardware instructions◦ Atomic = non-interruptable

These can either:◦ test memory word and set value, or◦ swap contents of two memory words

They must accomplish this as one uninterruptable operation


Test and modify the content of a word atomically

boolean TestAndSet(boolean *target) {boolean rv = *target;*target = true; /* blocks others until lock is set to

false upon CS exit */return rv;

}

Note: If two TestandSet() instructions are executed

simultaneously (on separate processors), they will be executed sequentially in some arbitrary order.

Test and Set

Shared data: boolean lock = false;

Process Pido {

while (TestAndSet(&lock)); /*NOP*/

critical sectionlock = false;

remainder section}

Mutual Exclusion with Test-and-Set

Atomically swap two variables.

void swap(boolean *a, boolean *b) {boolean temp = *a;*a = *b;*b = temp;

}

Hardware Swap

Process Pi

do {key = true;while (key == true)

swap(lock,key);

critical sectionlock = false;

remainder section} while (true);

Mutual Exclusion with Swap

Synchronization tool that does not require busy waiting.

Semaphore S – integer variable◦ can only be accessed via two indivisible (atomic)

operations wait (S):

while S 0 do NOP;S--;

signal (S): S++;

Semaphores

Counting semaphore ◦ integer value can range over an unrestricted domain

Binary semaphore ◦ integer value can range only between 0 and 1; can be

simpler to implement. Also known as mutex locks Can implement a counting semaphore S as a

binary semaphore Provides mutual exclusion

◦ semaphore S; // initialized to 1◦ wait (S);◦ Critical Section◦ signal (S);

Semaphores

Must guarantee that no two processes can execute wait() and signal() on the same semaphore at the same time

Thus, implementation becomes the critical section problem where the wait and signal code are placed in the critical section.◦ Could now have busy waiting in critical section implementation

But implementation code is short Little busy waiting if critical section rarely occupied

Note that applications may spend lots of time in critical sections and therefore this is not a good solution.

Semaphores

Semaphores – w/o Busy Waiting

• Associate a waiting queue with each semaphore. • Each entry in a waiting queue has two data items:

value (of type integer) pointer to next record in the list

Two operations: block – place the process invoking the operation on the appropriate

waiting queue.

wakeup – remove one of processes in the waiting queue and place it in the ready queue.

Define a semaphore as a recordtypedef struct { int value; struct process *L;} semaphore;

Assume two simple operations:◦ block suspends the process that invokes it.◦ wakeup(P) resumes the execution of a blocked

process P.

Semaphore Implementation

Semaphore operations now defined as

wait(S):S.value--;if (S.value < 0) {

add this process to S.L;block();

}

Implementation

Semaphore operations now defined as

signal(S): S.value++;

if (S.value <= 0) {remove a process P from S.Lwakeup(P);

}

Implementation

Shared data:semaphore mutex; //initially mutex = 1

Process Pi:

do { wait(mutex); critical section

signal(mutex); remainder section} while (1);

Critical Section of n Processes

Execute B in Pj only after A executed in Pi Use semaphore flag initialized to 0

Code:Pi Pj A wait(flag)signal(flag) B

Semaphore as a General Synchronization Tool

Starvation – indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended.

Deadlock and Starvation

Deadlock – two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes.

Let S and Q be two semaphores initialized to 1P0 P1wait(S); wait(Q);wait(Q); wait(S); signal(S); signal(Q);signal(Q) signal(S);

Deadlock and Starvation

Classical Problems of Synchronization

Bounded-Buffer Problem

Readers and Writers Problem

Dining-Philosophers Problem

Shared data

semaphore full, empty, mutex;

Initially:

full = 0, empty = n, mutex = 1

Bounded-Buffer Problem

do { …

produce an item in nextp …

wait(empty);wait(mutex);

…add nextp to buffer

…signal(mutex);signal(full);

} while (1);

Bounded-Buffer Producer Process

do { wait(full)wait(mutex);

…remove item from buffer to nextc

…signal(mutex);signal(empty);

…consume the item in nextc

…} while (1);

Bounded-Buffer Consumer Process

Often, processes will want to access some shared data set (perhaps in a database)

The processes need to control how concurrent readers and writers interact◦ Multiple readers OK◦ Writes must have exclusive access◦ Priority between the two must be decided

Readers-first? Writers-first?

Readers-Writers Problem

Shared data

semaphore mutex, wrt;

Initially

mutex = 1, wrt = 1, readcount = 0;

Readers-Writers Solution

wait(wrt);

…writing is performed

…signal(wrt);

Readers-Writers: Writer Process

wait(mutex);readcount++;if (readcount == 1)

wait(wrt);signal(mutex);

…reading is performed

…wait(mutex);readcount--;if (readcount == 0)

signal(wrt);signal(mutex):

Readers-Writers: Reader Process

Consider:◦ 5 philosophers sitting around a circular table◦ They spend their lives doing two things:

Thinking Eating

◦ In the center of the table is a bowl of rice◦ Between each pair of philosophers is a single

chopstick◦ When going from thinking to eating, philosophers

attempt to pick up the two adjacent chopsticks (one at a time) and eat.


Shared data :

semaphore chopstick[5];

// Initially all// values are 1


Philosopher i:do {wait(chopstick[i])wait(chopstick[(i+1) % 5]) …eat …signal(chopstick[i]);signal(chopstick[(i+1) % 5]); …think …} while (1);


Is this a good solution?

People make mistakes.

signal(mutex);…Critical section…

wait(mutex);

When would you discover this?

The Big Problem with Semaphores

High-level synchronization construct that allows the safe sharing of an abstract data type among concurrent processes.

monitor monitor-name{

shared variable declarationsprocedure body P1 (…) {

. . .}// more procedure defns here……{

initialization code}

}

Monitors

To allow a process to wait within the monitor, a condition variable must be declared, as

condition x, y;

Condition variable can only be used with the operations wait and signal.

Monitors

x.wait()means that the process invoking this operation is suspended until another process invokes x.signal();

x.signal( ) resumes exactly one suspended process. If no process is suspended, then the signal operation has no effect.

Monitors

Schematic View of a Monitor

Monitor With Condition Variables

monitor dp {enum {THINK, HUNGRY, EAT} state[5];condition self[5];void pickup(int i) // following

slidesvoid putdown(int i) // following

slidesvoid test(int i) // following

slidesvoid init() {

for (int i = 0; i < 5; i++)state[i] = think;

}}

Dining Philosophers Example

void pickup(int i) {state[i] = HUNGRY;test[i];if (state[i] != EAT)

self[i].wait();}

Dining Philosophers

void putdown(int i) {state[i] = THINK;// test left and right neighborstest((i+4) % 5);test((i+1) % 5);

}

Dining Philosophers

void test(int i) {if ( (state[(i + 4) % 5] != EAT) &&

(state[i] == HUNGRY) && (state[(i + 1) % 5] != EAT)) {

state[i] = EAT;self[i].signal();

}}

Dining Philosophers

Philosopher i’s code:

:dp.pickup(i)

: eat :

dp.putdown(i):

Philosophers Execution

Variables semaphore mutex; //(initially = 1)semaphore next; // (initially = 0)int next_count = 0;

Monitor Implementation Using Semaphores

Each external procedure F will be replaced by:wait(mutex); … body of F; …if (next_count > 0)

signal(next)else

signal(mutex);

Mutual exclusion within a monitor is ensured.

Monitor Implementation Using Semaphores

For each condition variable x:semaphore x_sem; // (initially = 0)int x_count = 0;

x.wait can be implemented as:x_count++;if (next_count > 0)signal(next);elsesignal(mutex);wait(x_sem);x_count--;

Monitor Implementation

x.signal can be implemented as:

if (x_count > 0) {next_count++;signal(x_sem);wait(next);next_count--;

}


Conditional-wait construct: x.wait(c);

◦ c : integer expression evaluated when the wait operation is executed.

◦ value of c (a priority number) stored with the name of the process that is suspended.

◦ when x.signal is executed: process with smallest associated priority number is

resumed next.


Check two conditions to establish correctness of system:

◦ User processes must always make their calls on the monitor in a correct sequence.

◦ Must ensure that an uncooperative process does not ignore the mutual-exclusion gateway provided by the monitor, and try to access the shared resource directly, without using the access protocols.


Synchronization Implementations

SolarisWinXPLinuxpThreads

Implements a variety of locks to support multitasking, multithreading (including real-time threads), and multiprocessing.

Uses adaptive mutexes for efficiency when protecting data from short code segments.

Uses condition variables and readers-writers locks when longer sections of code need access to data.

Uses turnstiles to order the list of threads waiting to acquire either an adaptive mutex or reader-writer lock.

Solaris 2 Synchronization

Uses interrupt masks to protect access to global resources on uniprocessor systems.

Uses spinlocks on multiprocessor systems.

Also provides dispatcher objects which may act as either mutexes or semaphores.

Dispatcher objects may also provide events. An event acts much like a condition variable.

Windows XP Synchronization

Linux Synchronization Linux:

◦ Prior to kernel Version 2.6, disables interrupts to implement short critical sections

◦ Version 2.6 and later, fully preemptive

Linux provides:◦ semaphores◦ spin locks

Pthreads Synchronization

Pthreads API is OS-independent It provides:

◦ mutex locks◦ condition variables

Non-portable extensions include:◦ read-write locks◦ spin locks

Atomic TransactionsSystem ModelLog-based RecoveryCheckpointsConcurrent Atomic TransactionsTransactional Memory

Assures that operations happen as a single logical unit of work, in its entirety, or not at all

Challenge is assuring atomicity despite computer system failures

Transaction - collection of instructions or operations that performs single logical function◦ Here we are concerned with changes to stable storage – disk◦ Transaction is series of read and write operations, terminated

by commit (transaction successful) or abort (transaction failed) operation

◦ Aborted transaction must be rolled back to undo any changes it performed

Atomic Transaction Model

Types of Storage Media Volatile storage – information stored here does not

survive system crashes◦ Example: main memory, cache

Nonvolatile storage – Information usually survives crashes◦ Example: disk and tape

Stable storage – Information never lost◦ Not actually possible, so approximated via replication or

RAID to devices with independent failure modes

Goal is to assure transaction atomicity where failures cause loss of information on volatile storage

Log-Based Recovery Record to stable storage information about all

modifications by a transaction Most common is write-ahead logging

◦ Log on stable storage, each log record describes single transaction write operation, including Transaction name Data item name Old value New value

◦ <Ti starts> written to log when transaction Ti starts

◦ <Ti commits> written when Ti commits Log entry must reach stable storage before operation on data

occurs

Log-Based Recovery Algorithm Using the log, system can handle any volatile

memory errors◦ Undo(Ti) restores value of all data updated by Ti

◦ Redo(Ti) sets values of all data in transaction Ti to new values

Undo(Ti) and redo(Ti) must be idempotent◦ Multiple executions must have the same result as one

execution If system fails, restore state of all updated data via

log◦ If log contains <Ti starts> without <Ti commits>, undo(Ti)

◦ If log contains <Ti starts> and <Ti commits>, redo(Ti)

Checkpoints Log could become long, and recovery could take a

long time! Checkpoints shorten log and recovery time. Checkpoint scheme:

1. Output all log records currently in volatile storage to stable storage

2. Output all modified data from volatile to stable storage3. Output a log record <checkpoint> to the log on stable

storage Now recovery only includes Ti, such that Ti started

executing before the most recent checkpoint, and all transactions after Ti All other transactions already on stable storage

Concurrent Transactions Must be equivalent to serial execution –

serializability

Could perform all transactions in critical section◦ Inefficient, too restrictive

Concurrency-control algorithms provide serializability

Serializability Consider two data items A and B Consider Transactions T0 and T1

Execute T0, T1 atomically Execution sequence called schedule Atomically executed transaction order

called serial schedule For N transactions, there are N! valid serial

schedules

Schedule 1: T0 then T1

Nonserial Schedule Nonserial schedule allows overlapped execute

◦ Resulting execution not necessarily incorrect Consider schedule S, operations Oi, Oj

◦ Conflict if access same data item, with at least one write

If Oi, Oj consecutive and operations of different transactions & Oi and Oj don’t conflict◦ Then S’ with swapped order Oj Oi equivalent to S

If S can become S’ via swapping nonconflicting operations◦ S is conflict serializable

Schedule 2: Concurrent Serializable Schedule

Ensure serializability by associating lock with each data item◦ Follow locking protocol for access control

Locks◦ Shared – Ti has shared-mode lock (S) on item Q, Ti can

read Q but not write Q◦ Exclusive – Ti has exclusive-mode lock (X) on Q, Ti can

read and write Q Require every transaction on item Q acquire

appropriate lock If lock already held, new request may have to wait

◦ Similar to readers-writers algorithm

Locking Protocol

Two-phase Locking Protocol Generally ensures conflict serializability

Each transaction issues lock and unlock requests in two phases◦ Growing – obtaining locks◦ Shrinking – releasing locks

Does not prevent deadlock

Managing locks/semaphores/mutex/etc. imposes additional complexity for programmers.◦ Harder to debug or tune for performanceupdate () { acquire(); /* modify shared stuff */ release();

} Much like other high-level constructs, it would

be nice if the system/hardware did this for us.

Transactional Memory

Transactional Memory Systems allow sequences of read/write statements to be declared atomic – and treated like transactionsupdate () { atomic { /* modify shared stuff */ }

} These systems (which can be implemented

either in software or hardware) do the synchronization and optimization for us.

Transactional Memory

Documents

Background The Critical-Section Problem Synchronization Hardware Semaphores Classical Problems of Synchronization Critical Regions Monitors