50
CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September 19, 2007 Prof. Anthony D. Joseph http://www.cs.berkeley.edu/~adj/cs16x

CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

CS194-3/CS16xIntroduction to Systems

Lecture 7

Monitors, Readers/Writers problem,

Language support for concurrency control, Relational model

September 19, 2007

Prof. Anthony D. Joseph

http://www.cs.berkeley.edu/~adj/cs16x

Page 2: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.29/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Value=2Value=1Value=0

Review: Semaphores• Definition: a Semaphore has a non-negative

integer value and supports the following two operations:– P(): an atomic operation that waits for

semaphore to become positive, then decrements it by 1 » Think of this as the wait() operation

– V(): an atomic operation that increments the semaphore by 1, waking up a waiting P, if any» This of this as the signal() operation

– Only time can set integer directly is at initialization time

• Semaphore from railway analogy– Here is a semaphore initialized to 2 for resource

control:Value=1Value=0Value=0

Page 3: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.39/19/07 Joseph CS194-3/16x ©UCB Fall 2007

• Atomicity: guarantee that either all of the tasks of a transaction are performed, or none of them are

• Consistency: database is in a legal state when the transaction begins and when it ends – a transaction cannot break the rules, or integrity constraints

• Isolation: operations inside a transaction appear isolated from all other operations – no operation outside transaction can see data in an intermediate state

• Durability: guarantee that once the user has been notified of success, the transaction will persist (survive system failure)

Review: ACID Summary

Page 4: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.49/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Goals for Today

• Continue with Synchronization Abstractions– Monitors and condition variables

• Readers-Writers problem and solution• Java language support for synchronization• Relational model

Note: Some slides and/or pictures in the following areadapted from slides ©2005 Silberschatz, Galvin, and Gagne. Slides courtesy of Kubiatowicz, AJ Shankar, George Necula, Alex Aiken, Eric Brewer, Ras Bodik, Ion Stoica, Doug Tygar, and David Wagner.

Page 5: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.59/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Motivation for Monitors and Condition Variables

• Semaphores are a huge step up, but:– They are confusing because they are dual purpose:

» Both mutual exclusion and scheduling constraints» Example: the fact that flipping of P’s in bounded

buffer gives deadlock is not immediately obvious– Cleaner idea: Use locks for mutual exclusion and

condition variables for scheduling constraints• Definition: Monitor: a lock and zero or more

condition variables for managing concurrent access to shared data– Use of Monitors is a programming paradigm– Some languages like Java provide monitors in the

language• The lock provides mutual exclusion to shared

data:– Always acquire before accessing shared data

structure– Always release after finishing with shared data– Lock initially free

Page 6: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.69/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Simple Monitor Example (version 1)

• Here is an (infinite) synchronized queueLock lock;Queue queue;

AddToQueue(item) {lock.Acquire(); // Lock shared dataqueue.enqueue(item); // Add itemlock.Release(); // Release Lock

}

RemoveFromQueue() {lock.Acquire(); // Lock shared dataitem = queue.dequeue();// Get next item or

nulllock.Release(); // Release Lockreturn(item); // Might return null

}

Page 7: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.79/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Condition Variables• How do we change the RemoveFromQueue()

routine to wait until something is on the queue?– Could do this by keeping a count of the number of

things on the queue (with semaphores), but error prone

• Condition Variable: a queue of threads waiting for something inside a critical section– Key idea: allow sleeping inside critical section by

atomically releasing lock at time we go to sleep– Contrast to semaphores: Can’t wait inside critical

section• Operations:

– Wait(&lock): Atomically release lock and go to sleep. Re-acquire lock later, before returning.

– Signal(): Wake up one waiter, if any– Broadcast(): Wake up all waiters

• Rule: Must hold lock when doing condition variable ops!– In Birrell paper, he says can perform signal() outside

of lock – IGNORE HIM (this is only an optimization)

Page 8: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.89/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Complete Monitor Example (with condition variable)

• Here is an (infinite) synchronized queueLock lock;Condition dataready;Queue queue;

AddToQueue(item) {lock.Acquire(); // Get Lockqueue.enqueue(item); // Add itemdataready.signal(); // Signal any

waiterslock.Release(); // Release Lock

}

RemoveFromQueue() {lock.Acquire(); // Get Lockwhile (queue.isEmpty()) {

dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue(); // Get next itemlock.Release(); // Release Lockreturn(item);

}

Page 9: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.99/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Mesa vs. Hoare monitors• Need to be careful about precise definition of

signal and wait. Consider a piece of our dequeue code:

while (queue.isEmpty()) {dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue(); // Get next item

– Why didn’t we do this?if (queue.isEmpty()) {dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue(); // Get next item

• Answer: depends on the type of scheduling– Hoare-style (most textbooks):

» Signaler gives lock, CPU to waiter; waiter runs immediately

» Waiter gives up lock, processor back to signaler when it exits critical section or if it waits again

– Mesa-style (Nachos, most real operating systems):» Signaler keeps lock and processor» Waiter placed on ready queue with no special priority» Practically, need to check condition again after wait

Page 10: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.109/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Readers/Writers Problem

• Motivation: Consider a shared database– Two classes of users:

» Readers – never modify database» Writers – read and modify database

– Is using a single lock on the whole database sufficient?» Like to have many readers at the same time» Only one writer at a time

RR

R

W

Page 11: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.119/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Basic Readers/Writers Solution• Correctness Constraints:

– Readers can access database when no writers– Writers can access database when no readers or

writers– Only one thread manipulates state variables at a

time• Basic structure of a solution:

– Reader() Wait until no writers Access data base Check out – wake up a waiting writer

– Writer() Wait until no active readers or writers Access database Check out – wake up waiting readers or writer

– State variables (Protected by a lock called “lock”):» int AR: Number of active readers; initially = 0» int WR: Number of waiting readers; initially = 0» int AW: Number of active writers; initially = 0» int WW: Number of waiting writers; initially = 0» Condition okToRead = NIL» Conditioin okToWrite = NIL

Page 12: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.129/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Code for a Reader

Reader() {// First check self into systemlock.Acquire();

while ((AW + WW) > 0) { // Is it safe to read?WR++; // No. Writers existokToRead.wait(&lock); // Sleep on cond varWR--; // No longer waiting

}

AR++; // Now we are active!lock.release();

// Perform actual read-only accessAccessDatabase(ReadOnly);

// Now, check out of systemlock.Acquire();AR--; // No longer activeif (AR == 0 && WW > 0) // No other active

readersokToWrite.signal(); // Wake up one writer

lock.Release();}

Why Release the Lock here?

Page 13: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.139/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Writer() {// First check self into systemlock.Acquire();while ((AW + AR) > 0) { // Is it safe to write?

WW++; // No. Active users existokToWrite.wait(&lock); // Sleep on cond varWW--; // No longer waiting

}AW++; // Now we are active!lock.release();// Perform actual read/write accessAccessDatabase(ReadWrite);// Now, check out of systemlock.Acquire();AW--; // No longer activeif (WW > 0){ // Give priority to writers

okToWrite.signal(); // Wake up one writer} else if (WR > 0) { // Otherwise, wake reader

okToRead.broadcast(); // Wake all readers}lock.Release();

}

Why Give priority to writers?

Code for a Writer

Why broadcast() here instead of signal()?

Page 14: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.149/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Simulation of Readers/Writers solution

• Consider the following sequence of operators:– R1, R2, W1, R3

• On entry, each reader checks the following:while ((AW + WW) > 0) { // Is it safe to read?

WR++; // No. Writers existokToRead.wait(&lock); // Sleep on cond varWR--; // No longer waiting

}

AR++; // Now we are active!

• First, R1 comes along:AR = 1, WR = 0, AW = 0, WW = 0

• Next, R2 comes along:AR = 2, WR = 0, AW = 0, WW = 0

• Now, readers may take a while to access database– Situation: Locks released– Only AR is non-zero

Page 15: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.159/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Simulation(2)

• Next, W1 comes along:while ((AW + AR) > 0) { // Is it safe to write?

WW++; // No. Active users exist

okToWrite.wait(&lock); // Sleep on cond varWW--; // No longer waiting

}AW++;

• Can’t start because of readers, so go to sleep:AR = 2, WR = 0, AW = 0, WW = 1

• Finally, R3 comes along:AR = 2, WR = 1, AW = 0, WW = 1

• Now, say that R2 finishes before R1:AR = 1, WR = 1, AW = 0, WW = 1

• Finally, last of first two readers (R1) finishes and wakes up writer:

if (AR == 0 && WW > 0) // No other active readers

okToWrite.signal(); // Wake up one writer

Page 16: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.169/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Simulation(3)

• When writer wakes up, get:AR = 0, WR = 1, AW = 1, WW = 0

• Then, when writer finishes:if (WW > 0){ // Give priority to

writersokToWrite.signal(); // Wake up one writer

} else if (WR > 0) { // Otherwise, wake reader

okToRead.broadcast(); // Wake all readers}

– Writer wakes up reader, so get:AR = 1, WR = 0, AW = 0, WW = 0

• When reader completes, we are finished

Page 17: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.179/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Questions• Can readers starve? Consider Reader() entry code:

while ((AW + WW) > 0) { // Is it safe to read?WR++; // No. Writers existokToRead.wait(&lock); // Sleep on cond varWR--; // No longer waiting

}AR++; // Now we are active!

• What if we erase the condition check in Reader exit?AR--; // No longer activeif (AR == 0 && WW > 0) // No other active readers

okToWrite.signal(); // Wake up one writer

• Further, what if we turn the signal() into broadcast()AR--; // No longer activeokToWrite.broadcast(); // Wake up one writer

• Finally, what if we use only one condition variable (call it “okToContinue”) instead of two separate ones?– Both readers and writers sleep on this variable– Must use broadcast() instead of signal()

Page 18: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.189/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Can we construct Monitors from Semaphores?

• Locking aspect is easy: Just use a mutex• Can we implement condition variables this way?

Wait() { semaphore.P(); }Signal() { semaphore.V(); }

– Doesn’t work: Wait() may sleep with lock held• Does this work better?

Wait(Lock lock) { lock.Release(); semaphore.P(); lock.Acquire();}Signal() { semaphore.V(); }

– No: Condition vars have no history, semaphores have history:» What if thread signals and no one is waiting? NO-OP» What if thread later waits? Thread Waits» What if thread V’s and noone is waiting? Increment» What if thread later does P? Decrement and

continue

Page 19: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.199/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Construction of Monitors from Semaphores (con’t)

• Problem with previous try:– P and V are commutative – result is the same no

matter what order they occur– Condition variables are NOT commutative

• Does this fix the problem?Wait(Lock lock) { lock.Release(); semaphore.P(); lock.Acquire();}Signal() { if semaphore queue is not empty semaphore.V();}

– Not legal to look at contents of semaphore queue– There is a race condition – signaler can slip in after

lock release and before waiter executes semaphore.P()

• It is actually possible to do this correctly– Complex solution for Hoare scheduling in book– Can you come up with simpler Mesa-scheduled

solution?

Page 20: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.209/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Monitor Conclusion

• Monitors represent the logic of the program– Wait if necessary– Signal when change something so any waiting

threads can proceed• Basic structure of monitor-based program:

lock while (need to wait) { condvar.wait();}unlock

do something so no need to wait

lock

condvar.signal();

unlock

Check and/or updatestate variablesWait if necessary

Check and/or updatestate variables

Page 21: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.219/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Administrivia

• First design document due 9/26– Has to be in by 11:59pm– Good luck!

• Design reviews: [polls]– Everyone must attend! (no exceptions)– 2 points off for one missing person– 1 additional point off for each additional missing

person– Penalty for arriving late (plan on arriving 5—10

mins early)

Page 22: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.229/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Java Language Support for Synchronization

• Java has explicit support for threads and thread synchronization

• Bank Account example:class Account {

private int balance;// object constructorpublic Account (int initialBalance) {

balance = initialBalance;}public synchronized int getBalance() {

return balance;}public synchronized void deposit(int amount)

{balance += amount;

}}

– Every object has an associated lock which gets automatically acquired and released on entry and exit from a synchronized method.

Page 23: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.239/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Java Language Support for Synchronization (con’t)

• Java also has synchronized statements:synchronized (object) {

…}

– Since every Java object has an associated lock, this type of statement acquires and releases the object’s lock on entry and exit of the body

– Works properly even with exceptions:synchronized (object) {

…DoFoo();…

}void DoFoo() {

throw errException;}

Page 24: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.249/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Java Language Support for Synchronization (con’t 2)

• In addition to a lock, every object has a single condition variable associated with it– How to wait inside a synchronization method of

block:» void wait(long timeout); // Wait for timeout» void wait(long timeout, int nanoseconds); //variant» void wait();

– How to signal in a synchronized method or block:» void notify(); // wakes up oldest waiter» void notifyAll(); // like broadcast, wakes everyone

– Condition variables can wait for a bounded length of time. This is useful for handling exception cases:

t1 = time.now();while (!ATMRequest()) {

wait (CHECKPERIOD);t2 = time.new();if (t2 – t1 > LONG_TIME) checkMachine();

}– Not all Java VMs equivalent!

» Different scheduling policies, not necessarily preemptive!

Page 25: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.259/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Concurrency Summary• Semaphores: Like integers with restricted interface

– Two operations:» P(): Wait if zero; decrement when becomes non-zero» V(): Increment and wake a sleeping task (if exists)» Can initialize value to any non-negative value

– Use separate semaphore for each constraint• Monitors: A lock plus zero or more condition

variables– Always acquire lock before accessing shared data– Use condition variables to wait inside critical section

» Three Operations: Wait(), Signal(), and Broadcast()• Readers/Writers

– Readers can access database when no writers– Writers can access database when no readers– Only one thread manipulates state variables at a time

• Language support for synchronization:– Java provides synchronized keyword and one

condition-variable per object (with wait() and notify())

Page 26: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.269/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Query Compiler

query

Execution Engine Logging/Recovery

LOCK TABLE

Concurrency Control

Storage ManagerBUFFER POOLBUFFERS

Buffer Manager

Schema Manager

Data Definition

DBMS: a set of cooperating software modules

Transaction Manager

transaction

Components of a DBMS

Page 27: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.279/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Data Models

• DBMS models real world

• Data Model is link between user’s view of the world and bits stored in computer

• Many models exist• We will concentrate on

the Relational Model1010111101

Student (sid: string, name: string, login: string, age: integer, gpa:real)

Page 28: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.289/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Why Study the Relational Model?

• Most widely used model

• “Legacy systems” in older models – e.g., IBM’s IMS

• Object-oriented concepts merged in– “Object-Relational” model

» Early work done in POSTGRES research project at Berkeley

• XML features in most relational systems– Can export XML interfaces– Can embed XML inside relational fields

Page 29: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.299/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Relational Database: Definitions

• Relational database: a set of relations• Relation: made up of 2 parts:

– Schema : specifies name of relation, plus name and type of each column. » E.g. Students(sid: string, name: string, login:

string, age: integer, gpa: real)

– Instance : a table, with rows and columns. » #rows = cardinality» #fields = degree / arity

• Can think of a relation as a set of rows or tuples – i.e., all rows are distinct

Page 30: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.309/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Ex: Instance of Students Relation

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.2

53650 Smith smith@math 19 3.8

• Cardinality = 3, arity = 5 , all rows distinct

• Do all values in each column of a relation instance have to be distinct?

Page 31: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.319/19/07 Joseph CS194-3/16x ©UCB Fall 2007

SQL - A language for Relational DBs

• SQL (a.k.a. “Sequel”), standard language• Data Definition Language (DDL)

– create, modify, delete relations– specify constraints– administer users, security, etc.

• Data Manipulation Language (DML)– Specify queries to find tuples that satisfy

criteria– add, modify, remove tuples

Page 32: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.329/19/07 Joseph CS194-3/16x ©UCB Fall 2007

SQL Overview

• CREATE TABLE <name> ( <field> <domain>, … )

• INSERT INTO <name> (<field names>) VALUES (<field values>)

• DELETE FROM <name> WHERE <condition>

• UPDATE <name> SET <field name> = <value> WHERE <condition>

• SELECT <fields> FROM <name> WHERE <condition>

Page 33: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.339/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Creating Relations in SQL

•Creates the Students relation.– Note: the type (domain) of each field is

specified, and enforced by the DBMS whenever tuples are added or modified.

CREATE TABLE Students(sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)

Page 34: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.349/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Table Creation (continued)

• Another example: the Enrolled table holds information about courses students take

CREATE TABLE Enrolled(sid CHAR(20), cid CHAR(20), grade CHAR(2))

Page 35: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.359/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Adding and Deleting Tuples

• Can insert a single tuple using:

INSERT INTO Students (sid, name, login, age, gpa) VALUES (‘53688’, ‘Smith’, ‘smith@ee’, 18, 3.2)

• Can delete all tuples satisfying some condition (e.g., name = Smith):

DELETE FROM Students S WHERE S.name = ‘Smith’

Powerful variants of these commands are available; more later!

Page 36: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.369/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Keys

• Keys are a way to associate tuples in different relations

• Keys are one form of integrity constraint (IC)

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

Enrolled Students

PRIMARY KeyFOREIGN Key

Page 37: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.379/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Primary Keys

• A set of fields is a superkey if:– No two distinct tuples can have same values in

all key fields

• A set of fields is a key for a relation if :– It is a superkey– No subset of the fields is a superkey

• what if >1 key for a relation?– One of the keys is chosen (by DBA) to be the

primary key. Other keys are called candidate keys.

• E.g.– sid is a key for Students. – What about name?– The set {sid, gpa} is a superkey.

Page 38: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.389/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Primary and Candidate Keys in SQL

• Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key.

• Keys must be used carefully!• “For a given student and course, there is a

single grade.”

“Students can take only one course, and no two students in a course receive the same grade.”

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid))

CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade))

vs.

Page 39: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.399/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Foreign Keys, Referential Integrity

• Foreign key: Set of fields in one relation that is used to `refer’ to a tuple in another relation. – Must correspond to the primary key of the other

relation. – Like a `logical pointer’.

• If all foreign key constraints are enforced, referential integrity is achieved (i.e., no dangling references.)

Page 40: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.409/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Foreign Keys in SQL

• E.g. Only students listed in the Students relation should be allowed to enroll for courses.– sid is a foreign key referring to Students:

CREATE TABLE Enrolled (sid CHAR(20),cid CHAR(20),grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students )

sid cid grade

53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

Enrolled

sid name login age gpa

53666 Jones jones@cs 18 3.4

53688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

Students

11111 English102 A

Page 41: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

BREAK

Page 42: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.429/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Enforcing Referential Integrity

• Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students.

• What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!)

• What should be done if a Students tuple is deleted?– Also delete all Enrolled tuples that refer to it?– Disallow deletion of a Students tuple that is referred to?– Set sid in Enrolled tuples that refer to it to a default sid?– (In SQL, also: Set sid in Enrolled tuples that refer to it to a

special value null, denoting `unknown’ or `inapplicable’.)

• Similar issues arise if primary key of Students tuple is updated.

Page 43: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.439/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Integrity Constraints (ICs)

• IC: condition that must be true for any instance of the database; e.g., domain constraints.– ICs are specified when schema is defined.– ICs are checked when relations are modified.

• A legal instance of a relation is one that satisfies all specified ICs. – DBMS should not allow illegal instances.

• If the DBMS checks ICs, stored data is more faithful to real-world meaning.– Avoids data entry errors, too!

Page 44: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.449/19/07 Joseph CS194-3/16x ©UCB Fall 2007

• ICs are based upon the semantics of the real-world that is being described in the database relations

• We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance– An IC is a statement about all possible

instances!– From example, we know name is not a key,

but the assertion that sid is a key is given to us

• Key and foreign key ICs are the most common; more general ICs supported too

Where do ICs Come From?

Page 45: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.459/19/07 Joseph CS194-3/16x ©UCB Fall 2007

• A major strength of the relational model: supports simple, powerful querying of data

• Queries can be written intuitively, and the DBMS is responsible for efficient evaluation– The key: precise semantics for relational

queries.– Allows the optimizer to extensively re-order

operations, and still ensure that the answer does not change

Relational Query Languages

Page 46: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.469/19/07 Joseph CS194-3/16x ©UCB Fall 2007

The SQL Query Language

• The most widely used relational query language. – Current std is SQL:2003; SQL92 is a basic subset

• To find all 18 year old students, we can write:

SELECT * FROM Students S WHERE S.age=18

• To find just names and logins, replace the first line:

SELECT S.name, S.login

sid name age gpa

53666 Jones 18 3.4 53688

Smith

18

3.2

53650 Smith

login

jones@cs smith@ee

smith@math 19 3.8

Page 47: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.479/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Querying Multiple Relations

• What does the following query compute?

SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade='A'

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

Given the following instance of Enrolled

S.name E.cid

Smith Topology112we get:

Page 48: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.489/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Semantics of a Query

• A conceptual evaluation method for the previous query:1. do FROM clause: compute cross-product of

Students and Enrolled

2. do WHERE clause: Check conditions, discard tuples that fail

3. do SELECT clause: Delete unwanted fields

• Remember, this is conceptual. Actual evaluation will be much more efficient, but must produce the same answers.

Page 49: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.499/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Cross-product of Students and Enrolled Instances

S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade

53666 Jones jones@cs 18 3.4 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53832 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53666 Jones jones@cs 18 3.4 53666 History105 B 53688 Smith smith@ee 18 3.2 53831 Carnatic101 C 53688 Smith smith@ee 18 3.2 53831 Reggae203 B 53688 Smith smith@ee 18 3.2 53650 Topology112 A 53688 Smith smith@ee 18 3.2 53666 History105 B 53650 Smith smith@math 19 3.8 53831 Carnatic101 C 53650 Smith smith@math 19 3.8 53831 Reggae203 B 53650 Smith smith@math 19 3.8 53650 Topology112 A 53650 Smith smith@math 19 3.8 53666 History105 B

Page 50: CS194-3/CS16x Introduction to Systems Lecture 7 Monitors, Readers/Writers problem, Language support for concurrency control, Relational model September

Lec 7.509/19/07 Joseph CS194-3/16x ©UCB Fall 2007

Relational Model: Summary

• A tabular representation of data• Simple and intuitive, currently the most widely

used– Object-relational support in most products– XML support added in SQL:2003, most systems

• Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. – Two important ICs: primary and foreign keys– In addition, we always have domain constraints.

• Powerful query languages exist– SQL is the standard commercial one

» DDL - Data Definition Language» DML - Data Manipulation Language