48
Notes on Chapter 8 Transactions from Chapter 6 Views and Indexes from Chapter 8

Notes on Chapter 8

  • Upload
    gaia

  • View
    40

  • Download
    3

Embed Size (px)

DESCRIPTION

Notes on Chapter 8. Transactions from Chapter 6 Views and Indexes from Chapter 8. Why Transactions?. Controlling Concurrent Behavior Database systems are normally being accessed by many users or processes at the same time. Both queries and modifications. - PowerPoint PPT Presentation

Citation preview

Page 1: Notes on Chapter 8

Notes on Chapter 8

Transactions from Chapter 6Views and Indexes from Chapter 8

Page 2: Notes on Chapter 8

Why Transactions?

• Controlling Concurrent Behavior

– Database systems are normally being accessed by many users or processes at the same time.• Both queries and modifications.

– Unlike operating systems, which support interaction of processes, a DMBS needs to keep processes from troublesome interactions.

Page 3: Notes on Chapter 8

Examples of Interactions

• The airline seat choice in 6.6.1• You and your domestic partner each take $100 from

different ATM’s at about the same time.– The DBMS better make sure one account deduction

doesn’t get lost.• Compare: An OS allows two people to edit a

document at the same time. If both write, one’s changes get lost.

• What happens when there is a crash in the middle of an update?

Page 4: Notes on Chapter 8

The Bottom Line

• We want serializability and atomicity

• The answer is the concept of a transaction

Page 5: Notes on Chapter 8

Transactions

• Transaction = process involving database queries and/or modification.

• Normally with some strong properties regarding concurrency.

• Formed in SQL from single statements or explicit programmer control.

Page 6: Notes on Chapter 8

ACID Transactions

• ACID transactions are:– Atomic : Whole transaction or none is done.– Consistent : Database constraints preserved.– Isolated : It appears to the user as if only one process

executes at a time.– Durable : Effects of a process survive a crash.

• Optional: weaker forms of transactions are often supported as well.

Page 7: Notes on Chapter 8

Transactions and SQL

• SQL allows the programmer to group several statements into a single transaction.

• The SQL command START TRANSACTION is used to mark the beginning of a transaction.

• We execute BEGIN TRANSACTION before accessing the database.

Page 8: Notes on Chapter 8

Ending a Transaction in SQL - 1

• The SQL statement COMMIT causes a transaction to complete.– It’s database modifications are now permanent in the

database.

Page 9: Notes on Chapter 8

Ending a Transaction in SQL - 2

• The SQL statement ROLLBACK also causes the transaction to end, but by aborting.– No effects on the database.

• Failures like division by 0 or a constraint violation can also cause rollback, even if the programmer does not request it.

Page 10: Notes on Chapter 8

Read- Only Transactions

• SET TRANSACTION READ ONLY;

• Readers and Writers problem• Generally possible that SQL system will be able to

take advantage of that

Page 11: Notes on Chapter 8

Dirty Reads

• Dirty data is a term for data written by a transaction that has not committed

• A dirty read is a read of dirty data written by another transaction

Page 12: Notes on Chapter 8

Interacting Processes Example

• Assume the usual Sells(bar,beer,price) relation, and suppose that Joe’s Bar sells only Bud for $2.50 and Miller for $3.00.

• Sally is querying Sells for the highest and lowest price Joe charges.

• Joe decides to stop selling Bud and Miller, but to sell only Heineken at $3.50.

Page 13: Notes on Chapter 8

Sally’s Program

• Sally executes the following two SQL statements called (min) and (max) to help us remember what they do.

(max) SELECT MAX(price) FROM SellsWHERE bar = ’Joe’’s Bar’;

(min) SELECT MIN(price) FROM SellsWHERE bar = ’Joe’’s Bar’;

Page 14: Notes on Chapter 8

Joe’s Program

• At about the same time, Joe executes the following steps: (del) and (ins).

(del) DELETE FROM Sells WHERE bar = ’Joe’’s Bar’;

(ins) INSERT INTO Sells VALUES(’Joe’’s Bar’, ’Heineken’,

3.50);

Page 15: Notes on Chapter 8

Interleaving of Statements

• Although (max) must come before (min), and (del) must come before (ins), there are no other constraints on the order of these statements, unless we group Sally’s and/or Joe’s statements into transactions.

Page 16: Notes on Chapter 8

Problems with the Interleaving

• Suppose the steps execute in the order (max)(del)(ins)(min).

Joe’s Prices:Statement:Result:

• Sally sees MAX < MIN!

{2.50,3.00}

(del) (ins)

{3.50}

(min)

3.50

{2.50,3.00}

(max)

3.00

Page 17: Notes on Chapter 8

Fixing the Problem Using Transactions

• If we group Sally’s statements (max)(min) into one transaction, then she cannot see this inconsistency.

• She sees Joe’s prices at some fixed time.– Either before or after he changes prices, or in the middle,

but the MAX and MIN are computed from the same prices.

Page 18: Notes on Chapter 8

Another Problem: Rollback

• Suppose Joe executes (del)(ins), not as a transaction, but after executing these statements, thinks better of it and issues a ROLLBACK statement.

• If Sally executes her statements after (ins) but before the rollback, she sees a value, 3.50, that never existed in the database.

Page 19: Notes on Chapter 8

Solution to the Rollback Problem

• If Joe executes (del)(ins) as a transaction, its effect cannot be seen by others until the transaction executes COMMIT.– If the transaction executes ROLLBACK instead, then its

effects can never be seen.

Page 20: Notes on Chapter 8

Isolation LEvels

• SQL defines four isolation levels– These are choices about what interactions are allowed by

transactions that execute at about the same time.• Only one level (“serializable”) = ACID transactions.• Each DBMS implements transactions in its own way.

Page 21: Notes on Chapter 8

Choosing the Isolation Level

• Within a transaction, we can say:SET TRANSACTION ISOLATION LEVEL Xwhere X =1. SERIALIZABLE2. REPEATABLE READ3. READ COMMITTED4. READ UNCOMMITTED

Page 22: Notes on Chapter 8

Serializable Transactions

• If Sally = (max)(min) and Joe = (del)(ins) are each transactions, and Sally runs with isolation level SERIALIZABLE, then she will see the database either before or after Joe runs, but not in the middle.

Page 23: Notes on Chapter 8

Isolation Level Is Personal Choice

• Your choice, e.g., run serializable, affects only how you see the database, not how others see it.

• For example, if Joe Runs serializable, but Sally doesn’t, then Sally might see no prices for Joe’s Bar.– i.e., it looks to Sally as if she ran in the middle of Joe’s

transaction.

Page 24: Notes on Chapter 8

Read-Committed Transactions

• If Sally runs with isolation level READ COMMITTED, then she can see only committed data, but not necessarily the same data each time.

• For example, under READ COMMITTED, the interleaving (max)(del)(ins)(min) is allowed, as long as Joe commits.– Sally sees MAX < MIN.

Page 25: Notes on Chapter 8

Repeatable-Read Transactions

• Requirement is like read-committed, plus: if data is read again, then everything seen the first time will be seen the second time.– But the second and subsequent reads may see more

tuples as well.

• Suppose Sally runs under REPEATABLE READ, and the order of execution is (max)(del)(ins)(min).– (max) sees prices 2.50 and 3.00.– (min) can see 3.50, but must also see 2.50 and 3.00,

because they were seen on the earlier read by (max).

Page 26: Notes on Chapter 8

Read Uncommitted

• A transaction running under READ UNCOMMITTED can see data in the database, even if it was written by a transaction that has not committed (and may never).

• For example, if Sally runs under READ UNCOMMITTED, she could see a price 3.50 even if Joe later aborts.

Page 27: Notes on Chapter 8

8.1 Virtual Views

• A view is a relation defined in terms of stored tables (called base tables ) and other views.

• Two kinds:1. Virtual = not stored in the database; just a query for

constructing the relation.2. Materialized = actually constructed and stored.

Page 28: Notes on Chapter 8

Declaring Views

• Declare virtual view by:CREATE VIEW <view-name> AS <view-definition>;

• Materialized view byCREATE MATERIALIZED VIEW <name> AS <query>

• Default is virtual.

Page 29: Notes on Chapter 8

Example of a View Definition

• CanDrink(drinker, beer) is a view “containing” the drinker-beer pairs such that the drinker frequents at least one bar that serves the beer:

CREATE VIEW CanDrink ASSELECT drinker, beerFROM Frequents, SellsWHERE Frequents.bar = Sells.bar;

Page 30: Notes on Chapter 8

Example of Accessing a View

• Query a view as if it were a base table.– Also: a limited ability to modify views if it makes sense as a

modification of one underlying base table.• Example query:

SELECT beer FROM CanDrinkWHERE drinker = ’Sally’;

Page 31: Notes on Chapter 8

Modifying Views

• Since a virtual view does not exist, it doesn’t make to much sense to modify one by insertion. (You can delete and update views.)

• SQL, however, provides a formal definition of when modifications of a view are permitted.– Such as “the list in the SELECT clause must include enough attributes

that for the every tuple inserted into the view, we can fill the other attributes with NULL values or the proper default”

• An insertion on a view can be applied directly to the underlying relation R.

Page 32: Notes on Chapter 8

Text Example

CREATE VIEW ParamountMovies ASSELECT studioName, title, yearFROM MoviesWHERE studioName = ‘Paramount’;

Then insert the Star-Trek tuple into the view byINSERT INTO ParamountMoviesVALUES(‘Paramount’, ‘Star trek’, 1979);

Same effect on Movies asINSERT INTO Movies(studioName, title, year)VALUES(‘Paramount’, ‘Star trek’, 1979);

Page 33: Notes on Chapter 8

Instead-Of Triggers on Views

• An INSTEAD OF trigger lets us interpret view modifications in a way that makes sense.

• An instead-of trigger intercepts attempts to modify the view and in its place performs whatever action the database designer deems appropriate.

• Consider the following view named Synergy that has (drinker, beer, bar) triples such that the bar serves the beer, the drinker frequents the bar and likes the beer.

Page 34: Notes on Chapter 8

The View Named Synergy

CREATE VIEW Synergy ASSELECT Likes.drinker, Likes.beer, Sells.barFROM Likes, Sells, FrequentsWHERE Likes.drinker = Frequents.drinker

AND Likes.beer = Sells.beerAND Sells.bar = Frequents.bar;

Natural join of Likes,Sells, and Frequents

Pick one copy ofeach attribute

Page 35: Notes on Chapter 8

Interpreting a View Insertion

• We cannot insert into Synergy --- it is a virtual view that does not meet the criteria of a single relation in the FROM clause.

• But we can use an INSTEAD OF trigger to turn a (drinker, beer, bar) triple into three insertions of projected pairs, one for each of Likes, Sells, and Frequents.– Sells.price will have to be NULL.

Page 36: Notes on Chapter 8

The Trigger

CREATE TRIGGER ViewTrigINSTEAD OF INSERT ON SynergyREFERENCING NEW ROW AS nFOR EACH ROWBEGIN

INSERT INTO LIKES VALUES(n.drinker, n.beer);INSERT INTO SELLS(bar, beer) VALUES(n.bar,

n.beer);INSERT INTO FREQUENTS VALUES(n.drinker, n.bar);

END;

Page 37: Notes on Chapter 8

8.5 Materialized Views

• If a view is used frequently enough, it may be efficient to materialize it– That is, maintain its value at all times.

• The problem is that each time a base table changes, the materialized view may change.– Cannot afford to re-compute the view with each

change.

Page 38: Notes on Chapter 8

Maintaining a Materialized View

• It is possible to make changes to a materialized view in an incremental manner – Insertions, deletions, and updates to a base table can be

implemented in a join view by a small number of queries to the base tables followed by modification statements on the materialized view.

• A second solution is periodic reconstruction of the materialized view, which is otherwise “out of date.”

Page 39: Notes on Chapter 8

Class list during scheduling

• The class list of CSC570 students is in effect a materialized view of the class enrollment in Banner.

• Actually should be updated four times/day.

Page 40: Notes on Chapter 8

How about a Data Warehouse?

• Wal-Mart stores every sale at every store in a database.

• Overnight, the sales for the day are used to update a data warehouse = materialized views of the sales.

• The warehouse is used by analysts to predict trends and move goods to where they are selling best.

Page 41: Notes on Chapter 8

8.3 Indexes in SQL

• Index = data structure used to speed access to tuples of a relation, given values of one or more attributes.– Could be expensive to scan all tuples to find only those

that match a given condition

• Could be a hash table, but in a DBMS it is always a balanced search tree with giant nodes (a full disk page) called a B-tree.

Page 42: Notes on Chapter 8

Declaring Indexes

• There does not appear to be a SQL standard

• Typical syntax:CREATE INDEX BeerInd ON Beers(manf);CREATE INDEX SellInd ON Sells(bar, beer);

Page 43: Notes on Chapter 8

Using Indexes

• Given a value v, the index takes us to only those tuples that have v in the attribute(s) of the index.

• For example use BeerInd and SellInd to find the prices of beers manufactured by Pete’s and sold by Joe. (next slide)

Page 44: Notes on Chapter 8

Using Indexes - 2

SELECT price FROM Beers, SellsWHERE manf = ’Pete’’s’ AND

Beers.name = Sells.beer ANDbar = ’Joe’’s Bar’;

1. Use BeerInd to get all the beers made by Pete’s.2. Then use SellInd to get prices of those beers, with

bar = ’Joe’’s Bar’

Page 45: Notes on Chapter 8

8.4 Selection of Indexes (i.e. Database Tuning)

• A major problem in making a database run fast is deciding which indexes to create.

• Pro: An index speeds up queries that can use it.• Con: An index slows down all modifications on its

relation because the index must be modified too.

• Often, the most useful index we can put on a relation is an index on its key.– An index on the key will get used frequently

Page 46: Notes on Chapter 8

Tuning Example

• Suppose the only things we did with our beers database was:

1. Insert new facts into a relation (10%).2. Find the price of a given beer at a given bar (90%).

• Then SellInd on Sells(bar, beer) would be wonderful, but BeerInd on Beers(manf) would be harmful.

Page 47: Notes on Chapter 8

Calculating the Best Indexes to Create

• The authors suggest tuning advisors.• Appears to be a major research thrust.– Because hand tuning is so hard.

• An advisor gets a query load, e.g.:1. Choose random queries from the history of

queries run on the database, or2. Designer provides a sample work

Page 48: Notes on Chapter 8

Tuning Advisors - 2

• The advisor generates candidate indexes and evaluates each on the workload.– Feed each sample query to the query optimizer, which

assumes only this one index is available.– Measure the improvement/degradation in the average

running time of the queries.