Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
CMPT 354:Database System I
Lecture 11. Transaction Management
1
Why this lecture
• DB application developer• What if crash occurs, power goes out, etc?• Single useràMultiple users
2
Outline
• Transaction Basics• Definition• Motivation for Transaction• ACID Properties
• Concurrency Control• Scheduling• Anomaly Types• ConflictSerializability
3
Transactions:BasicDefinition
Examples:
• Transfermoneybetweenaccounts
• Purchaseagroupofproducts
• Registerforaclass(eitherwaitlistorallocated)
Atransaction(“TXN”)isasequenceofoneormoreoperations (readsorwrites)whichreflectsasinglereal-worldtransition.
Intherealworld,aTXNeitherhappenedcompletelyornotatall
5
TransactionsinSQL
• In“ad-hoc”SQL:• Default:eachstatement=onetransaction
• Inaprogram,multiplestatementscanbegroupedtogetherasatransaction:
START TRANSACTIONUPDATE Bank SET amount = amount – 100 WHERE name = ‘Bob’UPDATE Bank SET amount = amount + 100 WHERE name = ‘Joe’
COMMIT
MotivationforTransactions
Groupinguseractions(reads&writes)intotransactionshelpswithtwogoals:
1. Recovery&Durability:KeepingtheDBMSdataconsistentanddurableinthefaceofcrashes,aborts,systemshutdowns,etc.
2. Concurrency: AchievingbetterperformancebyparallelizingTXNswithout creatinganomalies
Motivation
1.Recovery&Durability ofuserdataisessentialforreliableDBMSusage
• TheDBMSmayexperiencecrashes(e.g.poweroutages,etc.)
• IndividualTXNsmaybeaborted(e.g.bytheuser)
Idea:MakesurethatTXNsareeitherdurablystoredinfull,ornotatall;keeplogtobeableto“roll-back”TXNs
8
Protectionagainstcrashes/aborts
Client 1:INSERT INTO SmallProduct(name, price)
SELECT pname, priceFROM ProductWHERE price <= 0.99
DELETE ProductWHERE price <=0.99
Whatgoeswrong?
Crash/abort!
9
Protectionagainstcrashes/aborts
Client 1:START TRANSACTION
INSERT INTO SmallProduct(name, price)SELECT pname, priceFROM ProductWHERE price <= 0.99
DELETE ProductWHERE price <=0.99
COMMIT OR ROLLBACK
Nowwe’dbefine!
Motivation2.Concurrent executionofuserprogramsisessentialforgoodDBMSperformance.
• Diskaccessesmaybefrequentandslow- optimizeforthroughput(#ofTXNs),tradeforlatency(timeforanyoneTXN)
• UsersshouldstillbeabletoexecuteTXNsasifinisolation andsuchthatconsistencyismaintained
Idea:HavetheDBMShandlerunningseveraluserTXNsconcurrently,inordertokeepCPUshumming…
11
Multipleusers:singlestatements
Client 1: UPDATE EmployeeSET Salary = Salary + 1000
Client 2: UPDATE ProductSET Salary = Salary * 2
Twomanagersattempttoincrease employee salary concurrently-Whatcouldgowrong?
12
Client 1: START TRANSACTIONUPDATE EmployeeSET Salary = Salary + 1000
COMMIT
Client 2: START TRANSACTIONUPDATE Employee SET Salary = Salary * 2
COMMIT
Nowworkslikeacharm- we’llseehow/whylater…
Multipleusers:singlestatements
13
TransactionProperties:ACID
• Atomic• Stateshowseitheralltheeffectsoftxn,ornoneofthem
• Consistent• Txn movesfromastatewhereintegrityholds,toanotherwhereintegrityholds
• Isolated• Effectoftxns isthesameastxns runningoneafteranother(ie lookslikebatchmode)
• Durable• Onceatxn hascommitted,itseffectsremaininthedatabase
ACIDcontinuestobeasourceofgreatdebate!
14
ACID:Atomic
• TXN’sactivitiesareatomic:allornothing
• Intuitively:intherealworld,atransactionissomethingthatwouldeitheroccurcompletely ornotatall
• TwopossibleoutcomesforaTXN
• Itcommits:allthechangesaremade
• Itaborts:nochangesaremade
15
ACID:Consistent
• Thetablesmustalwayssatisfyuser-specifiedconstraints• Examples:
• Accountnumberisunique• Stockamountcan’tbenegative• Sumofdebitsandofcredits is0
• Howconsistencyisachieved:• Programmermakessureatxn takesaconsistentstatetoaconsistentstate• Systemmakessurethatthetxn isatomic
16
ACID:Isolated
• Atransactionexecutesconcurrentlywithothertransactions
• Isolation:theeffectisasifeachtransactionexecutesinisolation oftheothers.
• E.g.Shouldnotbeabletoobservechangesfromothertransactionsduringtherun
17
ACID:Durable
• TheeffectofaTXNmustcontinuetoexist(“persist”)aftertheTXN• Andafterthewholeprogramhasterminated• Andeveniftherearepowerfailures,crashes,etc.• Andetc…
• Means:Writedatatodisk
ANote:ACIDiscontentious!
• ManydebatesoverACID,bothhistorically and currently
• Manynewer“NoSQL”DBMSsrelaxACID
• Inturn,now“NewSQL”reintroducesACIDcompliancetoNoSQL-styleDBMSs…
ACIDisanextremelyimportant&successfulparadigm,butstilldebated!
Transaction Management (Big Picture)
Storage
Definition
Write Read
Transaction = A list of writes and reads
For example: {Read, Read, Write}
Two Big Problems
1. Support multipletransactionatthesametime
2. Make sure the datastored is reliable
Techniques
1. ConcurrencyControl
2. DatabaseRecovery
Properties
Atomicity
Consistency
Isolation
Durability
Outline
• Transaction Basics• Definition• Motivation for Transaction• ACID Properties
• Concurrency Control• Scheduling• Anomaly types• Conflictserializability
20
Concurrency:Isolation&Consistency
• TheDBMSmusthandleconcurrencysuchthat…
1. Isolation ismaintained:UsersmustbeabletoexecuteeachTXNasiftheyweretheonlyuser• DBMShandlesthedetailsofinterleaving variousTXNs
2. Consistency ismaintained:TXNsmustleavetheDBinaconsistentstate• DBMShandlesthedetailsofenforcingintegrityconstraints
ACID
ACID
Example- considertwoTXNs:
T1: START TRANSACTIONUPDATE AccountsSET Amt = Amt + 100WHERE Name = ‘A’
UPDATE AccountsSET Amt = Amt - 100WHERE Name = ‘B’
COMMIT
T2: START TRANSACTIONUPDATE AccountsSET Amt = Amt * 1.06
COMMIT
T1transfers$100fromB’saccounttoA’saccount
T2creditsbothaccountswitha6%interestpayment
Example- considertwoTXNs:
T1transfers$100fromB’saccounttoA’saccount
T2creditsbothaccountswitha6%interestpayment
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Time
WecanlookattheTXNsinatimelineview- serialexecution:
Example- considertwoTXNs:
T1transfers$100fromB’saccounttoA’saccount
T2creditsbothaccountswitha6%interestpayment
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Time
TheTXNscouldoccurineitherorder…DBMSallows!
Example- considertwoTXNs:
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Time
TheDBMScanalsointerleave theTXNs
T2creditsA’saccountwith6%interestpayment,thenT1transfers$100toA’saccount…
T2creditsB’saccountwitha6%interestpayment,thenT1transfers$100fromB’saccount…
Example- considertwoTXNs:
Isitcorrect?
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Time
TheDBMScanalsointerleave theTXNs
WhyInterleaveTXNs?
• InterleavingTXNsmightleadtoanomalousoutcomes…whydoit?
• Severalimportantreasons:• IndividualTXNsmightbeslow- don’twanttoblockotherusersduring!
• Diskaccessmaybeslow- letsomeTXNsuseCPUswhileothersaccessingdisk!
27Allconcernlargedifferencesinperformance
Ignore all issues?
• AtFacebook,only0.0004%ofresultsreturned areinconsistent• But,
Interleaving&Isolation
• TheDBMShasfreedomtointerleaveTXNs
• However,itmustpickaninterleavingorschedule suchthatisolationandconsistencyaremaintained
• Mustbeasif theTXNshadexecutedserially!
29
DBMSmustpickaschedulewhichmaintainsisolation&consistency
“Withgreatpowercomesgreatresponsibility”
ACID
Schedulingexamples
30
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
A B
$50 $200
A B
$159 $106
A B
$159 $106
StartingBalance
Sameresult!
SerialscheduleT1,T2:
Interleavedschedule1:
Schedulingexamples
31
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
A B
$50 $200
A B
$159 $106
A B
$159 $112
StartingBalance
DifferentresultthanserialT1,T2!
SerialscheduleT1,T2:
Interleavedschedule2:
Schedulingexamples
32
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
A B
$50 $200
A B
$153 $112
A B
$159 $112
StartingBalance
DifferentresultthanserialT2,T1ALSO!
SerialscheduleT2,T1:
Interleavedschedule2:
Schedulingexamples
33
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Thisscheduleisdifferentthananyserialorder! Wesaythatitisnot
serializable
InterleavedscheduleB:
SchedulingDefinitions
• Aserialschedule isonethatdoesnotinterleavetheactionsofdifferenttransactions
• Aserializableschedule isaschedulethatisequivalenttosome serialschedule.
• Schedule1andSchedule2 areequivalentif, foranydatabasestate,theeffectonDBofexecutingSchedule1 isidenticaltotheeffectofSchedule2
Serializable?
35
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
SameasaserialscheduleforallpossiblevaluesofA,B=serializable
Serialschedules:
A BT1,T2 1.06*(A+100) 1.06*(B-100)T2,T1 1.06*A+100 1.06*B- 100
A B1.06*(A+100) 1.06*(B-100)
Serializable?
36
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
Notequivalent toanyserializableschedule =notserializable
Serialschedules:
A BT1,T2 1.06*(A+100) 1.06*(B-100)T2,T1 1.06*A+100 1.06*B- 100
A B1.06*(A+100) 1.06*B- 100
Outline
• Transaction Basics• Definition• Motivation for Transaction• ACID Properties
• Concurrency Control• Scheduling• Anomaly types• ConflictSerializability
37
Whatelsecangowrongwithinterleaving?
• Variousanomalieswhichbreakisolation/serializability
• Oftenreferredtobyname…
• Occurbecauseof/withcertain“conflicts”betweeninterleavedTXNs
38
TheDBMS’sviewoftheschedule
39
T1
T2
A+=100 B-=100
A*=1.06 B*=1.06
T1
T2
R(A)
R(A)
W(A)
W(A) R(B) W(B)
R(B) W(B)
EachactionintheTXNsreadsavaluefromglobalmemory andthenwritesonebacktoit
Schedulingordermatters!
ConflictTypes
• Thus,therearethreetypesofconflicts:• Read-Writeconflicts(RW)• Write-Readconflicts(WR)• Write-Writeconflicts(WW)
Whyno“RRConflict”?
Twoactionsconflict iftheyarepartofdifferentTXNs,involvethesamevariable,andatleastoneofthemisawrite
Interleavinganomaliesoccurwith/becauseoftheseconflictsbetweenTXNs (buttheseconflictscanoccurwithoutcausinganomalies!)
Occurringwith/becauseofaRWconflict
ClassicAnomalieswithInterleavedExecution
“Unrepeatableread”:
T1
T2
R(A) R(A)
1. T1 reads somedatafromA
2. T2 writes toA
3. Then,T1 readsfromAagainandnowgetsadifferent/inconsistentvalue
R(A) W(A) C
Example:
Occurringwith/becauseofaWRconflict
ClassicAnomalieswithInterleavedExecution
“Dirtyread”/Readinguncommitteddata:
T1
T2
W(A) A
1. T1 writes somedatatoA
2. T2 reads fromA,thenwritesbacktoA&commits
3. T1 thenaborts- nowT2’sresultisbasedonanobsolete/inconsistentvalue
R(A) W(A) C
Example:
“Lostupdate”:
T1
T2
W(A)
1. T1 blindwrites somedatatoA
2. T2 blindwrites toAandB
3. T1 thenblindwrites toB;nowwehaveT2’svalueforBandT1’svalueforA- notequivalenttoanyserialschedule!
Example:
W(B) C
W(A) CW(B)
OccurringbecauseofaWWconflict
ClassicAnomalieswithInterleavedExecution
Outline
• Transaction Basics• Definition• Motivation for Transaction• ACID Properties
• Concurrency Control• Scheduling• Anomaly Types• ConflictSerializability
44
Schedules
45
T1T2
R(A) R(B)W(A) W(B)
SerialSchedule:
R(A) R(B)W(A) W(B)
T1T2
R(A) R(B)W(A) W(B)
SerializableSchedule:
R(A) R(B)W(A) W(B)
ConflictSerializableSchedule
46
AllSchedules
SerializableSchedule
ConflictSerializableSchedule
SerialSchedule
Conflicts
• Twoactionsconflict ifalltheconditionshold• i)theyarepartofdifferentTXNs,• ii)theyinvolvethesamevariable,• iii)atleastoneofthemisawrite
47
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
Exercise
• Twoactionsconflict ifalltheconditionshold• i)theyarepartofdifferentTXNs,• ii)theyinvolvethesamevariable,• iii)atleastoneofthemisawrite
48
Findalltheotherconflicts
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
Exercise: Answer
• Twoactionsconflict ifalltheconditionshold• i)theyarepartofdifferentTXNs,• ii)theyinvolvethesamevariable,• iii)atleastoneofthemisawrite
49
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
Conflictserializable
• ScheduleSisconflictserializableifSisconflictequivalent tosomeserialschedule
• Twoschedulesareconflictequivalentif:• TheyinvolvethesameactionsofthesameTXNs• Everypairofconflictingactions oftwoTXNsareorderedinthesameway
50
Aretheyconflictequivalent?
• “TheyinvolvethesameactionsofthesameTXNs”doesnothold
51
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A)
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
NO
Aretheyconflictequivalent?
• “Everypairofconflictingactions oftwoTXNsareorderedinthesameway”doesnothold
52
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
NO
Aretheyconflictequivalent?
• TheyinvolvethesameactionsofthesameTXNs• Everypairofconflictingactions oftwoTXNsareorderedinthesameway
53
YES
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
TheConflictGraph
• ConsideragraphwherethenodesareTXNs,andthereisanedgefromTi àTj ifanyactionsinTiprecedeandconflictwith anyactionsinTj
54
T1 T2
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
Theorem:Scheduleisconflictserializable ifandonlyifitsconflictgraphisacyclic
Isthisscheduleconflictserializable?
55
T1 T2
T1
T2
R(A) R(B)W(A) W(B)
R(A) R(B)W(A) W(B)
1. Findallconflicts2. Modelthescheduleasaconflictgraph3. Checkwhetherthegraphhasacycle• Yesà notconflictserializable• Noà conflictserializable
NO
IsolationLevels
• TransactionsinSQLiteareserializable(https://www.sqlite.org/isolation.html)
• IsolationLevelsinSQLServer(https://docs.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017)
56
ACID
CheckoutCMPT454
ConcurrencyControlAlgorithms
57
• Locking • Timestamp Ordering
2-phase locking Multi-versionconcurrencycontrol (MVCC)
CheckoutCMPT454
Summary
• Transaction Basics• Definition• Motivation for Transaction• ACID Properties
• Concurrency Control• Scheduling• Anomaly Types• ConflictSerializability
5858
Acknowledge• Some lecture slides were copied from or inspired by thefollowing course materials• “W4111: Introduction to databases” by Eugene Wu atColumbia University• “CSE344: IntroductiontoDataManagement” by Dan Suciu atUniversityof Washington• “CMPT354: Database System I” by JohnEdgar at Simon FraserUniversity• “CS186: Introduction to Database Systems” by Joe Hellersteinat UC Berkeley• “CS145: IntroductiontoDatabases” by Peter Bailis at Stanford• “CS348:IntroductiontoDatabaseManagement” by GrantWeddell at University of Waterloo
59