6
CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise (See exercise sheet. You can start before class.) Performance, availability, more storage space to fit the data Goals for Today Discuss distributing data and workload for increased performance in a DBMS Reason about DBMS concepts in parallel and distributed setting How to achieve distributed ACID transactions Data consistency across copies Understand why some of the disadvantages of distributed databases have led to some relaxation of consistency Some Parallelism Terminology Throughput Amount of work done per unit time Latency (response time) Time to complete one unit work Speed-Up More resources means less time for a given unit of work à do more units of work in same time Scale-Up If resources increased in proportion to increase in units of work, time per unit is constant. degree of parallelism Xact/sec. (throughput) Ideal (speed-up) degree of parallelism sec./Xact (response time) Ideal (scale-up) Resource contention impacts scale-up/speed-up

CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

CS133:Databases

Fall2019Lec24–12/05

ParallelandDistributedDBMSs

Prof.BethTrushkowsky

Warm-upExercise

(Seeexercisesheet.Youcanstartbeforeclass.)

Performance,availability,morestoragespacetofitthedata

GoalsforToday•  DiscussdistributingdataandworkloadforincreasedperformanceinaDBMS

•  ReasonaboutDBMSconceptsinparallelanddistributedsetting–  HowtoachievedistributedACIDtransactions–  Dataconsistencyacrosscopies

•  Understandwhysomeofthedisadvantagesofdistributeddatabaseshaveledtosomerelaxationofconsistency

SomeParallelismTerminology•  Throughput

–  Amountofworkdoneperunittime

•  Latency(responsetime)–  Timetocompleteoneunitwork

•  Speed-Up

–  Moreresourcesmeanslesstimeforagivenunitofworkàdomoreunitsofworkinsametime

•  Scale-Up

–  Ifresourcesincreasedinproportiontoincreaseinunitsofwork,timeperunitisconstant.

degreeofparallelism

Xact/sec.

(throu

ghpu

t) Ideal

(speed-up)

degreeofparallelism

sec./Xact

(respo

nsetim

e)

Ideal(scale-up)

Resourcecontentionimpactsscale-up/speed-up

Page 2: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

ParallelDBMSArchitectures•  Multipleprocessors(CPUs)candoworkinparallel–  Howdotheycommunicateaboutwhatworktodo?–  ThreemainparallelDBarchitectures:

Disk

Disk

Disk

CPU

CPU

CPU

SharedRAM

SharedMemory(akasharedeverything)

Disk

Disk

Disk

CPU

CPU

CPU

SharedDisk

RAM

RAM

RAM

Disk

Disk

CPU

CPU

SharedNothing

RAM

RAM

Partitioning(akashardingakahorizontallyfragmenting)atable:Range Hash RoundRobin

A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z

Goodforequijoins,rangequeriesgroup-by

Goodforequijoins Goodtospreadload

DataPartitioning

DifferentTypesofDBMSParallelism

•  Inter-queryparallelism:differentqueriesrunondifferentnodes

•  Intra-operator“partitioned”parallelism–  Multiplemachinesworkingtogethertoexecuteanoperator(e.g.,scan,sort,join)

–  Machinesworkondisjointpartitionofthedata

•  Parallelizingarelationaloperator:mergeandsplit–  Mergestreamsofoutputtoserveasinputtoanoperator–  Splitoutputofoperatortobeprocessedinparallel

Sequential Sequential Sequential

Sequential Sequential

Code Sequential

Code

DifferentTypesofDBMSParallelism

•  Inter-operator“pipelined”parallelism– Eachrelationaloperatormayrunconcurrentlyonadifferentmachine

– Outputoffirstoperatorconsumedon-the-flyasinputtosecondoperator

Sequential Code

Sequential Code

Couldbelimitedbyblockingoperatorssuchassortoraggregation

Page 3: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

Exercise2

•  (a)inter-query•  (b)intra-operator“partitioned”parallelism

DistributedDBMS(SharedNothing)

•  Dataisstoredatseveralsites(geo-distributed),eachmanagedbyaDBMSthatcanrunindependently

•  DistributedDataIndependence:Usersshouldnothavetoknowwheredataislocated–  Note:catalogneedstokeeptrackofwheredatais

•  DistributedTransactionAtomicity:UsersshouldbeabletowriteXactsaccessingmultiplesitesjustlikelocalXacts

ExtendsPhysicalandLogicalDataIndependenceprinciples

DistributedQueryProcessing:Joins

•  Approach1--FetchasNeeded:PageNLJ,Sailorsasouter(querysubmittedatLondon):–  Discosttoread/writepage;Siscosttoshippage–  Cost:500D+500*1000(D+S)=500,500D+500,000S–  IfquerynotsubmittedatLondon,mustaddcostofshippingresulttoquerysite

•  Approach2--ShiptoOneSite:ShipReservestoLondon–  Cost:1000(D+S)+500D+500*1000D=501,500D+1000S

Sailors ReservesLONDON PARIS

500pages 1000pages

JOINonsid

Couldalsoconsiderothersingle-sitejoinmethods

Semijoin•  WhyshipallofReservestoLondon?

–  Someofthesetuplesmaynotevenendupbeingjoined,sowastedcommunicationcost

–  Idea:onlyshiptheReservestuplesthatwillmatchSailorstuples

•  Bottomline:Tradeoffthecostofcomputingandshippingprojection

forcostofshippingfullReservesrelation.

•  Note:EspeciallyusefulifthereisaselectiononSailors,andthenjoinselectivityisalsohigh

Sailors ReservesLONDON PARIS

500pages 1000pages

(1)πsidSailors

(2)Reserves�Sailors(3)finishjoin

Page 4: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

Anaside:BloomFilter•  Abloomfilterisabitvectorusedtoquicklydeterminewhetheranelementbelongstoaset

•  Hashelementstooneofkbuckets

Refinement:Bloomjoin•  Idea:ratherthanshippingthejoincolumn,shipamorecompactdata

structurethatcaptures(almost)thesameinfo–  Bloomfilterbitvector–  Bit-vectorcheapertoship,almostaseffective(falsepositivespossible)

•  HashSailors.sidvaluesintorange[0,k-1]–  Iftuplehashestosloti,setbitito1

•  HashReservestuplesintosamerange[0,k-1]–  Discardtuplesthathashto0inSailorsbitvector

Sailors ReservesLONDON PARIS

500pages 1000pages

(1)Sailors.sidbitvector

(2)Reservesreduction(3)finishjoin

QueryOptimization•  Newconsiderationsforcost-basedapproach

–  Communicationcosts–  Newdistributedjoinmethods

•  Alsooptimizingforqueryresponsetime?–  Mightwanttoconsideralargerspaceofplans

Vs.

R S

a=b T

b=c U

c=d

R S

a=b

U

b=c

c=d

T

Maycostless

Mayallowparallelismthatyields

overalllowerresponsetime

Exercise3

•  Concerns:correctness,deadlock,performance•  Someoptions:– Alllockrequestsgothroughacentrallocation– Lockrequestsdistributed;requestissenttothesitethatholdsthedatathatisdesired

Page 5: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

DistributedTransactions•  Withdataatdistributedsites,atransactionmayoperateon

dataatmultiplesites–  xactbrokendownintosub-xactsthatexecuteateachsite

•  Example:readandupdateinventoryatfoursiteswithhorizontalpartitionsofthedata–  T:R(A),R(B),R(C),R(D),W(A),W(B),W(C),W(D),commit

•  Howdoweguaranteeatomicity??–  Needacommitprotocol(typeofconsensusprotocol)–  Alogismaintainedateachsite,asinacentralizedDBMS,andcommitprotocolactionsareadditionallylogged

A B C DWhereshouldlocksbemanaged?

Two-PhaseCommit(2PC)•  Siteatwhichxactoriginatesiscoordinator;othersitesatwhichitexecutesaresubordinates

•  Supposexactwantstocommit(normalexecution):(1)PREPARE!

(2a)Log:noorready

(2a)Log:noorready

(2a)Log:noorready

(2b)YESorNO

(3a)AllYES?Log:commitElseLog:abort

(3b)COMMITorABORT

(5)Log:end

(4a)Log:abortorcommit

(4a)Log:abortorcommit

(4a)Log:abortorcommit

4b)ACK!

2PC:Steps•  WhenaXactwantstocommit:① Coordinatorsendspreparemsgtoeachsubordinate.② Subordinateforce-writesanoorreadylogrecordandthensendsanooryesmsgtocoordinator.

③ Ifcoordinatorgetsunanimousyesvotes,force-writesacommitlogrecordandsendscommitmsgtoallsubs.Else,force-writesabortlogrec,andsendsabortmsg.

④ Subordinatesforce-writeabort/commitlogrecbasedonmsgtheyget,thensendackmsgtocoordinator.

⑤ Coordinatorwritesendlogrecaftergettingallacks.

SiteandLinkFailuresL

•  Ifcoordinatordetectsasubordinatefailed,e.g.,aftertimeout

–  Before“yes”?àassumeitwas“abort”–  After“yes”?àcontinueasnormal

•  IfcoordinatorforXactTfails,subordinateswhohavevotedyescannotdecidewhethertocommitorabortTuntilcoordinatorrecovers–  XactTisblocked

A B

Exercise4:whenthefailedsubordinatesitewakesup,canittellifaglobaltransactioncommittedoraborted?

Page 6: CS 133: Databasesbeth/courses/cs133/current/... · 2019. 12. 12. · CS 133: Databases Fall 2019 Lec 24 – 12/05 Parallel and Distributed DBMSs Prof. Beth Trushkowsky Warm-up Exercise

Exercise4

•  Sitethatwakesupshouldinterpretitslog–  (a)Tcommitted!itshouldREDOactionsforT–  (b)Taborted!itshouldUNDOactionsforT–  (c)Can’ttell...Needstoconsultcoordinatoraboutwhatcoordinatordecided

–  (d)Taborted!itshouldwriteabort,andUNDO•  Sinceitneverevenwroteready,coordinatorcouldn’tpossiblyhavedecidedtocommit(sinceitwouldhavewaitedforthesitetosayyes)

DataReplication•  Replication:keepcopiesofdataatdifferentsites

•  Benefits–  Givesincreasedavailability–  Fasterqueryevaluation

•  Flavors–  Synchronous(eager)vs.Asynchronous(lazy)

•  Varyinhowcurrentcopiesare(i.e.,howconsistenttheyare)

•  Canbeusedinadditiontodatapartitioning–  Fullreplication:copyofalldataateverysite(vs.partial)

Locking:onprimarycopyorfullydistributed

UpdatingDistributedData

•  Synchronous(Eager)Replication:setofcopiesofamodifiedrelationmustbeupdatedbeforethemodifyingxactcommits–  Exclusivelocksonallthecopiesthataremodified–  Users/appsdonotneedtoknowdatalocation(s)

•  Asynchronous(Lazy)Replication:Copiesofamodifiedrelationareonlyperiodicallyupdated–  Differentcopiesmaygetoutofsyncinthemeantime–  Users/applicationsmustbeawareofdatalocation(s)

Notnecessarilyallcopies!

SynchronousReplication:Majority•  Majoritytechniquecanguaranteedataconsistency:–  Xactmustwriteamajorityofcopiestomodifyanobject–  Eachcopyhasversionnumberforobject–  Xactmustreadenoughcopiestobesureofseeingatleastonemostrecentcopy

•  Example:6copiesofdata

Written:4nodes

Read:3nodes

ImpactonperformanceofREADqueries??

CoulduseRead-one-write-all

(ROWA)policyinstead