ExpoLab - Staring into the Abyss: An Evaluation of Concurrency … into... · 2020. 10. 3. ·...

Preview:

Citation preview

StaringintotheAbyss:AnEvaluationofConcurrencyControlwithOneThousandCores

XiangyaoYu1GeorgeBezerra1 AndrewPavlo2

Srinivas Devadas1 MichaelStonebraker1

1CSAIL,MassachusettsInstituteofTechnology

2Dept.ofComputerScienceCarnegieMellonUniversity

PublishedinVLDB2014

Presenter:VaibhavJain

1

Motivation(1)

ØTheeraofsingle-coreCPUspeed-upisover.

ØNumberofcoresonachipisincreasingexponentially§ Increasecomputationpowerbythreadlevelparallelism

§ 1000-corechipsarenear…

XeonPhi(upto61cores) Tilera (upto100cores)

2

Motivation(2)

ØIstheDBMSreadytobescaled?§ MostDBMSsstillfocusonsingle-threadedperformance

§ Existingworksonmulti-coresfocusonsmallcorecount

3

Objective

• Toevaluatetransactionprocessingat1000cores.• Focusononescalabilitychallenge:Concurrencycontrol.• Discussthebottlenecksandimprovementsneeded.

4

Implementation

• ConcurrencyControlSchemes• DBMSTestBed

5

ConcurrencyControlSchemes

CC Scheme Description

DL_DETECT 2PLwithdeadlockdetection

NO_WAIT 2PLwithnon-waitingdeadlockprevention

WAIT_DIE 2PLwithwait-and-diedeadlockprevention

TIMESTAMP Basic T/Oalgorithm

MVCC Multi-version T/O

OCC Optimisticconcurrencycontrol

HSTORE T/Owithpartition-levellocking

Two–PhaseLocking(2PL)

TimestampOrdering(T/O)

Partitioning

6

Two-PhaseLocking(1)

7

Two-PhaseLocking(2)

8

ØLockconflict§ DL_DETECT:alwayswait.§ NO_WAIT:alwaysabort.§ WAIT_DIE:waitifolder,otherwiseabort

ØExamplesystems§ Ingres,Informix,IBMDB2,MSSQLServer,MySQL(InnoDB)

deadlockdetection

deadlockprevention

ConcurrencyControlSchemes

CC Scheme Description

DL_DETECT 2PLwithdeadlockdetection

NO_WAIT 2PLwithnon-waitingdeadlockprevention

WAIT_DIE 2PLwithwait-and-diedeadlockprevention

TIMESTAMP Basic T/Oalgorithm

MVCC Multi-version T/O

OCC Optimisticconcurrencycontrol

HSTORE T/Owithpartition-levellocking

Two–PhaseLocking(2PL)

TimestampOrdering(T/O)

Partitioning

9

TimestampOrdering(T/O)(1)

Eachtransactionhasauniquetimestampindicatingtheserialorder.1.TIMESTAMP(BasicTimestampOrdering)• R/Wrequestrejectediftx timestamp <timestampoflastwrite.

2.MVCC(Multi-VersionConcurrencyControl)• Everywriteopcreatesanewtimestampedversion• Forreadop,DBMSdecideswhichversionitaccesses.

10

TimestampOrdering(T/O)(2)

3.OCC(Optimistic Concurrency Control)• Privateworkspaceofeachtransaction.• Atcommittime,ifanyoverlap,tx isabortedandrestarted.• Advantage:shortcontentionperiod.

ExamplesystemsOracle,Postgres,MySQL(InnoDB),SAPHANA,MemSQL,MSHekaton

11

ConcurrencyControlSchemes

CC Scheme Description

DL_DETECT 2PLwithdeadlockdetection

NO_WAIT 2PLwithnon-waitingdeadlockprevention

WAIT_DIE 2PLwithwait-and-diedeadlockprevention

TIMESTAMP Basic T/Oalgorithm

MVCC Multi-version T/O

OCC Optimisticconcurrencycontrol

HSTORE T/Owithpartition-levellocking

Two–PhaseLocking(2PL)

TimestampOrdering(T/O)

Partitioning

12

H-Store

• Databasedividedintodisjointmemorysubsetscalledpartitions.• Eachpartitionprotectedbylocks.• Tx acquireslockstoallpartitionsitneedstoaccess.• DBMSassignsitatimestampandaddsittolockqueues.

13

DBMSTestBed(1)Graphite:CPUsimulator,scalesupto1024cores.• Applicationthreadsmappedtosimulatedcorethreads.• Simulatedthreadsmappedtomultipleprocessesonhostmachines.

14

DBMSTestBed(2)

• Implementedlight-weightpthread basedDBMS.• Allowstoswapdifferentconcurrencyschemes.• Ensuresnootherbottlenecksthanconcurrencycontrol.• Reportstransactionstatistics.

15

GeneralOptimizations

1. MemoryAllocation:Custommalloc ,resizablememorypoolforeachthread.2.LockTable:Insteadofcentralizedlocktable,per-tuplelocks3.Mutexes:Avoidmutex oncriticalpath.- For2PL,centralizeddeadlockdetector- Fort/o:allocatinguniquetimestamps.

16

Scalable2PL

1. DeadlockDetection- Makingdeadlockdetectorlockfreebykeepinglocalwait-forgraph.- Threadsearchesforcyclesinpartialwait-forgraph.

2.LockThrashing- Holdinglocksuntilcommit=>bottleneckinconcurrentTxs.- Timeoutthreshold:abortTx ifwaittimeexceedstimeout.

17

ScalableT/O

1. TimestampAllocationa) Batchedatomicaddition- Managerreturnsmultipletimestampsforarequest.b)CPUclocks- Readlogicalclockofcore,concatenatewiththreadid.- requiressynchronizedclocks.c) Hardwarecounters- PhysicallylocatedatcenterofCPU.

18

EvaluationRead-OnlyWorkload

19

ReadOnlyWorkload

20

Ø 2PLschemesarescalableforreadonlybenchmarks

ReadOnlyWorkload

21

Ø 2PLschemesarescalableforreadonlybenchmarksØ Timestampallocationlimitsscalability

ReadOnlyWorkload

22

Ø 2PLschemesarescalableforreadonlybenchmarksØ TimestampallocationlimitsscalabilityØMemorycopyhurtsperformance

WriteIntensive(mediumcontention)

23

No_Wait,Wait_Die scalesbetterthanothers.DL_Detect inhibitedbylockthrashing.

WriteIntensive(Highcontention)

24

Ø Scalingstopsatsmallcorecount(64)

WriteIntensive(Highcontention)

25

Ø Scalingstopsatsmallcorecount(64)ØNO_WAIThasgoodperformancebutfallsduetothrashing.

WriteIntensive(Highcontention)

26

Ø Scalingstopsatsmallcorecount(64)ØNO_WAIThasgoodperformancebutfallsduetothrashing.ØOCCwinsat1000cores asoneTx alwayscommits.

MoreAnalysis

1. ShortTransactions=>LowLockcontentionLongerTransactions=>Timestampallocationnotabottleneck.

2. Morereadtransactions=>Betterthroughput.

3. Multipartitiontransactions=>H-Storeschemeperformsbad.Partitionedworkloads=>H-Storebestalgorithm

27

BottlenecksSummary

28

ConcurrencyControl

Waiting(Thrashing)

HighAbortRate

TimestampAllocation

Multi-partition

DL_DETECT

NO_WAIT

WAIT_DIE

TIMESTAMP

MULTIVERSION

OCC

HSTORE

Summary

Allalgorithmsfailtoscaleascoreincreases.ØThrashing limitsthescalabilityof2PLalgorithmsØTimestampallocation limitsthescalabilityofT/Oalgorithms

29

ProjectIdeas

• Newconcurrencycontrolapproachestotacklescalabilityproblem.• HardwaresolutionstoDBMSbottlenecksunsolvableinsoftwareside.• Hybridapproach:Switchb/wschemesdependingonworkload.

30

Questions

31

Thrashing

32

v"uz"y"x"tuples

transactions A" B" C" D"

Locking Waiting

Recommended