Upload
danglien
View
220
Download
1
Embed Size (px)
Citation preview
Gumball:A RaceConditionPreventionTechniquefor Cache
AugmentedSQL DatabaseManagementSystems�
Shahram Ghandeharizadeh, Jason Yap
DatabaseLaboratoryTechnicalReport2012-01
ComputerScienceDepartment,USC
Los Angeles,California90089-0781
September11,2012
Abstract
Queryintensiveapplicationsaugmenta RelationalDatabaseManagementSystem(RDBMS) with a
middle-tiercacheto enhanceperformance.An exampleis memcachedin useby very largewell known
sitessuchasFacebook.In thepresenceof updatesto thenormalizedtablesof theRDBMS, invalidation
basedconsistency techniquesdeletethe impactedkey-valuepairsresidingin the cache.A subsequent
referencefor thesekey-valuepairsobservesacachemiss,re-computesthenew valuesfrom theRDBMS,
and insertsthe new key-valuepairs in the cache. Thesetechniquessuffer from raceconditionsthat
result in cachestatesthat producestaledata. The GumballTechnique(GT) addressesthis limitation
by preventingraceconditions.Experimentalresultsshow GT enhancesthe accuracy of an application
hundredsof folds and,in somecases,mayreducesystemperformanceslightly.
A Introduction
The workloadof certainapplicationclassessuchassocialnetworking is dominatedby queriesthat read
data[7]. An exampleis theuserprofile page.A usermayupdateherprofile pagerarely, every few hoursif
not daysandweeks.At thesametime, theseprofile pagesarereferencedanddisplayedfrequently:Every
time the userlogs in and navigatesbetweenpages. To enhancesystemperformance,theseapplications
augmentastandardSQLbasedRDBMS,e.g.,MySQL, with acachemanager. Typically, thecachemanager
is a Key-ValueStore(KVS), materializingkey-valuepairscomputedusingthenormalizedrelationaldata.
A key-valuepair might be finely tunedto the requirementsof an application,e.g.,dynamicallygenerated�A shortversionof this paperappearedin theproceedingsof theSecondACM SIGMOD Workshopon DatabasesandSocial
Networks(DBSocial),May 20,2012.
1
HTML formattedpages[12, 27, 5, 26, 10, 17], and the KVS may managea large number(billions) of
suchhighly optimizedrepresentation.A CacheAugmentedSQLRDBMS,CASQL,enhancesperformance
dramaticallybecausea KVS look up is significantly fasterthan processingSQL queries. This explains
thepopularityof memcached,anin-memorydistributedKVS deployedby sitessuchasYouTube[14] and
Facebook[35].
With CASQLs, a consistency techniquemaintainsthe relationshipbetweenthe normalizeddataand
its key-value representation,detectschangesto the normalizeddata,and invalidates1 the corresponding
key-value(s)storedin the KVS. Almost all techniquessuffer from raceconditions,seeSectionB. The
significanceof theseraceconditionsis highlightedin [33] which describeshow a web site may decide
to not materializefailed key-value lookupsbecausethe KVS may becomeinconsistentwith the database
permanently.
As anexample,considerAlice whois trying to retrieveherprofilepagewhile thewebsite’sadministrator
is trying to deleteher profile pagedue to her violation of site’s termsof use. SectionC shows how an
interleavedexecutionof thesetwo logicaloperationsleave theKVS inconsistentwith thedatabasesuchthat
theKVS reflectstheexistenceof Alice’s profile pagewhile thedatabaseis left with no recordspertaining
to Alice. A subsequentreferencefor the key-value pair correspondingto Alice’s profile pagesucceeds,
reflectingAlice’s existencein thesystem.
ThispaperpresentstheGumballTechnique(GT) to detectraceconditionsandpreventthemfrom caus-
ing the key-value pairs to becomeinconsistentwith the tabular data. GT is an applicationtransparent,
deadlockfree(non-blocking)techniqueimplementedby theKVS. Its key insight is that it is permissibleto
ignorecacheputoperationsto preventinconsistentstates.Its advantagesareseveralfolds. First, it appliesto
all applicationsthatusea CASQL, freeingeachapplicationfrom implementingits own raceconditionde-
tectiontechnique.Thisreducesthecomplexity of theapplicationsoftware,minimizingcosts.Second,in our
experiments,it reducedthenumberof observed inconsistenciesdramatically(morethanten folds). Third,
GT requiresno specificfeaturefrom anRDBMS andcanbeusedwith all off-the-shelfRDBMSs. Fourth,
GT adjuststo varyingsystemloadsandhasno externalknobsthatrequireadjustmentsby anadministrator.
Fifth, while GT employstimestamps,it doesnotsuffer from clockdrift andrequiresnosynchronizedclocks
becauseits timestampsarelocal to a(partitioned)cacheserver. Thedisadvantageof usingGT is thatit may
slow down anapplicationslightly whenalmostall (99%)requestsareservicedusingKVS. This is because
GT re-directsrequeststhatmayretrieve stalecachedatato processSQLqueriesusingtheRDBMS.
Theprimarycontribution of this paperis thedesignof GT andhow it detectsandpreventsracecondi-
tions. To thebestof our knowledge,GT is novel andnot presentedelsewhere.A secondarycontribution is
animplementationandevaluationof GT usinga socialnetworking benchmark.Obtainedresultsshow GT
imposesnegligible overheadwhile reducingthepercentageof inconsistenciesdramatically.1Otherpossibilitiesincluderefreshing[12, 23] or incrementallyupdating[24] thecorrespondingkey-value.
2
��������� ���������
�������� ������������ � � � ��! "��#$���! "�%������&'&�&'&�&
( )+*, -.���� $ � � � !�� /�0#!�������/�
Figure1: Architectureof aCASQL.Arrowsshow processingof a13254"6�7�8
with akey referencewhosevalueis not foundin KVS.
The restof this paperis organizedasfollows. Theraceconditionis formalizedin SectionB andSec-
tion C describesGT to detectandprevent it. We highlight the tradeoffs associatedwith GT in SectionD.
SectionE presentsrelatedwork. Brief conclusionsareofferedin SectionF.
B Problem Statement
Figure1 shows the architectureof a typical CASQL. Both the KVS andthe RDBMS consistof a client
anda server componentthat communicateusingtheTCPprotocol. Theapplicationemploys theRDBMS
client(e.g.,JDBC)to issueSQLqueriesto theRDBMSserver. Similarly, it employs theKVS client to issue
insert,getanddeletecommandsto oneor moreinstancesof a KVS server deployedon several(potentially
thousandsof) PCswith asubstantialamountof memory.
Thekey-valuepairsin KVS might pertainto eithertheresultsof a query[23] or a semistructureddata
obtainedby executingseveralqueriesandglueingtheir resultstogetherusingapplicationspecificlogic [12,
10, 33, 23]. With theformer, thequerystringis thekey andits resultsetis thevalue.Thelattermightbethe
outputof eitheradeveloperdesignatedread-onlyfunction[33] or codesegmentthatconsumessomeinputto
produceanoutput[23]. In thepresenceof updatesto theRDBMS,a consistency techniquedeployedeither
at the applicationor the RDBMS may deletethe impactedcachedkey-valuepairs. This deleteoperation
mayracewith a look up thatobservesacachemiss,resultingin stalecacheddata.
To illustratea racecondition,assumetheuserissuesa requestthat invokesa segmentof code(192:4"6�7�8
)
that referencesa ;=< - >?< pair that is not KVS residentbecauseit wasjust deletedby anupdateissuedto the
RDBMS.Thiscorrespondsto Alice in theexampleof SectionA referencingherprofilepageafterupdating
herprofile information.Theadministratorwho is trying to deleteAlice from thesysteminvokesadifferent
codesegment(132A@CBED
) to delete;�< - >?< . Eventhoughboth192A@FBED
and132:4"6G708
employ theconceptof trans-
actions,theirKVS andRDBMS operationsarenon-transactionalandmayleave theKVS inconsistent.One
3
1. Get(Ki) returns cache
miss.
2. Issue queries to DBMS to compute
K i-V i.3. Insert Ki-V i
in the cache.
CS mod
1. UpdateDBMS
2. Consistencytechnique deletes Ki
from the cache.
CS modCS
fuseCS
fuseCS
fuse
T miss
T Delete
Time
2.a)Acceptable.
1. Get(Ki) returns cache
miss.
2. Issue queries to DBMS to compute
K i-V i.3. Insert Ki-V i
in the cache.
CS mod
1. UpdateDBMS
2. Consistencytechnique deletes Ki
from the cache.
CS mod
CS fuse
CS fuse
CS fuse
T miss
Time T Delete
2.b)Undesirable.
Figure2: Two interleavedprocessingof132:4"6G708
and132A@CBHD
referencingthesamekey-valuepair.
scenariois shown in Figure2.b.192:4"6�7�8
looksuptheKVS andobservesamiss,Arrows1 and2 of Figure1,
andcomputes;�< - >?< by processingits bodyof codethat issuesSQL queries(a transaction)to theRDBMS
to computes>?< , Arrows 3 and4 of Figure1. Prior to132:4"6G708
executingArrow 5,192A@FBED
issuesboth its
transactionto updatetheRDBMS anddeletecommandto updatetheKVS. Next,132:4"6G708
inserts;�< - >?< into
theKVS. Thisschedule,seeFigure2.b,renderstheKVS inconsistentwith theRDBMS.A subsequentlook
up of ;�< from KVS producesastalevalue >/< with no correspondingtabular datain theRDBMS.
In sum,a race condition is an interleaved executionof192:4"6�7�8
and192A@FBED
with both referencingthe
samekey-valuepair. Not all raceconditionsareundesirable;only thosethat causethe key-valuepairsto
becomeinconsistentwith thetabular data.An undesirableraceconditionis aninterleavedexecutionof one
or more threadsexecuting132A@CBHD
with oneor more threadsexecuting132:4"6G708
that satisfy the following
criteria. First, the thread(s)executing192:4"6�7�8
mustconstructa key-valuepair prior to thosethreadsthat
execute192A@FBED
that updatethe RDBMS. And,132A@CBHD
threadsmustdeletetheir impactedkey-valuepair
from KVS prior to13254"6�7�8
threadsinsertingtheir computedkey-valuepairsin theKVS. Figure2.b shows
aninterleavedprocessingthatsatisfiestheseconditions,resultingin anundesirableracecondition.Therace
conditionof Figure2.adoesnot resultin aninconsistentstateandis acceptable.
C Gumball Technique
GumballTechnique(GT) is designedto preventtheraceconditionsof SectionB from causingthekey-value
pairsto becomeinconsistentwith tabular data. It is implementedwithin theKVS by extendingits simple
operations(delete,getandput) to managegumballs,seeFigure3. Its detailsareasfollows. Whentheserver
4
delete(I�J )1. If I�J - K?J existsthendeleteI=J - K?J andgenerategumball L=J , i.e., I=J - L=J , with MONHP setto currenttime.2. If I�J - L=J existsthenchangeMQN�P to thecurrenttime.3. If noentryexistsfor I J thengenerateL J , i.e., I J - L J , with M N P setto currenttime.
get(I�J )1. If I�J - K?J existsthenreturn K?J2. If either I=J - L=J existsor no entryexistsfor I�J thenreporta cachemisswith currenttimeas M�RSJUTVT timestamp.
put(I J ,K J ,M RSJWTVT )1. Let M�X betheserversystemtime.2. If ( MQX - M�RSJWTVT�Y[Z ) thenignoretheput operation.3. If ( L=J existsand M�RSJWTVT is beforeMQNHP ) thenignoretheputoperation.4. If ( K?J existsandits time stampis after M�RSJWT0T ) thenignoretheput.5. If ( M RSJUTVT�\ M%]�^�_�` T!a ) thenignoretheputoperation.6. Otherwise,insert I J - K J with its time stampsetto M RSJUTVT .
Figure3: GT enableddelete,get,andput pseudo-code.All time stampsarelocal to theserver containing;cb - >db .
receivesadelete(;cb ) request,andthereis novaluefor ;eb in theKVS, GT storesthearrival timeof thedelete
( f D�8�gh80i'8 ) in a gumball jdb andinsertsit in theKVS with key ;eb . With severaldelete(;cb ) requestsissuedback
to back,GT maintainsonly one jGb denotingthetime stampof thelatestdelete(;cb ). GT assignsa fixedtime
to live, k , to each;cb - jdb to prevent themfrom occupying KVS memorylongerthannecessary. Thevalueof
k is computeddynamically, seeSectionC.1.1.
When the server processesa get(;eb ) requestandobserves a KVS miss, GT provides the KVS client
component(client for short)with the misstime stamp,f @ b 7H7 . The client maintains;eb andits f @ b 7�7 time
stamp.Once13254"6�7�8
computesavaluefor ;eb andperformsa putoperation,theclient extendsthis call with
f @ b 7�7 . With this put(;eb ,>db ,f @ b 7H7 ), a GT enabledKVS server comparesf @ b 7�7 with thecurrenttime ( f5l ). If
their differenceexceedsk , f5lnmof @ b 7�79p k , thenit ignorestheput operation.This is becausea gumball
might have existedandit is no longerin theKVS asit timed out. Otherwise,therearethreepossibilities:
Either (1) thereexists a gumball for ;cb , ;eb - jdb , (2) the KVS server hasno entry for ;cb , or (3) thereis an
existing valuefor ;eb , ;cb - >Gb . Considereachcasein turn. With thefirst, theserver comparesf @ b 7�7 with the
timestampof thegumball.If themisshappenedbeforethe jGb timestamp,f @ b 7�7rq f�s 6?@Ftvu�gwg , thenthereis a
raceconditionandtheput operationis ignored.Otherwise,theput operationsucceeds.This meansjdb (i.e.,
the gumball) is overwrittenwith >db . Moreover, the server maintainsf @ b 7H7 asmetadatafor this ;eb - >db (this
f @ b 7�7 is usedin thethird scenarioto detectstaleputoperations,seediscussionsof thethird scenario).
In the secondscenario,the server inserts ;cb - >db in the KVS and maintainsf @ b 7�7 as metadataof this
key-valuepair.
In the third scenario,a KVS server may implementtwo possiblesolutions. With the first, the server
comparesf @ b 7�7 of theputoperationwith themetadataof theexisting ; b - > b pair. Theformermustbegreater
in orderfor theputoperationto over-write theexistingvalue.Otherwise,theremight bearaceconditionand
5
theputoperationis ignored.A moreexpensivealternative is for theKVS to performabyte-wisecomparison
of theexisting valuewith theincomingvalue.If they differ thenit maydelete;eb - >db to forcetheapplication
to produceaconsistentvalue.
GT ignorestheput operationwith bothacceptableandundesirableraceconditions,seediscussionsof
Figure2 in SectionB. For example,with theacceptableraceconditionof Figure2.a,GT rejectstheputop-
erationof132:4"6G708
becauseits f @ b 7H7 is beforef�s 6?@Ctvu�ghg . Thesereducethenumberof requestsservicedusing
theKVS. Instead,they executethefusioncodethatissuesSQLqueriesto theRDBMS.This is significantly
slower thanaKVS look up,degradingsystemperformance.
C.1 Value of xIdeally, k shouldbesetto theelapsedtime from when
192:4"6�7�8observesa KVS missfor ; b to the time it
issuesaput(; b ,> b ,f @ b 7�7 ) operation.k valuesgreaterthanidealareundesirablebecausethey causegumballs
to occupy memorylonger thannecessary, reducingthe KVS hit rateof the application. k valueslower
thanideal causeGT to rejectKVS insertoperationsun-necessarily, seeStep2 of the put pseudo-codein
Figure3. They slow down a CASQL significantlybecausethey prevent theserver from cachingkey-value
pairs.In oneexperiment,GT configuredwith asmall k valueslowedthesystemdown tenfoldsby causing
theKVS to sit emptyandre-directall requeststo theRDBMS for processing.This sectiondescribeshow
GT computesthevalueof k dynamically.
C.1.1 Dynamic computation of kGT adjuststhe valueof k dynamicallyin responseto CASQL load to avoid valuesthat renderthe KVS
emptyandidle. Thedynamictechniqueis basedon theobservation that theKVS server mayestimatethe13254"6�7�8responsetime, RT, by subtractingthecurrenttime ( f:l ) from f @ b 7�7 : yzf|{}f5lnm~f @ b 7�7 . Whena
put is rejectedbecauseits RT is higherthan k , GT setsthevalueof k to this yzf multipliedby aninflation
( � ) value, k = yzf���� . For example,� mightbesetto 1.1to inflate k to be10%higherthanthemaximum
observedresponsetime. (Seebelow for adiscussionof � andits value.)
Increasingthevalueof k meansrequeststhatobserved a missprior to this changemay now passthe
raceconditiondetectioncheck. This is becauseGT may have rejectedoneor moreof theseput requests
with the smaller k valuewhenperformingthe check f5l�m�f @ b 7H7�p k . To prevent suchrequestsfrom
polluting thecache,GT maintainsthetimestampof whenit increasedthevalueof k , f u�D < 6G7Vi . It ignoresall
putoperationswith f @ b 7�7 prior to f u�D < 6�70i .GT reducesthevalueof k whena ;cb - jdb is replacedwith a ;cb - >db . It maintainsthemaximumresponse
time, yzf @Cu�� , usingaslidingwindow of 60seconds(durationof slidingwindow is aconfigurationparameter
of theKVS server). If this maximummultiplied by aninflation value( � ) is lower thanthecurrentvalueof
6
k thenit resetsk to this lowervalue, k|{�yzf @Cu�� ��� . Notethatdecreasingthevalueof k doesnotrequire
settingf u�D < 6�70i to thecurrenttimestamp:Thoseput requeststhatsatisfythecondition f5l�m�f @ b 7H7�p k will
continueto satisfyit with thesmaller k value.
Thedynamic k computationtechniqueuses� valuesgreaterthan1 to maintain k slightly higherthan
its ideal value. In essence,it tradesmemoryto minimize the likelihoodof Step2 of the put pseudo-code
(seeFigure3) from ignoringits cacheinsertunnecessarilyandredirectingfuturereferencesto theRDBMS.
This preventsthe possibility of an applicationobservingdegradedsystemperformancedue to a burst of
requeststhatincurKVS missesandaresloweddown bycompetingwith oneanotherfor RDBMSprocessing.
Moreover, gumballshaveasmallmemoryfootprint. Thisin combinationwith thelow probabilityof updates
minimizesthelikelihoodof extragumballsfrom impactingtheKVS hit rateadversely.
D Evaluation
Thissectionanalyzestheperformanceof anapplicationconsistency techniquewith andwithoutGT usinga
realisticsocialnetworkingbenchmarkbasedonawebsitenamedRAYS.Weconsideredusingotherpopular
benchmarkingtools suchas RUBiS [3], YCSB [13] and YCSB++ [32]. We could not useRUBiS and
YCSB[13] becauseneitherquantifiestheamountof staledata.Theinconsistency window metricquantified
by YCSB++ [32] measuresthedelayfrom whenanupdateis issueduntil it is consistentlyreflectedin the
system. This metric is inadequatebecauseit doesnot measurethe amountof staledataproduceddueto
raceconditionsby multiple threads.Below, we startwith a descriptionof our workload.Subsequently, we
presentperformanceresultsandcharacterizetheperformanceof Gumballwith different k values.
D.1 RAYS and a Social Networking Benchmark
RecallAll You See(RAYS) [22] envisions a social networking systemthat empowers its usersto store,
retrieve, andsharedataproducedby devicesthatstreamcontinuousmedia,audioandvideodata.Example
devicesincludethepopularApple iPhoneandinexpensive camerasfrom PanasonicandLinksys.Similar to
othersocialnetworking sites,a userregistersa profile with RAYS andproceedsto invite othersasfriends.
A usermayregisterstreamingdevicesandinvite othersto view andrecordfrom them.Moreover, theuser’s
profileconsistsof a“Li veFriends”sectionthatdisplaysthosefriendswith adevicethatis actively streaming.
Theusermaycontactthesefriendsto view their streams.
We usetwo popularnavigation pathsof RAYS to evaluateGT: BrowseandTogglestreaming(Toggle
for short). While Browseis a read-onlyworkload,Toggleresultsin updatesto the databaserequiringthe
key-valuepairsto remainconsistentwith thetabular data.Wedescribeeachin turn.
Browseemulatesfour clicks to modelauserviewing herprofile,herinvitationsto view streams,andher
list of friendsfollowedwith theprofileof afriend. It issues38SQLqueriesto theRDBMS.With aCASQL,
7
Databaseparameters� Numberof usersin thedatabase.�Numberof friendsperuser.
Workloadparameters�Numberof simultaneoususers/threads.� Numberof usersemulatedby a thread.� Think time betweenuserclicksexecutinga sequence.�Inter-arrival timebetweenusersemulatedby a thread.� Probabilityof auserinvoking theTogglesequence.
Table1: Workloadparametersandtheirdefinitions.
Browsingissues8 KVS getoperations.For eachgetthatobservesa miss,it performsaput operation.With
anemptyKVS, thegetoperationsobserve no hitsandthissequenceperforms8 putoperations.
Togglecorrespondsto a sequenceof threeclicks wherea userviews her profile, her list of registered
devicesandtogglesthestateof a device. Thefirst two resultin a total of 23 SQL queries.With a CASQL,
Toggle issues7 get operationsand,with an empty KVS, observes a miss for all 7. This causesToggle
to perform7 put operationsto populatethe KVS. With the last userclick, if the device is streamingthen
theuserstopsthis stream.Otherwise,theuserinitiatesa streamfrom thedevice. This resultsin 3 update
commandsto thedatabase.With Trig, theseupdatesinvoke triggersthatdeleteKVS entriescorresponding
to boththeprofile2 anddevicespages.With apopulatedKVS, thenumberof deletesis higherbecauseeach
toggleinvalidatesthe“Li veFriends”sectionof thosefriendswith aKVS entry.
Our multi-threadedworkloadgeneratortargetsa databasewith a fixed numberof users,� . A thread
simulatessequentialarrival of � usersperformingonesequenceat a time. Thereis a fixed delay, inter-
arrival time � , betweentwo usersissuedby thethread.A threadselectstheidentityof auserby employing a
randomnumbergeneratorconditionedusingaZipfiandistribution with ameanof 0.27. � threadsmodel�simultaneoususersaccessingthesystem.In thesingleuser(1 thread,� =1) experiments,this means20%
of usershave 80%likelihoodof beingselected.Oncea userarrivesandheridentity is selected,shepicksa
Togglesequencewith probabilityof � anda Browsingsequencewith probability �E�zmo�:� . Thereis a fixed
think time � betweentheuserclicks thatconstitutea sequence.
Theworkloadgeneratormaintainsboththestructureof thesyntheticdatabaseandtheactivities of dif-
ferentusersto detectkey-valuepairs(HTML pages)thatarenotconsistentwith thestateof thetabulardata,
termedstale data. The workloadgeneratorproducesuniqueusersaccessingRAYS simultaneously. This
meansamoreuniformdistribution of accessto datawith a largernumberof threads.While this is no longer
a trueZipfian distribution,obtainedresultsfrom a systemwith andwithout GT arecomparablebecausethe
sameworkloadis usedwith eachalternative.2Theuser’sprofile pagedisplaysthecountof devicesthatarestreaminganda toggleof astreamingdeviceeitherincrementsor
decrementsthis counter.
8
0 1 2 3 4 5 61
10
100
1,000
10,000
100,000
1,000,000
Total Execution Time (Hours) Time (hours)
Number of Stale Reads
System with GT
System without GT
Figure4: Numberof stalereadsin oneexperiment;� =1000, � =10, � =100, � =10,000,� = � =0, � =1%.
To measurethe amountof staledatawith 100%accurracy, the workloadgeneratormustmaintainthe
statusof different devices managedby RAYS and serializesimultaneoususerrequestsand issueone to
CASQLatatime. This is unrealisticandwouldeliminateall raceconditions.Instead,weallowedthework-
loadgeneratorto issuerequestssimultaneouslyandusedtime stampsto detectits internalraceconditions.
This resultsin falsepositiveswhereanidentifiedstaledatais dueto anin-progresschangeto a time stamp.
Thesefalsepositivesareobservedwhentheworkloadgeneratoris usingRDBMS only.
D.2 Performance Results
Weconductedmany experimentsto quantifya) theamountof staledataeliminatedby GT, b) theimpactof
GT on systemperformance,andc) how quickly GT adaptsk to changingworkloadcharacteristics.In all
experiments,GT reducedthe amountof staledata100 folds or more. Below, we presentoneexperiment
with 300fold reductionin staledataanddiscusstheothertwo metricsin turn.
In this experiment,we focuson an invalidationbasedtechniqueimplementedin the application. We
targetasmalldatabasethatfits in memoryto quantifytheoverheadof GT. With largerdatasetsthatresultin
cachemisses,theapplicationmustissuequeriesto theRDBMS. This resultsin higherresponsetimesthat
hidetheoverheadof GT. If raceconditionsoccurfrequentlythenGT will slow down aCASQLby reducing
its cachehit rate. In our experiments,raceconditionsoccurlessthan3% of thetime3. Thus,GT’s impact
on systemperformanceis negligible.
Figure4 shows thenumberof requeststhatobserve staledatawhenasystemis configuredto eitheruse
GT or not useGT. Thex-axisof this figure is theexecutiontime of thework-load. They-axis is log scale
andshows thenumberof requeststhatobserve staledata.With GT, only 343requests(lessthan0.009%of
the total numberof requests)observe staledata. We attribute theseto the falsepositivesproducedby our
workloadgenerator, seeSectionD.1. Without GT, morethan100,000requests(2.4%of thetotal requests)
observe staledata.Thecachehit rateis approximately85%with andwithoutGT. Eventhoughthedatabase3Approximatedby theamountof staledataproducedwithout GT.
9
0 1000 2000 3000 4000 50000
10
20
30
40
50
60
70
Time (Seconds)
Delta in Seconds
MAX RT
∆
N=1 10 20 50 100 50 20 10 1
2) Valueof k .
Figure5: Varyingsystemload; � =1000, � =2, � =10,000,� =100msec,� =0, � =10%.
is small enoughto fit in memory, the cachehit ratecannotapproximate100%because10% of requests
executeToggle( � =10%)which invalidatecacheentries.
GT adaptsto changingworkloadsby adjustingthe value of k . We varied systemload by varying
thenumberof simultaneoususersaccessingthesystem,� . We looked at differentpatternsrangingfrom
thosethatchangetheloadabruptly(switchfrom 1 to 100simultaneoususers)to thosethatchangetheload
gracefully. In eachcase,GT adjuststhevalueof k quickly, minimizing thenumberof KVS insertsrejected
due to a small value of k . Suchrejectionsare typically a negligible percentageof the total numberof
requestsprocessed.Below, wereporton oneexperiment.
Thisexperimentvariedthenumberof simultaneoususers(� ) from 1 to 10,20,50,100andbackto 50,
20,10 and1. For eachsetting,a userissues1000requests.Figure5 shows thevalueof k whencompared
with themaximumobserved RT, seediscussionsof SectionC.1.1. As we increasethe load,GT increases
the valueof k to prevent rejectionof KVS insertsunnecessarily. Similarly, whenwe decreasethe load,
GT reducesthevalueof k to freememoryby preventinggumballsfrom occupying thecachelongerthan
necessary. k is higherthantheobservedmaximumresponsetimebecauseits inflationvalueis setto 2. This
experimentissuesmorethantwo hundredthousandput requestsandGT rejectsfewer than600dueto small
k values.
E Related Work
CacheaugmentedRDBMSshave beenanactive areaof researchdatingbackto 1990s[27, 12, 41, 18, 15,
10, 31, 29, 1, 40, 30, 9, 17, 2, 4, 5, 33, 24]. The focus of this paperis on a subsetwith the following
characteristic.First, the cachemustsupportsimpleinsert,get, anddeleteoperations[6, 33]. Thereexist
complex cacheswith theability toprocessSQLqueries,e.g.,TimesTen[40], DBProxy[2], DBCache[31, 9],
CacheTables[1], MTCache[30], Ferdinand[21]. While thesefall beyond the focusof GT, a KVS might
betheir building componentwhich mayrenderGT relevant. Second,thecache(KVS) mustbeat thesame
10
abstractionasthe RDBMS wheresecurityandprivacy of contentis guaranteedby the applicationandits
infrastructure[27, 12, 41,18, 15, 29, 4, 5, 33, 24, 23]. It doesnotapplyto proxycaches[10, 16,17] thatare
externalto theapplication.
TxCache[33] presentstheconceptof raceconditionsandhow they mayleave thecache(KVS) incon-
sistentwith the RDBMS. It proposesa slight modificationto thoseRDBMSsthat implementMVCC [8]
to (a) maintainvalidity interval for read-onlyquerieswhoseresultsarerepresentedaskey-valuepairs,and
(b) generateinvalidation streamsin thepresenceof updatetransactions.This streamcontainstransaction’s
time stampandall invalidationtagscorrespondingto key-valuepairsin thecache.This enablesTxCache
to implementtransactionswith snapshotisolation. GT is different in several ways. First, while both GT
andTxCacheemploy time stampsto detectraceconditions,thesemanticsof thesetime stampsarediffer-
ent. GT’s time stamppertainsto whenKVS fails to producea valuefor a referencedkey while TxCache’s
time stampis the updatetransactiontime stampissuedby the RDBMS. Second,GT is a building block
of a consistency techniquewhile TxCacheis a consistency technique.Certainpropertiesof a transaction
(atomicity, consistency, isolation,durability) may imposesignificantoverheadandnot beapplicableto an
application[38, 11, 24, 32]. A consistency techniquemayabandonthesepropertieswhile utilizing GT to
preventraceconditionsthatleave thecacheinconsistentpermanently. Thesamecannotberealizedwith Tx-
Cachebecauseit is aconsistency techniquethatimplementstransactionswith snapshotisolation.Third, GT
canbeusedwith all SQLRDBMSsbecauseit doesnoteithermodify or requireaprespecifiedfunctionality
from theRDBMS.
Oneapproachto preventraceconditionsis to serializeall writesusingtheRDBMS[24]. If theRDBMS
is thesystembottleneck,GT is abetteralternative becauseit imposesnoadditionalloadontheRDBMS.Its
entireimplementationis by thecache.
Tombstones[20] mark deletedobjectsin systemsthat allow replica contentto diverge [37]. A key
designchallengeis when to deletethem. Somesystemsemploy a two-phasecommit protocol to delete
tombstonessafely[25, 34, 36]. To continueoperationin thepresenceof failuresandintermittentnetwork
connectivity, somestudieshave proposeddeletionof tombstonesat eachsite after a fixed period, long
enoughfor mostupdatesto completepropagation,but shortenoughto keepthespaceoverheadlow [39, 28].
Clearinghouse[19] removestombstonesfrom mostsitesafter the expirationperiodandretainsthemon a
few designatedsitesindefinitely. Whena staleoperationarrivesaftertheexpirationperiod,somesitesmay
incorrectlyapply thatoperation.However, the designatedsiteswill distribute an operationthat re-installs
tombstoneson thesesites.
A gumball is similar to a tombstonebecauseit identifiesa deletedkey-value pair. However, GT’s
managementof gumballsis novel. First, GT deletesa gumballaftera fixed time interval k . GT controls
thevalueof k dynamicallyusingtheresponsetimeof thekey-valuepair, i.e., theRDBMSservicetimeand
queuingdelaysattributedto systemload. Second,GT ignoresput operationsfor ;cb - >db that incur response
11
timesgreaterthan k (independentof gumballexistence).GT is notasubstitutefor tombstonemanagement
algorithmsandtheirassumedoptimisticreplicationenvironmentandvice versa.
F Conclusion
GT is a racecondition detectionand prevention techniquefor mid-tier in-memorycachesthat comple-
menta RDBMS to enhanceperformance.This simpletechniqueworkswith all RDBMSsandalternative
invalidation-basedapproachesto cacheconsistency. Its evaluationhighlightstheneedfor abenchmarkthat
canaccuratelymeasuretheamountof staledataproducedby a framework. Our socialnetworking bench-
mark suffers from a few falsepositives (thousandthof onepercentof issuedrequest). Theseshouldbe
eliminatedwithoutslowing down theworkloadgenerator.
G Acknowledgments
Wewish to thanktheanonymousrefreesof DBSocial2012for their valuablecomments.
References
[1] M. Altinel, C. Bornhovd, S.Krishnamurthy, C. Mohan,H. Pirahesh,andB. Reinwald. CacheTables:
Paving theWay for anAdaptive DatabaseCache.In VLDB, 2003.
[2] K. Amiri, S. Park, andR. Tewari. DBProxy: A dynamicdatacachefor Webapplications.In ICDE,
2003.
[3] C. Amza, A. Chanda,A. Cox, S. Elnikety, R. Gil, K. Rajamani,W. Zwaenepoel,E. Cecchet,and
J.Marguerite.SpecificationandImplementationof DynamicWebSiteBenchmarks.In Workshop on
Workload Characterization, 2002.
[4] C. Amza,A. L. Cox,andW. Zwaenepoel.A comparative evaluationof transparentscalingtechniques
for dynamiccontentservers. In ICDE, 2005.
[5] C.Amza,G.Soundararajan,andE.Cecchet.TransparentCachingwith StrongConsistency in Dynamic
ContentWebSites.In Supercomputing, ICS ’05, pages264–273,New York, NY, USA, 2005.ACM.
[6] Six Apart. Memcached Specification, http://code.sixapart.com/svn/memcached/trunk/server
/doc/protocol.txt.
[7] Fabrıcio Benevenuto,Tiago Rodrigues,MeeyoungCha,andVirgılio A. F. Almeida. Characterizing
userbehavior in onlinesocialnetworks. In Internet Measurement Conference, 2009.
12
[8] P. BernsteinandN. Goodman. MultiversionConcurrency Control - TheoryandAlgorithms. ACM
Transactions on Database Systems, 8:465–483,February1983.
[9] C. Bornhovdd,M. Altinel, C. Mohan,H. Pirahesh,andB. Reinwald. AdaptiveDatabaseCachingwith
DBCache.IEEE Data Engineering Bull., pages11–18,2004.
[10] K. S. Candan,W. Li, Q. Luo, W. Hsiung,andD. Agrawal. Enablingdynamiccontentcachingfor
database-drivenwebsites.In SIGMOD Conference, pages532–543,2001.
[11] R. Cattell. ScalableSQLandNoSQLDataStores.SIGMOD Rec., 39:12–27,May 2011.
[12] J.Challenger, P. Dantzig,andA. Iyengar. A ScalableSystemfor ConsistentlyCachingDynamicWeb
Data.In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications
Societies, 1999.
[13] B. F. Cooper, A. Silberstein,E. Tam,R. Ramakrishnan,andR. Sears.BenchmarkingCloudServing
Systemswith YCSB. In Cloud Computing, 2010.
[14] C. D. Cuong.YouTubeScalability.GoogleSeattleConferenceonScalability, June2007.
[15] A. Datta,K. Dutta,H. Thomas,D. VanderMeer, D. VanderMeer, K. Ramamritham,andD. Fishman.
A ComparativeStudyof AlternativeMiddle Tier CachingSolutionsto SupportDynamicWebContent
Acceleration.In VLDB, pages667–670,2001.
[16] A. Datta, K. Dutta, H. Thomas,D. VanderMeer, D. VanderMeer, Suresha,and K. Ramamritham.
Proxy-BasedAccelerationof DynamicallyGeneratedContenton theWorld Wide Web:An Approach
andImplementation.In SIGMOD, pages97–108,2002.
[17] A. Datta,K. Dutta,H. M. Thomas,D. E.VanderMeer, andK. Ramamritham.Proxy-basedAcceleration
of DynamicallyGeneratedContenton theWorld WideWeb:An ApproachandImplementation.ACM
Transactions on Database Systems, pages403–443,2004.
[18] L. Degenaro,A. Iyengar, I. Lipkind, and I. Rouvellou. A MiddlewareSystemWhich Intelligently
CachesQueryResults.In IFIP/ACM International Conference on Distributed systems platforms, 2000.
[19] A. Demers,D. Greene,C. Hauser, W. Irish, and J. Larson. EpidemicAlgorithms for Replicated
DatabaseMaintenance.In Symposium on Principles of Distributed Computing, pages1–12,1987.
[20] M. J.FischerandA. Michael.SacrificingSerializabilityto Attain Availability of Datain anUnreliable
Network. In Symposium on Principles of Database Systems (PODS) Conference, pages70–75,1982.
13
[21] C. Garrod,A. Manjhi, A. Ailamaki, B. Maggs,T. Mowry, C. Olston,andA. Tomasic.ScalableQuery
ResultCachingfor WebApplications.August2008.
[22] S.Ghandeharizadeh,S.Barahmand,A. Ojha,andJ.Yap.RecallAll YouSee,http://rays.shorturl.com,
2010.
[23] S.GhandeharizadehandJ.Yap. SQL QueryTo TriggerTranslation:A Novel Consistency Technique
for CacheAugmentedDBMSs. In Submitted for Publication, 2012.
[24] P. Gupta,N. Zeldovich, andS. Madden.A Trigger-BasedMiddlewareCachefor ORMs. In Middle-
ware, 2011.
[25] R. Guy, G. Popek,andT. Page.Consistency Algorithmsfor OptimisticReplication.In IEEE Interna-
tional Conference on Network Protocols, 1993.
[26] V. Holmedahl,B. Smith,andT. Yang.CooperativeCachingof DynamicContentonaDistributedWeb
Server. In HPDC, pages243–250,1998.
[27] A. IyengarandJ. Challenger. Improving WebServer Performanceby CachingDynamicData. In In
Proceedings of the USENIX Symposium on Internet Technologies and Systems, pages49–60,1997.
[28] J.KistlerandM. Satyanarayanan.DisconnectedOperationin theCodaFileSystem.ACM Transactions
on Computing Systems, 10:3–25,February1992.
[29] A. LabrinidisandN. Roussopoulos.ExploringtheTradeoff BetweenPerformanceandDataFreshness
in Database-DrivenWebServers.The VLDB Journal, 2004.
[30] P. Larson,J. Goldstein,and J. Zhou. MTCache: TransparentMid-Tier DatabaseCachingin SQL
Server. In ICDE, pages177–189,2004.
[31] Q.Luo,S.Krishnamurthy, C.Mohan,H. Pirahesh,H. Woo,B. G.Lindsay, andJ.F. Naughton.Middle-
Tier DatabaseCachingfor e-Business.In SIGMOD, 2002.
[32] S. Patil, M. Polte,K. Ren,W. Tantisiriroj, L. Xiao, J. Lopez,G. Gibson,A. Fuchs,andB. Rinaldi.
YCSB++: BenchmarkingandPerformanceDebuggingAdvancedFeaturesin ScalableTableStores.
In Cloud Computing, New York, NY, USA, 2011.ACM.
[33] D. R. K. Ports,A. T. Clements,I. Zhang,S. Madden,andB. Liskov. Transactionalconsistency and
automaticmanagementin anapplicationdatacache.In OSDI. USENIX, October2010.
[34] D. Ratner. Roam:A ScalableReplicationSystemfor Mobile andDistributedComputing,Ph.D.thesis,
UC LosAngeles,TechReportUCLA-CSD-970044,1998.
14
[35] P. Saab. ScalingmemcachedatFacebook,http://www.facebook.com/note.php?note id=39391378919,
Dec.2008.
[36] Y. SaitoandH. Levy. OptimisticReplicationfor InternetDataServices.In Conference on Distributed
Computing (DISC), pages297–314,2000.
[37] Y. SaitoandM. Shapiro.OptimisticReplication.ACM Computing Surveys, 5:12–27,2005.
[38] M. I. Seltzer. BeyondRelationalDatabases.Commun. ACM, 51(7):52–58,2008.
[39] H. SpencerandD. Lawrence. ManagingUsent,O’Reilly & Associates,Sebastopol,CA, ISBN 1-
56592-198-4,1998.
[40] TheTimesTenTeam. Mid-Tier Caching:TheTimesTenApproach. In Proceedings of the SIGMOD,
2002.
[41] K. Yagoub,D. Florescu,V. Issarny, andP. Valduriez.CachingStrategiesfor Data-Intensive WebSites.
In VLDB, pages188–199,2000.
15