30
Deploying data streaming applica2ons in the Fog Valeria Cardellini [email protected] University of Rome Tor Vergata 2 nd Workshop “Through the Fog” – Pisa, Italy

Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Deployingdatastreamingapplica2onsintheFog

ValeriaCardellini

[email protected]

UniversityofRomeTorVergata

2ndWorkshop“ThroughtheFog”–Pisa,Italy

Page 2: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Datastreamprocessing(DSP)

•  Avarietyoflow-latencyandlocaLon-awareapplicaLonsindiversedomains:–  SituaLon-awareapplicaLons(e.g.,intelligenturbantransport,surveillance,andtrafficcongesLon)

–  Socialdatamining

•  Require–  ConLnuousreal-2meprocessingofunboundeddatastreamsgeneratedbymulLple,distributedsources

–  ToextractvaluableinformaLoninaLmelyandreliablemanner

1V.Cardellini-ThroughtheFog2017

Page 3: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Inanewdistributedenvironment•  Toincreasescalabilityandavailability,reducelatency,networktraffic,andpowerconsumpLon

– Edge/fogcompu2ng(“thecloudclosetotheground”):manymicrodatacenterslocatedatthenetworkedge

Exploitdistributedandnear-edgecomputaLon

Page 4: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

…thatposesoldandnewchallenges

•  Networklatenciesaresignificant•  CompuLngandnetworkingresourcesareheterogeneous(e.g.,businessconstraints,capacitylimits,…)

•  CompuLngandnetworkresourcesarenotalwaysavailable

•  Datacannotbeprocessedeverywhere•  …

3V.Cardellini-ThroughtheFog2017

Page 5: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Goalofthetalk

•  GiveaflavorofsomechallengesandtheirpossiblesoluLonsthatarisewhendeployingdatastreamprocessingapplicaLonsinafog/edgeenvironment

4V.Cardellini-ThroughtheFog2017

Page 6: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

DSPapplicaLonbasics

•  Anetworkofoperatorsconnectedbydatastreams,atleastonedatasourceandonedatasink

•  Representedbyadirectedgraph–  GraphverLces:operators–  Graphedges:datastreams–  Usuallydirectedacyclicgraph(DAG)

•  Operator:–  Processingelementthattransformsoneormoreinputstreamsintoanotherstream

–  Canbestatelessorstateful5V.Cardellini-ThroughtheFog2017

Page 7: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Challenge1:Operatorplacement

•  HowtoassigntheDSPoperatorstocompuLngnodeswhicharedistributedinaFogenvironment

6V.Cardellini-ThroughtheFog2017

1 23

4 6

5

(1,2)

(1,2) (1,2) (2,3)(2,4)

(3,5)(4,5)

(4,6)

(4,6)

(2,4)(2,3)

(3,5)

(4,5)

(4,6)

Page 8: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Thebeginning:DistributedStorm

•  CurrentDSPsystems(e.g.,Storm,Flink,Heron)aredesignedtoruninsingledatacenters

•  OuriniLalgoal:toextendStormforalarge-scaledistributedandheterogeneousenvironment

7

V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,DistributedQoS-awareschedulinginStorm.DEBS’15.V.Cardellini-ThroughtheFog2017

Page 9: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

NetworklatencyesLmaLon

•  HowtoprovideanefficientesLmaLonofthenetworkdelaybetweenpairsofnodes?

•  Useanetworkcoordinatessystem– Topredictlatencieswithoutperformingdirectmeasurements•  E.g.,Vivaldinetworkcoordinates:decentralizedandgossip-basedscheme

8V.Cardellini-ThroughtheFog2017

Page 10: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Operatorplacementpolicies

•  Operatorplacement:NP-hardproblem•  Severalplacementpoliciesinliterature(mainlyheurisLcs)thataddresssuchproblembut– DifferentassumpLons(systemmodel,applicaLontopology,QoSahributesandmetrics,…)

– DifferentobjecLves– Noteasilycomparable

9V.Cardellini-ThroughtheFog2017

Page 11: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODP:OpLmalDSPPlacement•  WeproposeODP–  CentralizedpolicyforopLmalplacementofDSPapplicaLons

–  FormulatedasIntegerLinearProgramming(ILP)problem

•  Ourgoals:–  Tocomputetheop2malplacement(ofcourse!)

–  Toprovideaunifiedgeneralformula2onoftheplacementproblemforDSPapplicaLons(butnotonly!)

–  ToconsidermulLpleQoSaNributesofapplicaLonsandresources

–  ToprovideabenchmarkforheurisLcs

10

V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,Op2malOperatorPlacementforDistributedStreamProcessingApplica2ons,DEBS’16. V.Cardellini-ThroughtheFog2017

Page 12: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODP:modelDSPapplica2on

11

Operators• CirequiredcompuLngresources• RiexecuLonLmeperdataunit

Datastreams• λi,j dataratefromoperatoritoj

V.Cardellini-ThroughtheFog2017

Page 13: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODP:modelCompu2ngandnetworkresources

12

(Logical)Networklinks• du,vnetworkdelayfromutov

• Bu,v bandwidthfromutov

• Au,vlinkavailability

Compu2ngresources• Cuamountofresources

• Su processingspeed• Auresourceavailability

V.Cardellini-ThroughtheFog2017

Page 14: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

13

DecisionvariablesWheretomapoperatorsanddatastreams

OpLmalDSPPlacementModel

i

j

xi,u=1

y(i,j),(u,v)=1

xj,v=1

u

z

v

w

ODP:model

V.Cardellini-ThroughtheFog2017

Page 15: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODP:someQoSmetrics

•  Latency– Maxend-to-enddelaybetweensourcesanddesLnaLons

14

R

•  Availability–  Prob.thatalloperators/linksareupandrunning

•  Latencyandbandwidth–  Inter-nodetraffic–  Networkusage

•  Inflightbytes Σlinks∈lrate(l)Lat(l)V.Cardellini-ThroughtheFog2017

Page 16: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

15

Latency

Availability

Networkbandwidthandnodecapacityconstraints

Assignmentandintegerconstraints

ODP:OpLmalDSPPlacementModelODP:ILPformulaLonTunableknobstosettheopLmalplacementgoals

V.Cardellini-ThroughtheFog2017

Page 17: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODPandApacheStorm•  WecanuseODP

–  todeterminetheopLmalplacement

–  asbenchmarktoevaluateexisLngheurisLcs

16

ODP

V.Cardellini-ThroughtheFog2017

Page 18: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODP:BenchmarkforplacementheurisLcsDistributedplacementheurisLcthatminimizesnetworkusage

17P.Pietzuchetal.,Network-awareoperatorplacementforstream-processingsystems,ICDE‘06.

Pietzuchetal.:

V.Cardellini-ThroughtheFog2017

Page 19: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Challenge2:placementandreplicaLon

•  ExploitapplicaLon-levelparallelismbyreplicaLonoperators

18V.Cardellini-ThroughtheFog2017

A B

A

A

A

Split Merge

Page 20: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

OperatorplacementandreplicaLon

V.Cardellini-ThroughtheFog2017 19

Page 21: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODRP:OpLmalDSPReplicaLonandPlacement•  WeproposeODRP

–  CentralizedpolicyforopLmalreplicaLonandplacementofDSPapplicaLons

–  FormulatedasIntegerLinearProgramming(ILP)problem

•  Ourgoals:–  TojointlydeterminetheopLmalnumberofreplicaandtheirplacement

–  ToconsidermulLpleQoSaNributesofapplicaLonsandresources

–  Toprovideaunifiedgeneralformula2on

–  ToprovideabenchmarkforheurisLcs

20

V.Cardellini,V.Grassi,F.LoPres2,M.Nardelli,Op2maloperatorreplica2onandplacementfordistributedstreamprocessingsystems.ACMPerf.Eval.Rew.,2017. V.Cardellini-ThroughtheFog2017

Page 22: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ODRPperformance

V.Cardellini-ThroughtheFog2017 21

sinkoperatorsource

RabbitMQRedis

data source parser filterByCoordinates

metronome

computeRouteID

partialRankcountByWindow globalRank

0.001

0.01

0.1

1

10

100

1000

20 40 60 80 100 120

Response time (s)

Source data rate (tuples/s)

S-ODP_RS-ODRP_R

DSPapplicaLon:DEBS2015GrandChallenge

Page 23: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Challenge3:runLmedeployment

•  ManyfactorsmaychangeatrunLme,e.g.,–  LoadvariaLons,QoSahributesofresources,costofresources(e.g.,duetodynamicpricingschemes),networkcharacterisLcs,nodemobility,…

•  HowtoadapttheplacementandreplicaLonwhenchangesoccur?Exploitself-adap2vedeployment

22V.Cardellini-ThroughtheFog2017

Page 24: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Self-adapLvedeployment

23

•  MAPE(Monitor,Analyze,PlanandExecute)

•  Planphase:howtoreconfiguretheapplicaLondeployment

V.Cardellini-ThroughtheFog2017

Page 25: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ReconfiguraLonchallenges

24

•  Reconfiguringthedeploymenthasanonnegligiblecost!

•  CanaffectnegaLvelyapplicaLonperformanceintheshortterm–  ApplicaLonfreezingLmescausedbyoperator

migraLonandscaling,especiallyforstatefuloperators

PerformreconfiguraLononlywhenneeded

TakeintoaccounttheoverheadformigraLngandscalingtheoperators

V.Cardellini-ThroughtheFog2017

Page 26: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

ElasLcstatefulmigraLoninStorm

•  WedevelopmechanismsforelasLcstatefulmigraLoninApacheStorm

Supervisor Supervisor Supervisor Supervisor

worker

process

worker

process

worker

slot

worker

slot

worker

slot

worker

slot

worker

process

worker

process

worker

process

worker

process

worker

process

worker

process

DDS DDS DDS DDS

Network

schedulerMigrationNotifier

ElasticityManager

Nimbus ZooKeeper

25V.Cardellini-ThroughtheFog2017V.Cardellini,M.Nardelli,D.Luzi,Elas2cstatefulstreamprocessinginStorm,HPCS‘16.

Page 27: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

EDRP:ElasLcDSPReplicaLonandPlacement

•  UnifiedframeworkfortheQoS-awareiniLaldeploymentandrunLmeelasLcitymanagementofDSPapplicaLons

•  Wemodelreconfigura2oncosts–  RelatedtomigraLngorscalingin/outtheoperators

•  CentralizedpolicyformulatedasIntegerLinearProgramming(ILP)problem

V.Cardellini-ThroughtheFog2017 26

V.Cardellini,F.LoPres2,M.Nardelli,G.RussoRusso,Op2maloperatordeploymentandreplica2onforelas2cdistributeddatastreamprocessing,underreview,2017.

Page 28: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

EDRPperformance

V.Cardellini-ThroughtheFog2017 27

WithreconfiguraLon

penalLes:Availability:95.5%DownLme:<90s

WithoutreconfiguraLon

penalLes:Availability:81.5%DownLme:370s

Page 29: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Futurework•  StudyefficientheurisLcstodealwithlargeprobleminstances

•  Dealwithuncertainty:takeuncertaintyofparametersintoaccountanddesignrobustplacementalgorithms

•  StudyhowtodeploymulLplecompeLngapplicaLonsintheFog

•  IntegrateplacementdecisionwithSDN– WithSDN,networkintothecontrolloop

•  Studycross-layerstrategiesthatinvolvemulLpleBigdataframeworksintheFog–  E.g.,Heron+ApacheAurora+Mesos

28V.Cardellini-ThroughtheFog2017

Page 30: Deploying data streaming applicaons in the Fogpages.di.unipi.it/throughthefog/wp-content/uploads/sites/... · 2017-02-28 · The beginning: Distributed Storm • Current DSP systems

Acknowledgments–Co-authors

29

VincenzoGrassi FrancescoLoPresL MaheoNardelli

Thankyou!AnyquesLons?

[email protected]

hhp://www.ce.uniroma2.it/~valeriaV.Cardellini-ThroughtheFog2017