19
Efficient Point to Multipoint Transfers Across Datacenters Mohammad Noormohammadpour 1 , Cauligi S. Raghavendra 1 , Sriram Rao 2 , Srikanth Kandula 2 1 University of Southern California, 2 Microsoft

Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

EfficientPointtoMultipointTransfersAcrossDatacenters

MohammadNoormohammadpour1,CauligiS.Raghavendra1,SriramRao2,SrikanthKandula2

1UniversityofSouthernCalifornia,2Microsoft

Page 2: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

Source:https://azure.microsoft.com/en-us/overview/datacenters/how-to-choose/ (Jun14,2017)

2

Page 3: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• DedicatedWANnetworksforasingleorganization

• Connectmanydatacenters– Increasedreliability– Loadbalancing– Contentisusuallyservedbydatacentersclosesttousers

• LowerRTTtousers ⇒ Higheraveragethroughput(TCP)• Lesshopstousers ⇒ SavesWANbandwidth

Inter-DatacenterNetworks

Source:S.Jainetal.,“B4:ExperiencewithaGlobally-DeployedSoftwareDefinedWAN”,ACMSIGCOMM2013

Source:C.Hongetal.,“AchievingHighUtilizationwithSoftware-DrivenWAN”,ACMSIGCOMM2013

3

Page 4: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

Needdatadeliveryfromonepointtomultiplepoints

Application

CDN,Web

DataRecovery

Search

Recommendation, Ads

Databases

Geo-Distributed DataAnalytics

Reasonfordeliverytomultiple datacenters

Gettingclosertousers

Makingbackupcopies

Synchronizationofstate

Globalloadbalancing

Inputfornext processingstages

4

Page 5: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Anabstractionmodel– Singlesource

• Contentislocatedonasourcedatacenter– Receiversarefixedoncetransferbegins

• Nojoin/leaves

PointtoMultipoint(P2MP)Transfers

A

B

C

D

XXXX

5

Page 6: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Usuallyperformedasseparateunicasttransfers– Wastesbandwidthandcanincreasecompletiontimes

• Multicasting– Network-driven(e.g.IPMulticast)

• Locallyandgraduallybuilttreesfarfromoptimal• Noloaddistributionmanagement• Complexsessionmanagementprotocols

– Client-driven(e.g.OverlayNetworks)• Limitedvisibilityintonetworkstatus• Limitedcontroloverrouting

• UsingStore-and-Forward– Storageandbandwidthcostsonintermediatedatacenters– Canleadtoexcessivedelays– Moreengineeringwork(runningagents,chunking,etc.)

P2MPTransfersTodayB

C

D

A1

2

2

2

A

B

C

D

1

2

2

2

A

B

C

D

12

3

3

4

A

B

C

D

1

2

2

2

6

Page 7: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Sendtraffictoalldestinationsoveraforwardingtree– Savesbandwidth– Acontrollerwithglobalviewofnetworkstatuscanexamineoptions– Selectionaccordingtocurrentnetworkloadconditionsandtransferparameters

• Userate-allocation andrate-limiting– Aslottedtimelinewithfixedratesduringtimeslots– Rate-allocationatcontrolleraccordingtoavailablebandwidth– Rate-limitingatend-points

• Maincontribution– Forwardingtreeselection

• Weight/Costassignmenttoedges

OurSolution:DCCast

A

B

C

D

TEServer

7

Page 8: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Update()– Isexecutedattheendofeverytimeslot

• Dispatchesrate-allocationstoend-points(i.e.,senders)forrate-limiting

• Allocate(𝑅)– Isexecuteduponarrivalofatransferrequest𝑅

1. Selectsaforwardingtree𝑇 forrequest𝑅2. Performsrate-allocationover𝑇

DCCastProcedures

DC1

DC2

DC3

Update()

Allocate(R)RateAllocationDatabase

Rates

Requests

TEServer

8

Page 9: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Assumeadirectedinter-datacentergraph𝐺– 𝐿& isthetotaloutstandingamountoftrafficallocatedoveranyedge𝑒

– Uponarrivalofrequest𝑅 withsizeof𝑉),everyedge𝑒 getsaweightof𝑊& = 𝑉) +𝐿&– 𝑅’s forwardingtreeisobtainedbyfindingaminimumweightSteinerTree– Fastheuristicsavailablethatoftenprovideresultsclosetooptimal

SelectionofForwardingTrees

t

Rate(edge𝑒)

…𝑡/ 𝑡0 𝑡1 𝑡2 𝑡3 𝑡4

𝐿& = 5𝑏𝑙𝑢𝑒𝑎𝑟𝑒𝑎𝑠�

𝑡4=>

𝐶&

A

B

C

D

R(Size=4)

57

3

1 42

9

A

B

C

D

R 911

7

5 86

13

A

B

C

D

57

7

1 86

13

R

9

Page 10: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Anyforwardingtreehasacostthatissumofedgeweights– Usingthiscostassignmentwestayawayfrom

• Highlyloadededges• Largetrees

• Implicationsofthiscostassignment– Smallertreesforlargerrequests(𝑉) ≫ 𝐿& ⇒ 𝑊& ≈ 𝑉))– Treesareselectedaccordingtoedgeloadsforsmallerrequests(𝑉) ≪ 𝐿& ⇒ 𝑊& ≈ 𝐿&)

AnalysisofDCCastforwardingtreeselection

10

Page 11: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Complexproblem:Trade-offs– Staticpolicies:FCFS,ALAP(aslateaspossible)

• Morepredictability– Dynamicpolicies:SRPT,FairSharing

• Bettermeantimes(byresolvingpriorityinversion)

• WeusedFCFSpolicy– Simple,noraterecalculations– Guaranteedcompletiontimesgivennofailures– Senderssendatmaximumavailable ratestartingnexttimeslot

• Calculationofavailableratesacrosstimeslotsovertrees– 𝐴& 𝑡 istheavailablerateoveredge𝑒 attime𝑡– Maximumrateoftree𝑇 attime𝑡 is𝑟D 𝑡 = min

&∈D(𝐴& 𝑡 )

Rate-allocation

A

R1R2R3

11

Page 12: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• EvaluatedTechniques– SelectionofForwardingTrees(Random,MINMAX,DCCast)– Rate-allocationpolicy(FCFSandSRPT)– DCCast(P2MP)vs.Point-to-Point(P2P-FCFSandP2P-SRPT)

• PerformanceMetrics– MeanTCT– TailTCT– Totalbandwidthusage

• TrafficPatterns– Artificiallygenerated

• Poissonarrivals• Exponentialtransfersizedistribution

Evaluation

12

Page 13: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Weconsideredthreeapproaches– Randomlyselectingaforwardingtree(Random)– Pickingthetreewithminimalmaximum𝐿& overanyedge(MINMAX)

• Greedyapproach• Methodusedinmanyresearchwork(minimizingmaximumutilization)

– Pickingthetreewithminimalsumof𝑊& (DCCast)

• Results– Overallbandwidthusage(notshown)

• Sameforallschemes– MeanandTailTCT

• DCCast<MINMAX< Random

Evaluation:SelectionofForwardingTrees

13

Page 14: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• DCCastlimitsloadbalancingforimprovedBWsavings– MINMAXdoesnotaccountfornumberofedges– MINMAXdoesnotaccountforrequestvolume

• DCCastcostassignmentmakesiteasiertofindtrees– Edgedecomposablecosts

BenefitsofDCCastcostassignmentoverMINMAX

𝐿& = 18

10 1

1 1

SmallRequestwithvolumeof1 LargeRequestwithvolumeof10

MINMAX

DCCast

18

11 2

2 2

18

11 2

2 2

18

20 11

11 11

28

10 1

1 114

Page 15: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• WeproposeduseofFCFSforDCCast– Simpleschedulingandresourcesguaranteedonescheduled– ButhowmuchwillitloseonMeanTCT?

• SRPTisthebestpolicyformeantimes– Challengingtoimplement:Treeevictionandraterecalculation

asnewrequestsarrive– Starvationofverylargetransfers

• Results– FCFSperformsslightlybetterinTailtimes– FCFSincreasesmeantimesby50%

Evaluation:SchedulingPolicy

15

Page 16: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• PropertiesofP2P-SRPTscheme– BasedonK-Shortestpaths(foreverytransfer)– UsesSRPT policytoachievebestMeanTCT– RatesarecalculatedusingLinearProgramming

• Results– BothTailtimesandBWUsageimprovedbyupto50%usingDCCast– DCCastbetterinMeantimeswhenmakinglargernumberofcopies

Evaluation:ComparisonwithPoint-to-Point(P2P)

16

Page 17: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• Manyinter-datacentertransfersfollowtheP2MPabstractionmodel– Oneobjectistobedeliveredtomanydestinations– Sourceanddestinationsknownuponarrivaloftransfers– Nojoins/leaves

• PerformeveryP2MPtransferjointlyusingaforwardingtree– Achievebandwidthsavingsandreducetailtimes

• Opportunisticallyanddynamicallyselectforwardingtrees– Allowingallavailablepathstobepotentiallyused– Theoppositewouldbepre-calculatingandusingK-MinimalTrees

Summary

17

Page 18: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

Thankyou!

Page 19: Efficient Point to Multipoint Transfers Across …...– Client-driven (e.g. Overlay Networks) • Limited visibility into network status • Limited control over routing • Using

• ImprovingMeanTCT– Multipletreeseachconnectedtoasubsetofreceivers(addressingtheslowreceiver)– Paralleltreestosamesubsetsofreceivers(increasingthroughput)– SRPTwithonlyBWpreemption(treesselecteduponrequestarrivals)– Combiningforwardingtreeswithstore-and-forward– Applyingbatchingtechniquesforburstyarrivalpatterns(e.g.applySJFpolicytobatches)– Applyingthefair-sharingpolicy(ratherthanFCFS)

• Evaluationusingrealtracesofinter-datacentertraffic– Chooseschedulingpolicyaccordingtotrafficpatterns

• Handlingfailures– Proactiveapproaches(leavingsparecapacity,backuptrees)– Reactiveapproaches(reschedulingaffectedtransfers,localactivation)

FutureWork&Discussion

19