42
Peer-Assisted Content Peer-Assisted Content Distribution Networks: Distribution Networks: Techniques and Techniques and Challenges Challenges Pei Cao Pei Cao Stanford University Stanford University

Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

  • View
    224

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Peer-Assisted Content Peer-Assisted Content Distribution Networks: Distribution Networks:

Techniques and ChallengesTechniques and Challenges

Pei CaoPei Cao

Stanford UniversityStanford University

Page 2: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Traditional Intra-Provider Traditional Intra-Provider Content Distribution NetworksContent Distribution Networks

National Center

Regional Center

Branch

. . .

. . . . . .

Users. . . . . . . . . . . .

Page 3: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Peer-to-Peer Content Peer-to-Peer Content DistributionDistribution

National Center

Regional Center

Branch

. . .

. . . . . .

Users. . . . . . . . . . . .

Page 4: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

P2P vs CDNP2P vs CDN• P2P: P2P:

– No infrastructure costNo infrastructure cost– Supply grows linearly with demandSupply grows linearly with demand– Simple distributed, randomized algorithmsSimple distributed, randomized algorithms– No QoS No QoS

• CDN:CDN:– Initial infrastructure costInitial infrastructure cost– Centralized scheduling algorithmsCentralized scheduling algorithms– Network efficiencyNetwork efficiency– Capable of supporting QoSCapable of supporting QoS

Page 5: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Combine P2P with CDN?Combine P2P with CDN?• Use P2P to complement CDNUse P2P to complement CDN

– P2P reduces load on the CDN, covers areas P2P reduces load on the CDN, covers areas where CDN is not installedwhere CDN is not installed

– Must be able to control, or “shape”, P2P trafficMust be able to control, or “shape”, P2P traffic

• Use CDN to complement P2PUse CDN to complement P2P– CDN steps in when peer-based distribution is CDN steps in when peer-based distribution is

falling short, enabling QoSfalling short, enabling QoS– Must be able to detect when peers won’t meet Must be able to detect when peers won’t meet

the delivery time guaranteethe delivery time guarantee

Page 6: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

OutlineOutline

• Review of BitTorrent Review of BitTorrent

• Traffic-shaping BitTorrent: biased Traffic-shaping BitTorrent: biased neighbor selectionneighbor selection

• QoS in BitTorrent: delivery time QoS in BitTorrent: delivery time predictionprediction

Page 7: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

BitTorrent File Sharing BitTorrent File Sharing NetworkNetwork

Goal: replicate K chunks of data Goal: replicate K chunks of data among N nodesamong N nodes

• Form neighbor connection graphForm neighbor connection graph

• Neighbors exchange dataNeighbors exchange data

Page 8: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

BitTorrent: Neighbor BitTorrent: Neighbor SelectionSelection

Trackerfile.torrent1Seed

Whole file

A

52

3

4

Page 9: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

BitTorrent: Piece ReplicationBitTorrent: Piece Replication

Trackerfile.torrent1Seed

Whole file

A

5

3

Page 10: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

BitTorrent: Piece Replication BitTorrent: Piece Replication AlgorithmsAlgorithms

• ““Tit-for-tat” (choking/unchoking):Tit-for-tat” (choking/unchoking):– Each peer only uploads to 7 other peers at a timeEach peer only uploads to 7 other peers at a time– 6 of these are chosen based on amount of data 6 of these are chosen based on amount of data

received from the neighbor in the last 20 secondsreceived from the neighbor in the last 20 seconds– The last one is chosen randomly, with a 75% bias The last one is chosen randomly, with a 75% bias

toward newcomerstoward newcomers

• (Local) Rarest-first replication:(Local) Rarest-first replication:– When peer 3 unchokes peer A, A selects which When peer 3 unchokes peer A, A selects which

piece to downloadpiece to download

Page 11: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Analysis of BitTorrentAnalysis of BitTorrent

• Conclusion from modeling studies: Conclusion from modeling studies: BitTorrent is nearly optimal in BitTorrent is nearly optimal in idealized, homogeneous networksidealized, homogeneous networks– Demonstrated by simulation studiesDemonstrated by simulation studies– Confirmed by theoretical modeling Confirmed by theoretical modeling

studiesstudies• Intuition: in a random graph, Intuition: in a random graph,

Prob(Peer A’s content is a subset of Peer B’s) ≤ Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%50%

Page 12: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Traffic-Shaping BitTorrentTraffic-Shaping BitTorrent

Page 13: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Random Neighbor GraphRandom Neighbor Graph

• Existing studies all assume random Existing studies all assume random neighbor selectionneighbor selection– BitTorrent no longer optimal if nodes in BitTorrent no longer optimal if nodes in

the same ISP only connect to each otherthe same ISP only connect to each other

• Random neighbor selection Random neighbor selection high high cross-ISP trafficcross-ISP traffic

Page 14: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Difficulty in Traffic-Shaping P2P Difficulty in Traffic-Shaping P2P ApplicationsApplications

• ISPs: ISPs: – Different links have different monetary Different links have different monetary

costscosts– Prefer “clustering” of trafficPrefer “clustering” of traffic

• P2P Applications: P2P Applications: – No knowledge of underlying ISP topologyNo knowledge of underlying ISP topology– Use randomized algorithms that don’t do Use randomized algorithms that don’t do

well under clusteringwell under clustering

• Current solution: throttling Current solution: throttling users users suffersuffer

Page 15: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

A Network-Friendly A Network-Friendly BitTorrent?BitTorrent?

• ISPs inform BitTorrent of its link ISPs inform BitTorrent of its link preferencespreferences

• Algorithm of BitTorrent is adjusted Algorithm of BitTorrent is adjusted such that both users and ISPs benefitsuch that both users and ISPs benefit

• Example: Biased Neighbor SelectionExample: Biased Neighbor Selection– Works when cost function is transitiveWorks when cost function is transitive

Page 16: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Biased Neighbor SelectionBiased Neighbor Selection

• Idea: of N neighbors, choose N-k from Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k peers in the same ISP, and choose k randomly from peers outside the ISPrandomly from peers outside the ISP

ISP

Page 17: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Implementing Biased Neighbor Implementing Biased Neighbor SelectionSelection

• By TrackerBy Tracker– Need ISP affiliations of peersNeed ISP affiliations of peers

•Peer to AS mapsPeer to AS maps•Public IP address ranges from ISPsPublic IP address ranges from ISPs•Special “X-” HTTP headerSpecial “X-” HTTP header

• By traffic shaping devicesBy traffic shaping devices– Intercept “peer Intercept “peer tracker” messages tracker” messages

and manipulate responsesand manipulate responses– No need to change tracker or clientNo need to change tracker or client

Page 18: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Evaluation MethodologyEvaluation Methodology

• Event-driven simulatorEvent-driven simulator– Use actual client and tracker codes as much as Use actual client and tracker codes as much as

possiblepossible– Calculate bandwidth contention, assume perfect Calculate bandwidth contention, assume perfect

fair-share from TCPfair-share from TCP

• Network settingsNetwork settings– 14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s 14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s

downloaddownload– Seed node, 400Kb/s uploadSeed node, 400Kb/s upload– Optional “university” nodes (1Mb/s upload)Optional “university” nodes (1Mb/s upload)– Optional ISP bottleneck to other ISPsOptional ISP bottleneck to other ISPs

Page 19: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Limitation of ThrottlingLimitation of Throttling

Page 20: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Throttling: Cross-ISP TrafficThrottling: Cross-ISP Traffic

0

10

20

30

40

50

Nothrottling

2.5Mbps 1.5Mbps 500kbps

Bottleneck Bandwidth

Redundancy

Redundancy: Average # of times a data chunk enters the ISP

Page 21: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Biased Neighbor Selection: Biased Neighbor Selection: Download TimesDownload Times

Page 22: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Biased Neighbor Selection: Biased Neighbor Selection: Cross-ISP TrafficCross-ISP Traffic

0

10

20

30

40

50

Regular k=17 k=5 k=1

Neighbor Selection

Redundancy

Page 23: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Importance of Rarest-First Importance of Rarest-First ReplicationReplication

• Random piece replication performs Random piece replication performs badlybadly– Increases download time by 84% - 150%Increases download time by 84% - 150%– Increase traffic redundancy from 3 to 14Increase traffic redundancy from 3 to 14

• Biased neighbors + Rarest-First Biased neighbors + Rarest-First More uniform progress of peersMore uniform progress of peers

Page 24: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Presence of External High-Presence of External High-Bandwidth PeersBandwidth Peers

• Biased neighbor selection alone: Biased neighbor selection alone: – Average download time same as regular Average download time same as regular

BitTorrentBitTorrent– Cross-ISP traffic increases as # of “university” Cross-ISP traffic increases as # of “university”

peers increasepeers increase• Result of tit-for-tatResult of tit-for-tat

• Biased neighbor selection + Throttling: Biased neighbor selection + Throttling: – Download time only increases by 12%Download time only increases by 12%

• Most neighbors do not cross the bottleneckMost neighbors do not cross the bottleneck

– Traffic redundancy (i.e. cross-ISP traffic) same Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peersas the scenario without “university” peers

Page 25: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Comparison with Simple Comparison with Simple ClusteringClustering

• Gateway peer: only one peer Gateway peer: only one peer connects to the peers outside the connects to the peers outside the ISP, all other peers only connect to ISP, all other peers only connect to peers inside the ISPpeers inside the ISP– Gateway peer must have high Gateway peer must have high

bandwidthbandwidth• It is the “seed” for this ISPIt is the “seed” for this ISP

– Ends up benefiting peers in other ISPsEnds up benefiting peers in other ISPs

Page 26: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Combining Biased Neighbor Combining Biased Neighbor Selection with CachesSelection with Caches

• Under random neighbor selectionUnder random neighbor selection– bandwidth requirement of cache is highbandwidth requirement of cache is high

• Under biased neighbor selectionUnder biased neighbor selection– bandwidth needed from the cache is bandwidth needed from the cache is

reduced by an order of magnitudereduced by an order of magnitude

Page 27: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

ConclusionsConclusions

• By choosing neighbors well, BitTorrent By choosing neighbors well, BitTorrent can achieve high peer performance can achieve high peer performance without increasing ISP costwithout increasing ISP cost– Biased neighbor selection: choose initial Biased neighbor selection: choose initial

set of neighbors wellset of neighbors well– Can be combined with throttling and Can be combined with throttling and

cachingcaching

BitTorrent’s algorithm can be shaped!BitTorrent’s algorithm can be shaped!

Page 28: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Delivery Time PredictionDelivery Time Prediction

Page 29: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

MotivationMotivation

• Provide delivery time guarantee under Provide delivery time guarantee under P2P+CDN P2P+CDN

• What contributes to delivery time of a What contributes to delivery time of a download via BitTorrent?download via BitTorrent?– From simulations: seed bandwidth and From simulations: seed bandwidth and

even replication of blocks even replication of blocks – Missing: node join/leave dynamics, TCP Missing: node join/leave dynamics, TCP

effects, etc. effects, etc.

Page 30: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Side-by-Side Live Side-by-Side Live ExperimentsExperiments

• Two clients, running on the same Two clients, running on the same machine, starting at the same time, machine, starting at the same time, downloading the samedownloading the same

• 13 experiments from Apr-May 200613 experiments from Apr-May 2006

• File sizes: 700MB ~ 1.4GBFile sizes: 700MB ~ 1.4GB

• Network size: 1100 ~ 2100 peersNetwork size: 1100 ~ 2100 peers

• Duration: 10 hrs ~ 2 daysDuration: 10 hrs ~ 2 days

Page 31: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Results from ExperimentsResults from Experiments

• Effective download rate: 10 ~ 30KB/sEffective download rate: 10 ~ 30KB/s

• Speed difference between the two Speed difference between the two peers: 3% ~ 82%peers: 3% ~ 82%

• What made the slower peer slow?What made the slower peer slow?

Page 32: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Suspicion #1: Slower Suspicion #1: Slower Neighbors?Neighbors?• Calculate unweighted average of observed Calculate unweighted average of observed

throughput at application levelthroughput at application level– RR11: average from all neighbors: average from all neighbors– RR22: average from neighbors uploading >250KB of data: average from neighbors uploading >250KB of data– RR33: average from neighbors uploading >2.5MB of data: average from neighbors uploading >2.5MB of data

• Low correlation between download-time ratio and Low correlation between download-time ratio and neighbor-speed rationeighbor-speed ratio– 0.57 for R0.57 for R11, 0.43 for R, 0.43 for R22, 0.47 for R, 0.47 for R33

– Faster neighbors corresponds to slower downloads in 3 Faster neighbors corresponds to slower downloads in 3 experimentsexperiments

Page 33: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Suspicion #2: Fewer Neighbors Suspicion #2: Fewer Neighbors Uploading to the Peer?Uploading to the Peer?

• Slot analysis: calculate download Slot analysis: calculate download concurrencyconcurrency– Maximum number of neighbors: 35Maximum number of neighbors: 35– Neighbors come and go Neighbors come and go align neighbors into align neighbors into

35 slots35 slots– Calculate time-average of number of concurrent Calculate time-average of number of concurrent

slots with neighbors uploadingslots with neighbors uploading

• Upload concurrency varies from 7 to 11Upload concurrency varies from 7 to 11– Explains one of the download-time/neighbor-Explains one of the download-time/neighbor-

speed reversal casespeed reversal case– But doesn’t explain the two othersBut doesn’t explain the two others

Page 34: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

““Close” NeighborsClose” Neighbors

• 90% of data downloaded from 1-4% 90% of data downloaded from 1-4% of neighborsof neighbors

• Let Let F(p)F(p) and and G(p)G(p) be the number of be the number of neighbors that provides neighbors that provides pp of data to of data to peers F and G, thenpeers F and G, then

F(p) > G(p) F(p) > G(p) peer F is slower than peer F is slower than GG– This holds for p = 90%, 75%, and 50%This holds for p = 90%, 75%, and 50%

Page 35: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

What makes a neighbor What makes a neighbor close?close?

• Not related to speed, or order of Not related to speed, or order of connection to peer, or order of connection to peer, or order of unchoking by peerunchoking by peer

Page 36: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Cost of Departure of a Close Cost of Departure of a Close NeighborNeighbor

• Departure cost: if one close neighbor Departure cost: if one close neighbor leaves, calculate the time until the leaves, calculate the time until the earliest next close neighborearliest next close neighbor

• The average departure cost: 30 min The average departure cost: 30 min

The convergence time of the tit-for-The convergence time of the tit-for-tat algorithm is slowtat algorithm is slow

Page 37: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Why Do Close Neighbors Why Do Close Neighbors LeaveLeave

• Five possible reasonsFive possible reasons– A: Random disconnectA: Random disconnect– B: Finished downloadingB: Finished downloading– C: Peer broke off the relationshipC: Peer broke off the relationship– D: Neighbor broke off the relationshipD: Neighbor broke off the relationship

• Results: B is most common, followed Results: B is most common, followed by C/D, then Aby C/D, then A

Page 38: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

ConclusionsConclusions

• Content delivery time in BitTorrent is Content delivery time in BitTorrent is determined by:determined by:– Neighbor upload speedNeighbor upload speed– Stability of neighbor relationshipStability of neighbor relationship

•Disruption of the pairing leads to long Disruption of the pairing leads to long delivery timedelivery time

•Neighbors may leave due to random Neighbors may leave due to random disconnection, completion of download, or disconnection, completion of download, or finding faster neighborsfinding faster neighbors

Page 39: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Using CDN to Complement Using CDN to Complement P2PP2P

• Use nodes CDN as high-speed Use nodes CDN as high-speed specially managed seedsspecially managed seeds

• Seeds are called to help whenever a Seeds are called to help whenever a node loses a close neighbornode loses a close neighbor

Page 40: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

SummarySummary

• A way to shape BitTorrent trafficA way to shape BitTorrent traffic

• Predicting BitTorrent performance by Predicting BitTorrent performance by monitoring close peer relationshipmonitoring close peer relationship

Page 41: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Related WorkRelated Work

• Many modeling studies of BitTorrentMany modeling studies of BitTorrent

• Simulation studiesSimulation studies

• Measurements of real torrentsMeasurements of real torrents

Page 42: Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

Ongoing WorkOngoing Work

• Live experiments with biased Live experiments with biased neighbor selectionsneighbor selections

• A k-regular graph algorithm with A k-regular graph algorithm with faster convergencefaster convergence

• Prototype implementation of Prototype implementation of “P2P+CDN”“P2P+CDN”