26
Tapestry: Finding Nearby Objects in Peer-to-Peer Networks Joint with: Ling Huang Anthony Joseph Robert Krauthgamer John Kubiatowicz Satish Rao Sean Rhea Jeremy Stribling Ben Zhao

Tapestry: Finding Nearby Objects in Peer-to-Peer Networks Joint with: Ling Huang Anthony Joseph Robert Krauthgamer John Kubiatowicz Satish Rao Sean Rhea

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Tapestry: Finding Nearby Objects in Peer-to-Peer Networks

Joint with:Ling Huang

Anthony JosephRobert Krauthgamer

John KubiatowiczSatish RaoSean Rhea

Jeremy StriblingBen Zhao

Object Location

Behind the Cloud

Why nearby?(DHT vs. DOLR)

Nearby= low stretch, ratio of distance traveled to find object to distance to closest copy of object

• Objects are services, so distance isn’t one-time cost (see COMPASS)

• (smart) publishers put objects at chosen locations in network– Bob Miller places retreat schedule at node in Berkeley

• Wildly popular objects

Well-Placed Objects

Popular Objects

• Low stretch dynamic peer-to-peer network

• Tolerate failures in network

• Adapting to network variation

• Future work

Outline

System Neighbors

Motivating Structure

Hops

CAN, 2001 O(r) grid O(rn1/r)

Chord, 2001 O(log n) hypercube O(log n)

Pastry, 2001 O(log n) hypercube O(log n)

Tapestry, 2001 O(log n) hypercube O(log n)

Distributed Hash Tables

• These systems give– Guaranteed location – Join and leave algorithms– Load-balanced storage

• No stretch guarantees

Low Stretch Approaches

System Stretch Space Balanced Metric

Awerbuch Peleg, 1991

polylog polylog no General

PRR, 1997 O(1) O(log n) yes Special

Thorup-Zwick O(k2) O(kn1/k) yes General

RRVV, 2001 polylog polylog yes General

• Not dynamic

Tapestry is first dynamic low-stretch scheme

PRR/Tapestry

City

State

Country

PRR/Tapestry

Two object types: red and blue, so two trees

Level 1

Level 2

Level 3

Neighbor TableFor “5471” (Octal)

Routing Levels1234

1xxx

2xxx

0xxx

3xxx

4xxx

5xxx

6xxx

7xxx

50xx

51xx

52xx

53xx

54xx

55xx

56xx

57xx

540x

541x

542x

543x

544x

545x

546x

547x

5470

5471

5472

5473

Ø

5475

Ø

5477

Balancing Load

1

NodeID5123

3

3

2

22

4

3

NodeID5471

NodeID5416

NodeID5061

NodeID5432

NodeID5455

NodeID5470

Big Challenge: Joining Nodes

Theorem 1 [HKRZ02] When peer A is finished inserting, it knows about all relevant peers that have finished insertion.

Results

• Correctness O(log n) insert & delete – Concurrent inserts in a lock-free fashion

• Neighbor-search routine– Required to keep low stretch– All low-stretch schemes do something like this

• Zhao, Huang, Stribling, Rhea, Joseph & Kubiatowicz (JSAC)– This works! Implemented algorithms– Measured performance

Neighbor Search

In growth-restricted networks (with no additional space!):

Theorem 2 [HKRZ02] Can find nearest neighbor with high probability with O(log2 n) messages

Theorem 3 [HKMR04] Can find nearest neighbor, and messages is O(log n) with high probability

• Low stretch dynamic peer-to-peer network

• Tolerate failures in network

• Adapting to network variation

• Future work

Outline

Behind the Cloud Again

Dealing with faults

• Multiple paths– Castro et. al– One failure along path,

path breaks

• Wide path– Paths faulty at the same

place to break

• Exponential difference in width effect

• “retrofit” Tapestry to do latter in slightly malicious networks

Failed!

Still good…

Effective even for small overhead

Theorem 4 In growth restricted spaces, can make probability of failed route less than 1/nc for width O(clog n)Hildrum & Kubiatowicz, DISC02

0

10

20

30

40

50

60

70

80

90

100

0.1 0.2 0.3 0.4 0.5

Fraction of Bad Nodes

% f

aile

d r

ou

tes 1

2

3

4

5

6

Wide path vs. multiple paths

0

10

20

30

40

50

60

70

80

90

0 0.1 0.2 0.3 0.4 0.5 0.6

Fraction of Bad nodes

Fa

ile

d P

ath

s

4

4 Single

• Low stretch dynamic peer-to-peer network

• Tolerate failures in network

• Adapting to Network Variation

• Future work

Outline

Digit size affects performance

0

100

200

300

400

500

600

0 5 10 15 20 25

Base

Wo

rk

Network not homogeneous

Previous schemes picked a digit size• How do we find a good one?• But what if there isn’t one?

San Francisco

Nebraska

Paris

New Result

• Pick digit size based on local measurements• Don’t need to guess• Vary digit size depending on location

– No, it’s not obvious that this works, but it does!

Hildrum, Krauthgamer & Kubiatowicz [SPAA04]:

Dynamic, locally optimal low-stretch network

Conclusions and Future WorkConclusion

– Low stretch object location is practical• System provably good [HKRZ02]• System built [ZHSJK]

Open Questions– Do we need a DOLR?

• Object placement schemes? Workload?

– Examples where low stretch, load balance, and low storage not possible simultaneously

• What is tradeoff between degree, stretch, load balance as function of graph?

• Can we get best possible? Trade off smoothly?

Tapestry People

• Ling Huang

• Anthony Joseph

• John Kubiatowicz

• Sean Rhea

• Jeremy Stribling

• Ben Zhao

• and…OceanStore group members