Minseok Kwon Department of Computer Science Rochester Institute of Technology [email protected]

1

Minseok Kwon

Department of Computer ScienceRochester Institute of Technology

[email protected]

http://www.cs.rit.edu/~jmk

Week 2: P2P Overlay Week 2: P2P Overlay NetworksNetworks

2

Peer-to-Peer (P2P) Networks

Internet

PublisherKey = “Davinci Code”

File = mpg data

ClientLookup(“Davinci Code”)

File = mpg data

• Lookup problem: Given a network of peers and a keyword, how can we find the data?

N1

N4

N5

N6

N2

N7

N3

N8

Client: Lookup (“title”)

• Structured P2P: Can we find a target efficiently, specifically in O(log n) steps?

3

Distributed Hash Table (DHT)• Question: Can we do the lookup in the logarithmic time?

If yes, how?• In CS 3, we know how to do this if numbers are sorted in

an array. Run binary search!• In P2P networks, we first compute node identifiers and

key identifiers using a hash function. • Key-identifier = SHA-1(keyword)• Node-identifier = SHA-1(IP address)

• We then arrange nodes and data items in a certain order in the same ID space. This is called DHT!

• Now, how can we map key IDs to node IDs to support the lookup?

• At the same time, the system must be scalable, robust, self-organizing, and completely decentralized.

4

CAN: Insert/Retrieve Files

y = hy(K)

I

• Node I wants to add file F with keyword K.

• First, get the target coordinates x and y.

• Second, route to the target (x, y).

• Third, store F at the target.

• How can we retrieve F later?

• Answer: get the target coordinates using the same hash functions.

(K,F)

x = hx(K)

5

CAN: Routing

(a,b)

(x,y)

• Greedy approach• Select one of the

neighbors which gets closer to the target.

6

CAN: Node Join

(p,q)

I

X

new node

• A new node contacts a random node I initially.

• I routes the new node to other randomly picked target (p, q), which is node X.

• The node X zone is split in half in which the new node takes one half.

new node

7

• Suppose that there are n nodes and d dimensions.• Scalable?

• Inserting a new node affects only a single other node and its immediate neighbors

• A node only maintains state for its immediate neighboring nodes. The number of neighbors is 2d.

• Efficient?• Average routing path: (dn1/d)/4 hops.

• Robust?• Resilient to node and link failures• No single point of failure since the system is completely

distributed.

How Good is CAN?

8

Chord: Basic Lookup

N32

N105

N90

Circular 7-bitID space

K5

K20

Key 5:

Node 105:

Which node is K80 stored at?

A key is stored at its successor

9

Chord: Advanced Lookup

N80

N120 1/2

1/4

1/8

1/161/32

Finger i points to successor of n+2^i

N80

N99

N110

N5

N10

N20

N32

N60

K19

Lookup(K19)

Lookup takes O(log(N)) steps

10

Chord: Join

N25

N40

K30, K38

K36 wants to join

N25

N40

K30, K38

K36 wants to join N36 K30

11

How Good is Chord?• Suppose that there are n nodes.• Scalable?

• O(log(n)) states per lookup

• Efficient?• O(log(n)) messages (or steps) per lookup

• Robust?• Survives massive failures

12

Is DHT a Silver Bullet?• What is the dominant P2P applications?

• Mass-market file sharing mostly for music, video, and software

• Potential problems with these applications?• Wide range of heterogeneity• Large transient user population

• Flooding does not scale; how about DHTs?• DHTs scale, especially for finding the location of a given

filename.• One problem with DHTs: require lots of extra work including

caching, keyword searching

• Do we really need DHTs for mass-market file sharing?• NOT necessarily!• DHTs are great at finding rare files, but most queries are about

popular files.

13

Suggested Solution: GIA• Can we make Gnutella-like P2P systems

scalable?• Our answer is yes, and we designed GIA!• Idea

• Unstructured (based on flooding), but take node capacity into account.

• High-capacity nodes have room for more queries; why not sending more queries to them?

• This will work only if: • High-capacity nodes have correspondingly more answers.• High-capacity nodes are easily reachable from other

nodes.

14

GIA (Gianduia) Design• Make high-capacity nodes easily reachable

• Dynamic topology adaptation

• Make high-capacity nodes have more answers• One-hop replication

• Search efficiently• Biased random walks

• Prevent overloaded nodes• Active flow control Query

15

Dynamic Topology Adaptation• Goals

• High capacity nodes are high-degree nodes.• Low capacity nodes are close to higher capacity ones.

new nodehigh-capacity node

• How does a new node join?• A new node selects the node with max capacity (> its own

capacity) among a small number of randomly selected nodes.

• Uses a level of satisfaction as a function of capacity, degree, and age.

• Neighbors must have outgoing capacity to handle forwarded queries.

16

One-hop Replication

• Content information is exchanged during connection, and updated incrementally.

• High-capacity peers can act as a proxy for low capacity peers.• Nodes keep an index of their neighbors’ shared files.

high-capacity node

high-capacity node maintains indices for bothitself and neighbors

17

Active Flow Control• Sender sends queries to a neighbor only if that

neighbor accepts queries.

• How to know whether or not accept queries?• If that neighbor sends a token…

• Active flow control periodically assigns tokens to neighbors.

• Token allocation rate varies on query processing capability and buffer queue.

18

Search Protocol• Biased random walk

• Rather than forwarding incoming queries to randomly chosen neighbors, a node selects the highest capacity neighbor for which it has flow-control tokens and sends the query to that neighbor.

• Use GUID to send queries to different paths.• TTL and max_responses bound propagation.

• Advantage: reduce flooding• Disadvantage: sensitive to peer failures

19

Performance

0.00001

0.001

0.1

10

1000

0.01 0.1 1Replication Rate (percentage)

Collapse Point (qps/node)

GIA: N=10,000

SUPER: N=10,000

RWRT: N=10,000

FLOOD: N=10,000

20

Transient Behavior

• GIA outperforms under heavy churn.

0.001

0.01

0.1

1

10

100

1000

10 100 1000 10000Per-node max-lifetime (seconds)

Collapse point (qps/node)

replication rate = 1.0%



21

Summary• GIA: scalable Gnutella

• 3-5 orders of magnitude improvement in system capacity.

• Unstructured approach is good enough• DHTs may be overkill!

Documents

Minseok Kwon Department of Computer Science Rochester Institute of Technology [email protected]