46
An Overview of Peer- to-Peer Sami Rollins 11/14/02

An Overview of Peer-to-Peer

  • Upload
    mali

  • View
    63

  • Download
    0

Embed Size (px)

DESCRIPTION

An Overview of Peer-to-Peer. Sami Rollins 11/14/02. Outline. P2P Overview What is a peer? Example applications Benefits of P2P P2P Content Sharing Challenges Group management/data placement approaches Measurement studies. What is Peer-to-Peer (P2P)?. Napster? Gnutella? - PowerPoint PPT Presentation

Citation preview

Page 1: An Overview of Peer-to-Peer

An Overview of Peer-to-Peer

Sami Rollins

11/14/02

Page 2: An Overview of Peer-to-Peer

Outline

• P2P Overview– What is a peer?– Example applications– Benefits of P2P

• P2P Content Sharing– Challenges– Group management/data placement approaches– Measurement studies

Page 3: An Overview of Peer-to-Peer

What is Peer-to-Peer (P2P)?

• Napster?

• Gnutella?

• Most people think of P2P as music sharing

Page 4: An Overview of Peer-to-Peer

What is a peer?

• Contrasted with Client-Server model

• Servers are centrally maintained and administered

• Client has fewer resources than a server

Page 5: An Overview of Peer-to-Peer

What is a peer?

• A peer’s resources are similar to the resources of the other participants

• P2P – peers communicating directly with other peers and sharing resources

Page 6: An Overview of Peer-to-Peer

Levels of P2P-ness

• P2P as a mindset– Slashdot

• P2P as a model– Gnutella

• P2P as an implementation choice– Application-layer multicast

• P2P as an inherent property– Ad-hoc networks

Page 7: An Overview of Peer-to-Peer

P2P Application Taxonomy

P2P Systems

Distributed ComputingSETI@home

File SharingGnutella

CollaborationJabber

PlatformsJXTA

Page 8: An Overview of Peer-to-Peer

P2P Goals/Benefits

• Cost sharing

• Resource aggregation

• Improved scalability/reliability

• Increased autonomy

• Anonymity/privacy

• Dynamism

• Ad-hoc communication

Page 9: An Overview of Peer-to-Peer

P2P File Sharing

• Content exchange– Gnutella

• File systems– Oceanstore

• Filtering/mining– Opencola

Page 10: An Overview of Peer-to-Peer

P2P File Sharing Benefits

• Cost sharing

• Resource aggregation

• Improved scalability/reliability

• Anonymity/privacy

• Dynamism

Page 11: An Overview of Peer-to-Peer

Research Areas

• Peer discovery and group management

• Data location and placement

• Reliable and efficient file exchange

• Security/privacy/anonymity/trust

Page 12: An Overview of Peer-to-Peer

Current Research

• Group management and data placement– Chord, CAN, Tapestry, Pastry

• Anonymity– Publius

• Performance studies– Gnutella measurement study

Page 13: An Overview of Peer-to-Peer

Management/Placement Challenges

• Per-node state

• Bandwidth usage

• Search time

• Fault tolerance/resiliency

Page 14: An Overview of Peer-to-Peer

Approaches

• Centralized

• Flooding

• Document Routing

Page 15: An Overview of Peer-to-Peer

Centralized

• Napster model• Benefits:

– Efficient search

– Limited bandwidth usage

– No per-node state

• Drawbacks:– Central point of failure

– Limited scale

Bob Alice

JaneJudy

Page 16: An Overview of Peer-to-Peer

Flooding

• Gnutella model• Benefits:

– No central point of failure

– Limited per-node state

• Drawbacks:– Slow searches

– Bandwidth intensive

Bob

Alice

Jane

Judy

Carl

Page 17: An Overview of Peer-to-Peer

Document Routing

• FreeNet, Chord, CAN, Tapestry, Pastry model

• Benefits:– More efficient searching

– Limited per-node state

• Drawbacks:– Limited fault-tolerance vs

redundancy

001 012

212

305

332

212 ?

212 ?

Page 18: An Overview of Peer-to-Peer

Document Routing – CAN

• Associate to each node and item a unique id in an d-dimensional space

• Goals– Scales to hundreds of thousands of nodes

– Handles rapid arrival and failure of nodes

• Properties – Routing table size O(d)

– Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes

Slide modified from another presentation

Page 19: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Space divided between nodes• All nodes cover the entire space• Each node covers either a square or a

rectangular area of ratios 1:2 or 2:1• Example:

– Node n1:(1, 2) first node that joins cover the entire space

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1

Slide modified from another presentation

Page 20: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Node n2:(4, 2) joins space is divided between n1 and n2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

Slide modified from another presentation

Page 21: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Node n2:(4, 2) joins space is divided between n1 and n2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3

Slide modified from another presentation

Page 22: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Nodes n4:(5, 5) and n5:(6,6) join

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

Slide modified from another presentation

Page 23: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)

• Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 24: An Overview of Peer-to-Peer

CAN Example: Two Dimensional Space

• Each item is stored by the node who owns its mapping in the space

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 25: An Overview of Peer-to-Peer

CAN: Query Example• Each node knows its neighbors

in the d-space• Forward query to the neighbor

that is closest to the query id• Example: assume n1 queries f4• Can route around some failures

– some failures require local flooding

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 26: An Overview of Peer-to-Peer

CAN: Query Example• Each node knows its neighbors

in the d-space• Forward query to the neighbor

that is closest to the query id• Example: assume n1 queries f4• Can route around some failures

– some failures require local flooding

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 27: An Overview of Peer-to-Peer

CAN: Query Example• Each node knows its neighbors

in the d-space• Forward query to the neighbor

that is closest to the query id• Example: assume n1 queries f4• Can route around some failures

– some failures require local flooding

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 28: An Overview of Peer-to-Peer

CAN: Query Example• Each node knows its neighbors

in the d-space• Forward query to the neighbor

that is closest to the query id• Example: assume n1 queries f4• Can route around some failures

– some failures require local flooding

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Slide modified from another presentation

Page 29: An Overview of Peer-to-Peer

Node Failure Recovery

• Simple failures– know your neighbor’s neighbors– when a node fails, one of its neighbors takes

over its zone

• More complex failure modes– simultaneous failure of multiple adjacent nodes – scoped flooding to discover neighbors– hopefully, a rare event

Slide modified from another presentation

Page 30: An Overview of Peer-to-Peer

Document Routing – Chord

• MIT project• Uni-dimensional ID

space• Keep track of log N

nodes• Search through log N

nodes to find desired key

N32

N10

N5

N20

N110

N99

N80

N60

K19

Page 31: An Overview of Peer-to-Peer

Doc Routing – Tapestry/Pastry

• Global mesh• Suffix-based routing• Uses underlying network

distance in constructing mesh

13FE

ABFE

1290239E

73FE

9990

F990

993E

04FE

43FE

Page 32: An Overview of Peer-to-Peer

Comparing Guarantees

logbNNeighbor map

Pastry

b logbNlogbNGlobal MeshTapestry

2ddN1/dMulti-dimensional

CAN

log Nlog NUni-dimensional

Chord

StateSearchModel

b logbN + b

Page 33: An Overview of Peer-to-Peer

Remaining Problems?

• Hard to handle highly dynamic environments

• Usable services

• Methods don’t consider peer characteristics

Page 34: An Overview of Peer-to-Peer

Measurement Studies

• “Free Riding on Gnutella”

• Most studies focus on Gnutella

• Want to determine how users behave

• Recommendations for the best way to design systems

Page 35: An Overview of Peer-to-Peer

Free Riding Results

• Who is sharing what?

• August 2000

The top Share As percent of whole333 hosts (1%) 1,142,645 37%

1,667 hosts (5%) 2,182,087 70%

3,334 hosts (10%) 2,692,082 87%

5,000 hosts (15%) 2,928,905 94%

6,667 hosts (20%) 3,037,232 98%

8,333 hosts (25%) 3,082,572 99%

Page 36: An Overview of Peer-to-Peer

Saroiu et al Study

• How many peers are server-like…client-like?– Bandwidth, latency

• Connectivity

• Who is sharing what?

Page 37: An Overview of Peer-to-Peer

Saroiu et al Study

• May 2001

• Napster crawl– query index server and keep track of results– query about returned peers– don’t capture users sharing unpopular content

• Gnutella crawl– send out ping messages with large TTL

Page 38: An Overview of Peer-to-Peer

Results Overview

• Lots of heterogeneity between peers– Systems should consider peer capabilities

• Peers lie– Systems must be able to verify reported peer

capabilities or measure true capabilities

Page 39: An Overview of Peer-to-Peer

Measured Bandwidth

Page 40: An Overview of Peer-to-Peer

Reported Bandwidth

Page 41: An Overview of Peer-to-Peer

Measured Latency

Page 42: An Overview of Peer-to-Peer

Measured Uptime

Page 43: An Overview of Peer-to-Peer

Number of Shared Files

Page 44: An Overview of Peer-to-Peer

Connectivity

Page 45: An Overview of Peer-to-Peer

Points of Discussion

• Is it all hype?

• Should P2P be a research area?

• Do P2P applications/systems have common research questions?

• What are the “killer apps” for P2P systems?

Page 46: An Overview of Peer-to-Peer

Conclusion

• P2P is an interesting and useful model

• There are lots of technical challenges to be solved