38
NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD [email protected] 1

NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD [email protected] 1

Embed Size (px)

Citation preview

Page 1: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

NETE4631Network Information Systems (NISs):

Peer-to-Peer (P2P)

Suronapee, PhD

[email protected]

1

Page 2: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

P2P Peer-to-peer:

“direct” connections between peers Peers are all equal - both a sender and a receiver of a

content

P2P core principle Self-organizing

no central management, peers are completely independent

Large collection of resources Millions of simultaneous users, voluntary participation

Scalability scalability with respect to number of nodes

2

Page 3: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

P2P principle P2P is an overlay network (of internet)

a virtual network on top of the underlying IP network

3

Overlay graph

Page 4: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Overlay network

4

Page 5: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Peer-to-Peer (P2P) Systems Old ideas

1979 - USENET news service (still in use) Popular around 1999

Napster, Kazaa and Gnutella for sharing files, music.. ‘01: Skype launched (Kazaa) ‘06’, ’10’: Acquired by eBay,

Microsoft ‘01: BitTorrent launched – heavily used for file and music

sharing Still very popular today for sharing multimedia content

BitTorrent – 30% of internet traffic (mid 2000s) Skype – 663M users (2010), 700 M minutes a day

Problem: Free Riders - only consume, not contribute5

Page 6: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Current State of P2P P2P networks going strong, all over the world

Currently P2P accounts for almost 70% of network traffic

P2P networks currently mostly used for illegal sharing of copyrighted material Music, videos, software, …

Content providers not so happy Sue companies making P2P software (e.g.,

Napster), sue software developers (Winny), sue users sharing material

6

Page 7: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

P2P Application P2P principle applicable to many kinds of

systems Content distribution

Most current P2P targeted at one application: File sharing Users share files (e.g., music, video, software) and others

download Also often illegally shared (except BitTorrent) Example

BitTorrent, Napster, Gnutella, KaZaA

From Acadamic Chord

Communication Skype

7

Page 8: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Napster Napster launched in 1999 by Shawn Fanning

The term “P2P was coined by Napster. In 2000:,25% of traffic out of Uni. of Wisconsin Madison, 60M

users Centralized real-time directory, distributed files, mostly MP3

music;

Based in USA; lawsuits put it out of business RIAA sues Napster, asking $100K per download Indirectly helping users to infringe copyright

Currently, paid service Pay % to songwriters and music companies as copyright required Napster protocol is open, people free to develop

8

Page 9: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Napster Connect to Napster server

Upload list of music files that you want to share

Server stores no files

Maintain a list of

<filename, ip_address, portnum>

9

Structure

Page 10: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Napster search Send server keywords to search with Server returns a list of hosts – <ip_address, portnum> tuples – to

client Client pings each host in the list to find transfer rates Client fetches file from best host

10

Page 11: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Napster Problem Centralized server a source of congestion

Centralized server single point of failure

Napster.com declared to be responsible for users’ copyright violation “Indirect infringement”

Next system: Gnutella

11

Page 12: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Gnutella Eliminate the servers Client search and retrieve amongst themselves Clients act as servers too, called servents In 2000, release by AOL, 88K users by ’03’

12

Page 13: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

How a peer join a network To join the network,

peer needs the address of another peer that is currently a member

New peer sends connect message to existing peer GNUTELLA CONNECT Reply is simply “OK”

13

Page 14: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Gnutella search Gnutella routes different messages within the

overlay graph Gnutella protocol has 5 main message types

Query (search) QueryHit (response to query) Ping (to probe network for other peers) Pong (reply to ping, contains address of another

peer) Push (used to initiate file transfer)

14

Page 15: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Gnutella Message Header Format

15

Page 16: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Flooding query message Query message

16

Page 17: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

How do search results come back?

17

Page 18: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Avoiding excessive traffic To avoid duplicate transmissions, each peer

maintains a list of recently received messages Query forwarded to all neighbors except peer from

which received Each Query (identified by DescriptorID) forwarded

only once QueryHit routed back only to peer from which Query

received with same DescriptorID Duplicates with same DescriptorID and Payload

descriptor (msg type) are dropped QueryHit with DescriptorID for which Query not

seen is dropped18

Page 19: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

After receiving QueryHit messages Requestor chooses “best” QueryHit responder

Initiates HTTP request directly to responder’s ip+port

Responder then replies with file packets after this message:

19

Page 20: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Dealing with Firewalls Requestor sends Push to responder asking for

file transfer

Responder establishes a TCP connection at ip_address, port specified. Sends

Requestor then sends GET to responder (as before) and file is transferred as explained earlier

20

Page 21: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

PING-PONG Peers initiate Ping’s periodically Ping’s flooded out like Query’s, Pong’s routed

along reverse path like QueryHit’s Pong replies used to update set of

neighboring peers To keep neighbor lists fresh in spite of peers

joining, leaving and failing

21

Page 22: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Problem Flooding a query is extremely inefficient

Wastes lot of network and peer resources Repeated searches with same keywords Solution:

Gnutella’s network management not efficient Periodic PING/PONGs consume lot of resources Ping/Pong constituted 50% traffic

Modem-connected hosts do not have enough bandwidth for passing Gnutella traffic

Another solution: FastTrack System

22

Page 23: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

FastTrack Hybrid between Gnutella and Napster Takes advantage of “healthier” participants in

the system Underlying technology in Kazaa, KazaaLite,

Grokster Like Gnutella, but with some peers designated

as supernodes

23

Page 24: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

FastTrack (2) A supernode stores a directory listing a subset of

nearby (<filename,peer pointer>), similar to Napster servers

Supernode membership changes over time Any peer can become (and stay) a supernode,

provided it has earned enough reputation Kazaalite: participation level (=reputation) of a user

between 0 and 1000, initially 10, then affected by length of periods of connectivity and total number of uploads

More sophisticated Reputation schemes invented, especially based on economics (See P2PEcon workshop)

A peer searches by contacting a nearby supernode

24

Page 25: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Strength Combines good points from Napster and

Gnutella Efficient searching under each supernode Flooding restricted to supernodes only Result: Efficient searching with “low” resource

usage

Most popular network Lot of content, lot of users Currently most file sharing networks adopted this

architecture

25

Page 26: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

BitTorrent Developed by Bram Cohen in 2001

Written in Python, available on many platforms

BitTorrent is a new approach for sharing large files distributed directories, distributed files Each file divided as chunks

Each chunk contains 32 KB – 256 KB Each chunks can traverse different paths

BitTorrent widely used also for legal content For example, Linux distributions, software patches, Official

movie Currently lots of illegal content on BitTorrent too…

26

Page 27: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Topology of BitTorrent Overlay graph

(1) physical (2) neighboring peer (3) peering relationship

A tracker a server which tracks the currently active clients serves as a centralized directory

Topology can be changed regularly Tracker factors: content, distance, peer churn,

randomization

27

Page 28: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

BitTorrent: Players Three entities needed to start distribution of a file

Terminology: A “torrent” file: the metadata about the file Seed: Client with a complete copy of the file Leecher: Client still downloading the file

28

Page 29: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

BitTorrent Start Up New client gets torrent-file and gets peer list

from tracker

29

Page 30: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

BitTorrent Operation

30

Page 31: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Summary of BitTorrent operation A new peer A receives a.torrent file from one of the BitTorrent web

servers, including the name, size, and number of chunks of a particular file, together with the IP address and port number of the corresponding tracker.

It then registers with the right tracker. It will also periodically send keep-alive messages to the tracker.

The tracker sends to peer A a list of potential peers (peer set = 50 peers).

Peer A selects a subset (following the tit-for-tat and randomization rules) and establishes connections with these five peers.

Peer A downloads chunks from peers in peer set and provides them with its own chunks (possible to parallel) Chunks typically 256 KB Starting with the rarest chunks.

Every now and then, each peer updates its peer list.

31

Page 32: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Peering construction methods Tracker suggests a set of 50 peers Let new peer picks 5 peers (at this

time!) for exchanging chunks Exchanging contents evenly

between them (Rarely chunk first)

32

Peer serves 4 peers in peer set simultaneously (tit-for-tat) Seeks 4 best downloaders in last time slot if it’s a seed Seeks 4 best uploaders in last time slot if it’s a leecher

The fifth peer selected at 50% randomly (randomization)

Choking: Limit number of neighbors to which concurrent uploads or download <= a number

Page 33: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Strength Tit-for-Tat

A peer serves peers that serve it Encourages cooperation, discourage free-riding

Rarely chunk first Prefer early download of blocks that are least

replicated among neighbors Avoid the problem that most of peers have most of the

chuck but all must wait for the few rare chunks

Randomization avoids unfairness of little upload capacity nodes

33

Page 34: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Weakness File needs to be quite large

256 KB chunks Rarest first needs large number of chunks

Everyone must contribute Low-bandwidth clients have a disadvantage

34

Page 35: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

How can BitTorrent be free? It leverages peer uplink capacities to send

chunks of files to each other without deploying many media servers.

P2P is used for sharing content in BitTorrent.

Scalable? Add many nodes as the network scale up without

a bottleneck

35

Page 36: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

P2P versus client-server architecture Client-server architecture

Each client requests data from the server Not help each other

P2P Peer is both a sender and a receiver of a content each peer helps each other in a distributed

manner Data transmission is distributed Although control plane for signaling is centralized

36

Page 37: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Summary Most existing P2P networks built on searching,

however Searching does not scale in same way Either centralized system with all its problems Distributed system with all its problems Hybrid systems cannot guarantee discovery either

Alternatively, use addressing instead of searching Distributed hash tables (DHTs) - efficient searching

and object location in P2P network Example

Chord, CAN, Plaxton, Pastry, Tapestry

37

Page 38: NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

Reference Kangasharju: Peer-to-Peer Networks Brinton, Christopher; Chiang, Mung (2013-06-

10). Networks Life: 20 Questions and Answers