25
DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park [email protected] DPNM Lab., Dept. of CSE, POSTECH

DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park [email protected] DPNM Lab., Dept

Embed Size (px)

Citation preview

Page 1: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 1/25 -

Peer-to-Peer Algorithms and System

CS600 Assignment #5

Nov. 22. 2006

Byungchul [email protected]

DPNM Lab., Dept. of CSE, POSTECH

Page 2: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 2/25 -

Presentation Outline History of P2P Motivation Definition of P2P Most popular P2P application - Skype Overlay Network

– Unstructured P2P– Structured P2P

Comparison of P2P models Application Services & technology Open Issues Conclusion

Page 3: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 3/25 -

History of P2P Origin of Internet

– ARPANET was a P2P system (not client/server model)– FTP : killer application of early Internet – Usenet and DNS

Revolution of network model – Napster– Developed by Shawn Fanning– MP3 file sharing– Centralized P2P concept

Obstacles of P2P– Firewall– Dynamic IP– NAT

Success of Skype

Page 4: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 4/25 -

Motivation Requirements of Internet application

– Scalability– Security & Reliability– Flexibility & QoS

• Client/Server model can not satisfies the requirements

=>DOS attack, Bottleneck

P2P Network– Self-organizing– More flexible than Client/Server model

Page 5: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 5/25 -

Definition of P2P [1]

Significant autonomy from central servers Exploits resources at the edges of the Internet

– Storage and contents– Computing power– Connectivity and presence

Resources at edge have intermittent connectivity, being added & removed

Page 6: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 6/25 -

Skype [12] One of the most popular VoIP applic

ation Skype provide VoIP and IM services

that works behind NAT and firewalls 50 million users Valued at $2.6 billon Encrypted protocol & proprietary net

work How P2P VoIP traffic in Skype differ

s from other P2P traffic and from traffic in traditional voice-communication?

Resource consumption?

Page 7: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 7/25 -

Skype Overview Skype’s three services

– VoIP : two-way audio stream, conferences of up to 4user– IM (instant messaging) : small text message– File-transfer – Paid service : initiate & receive calls via regular telephone nu

mber through VoIP-PSTN gateway

Related to KaZaA – Supernode-based P2P network

Inherent Structure

Supernode : plenty ofspare bandwidth &publicly reachable

Page 8: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 8/25 -

Overlay Network

Figure 1. from [1]

Page 9: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 9/25 -

Overlay Network (cont) Virtual edge

– TCP connection or a pointer to an IP address

Maintenance of Overlay networks (typical)– Periodically ping to make sure neighbor is still alive– verify liveness while messaging– If neighbor goes down, may establish new edge– New node needs to bootstrap

Page 10: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 10/25 -

Overlay Network (cont) Design flexibility

– Topology– Protocol– Messaging over TCP, UDP, I

CMP

Underlying physical net is transparent to developer

Page 11: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 11/25 -

Unstructured P2P [2]

Unstructured P2P– No relations between peer and contents– Searching : query flooding to central server or neighbor peer

Centralized Server model – Central server keeps peer information – Scalability problem and copyright problem

Pure P2P– No central server (or directory)– Searching : query flooding to neighbor peer– excessive traffic generation

Hybrid P2P– Super node– Hierarchical architecture

Page 12: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 12/25 -

Unstructured P2P Applications Napster (Centralized model) [6]

Application-level, client-server point-to-point TCP

Steps:– connect to Napster– upload list of files– give server search query– receive “best” of correct

answers (ping)

Page 13: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 13/25 -

Unstructured P2P Applications (cont) Gnutella [7,8]

Focus: decentralized method of searching for files– Central directory server no longer problem (bottleneck)– Harder control of the system

Each application instances serves to:– store/serve files– route queries to its neighbors– respond to request queries

Limited scope query– if you don’t have the file you want, query 7 of your neighbors.– if they don’t have it, they contact 7 of their neighbors, for a

maximum hop count of 10.– reverse path forwarding for responses (not files)

Page 14: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 14/25 -

Unstructured P2P Applications (cont) Overlay management of Pure P2P New node uses bootstrap node to get IP addresses of

existing Gnutella nodes New node establishes neighboring relations by sending

join messages

Page 15: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 15/25 -

Unstructured P2P Applications (cont) KaZaA & Skype [3]

Every peer is either a supernode or is assigned to a supernode

Each supernode knows where the other supernodes are (Mesh structure)

Cross between Napster and Gnutella

Page 16: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 16/25 -

Structured P2P Structured P2P

– Contents-peer mapping– Content-addressing tech.– Searching complexity : O(logN) -> scalability

What is DHT?– Distributed Hash Table– files are associated with a key (produced, by hashing the file nam

e) and each node in the system is responsible for storing a certain range of keys [3]

Example– Hash (“The Devil Wears Prada”) = 8045– Pastry, Tapestry, Chord, CAN, Viceroy

Page 17: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 17/25 -

Comparison of P2P models [5]

Page 18: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 18/25 -

Application Service & Tech.

1. Information Sharing

2. File Sharing

3. Bandwidth efficiency

4. Computing power sharing

Page 19: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 19/25 -

Information Sharing Presence information

– Basic operation of P2P application– e.g. buddy list of Skype or MSN

Document Management– Most documents are stored with personal PCs– P2P networking provides the repository that can coordinates distri

buted local data– e.g. NextPage-NXT 4 [9]

Collaboration– P2P networking provides collaboration service without central ma

nagement system– e.g. Groove Virtual Office [10] (ad-hoc collaboration system)

• Instance messaging, File sharing• Real-time DB synchronization, etc

Page 20: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 20/25 -

File Sharing Representative P2P application service Over 70% of Internet traffic generated by P2P file sharing

File searching mode– Flooding model (e.g. Gnutella)– Central directory model (e.g. Napster)– Document routing model (e.g. Freenet)

Document routing model– A file saved on anonymous peer– An identifier assigned to each file and peer– A file saved on a peer that has similar identifier

Page 21: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 21/25 -

Bandwidth Efficiency Client/Server model involved with bottleneck problem and

queuing delay

Load balancing– Multimedia streaming service and VoD service– A contents copied to numerous peers– e.g. PeerCast, Peer-to-Peer-Radio

Improving download speed– If file is found in multiple nodes, user can select parallel downloadi

ng– Most likely HTTP byte-range header used to request different porti

ons of the file from different nodes– Automatic recovery when server peer stops sending file

Page 22: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 22/25 -

Computing Power Sharing seti@home [11]

– Search for ET intelligence– Central site collects radio telescope data– Data is divided into work chunks of 300Kbytes– User obtains client, which runs in background– Peer sets up TCP connection to central computer, downloads chu

nk– Peer does FFT on chunk, uploads results, gets new chunk

» Not peer to peer, but exploits» resources at network edge

Page 23: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 23/25 -

Open Issues Copyright issues Avoid Firewall or detection

– Dynamic port allocation– NAT traversal

Page 24: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 24/25 -

Conclusion P2P networking is a promising paradigm for services oper

ating at the edges of the network [4]

Significant autonomy from central servers

Scalability, Security & Reliability, Flexibility & QoS

P2P networks still have a long way to go

Page 25: DP&NM Lab. CSE, POSTECH - 1/25 - Peer-to-Peer Algorithms and System CS600 Assignment #5 Nov. 22. 2006 Byungchul Park fates@postech.ac.kr DPNM Lab., Dept

DP&NM Lab.CSE, POSTECH- 25/25 -

References [1] K.W. Ross and Dan Rubenstein, "Tutorial on P2P Systems " presented at Infocom 2

003, http://cis.poly.edu/~ross/p2pTheory/P2Preading.htm. [2] Peer-to-peer information systems: concepts and models, state-of-the-art, and future

systems, Karl Aberer, Manfred Hauswirth Tutorial at the 18th International Conference on Data Engineering, February 26-March 1, 2002, San Jose, California.

[3]Nathaniel Leibowitz, Matei Ripeanu, and Adam Wierzbicki, "Deconstructing the Kazaa Network", 3rd IEEE Workshop on Internet Applications (WIAPP'03), June 2003.

[4] Management of Peer-to-Peer Networks, K. Tutschku., http://www3.informatik.uni-uerzburg.de/ITG/2001/vortrag/vortrag13.pdf.

[5] 전자통신동향분석 제 21 권 제 5 호 , ETRI, 2006 년 10 월 [6] Napster, “Napster Messages,” http://opennap.sourceforge.net/napster.txt, 2000. M.

Ripeanu, I. Foster, and A. Iamnitchi, “Mapping [7]the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implicati

ons for System Design,” IEEE Internet Computing Journal, Vol.6, No.1, 2002. [8] M. Ripeanu, “Peer-to-Peer Architecture Case Study: Gnutella Network,” In Proc. of I

EEE 1st Int’l Conf. on Peer-to-Peer Computing, 2001. [9] Next Page Inc, http://www.nextpage.com/pdfs/collateral/wp/nxt4 build wp090903lo.p

df, 2004. [10] Groove Networks, http://www.groove.net, 2004. [11] D. Anderson, SETI@home, chapter 5, O’Reilly, Sebastopol, 2001, pp.67–76. [12] Saikat Guha, Neil Daswani, Ravi Jain, “An Experimental Study of the Skype Peer-t

o-Peer VoIP System”, IPTPS 2006.