Opportunistic and Peer-to-Peer Networks › fileadmin › redaktion › ...Apr 16, 2015  ·...

Preview:

Citation preview

Opportunistic and Peer-to-Peer Networks

Peer-to-Peer Networks

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 2

Call for Papers

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 3

Tools to be used

Conference Web Site –  http://www.tsn.hhu.de/teaching/lectures/2015ss/oppp2pnet.html

Conference management tool EasyChair –  For seminar paper & reviews management –  https://easychair.org/conferences/?conf=oppp2pnet2015

Simulator to use: PeerfactSim.KOM –  www.peerfact.org –  Visit website for documentation

Doodle for topic selection –  http://doodle.com/59payaptnc2e3ygu

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 4

Schedule

09. April 2015 –  Organizational details –  Presentation of the topics –  Time for questions

09.-21. April 2015 –  Doodle-based choosing of seminar topics

23. April 2015

–  Assignment to the topics

23. July 2015

–  Final presentation

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 5

Overview on Topics

Peer-to-Peer Networks PA1 - Simple Replication – TA PA2 - Erasure Code based Replication – RA PA3 - File Sharing of linked and connected Files – RA PO1 - Location-based Overlay using Space-filling Curves – TA PO2 - Location-based 2-dimensional Overlay – TA PM1 - Security in P2P Monitoring Networks – AD PM2 - Trees also want to Gossip – AD PM3 - Memory-based Mechanism for Document Statistics – AD PM4 - Publish/Subscribe-based Mechanism for Document Statistics – AD PM5 - Publish/Subscribe-based Mechanism for Document Statistics – AD Mobile Ad Hoc and Opportunistic Networks MO1 - MANET Testbed with Raspberry PIs - AC MO2 - Opportunistic Network Testbed with Raspberry PIs - AC MO3 - Scheduling Messages in Opportunistic Networks – SS MO4 - Local Connectivity Options between Android Devices - AI MO5 - Local Connectivity Options for Android Devices and Raspberry Pis - AI

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 6

http://doodle.com/59payaptnc2e3ygu

Peer-to-Peer Systems

Much more information to be found here: http://www.tsn.hhu.de/teaching/lectures/2014ws/p2p.html

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 8

Networks

Peer-to-Peer Networks –  Network of equal participants – Application Layer –  Freedom to create logical topologies, harness node resources –  Strong heterogeneity, churn (on/offline behavior), no trust –  Focus on data management

Opportunistic Networks –  Mobile Ad Hoc Networks that span „local communication islands“ –  Mobility of nodes

•  à connect islands over time à delay tolerant communication –  Focus on routing / communication – Network Layer

Combination of both –  Mobile decentralized network with p2p network on top

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 9

A few Definitions for Peer-to-Peer Systems

Peer-to-peer systems and applications are distributed systems without any centralized control or hierarchical organization, where the software running at each node is equivalent in functionality. [...] The core operation in peer-to-peer systems is efficient location of data items.

–  I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications”

Peer-to-peer systems can be characterized as distributed systems in

which all nodes have identical capabilities and responsibilities and all communication is symmetric.

–  A. I. T. Rowstron and P. Druschel, “Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems“

The sheer scale and dynamism in which P2P networks are supposed to operate make the design of P2P systems challenging even for relatively simple applications.

–  M. Naor and U. Wieder, “Novel architectures for p2p applications: the continuous-discrete approach“

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 10

Detailed Characteristics - Properties

1. Self-organizing system –  Relevant mechanisms performed by peers

•  No central control •  Decentralized resource search, allocation and scheduling

–  (Sometimes, servers assist à centralized p2p systems)

2. Combined client and server functionality –  Resources provided by end systems

•  Storage, communication (forwarding messages) –  Mostly similar rights – same code!

•  Roles based on capabilities

3. Direct interaction between peers (= “peer to peer”) –  Provision of services, such as: search, data hosting,

communication

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 11

Detailed Characteristics - Challenges

4. Relevant resources located at (private) nodes (peers) –  Uncontrolled, voluntary offers –  Widely spread –  Often operating behind firewalls or NAT gateways –  Requires proper mechanism to find and use

5. Capacities of peers are heterogeneous –  Bandwidth, CPU power, storage space, … –  Quality depends on device / connectivity

6. Churn: variable connectivity –  Peers are online for a limited time –  Very unpredictable, not reliable

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 12

Overlay Networks

A network

–  interconnected nodes –  provides services (service model) –  defines how nodes interact –  needs for addressing, routing, …

Overlay network –  = network built ON TOP of one or more existing networks –  adds an additional layer of

•  abstraction •  indirection/virtualization

TCP/IP

TCP/IP

TCP/IP

Peers Overlay Network

Underlay Networks

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 13

Schematic View on P2P Systems

IP Network (Underlay) providing: Routing

Overlay Network providing: Search / Lookup of Data Multicast, Data Storage Publish/Subscribe…

Peer-to-Peer Service Delivery

Firewall + NAT TCP/IP Network

TCP/IP Network TCP/IP

Network

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 14

Typical Service / Content Discovery and Provision

Intelligence in the network –  Enabling search for resources –  “Content”-based routing –  Provider and client matching –  All roles are distributed fulfilled by large number of nodes

Node Node Node Node

Network

Advanced Network

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 15

Summary on the motivation

A huge number of nodes participating in the network –  Have resources to share –  Have demands towards the use of resources

which may not be satisfied easily and by single nodes ??? Main question for “intelligent network”

–  How to find nodes providing desired resources –  How to organize the exchange of resources

Peer-to-Peer (P2P) –  P2P builds overlay network(s) –  P2P overlay offers mechanisms to find / look up what is wanted

Mode of operation –  After locating the node providing the desired service: –  Interact directly from peer to peer

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 16

Challenges in P2P related Data Management and Retrieval

Essential challenges in (most) Peer-to-Peer systems –  Location of a data items at distributed systems

•  Where shall the item be stored by the provider? •  How does a requester find the actual location of an item?

Challenges –  Scale – potentially millions of nodes –  Dynamism – nodes go regularly online and offline

D

?

Data item „D“

7.31.10.25

peer - to - peer.info 12.5.7.31

95.7.6.10

86.8.10.18

planet - lab.org berkeley.edu 89.11.20.15

I have item „D“. Where to place „D“?

I want item „D“. Where can I find „D“?

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 17

Structured and Unstructured P2P Networks

Unstructured P2P Networks –  objects have no special identifier –  location of desired object a priori not known –  each peer is only responsible for objects it submitted

Structured P2P Networks –  peers and objects have identifiers –  objects are stored on peers according to their ID:

responsibleFor(ObjID) = PeerID –  distributed indexing points to object location

Search:

–  Find all (or some) objects in the P2P network which fit to given criteria Lookup / Addressing:

–  Retrieve the object which is identified with a given identifier

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 18

Structured Overlay Networks: Interconnection Networks

Structured Overlay Networks –  Give peers and objects (unique) identifier

•  PeerIDs and ObjectIDs shall be from the SAME key set •  Each peer is responsible for a specific range of ObjectIDs

–  Indexing (knowledge on location of resources) to be distributed –  No search needed anymore (local indexing) –  No server knowing all (global indexing) available

New challenge: to find peer(s) with specific ID in overlay

–  Lookup: •  “Route” queries across the overlay network to peer with specific ID

–  Once peer is found •  Initiate direct communication •  Upload / download resources

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 19

Schematic View on Distributed Hash Table

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 20

Functions in a Structured P2P Overlay (all) IsMyKey(K) à true if node is responsible for Key K Route(K, M, hint) à send message M to node responsible for K

§  Hint: Optional first hop GetNodeHandle (K, hint) à get contact details of responsible node Send(M, q) à Send Message M to node q

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 21

Additional Functions in a Distributed Hash Table

Put (Data D, Key K) à Copies Data to node responsible for K GetData (Key K) à Gets Data stored under the Key K Optional further functions:

–  Replication –  Secure Communication –  Access Control

H(„my data“)= 3107

2207

7.31.10.25

peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18

planet-lab.orgberkeley.edu

29063485

201116221008709

611

89.11.20.15

?

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 22

Look up in Structured P2P Systems

Principle –  Location of the objects is found via routing

•  � Node A (provider) advertises object at responsible peer B »  Advertisement is routed to B.

•  � Node C looking for object sends query »  Query is routed to responsible node.

•  � Node B replies to C by sending contacting information of A

1. Publish link at responsible Peer

3. P2P com-munication. Get link to object.

2. “Routing” to / Lookup of desired Object

?

Node A

Node B

Node C

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 23

Strategies for Data Retrieval: Distributed Indexing

Goal is scalable complexity for –  Communication effort: O(log(N)) hops –  Node state: O(log(N)) routing entries

H(„my data“)= 3107

2207

7.31.10.25

peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18

planet-lab.orgberkeley.edu

29063485

201116221008709

611

89.11.20.15

?

Routing in O(log(N)) steps to the node storing the data

Nodes store O(log(N)) routing information to other nodes

The content of this slide has been adapted from “Peer-to-Peer Systems and Applications”, ed. by Steinmetz, Wehrle

Peer-to-Peer Systems

Structured Homogenous P2P Overlay Networks – Chord

This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 25

Chord: An Efficient Lookup Network

Chord uses SHA-1 hash function –  Results in a 160-bit object/node identifier –  Same hash function for objects and nodes

Node ID hashed from e.g., IP address Object ID hashed from object name

–  Object names assumed to be known Chord is organized in a ring which wraps around

–  Nodes keep track of predecessor and successor •  System invariant for valid network operation

–  Node responsible for •  objects between its predecessor and itself

–  Fingers used to enable efficient content addressing •  O(log(N)) fingers lead to lookup operation of O(log(N)) length

Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications (2001) by Ion Stoica, et.al.

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 26

Chord: Network Topology

Basic ring topology –  Successor/

Predecessor

Uses SHA-1to map –  IP address/object name

onto –  160 Bit ID

Circular Key Space

Link to ring successor

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

1622 1009-1622 1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-4047 0-611 Peers are responsible

for own ID and IDs back to predecessor

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 27

Chord: Network Topology

Enhanced topology –  kth finger of Peer n is shortcut pointing

to peers being responsible for Object ID (n + 2^k) –  k ranges from 0 to log(N) –  O(log(N)) fingers lead to lookup operation of O(log(N))

Fingers points to peers with ObjectIDs increasing exponentially. Here: 709 + 2k

= …, 965, 1221, 1733, 2757

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

1622 1009-1622 1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-4047 0-611

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 28

Example Peer ID 7, ID range: 0 - 127

17 peers in the network: –  3, 7, 10, 19, 21, 31, 36, 37, 51,

60, 65, 78, 82, 90, 93, 101, 105

log_2(128) = 7 à 7 fingers

–  Calculate finger IDs, find corresponding peers

Peer ID From To 3 106 3 7 4 7 10 8 10 19 11 19 21 20 21 31 22 31 36 32 36 37 37 37 51 38 51 60 52 60 65 61 65 78 66 78 82 79 82 90 83 90 93 91 93 101 94 101 105 102 105

Responsibility range

Finger No. Calculation Finger ID Real Peer

0 10+2^0 11 19 1 10+2^1 12 19 2 10+2^2 14 19 3 10+2^3 18 19 4 10+2^4 26 31 5 10+2^5 42 51 6 10+2^6 74 78

Fingers of node 10

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 29

Chord: Addressing Content

Query –  Contains the hash value of the queried content –  On each step the distance from the destination is halved

(remember fingers)

Node 1008 queries item 3000

Use Fingers to locate the destination faster Without fingers: no shortcuts, walk the circle

Responsible peer found

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

1622 1009-1622 1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-… 0-611

2

Responsible for 1008 + 1024

31

Responsible for 2207 + 512

Responsible for 3000

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 30

Chord: Join Procedure (1)

Request to join the Chord ring

New Peer 1289

1. Contact a member of the ring

2. Route the query in the ring

3. Provide new peer’s successor

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

1622 1009-1622 1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-… 0-611

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 31

1. Set successor

Chord: Join Procedure (2)

Request to join the Chord ring

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

Nx:1622 1290-1622 Ny:1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-… 0-611

4. Build fingers

2. Redistribute indexing information (e.g. 1009-1289)

3. Update successor of predecessor

Nz:1289 1009-1289

1. & 2. Notify Successor Actions: NZ: Set Successor (NX) NZ: Notify NX NX: Set Predecessor NX: Copy items (index) to NZ

3. & 4. Stabilize Actions: NY: Ask Predecessor of NX NY: Set Successor (NZ) NY: Notify NZ NZ: Set Predecessor (NY) NX: Clear moved items All: Fix Fingers

Fingers of peer n pointing to peers responsible for ObjectID n + 2k thus, log(N) fingers are built

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 32

Chord – Evaluation

Advantages –  Lookups will find the target (if it exists) –  Good scalability of O(log n) –  Very popular, shows main idea of DHTs –  Often used as basis for research extensions

Drawbacks

–  Heterogeneity not supported •  all nodes are treated equally, although some are strong/weak

–  Maintaining unidirectional links (fingers) is only beneficial for one party

•  Traffic costs not optimally used –  Only one-directional routing: not optimally efficient

Peer-to-Peer Systems

Location-based P2P Overlay Networks – Space-filling Curves – Geodemlia

This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 34

Location-aware Services

Answering on questions: –  “Where I am?” = Locating –  “What is near by? Where is …” = Searching

–  “How can I go to?” = Navigating

Location-aware services…

–  are highly attractive for end-users and providers

–  they can offer highly personalized services based on the user‘s location

Example:

–  List Italian restaurants within walking distance

Closest Italian restaurant?

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 35

Space-Filling Curves

Main idea: –  Fill 2-dimensional space with

1-dimensional curve –  Map 1 dimensional curve to Chord –  à Each position in 2D map is

managed by a node in Chord

Expected limitations:

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 36

Geodemlia

Geodemlia –  Structured p2p overlay –  Location-based –  Similar to Kademlia

Geographical position: Pos = (long,lat) –  Longitude: -180° .. 180° –  Latitude: -90° .. 90°

Each node: –  Geographical position –  IP contacts –  Random identifier, 160 bits

Each object

–  Geographical position –  Set of tags –  Random identifier, 160 bits

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 37

Geodemlia - Interface

FIND_NODES(pos, k, b) –  Pos – Point in the world which is searched –  k – closest nodes to be replied –  b – bloom filter of already known nodes

STORE(o, pos) –  Object o to be stored –  Position pos of the object

AREA_SEARCH(pos, r, s, b) –  Pos – Point in the world which surrounding is searched –  r – Radius around the point to be covered –  s – set of search terms –  b – bloom filter of already known nodes

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 38

Geodemlia – Routing Table

Routing Table –  Predefined directions: 0 … n-1

•  Based on the angle (-π .. π) starting north

–  For each direction j •  K-buckets are maintained •  160 buckets for distances

2^i – 2^(i+1)

–  Routing table entry – per peer •  Position (long, lat) •  Peer ID (160 bits) •  IP contacts

Example: n=4

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 39

Geodemlia Routing

FIND_NODES(pos, k, b) –  „Like Kademlia“ –  Parallel routing (alpha) –  Iterative routing – k closest nodes

Approaching the target position

–  Like in Kademlia –  Only position-based –  Using Bloom-filters to

signal already known nodes

FIND_NODES(Pos, 2, ---)

FIND_NODES(Pos, 2, ---)

k closest nodes to POS k closest nodes to POS

FIND_NODES(Pos, 2, Bloom)

FIND_NODES(Pos, 2, Bloom) Newly learned closest nodes Already queried nodes

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 40

Geodemlia Routing

STORE(o, pos) –  First find nodes at position pos –  Store object o at k closest peers

AREA_SEARCH(pos, r, s, b) –  Similar to FIND_NODES – just with larger scope –  All nodes in radius around position are contacted –  They tell their fitting objects

Peer-to-Peer Systems

Improvements of P2P Overlays

2207 2012-2207

2906 2683-2906 3485

2907-3485

2011 1623-2011

1622 1009-1622 1008

710-1008 709 660-709

659 612-659

2682 2208-2682

611 3486-… 0-611

2

Responsible for 1008 + 1024

31

Responsible for 2207 + 512

Responsible for 3000

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 42

Simple Replication

Motivation –  Object stored at only one node: lost if node leaves

Idea

–  Replication through multiple IDs per object –  File published under each ID –  File looked up under next ID if lookup for previous IDs failed –  Number of replicas flexible

Examples

–  Multiple hash functions •  ID_1: h1(object), ID_2: h2(object), ID_3: h3(object), …

–  Sequential hashing •  ID_1: h(object), ID_2: h(ID_1), ID_3: h(ID_2), …

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 43

Evaluation of Simple Replication

Good –  Easy to implement

Shortcommings

–  Hashes of ID_i and ID_i+1 not close, each one requires a new lookup

–  If number of replica fixed (i.e. i=5) •  Too much replication in stable scenario, too few in dynamic scenario

–  If number of replica flexible •  Determined by each replica holder to adapt load / churn risk •  How to know how long the replica chain is?

Improvement of simple replication: PAST

–  Replication around a single ID

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 44

Redundancy through full copy replication

Traditional replication: –  Several copies of same content

Example PAST (extension to Pastry): Nodes [] replicaSet (key à k, int à max rank)

–  Returns an ordered set of peers of magnitude (max rank) on which replicas of the object with key k can be stored

–  The nodes which become roots for the key k when the local node fails

12469 3 5 7 8

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 45

PAST Evaluation

Good –  ID related replication: 1 lookup sufficient to find object –  Replication ratio flexible (might depend on object / peer

properties) –  Failed replica nodes are detected by overlay: easy to react

Drawback –  Replication not peer heterogeneity aware

•  Weak nodes might be overloaded by replication task –  Security

•  Replicas all in one ID area: easier to attack

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 46

Erasure Codes

Erasure codes –  Split content in blocks (symbols) (e.g. 50) –  Create redundancy in blocks

•  à e.g. 200 blocks –  For restoring content

•  50 out of 200 blocks needed

Slide based on Anwitaman Datta, “Peer�-to-Peer Storage Systems: Crowdsourcing the Storage Cloud“, ICDCN’10

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 47

Erasure Codes: Static Resilience

Replicated 𝑟  times –  Faults that can be tolerated: r - 1 –  Probability of failure: f^r –  Storage efficiency: 1/r –  Access: Find any one good replica

Erasure coded (k of n)

–  Faults that can be tolerated: n-k –  Probability of failure:

–  Storage efficiency: k/n –  Access: Find k good blocks

Assume: Peer failure is i.i.d. with failure probability 𝑓

Slide based on Anwitaman Datta, “Peer�-to-Peer Storage Systems: Crowdsourcing the Storage Cloud“, ICDCN’10

Jun.-Prof. Dr.-Ing. Kalman Graffi, Peer-to-Peer Systems, WS 2013/2014 7

Erasure Codes: Static Resilience

Replicated 𝑟  times � Faults that can be tolerated: 𝑟 − 1 � Probability of failure: 𝑓 � Storage efficiency: 1/𝑟 � Access: Find any one good replica

Erasure coded (k of n)

� Faults that can be tolerated: 𝑛 − 𝑘 � Probability of failure:

∑ 𝑛𝑛 − 𝑘 + 𝑗 𝑓 1 − 𝑓

� Storage efficiency: k/n � Access: Find 𝑘 good blocks

Assume: Peer failure is i.i.d. with failure probability 𝑓

Slide based on Anwitaman Datta,  “Peer•-to-Peer Storage Systems: Crowdsourcing the Storage  Cloud“,  ICDCN’10

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 48

Example: Static Resilience

Churn rate, f=0.25 Replication

–  r = 5 •  à Availability ~ 0.73 •  Too low

–  r = 24 •  à Availability > 0.999 •  Too much overhead!

Erasure code

–  N=517, k=100 •  à Availability ~ 0.999 •  Overhead = 5.17

Slide based on Anwitaman Datta, “Peer�-to-Peer Storage Systems: Crowdsourcing the Storage Cloud“, ICDCN’10

Peer-to-Peer Systems

Monitoring the Quality of Peer-to-Peer Systems – Introduction to P2P Quality Monitoring – Gossip-based solutions – Tree-based solutions

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 50

Quality of Service in P2P Networks

Quality of service (QoS) –  “The well-defined and controllable behavior of a system with

respect to quantitative parameters” Challenges for providing (constant) quality in p2p systems:

–  Various scenarios •  Distributed storage •  Content delivery •  Discovery and contacting of users

–  Dynamics over time •  Network size •  Churn

–  Peer heterogeneity •  Peer capacities •  Connectivity

User

Overlay

Application

Devices

Network

Manage-ment

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 51

Monitoring and Controlling System Metrics

Examples –  Statistics: minimum, maximum, average, standard deviation –  P2P system: peer count, online time, –  Overlay “metrics”: Hop count, routing delay –  Overhead: bandwidth consumption and traffic (per message type) –  Resources: Free / used CPU, memory, storage space, bandwidth

Monitoring –  Obtain (global) knowledge on the system statistics

•  Aggregatable: Size of statistics for 10 or 1000 peers equal •  Statistics must be fresh, monitoring mechanism of low overhead

–  Information on system status can be used for optimized decisions •  E.g. peer count defines size of time-to-live •  E.g. churn pattern defines stabilization frequency

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 52

P2P Monitoring

Goal: Obtaining statistics on the global system –  Input: local states x_i(t) of peers p_i at time t –  Output: global status (aggregate) at time t: F(t) = f(x_1(t),…,x_n(t))

Aggregate function: f(R*) à R –  Matches non-empty set of values to a single value –  Associative (aggregation of aggregations) –  commutative (order not relevant)

Operations: –  Add local statistics –  Get global statistics

Peer-to-Peer Systems

Monitoring the Quality of Peer-to-Peer Systems – Gossip-based Approaches

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 54

► Gossiping Protocols

Idea: –  Communicate only with neighbors (gossip)

•  Assumes no specific overlay topology –  Exchange and aggregate information

•  E.g. calculate averages, minimum, maximum

Characteristics –  Gossip protocols are round-based (epochs) –  For every round

•  Each node selects a subset of nodes to interact with (pairwise) •  The selection function is often probabilistic; •  Nodes interact via “small” messages •  Local state changes due to new information

–  In general: “quick” convergence D. Kempe, A. Dobra,J. Gehrke, “Gossip-Based Computation of Aggregate Information,” IEEE Symposium on Foundations of Computer Science (FOCS’03)

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 55

Example Average Calculation

Example: 12 nodes –  Initial state –  After 1 round

•  With communication links –  After 5 rounds –  After 10 rounds

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 56

Gossip-Protocol: PushSum

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 57

Initialization of PushSum

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 58

Calculation of the Sum

–  One nodes starts with 1, all others with 0 –  Create average, once converged: sum = 1 / average

–  Example:

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 59

Observation: –  During the gossiping: local input x_i cannot be changed –  Aggregations round-based (epochs)

Initializing an Epoch –  Centralized

•  Several leader algorithms exist –  E.g. node with maximum specific value

•  Epoch may be restarted periodically by leader

–  Decentralized •  Besides local measure (x_i) and weight (w_i): add version number •  Every node may start new epoch concurrently

–  Larger version numbers dominate smaller ones –  Epoch runs out after convergence and minimal value variations (<ε)

Initializing an Epoch and Peer Election

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 60

Performance and Complexity of Push-Sum

Performance: precision –  Simulations with 1M nodes

•  Gossip every 5 second –  For most time:

•  False values •  Although convergence exist

–  Problem •  Peer count starts always at 0

Convergence time •  n = number of nodes •  ε = accepted relative error

–  Push-Sum converges quickly –  Problem:

•  Huge message overhead per node

W. Terpstra, C. Leng, A. Buchmann: Brief Announcement: Practical Summation via Gossip, ACM Symposium on Principles of Distributed Computing (PODC 2007)

In-pre-cise

Peer-to-Peer Systems

Monitoring the Quality of Peer-to-Peer Systems – Tree-based Approach: SkyEye.KOM

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 62

► Tree-based Monitoring Mechanism

Idea: –  Create (additional) tree topology on top of DHT –  Protocol:

•  Periodically –  Calculate aggregate of own local view and received from child nodes –  Send aggregate to parent node

•  Root calculates global view –  And passes global view to all peers

Example: SkyEye.KOM (2009)

–  Assumes structured p2p overlay –  Aims at high precision with low overhead

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 63

SkyEye.KOM: Tree Topology

Tree of information domains –  Domain: ID interval

•  E.g. [0, 0.5[ or [0.75, 0.875[ •  Largest domain, level 0: [0,1[

–  Domain ID: “middle value” in interval

–  Domain size split in β parts per level

Domain IDs build tree topology

–  Node degree: β child nodes –  Tree topology of Domain IDs does

not change over time! •  Only the peers responsible for the

Domain IDs might change –  Assignment of peers to domains

dynamic

1 10 50

20 30 40

45 15

P2P Overlay

0 1 0.09 0.2 0.31 0,4 0.5 0.6 0.75 0.9

Internet

0.5

0.25

0.375

0,3125

0.75

0.875 0.625 0.125

Domain Domain ID

0.3125

0.375

0.25

0.5

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 64

SkyEye.KOM: Communication

Example tree –  Tree degree (β) = 2

•  Results in logarithmic tree size

–  Balanced, if ID space balanced

–  Not always β children •  Peers may be Coordinators

at various levels

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 65

SkyEye.KOM: Communication Protocol

Gathering global view –  All peers measure local

status –  Periodically sent to parent

peer •  à Update Interval (UI)

Aggregation –  Direct: count, sum, minimum,

maximum, sum of squares –  Derived: mean, variance, std.

deviation

Dissemination of global view –  Global view in root –  Every update message is

acknowledged –  Contains global view from

level above

Global view

Local measures, (synchronized signal in simulations)

Aggregated view

β child nodes

… 1a 1b 1β

1. Independent updates in UI intervals per node

2b 2a 2β

2. ACKs with view of parent peer for every update

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 66

Evaluation of SkyEye.KOM

Simulation Setup –  Node count 5000 –  Churn: Join, KAD churn –  Tree degree = 4 –  Update interval = 60sec

Observation: –  Time delayed, precise

monitor –  Very low overhead

•  <100 bytes / s •  Overhead is precisely

monitored

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 67

► Benchmarking of Monitoring Solutions

Structured / tree-based solutions –  Limited in applicability à key-based routing needed

Unstructured / gossip-based solutions –  Needs only connected graph à not limited in application

Question: –  Why do not we use only gossip-based solutions?

Answer: –  They are inefficient –  Imprecise OR costy

Comparative Evaluation –  Node count: 1000, churn –  SkyEye.KOM: UI=15 sec, β = 8, synchronized (every 20min) –  PushSum: 30 messages per epoch –  Centralized for comparison, UI = 60 sec –  Same overhead allowed for all monitoring approaches

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 68

Churn – Reference: Node Count

Simulation setup –  Churn with joining and

instantly leaving nodes –  Both decentralized

solutions •  Use ca. 200 bytes/s per

node •  For better comparability

Aggregation time –  Of global view –  Performance of tree similar

to centralized solution –  Gossip-approach slower

•  Sawtooth: epochs

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 69

Reference Signals: Steps, Sawtooth and Sine PushSum

–  Imprecise monitoring –  Epochs are visible –  Although same traffic

overhead Centralized and tree-based

–  Precise –  Tree become imprecise with

too much churn

Kalman Graffi Heinrich Heine Universität Düsseldorf 16 April 2015 70

http://doodle.com/59payaptnc2e3ygu

Peer-to-Peer Networks PA1 - Simple Replication – TA PA2 - Erasure Code based Replication – RA PA3 - File Sharing of linked and connected Files – RA PO1 - Location-based Overlay using Space-filling Curves – TA PO2 - Location-based 2-dimensional Overlay – TA PM1 - Security in P2P Monitoring Networks – AD PM2 - Trees also want to Gossip – AD PM3 - Memory-based Mechanism for Document Statistics – AD PM4 - Publish/Subscribe-based Mechanism for Document Statistics – AD PM5 - Publish/Subscribe-based Mechanism for Document Statistics – AD Mobile Ad Hoc and Opportunistic Networks MO1 - MANET Testbed with Raspberry PIs - AC MO2 - Opportunistic Network Testbed with Raspberry PIs - AC MO3 - Scheduling Messages in Opportunistic Networks – SS MO4 - Local Connectivity Options between Android Devices - AI MO5 - Local Connectivity Options for Android Devices and Raspberry Pis - AI

Recommended