Upload
mizell
View
46
Download
0
Embed Size (px)
DESCRIPTION
An Optimal Broadcast Algorithm for Content- Addressable Networks. Ludovic Henrio Fabrice Huet Justine Rochas. Background Efficient Algorithm Experiments. General Motivation – RDF Storage. Context Web Semantic : RDF data C hallenge Store and retrieve RDF data - PowerPoint PPT Presentation
Citation preview
18/12/2013 - OPODIS (Nice)
1
An Optimal Broadcast Algorithm for Content-Addressable NetworksLudovic Henrio
Fabrice Huet
Justine Rochas
2
BackgroundEfficient AlgorithmExperiments
3
General Motivation – RDF Storage
Context Web Semantic: RDF data
Challenge Store and retrieve RDF data Large scale setting
Our solution Content Addressable Network
4
Content-Addressable Networks (CAN)
Overlay network Nodes are peers
AE
BC
D
0 1
1
dim #1
dim #2
Structured organization Multidimensional Cartesian space Entirely partitioned
A zone is managed by one peer A zone = a (hyper)rectangle
Neighborhood based on adjacent zones Routing = successively approaching value in all dimensions
5
Problem: Cost of Queries2 queries over 2 variables:
conjunction of two 2-dimensional broadcast
1 query over 2 variables
1 query over 1 variable
Naive broadcast does not scale
OK
OK
NOT OK
6
Duplicated messages 11 peers 40 messages !
How to eliminate duplicates? For each peer P Find the peer that is reponsible
for sending the message to P
E
0 1
1
dim #1
dim #2
Problem: Duplicated Messages
7
Existing Solutions Use the CAN structure to route messages
Meghdoot [1] « upperLeft » predicate M-CAN [2]
M-CAN principles Initiator peer sends to all neighbors Other peers forward to neighbors on
Same dimension on opposite side Lower dimensions on all sides
Forwarding on the last dimension depends on a constraint
[1] A. Gupta, O. D. Sahin, D. Agrawal, A. El Abbadi: Meghdoot: Content-Based Publish/Subscribe over P2P Networks. Middleware 2004
Meghdoot: start from a corner
A
BC
8
M-CAN Execution
INIT
Corner Constraint
Message
Message that leads to duplication
[2] S. Ratnasamy, M. Handley, R. M. Karp, S. Shenker: Application-Level Multicast Using Content-Addressable Networks. Networked Group Communication 2001
9
Preliminary Work
Existence of an optimal algorithm proved [3]
A solution to exhibit existence Valid for a very generic definition of CAN Not efficient (execution time)
Parallelize messagessending only whenreaching a « border »
INIT
[3] Francesco Bongiovanni, Ludovic Henrio: A Mechanized Model for CAN Protocols. FASE 2013
10
BackgroundEfficient AlgorithmExperiments
11
Hypothesis and Goals
CAN = adjacent rectangles
No additional structure Tolerate churns between two Bcast
Not implementation-dependent Do not tolerate churns during Bcast
Optimal in number of messages and good parallelization
A spanning tree
INIT
Efficient Algorithm – Principle
Removes all duplicates In all dimensions
How ? Uses the corner constraint Plus a spatial constraint
A set of fixed values Reduce the problem
Applies recursivelyspatial constraint in 3D CAN
spatial constraint in 2D CAN
13
Observation #1 Easy to forward in 1D
Observation #2 Only one zone touches a corner
Idea of the algorithm Suppose an efficient broadcast in dimension N Apply on a hyperplane of dimension N - 1 Send to both sides of this hyperplane using the corner
constraint Repeat until the hyperplane is just a line (dimension 1)
Efficient Algorithm
14
Efficient Algorithm – Execution
INIT
Corner Constraint
Message
Message that leads to duplication
Spatial Constraint
15
Efficient Algorithm – Properties Proved to be correct
All peers receive a broadcast message at least once
Proved to be minimal All peers receive a broadcast message at most once
Elements of proof – When receiving on dimension D: dim < D spatial constraint is satisfied For dim = D ascending or descending direction dim > D corner constraint is satisfied
This algorithm is optimal
All peers receive a broadcast message exactly once
16
BackgroundEfficient AlgorithmExperiments
17
Experimental Setup
Using the Grid5000 platform Multisite experimentation
Deployment From 50 to 1500 peers Up to 200 physical machines
CAN setting Successively split zones in half Zone to split is chosen randomly
AC
B
18
Number of messages
Maximum gain of 5.3 MB
19
Number of messages
20
Execution Time
Significant speedup
21
Conclusion: Broadcast on CAN
We found an optimal solution Proved to be correct and optimal Efficient on large scale settings
Support range multicast
Currently in use in the EventCloud project [4] Management of RDF data Algorithm used for one year Tested and approved !
[4] http://www.play-project.eu/solutions/event-cloud
EventCloud
A range multicast
22
dim #1dim #3
dim #2