Large-Scale Distributed Systems LSD Large-Scale Distributed
Peer-to-Peer file sharing networks eMule: 2-3 million peers Skype:
5,801,651 peers (for 4/4/06 16:14) Grid systems EGEE: 10,000 CPU,
10 Petabytes of storage Wireless sensor networks Introduction
Slide 4
Computing in LSD Systems is Difficult Global synchronization is
impossible Synchronization is need after each iteration The input
constantly changes It is hard to keep a large system static
Failures are frequent If a PC fails once a week, a system with a
million PCs will have 2 failures every second And of course
scalability is necessary Introduction
Slide 5
Current state of LSD computing Embarrassingly parallel tasks
Many interesting problems are not embarrassingly parallel Used in
current grid systems Data storage and retrieval No computation here
Used in current peer-to-peer systems Introduction
Slide 6
Desired state of LSD computing We want to be able to solve more
elaborate problems: Data mining Optimization problems In this
research we solve the facility location problem in LSD systems.
Introduction
Slide 7
The Facility Location Problem We are given: A set of facilities
A set of clients A cost function We need to choose: Which
facilities to open Which facility serves each client Such that the
cost is minimized Introduction
Slide 8
Related Work Data Mining Most of the distributed data mining
algorithms were designed for small systems: Extensive use of global
synchronization Do not tolerate failures Meta-Learning No
synchronization, and tolerates failures Result quality decreases
with the number of nodes Related Work
Slide 9
Related Work LSD Computing Approaches Gossip Random walk based
Asymptotically converges to the exact result with high probability
Local Algorithms Eventually achieves the exact result. We are here
Related Work
Slide 10
What is a local algorithm? The output of each node depends only
on the data of a group of neighbours Eventual correctness
guaranteed Size of group may depend on the problem at hand
Prerequisites
Slide 11
Local vs. Centralized LocalCentralized 2 link delays, doesnt
depend on the network size 16 link delays, equals to the network
diameter Prerequisites
Slide 12
Local Facility Location Architecture Our contribution Proposed
by Wolff and Schuster in ICDM 2003. Used in local association rules
mining algorithm. Our extension Prerequisites
Slide 13
Majority Voting Each node has a poll with votes for the red or
green party. Each node is interested to know which party won the
elections. Local Majority Voting
Slide 14
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Global constants: majority threshold bias Input of node u: c u
the number of local votes s u the number of local red votes G u a
set of neighbors Output of node u: true if this inequality holds:
false otherwise. The input is free to change An ad-hoc output is
always available. Its accuracy gradually increases, and eventually
it becomes exact. Local Majority Voting
Slide 20
Distributed Facility Location Global constants: M a set of
possible facility locations - facility cost Input of node u : DB u
a set of clients local to node u G u a set of neighbors - service
cost Output of node u : - a set of open facilities, such that
Cost(C u ) is minimal. The input is free to change An ad-hoc output
is always available. Its accuracy gradually increases, and
eventually it becomes exact. Distributed Facility Location
Slide 21
Finding the optimal solution Facility Location is NP-Hard We
use a hill climbing heuristic In this case, hill climbing provides
factor 3 approximation In each step we move, open, or close one
facility. We stop when the cost doesnt improve. Distributed
Facility Location
Slide 22
Choosing the next step configuration C0C0 C1C1 C2C2 C3C3 C4C4
C5C5 C0C0 Known to every node Distributed over the whole network
The local majority vote algorithm can be used to compare the costs
of two configurations. Distributed Facility Location
Slide 23
Comparing two configurations. Each node votes in favor of one
or another configuration. A configuration that wins the elections
has lower cost than the other. C1C1 C2C2 Number of green votes of
node u: Number of red votes of node u: Global constants:
Distributed Facility Location
Slide 24
Why the ArgMin? To find the best next step out of k possible
options, using majority votes, will require O(k 2 ) comparisons.
The local ArgMin algorithm makes it in O(k) comparisons in average.
Internals of the ArgMin will be described at the end. Distributed
Facility Location
Slide 25
The ArgMin interface Global constants: B the bias vector Input
of node u : A u the addendum vector G u a set of neighbors Output
of node u : The index i such that: ArgMin is anytime, its output
may change, and like majority vote it never terminates! Distributed
Facility Location
Slide 26
Speculative execution If we never finish computing the first
step, how can we start the second one? The answer: We make a guess
and base on it the next step. If the guess turns to be wrong, we
backtrack and recompute. Speculative Execution
Slide 27
Every node speculates 1.Eventually, the first iteration will
converge to the exact result, and will be the same in every node.
2.Then, the second iteration will be able to converge, and so on
until all iterations are exact. 3.When all iterations are exact,
every node will output the exact solution. Speculative
Execution
Slide 28
ArgMin internals ArgMin uses majority votes to compare pairs of
vector elements. ArgMin is also speculative. In every iteration,
each configuration is compared to a pivot Local ArgMin
Slide 29
Experimental Results The number of messages each node sends
does not depend on the network size Experimental Results
Slide 30
Majority of the nodes provide exact result even if the input
continuously changes. Experimental Results
Slide 31
Conclusions We have described a new facility location algorithm
suitable for large-scale distributed systems. The algorithm is
scalable, communication efficient, and able to efficiently sustain
failures.
Slide 32
The End Special thanks to Ran Wolff for his help in supervision
of this research