Distributed Computing Group Locality and the Hardness of Distributed Approximation Thomas Moscibroda Joint work with: Fabian Kuhn, Roger Wattenhofer

DistributedComputing

Group

Locality and the Hardness of Distributed Approximation

Thomas Moscibroda

Joint work with: Fabian Kuhn, Roger Wattenhofer

Thomas Moscibroda @ DANSS 2004 2

Locality...

• Communication in multi-hop networks is inherently local !• Issue of locality is crucial in distributed systems!

• Direct communication only between neighbors• Obtaining information about distant nodes requires

multi-hop communication

Obtaining information from entire network

requires plenty of communication!


Locality...

• Many modern networks are large-scale and highly complex– Internet– Peer-to-Peer Networks– Wireless Sensor Networks

• Or even dynamic...– Wireless Ad Hoc Networks

No node has global information

Each node has information from its vicinity only

(local information)

Yet, nodes have to come up with a global goal!


Example: Global Goal – Local Information

• Clustering in Wireless Sensor Networks• Choose Clusterheads such that

– Every node is either a clusterhead or...– ...has a clusterhead in its neighborhood.

• Idea: Clusterhead sense environment Non-clusterheads can go to sleep mode Save energy!

• Goal: We want as few clusterheads as possible! (Minimum Dominating Set Problem)

Nodes have only local information

Nodes have to optimize a global goal

Crucial for

fast Algorithms!


k-Neighborhood

• What does „local“ mean ? • How far does „locality“ go?• Neighborhood, 2-Hop Neighborhood,... or something else...???

In communication round:123


k-Neighborhood

In k rounds of communication...

... each node can gather information

only from k-neighborhood!

• If message size is unbounded:

Entire information from k-neighborhood

(IDs, topology, edge-weights,...)

can be collected!

• If message size is bounded:

Only subset of this information can be gathered.

Strongest model

for lower bounds

on locality!


What can be computed locally? [Naor,Stockmeyer;1993]

• We want to establish a trade-off between amount of communication and quality of the global solution.

TRADE-OFFLOCALITY

Communication

Rounds

GLOBAL GOALApproximation

• Upper Bounds:

Constant-Time Approximation Algorithms

• Lower Bounds:

Hardness of Distributed Approximation



• How well can global tasks be locally approximated?– Minimum Dominating Set

(Choose minimum S µ V, s.t. each v2V is in S or has

at least one neighbor in S)

– Minimum Vertex Cover

(Choose minimum S µ V, s.t. each e2E has one node in S)

Both problems appear to be local in

nature!



An answer to the previous question...

...helps in answering the following question about

exact variants of the problems.

How large must the locality be in order to compute a maximal independent set or maximal matching?

An answer to this question implies important

time-lower bounds for distributed algorithms!


Overview

• Introduction to Locality

• Related Work

• Vertex Cover Upper Bound

• Vertex Cover Lower Bound

• Conclusions & Open Problems


Related Work

On Locality

• Naor, Stockmeyer 1993: Which locally checkable labelings can be computed in constant time? [Naor,Stockmeyer;1993]

• O(log n) algorithm for maximal independent set [Luby;1986]

O(log n) algorithm for maximal matching [Israeli, Itai;1986]

• 3-coloring in a ring in O(log*n) time [Cole, Vishkin;1986]

• O(log*n) was shown to be optimal by Linial [Linial;1992]

Only previous lower bound on locality!


Related Work

On Distributed Approximation Upper Bounds (examples...)

– Minimum Dominating Set Problem[Jia, Rajaraman, Suel; 2001] [Kuhn, Wattenhofer; 2003],...

– Minimum Edge-Coloring[Panconesi, Srinivasan; 1997],...

– General covering and packing problems[Bartal, Byers, Raz; 1997][Kuhn, Moscibroda, Wattenhofer;

submitted]

– Facility Location

Lower Bounds– Results based on Linials (log*n) lower bound– Recently, strong lower bound on the distributed approximability

for the MST. [Elkin; 2004]

For general graphs, we

drastically improve this result.


Overview


• Related Work





Minimum Vertex Cover

• We consider the most basic coordination Problem

Minimum Vertex Cover (MVC)

Choose as few nodes as possible to cover all edges

• We give an approximation algorithm with...

O(k) communication rounds

O(1/k) approximation

O(log n) bits message size

• General idea: Consider the integer linear program of MVC. • Distributed Primal-Dual Approach


Minimum Vertex Cover

• (Fractional) MVC is captured by the following linear program

• Its dual is the fractional maximum matching (MM) problem


• Each node stores a value xi

• Each edge stores a value yj

– In a real network: edge is simulated by incident node!

• Idea: – Compute a feasible solution for MVC

– While doing so: Distribute the dual values yj among

incident, uncovered edges, hence

– We show that

This yields an O(1/k) approximation!

Distributed MVC Algorithm

y1

y2

y3xi=1

=1/3

=1/3

y4=1/3


Distributed MVC AlgorithmNumber of incident,

uncovered edges

Maximum i

in neighborhood

If relative number

of uncovered

edges is high

join VC

If sum of yj in

neighborhood is ¸ 1: Pick node and

distribute yj

proportionally!

It can be

shown that:

i · (l+1)/k


Analysis

Lemma:

At the end of the algorithm,

for all nodes vi 2 V: Yi · 3+1/k

Idea: Bound the sum of the incident dual variables

for each node i.

Proof: Let i denote the ith iteration of the loop

Case 1: Consider a node which does not join the vertex cover!

1) Until 0 ,, Yi is smaller than 1

2) In 0 , i must be 0

3) All neighbors must have joined

VC before 0 Yi · 1


• Before l: Yi · 1

• During l:

• xi := 1 Yi · 2

• neighbors vk may join

VC, too increase Yi

• All these nodes have k ¸ i(1) l/(l+1) ¸ i

l/(l+1)

• We get at most 1/k from each of the v

uncovered neighbors! v / k · 1/k

Analysis

Case 2: Consider a node that joins

VC in line 5 of iteration l.

before l :

vi

0.3 0

0

0

during l :

0.33

0.33

0.33vk

0.62

Yi · 3+1/k

Additional

nodes joining VC

can increase Yi

by at most 1

Case 3 is similar


Summary VC-Algorithm

• Algorithm runs in O(k) rounds

Equivalent: Locality is O(k) hops! • Message size is O(log n) bits

Approximation quality is 1/k+O(1)

How many rounds are necessary for a

O(1) or O(polylog ) approximation?

O(log ) time O(1) approximation

O(log /loglog ) time O(polylog ) approximation

Can we do

better?


Overview


• Related Work





Model

• Network graph = graph on which we compute VC

• Nodes have unique identifiers

• Message size and local computation are unbounded

• Strongest possible model for lower bounds Lower bounds are consequence of locality alone


Basic Idea

• S0 and S1 contain n0 and n0/ nodes, resp.

• Optimal VC does not contain nodes of S0

• Basic Structure of our proof:

1. Construct graph such that nodes in S0 and S1 have same view

2. Algorithm has to take nodes of S0 in order to cover edges between S0 and S1

3. VCALG >> VCOPT because |S0| >> |S1|

S0

S1


One Round Lower Bound

S1

S0


One Round Lower Bound

S0

S1


S0

S1

Two Round Lower Bound


S0

S1

7 77 77 77 73 21 34 11 21 44 224 11

View of node in S0 View of node in S1

Two Round Lower Bound: Views


The Cluster Tree I


The Cluster Tree II

• Cluster Tree = a tree of clusters of nodes• Recursively defined for k>0

• Defines the structure of a graph Gk

• Each link on the tree is a bipartite sub-graph of Gk

• If girth of Gk is at least 2k+1, nodes in S0 and S1 have the same view up to distance k


Construction of Gk

• How can we achieve high girth?

• Gk is a bipartite graph (even level clusters / odd level clusters)

• For prime power q, D(r,q) is bipartite graph with 2qr nodes and girth at least r+5[Lazebnik,Ustimenko; Explicit Construction of Graphs With an Arbitrary Large Girth and of Large Size; 1995]

• If >k, Gk can be constructed as sub-graph of D(2k-4,q) for q=O(k)

Gk has n=O(k2) nodes

This is according to Intuition...

...because every node in S0

must see a tree up to

distance k.


Bounding the Optimum

• The number of nodes decreases by factor at

least /k on each level.

• If >2k, n < n0+2n0k/

• All nodes V \ S0 form a feasible vertex cover, hence

|VCOPT| < 2n0k/

n0:

· n0k/:

· n0k2/2:

· n0k3/3:

geometric series


Bounding any distributed algorithm

• Assume that the labeling (IDs) is chosen uniformly at random:

• Nodes v0 in S0 and v1 in S1 see– Same topology– Same probability distribution of labels– Both have same probability for being in VC (probability p)

• p is at least ½, otherwise there is a probability that VC is not feasible!

Therefore: At least half of the nodes in S0 join VC!

• For all algorithms, there is labeling with |VCALG|¸n0/2

• Randomized: |VCALG|¸n0/2 by Yao’s minimax principle

v0

v1


Approximation Lower Bound

• We have |VCALG| ¸ n0/2 and |VCOPT| < 2n0k/

Approximation ¸ |VCALG|/|VCOPT| > /(4k)

• n = O(k2), = k+1

In k communication rounds, no

algorithm can approximate MVC

better than nc/k2/k) or (1/k/k)


Time Lower Bound

• For constant/polylog approximation, we need

• Recall our vertex cover algorithm

O(log ) time O(1) approximation

O(log /loglog ) time O(polylog ) approximation

Can we do

better?

Algorithm tight for polylog approximation and

tight up to O(loglog ) for constant approximation!


Hardness of Approximation Exact Problems

• Approximation Theory is very active area of research

(see STOC, FOCS, SODA,...)

• Study of lower bounds on approximation

Hardness of Approximation!

• This has lead to new insight in complexity theory (PCP,...)

Study of

Approximability

Better understanding

of exact problems!


Hardness of Approximation Exact Problems

• Maximal matching (MM) is 2-approximation for MVC…• (MM) is maximal independent set (MIS) on line graph, …

Does the same hold in distributed computing?

To some degree, it does....!

Time lower bounds for MIS and

Maximal Matching

Compare with

log*n) lower

bound on ring and

O(log n) upper

bound


What about Dominating Sets ?

• For each VC instance, there is graph on which dominating set is the same

MVC bounds also hold for MDS

• Approximation lower bound can also be extended to maximum matching (more than just a reduction)


Conclusions

• Locality is vital in distributed systems. • Not much is known so far...

• In this talk, lower bounds on local computation

tight up to a factor of

and

Vertex Cover, Dominating Set, Maximum Matching,

MIS, Maximal Matching,...

• The hardness of distributed approximation is an

interesting research topic


DistributedComputing

Group

Questions?Comments?

Documents

Distributed Computing Group Locality and the Hardness of Distributed Approximation Thomas Moscibroda Joint work with: Fabian Kuhn, Roger Wattenhofer