37
Dense subgraphs of random graphs Uriel Feige Weizmann Institute

Dense subgraphs of random graphs

Embed Size (px)

DESCRIPTION

Dense subgraphs of random graphs. Uriel Feige Weizmann Institute. Talk Outline. Discuss problems related to dense subgraphs of random graphs: Planted k -clique. Dense k -subgraph (if time permits). Random Clique. Random graph G on n vertices and edge probability ½ . - PowerPoint PPT Presentation

Citation preview

Page 1: Dense subgraphs  of random graphs

Dense subgraphs of random graphs

Uriel Feige

Weizmann Institute

Page 2: Dense subgraphs  of random graphs

Talk Outline

Discuss problems related to dense subgraphs of random graphs:

Planted k-clique.

Dense k-subgraph (if time permits).

Page 3: Dense subgraphs  of random graphs

Random Clique

Random graph G on n vertices and edge probability ½.

Maximum clique size almost surely 2log n.

Upper bound: expectation.

Lower bound: + variance.

Not constructive.

Page 4: Dense subgraphs  of random graphs

How to actually find the clique?

Greedy(degree) algorithm finds clique of size log n (plus low order terms).

No better polytime algorithm known.

Exhaustive search in time nO(log n).

Page 5: Dense subgraphs  of random graphs

Cryptographic applications[Juels and Peinado]

Assuming state of the art is not improved:

• Oneway functions.

• Hierarchical keys.

(Idea: distribution does not change if a small number of cliques of size 1.5 log n are planted in the graph.)

Page 6: Dense subgraphs  of random graphs

Planted/hidden clique

Random graph G on n vertices and edge probability ½.

A random set H of k vertices turned into a clique.

If k > 2log n, H will almost surely be the unique maximum clique in G.

Find H. Becomes easier the larger k is.

Page 7: Dense subgraphs  of random graphs

Degree concentration

• Degrees of vertices in G strongly concentrated around n/2.

• Distribution of degrees of H-vertices statistically different than other vertices if k larger than standard deviation.

• Kucera: if k > c(n log n)1/2, H is simply all vertices of largest degree.

(Greedy(degree) algorithm outputs H)

Page 8: Dense subgraphs  of random graphs

Use of eigenvectors[Alon, Krivilevich and Sudakov]

• Normalize adjacency matrix of G to sum up to 0.

• Eigenvalues of G strongly concentrated around 0. No value larger than n1/2.

• If k > cn1/2, H contributes a larger eigenvalue.

• H can be recovered from the eigenvector that corresponds to largest eigenvalue (takes some work).

Page 9: Dense subgraphs  of random graphs

Constant improvements

• Guess a vertex from H, and restrict problem to its neighborhood.

• Clique relative size increases, and graph remains random.

• Can find planted cliques of size n1/2/2t in time nO(t).

• Polynomial (but very slow) for fixed t.

Page 10: Dense subgraphs  of random graphs

Use of SDP [Feige and Krauthgamer]

Lovasz theta function provides upper bound of clique size.

On random graphs, its value is known to be O(n1/2).

Can be used to both find and certify optimality of H when k > n1/2.

Page 11: Dense subgraphs  of random graphs

Going below n1/2

• A certain Markov chain approach fails [Jerrum].

• Use of t levels of Lovasz-Schrijver SDP relaxations no better than simply guessing t vertices of clique [Feige and Krauthgamer].

• For k > n1/3, a global maximum of a certain cubic form [Frieze and Kannan].

Page 12: Dense subgraphs  of random graphs

Why care about planted clique?

Seems to require the development of new algorithmic techniques.

A concrete challenge for understanding observable properties of random graphs (does planting a large clique make a noticeable difference?).

Related to some other problems.

Page 13: Dense subgraphs  of random graphs

Interesting connection

• In a 2-person game, an approximate Nash equilibrium with nearly best payoffs (compared to true Nash) can be found in time nO(log n) [Lipton, Markakis and Metha].

• A poly-time algorithm for approximate best Nash will solve the hidden clique problem in polynomial time [Hazan and Krauthgamer].

Page 14: Dense subgraphs  of random graphs

The experimental approach to the design and analysis of algorithms

For hidden clique, the input distribution is well defined and can be sampled from efficiently.

To evaluate a candidate algorithm, run it on a random sample and observe performance.

• If not good, modify the algorithm.• If good, analyze the algorithm.

In practice, graphs for experiments are generated using pseudorandom generators.

Page 15: Dense subgraphs  of random graphs

Experimental results(with Dorit Ron)

• n = 40,000.• m = 400,000,000.• n1/2 = 200.For success rate roughly ½:• k = 158 (Alg1 - LDR), 137 (Alg2 - TPMR). Is this good or bad?• 2 log n = 30 • n1/4 = 14.

Page 16: Dense subgraphs  of random graphs

Understanding large sets of results

To estimate the success probability within 1% error requires roughly 10,000 experiments.

To see patterns, helps if results are displayed graphically.

Do our algorithms work when k = n0.49?

Need experiments with large n.

Page 17: Dense subgraphs  of random graphs
Page 18: Dense subgraphs  of random graphs

Jumping to conclusions

Care is needed.

• Is the PRG the issue?

• Is n sufficiently large to draw asymptotic conclusions?

• Might the choice of scaling of the x-axis be biasing our interpretation?

Page 19: Dense subgraphs  of random graphs
Page 20: Dense subgraphs  of random graphs

Jump to the analysis?

The TPMR algorithm (Truncated Power Method Removal) looks promising.

Difficult to analyze, but worth it, because the algorithm is so special.

Or is it? (there was also Alg1 …)

Page 21: Dense subgraphs  of random graphs
Page 22: Dense subgraphs  of random graphs
Page 23: Dense subgraphs  of random graphs

Information on the algorithms

General idea:

• Sort vertices by likelihood of being in H.

• Remove (one or more) least likely vertices.

• Repeat.

Our algorithms take linear time (in m).

Page 24: Dense subgraphs  of random graphs

Low Degree Removal (LDR)

Iterative removal phase:

• If current graph is a clique, move to expansion phase.

• Remove vertex of lowest degree (breaking ties arbitrarily).

Iterative expansion phase:

• Add vertices that are connected to all the clique.

Page 25: Dense subgraphs  of random graphs

Theorem

For every < 1 there is a constant c such that if k > cn1/2 then LDR finds the hidden k-clique H for at least a fraction of the input

instances.

Page 26: Dense subgraphs  of random graphs

Sketch of proof of theorem

Lemma 1. In every subgraph with t > 11k/10 vertices, some vertex not in H has degree at most t/2 + c1n1/2.

Proof. Straightforward. Large deviation bounds on average degree + union bound.

Page 27: Dense subgraphs  of random graphs

Corollary

As long as t > 11k/10 vertices remain, LDR removes a vertex of degree “not much larger” than t/2 (at most t/2 +c1n1/2 ).

Page 28: Dense subgraphs  of random graphs

Lemma 2

For any vertex v,

with high probability (say 99/100),

up to the point v was removed (if at all),

v’s average degree to removed vertices not in H is at most 1/2,

with a total deviation no larger than c2n1/2.

Page 29: Dense subgraphs  of random graphs

Sketch of proof of Lemma 2

Reveal the edges of v only when needed.

Given a candidate vertex u for removal, if no edge (u,v) then remove u. Otherwise perhaps delay removal.

Average rate of removal at most 1/2.

Probability of excursion larger than c2n1/2 is small.

Page 30: Dense subgraphs  of random graphs

Most vertices of H survive LDR.

Almost all vertices of H start with “very high” degree (assuming that c > 4(c1 + c2)).

There are always vertices of not high degree available for removal. (Lemma 1.)

The first k/10 high degree vertices of H to be removed must have lost degree at a high rate. This is a low probability event, by Lemma 2 and Markov’s inequality.

Page 31: Dense subgraphs  of random graphs

Finishing the proof

9k/10 vertices of H among the last 11k/10 survivors.

Hence no vertex not in H can survive the removal phase.

Expansion phase will pick up remaining vertices from H.

Page 32: Dense subgraphs  of random graphs

Conjectures

• The leading constant c is small: when =1/2, then c < 1 suffices.

• Order of quantifiers can be switched: for some c, the fraction tends to 1 as n grows.

• Lower bounds: LDR fails when k = o(n1/2).

Page 33: Dense subgraphs  of random graphs

Open question

Does the size of the planted clique exhibit threshold behavior with respect to the success probability of the LDR algorithm?

Page 34: Dense subgraphs  of random graphs

Truncated Power Method Removal TPMR algorithm

• Initially x is the vector of degrees.• Compute x’ = Ax.• Normalize x’ to sum up to 0.• Average x and x’ to get a new x.• Repeat 6 times. Sort vertices by their x value.Remove the lower 10%.Etc.

Page 35: Dense subgraphs  of random graphs

Some observations on TPMR

• Linear time in m, though slower than LDR.

• Finds smaller planted cliques than LDR.

Why not let x converge?

• Faster.

• Performs better in our experiments.

Any hope of analysing TPMR?

Page 36: Dense subgraphs  of random graphs

Summary

Experimental approach suggests interesting observations.

• Commit in small steps. (Related to “decimation” in message passing algs.)

• Truncated power method is better than power method.

Challenge: support observations by analysis.

Page 37: Dense subgraphs  of random graphs

Running times

Lenovo 2.53 Ghz and 3GB RAM.

20 samples with around 50% success rates.

N GEN | LDR | TPMR

2500 14 | 17 (3) | 48 (34) |

5000 72 | 80 (8) | 199 (127) |

10000 334 | 365 (31) | 832 (498) |