47
Computer Science and Engineering Computing A Near-Maximum Independent Set in Linear Time by Reducing-Peeling Lijun Chang University of New South Wales, Australia [email protected] Joint work with Wei Li, Wenjie Zhang

Computing A Near-Maximum Independent Set in Linear Time by

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computing A Near-Maximum Independent Set in Linear Time by

Computer Science and Engineering

ComputingANear-MaximumIndependentSetinLinearTimebyReducing-Peeling

Lijun Chang

University of New South Wales, [email protected]

Joint work with Wei Li, Wenjie Zhang

Page 2: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 3: Computing A Near-Maximum Independent Set in Linear Time by

Introduction

Given a graph ! = ($, &), a vertex subset ( ⊆ $ is an independent set if for any two vertices * and + in (, there is no edge between *and + in !.

Independent Set

An independent set(of ! is a maximum independent set if its size is the largest among all independent sets of !.

Maximum Independent Set

Page 4: Computing A Near-Maximum Independent Set in Linear Time by

Introduction

Given a graph ! = ($, &), a vertex subset ( ⊆ $ is an independent set if for any two vertices * and + in (, there is no edge between *and + in !.

Independent Set

An independent set(of ! is a maximum independent set if its size is the largest among all independent sets of !.sets of !.

Maximum Independent Set

Independent Set

Page 5: Computing A Near-Maximum Independent Set in Linear Time by

Introduction

Given a graph ! = ($, &), a vertex subset ( ⊆ $ is an independent set if for any two vertices * and + in (, there is no edge between *and + in !.

Independent Set

An independent set(of ! is a maximum independent set if its size is the largest among all independent sets of !.

Maximum Independent Set

MaximumIndependent

Set

Page 6: Computing A Near-Maximum Independent Set in Linear Time by

Introduction

Applicationsv Build index for shortest path/distance queries [Cheng et al.

SIGMOD’12, Fu et al. VLDB’13]v Refine the result of matching two graphs [Zhu et al. VLDB J’13]v Social network coverage [Puthal et al. BigData’15]; vertex

cover

Hardnessv NP-hard to compute a maximum independent set [Garey et al.

Book’79]v Hard to approximate

§ NP-hard to approximate within a factor of -./0 for any 0 <3 < 1 [J. Håstad. FOCS’96]

Page 7: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 8: Computing A Near-Maximum Independent Set in Linear Time by

Existing WorksExact algorithms -- branch-and-reduce paradigm

v [F. V. Fomin et al .J.ACM’09] § Theoretically runs in 5∗ 1.22019 time

v [T. Akiba et al. Theor. Comput. Sci.’16]§ Practically computes the exact solution for many small and

medium-sized graphs

Approximation algorithmsv [U. Feige J. Discrete Math’04, M. M. Halldórsson et al.

Algorithmica’97, P. Berman. Theor.Comput. Sys.’99]§ Approximation ratio largely depends on n or Δ§ Not practically useful

Page 9: Computing A Near-Maximum Independent Set in Linear Time by

Existing Works

Heuristic algorithms for large graphsv Linear-time algorithms

§ Greedy, dynamic update§ Efficient, but can only find small independent sets in

practicev Iterative randomized searching

§ Local search algorithm: ARW [D. V. Andrade. J.Heuristics’12]

§ Evolutionary algorithm: ReduMIS [S. Lamm. ALENEX’16]§ Local search + simple reduction rules: OnlineMIS [J.

Dahlum. SEA’16]§ Can find large independent sets, but take long time

Our goal: find large independent sets in a time-efficientand space-effective manner

Page 10: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 11: Computing A Near-Maximum Independent Set in Linear Time by

Three Observations Utilized in Our Frameworkv Observation–I: Real networks are usually power-law graphs with

many low-degree vertices

v Observation-II: Reduction rules have been effectively used for low-degree vertices

v Observation-III: High-degree vertices are less likely to be in a maximum independent set

;< =>? = @ ∝B

@C

Page 12: Computing A Near-Maximum Independent Set in Linear Time by

Three Observations Utilized in Our Frameworkv Observation–I: Real networks are usually power-law graphs with

many low-degree vertices

v Observation-II: Reduction rules have been effectively used for low-degree vertices

(b) Isolation D ! = D !\ +, F(c) Folding D ! = D !/ *, +, F + 1

v Observation-III: High-degree vertices are less likely to be in a maximum independent set

(a) D ! = D !\ +

Degree-one Reduction

Degree-two ReductionsD ! : J-KLML-KL-NL-*OPLQRS!

Page 13: Computing A Near-Maximum Independent Set in Linear Time by

Three Observations Utilized in Our Frameworkv Observation–I: Real networks are usually power-law graphs with

many low-degree vertices

v Observation-II: Reduction rules have been effectively used for low-degree vertices

v Observation-III: High-degree vertices are less likely to be in a maximum independent set

Ø If a high-degree vertex is added into the independent set, then all its neighbors, which are of a large quantity, are ruled out from the independent set [J. Dahlum et al SEA’16]

Ø Removing/peeling high-degree vertices can further sparsify the graph [Y. Lim et al TKDE’14]

Page 14: Computing A Near-Maximum Independent Set in Linear Time by

The Reducing-Peeling FrameworkDefinition 3.1: (Inexact Reduction) Given a graph !, we remove/peel the vertex with the highest degree from !.

v Phase 1: ReducingØ While a reduction rule can be applied on a vertex * then

Apply the exact reduction rule on *

v Phase 2: PeelingØ Apply the inexact reduction rule to temporarily remove a high-

degree vertex

v Repeat the above two phases until there is no edge in the graph

v Post-process: Iteratively add a temporarily removed vertex to the solution if the independence requirement is not violated

Page 15: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 16: Computing A Near-Maximum Independent Set in Linear Time by

Overview of Our Approaches

v Compute large independent set for large graphs in a time-efficient and space-effective manner§ Subquadratic (or even linear) time.§ 2m + O(n) space: m is the number of undirected edges.

§ A graph is stored in 2m + n + O(1) space by the adjacency array (aka, Compressed Sparse Row) graph representation

§ A graph with one billion edges takes slightly more than 8GBmemory

Page 17: Computing A Near-Maximum Independent Set in Linear Time by

An Efficient Baseline Algorithmv BDOne

Step 1:While $T. ≠ ∅ or $WX ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily

removed vertices

=>? YB = B

Page 18: Computing A Near-Maximum Independent Set in Linear Time by

An Efficient Baseline Algorithmv BDOne =>? YB = B

YZ[\][^_^_>_[?_>\^=>?<>>

Step 1:While $T. ≠ ∅ or $WX ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily

removed vertices

Page 19: Computing A Near-Maximum Independent Set in Linear Time by

An Efficient Baseline Algorithmv BDOne =>? YB = B

YZ[\][^_^_>_[?_>\^=>?<>>

Step 1:While $T. ≠ ∅ or $WX ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily

removed vertices

Page 20: Computing A Near-Maximum Independent Set in Linear Time by

An Efficient Baseline Algorithmv BDOne =>? YB = B

YZ[\][^_^_>_[?_>\^=>?<>>

Step 1:While $T. ≠ ∅ or $WX ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily

removed vertices

Page 21: Computing A Near-Maximum Independent Set in Linear Time by

An Efficient Baseline Algorithmv BDOne =>? YB = B

YZ[\][^__[?_>\^=>?<>>

Complexity AnalysisTime: 5 OSpace: 2O + 5 -

Step 1:While $T. ≠ ∅ or $WX ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily

removed vertices

Page 22: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Baseline Algorithmv BDTwo

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwo-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily removed

vertices

=>? YB = B

Page 23: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Baseline Algorithmv BDTwo =>? YB = B

=>? Ya = b

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwo-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily removed

vertices

Page 24: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Baseline Algorithmv BDTwo =>? YB = B

=>? Ya = b

Complexity AnalysisTime: 5 -×O andg O + -hRi-Space: 6O + 5 -

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwo-Reduction

ElseInexact-Reduction

Step 2:Recover temporarily removed

vertices

Page 25: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Lemma 4.1: (Degree-two Path Reductions) Consider a graph! = $, & with minimum degree two. For a maximal degree-two path k = +., +X, … , +m , let + ∉ k and F ∉ k be the unique vertices connected to +. and +m, respectively.

Case 1: + = F

⟹ D ! = D !\ +

Page 26: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Lemma 4.1: (Degree-two Path Reductions) Consider a graph! = $, & with minimum degree two. For a maximal degree-two path k = +., +X, … , +m , let + ∉ k and F ∉ k be the unique vertices connected to +. and +m, respectively.

Case 2: k is odd and +, F ∈ &

⟹ D ! = D !\ +, F

Page 27: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Lemma 4.1: (Degree-two Path Reductions) Consider a graph! = $, & with minimum degree two. For a maximal degree-two path k = +., +X, … , +m , let + ∉ k and F ∉ k be the unique vertices connected to +. and +m, respectively.

Case 3: k is odd and +, F ∉ &

⟹ D !

= D !\ +X, … , +m ⋃ +., F +k − 1

2

Page 28: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Lemma 4.1: (Degree-two Path Reductions) Consider a graph! = $, & with minimum degree two. For a maximal degree-two path k = +., +X, … , +m , let + ∉ k and F ∉ k be the unique vertices connected to +. and +m, respectively.

Case 4: k is even and +, F ∈ &

⟹ D !

= D !\ +., … , +m +k

2

Page 29: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Lemma 4.1: (Degree-two Path Reductions) Consider a graph! = $, & with minimum degree two. For a maximal degree-two path k = +., +X, … , +m , let + ∉ k and F ∉ k be the unique vertices connected to +. and +m, respectively.

Case 5: k is even and +, F ∉ &

⟹ D != D !\ +., … , +m ⋃ +, F

+k

2

Page 30: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Page 31: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Page 32: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Page 33: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Page 34: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Page 35: Computing A Near-Maximum Independent Set in Linear Time by

An Effective Linear-Time Algorithmv LinearTime

Step 1:While $T. ≠ ∅ or $TX ≠ ∅ or $W` ≠ ∅

If $T. ≠ ∅ thenDegreeOne-Reduction

Else if $TX ≠ ∅ thenDegreeTwoPath-Reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Complexity AnalysisTime: 5 OSpace: 2O + 5 -

Page 36: Computing A Near-Maximum Independent Set in Linear Time by

A Near-Linear-Time Algorithmv NearLinear

Lemma 5.1: (Dominance Reduction) [F. V. Fomin et al. JACM’09]Vertex v dominates vertex u if +, * ∈ & and all neighbors of v other than u are also connected to u (i.e., s + \ * ⊆ s * ). If v dominates u, then there exists a maximum independent set of G the excludes u; thus, we can remove u from G, and D ! = D !\ * .

Lemma 5.2: Vertex v dominates its neighbor u iff ∆ +, * = K + − 1, where ∆ +, * is the number of triangles containing u and v

Page 37: Computing A Near-Maximum Independent Set in Linear Time by

A Near-Linear-Time Algorithmv NearLinearStep1: Maintain the set u of candidate dominated vertices, and also maintain ∆ +, *for every edge +, *

Step 2:While $TX ≠ ∅ or u ≠ ∅ or $W` ≠ ∅

If $TX ≠ ∅ thenDegreeTwoPath-Reduction

Else if u ≠ ∅thendominance reduction

ElseInexact-Reduction

Step 2: Recover temporarily removed vertices

Complexity AnalysisTime: 5 O×Δ(Δ is the maximum degree in !)

Space: 4O + 5 - in worst case and 2O + 5 - in practice

Page 38: Computing A Near-Maximum Independent Set in Linear Time by

Extensions of Our Algorithms

v Accelerate ARW

v Compute Upper Bound of D !

Page 39: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 40: Computing A Near-Maximum Independent Set in Linear Time by

Experimental Settingsv Datasets

v EnvironmentsØ All algorithms are

implemented in C++Ø All experiments are

conducted on a machine with an Intel(R) Xeon(R) 3.4GHz CPU and 16GB main memory running Linux

Page 41: Computing A Near-Maximum Independent Set in Linear Time by

Accuracy

v Gap to the maximum independent set size

Page 42: Computing A Near-Maximum Independent Set in Linear Time by

Processing Time

(a) Compared with ExistingTechniques

(b) Compare Our Techniques

Page 43: Computing A Near-Maximum Independent Set in Linear Time by

Memory Usage(a) Compared

with ExistingTechniques

(b) Compare Our Techniques

Page 44: Computing A Near-Maximum Independent Set in Linear Time by

Boost ARW

ARW-NL, ARW-LT: ARW boosted by NearLinear and LinearTime, respectively.

Convergence plots of local search algorithms

Page 45: Computing A Near-Maximum Independent Set in Linear Time by

Outline

q Introduction

q Existing Works

q Our Reducing-Peeling Framework

q Our Approaches

q Experimental Studies

q Conclusion

Page 46: Computing A Near-Maximum Independent Set in Linear Time by

Conclusion

p A new Reducing-Peeling framework

p Time-efficient and space-effective techniques to implement the reducing-peeling framework

p Find large independent sets efficiently for large real-world graphs with billions of edges

Page 47: Computing A Near-Maximum Independent Set in Linear Time by