Eccentricity Heuristics through Sublinear Analysis Lenses

EccentricityHeuristicsthroughSublinearAnalysisLenses

TalWagnerMIT

GraphEccentricities

• Let𝐺(𝑉, 𝐸) byagraph

• Shortest-pathmetric:Δ: 𝑉×𝑉 → ℝ

• Eccentricities:𝒆 𝑣 = max

2∈4Δ(𝑣, 𝑢)

• Max𝒆 𝑣 =diameter;Min𝒆 𝑣 =radius

𝒆 𝑣 = 3

Max𝒆 𝑣 =diameter;Min𝒆 𝑣 =radius90thpercentile𝒆 𝑣 =“effectivediameter”(excludesoutliers)

Applications:Networktopologyanalysis(computers,social,biological),hardwareverification,sparselinearsystemsolving,…

EccentricityDistributionofLargeGraphs

Eccentricity

Kangetal.TKDD2011

Linkedin (Aug2006)

Outsiders

Leskovec etal.WWW2008

EccentricityDistributionofLargeGraphs

Eccentricity Eccentricity

Eccentricity

Kangetal.TKDD2011 Iwabuchi etal.CLUSTER2018

U.S.Patent(1985) Linkedin (Aug2006)

YahooWeb (GCConly)

Wikipedia Twitter

com-Friendster Webgraph

(GCCsonly)

ComputingAllEccentricities

• Exactcomputation:𝑂 𝑚𝑛 (e.g.BFSfromeachnode)• Approximatealgorithms• Theoretical:

• Empirical: [Kangetal.’11],[Boldi etal.’11],[Takes&Kosters ‘13],[…], [Shun’15]

4-approx. 𝑂(𝑚) time [OneBFS]

(2 + 𝛿)-approx. 𝑂>(𝑚 𝛿⁄ ) time [Backurs-Roditty-Segal-V.Williams-Wein’18]

5 3⁄ -approx. 𝑂>(𝑚A.C) time [Chechik-Larkin-Roditty-Schoenebeck-Tarjan-V.Williams’14]

TightunderSETH

Parallel𝑘-BFSHeuristics[Shun’15]

• 𝑆A ← 𝑘 randomnodes

• ComputeBFSfromeach𝑢 ∈ 𝑆A

• 𝒆G𝟏 𝑣 ←maxdistancefrom𝑆A

• 𝑆I ← 𝑘 furthestnodesfrom𝑆A

• ComputeBFSfromeach𝑢 ∈ 𝑆I

• 𝒆G𝟐 𝑣 ←maxdistancefrom𝑆A ∪ 𝑆I

𝑘-BFS1:

𝑘-BFS2:

Parallel𝑘-BFSHeuristics[Shun’15]

• 𝑆A ← 𝑘 randomnodes

• ComputeBFSfromeach𝑢 ∈ 𝑆A

• 𝒆G𝟏 𝑣 ←maxdistancefrom𝑆A

• 𝑆I ← 𝑘 furthestnodesfrom𝑆A

• ComputeBFSfromeach𝑢 ∈ 𝑆I

• 𝒆G𝟐 𝑣 ←maxdistancefrom𝑆A ∪ 𝑆I

𝑘-BFS1:

𝑘-BFS2:

EmpiricalResultsin[Shun’15]

• 𝑘-BFS1 performsreasonablewell• E.g.,medianaveragerelativeerror7.55%

• 𝑘-BFS2 beatsallothermethodsby

ordersofmagnitude

• Oftencomputesalleccentricitiesexactly

Why?

Reagan’sPrinciple

“They'rethesortofpeoplewhoseesomethingworksinpracticeandwonderifitwouldworkintheory.”

ThisWork

• Analyze heuristicsinordertoexplain andimprove• Willgetprovable variantswithbetterempiricalperformance• Needtogobeyondworst-case(duetoSETH-hardness)

• 𝑘-BFS2: ConnectiontoStreamingSetCover• [Demaine,Indyk,Mahabadi,Vakilian ’14]

• 𝑘-BFS1: ConnectiontoDiameterPropertyTesting• [Parnas &Ron’02]

Empiricalvalidationoftheory-basedalgorithms

𝑘-BFS2 byStreamingSetCover

SetCoverFormulation

• SetCover:Givenelements𝑉 andsubsets𝒮 ⊂ 24 ,findsmallestcover𝐶 ⊂ 𝒮 of𝑉.

• EccentricitiesasSetCover:• Nodesareelements• Nodesaresets:𝒮 = 𝐴P: 𝑣 ∈ 𝑉

𝑣 𝐴P

𝐴P = 𝑢 ∈ 𝑉: 𝒆 𝑢 = Δ 𝑣, 𝑢

SetCoverFormulation



𝑣𝐴P

𝐴P = 𝑢 ∈ 𝑉: 𝒆 𝑢 = Δ 𝑣, 𝑢

SetCoverFormulation



𝐴P = 𝑢 ∈ 𝑉: 𝒆 𝑢 = Δ 𝑣, 𝑢

𝑣

𝐴P = ∅

SetCoverFormulation



• Covercomputesalleccentricities

• Optimalcover=“eccentriccover”,𝜿𝐴P = 𝑢 ∈ 𝑉: 𝒆 𝑢 = Δ 𝑣, 𝑢

ComputationalConstraints

• Computingaset𝐴P isprohibitive• 𝑂(𝑚𝑛) work

• Computingwhichsetscover𝑣 isexpensive• SingleBFS,𝑂(𝑚) work

• KnownSetCoveralgorithms?Yes

𝑣𝐴P

𝑣 𝐴PS𝐴PT𝐴PU

StreamingSetCover[Demaine-Indyk-Mahabadi-Vakilian’14]

• 𝑆A ← 𝑘 randomelements

• 𝐶 ← Cover forsample(e.g.greedy)

ElementSamplingLemma:

Ifglobaloptimumissmall,𝐶 covers

almostallelements.

𝑘-BFS2 vs.DIMV

𝑘-BFS2 Streaming SetCover[DIMV’14]

𝑆 ← Random sample ⟺ 𝑆 ← Random sample

Compute BFSfromeach𝑣 ∈ 𝑆 ⟺ Compute coveringsetsforeach𝑣 ∈ 𝑆

𝐶 ← 𝑘 nodeswith maxΔ(𝑣, 𝑆) ≉ 𝐶 ← Greedycoverfor 𝑆

𝑘-BFSSC𝑘-BFS2 Streaming SetCover[DIMV’14]

𝑆 ← Random sample ⟺ 𝑆 ← Random sample

Compute BFSfromeach𝑣 ∈ 𝑆 ⟺ Compute coveringsetsforeach𝑣 ∈ 𝑆

𝐶 ← 𝑘 nodeswith maxΔ(𝑣, 𝑆) ≉ 𝐶 ← Greedycoverfor 𝑆

𝐶 ← Parallelgreedycoverfor𝑆[Blelloch-Peng-Tangwongsan’11]

[Blelloch-Simhadri-Tangwongsan’12]

𝑘-BFSSCTheorem:Suppose𝐺(𝑉, 𝐸) haseccentriccoversize𝜿.

𝑘-BFSSC with𝑘 = 𝑂> 𝜿 ⋅ 𝜖ZA log 𝑛 satisfies:

• Expectedwork:𝑂(𝑘𝑚),expecteddepth:𝑂>(diam 𝐺 )

• Computesexacteccentricitiesofallbutan𝜖-fractionofnodesw.h.p.

???

EccentricCover:Warm-Up

• Path,star,clique:𝜿 = 2

• Evencycle,hypercube:𝜿 = 𝑛

• Oddcycle:𝜿 = AI𝑛 + 1

EccentricCoverintheWild

• 8real-worldgraphsin[Shun’15]

• 1M-4M nodeseach

• Upperboundsoneccentriccoversize:

• 2graphs:𝜿 ≤ 𝟏𝟐𝟖

• 5graphs:𝜿 ≲ 𝟏, 𝟎𝟎𝟎

• 1graph:𝜿 ≲ 𝟏𝟎, 𝟎𝟎𝟎

Real-worldgraphshavesmalleccentriccovers

Experiments

𝑘-BFS2 vs.𝑘-BFSSC(Real-worldgraphsfromStanford

NetworkAnalysisProject)

BFScount BFScount

averagerelativeerror

Graph1:𝑛 = 36,646𝑚 = 88,303𝜿 ≤ 1,024

Exactratio at𝑘 = 16:𝑘-BFS2:80%𝑘-BFSSC:99%

Graph2:𝑛 = 11,174𝑚 = 23,409𝜿 ≤ 32

Exactratio at𝑘 = 16:𝑘-BFS2:97%𝑘-BFSSC:100%

𝑘-BFS1 byPropertyTesting

PropertyTestingApproximation

• Usualapproximation:𝒆G 𝑣 iscloseto𝒆(𝑣)

• Propertytestingapproximation:𝒆G 𝑣 isexactonsome𝑮k closeto𝑮• Graphsare𝝐-closeifupto𝝐 ⋅ 𝒎 edgescanbeadded/removedtoget𝑮k from𝑮

• Nosparsity/densityassumption(“GeneralGraphModel”)

• Notation:𝒆G 𝑣 ≤ 𝒆 𝑣 ≼𝝐 𝒆G 𝑣

𝒆G 𝑣 = 1𝒆 𝑣 = 2

𝑮k is𝟒𝟗-closeto𝑮

𝑘-BFS1 vs.DiameterTesting

𝑘-BFS1 with𝑘 = 𝑂 𝜖ZA log 𝑛 satisfies𝒆G 𝑣 ≤ 𝒆 𝑣 ≼𝝐 𝒆G 𝑣 forall𝑣.• Work:𝑂>(𝜖ZA𝑚),depth:𝑂>(diam 𝐺 )• Algorithm:StartBFSat𝑘 randomnodes

Theorem [Parnas &Ron]:Givenagraph𝐺,computeadiameterestimate𝑫k suchthat𝑫k ≤ diam 𝐺 ≼𝝐 2𝑫k + 2.• Time:𝑝𝑜𝑙𝑦 𝜖ZA

• Algorithm:Starttruncated BFSat𝑘 randomnodes

Butthisislineartime

EccentricityTesting

Aux.Theorem:Given𝐺 and𝑣,compute𝒆G 𝑣 s.t. 𝒆G 𝑣 ≤ 𝒆 𝑣 ≼𝝐 𝒆G 𝑣intime𝑝𝑜𝑙𝑦 𝜖ZA .• Corollary– Diametertesting:𝑫k ≤ diam 𝐺 ≼𝝐 2𝑫k (shavedoff+2)

• Corollary– Radiustesting:𝑹k ≤ radius 𝐺 ≼𝝐 𝑹k + 𝟏

Impliesvariantof𝑘-BFS1:𝑘-BFSTST

𝑘-BFSTSTTheorem:𝑘-BFSTST satisfies𝒆G 𝑣 ≤ 𝒆 𝑣 ≼𝝐 𝒆G 𝑣 forall𝑣.• Work:𝑂(𝜖ZI𝑛),depth:𝑂>(𝜖ZA log 𝑛)

Sameguaranteeas𝑘-BFS1butinsublinear workand

depth,independentofgraph.

• Algorithm:truncatedBFS• 𝑆A ← 𝑘 randomnodes

• Fromeach𝑢 ∈ 𝑆A,startaBFSuptofirstlevelℓ2where𝑂>(𝜖ZA) nodesareseen.Allunseennodes

areconsideredat“distance”ℓ2 + 1 from𝑢.

• 𝒆G𝐓𝐒𝐓 𝑣 ←max“distance”from𝑆A

Experiments

𝑘-BFS1 vs.𝑘-BFSTST (withdifferentBFScutoffs)

Edgetraversals Edgetraversals

averagerelativeerror

Graph1:𝑛 = 36,646𝑚 = 88,303

Graph2:𝑛 = 11,174𝑚 = 23,409

Conclusion

• Explainandimprovehigh-performingheuristics

• Practicalalgorithm->“fit”analysis->practicalimprovementwithguarantees

• Inter-connectionsofparallel,streaming,sketching,andproperty

testingalgorithms

• All“pointtosamedirection”Thankyou

Documents

Eccentricity Heuristics through Sublinear Analysis Lenses