39
Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Embed Size (px)

Citation preview

Page 1: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Lower Bounds for Property Testing

Luca Trevisan

U.C. Berkeley

Page 2: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Sub-linear Time Algorithms

• This talk:– algorithms that run in less than linear time

(cannot read entire input).– No pre-preprocessing. (Unstructured data)– Must be probabilistic and approximate

• For optimization problems: – Compute numerical apx of optimum cost

(and implicit representation of apx solution?)

• For decision problems:– What is approximation for decision problems?

Page 3: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

(Graph) Property TestingTesting a property P with accuracy in

adjacency matrix representation:

• Given graph G that has property P, accept with probability >3/4

• Given graph G that is -far from property P accept with probability <1/4

-far = must change –fraction of adjacency matrix to get property P (add/remove > n2 edges)

Page 4: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Example [GGR,AK]

Testing bipartiteness of a given graph G• Pick (1/)polylog(1/) vertices, and check if they

induce a bipartite graph; if so accept otherwise reject

• If G is bipartite then alg accepts with prob 1• If G is -far from bipartite, then whp algorithm

discovers an odd cycle (non-trivial to prove)• Running time: O ((1/)polylog(1/))

Page 5: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Paleontologist’s approach

Page 6: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Paleontologist’s approach

Page 7: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Paleontologist’s approach

Page 8: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Paleontologist’s approach

Page 9: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Lower Bounds [BT]

• Alon-Krivelevich’s algorithm– has one-sided error, is non-adaptive and has

running time (1/2)polylog(1/)

• Lower Bounds:– (1/2) for non-adaptive algorithms

– (1/1.5) for adaptive algorithms– Both results hold even for two-sided error

Page 10: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Two Distributions

• Gfar: every edge exists with probability – whp it is /3-far from bipartite

• Gbip: pick a random partition, then every edge that crosses the partition exists with probability 2

• Indistinguishable by non-adaptive algorithms making o(1/2) queries

• Indistinguishable by adaptive algorithms making o(1/1.5) queries

Page 11: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Bounded Degree GraphsTesting a property P with accuracy in

adjacency lists representation:

• Given graph G that has property P, accept with probability >3/4

• Given graph G that is -far from property P accept with probability <1/4-far = must change –fraction of adjacency

lists entries to get property P (add/remove > dn edges)

Page 12: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Bipartiteness [GR]

Testing bipartiteness• Repeat polylog n times:

– Start at random point, and pick sqrt(n) random walks of length polylog n, if two of them combine to form an odd cycle reject, otherwise accept

• Analysis: – in a graph where you need to remove

constant fraction of edges to make it bipartite, algorithm finds odd cycle

Page 13: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Matching Lower Bound [GR]

• Define two distributions of graphs:– Gfar: a random hamiltonian circuit, plus a

random matching(whp 1/100-far from bipartite)

– Gbip: a random hamiltonian circuit, plus a random matching conditioned on making the graph bipartite

• Gfar and Gbip are indistinguishable by algorithms of query complexity o(sqrt(n)).

Page 14: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Sublinear Time Approximation

Problems restricted to dense instances:• Max CUT and other graph problems can be

approximated within (1+) in graphs with at least n2 edges in time 2poly(1/)

[GGR]• Max 3SAT can be approximated within (1+) in

instances with at least n3 clauses in time 2poly(1/) and similar results for other satisfiability problems[AFKK]

Page 15: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Sub-linear Time ApproximationProblems on bounded-degree instances• Minimum spanning tree

– given a connected weighted graph of degree d with weights in range {1,…,w}, can approximate MST weight within (1+) in time about O(dw/2)[Chazelle, Rubinfeld, T]

Page 16: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

General Goals

• When looking for polynomial-time algorithms:– Several algorithmic techniques of general

applicability– A general technique to “prove” impossibility

(NP-completeness)

• For sublinear-time algorithms:– General algorithmic techniques?– Impossibility results?

Page 17: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Dense GraphsSome general algorithmic results• All problems with a certain logical representation

testable in time dependent only on [AFKS]• All regular languages testable in time dependent

only on [AFNS]• Only one one-sided error algorithm [GT]

(pick a random subgraph and check it is consistent with the property)– Adaptivity does not help– “Only one algorithm” result also for 2-sided error.

Few lower bounds

Page 18: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Bounded-Degree GraphsFewer and less general algorithms.Some results are different from dense case• adaptivity helps

– No property testable with o(sqrt(n)) queries non-adaptive queries. Several problems testable with O(1) adaptive queries.

• 2-sided better than 1-sided for natural monotone properties– Property “being a forest” has no o(sqrt(n)) one-sided

algorithm, but has O(1) two-sided algorithm

Few lower bounds

Page 19: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Testing 3-Colorability

• Easy in adjacency matrix representation• NP-hard in adjacency list representation• Only for small enough

– Can find 3-coloring good for 80% of the edges in a 3-colorable graph using SDP

– NP-hard to find 3-coloring good for 98% (?) fraction of edges

• Implies non-tight, and conditional, lower bound for query complexity

Page 20: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Other problems

• The query complexity of following problems is equivalent to query complexity of testing 3col – Testing satisfiability of 3SAT instance

• Every variable occurs in O(1) clauses, “adjacency list” representation

– Approximating max cut, vertex cover, independent set, . . ., in bounded-degree graphs

– Approximating Max SAT, Max 2SAT, . . .

• Lower bound of sqrt(n) for all problems implied by [GR] lower bound for testing bipartiteness

Page 21: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Some Results from [BOT]

• For one-sided error algorithms:– (n) query complexity to distinguish

3-colorable graphs from graphs that are (1/3 – )-far

– Lower bound applies to testing problems that are solvable in polynomial time

• For two-sided error algorithms:– For some , (n) query complexity to

distinguish 3-colorable graphs from graphs that are -far.

Page 22: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Additional Results

• Unconditionally, algorithms running in time o(n) cannot:– Approximate Max 3SAT better than 7/8– Approximate Max Cut in bounded-degree

graphs better than 16/17– . . .

• Hastad’97 proved above problems are NP-hard

Page 23: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

The 3-Coloring Lower Bound

• Consider first one-sided error algorithms• It’s enough to find a graph G that is (1/3 – )-far

from 3-colorable, but every subgraph of size < n is 3-colorable– (for every there is an such that . . .)

• Then an algorithm of query complexity < n either accepts G (which is wrong) or rejects some 3-colorable graph (which means the algorithm has not one-sided error)

Page 24: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

The Graph• Pick a graph of degree O(1/2) at random (pick

so many random matchings)• Then it is (1/3 – )-far whp• But, for some , whp, every subgraph induced

by k < n vertices contains <1.5k edges• In a minimal non-3-colorable graph, every vertex

has degree at least 3• Every subgraph induced by < n vertices is 3-

colorable

[Erdos]

Page 25: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Explicit Construction

Can the previous construction be derandomized?

• For constants d, , , and for every suff large n, we can explicitly construct a graph – on n vertices, with max degree d, – -far from 3-colorable,

– every subset of n vertices induces a 3-colorable subgraph.

Page 26: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Explicit Construction

• We construct a 3SAT formula such that for constants k, ’, ’

– Every variable occurs k times– No assignment satisfies more than 1-’

fraction of clauses– Every ’ fraction of clauses is satisfiable– Then we use (slightly new) reduction from

3SAT to 3Coloring

Page 27: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

The Formula

• Fix a degree-d expander graph G=(V,E) such that for every cut (S,V-S) at least min{|S|,|V-S|} edges cross the cut(enough d=14)

• Have two variables xuv and xvu for each egde (u,v)

• For every vertex v have the (3SAT equivalent of) the constraint

– u xuv = 1 + w xvw

Page 28: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Structure of the Analysis

• Impossible to satisfy more than a fraction 1/(d+1) of the constraints

• Can always satisfy half of the constraint– define an auxiliary network– show that the auxiliary network has no small

cut because of expansion– then there is a large flow– use large flow to find assignment for subset of

constraint

Page 29: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Flow Argument

• Want to satisfy constraints corresponding to vertices in C, with |C| < |V|/2

s

t

V-C

C

Construct flow network with new source s, sink t obtained by collapsing V-C, and vertices in C

Page 30: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Flow Argument

s

A

C-A

t|A| edges

|C-A| edges

•Every cut has size at least |C|

•There is a 0/1 flow of cost at least |C|

•Interpreted as an assignment, satisfies all constraints in C

Page 31: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Two-Sided Error Algorithms

Need to define two distributions of graphs Gcol and Gfar such that:

• Graphs in Gcol are (almost) always 3-colorable• Graphs in Gfar are (almost) always far from

3-colorable• To an algorithm of bounded query complexity,

Gcol and Gfar look (almost) the same

Page 32: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Main Step• Define two distributions Dsat and Dfar of

instances of E3LIN-2(systems over GF(2) with 3 variables per equation)– Systems in Dsat are always satisfiable– Systems in Dfar are (almost) always (1/2-)-far from

satisfiable– To an algorithm of bounded query complexity, Dsat

and Dfar look the same

• We get Gcol and Gfar using reduction fromapproximate E3LIN-2 to approximate 3-coloring

Page 33: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

E3LIN-2

X1 + X3 + X10 = 0 mod 2

X2 + X3 + X4 = 1 mod 2

X1 + X2 + X9 = 0 mod 2

. . .

Page 34: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Main Building Block• We show that for every c there is such

that there exists a left-hand side with– n variables, cn equations, 3 variables per

equations, every variable occurs in 3c equations

– every n equations are linearly independent

• Pick the left-hand side at random– repeat 3c times: pick at random a set of n/3

disjoint triples of variables

• Explicit construction?

Page 35: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Distributions

• The left-hand side is always as before

• In Dsat, we pick a random assignment to the variables, and set right-hand side consistently– always satisfiable

• In Dfar, we pick the right-hand side uniformly at random– With high probability, (1/2 – O(1/sqrt c))-far

Page 36: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Indistinguishability

• Two distributions differ only in right-hand side

• In Dfar uniformly distributed

• In Dsat, n-wise independent– Linear independence implies statistical

independence

• Look the same to algorithm that sees less than n equations

Page 37: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Conclusion of the Argument

• No algorithm of “query complexity” o(n) can distinguish satisfiable instances of E3LIN-2 from instances that are (1/2-)-far from satisfiable

• For some , no algorithm of query complexity o(n) can distinguish 3-colorable graphs from graphs that –far from 3-col.

• No algorithm of query complexity o(n) can approximate Max 3SAT better than 7/8 . . .

Page 38: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Open Questions

• Show that distinguishing 3-colorable graphs from (1/3-)-far graphs requires query complexity (n)– we can only prove it for one-sided error

• Show that approximating Max SAT better than ¾ and Max CUT bettter than ½ requires query complexity (n)– we only know (sqrt(n)) [implicit in GR]– would “explain” why we need SDP

Page 39: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Some more open questions

• In adjacency matrix representation, most interesting problems solvable in constant (in ) time

• For some problems (eg testing triangle-freeness) analysis uses Szemeredy’s regularity lemma, and constant is hyper-exponential in

• Lower bound (1/)log 1/ and only and for one-sided error

• Alternative analysis / stronger lower bounds?