14
Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Embed Size (px)

Citation preview

Page 1: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Testing Collections of Properties

Reut Levi Dana Ron

Ronitt Rubinfeld

ICS 2011

Page 2: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Shopping distribution

What properties do your distributions have?

Page 3: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Transactions in California Transactions in New York

Testing closeness of two distributions:

trend change?

Page 4: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Testing Independence:Shopping patterns:

Independent of zip code?

Page 5: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

This work: Many distributions

Page 6: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

One distribution:

D is arbitrary black-box distribution over [n], generates iid samples.

Sample complexity in terms of n? (can it be sublinear?)

D

Test

samples

Pass/Fail?

Page 7: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Uniformity (n1/2) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08]

Identity (n1/2) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01]

Closeness (n2/3) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08]

Independence O(n12/3 n2

1/3), (n12/3 n2

1/3) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] , this work

Entropy n1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08]

Support Size (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10]

Monotonicity on total order (n1/2) [Batu, Kumar, Rubinfeld 04]

Monotonicity on poset n1-o(1)

[Bhattacharyya, Fischer, Rubinfeld, Valiant 10]

Some answers…

Page 8: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Collection of distributions:

Two models: Sampling model:

Get (i,x) for random i, xDi

Query model: Get (i,x) for query i and xDi

Sample complexity in terms of n,m?

D1

Test

samples

Pass/Fail?

D2 Dm…

Further refinement: Known or unknown distribution on i’s?

Page 9: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Properties considered:

Equivalence All distributions are equal

``Clusterability’’ Distributions can be clustered into k

clusters such that within a cluster, all distributions are close

Page 10: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Equivalence vs. independence

Process of drawing pairs: Draw i [m], x Di output (i,x)

Easy fact: (i,x) independent iff Di‘s are equal

Page 11: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Results

Def: (D1,…Dm) has the Equivalence property if Di = Di' for all 1 ≤ i, i’ ≤ m.

Lower Bound Upper Bound

n>m (n2/3m1/3) Unknown Weights Õ(n2/3m1/3)

m>n (n1/2m1/2) Õ(n1/2m1/2) Known Weights

Also yields “tight” lower bound for independence testing

Page 12: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Clusterability

Can we cluster distributions s.t. in each cluster, distributions (very) close? Sample complexity of test is

O(kn2/3) for n = domain size, k = number of clusters No dependence on number of distributions Closeness requirement is very stringent

Page 13: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Open Questions

• Clusterability in the sampling model, less stringent notion of close

• Other properties of collections?• E.g., all distributions are shifts of each other?

Page 14: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Thank you