25
Compression schemes, stable definable families, and o-minimal structures H. R. Johnson Department of Mathematics & CS John Jay College CUNY M. C. Laskowski Department of Mathematics University of Maryland March 18, 2009 Abstract We show that any family of sets uniformly definable in an o- minimal structure has an extended compression scheme of size equal to the number of parameters in the defining formula. As a consequence, the combinatorial complexity (or density) of any definable family in a structure with a o-minimal theory is bounded by the number of parameters in the defining formula. * Partially supported by NSF grant DMS-0600217. 1

Compression schemes, stable definable families, and o ...laskow/Pubs/Revisecompression.pdfCompression schemes, stable definable families, and o-minimal structures H. R. Johnson Department

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Compression schemes, stable definable

families, and o-minimal structures

H. R. Johnson

Department of Mathematics & CS

John Jay College

CUNY

M. C. Laskowski∗

Department of Mathematics

University of Maryland

March 18, 2009

Abstract

We show that any family of sets uniformly definable in an o-

minimal structure has an extended compression scheme of size equal

to the number of parameters in the defining formula.

As a consequence, the combinatorial complexity (or density) of any

definable family in a structure with a o-minimal theory is bounded by

the number of parameters in the defining formula.

∗Partially supported by NSF grant DMS-0600217.

1

Extended compression schemes for uniformly definable families

corresponding to stable formulas are also shown to exist.

1 Introduction

Problems concerning the combinatorial complexity of set systems arise in

many mathematical disciplines. They appear in context of range-space search-

ing [9] in computational geometry, PAC learning in computational learning

theory, and the study of types in mathematical logic. The basic situation of

interest is: Given a collection of subsets C from some universe X, analyze the

relationship of A ∈ C to the finite subsets of X. Warmuth and Littlestone

[12] noted a structural characteristic of some set systems C which allows this

question to be reduced to questions about subsets of X of some uniformly

bounded size. They referred to their discovery as a sample compression

scheme.

Compression schemes have many applications. It has been shown that

the existence of a compression scheme on a class of sets is sufficient to ensure

PAC learnability of the class [7, 12], and that the size of a compression scheme

can serve to replace the VC dimension of a class in PAC sample size bounds

[7]. Moreover, there are several kinds of learning machine which operate by

using compression sets to make predictions (e.g. [13, 14]).

Warmuth and Littlestone proposed the following definition. Start with a

(possibly infinite) set X of elements and a set C of subsets of X, which repre-

sent concepts. Warmuth and Littlestone say that C admits a d-dimensional

compression if, given any finite subset F of X, and any set A ∈ C, there is a

2

d-element subset S of F such that the set A ∩ F can be recovered from the

sets S ∩A and S \A. For example, if X is the line of real numbers then the

set C of open intervals has a 2-dimensional compression. Given any finite

set F of real numbers and any open interval I, membership in all of F ∩ I

can be deduced by knowing the leftmost and rightmost elements of the set

F ∩ I. Arguing similarly, if X denotes R2, the set of ordered pairs of real

numbers and C denotes the set of open axis-parallel rectangles, then C has a

4-dimensional compression.

Figure 1: Select the leftmost, rightmost, lowest and highest points includedin the rectangle.

While compression schemes are useful, more is needed to achieve closure

under the standard set theoretic operations (i.e. union, complement, etc.).

To this end, Floyd and Warmuth [7] proposed the notion of an extended com-

pression scheme with b extra bits. They conceive of storing some uniformly

bounded amount of information, represented as a binary string, in addition

to S ∩ A and S \ A. From this, as in an ordinary compression scheme, one

must recover the trace of A on F . Ben-David and Litman’s notion of a size

d-array compression [4] is a modification of the same idea.

Here we investigate how these notions compare with existing notions from

model theory, which is a branch of mathematical logic. To enable this cor-

3

respondence, it is useful to represent a subset A of X by its characteristic

function fA : X → {0, 1}. so a set of concepts C should be thought of as a

subset of X{0, 1}, the set of all characteristic functions with domain X.

For d any integer, Xd denotes the set of (ordered) d-tuples from X and

[X]d (resp. [X]≤d) denotes the d-element (resp. ≤ d-element) subsets of X.

For B ⊆ X, the notation C|B denotes the set of restrictions {f |B : f ∈ C}

and

C|fin =⋃

{C|B : B a finite subset of X with |B| ≥ 2}

The requirement that |B| ≥ 2 is technical and is used in the verification of

Proposition 2.1. We write f ⊑ g if and only if f is a restriction of g, i.e.,

dom(f) ⊆ dom(g) and f(x) = g(x) for all x ∈ dom(f).

Definition Fix C ⊆ X{0, 1}. C is said to have an extended d-compression

if there is a compression function κ : C|fin → [X]≤d and a finite set R of

reconstruction functions ρ : [X]≤d → X{0, 1} such that for every f ∈ C|fin

1. κ(f) ⊆ dom(f)

2. f ⊑ ρ(κ(f)) for at least one ρ ∈ R.

We say that C has an extended d-sequence compression if there there is a

compression function κ : C|fin → Xd and a finite set R of reconstruction

functions ρ : Xd → X{0, 1} such that for every f ∈ C|fin, range(κ(f)) ⊆

dom(f), and f ⊑ ρ(κ(f)) for at least one ρ ∈ R.

With Proposition 2.1 we will show that for any concept class C, the ex-

istence of either of these d-compressions is equivalent to having a Floyd-

Warmuth extended d-compression with b extra bits for some b.

4

In model theory, one is concerned with uniformly definable families of an

algebraic structure. The idea of definability is important in logic, and can

be described as follows.

A language L is a collection of function, relation, and constant symbols,

together with the familiar logical operators, quantifiers, and variables, which

operate according to the axioms of first-order logic. One recursively defines

the set of L-formulas. An L-structure M consists of a nonempty universe M ,

together with an interpretation of each of the symbols in L. It is straight-

forward to recursively define truth in M and we write M |= ϕ(a) if ϕ(a) is

true in M.

We generally partition the free variables of a formula into two sequences

and write ϕ(x; y) to denote this partition. Intuitively, we think of x as the

free variables and y as the parameters or instantiated instances.

If y = (y1, . . . , ym) is an ordered set of variables, then we denote the

length of y by lg(y) = m.

Suppose ϕ(x; y) is a partitioned formula with lg(x) = k and lg(y) = m.

For any A ⊆Mk and b ∈Mm, we define

ϕ(A; b) = {a ∈ A : M |= ϕ(a; b)}

which is visibly a subset of A. The uniformly definable family

Cϕ(x;y) = {ϕ(Mk; b) : b ∈Mm}

is a set of subsets of Mk.

5

In this paper, we prove two theorems. The notion of a stable formula is

defined in section 3.

Theorem 1.1. If ϕ(x; y) is any stable formula in a structure M, then the

uniformly definable family

Cϕ(x;y) = {ϕ(Mk; c) : c ∈Mm}

has an extended d-compression for some d.

The proof of 1.1 is an application of a certain kind of type definition from

model theoretic stability theory. The relationship between type definitions

and compression schemes is intimate, as will be made clear in Proposition

3.3.

Theorem 1.2. If M is an o-minimal structure and ϕ(x; y) is any partitioned

formula, then

Cϕ(x;y) = {ϕ(Mk; c) : c ∈Mm}

has an extended m-compression, where m = lg(y).

Theorem 1.2 generalizes results from Floyd-Warmuth [7] and Basu [3],

who proved a similar result for o-minimal expansions of the real field. The

proof follows by an induction on parameters, and makes use of the finitary

character of o-minimal systems.

As an example of the utility of these theorems, consider a ‘generalized

polynomial’ or ‘fewnomial’ in the sense of Khovanskii [10]

p(x; a1, . . . , an, b1, . . . , bn) = a1xb1 + · · · + anx

bn

6

in which the exponents themselves are given by positive real parameters. For

each 2n-tuple (a, b) of real numbers, let A(a, b) = {x > 0 : p(x; a, b) ≥ 0},

and let C = {A(a, b) : (a, b) ∈ R2n}. Then Theorem 1.2, together with the

fact that the sets A(a, b) form a uniformly definable family in an o-minimal

structure, (the real exponential field) imply that C has an extended 2n-

compression. This extends results of Floyd and Warmuth [7], who proved a

similar result for polynomials in which the exponents were fixed.

2 Extended compression schemes and combi-

natorial density

Here we compare our two notions of a compression with preexisting notions

and show how their existence relates to the combinatorial complexity of a

family of sets.

Proposition 2.1. Let C be any set of subsets of a universe X. For any

positive integer d, the following notions are equivalent:

1. C has an extended d-compression;

2. For some b, C has a Floyd-Warmuth extended d-compression with b

extra bits;

3. C has an extended d-sequence compression.

Proof. First, the equivalence of (1) and (2) is immediate. Given an extended

d-compression, take any b such that 2b ≥ |R|. Use the same compression

function κ, and use the b bits to code which reconstruction function to use.

7

Conversely, given a Floyd-Warmuth extended compression, let R be the set

of reconstruction functions coded by the b extra bits.

To obtain the equivalence of (1) and (3), suppose that C has an extended

d-compression. To show that C has an extended d-sequence compression,

fix an arbitrary linear ordering < on X. Then form a d-element sequence

coding κ(f), which is a subset of dom(f) of size at most d, by writing κ(f) in

ascending order, repeating the largest element as needed to make the length

d, and adjust the reconstruction functions accordingly. Conversely, fix an

extended d-sequence compression of C. Form an extended d-compression by

taking the new compression function to be range(κ(f)). Since there are at

most dd length d sequences of these elements, we can replace each of the

original reconstruction functions by a set of at most dd new reconstruction

functions coding each of the possibilities.

Furthermore, if one allows the d’s to vary, it is routine to show that a set

C of concepts has an extended d-sequence compression for some d if and only

if C has a size d′-array compression for some d′ in the sense of Ben-David

and Litman [4].

We compare the existence of an extended d-compression with the measure

of combinatorial density introduced by Assouad in [1].

Definition For C ⊆ X{0, 1}, the combinatorial density of C, denoted dens(C),

is a real number defined as the infimum over all r > 0 for which there is a

number N for which

|C|A| ≤ N · |A|r

for all finite subsets A ⊆ X.

8

One of the primary virtues of VC dimension is that it provides an upper

bound for combinatorial density. See, for instance, Chapter 4 of [6]. The

following shows that compression schemes can perform a similar function.

Lemma 2.2. Suppose C has an extended d-compression. Then the combina-

torial density of C is at most d.

Proof. Fix any finite A ⊆ X. Since C has an extended d-compression it also

has an extended d-sequence compression. Fix one such compression and letN

denote the number of reconstruction functions. Then any g ∈ C|A is uniquely

determined by its compression κ(g) ∈ Ad and any reconstruction function

ρ ∈ R satisfying g ⊑ ρ(κ(g)). That is, |C|A| ≤ N · |A|d, so dens(C) ≤ d.

Finally, we call an extended d-compression consistent if the image of each

reconstruction function ρ is an element of C as opposed to an arbitrary func-

tion g : X → {0, 1}. Consistent compressions are the subject of Subsec-

tion 4.1.

3 Structures and uniformly definable families

We are concerned with finding sufficient conditions on ϕ and M which

will guarantee that uniformly definable families Cϕ(x;y) have extended d-

compressions.

As notation, given a partitioned formula ϕ(x; y), its dual is the formula

ϕ∗(y; x). Formally, the dual ϕ∗ is the same formula as ϕ, but with the roles

9

of the free variables and parameters reversed. Thus,

Cϕ∗(y;x) = {ϕ(a;Mm) : a ∈Mk}

is a uniformly definable family of subsets of Mm.

3.1 Stable formulas

There are many equivalent definitions of a partitioned formula ϕ(x; y) being

stable in a structure M. Stability of a formula morally means that the

parameters can not be used to linearly order arbitrarily large sets of points

from the domain. For example, the formula x < y interpreted in any infinite

linear order is not stable, whereas x = y is stable in any structure.

We call ϕ(x; y) stable if, for some integerN , there are no elements {(ai, bi) :

1 ≤ i ≤ N} from Mk ×Mm such that for all 1 ≤ i, j ≤ N

M |= ϕ(ai; bj) if and only if i < j

An example of a non-trivial stable formula is f(x; y) = 0, where f is a

parameterized polynomial, when evaluated in any field (F,+, ·). In this case,

Cf(x;y)=0 is the parameterized family of zero sets of f(x; a) for various a from

F . The reader may want to verify this fact for the simplest case, lg(x) = 1,

where it follows from the fact that the cardinality of a zero set is uniformly

bounded by deg(f) over all choices of parameters (coefficients).

To distinguish this section from the subsequent one, we should say that

o-minimal structures, since they are ordered, always have many unstable

10

formulas. In fact most natural geometric families are unstable. For example,

the family of all 2-discs in R2 is unstable, because any finite sequence of

points in the plane arranged along a line can be linearly ordered (in the

sense of stability) by discs emanating from one of the endpoints.

There are many texts giving the salient features of stable formulas. See

e.g., [17] (Chapter 1, Lemma 2.1), or [18] for proof of the following easy fact.

Fact 3.1. For every L-structure M the set of stable formulas is closed under

Boolean combinations and duality. That is, if ϕ(x; y) and ψ(x; z) are both

stable, then so are the dual ϕ∗(y; x), ¬ϕ(x; y), and (ϕ ∧ ψ)(x; y, z).

There are many consequences of stability. For our purposes, the most

relevant one is the existence of uniform type definitions, which we now intro-

duce.

Definition A formula ϕ(x; y) with lg(x) = k, lg(y) = m has a uniform type

definition if there is an integer d ≥ 1 and a formula ψ(y; z1, . . . , zd) with

lg(zi) = m for each i such that for all a ∈ Mk and all B ⊆ Mm with more

than one element, there is (c1, . . . , cd) ∈ Bd such that

ϕ(a;B) = ψ(B; c1, . . . , cd)

Thus, the dual formula ϕ∗(y; x) having a uniform type definition means

that there is a formula ψ(x; z1, . . . , zd) such that for all b ∈ Mm and all

A ⊆ Mk with more than one element, there is (e1, . . . , ed) ∈ Ad such that

ϕ(Ak; b) = ψ(Ak; e1, . . . , ed).

11

The following fact is well known. See e.g., Theorem II 2.2 of [18] for a

proof.

Fact 3.2. If ϕ(x; y) is stable then it has a uniform type definition.

The following Proposition simply amounts to unpacking the definitions.

Proposition 3.3. If the dual formula ϕ∗(y; x) has a uniform type definition

via the formula ψ(x; z1, . . . , zd), then the uniformly definable family Cϕ(x;y)

has an extended d-compression.

Proof. We will show that Cϕ(x;y) has an extended d-sequence compression.

Fix A ⊆ Mk finite. For each b ∈ Mm, let χϕ(A;b) : A → A{0, 1} denote the

characteristic function of ϕ(A; b). Define the compression function κ to send

χϕ(A;b) to any (e1, . . . , ed) ∈ Ad such that ϕ(Ak; b) = ψ(Ak; e1, . . . , ed). Then

define the unique reconstruction function ρ by taking ρ(e1, . . . , ed) to be the

characteristic function of ψ(Ak; e1, . . . , ed).

Theorem 3.4. For any structure M and any stable formula ϕ(x; y), the

uniformly definable family Cϕ(x;y) = {ϕ(Mk; b) : b ∈ Mm} of subsets of Mk

has an extended compression scheme.

Proof. By Fact 3.1 the dual formula ϕ∗(y; x) is stable, hence has a uniform

type definition by Fact 3.2. Thus Cϕ(x;y) has an extended compression scheme

by Proposition 3.3.

There is no hope of relating the size of a compression scheme of a stable

formula to the number of its parameters. For example, for any d, let Md be

a structure in the language L = {U,R}, in which UMd is interpreted as an

12

infinite set X, (¬U)Md is interpreted as [X]d, and R(x, y) is interpreted as

membership x ∈ y. Then R(x, y) is a stable formula with a d-compression,

but no extended d′-compression for any d′ < d.

It may be the case that the size of an extended compression on a sta-

ble formula can be bounded by some other combinatorial characteristic. A

leading candidate would be the Shelah 2-rank of the formula.

4 O-minimal structures

Definition Let L = {<, . . .} and let M be an L-structure. We say that M

is o-minimal if < is interpreted as a dense linear order without endpoints,

and for every partitioned L-formula ψ(x; y) in a single variable x, the set

ψ(M; c) is a finite union of points and intervals for every c ∈Mm.

An excellent reference for o-minimality is [5]. The most important o-

minimal structures for applications are expansions of the ordered real num-

bers, but there are other examples, in particular various structures built on

the rationals.

It follows from the Tarski-Seidenberg theorem that the real field is o-

minimal. Two other important cases are due to results by Wilkie and

Gabrielov. They are:

• The real field with a function symbol added for exponentiation [21].

• The real field with symbols added for the analytic functions, restricted

to a compact domain [22, 8].

13

Since the real field with exponentiation is o-minimal, our result shows

that any feedforward sigmoidal neural network with standard sigmoid has an

extended compression scheme of size w, where w is the number of weights.

Similarly, concept classes associated with sets of positivity of exponential

expressions such as

eex1y1+x2

2 + x21ey2 ≥ 0

will be subject to Corollary 4.5, as will any “exponential polynomial” or

fewnomial in the sense of Wilkie and Khovanskii [16, 21, 10]. It follows from

[4], as well as from our result, that any family of real algebraic hypersurfaces

defined by polynomials with k non-zero monomials admits an extended k-

compression.

Note that if M is o-minimal and ϕ(x; y) is any partitioned formula

with a single x-variable, then the boundary of ϕ(M ; c), which we denote

by ∂ϕ(M ; c), is a finite subset of M . A proof of the following Fact is given

in Chapter 3 of [5].

Fact 4.1. Suppose M is o-minimal and ψ(x; y) is any partitioned formula

with a single x-variable. Then there is some integer N such that for every

c ∈Mm, the boundary ∂ϕ(M ; c) of the set ϕ(M ; c) consists of fewer than N

points.

Lemma 4.2. For any o-minimal structure M and for any formula θ(x; y, z)

with lg(x) = k and lg(y) = 1, there is a finite set Fθ of formulas, each of the

form ψ(x; w, z) with lg(w) = k, such that for every finite A ⊆ Mk, c ∈ M

and e ∈M lg(z) there is a ∈ A and ψ ∈ Fθ such that θ(A; c, e) = ψ(A; a, e).

Proof. Fix an o-minimal structure M and a formula θ(x; y, z). By o-minimality,

14

choose an integer N so that the boundary ∂θ(a,M, e) has size at most N for

all choices of a and e. For each i < N , let fi(a, e) denote the ith boundary

point (with respect to the ordering on M) of ∂θ(a,M, e) if it exists. Note

that each partial function fi is definable in M. We define four groups of

formulas:

ψi,1(x; w, z) := θ(x; fi(w, z), z)

ψi,2(x; w, z) := ∀u[u < fi(w, z) → ∃v(u < v < fi(w, z) ∧ θ(x; v, z))]

ψi,3(x; w, z) := ∀u[u > fi(w, z) → θ(x;u, z)]

ψ∗(x; w, z) := ∀uθ(x;u, z)

Let Fθ be this finite set of formulas.

Now choose a finite A ⊆Mk, c ∈M , and e ∈M lg(z). Let

D =⋃

a∈A

∂θ(a,M, e)

Since A is finite, D is a finite subset of M . The argument now splits into

four cases:

Case 1. c ∈ D.

Then θ(A; c, e) = ψi,1(A; a, e) for any a ∈ A and i < N such that c =

fi(a, e).

Case 2. c 6∈ D, but c < d for some d ∈ D.

Since D is finite, we can choose d∗ ∈ D least such that c < d∗. By

definition of D, for any a ∈ A, the truth of θ(a, y, e) is invariant for y’s

taken from the half-open interval [c, d∗). Thus θ(A; c, e) = ψi,2(A; a, e) for

15

any a ∈ A and i < N such that fi(a, e) = d∗.

Case 3. D 6= ∅ but c > d for every d ∈ D.

Let d∗ be the maximum element in D. For every a ∈ A, the truth of

θ(a, y, e) is invariant on the open interval (d∗,∞), so θ(A; c, e) = ψi,3(A; a, e)

for any a ∈ A and i < N such that fi(a, e) = d∗.

Case 4. D = ∅.

In this case, for any a ∈ A, the truth of θ(a, y, e) is invariant on all of M ,

hence ϕ(A; c, e) = ψ∗(A; a, e) for any a from A.

Note that there was no constraint on lg(z) in the previous Lemma. Thus,

if we are given a formula ϕ(x; y) with lg(y) = m, we can inductively shave

off elements from y in favor of elements from A. More precisely, we have the

following Proposition.

Proposition 4.3. Let M be any o-minimal structure. For all formulas

ϕ(x; y) with lg(x) = k and lg(y) = m, and for all 1 ≤ r ≤ m, there is

a finite set F rϕ of formulas, each of the form ψ(x; y1, . . . , ym−r, w1, . . . , wr),

such that for all finite A ⊆ Mk and all c ∈ Mm, there are (a1, . . . , ar) ∈ Ar

and ψ ∈ F rϕ such that ϕ(A; c) = ψ(A; c1, . . . , cm−r, a1, . . . , ar).

Proof. By induction on r ≤ m. When r = 1 this follows immediately from

Lemma 4.2 applied to ϕ. Assuming the result holds for r < m, in order to

establish it for r + 1 one simply applies Lemma 4.2 to each of the formulas

ψ ∈ F rϕ and taking F r+1

ϕ =⋃ψ∈Fr

ϕFψ.

By taking r = m in the previous Proposition we obtain the following

theorem.

16

Theorem 4.4. Let M be any o-minimal structure. For all formulas ϕ(x; y)

with lg(x) = k and lg(y) = m, there is a finite set F of formulas, each of

the form ψ(x; w1, . . . , wm), such that for all finite A ⊆ Mk and all c ∈ Mm,

there is (a1, . . . , am) ∈ Am and ψ ∈ F such that ϕ(A; c) = ψ(A; a1, . . . , am).

Corollary 4.5. For M any o-minimal structure, every uniformly definable

family Cϕ(x;y) has an extended m-compression, where m = lg(y).

Proof. By Proposition 2.1 it suffices to exhibit an extended m-sequence com-

pression. Let F be the finite set of formulas from Theorem 4.4. Fix a finite

A ⊆Mk. Define the compression function by

κ(χϕ(A;c)) = (a1, . . . , am)

where (a1, . . . , am) ∈ Am is found via Theorem 4.4. Take R = {ρψ : ψ ∈ F},

where ρψ(a1, . . . , am) = χψ(A;a1,...,am).

The next Corollary follows immediately from Corollary 4.5 and Lemma 2.2.

It generalizes a result of Basu [3], which makes a similar assertion for o-

minimal expansions of real closed fields.

To see the relation between our result and that of Basu, note that Basu

considers arrangements of n objects, thought of as the fibers of some fixed

definable map π : T → Rl where T is a definable subset of R

l+k. He concludes

that the combinatorial complexity of any such finite arrangement (i.e. the

number of cells) is O(nk). (This is phrased as a result on the Betti numbers

of the cells in Theorem 2.2 of [3]). This is dual to our result on density,

though we phrase things in terms of relations rather than projections. Fun-

17

damentally, the number of subsets traced on a set of points, and the number

of cells given by the fibers of these points under a projection, are the same.

Our result is therefore equivalent, but has a slightly increased generality

stemming from the fact that we do not assume (as Basu does) that our

o-minimal structure is an expansion of a real closed field.

Corollary 4.6. For M any o-minimal structure, every uniformly definable

family Cϕ(x;y) has combinatorial density at most lg(y).

4.1 Consistent compressions

If one is willing to double the size of the data set, then in any o-minimal

structure M, every uniformly definable family Cϕ(x;y) has a consistent d-

compression, where d = 2 · lg(y). That is, the range of the reconstruction

functions can be taken to be {χϕ(A;c) : c ∈M lg(y)}.

The proof is analogous to the series of statements 4.2–4.5 given above.

Lemma 4.7. For any o-minimal structure M and for any formula θ(x; y, z)

with lg(x) = k and lg(y) = 1, there is a finite set Fθ of formulas, each of

the form ψ(y; w1, w2, z) with lg(w1) = lg(w2) = k such that for every finite

A ⊆Mk, c ∈M and e ∈M lg(z) there is (a1, a2) ∈ A2 and ψ ∈ Fθ such that

1. The set ψ(M ; a1, a2, e) is connected i.e., an interval

2. c ∈ ψ(M ; a1, a2, e) and

3. θ(A; c, e) = θ(A; d, e) for every d ∈ ψ(M ; a1, a2, e).

Proof. The proof is similar to that of Lemma 4.2. Fix M, θ, the integer N ,

and definable functions fi as in the proof of 4.2.

18

Here we define five groups of formulas, each with free variables among

(y, w1, w2, z):

ψi,1 := y = fi(w1, z)

ψ2 := y = y i.e., always true

ψi,3 := y < fi(w1, z)

ψi,4 := y > fi(w1, z)

ψi,j,5 := fi(w1, z) < y < fj(w2, z)

Let Fθ be this finite set of formulas.

As before, choose a finite A ⊆Mk, c ∈M , and e ∈M lg(z). Let

D =⋃

a∈A

∂θ(a,M, e)

Since A is finite, D is a finite subset of M . Here the argument now splits

into five cases:

Case 1. c ∈ D.

Then ψi,1(M ; a, e) works for any a ∈ A and i < N such that c = fi(a, e).

Case 2. D = ∅.

Then the always true formula y = y suffices for ψ.

Case 3. D 6= ∅ but c < d for every d ∈ D.

Let d∗ be the smallest element of D. Then ψi,3(M ; a, e) works for any

a ∈ A and i < N such that fi(a, e) = d∗.

Case 4. D 6= ∅ but c > d for every d ∈ D.

19

Let d∗ be the maximum element in D. Take ψi,4(M ; a, e) for any a ∈ A

and i < N such that fi(a, e) = d∗.

Case 5. c 6∈ D, but there are elements of D above and below c.

In this case, let d− be the maximum element of D below c and d+ be the

minimum element of D above c. Take ψi,j,5(M ; a1, a2, e) for any a1 such that

d− = fi(a1, e) and any a2 so that d+ = fj(a2, e).

As in the proof of Proposition 4.3, a routine induction yields the following

Proposition for any m and any 1 ≤ r ≤ m.

Proposition 4.8. Let M be any o-minimal structure. For all formulas

ϕ(x; y) with lg(x) = k and lg(y) = m, and for all 1 ≤ r ≤ m, there is a

finite set F rϕ of formulas, each of the form ψ(y1, . . . , yr; w1, . . . , w2r), such

that for all finite A ⊆Mk and all c ∈Mm, there are (a1, . . . , a2r) ∈ A2r and

ψ ∈ F rϕ such that the set B = ψ(M r; a1, . . . , a2r) satisfies

1. B is an r-dimensional box i.e., an r-fold product of intervals

2. (cm−r+1, . . . , cm) ∈ B and

3. ϕ(A; c1, . . . , cm−r, d) = ϕ(A; c1, . . . , cm−r, d′) for all d, d′ ∈ B.

Taking r = m yields

Theorem 4.9. Let M be any o-minimal structure. For all formulas ϕ(x; y)

with lg(x) = k and lg(y) = m, there is a finite set F of formulas, each of the

form ψ(y; w1, . . . , w2m) such that for all finite A ⊆Mk and all c ∈Mm there

are (a1, . . . , a2m) ∈ A2m and ψ ∈ F such that the set B = ψ(Mm; a1, . . . , a2m)

is an m-dimensional box containing c on which the set ϕ(A; b) is invariant

among b ∈ B.

20

Corollary 4.10. Let M be any o-minimal structure. For all formulas ϕ(x; y)

with lg(y) = m, Cϕ(x;y) has a consistent extended 2m-compression scheme.

Proof. As in the proof of Corollary 4.5 it suffices to produce a consistent

extended 2m-sequence compression. Let F be the finite set of formulas given

by Theorem 4.9. Let R be the set reconstruction functions indexed by ψ ∈ F ,

where

ρψ(a1, . . . , a2m) = χϕ(A;d)

for some d ∈ ψ(Mm; a1, . . . , a2m).

Given any finite A ⊆ Mk and any c ∈ Mm, the compression function is

defined by

κ(χϕ(A;c)) = (a1, . . . , a2m)

where (a1, . . . , a2m) are obtained by an application of Theorem 4.9.

5 Discussion

Results in [11] increased the number of known VC classes by considering

parallel results in mathematical logic concerning independence dimension.

Here we have sought to make an analogous connection between compression

schemes and type definability. The matching in this case is less symmetric,

but interesting nonetheless.

Warmuth and Littlestone conjectured, in their original unpublished manuscript

on compression schemes, that all set systems with finite VC dimension ad-

mit an extended compression scheme of some finite size. In this paper we

have confirmed that all set systems which are stable and all those definable

21

in o-minimal structures have extended compressions. It is of longstanding

interest to determine whether, in fact, every class of finite VC dimension has

an extended compression scheme, or, in model theoretic language, whether

the theories with uniform type definitions over finite sets are exactly those

which are dependent.

It is also of interest to relate the size of the extended compression, both

its dimension and the number of reconstruction functions to other metrics of

the class. In the stable case, the Shelah 2-rank [18] is the obvious possibility.

Our results depend modestly but necessarily on appeals to the compact-

ness theorem. Work remains to be done to translate our results into a work-

able method in which both |R| and the size of the compression are managed.

This will clearly have to be done on a situational basis, but many of the

pieces already exist for the real field (Milnor bounds) and the real field with

exponentiation (Khovanskii’s fewnomial bounds). One can also approach

compression schemes in o-minimal structures from the point of view of finite

cell decomposition.

Most previous work on compression schemes has relied on the nice prop-

erties of set systems which are maximum, which means that on any finite

set of points, they trace out the maximum number of sets allowed by Sauer’s

lemma. Previous research (e.g. [7, 4]) has shown that definable families

which are natural (intervals, half-spaces, algebraic sets) have a tendency to

be maximum, or nearly so. Our work has made no use of the maximum

condition, but the link between maximum families and the families we have

considered is largely unexplored. It may be the case, for instance, that any

family definable in an o-minimal structure at some level embeds in a max-

22

imum family. Ben-David and Litman showed a similar fact for algebraic

sets.

References

[1] P. Assouad, Densite et dimension. Annales de l’institut Fourier, 33 no.

3, (1983), 233–282.

[2] J. Baldwin, Fundamentals of Stability Theory, Springer-Verlag, New

York, 1988.

[3] S. Basu, Combinatorial complexity in o-minimal geometry, Proc. of the

39th ann. ACM symp. on Theory of Computing, (2007), 47 – 56.

[4] S. Ben-David, A. Litman, Combinatorial variability of Vapnik-

Chervonenkis classes with applications to sample compression schemes,

Discrete Applied Mathematics 86(1), (1998), 3-25.

[5] L. van den Dries, Tame Topology and O-minimal Structures, Number

248 in London Mathematical Society Lecture Notes Series, Cambridge

University Press, Cambridge, 1998.

[6] Dudley, R.M., Uniform Central Limit Theorems, Cambridge University

Press, New York, 1999.

[7] S. Floyd, and F. Warmuth, Sample compression, learnability, and the

Vapnik-Chervonenkis dimension, Machine Learning, Vol.21 (3), (1995),

269–304.

23

[8] A. Gabrielov, Projections of semi-analytic sets, Functional Anal. Appl.

2 (1968), 282291.

[9] D. Haussler, E. Welzl, ǫ-Nets and Simplex Range Queries. Discrete &

Computational Geometry 2, (1987), 127-151.

[10] A. Hovanskii, On a class of systems of transcendental equations, Soviet

Math. Dokl. 22, (1980), 762–765.

[11] M. C. Laskowski, Vapnik-Chervonenkis classes of definable sets, J. Lon-

don Math. Soc. 45 (2), (1992), 377-384.

[12] N. Littlestone and F. Warmuth, Unpublished notes.

[13] M. Marchland and J. Shaw-Taylor. The Set Covering Machine. Journal

of Machine Learning Research, 3, (2002), 723-746.

[14] M. Marchland and J. Shaw-Taylor. The Decision List Machine. Advances

in Neural Information Processing Systems,15, (2003), 921-928.

[15] D. Marker, Model Theory: An Introduction, Springer, New York, 2002.

[16] D. Marker, Model theory and exponentiation, Notices Amer. Math. Soc.

43 (1996), 753–759.

[17] A. Pillay, Geometric Stability Theory, Oxford University Press, 1996.

[18] S. Shelah, Classification theory, Second edition, North Holland, 1990.

[19] A. Tarski, A Decision method for Elementary Algebra and Geometry,

second edition revised, Rand Corporation, Berkeley and Los Angeles,

1951.

24

[20] Vapnik, V., Chervonenkis, A., On the uniform convergence of relative

frequencies of events to their probabilities, Theory of Probability and

its Applications, 16(2), (1971), 264–280.

[21] A. J. Wilkie, Model completeness results for expansions of the real field

by restricted Pfaffian functions and the exponential function, J. Amer.

Math. Soc. 9, (1996), 1051–1094.

[22] A. J. Wilkie, A theorem of the complement and some new o-minimal

structures, Selecta Mathematica, New Series, Vol. 5, No. 4, Dec (1999),

397-421.

25