25
VC Dimension – definition and impossibility result Lecturer: Yishay Mansour Eran Nir and Ido Trivizki

VC Dimension – definition and impossibility result

  • Upload
    rhoda

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

VC Dimension – definition and impossibility result. Lecturer: Yishay Mansour Eran Nir and Ido Trivizki. VC Dimension – Lecture Overview. PAC Model – Review VC dimension – motivation Definitions Some examples of geometric concepts Sample size lower bounds More examples. - PowerPoint PPT Presentation

Citation preview

Page 1: VC Dimension – definition and impossibility result

VC Dimension – definition and impossibility result

Lecturer: Yishay Mansour

Eran Nir and Ido Trivizki

Page 2: VC Dimension – definition and impossibility result

VC Dimension – Lecture Overview

PAC Model – Review VC dimension – motivation Definitions Some examples of geometric concepts Sample size lower bounds More examples

Page 3: VC Dimension – definition and impossibility result

The PAC Model - Review A fixed, unknown distribution D from which the

examples are chosen independently. The target concept is a computable function Our goal – finding h such that: - accuracy parameter; - confidence parameter. An algorithm A learns a family of concepts C if for

any and any distribution D , A outputs a function such that .

[ ( )( ) (r )]P D terror h c x h xr[ ( 1P ) ]error h ò

ò

tc C

tc

h [ ( )( ) (r )]P D terror h c x h x

Page 4: VC Dimension – definition and impossibility result

VC Dimension - Motivation

Question: How many examples does a learning algorithm need?

For PAC and a finite concept class C we proved:

We would like to be able to handle infinite concept classes – VC Dimensions will provide us a substitute to for infinite concept classes.

1 lnmC

ò

ln C

Page 5: VC Dimension – definition and impossibility result

VC Dimension - Definitions

Given a concept class C defined over the instance space X, let

The projection of C on S is all the possible functions that C induces on S :

A concept class C shatters S if

In other words: a class shatters a set if every possible function on the set is in the class.

S X

( ) { | }C S c S c C ( ) | 2 (| | )| m

C S S m

| | | ( ) |2 SC S

Page 6: VC Dimension – definition and impossibility result

VC Dimension – Definitions Cont.

VCdim (Vapnik-Chervonenkis dimension) of C: The maximum size of a set shattered by C:

If a maximum value doesn’t exist then

For a finite class C:

{ : :| |d ( ) {0,1} }im( ) max dCV S dC S SC d

dim( )VC C dim( ) log | |VC C C

Page 7: VC Dimension – definition and impossibility result

VC Dimension – Examples In order to show that the VCdim of a class is d

we have to show: : find some shattered set of size d. : show that no set of size d+1 is

shattered

dimVC ddim 1VC d

Page 8: VC Dimension – definition and impossibility result

VC Dimension – Examples: Half Lines (C1)

The concepts are for where:c [0,1], [0,1]X 0

( )1

xc x

x

Page 9: VC Dimension – definition and impossibility result

VC Dimension – Examples: Half Lines (C1) Cont.

Claim:

: , , thus .

: for any set of size 2 there is an assignment which is not in the concept class: for the assignment which lets x be 1 and y be 0 is impossible.

41

1( ) 12

C 1({ }) | 22

| C

1dim( ) 1VC C

1dim( ) 1VC C 43

1( ) 02

C

1dim( ) 2VC C

{ , },S x y x y

Page 10: VC Dimension – definition and impossibility result

VC Dimension – Examples: Linear halfspaces (C2) The concepts are where for

let . are lines in the plane where positive points above or on the line, and negative points are below.

1 1 2 2( ) 1wc x x x wc

21 2, , ),(w x

wc

Page 11: VC Dimension – definition and impossibility result

VC Dimension – Examples: Linear halfspaces (C2) Cont.

Claim: : Any three points that are not

collinear can be shattered. : No set of four points can be

shattered:

Generally: Half spaces in have VCdim of .d 1d

1dim( ) 3VC C

1dim( ) 3VC C

1dim( ) 4VC C

Page 12: VC Dimension – definition and impossibility result

VC Dimension – Examples: Axis-aligned rectangles in the plane (C3)

Positive examples are points inside the rectangle, and negative examples are points outside the rectangle.

Page 13: VC Dimension – definition and impossibility result

VC Dimension – Examples: Axis-aligned rectangles in the plane (C3)

Claim: : a for points set in the following

shape can be shattered:

1dim( ) 4VC C

1dim( ) 4VC C

Page 14: VC Dimension – definition and impossibility result

VC Dimension – Examples: Axis-aligned rectangles in the plane (C3)

Claim: : Given a set of five points in the

plane, there must be some point that is neither the extreme left, right, top or bottom point of the five. If we label this non-extermal point negative and the remaining four extermal points positive, no rectangle can satisfy the assignment.

1dim( ) 4VC C

1dim( ) 5VC C

Page 15: VC Dimension – definition and impossibility result

VC Dimension – Examples:

A finite union of intervals (C4)

For any set of points we could cover the positive points by choosing the intervals small enough so

1dim( )VC C

Page 16: VC Dimension – definition and impossibility result

VC Dimension – Examples: Convex Polygons on the plane (C5)

Points inside the convex polygon are positive and outside are negative.

There is no bound on the number of edges. Claim: 5dim( )VC C

Page 17: VC Dimension – definition and impossibility result

Proof: For every labeling of d points on the circle

perimeter, there exists that is consistent with the labeling.

This is a polygon which includes all the positive examples and none of the negative. Thus the group of points is shuttered.

This holds for every d, and so

5dim( )VC C

VC Dimension – Examples: Convex Polygons on the plane (C5)

tc C

tc

5dim( )VC C

Page 18: VC Dimension – definition and impossibility result

Sample Size Lower Bounds

Goal: we want to show that for a concept class with a finite VCdim d there is a function m of

such that if we sample less than points, any PAC learning algorithm

would fail.

Theorem: If a concept class C has VCdim d+1 then:

, and dò,( , )m dò

1( , , ) ( )

6d dm d ò ò ò

Page 19: VC Dimension – definition and impossibility result

Sample Size Lower Bounds - Proof

For contradiction: let such that C shatters T (possible because ).

Let D(x) be

Choose randomly so that it’s

0 1, ,..., }{ dzT zz

( )tc x

0

c (x)= 0 /1 ( 0.5) x=z ,10 otherwi e

1

st i

x zwith probability i d

dim( ) 1VC C d

0

D(x)= 8 x=z ,10 ot

1

h

8

erwisei

x zi d

òò

Page 20: VC Dimension – definition and impossibility result

Sample Size Lower Bounds – Proof Cont.

is in C because C shatters T. Claim: if we sample less than points out of then the error is at least . Proof: Let RARE be

Sample size: the expected number of points we sample from RARE is at most

Error: This implies that with probability of at least 0.5

we sample at most points of RARE and thus have error of at least .

( )tc x

1{ ,..., }dz z2d

2ò1{ ,..., }dz z

1 1 1Pr[ ] Pr[ ] Pr[ | ] 8 22 2 2

ERROR RARE UNSEEN RARE ò ò/ 28m d ò

/ 2d2ò

Page 21: VC Dimension – definition and impossibility result

VC Dimension – Examples: Parity (C6)

Let . The concept class is where .

Claim: : Let . For any bits

assignment for the vectors we choose the set . We get:

and so is shattered. : There are parity functions, thus

{0,1}nX ( )S i S ix x

{1,..., }S n

6dim( )VC C n6dim( )VC C n 0...010...0ie

1,..., nb b 1,..., ne e{ | 1}iS i b 0

( )1S j

j Se

j S

1,..., ne e

2n6dim( )VC C n

6 2dim( ) log 2nVC C n

Page 22: VC Dimension – definition and impossibility result

VC Dimension – Examples: OR of n literals (C7)

Let . The concept class is

Claim: : use n unit vectors (see prev. proof). :

Use ELIM algorithm to show . Show the (n+1) vector cannot be assigned 1, thus no set

of (n+1) vectors can be shuttered.

{0,1} , , {1,..., }nX S S n ( )S i S i ii SC x x x

7dim( )VC C n7dim( )VC C n

7dim( )VC C n6dim( ) 1VC C n

Page 23: VC Dimension – definition and impossibility result

Radon Theorem Definitions:

Convex Set: A is convex if for every the line connecting is in A.

Convex Hull: The Convex Hull of S is the smallest convex set which contains all the points of S. We denote it as conv(S).

Theorem (Radon): Let E be a set of d+2 points in . There is a

subset S of E such that .

,x y A,x y

d( ) ( \ )conv S conv E S

Page 24: VC Dimension – definition and impossibility result

VC Dimension – Examples: Hyper-Planes (C8)

The concept class assigns 1 to a point if it’s above or on a corresponding hyper-plane, 0 otherwise.

Claim: : use n unit vectors and the zero

vector to form a n+1 set that can be shuttered. : use Radon theorem (next page)

8dim( )VC C n8dim( ) 1VC C n

8dim( ) 2VC C n

Page 25: VC Dimension – definition and impossibility result

VC Dimension – Examples: Hyper-Planes (C8) Cont.

Assume a set of size d+2 points can be shattered. Use Radon Theorem to find S such that

Assume there is a separating hyper-plane that classifies points in S as ‘1’, points not in S as 0.

No way to classify points in .

( ) ( \ )conv S conv E S

( ) ( \ )conv S conv E S