37
1 3.6 Support Vector Machines K. M. Koo

1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

Embed Size (px)

Citation preview

Page 1: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

1

3.6 Support Vector Machines

K. M. Koo

Page 2: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

2

Goal of SVMFind Maximum Margin

goalfind a separating hyperplane with

maximum margin

marginminimum distance between a

separating hyperplane and the sets of or

1 2

Page 3: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

3

Goal of SVMFind Maximum Margin

assume that are linearly separable

margin

find separating hyperplane with maximum margin

1

2

21,

Page 4: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

4

Calculate margin

separating hyperplane and are not uniquely determinedunder the constraint ,

and are uniquely determine

0)( 0 wg Txwx

0)( 0 wxg Txww 0w

10min wTxwx

w0w

Page 5: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

5

Calculate margin

distance between a point x and is given by

thus, the margin is given by

)(xg

wxw /0wT

w

wxwwxwxx

/1

/)()/( 00 minmin

ww TT

Page 6: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

6

Optimization of margin

maximization of margin

20

10

,1

,1

thatrequiring

211

xxw

xxw

www

w

wT

T

Page 7: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

7

Optimization of margin

therefore, we want to

Niwy

J

iT

i ,...,2,1 1)( subject to

2

1)( minimize

0

2

xw

ww

separating hyperplanewith maximal margin

separating hyperplanewith minimum w

This is an optimization-problem with inequality constraints

Page 8: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

8

optimization with constraints

)(θJ

1

2

constraint

cost function

min-value

)(θJ

1

2

min-value

optimization with equality constraintsoptimization with inequality constraints

constraint

Page 9: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

9

Lagrange Multiplier

optimization problem under constraints can be solved by the method of Lagrange Multipliers

let be real valued functions, let and ,and let , the level set for with value . assume .if has a local minimum or maximum on at , which is called a critical point of ,then there is a real number ,called a Lagrange multiplier, such that

nRURUgf for ,:,U0x cg )( 0x )(1 cgS

g c 0)( 0 xg Sf |

0x

Sf |

)()( 00 xx gf

S

0),( xL

Page 10: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

10

The Method of Lagrange Multiplier

1

2

1)( cJ θ

2)( cJ θ

3)( cJ θ

b

JT θa

θ

subject to

)( minimize

aθθ

))(( *J

bT θa

a

1

2

)(θJ

1c2c

3c

bT θa

Page 11: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

11

Lagrange Multiplier

Lagrangian is obtained as follows:for equality constraints

for inequality constraints

In our caseInequality constraints

N1,2,...,i ,0

]1)([2

1),,(

10

2

0

i

N

ii

Tii wywL

xwwλw

)()(),(1

iTi

m

ii bJL

θaθλθ

mibJL iiTi

m

ii ,...,1,0 0, )()(),(

1

θaθλθ

Page 12: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

12

Convex

a subset is convex iff for any , the line segment joining and is also a subset of , i.e. for any ,

a real-valued function on is convex iff for any two points and for any ,

XC Cyx ,

x yC

f C

Cyx ,

]1,0[)()1()())1(( yfxfyxf

]1,0[Cyx )1(

Page 13: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

13

Convex

)(xf

x

)(xf

x

)(xf

x

x x

y y

convex set concave set

convex function concave function neither convex nor concave

1x 2x

)( 1xf

)( 2xf

Page 14: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

14

Convex Optimization

an optimization problem is said to be convex iff the cost function as well as the constraints are convex the optimization problem for SVM is convex

the solution to a convex problem, if it exist, is unique. that is, there is no local optimum!

for convex optimization problem, KKT(Karush-Kuhn-Tucker) condition is necessary and sufficient for the solution

Page 15: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

15

KKT(Karush-Kuhn-Tucker) condition KKT condition

1. The gradient of the Lagrangian with respect to the original variable is 0

2. The original constraints are satisfied

3. Multipliers for inequality constraints

4. (Complementary KKT) product of multiplier and constraints equal to 0

for convex optimize problems,1-4 are necessary and sufficient for the solution

0

0λww

),,( 0wL

0),,( 00

λw wLw

Nii ,...,2,0 ,0

Niwy iT

ii ,...,2,1 ,0]1)([ 0 xw

Page 16: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

16

KKT condition for the optimization of margin

recall

KKT condition

Niwy

J

iT

i ,...,2,1 1)( subject to

2

1)( minimize

0

2

xw

ww

Niwy

Ni

wLw

wL

iT

ii

i

,...,2,1 ,0]1)([

,...,2,0 ,0

0),,(

),,(

0

00

0

xw

λw

0λww

(3.62)

(3.63)

(3.64)

(3.65)

(3.66)

N

ii

Tii

T wxwywwwL1

00 ]1)([2

1),,( λw

Page 17: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

17

KKT condition for the optimization of margin

Combining (3.66) with (3.62)

01

1

N

iii

N

iiii

y

yw

x (3.67)

(3.68)

Page 18: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

18

Remarks-support vector

of the optimal solution is a linear combination of feature vectors which are associated with

support vectors are associated with

sN

iiii y

1

xw

wNN s

ix0i

0i

Niwy iT

ii ,...,2,1 ,0]1)([ 0 xw

Page 19: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

19

Remarks-support vector

0

ctorsupport ve -non

i

0

ctorsupport ve

i

00 wTxw

10 wTxw

10 wTxw

The resulting hyperplane classifier is insensitive to the number and position of non-support vector

Page 20: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

20

Remark-computation w0

can be implicitly obtaines by any of the condition satisfying strict complement (i.e. )

In practice, is computed as an average value obtained using all conditions of the type

0]1)([ 0 wy iT

ii xw

0i

0w

0w

Page 21: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

21

Remark-optimal hyperplane is unique

the optimal hyperplane classifier of a support vector machine is unique under two conditionthe cost function is convex the inequality constraints consist of

linear functionsconstraints are convex

an optimization problem is said to be convex iff the target(or cost) function as well as the constraints are convex (the optimization problem for SVM is convex)

the solution to a convex problem, if it exist, is unique. that is, there is no local optimum!

Page 22: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

22

Computation optimal Lagrange multiplier

optimization problem belongs to the convex programming family (convex optimization problem) of problems

It can be solved by considering the so called Lagrangian duality and can be stated equivalently by its Wolfe dual representation form

),( subject to

),(max

),(minmax),(maxmin)(min

0

00

λθθ

λθ

λθλθθ

λ

θλλθθ

L

L

LLJ

Lagrangian duality

Wolfe dual representation

Page 23: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

23

Wolfe dual representation form

xw

λw

0

subject to

]1)([

2

1),,( maximize

N

1i

1

10

0

ii

N

iiii

N

ii

Tii

T

y

y

wxwy

wwwL

Page 24: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

24

Computation optimal Lagrange multiplier

once the optimal Lagrangian multipliers have been computed, the optimal hyperplane is obtained

xxλ

,0 subject to

2

1max

1

,1

N

iii

jij

Tijiji

N

ii

y

yy

(3.75)

(3.76)

Page 25: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

25

Remarks

the cost function does not depend explicitly on the dimensionality of the input spacethis allows for efficient generalizations

in the case of nonlinearly separable classes

although the resulting optimal hyperplane is unique, there is no guarantee about Lagrange multipliers

Page 26: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

26

Simple example

T

N

ii

Tii

TT

TT

w

L

ww

L

ww

L

www

www

www

wwwww

wxywL

],[

00

0

0

)1(

)1(

)1(

)1(2

]1)([2

),,(

]1,1[,]1,1[:

]1,1[,]1,1[:

43214321

43210

432122

432111

0214

0213

0212

0211

22

21

10

2

0

2

1

w

ww

λw

consider the two classification task that consists of the following points

its Lagrangian function

KKT condition

Page 27: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

27

Simple exampleLagrangian duality

0,]0,1[

],[

0

)(21

)(21

)(21

)(21

)22(max

0

43214321

4321

32

41

32

41

324124

23

22

214321

wT

T

w

λ

optimize with equality constraint

resultmore then one solution

Page 28: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

28

SVM for Non-separable Classes

in the case of non-separable, the training feature vector belong to one of the following three categories

10 wTxw

1)(0 0 wy Ti xw

0)( 0 wy Ti xw

10 wTxw

00 wTxw

10 wTxw

Page 29: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

29

SVM for Non-separable Classes

All three cases can be treated under a single type constraints

iT

i wy 1)][ 0xw

0i

10 i

0i

Page 30: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

30

SVM for Non-separable Classes

goal ismake the margin as large as possible keep the number of points with as

small as possible

(3.79) is intractable because of discontinuous function

0i

0 0

0 1)(

)(2

1),,(

1

2

0

i

ii

N

ii

I

ICwJ

wξw (3.79)

)(I

Page 31: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

31

SVM for Non-separable Classes

as common case, we choose to optimize a closely related cost function

Ni

Niwy

CwJ

i

iiT

i

N

ii

,...,2,1 ,0

,...,2,1 ,1][ subject to

2

1),,( minimize

0

1

2

0

xw

wξw

Page 32: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

32

SVM for Non-separable Classes

to Lagrangian

]1)([2

1),,,,(

11

10

2

0

N

iii

N

ii

N

iii

Tii

C

wywL

xwwξw

Page 33: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

33

SVM for Non-separable Classes

The corresponding KKT condition

Ni

Niwy

Ni

NiCL

yw

L

yL

ii

iiT

ii

ii

iii

ii

N

iiii

,...,2,0 ,0

,...,2,1 ,0]1)([

,...,2,0 ,0 ,0

,...,2,1 ,0 or 0

0or 0

or

0

N

1i0

1

xw

xw0w (3.85)

(3.86)

(3.87)

(3.90)

(3.88)

(3.89)

Page 34: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

34

SVM for Non-separable Classes

The associated Wolfe dual representation now becomes

Ni

NiC

y

y

wL

ii

ii

ii

N

iiii

,...,2,0 ,0 ,0

,...,2,1 ,0

0

subject to

),,,,( maximize

N

1i

1

0

xw

ξw

Page 35: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

35

SVM for Non-separable Classes

equivalent to

0

,...,2,1 ,0 subject to

2

1max

1

,1

N

iii

i

jij

Tijiji

N

ii

y

NiC

yy

xx

Page 36: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

36

Remarks-difference with the linearly separable case

Lagrange multipliers( ) need to be bounded by C

the slack variables, , and their associated Lagrange multipliers, , do not enter into the problem explicitlyreflected indirectly though C

i

ii

Page 37: 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating hyperplane with maximum margin margin minimum distance

37

RemarksM-class problem

SVM for M-class problem design M separating hyperplanes so th

at separate class from all the others

assign0)( xgi

0)( xgi i

0)( xgi

0)( xgi

0)( xgi

)}({maxarg if in xgi kk

i x