18
Trading Convexity for Scalability Marco A. Alvarez CS7680 Department of Computer Science Utah State University

Trading Convexity for Scalability

  • Upload
    clove

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Marco A. Alvarez CS7680 Department of Computer Science Utah State University. Trading Convexity for Scalability. Paper. - PowerPoint PPT Presentation

Citation preview

Page 1: Trading Convexity for Scalability

Trading Convexity for Scalability

Marco A. AlvarezCS7680

Department of Computer ScienceUtah State University

Page 2: Trading Convexity for Scalability

Paper Collobert, R., Sinz, F., Weston, J., and Bottou, L.

2006. Trading convexity for scalability. In Proceedings of the 23rd International Conference on Machine Learning (Pittsburgh, Pennsylvania, June 25 - 29, 2006). ICML '06, vol. 148. ACM Press, New York, NY, 201-208.

Page 3: Trading Convexity for Scalability

Introduction Previously in Machine Learning

Non-convex cost function in MLP Difficult to optimize Work efficiently

SVM are defined by a convex function Easier optimization (algorithms) Unique solution (we can write theorems)

Goal of the paper Sometimes non-convexity has benefits

Faster == training and testing (less support vectors) Non-convex SVMs (faster and sparser) Fast transductive SVMs

Page 4: Trading Convexity for Scalability

From SVM Decision function

Primal formulation

Minimize ||w|| so that margin is maximized w is a combination of a small number of data (sparsity) Decision boundary is determined by the support vectors

Dual formulation

y=w⋅x b

minw,b

12∥w∥2C⋅∑

iH1[ y i⋅y x i]

min

G =12∑i , j i jx i x j−∑

iyii

s.t. ∑i

i=0

0 y i iC

Page 5: Trading Convexity for Scalability

SVM problem Number of support vectors increases linearly with L Cost attributed to one example (x,y):

From:

C H 1 [ y y x ]

Page 6: Trading Convexity for Scalability

Ramp Loss Function Given: z= y y x Outliers

Non SV

R s z =H 1 z −H s z

Page 7: Trading Convexity for Scalability

Concave-Convex Procedure (CCCP) Given a cost function: Decompose into a convex part and a concave part

Is guaranteed to decrease at each iteration

J

J = J VEX J CAV

Page 8: Trading Convexity for Scalability

Using the Ramp Loss

Page 9: Trading Convexity for Scalability

CCCP for Ramp Loss

Page 10: Trading Convexity for Scalability

Results

Page 11: Trading Convexity for Scalability

Speedup

Page 12: Trading Convexity for Scalability

Time and Number of SVs

Page 13: Trading Convexity for Scalability

Transductive SVMs

Page 14: Trading Convexity for Scalability

Loss Function Cost to be minimized:

Page 15: Trading Convexity for Scalability

Balancing Constraint Necessary for TSVMs

Page 16: Trading Convexity for Scalability

Results

Page 17: Trading Convexity for Scalability

Training Time

Page 18: Trading Convexity for Scalability

Quadratic Fit