33
On-line Learning with On-line Learning with Passive-Aggressive Passive-Aggressive Algorithms Algorithms Joseph Keshet The Hebrew University Learning Seminar,2004

On-line Learning with Passive-Aggressive Algorithms Joseph Keshet The Hebrew University Learning Seminar,2004

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

On-line Learning with On-line Learning with Passive-Aggressive Passive-Aggressive

AlgorithmsAlgorithms

Joseph KeshetThe Hebrew University

Learning Seminar,2004

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 2 of 33

Supervised Learning Supervised Learning ComponentsComponents

• Instance space

• Label space

• Mapping from to are called classifiers

• There exists unknown target classifier

• Goal: produce hypothesis that would be a good approximation of the target

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 3 of 33

OutlineOutline• Binary classification

– Problem setting– On-line algorithm– Mistake bound analysis– Kernels

• Regression

• Novelty-detection / “one-class”

• Hierarchical classification

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 4 of 33

Binary ClassificationBinary Classification• Input: examples

• Restriction: linear classification functions

• Goal: find that attains small error

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 5 of 33

Online LearningOnline LearningInitiate:

For

Receive vector

Predict label

Receive correct label

Suffer error

Apply update rule to obtain

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 6 of 33

Margin & LossMargin & Loss• Margin• Binary error is combinatorial quantity and thus

difficult to minimize directly• Define loss

Binary Error

“0-1” loss

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 7 of 33

The Update RuleThe Update Rule

Classify the current example correctly

Keep the current hyperplane close to

the last one

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 8 of 33

The Update RuleThe Update Rule

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 9 of 33

Loss BoundLoss Bound TheoremTheorem• sequence of examples

• satisfies

• Then

where

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 10 of 33

The Non-Separable CaseThe Non-Separable Case

Slack variable

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 11 of 33

The Update StoryThe Update Story

Correct classification-

No update

Misclassification Misclassification

The non-separable case

C

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 12 of 33

KernelsKernels• Since

• Note that

• Therefore

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 13 of 33

On-line RegressionOn-line Regression• Input: examples

• Restriction: linear classification functions

• Goal: find that attains small discrepancy

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 14 of 33

On-line RegressionOn-line Regression• Define loss

• Update rule

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 15 of 33

On-line Novelty DetectionOn-line Novelty Detection• Input: examples

• Restriction: linear classification functions

• Goal: find that closes the smallest ball

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 16 of 33

On-line Novelty DetectionOn-line Novelty Detection• Define loss

• Update rule

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 17 of 33

Hierarchical ClassificationHierarchical Classification• Goal: spoken phoneme recognition

b g

PHONEMES

Sononorants

Silences

ObstruentsNasals

Liquids

Vowels

Plosives FricativesFront Center Back

n m ng

d k p t

f v sh s thdhzh z

l y w r Affricates

jh ch

oyowuhuwaaao eraway

iy ih eyehae

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 18 of 33

Metric Over Label TreeMetric Over Label Tree

• A given hierarchy induces a metric over the set of labels tree distance

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 19 of 33

Metric Over Label TreeMetric Over Label Tree

• A given hierarchy induces a metric over the set of labels tree distance

bb

aa

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 20 of 33

Metric Over LabelsMetric Over Labels• Metric semantics:

γ(a,b) is the severity of predicting label “b” instead of correct label “a”

•Our high-level goal:

Tolerate minor errors …•Sibling errors

•Under-confident predictions - predicting a parent

…but, avoid major errors

bb

aa

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 21 of 33

Hierarchical ClassifierHierarchical Classifier• Assume and

• Associate a prototype

with each label

• Score of label

as

• Classify by: W4 W5 W6 W7 W8

W9 W10

W1

W0

W2

W3

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 22 of 33

Hierarchical ClassifierHierarchical Classifier• Goal: maintain close to

• Define

• Goal: maintain small

w4 w5 w6 w7 w8

w9 w10

w1

w0

w2

w3

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 23 of 33

Online LearningOnline LearningFor

Receive instance

Predict label

Receive correct label

Suffer tree-based penalty

Apply update rule to obtain

Goal: Suffer a small cumulative tree error

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 24 of 33

Tree LossTree Loss• Difficult to minimize directly

• Instead upper bound by

where

This is the Tree Loss

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 25 of 33

TheThe Update RuleUpdate Rule

w6 w7

w10w9

w4 w5 w8

w1 w2

w3

w0

Local update – only nodesalong the path from to are updated

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 26 of 33

Loss BoundLoss Bound TheoremTheorem• sequence of examples• satisfies

• Then

where and

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 27 of 33

Extension: KernelsExtension: Kernels• Since

• Note that

• Therefore

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 28 of 33

ExperimentsExperiments• Synthetic data: depth 4, 121 labels

Generated using orthogonal set in with Guassian noise (var 0.16)100 train instances and 50 test instances for each label

• Phoneme recognizer: 41 phoenems, taken from TIMIT corpus, MFCC+∆+∆∆ front-end, concatenation of 5 frames, RBF kernel2000 train vectors and 500 test vector per phoneme

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 29 of 33

ExperimentsExperiments• Flat - Ignore the hierarchy - solve as

multiclass CC

• Greedy approach: solve a multiclass problem at nodes with at least 2 children

CC

CC CC

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 30 of 33

ResultsResults

 AveragedTree Error

MulticlassError

Synthetic data (tree) 0.05 5

Synthetic data (flat) 0.11 8.6

Synthetic data (greedy) 0.52 34.9

Phonemes (tree) 1.3 40.6

Phonemes (flat) 1.41 41.8

Phonemes (greedy) 2.48 58.2

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 31 of 33

ResultsResults• Difference between the error rate of the Hieron

and the multiclass

Synthetic data Phonemes

Hie

ron

err

-Fla

t e

rr

gross errorsgross errors

minor errorsminor errors

Hie

ron

err

-Fla

t e

rr

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 32 of 33

Hierarchy vs. FlatHierarchy vs. Flat

Similarity between the prototypes

QuickTime™ and aVideo decompressor

are needed to see this picture.

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Slide 33 of 33

ThanksThanks!!