10
Importance Weighted Active Learning by Alina Beygelzimer, Sanjoy Dasgupta and John Langford (ICML 2009) Presented by Lingbo Li ECE, Duke University September 25, 2009

Importance Weighted Active Learning

  • Upload
    calida

  • View
    62

  • Download
    2

Embed Size (px)

DESCRIPTION

Importance Weighted Active Learning. by Alina Beygelzimer, Sanjoy Dasgupta and John Langford ( ICML 2009 ). Presented by Lingbo Li ECE, Duke University September 25, 2009. Outline. Introduction The Importance Weighting Skeleton Setting the rejection threshold Label Complexity - PowerPoint PPT Presentation

Citation preview

Page 1: Importance Weighted Active Learning

Importance Weighted Active Learning

by

Alina Beygelzimer, Sanjoy Dasgupta and John Langford

(ICML 2009)

Presented by Lingbo LiECE, Duke UniversitySeptember 25, 2009

Page 2: Importance Weighted Active Learning

Outline

• Introduction

• The Importance Weighting Skeleton

• Setting the rejection threshold

• Label Complexity

• Implementing IWAL

• Conclusion

Page 3: Importance Weighted Active Learning

Introduction

• Active learning At each step t, a learner receives an unlabeled point , and decide whether to query its

label . Hypothesis space is , where Z is prediction space.

Loss function

• Drawback from earlier work: not consistent• PAC-convergence guarantee active learning

1) only 0-1 loss function; 2) internal use of generalization bounds.

• Importance weighted approach

1) non-adaptive; 2) asymptotic.

• Motivation Using importance weighting to build a consistent binary classifier under general

loss functions, which removes sampling bias and improves label complexity.

tx Xty Y :{ : }H h X Z

: [0, )l Z Y

Page 4: Importance Weighted Active Learning

The Importance Weighting Skeleton

X Y

• The expected loss

• The importance weighted estimate of the loss at time T

then

• IWAL algorithms are consistent, if

does not equal zero.

tp

Page 5: Importance Weighted Active Learning

Setting the rejection threshold

H

• To do the minimization over instead of

where

• IWAL performs never worse than supervised learning.

tHH

Page 6: Importance Weighted Active Learning

Label Complexity – upper bound

2( log )O T dT T 2( log )O T d T

2( log )O T dT T

Previous work of active learning has been done only on the 0-1 loss with the number of queriesof ; For arbitrary loss functions with the similar conditions, the number of queries is

Page 7: Importance Weighted Active Learning

Label Complexity – lower bound

Lower bound is increased.

Page 8: Importance Weighted Active Learning

Implementing IWAL (1)

• linear separators;

• logistic loss;

• MNIST data set of handwritten digits with 3’s and 5’s as two classes;

• 1000 exemplars for training;

• another 1000 for testing;

• Use PCA to reduce dimensions;

• Optimistic bound of

• Active learning performs similar to supervised learning with only less than 1/3 of the labels queried.

Page 9: Importance Weighted Active Learning

Implementing IWAL (2)bootstrapping schemebinary and multiclass classification loss MNIST dataset

tp

Page 10: Importance Weighted Active Learning

Conclusion

• IWAL is a consistent algorithm, which can be implemented with flexible losses.

• Label complexity is theoretical provided with substantial improvement.

• Practical experiments approve this.