Upload
calida
View
62
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Importance Weighted Active Learning. by Alina Beygelzimer, Sanjoy Dasgupta and John Langford ( ICML 2009 ). Presented by Lingbo Li ECE, Duke University September 25, 2009. Outline. Introduction The Importance Weighting Skeleton Setting the rejection threshold Label Complexity - PowerPoint PPT Presentation
Citation preview
Importance Weighted Active Learning
by
Alina Beygelzimer, Sanjoy Dasgupta and John Langford
(ICML 2009)
Presented by Lingbo LiECE, Duke UniversitySeptember 25, 2009
Outline
• Introduction
• The Importance Weighting Skeleton
• Setting the rejection threshold
• Label Complexity
• Implementing IWAL
• Conclusion
Introduction
• Active learning At each step t, a learner receives an unlabeled point , and decide whether to query its
label . Hypothesis space is , where Z is prediction space.
Loss function
• Drawback from earlier work: not consistent• PAC-convergence guarantee active learning
1) only 0-1 loss function; 2) internal use of generalization bounds.
• Importance weighted approach
1) non-adaptive; 2) asymptotic.
• Motivation Using importance weighting to build a consistent binary classifier under general
loss functions, which removes sampling bias and improves label complexity.
tx Xty Y :{ : }H h X Z
: [0, )l Z Y
The Importance Weighting Skeleton
X Y
• The expected loss
• The importance weighted estimate of the loss at time T
then
• IWAL algorithms are consistent, if
does not equal zero.
tp
Setting the rejection threshold
H
• To do the minimization over instead of
where
• IWAL performs never worse than supervised learning.
tHH
Label Complexity – upper bound
2( log )O T dT T 2( log )O T d T
2( log )O T dT T
Previous work of active learning has been done only on the 0-1 loss with the number of queriesof ; For arbitrary loss functions with the similar conditions, the number of queries is
Label Complexity – lower bound
Lower bound is increased.
Implementing IWAL (1)
• linear separators;
• logistic loss;
• MNIST data set of handwritten digits with 3’s and 5’s as two classes;
• 1000 exemplars for training;
• another 1000 for testing;
• Use PCA to reduce dimensions;
• Optimistic bound of
• Active learning performs similar to supervised learning with only less than 1/3 of the labels queried.
Implementing IWAL (2)bootstrapping schemebinary and multiclass classification loss MNIST dataset
tp
Conclusion
• IWAL is a consistent algorithm, which can be implemented with flexible losses.
• Label complexity is theoretical provided with substantial improvement.
• Practical experiments approve this.