32
Adaboost and its applic ation 2007.3.2

Adaboost and its application 2007.3.2. Outline Introduction Adaboost algorithm Two class and training error Multi-class Application Face detection

  • View
    227

  • Download
    2

Embed Size (px)

Citation preview

Adaboost and its application

2007.3.2

Outline

Introduction Adaboost algorithm

Two class and training error Multi-class

Application Face detection

Outline

Introduction Adaboost algorithm

Two class and training error Multi-class

Application Face detection

Introduction

The horse-track problem: Use historical horse-race data. Derive some rules of thumb. Predict the winner.

• Odds• Dry or muddy• Jockey• Favorable odds• Muddy lightest one

Introduction

How to choose horse-race data? Random?

Resample data for classifier design. How to combine rules of thumb into single

decision? Equal importance?

Combine the results of multiple “weak”

classifiers into a single “strong” classifier.

Introduction

Two most popular: Bagging ( Breiman,1994 ) Boosting

Adaboost ( Freund and Schapire,1996)

Assume two class to classify :

1

1th x

class_2

class_1

Tt ......1

Training data

Bagging

1

T

1

T

1

T

1

1[ ]T

ii

H x sign h xT

Input data :x

Training data

Boosting

1

[ ]T

i ii

H x sign h x

12 T

Input data :x

Bagging v.s Boosting

Bagging: Sample: - equal weight. Weak classifier combination : - equal weight.

Boosting: Sample: -unequal weight. Weak classifier combination: - unequal weight.

Outline

Introduction Adaboost algorithm

Two class and training error Multi-class

Application Face detection

A Formal View of Boosting

Given training set correct label of instance For

Construct distribution on Find weak hypothesis( “rule of thumb”)

with small error on

Output final hypothesis

1 1, ,......, ,m mx y x y

1, 1iy

1...... :t Tix X

tD 1,......,m

: 1, 1th X

Prtt D t i ih x y

finalH

Adaboost starts with a uniform distribution of “weights” over training examples. The weights tell the learning algorithm the importance of the example.

Obtain a weak classifier from the weak learning algorithm, hj(x).

Increase the weights on the training examples that were misclassified.

(Repeat) At the end, carefully make a linear combination of the weak classifiers obtained at all iterations.

)()()( ,11,final xxx nnfinalfinal hhf

Adaboost Concept

Adaboost Constructing :

Given and :

where = normalized constant,

Final hypothesis:

tD1 1/D m

tD th

1

if

if

exp

t

t

i t itt

t i t i

tt i t i

t

e y h xD iD i

Z e y h x

D iy h x

Z

tZ11

ln 02

tt

t

final t tt

H x sign h x

1,...,t T

1,...,i m

Adaboost (Example)

Adaboost (Example)

Adaboost (Example)

Adaboost (Example)

Adaboost (Example)

Advantages of Adaboost

Weight update focus more on “hard” samples. ( misclassified in the previous iterations)

Simple and easy to program. No parameter to tune( except T). Can combine with many classifiers to find weak

hypothesis: Neural network, decision trees, nearest-neighbor

classifiers…..

Training Error

Let , then

training error

So if

then training error

1/ 2t t

2

2

2 1

1 4

exp 2

final t tt

tet

tt

H

: 0tt 22 T

finalH e

Multi-class Problem

Adaboost.MH Reduce to binary problems.

e.g: Possible labels are {a,b,c,d,e} Each training sample replaced by five {-1,+1}

labeled sample.

, , 1

, , 1

, , , 1

, , 1

, , 1

x a

x b

x c x c

x d

x e

Adaboost.MH

Formally: : 1, 1 ( )th X Y or R

1

,, exp ,t

t t i t it

D i yD i y v i h x y

Z

i

i

1 if y

1 if yi

ywhere v y

y

y Y tt

arg max ,final tH x h x y

Outline

Introduction Adaboost algorithm

Two class and training error Multi-class

Application Face detection

Face Detection

Training set

Detection result

Adaboost

Non-face

Classifiers Design

Haar-like features for : Two-rectangle (A,B) Three-rectangle (C) Four-rectangle (D)

th

24

24

Classifiers Design

Why use Haar-like features?

Resolution of detector : 24*24

total 160,000 (quite large)

Classifiers Design

Use “Integral image”.

Feature computation:

yyxx

yxiyxii','

',',

3214D iiiiiiii

4 2 3ii ii

Classifier Design

Choose the best features Adaptive reweighting

Haar-like features

Non-face

Training set

Face Detection

Computation cost: Ex: image size: 320x240.

sub-window size:24x24.

frame rate: 15 frame/sec.

each feature need (320-24+1)x(240-24+1)x15=966,735 per sec

(if ignore scaling)

huge computation cost !!

Face Detection

Use cascade classifiers.

Example: 200 feature classifier 10 20-featureclassifiers.

Face Detection

Advantage of cascade classifiers: Maintain accuracy. Speed up.

Experiments