Variations of Minimax Probability Machine

Huang, Kaizhu

2003-09-16

Overview

• Classification– types, problems

• Minimax Probability Machine• Main work

– Biased Minimax Probability Machine– Minimum Error Minimax probability Machine

• Experiments• Future work

Classification

Types of Classifiers

• Generative Classifiers

• Discriminative Classifiers

Classification—Generative Classifier

bT zabT xa

Generative model assumes specific distributions on two class of data and uses these distributions to construct classification boundary.

Problems of Generative Model

• All models are wrong, but some are useful –by Box

• The distributional assumptions lack the generality and are invalidate in real cases

It seems that Generative model should not assume specific model on the data

Classification—Discriminative Classifier:SVM

bT zabT xa

support vectors

Problems of SVM

support vectors

It seems that SVM should consider the distribution of the data

SVM GMIt seems that Generative model should not assume specific models on the data

It seems that SVM should consider the distribution of the data

Minimax Probability Machine (MPM)

• Features:– With distribution considerations

– With no specific distribution assumption

Minimax Probability Machine

• With distribution considerations– Assume the mean and covariance directly

estimated from data reliably represent the real mean of covariance

• Without specific distribution assumption– Directly construct classifiers from data

Minimax Probability Machine (Formulation)

}Pr{inf

}Pr{infs.t.max

),(~,,

Objective

Minimax Probability Machine (Cont’d)

• MPM problem leads to Second Order Cone Programming

• Dual Problem

• Geometric interpretation

1)(s.t.min22

yxaaa yxa

,:min 21

vuvyux yxvu

Minimax Probability Machine (Cont’d)

• Summary – Distribution-free– In general case, the accuracy of classification of

the future data is bounded by α– Demonstrated to achieve comparative

performance with the SVM.

Problems of MPM

}Pr{inf

}Pr{infs.t.max

),(~,,

1. In real cases, the importance for two classes is not always the same, which implies the lower bound α for two classes is not necessarily the same. – Motivate Biased Minimax Probability Machine

2. On the other hand, it seems that no reason exists that these equal bounds are required to be equal. The derived model is thus non-optimal in this sense.– Motivate Minimum Error Minimax Probability Machine

Biased Minimax Probability Machine

• Observation: In diagnosing a severe epidemic disease, misclassification of the positive class causes more serious consequence than misclassification of the negative class.

• A typical setting: as long as the accuracy of classification of the less important maintains at an acceptable level ( specified by the real practitioners), the accuracy of classification of the important class should be as high as possible.

• Objective

• the same meaning as previous

• an acceptable accuracy level

• Equivalently

Biased Minimax Probability Machine (BMPM)

}Pr{ inf

}Pr{infs.t.max

),(~,,,

),,(~ xxx ),(~ yyy

)()( 1)(

)()(1s.t.)(max,,,

aaaa yx0a

11 )(,)(

• Objective

• Equivalently,

BMPM (Cont’d)

)()( 1)(

)()(1s.t.)(max,,,

aaaa yx0a

)()(,1)(s.t.)(1

max),(

1)(s.t.)(1

BMPM (Cont’d)

• Parametric Method1. Find by solving

2. Update

• Equivalently

• Least-squares approach

1)(s.t.)(1max

yxaaaaa xy0a

1)(s.t.)(min

yxaaaaa xy0a

Biased Minimax Probability Machine

bmpmTbmpm bxa

bmpmT bbmpm ya

bmpmTbmpm bza

at an acceptable accuracy level

Minimum Error Minimax Probability Machine

-4 -2 0 2 4 6 80

decision plane when =

}Pr{inf

}Pr{infs.t.max

),(~,,

.}Pr{inf

,}Pr{inf

s.t.,)1(max

-4 -2 0 2 4 6 80

optimal decision plane

MPM MEMPM

The MEMPM achieves the distribution-free Bayes optimal hyperplane in the worst-case setting.

Minimum Error Minimax Probability Machine

• MEMPM achieves the Bayes optimal hyerplane when we assume some specific distribution, e.g. Gaussian distribution on data.

Lemma : If the distribution of the normalized random variable

is independent of a , the classifier derived by MEMPM will exactly represent the real Bayes optimal hyerplane.

• Objective

• Equivalently

MEMPM (Cont’d)

s.t.,)1(max,,

aaaa yx

s.t.,1)(

1)(min

22),(),(

aaaa yx

• Objective

• Line search + sequential BMPM method

MEMPM (Cont’d)

where)(1

,125.0

s.t.,)1(1)(

• Kernelized BMPM

• where

Kernelized Version

function mappinga ,RR:),,)((~)(

),,)((~)(

______)(

______

fnwhere

.1))()((s.t.)(1

max____________

.))()()()()((1

,))()()()()((1

____________

______1

______

• Kernelized BMPM

• where

• and

Kernelized Version (Cont’d)

(s.t.~~1

~~1)(1

TNN ],,,,,[ 11 yx

)()( jT

iij zzK

yzwxzwzN

iii bKKf

* ),(),()(

Illustration of kernel methods

Linear

Kernel

Experimental results (BMPM)

• Five benchmark datasets– Twonorm, Breast, Ionosphere, Pima, Sonar

• Procedure – 5-fold cross validation– Linear– Gaussian Kernel

• Parameter setting– pima – others

Experimental results

Experiments for MEMPM

• Six benchmark datasets– Twonorm, Breast, Ionosphere, Pima, Heart, Vote

• Procedure – 10-fold cross validation– Linear

– Gaussian Kernel

Results for MEMPM

Experiments for MEMPM

• Six benchmark datasets– Twonorm, Breast, Ionosphere, Pima, Heart, Vote

• Procedure – 10-fold cross validation– Linear

– Gaussian Kernel

Results for MEMPM

Conclusions and Future works• Conclusions

– First quantitative method to analyze the biased classification task

– Minimize the classification error rate in the worst case

• Future works– Improve the efficiency of algorithm, especially in the

kernelized version• Any decomposed method?

– Robust estimation – Relation between VC bound in Support Vector Machine

and bound in MEMPM– Regression model?

Reference

• Popescu, I. and Bertsimas, D. (2001). Optimal inequalities in probability theory: A convex optimization approach. Technical Report TM62, INSEAD.

• Lanckriet, G. R. G., El Ghaoui, L., and Jordan, M. I. (200a). Minimax probability machine. In Advances in Neural Information Processing Systems (NIPS) 14, Cambridge, MA. MIT Press.

• Kaizhu Huang, Haiqin Yang, Irwin King, R. Michael Lyu, and Laiwan Chan. Biased minimax probability machine. 2003.

• Kaizhu Huang, Haiqin Yang, Irwin King, R. Michael Lyu, and Laiwan Chan. Minimum error minimax probability machine. 2003.

Variations of Minimax Probability Machine

Documents

Bardelli Minimax

MOD. MINIMAX-E

MINIMAX 09/2011

Minimax Pathology

CATALOGO MINIMAX

Decodom - kuchyne MiniMax

Lecture 4: Minimax estimators - Department of Statisticsshao/stat710/stat710-04.pdf · better than a minimax estimator is also minimax. We should ﬁnd an admissible minimax estimator

04. Minimax

CECCATO - MiniMax

Juegos minimax AlfaBeta

minimax maximix

Minimax 2010 nr2

Circular MiniMax

משחקים ואלגוריתם MINIMAX

Minimax n Tig

Minimax Optimization

NGC minimax 2

4p minimax xpmetermall.godohosting.com/product/Honeywell/MiNiMax-XP_c.pdf · 2014. 11. 17. · MiniMAX XP는백라이트기능을가진LCD 표시창을통하여측정가스의농도를표시하고,

The Minimax Teacher

MiniMax NT LowNOx