22
A Semi-naive Bayes Classifier with Grouping of Cases J. Abellán, A. Cano, A. R. Masegosa, S. Moral Department of Computer Science and A.I. University of Granada Spain

A Semi-naive Bayes Classifier with Grouping of Cases

  • Upload
    ntnu

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Semi-naive Bayes Classifier with Grouping of Cases

A Semi-naive Bayes Classifier with Grouping of Cases

J. Abellán, A. Cano, A. R. Masegosa, S. Moral

Department of Computer Science and A.I. University of Granada

Spain

Page 2: A Semi-naive Bayes Classifier with Grouping of Cases

2

Outline 1. Introduction. 2. Semi-Naive Bayes Classifier with

Grouping of Cases. General Description The Joining Criterions The Grouping Criterions

3. Experimental Evaluation. 4. Conclusions and Future Work.

Page 3: A Semi-naive Bayes Classifier with Grouping of Cases

3

Introduction Information from a data base

Attribute variables Class variable

Data Base

Calcium Tumor Coma Migraine Cancer

normal a1 absent absent absent

high a1 present absent present

normal a1 absent absent absent

normal a1 absent absent absent

high ao present present absent

...... ...... ...... ...... ......

Page 4: A Semi-naive Bayes Classifier with Grouping of Cases

4

Introduction Naive Bayes (Duda & Hart, 1973)

Attribute variables {Xi | i=1,..,r} Class variable C={c1,..,ck}. New observation z=(z1,..,zr)

(X1=z1,..,Xr=zr). Select state of C: arg maxci

(P(ci|Z)). Supposition of independecy

known the class variable: arg maxci

(P(ci) ∏rj=1P(zj|ci))

C

X1 X2 Xr

Graphical Structure

Page 5: A Semi-naive Bayes Classifier with Grouping of Cases

5

Introduction Naive Bayes Classifiers Naive Bayesian Classifiers: NB’s performance is comparable with some

state-of-the-art classifiers even when its independency assumption does not hold in normal cases.

Question: “Can the performance be better when the

conditional independency assumption of NB is relaxed?”

Page 6: A Semi-naive Bayes Classifier with Grouping of Cases

6

Semi-Naive Bayesian Classifiers(SNB) A looser assumption than NB. Independency occurs among the joined

variables given the class variable C.

Introduction Semi-Naive Bayes Classifiers

Page 7: A Semi-naive Bayes Classifier with Grouping of Cases

7

Introduction Semi-Naive Bayes Classifiers Main problems of Semi-NB approach: When to join two variables? Joining Criterion

Kononenko’s criterion is entropy based.

Pazzani’s criterion is accuracy based. Wrapper estimation. Very high complexity with high number of variables.

Class entropy reduction

Page 8: A Semi-naive Bayes Classifier with Grouping of Cases

8

A SNB with Grouping of Cases Joining Method

Three new proposals for Joining Criterions. BDe: Bayesian Dirichlet Equivalent.

L10: The Expected Log-likelihood under

leaving-one-out. LRT: Log-likelihood Ratio Test.

Page 9: A Semi-naive Bayes Classifier with Grouping of Cases

9

A SNB with Grouping of Cases Grouping Method Increment in Parameter Estimations

Solution: “Grouping cases of the new variable”.

Independent P (Xi | C)P(Xj | C) Nº Parameters:

#(C) (#(Xi) + #(Xj))

Dependent P (Xi, Xj | C)

Nº Parameters: #(C) #(Xi) #(Xj)

Similar Information

Page 10: A Semi-naive Bayes Classifier with Grouping of Cases

10

A SNB with Grouping of Cases Example

C

X1 X2 Xr

Joining Phase

C

X5 x X9 X1 Xr

Each pair of Variables is evaluated using a JC

Grouping Phase

Similar Information

Each pair of Cases is evaluated using a GC

C

X5 x X9 X1 Xr

Page 11: A Semi-naive Bayes Classifier with Grouping of Cases

11

Joining Criterions BDe criterion Bayesian Dirichlet equivalent Metric (BDe)

“Bayesian scores measure the quality of a model, M, as the posterior probability of

the model given the learning data D”

JC(BDe) = Score (M1:D) – Score(M2:D)

C

X Y

C

X x Y

M1 M2

Page 12: A Semi-naive Bayes Classifier with Grouping of Cases

12

Joining Criterions L1O criterion Expected Log-Likelihood Under Leave-

One-Out (L1O).

Leave-one-out Estimation Laplace Estimation

“The estimation of the log-likelihood of the class is carried out with a leave-one-out scheme

computed with a closed equation”

Page 13: A Semi-naive Bayes Classifier with Grouping of Cases

13

Joining Criterions LRT criterion Log-likelihood Ratio Test (LRT):

Corrector Factor:

“Comparison of two nested models: M1 with merged variables and M2 variables are independent”

Number of total comparisons over n active variables

Page 14: A Semi-naive Bayes Classifier with Grouping of Cases

14

Grouping Method Hypotheses

Hypotheses: Model Selection Problem Sample data D is restricted to X=xi or X=xj. Consider xi and xj the only possible cases of X. Grouping xi and xj implies X has only one case.

Similar Information

Page 15: A Semi-naive Bayes Classifier with Grouping of Cases

15

Grouping Method Criterions BDe score:

L10 score:

LRT score:

Page 16: A Semi-naive Bayes Classifier with Grouping of Cases

16

Experimental Evaluation Details

SNG was implemented in Elvira. Integrated in Weka for evaluation. Tested in 13 data bases without missing

values from UCI repository. 10 fold-cross validation repeated 10 times. Comparison with a corrected paired t-test

to 5%.

Page 17: A Semi-naive Bayes Classifier with Grouping of Cases

17

The trade-off between Accuracy and log-likelihood is better for LRT.

L10 works badly as joining criterion.

Evaluating Joining Criterions Naive Bayes Comparison

Page 18: A Semi-naive Bayes Classifier with Grouping of Cases

18

Evaluating Joining Criterions Pazzani’s semi-NB comparison

LRT works slightly better than BDe. Similar performance with a lower time

complexity.

LRT is the best joining criterion

Page 19: A Semi-naive Bayes Classifier with Grouping of Cases

19

Evaluating Grouping Criterions Naive Bayes Comparison

LRT Joining + Grouping Method Not strong differences among criterions. L10 slightly better.

L1O is the best grouping criterion

Page 20: A Semi-naive Bayes Classifier with Grouping of Cases

20

Pazzani’s Semi-NB Comparison SNB-G = LRT Joining + L10 Grouping

Similar performance:

Dramatic building time reduction:

Page 21: A Semi-naive Bayes Classifier with Grouping of Cases

21

State-of-the-art Classifiers AODE, TAN and LBR comparison Three wins against

NB. 1 W vs 1 D against

AODE. None difference

against TAN and LBR.

One Win against Pazzani’s Semi-NB.

Page 22: A Semi-naive Bayes Classifier with Grouping of Cases

22

Conclusions and Future Work A preprocessing step for Naive Bayes: Method for joining variables. Combined method for grouping cases.

Very efficient with similar performance respect to Pazzani’s Semi-NB classifier.

Application to high-dimensionality data sets. Generalization of the methodology to

another models: decision trees and TAN model.