A Semi-naive Bayes Classifier with Grouping of Cases

A Semi-naive Bayes Classifier with Grouping of Cases

J. Abellán, A. Cano, A. R. Masegosa, S. Moral

Department of Computer Science and A.I. University of Granada

Spain

2

Outline 1. Introduction. 2. Semi-Naive Bayes Classifier with

Grouping of Cases. General Description The Joining Criterions The Grouping Criterions

3. Experimental Evaluation. 4. Conclusions and Future Work.

3

Introduction Information from a data base

Attribute variables Class variable

Data Base

Calcium Tumor Coma Migraine Cancer

normal a1 absent absent absent

high a1 present absent present



high ao present present absent

...... ...... ...... ...... ......

4

Introduction Naive Bayes (Duda & Hart, 1973)

Attribute variables {Xi | i=1,..,r} Class variable C={c1,..,ck}. New observation z=(z1,..,zr)

(X1=z1,..,Xr=zr). Select state of C: arg maxci

(P(ci|Z)). Supposition of independecy

known the class variable: arg maxci

(P(ci) ∏rj=1P(zj|ci))

…

C

X1 X2 Xr

Graphical Structure

5

Introduction Naive Bayes Classifiers Naive Bayesian Classifiers: NB’s performance is comparable with some

state-of-the-art classifiers even when its independency assumption does not hold in normal cases.

Question: “Can the performance be better when the

conditional independency assumption of NB is relaxed?”

6

Semi-Naive Bayesian Classifiers(SNB) A looser assumption than NB. Independency occurs among the joined

variables given the class variable C.

Introduction Semi-Naive Bayes Classifiers

7

Introduction Semi-Naive Bayes Classifiers Main problems of Semi-NB approach: When to join two variables? Joining Criterion

Kononenko’s criterion is entropy based.

Pazzani’s criterion is accuracy based. Wrapper estimation. Very high complexity with high number of variables.

Class entropy reduction

8

A SNB with Grouping of Cases Joining Method

Three new proposals for Joining Criterions. BDe: Bayesian Dirichlet Equivalent.

L10: The Expected Log-likelihood under

leaving-one-out. LRT: Log-likelihood Ratio Test.

9

A SNB with Grouping of Cases Grouping Method Increment in Parameter Estimations

Solution: “Grouping cases of the new variable”.

Independent P (Xi | C)P(Xj | C) Nº Parameters:

#(C) (#(Xi) + #(Xj))

Dependent P (Xi, Xj | C)

Nº Parameters: #(C) #(Xi) #(Xj)

Similar Information

10

A SNB with Grouping of Cases Example

…

C

X1 X2 Xr

Joining Phase

…

C

X5 x X9 X1 Xr

Each pair of Variables is evaluated using a JC

Grouping Phase

Similar Information

Each pair of Cases is evaluated using a GC

…

C

X5 x X9 X1 Xr

11

Joining Criterions BDe criterion Bayesian Dirichlet equivalent Metric (BDe)

“Bayesian scores measure the quality of a model, M, as the posterior probability of

the model given the learning data D”

JC(BDe) = Score (M1:D) – Score(M2:D)

C

X Y

C

X x Y

M1 M2

12

Joining Criterions L1O criterion Expected Log-Likelihood Under Leave-

One-Out (L1O).

Leave-one-out Estimation Laplace Estimation

“The estimation of the log-likelihood of the class is carried out with a leave-one-out scheme

computed with a closed equation”

13

Joining Criterions LRT criterion Log-likelihood Ratio Test (LRT):

Corrector Factor:

“Comparison of two nested models: M1 with merged variables and M2 variables are independent”

Number of total comparisons over n active variables

14

Grouping Method Hypotheses

Hypotheses: Model Selection Problem Sample data D is restricted to X=xi or X=xj. Consider xi and xj the only possible cases of X. Grouping xi and xj implies X has only one case.

Similar Information

15

Grouping Method Criterions BDe score:

L10 score:

LRT score:

16

Experimental Evaluation Details

SNG was implemented in Elvira. Integrated in Weka for evaluation. Tested in 13 data bases without missing

values from UCI repository. 10 fold-cross validation repeated 10 times. Comparison with a corrected paired t-test

to 5%.

17

The trade-off between Accuracy and log-likelihood is better for LRT.

L10 works badly as joining criterion.

Evaluating Joining Criterions Naive Bayes Comparison

18

Evaluating Joining Criterions Pazzani’s semi-NB comparison

LRT works slightly better than BDe. Similar performance with a lower time

complexity.

LRT is the best joining criterion

19

Evaluating Grouping Criterions Naive Bayes Comparison

LRT Joining + Grouping Method Not strong differences among criterions. L10 slightly better.

L1O is the best grouping criterion

20

Pazzani’s Semi-NB Comparison SNB-G = LRT Joining + L10 Grouping

Similar performance:

Dramatic building time reduction:

21

State-of-the-art Classifiers AODE, TAN and LBR comparison Three wins against

NB. 1 W vs 1 D against

AODE. None difference

against TAN and LBR.

One Win against Pazzani’s Semi-NB.

22

Conclusions and Future Work A preprocessing step for Naive Bayes: Method for joining variables. Combined method for grouping cases.

Very efficient with similar performance respect to Pazzani’s Semi-NB classifier.

Application to high-dimensionality data sets. Generalization of the methodology to

another models: decision trees and TAN model.