1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002

Feature SelectionFeature Selectionusing Mutual Informationusing Mutual Information

SYDE 676 Course ProjectSYDE 676 Course Project

Eric HuiEric Hui

November 28, 2002November 28, 2002

OutlineOutline

Introduction Introduction … prostate cancer project… prostate cancer project

Definition of ROI and FeaturesDefinition of ROI and Features Estimation of PDFs Estimation of PDFs … using Parzen Density … using Parzen Density

EstimationEstimation

Feature Selection Feature Selection … using MI Based Feature … using MI Based Feature SelectionSelection

Evaluation of Selection Evaluation of Selection … using Generalized … using Generalized DivergenceDivergence

ConclusionsConclusions

Ultrasound Image of Ultrasound Image of ProstateProstate

Prostate OutlineProstate Outline

““Guesstimated” Cancerous Guesstimated” Cancerous RegionRegion

Regions of Interest (ROI)Regions of Interest (ROI)

Cancerous ROIs

Benign ROIs

Features as Mapping Features as Mapping FunctionsFunctions

Cancerous ROIs

Benign ROIs

Mapping from Mapping from image space to image space to feature space…feature space…

Parzen Density Parzen Density EstimationEstimation

Histogram BinsHistogram Bins bad estimation with bad estimation with

limited data limited data available!available!

Parzen Density Parzen Density Est.Est. reasonable reasonable

approximation with approximation with limited data.limited data.

FeaturesFeatures

Gray-Level Gray-Level Difference Matrix Difference Matrix (GLDM)(GLDM) ContrastContrast MeanMean EntropyEntropy Inverse Difference Inverse Difference

Moment (IDM)Moment (IDM) Angular Second Angular Second

Moment (ASM)Moment (ASM)

Fractal DimensionFractal Dimension FDFD

Linearized PowerLinearized Power SpectrumSpectrum SlopeSlope Y-InterceptY-Intercept

P(X|C=Cancerous),P(X|C=Cancerous), P(X|P(X|C=Benign)C=Benign), and , and P(X)P(X)

Entropy and Mutual Entropy and Mutual InformationInformation

Mutual Information I(C;X) measures Mutual Information I(C;X) measures the degree of the degree of interdependenceinterdependence between X and C.between X and C.

Entropy H(C) measures the degree Entropy H(C) measures the degree of of uncertaintyuncertainty of C. of C.

I(X;C) = H(C) – H(C|X).I(X;C) = H(C) – H(C|X). I(X;C) ≤ H(C) is the upper bound.I(X;C) ≤ H(C) is the upper bound.

Results:Results:Mutual Information I(C;X)Mutual Information I(C;X)

FeatureFeature I(C;X)I(C;X) % of H(C)% of H(C)

GLDM ContrastGLDM Contrast 0.511520.51152 87%87%

GLDM MeanGLDM Mean 0.511520.51152 87%87%

GLDM EntropyGLDM Entropy 0.572650.57265 98%98%

GLDM IDMGLDM IDM 0.327400.32740 56%56%

GLDM ASMGLDM ASM 0.580690.58069 99%99%

FDFD 0.021270.02127 4%4%

PSD SlopePSD Slope 0.274260.27426 47%47%

PSD Y-intPSD Y-int 0.386220.38622 66%66%

Feature Images - GLDMFeature Images - GLDMContrast Mean Entropy

Inverse Difference Moment Angular Second Moment All features

Feature Images – Fractal Feature Images – Fractal Dim.Dim.

Fractal Dimension

Feature Images - PSDFeature Images - PSDLinearized PSD Slope (Horizontal) Linearized PSD y-intercept (Horizontal) Linearized PSD Slope (Vertical)

Linearized PSD y-intercept (Vertical) Linearized PSD Slope (Both) Linearized PSD y-intercept (Both)

All features

Interdependence between Interdependence between FeaturesFeatures

Expensive to compute all features.Expensive to compute all features. Some features might be similar to Some features might be similar to

each other.each other.

Thus, need to measure the Thus, need to measure the interdependence between features: interdependence between features: I(XI(Xii; X; Xjj))

Results:Results:Interdependence between Interdependence between

FeaturesFeatures

Contrast Mean Entropy IDM ASM FDPSD Slope

PSDY-int

Contrast n/a 0.1971 0.1973 0.8935 1.0261 0.0354 0.0988 1.1055

Mean 0.1971 n/a 0.1973 0.8935 1.0261 0.0354 0.0988 1.1055

Entropy 0.1973 0.1973 n/a 1.1012 1.5323 0.0335 0.0888 0.9615

IDM 0.8935 0.8935 1.1012 n/a 0.2046 0.2764 0.4227 0.1184

ASM 1.0261 1.0261 1.5323 0.2046 n/a 0.1353 0.4904 0.1355

FD 0.0354 0.0354 0.0335 0.2764 0.1353 n/a 0.0541 0.2753

PSD Slope 0.0988 0.0988 0.0888 0.4227 0.4904 0.0541 n/a 1.0338

PSD Y-int 1.1055 1.1055 0.9615 0.1184 0.1355 0.2753 1.0338 n/a

Mutual Information BasedMutual Information BasedFeature Selection (MIFS)Feature Selection (MIFS)

1.1. Select first feature with highest Select first feature with highest I(C;X).I(C;X).

2.2. Select next feature with highest:Select next feature with highest:

3.3. Repeat until a desired number of Repeat until a desired number of features are selected.features are selected.

SelectedS

SXIXCI );();(

Mutual Information BasedMutual Information BasedFeature Selection (MIFS)Feature Selection (MIFS)

This method takes into account both:This method takes into account both: the interdependence between the interdependence between classclass and and

featuresfeatures, and, and the interdependence between the interdependence between selected selected

featuresfeatures..

The parameter The parameter ββ controls the amount of controls the amount of interdependence between selected interdependence between selected features.features.

Varying Varying ββ in MIFS in MIFS

{X1, X2, X3,…, X8}

S = {X2, X3} S = {X2, X7} S = {X2, X4}

β = 0

β = 0.5

β = 1

Generalized Divergence JGeneralized Divergence J

If the features are “biased” towards If the features are “biased” towards a class, J is large.a class, J is large.

A good set of features should have A good set of features should have small J.small J.

),(log),(),(

BenignXP

CancerousXPBenignXPCancerousXPEJ x

FXXXX ,...,, 21

Results:Results:J with respect to J with respect to ββ

First feature selected: GLDM ASMFirst feature selected: GLDM ASM Second feature selected: …Second feature selected: …

ββ FeatureFeature JJ

00 GLDM EntropyGLDM Entropy 0.65530.6553

0.50.5 PSD Y-intPSD Y-int 0.29700.2970

11 PSD Y-intPSD Y-int 0.29700.2970

ConclusionsConclusions

Mutual Info. Based Feature Mutual Info. Based Feature Selection (MIFS):Selection (MIFS):

Generalized Divergence:Generalized Divergence:

SelectedS

SXIXCI );();(

),(log),(),(

BenignXP

CancerousXPBenignXPCancerousXPEJ x

maximizem

{X1, X2, X3,…, X8}

S = {X2, X3}S = {X2, X7} S = {X2, X4}

β = 0

β = 0.5

β = 1

Questions and CommentsQuestions and Comments

……

1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002

Documents

SYDE 575: Digital Image Processing Image Segmentation Part I: Histogram Methods Chapter 10

SYDE 192 Digital Systems - University of Waterlootnaqvi/downloads/DOC/... · SYDE 192 Digital Systems Spring 201 Lab Manual Instructor: Dr.John Yeow Lab Instructor: Tariq Naqvi –

UIT -P 676

676 (1914)

SYDE 192 Digital Systems - University of Waterlootnaqvi/downloads/DOC/sd192/2010Labs.pdf · SYDE 192 Digital Systems Spring 2010 Lab Manual Instructor: ... 5 EF Lab 5 – Using

Apliccion API 676

[Syde 361] Final Report V Final

Lector 676

SYDE 372 Introduction to Pattern Recognition Distance ...a28wong/slide5.pdf · SYDE 372 Introduction to Pattern Recognition Distance Measures for Pattern Classiﬁcation: Part II

La Noticia 676

SYDE 372 Introduction to Pattern Recognition Clustering

SYDE 575: Image Processing Introduction Read Textbook Chapters 1 & 2

Gooya issue 676

SYDE 575: Digital Image Processing Image Compression: JPEG Example (Variable Rate Compression)

L'informador 676

Liew Hui Hui

Edición Nº 676

Syde House & Bungalow, 14 Ashby Road, Packington, … · Syde House, 14 Ashby Road, Packington, LE65 1TD Asking Price £625,000 Two properties for the price of one. A beautiful modernised

Syde: A Tool for Collaborative Software Development

PanoramaUdeC N°676