Small bowel image classification using cross-co-occurrence matrices on wavelet domain

Small bowel image classification using cross-co-occurrence matrices onwavelet domain

Julien Bonnel a, April Khademi b, Sridhar Krishnan a,*, Cornel Ioana c

a Ryerson University, Department of Electrical and Computer Engineering, Canadab University of Toronto, Department of Electrical and Computer Engineering, Canadac E.N.S.Ingnieurs Electriciens de Grenoble, Laboratoire des Images et des Signaux, Canada

Biomedical Signal Processing and Control 4 (2009) 7–15

A R T I C L E I N F O

Article history:

Received 21 December 2007

Received in revised form 16 July 2008

Accepted 18 July 2008

Available online 24 September 2008

Keywords:

Small bowel images

Shift-invariant discrete wavelet transform

Color cross-co-occurrence matrices

Feature extraction

Classification

A B S T R A C T

This paper presents a novel system to compute the automated classification of wireless capsule

endoscope images. Classification is achieved by a classical statistical approach, but novel features are

extracted from the wavelet domain and they contain both color and texture information. First, a shift-

invariant discrete wavelet transform (SIDWT) is computed to ensure that the multiresolution feature

extraction scheme is robust to shifts. The SIDWT expands the signal (in a shift-invariant way) over the

basis functions which maximize information. Then cross-co-occurrence matrices of wavelet subbands

are calculated and used to extract both texture and color information. Canonical discriminant analysis is

utilized to reduce the feature space and then a simple 1D classifier with the leave one out method is used

to automatically classify normal and abnormal small bowel images. A classification rate of 94.7% is

achieved with a database of 75 images (41 normal and 34 abnormal cases). The high success rate could be

attributed to the robust feature set which combines multiresolutional color and texture features, with

shift, scale and semi-rotational invariance. This result is very promising and the method could be used in

a computer-aided diagnosis system or a content-based image retrieval scheme.

� 2008 Elsevier Ltd. All rights reserved.

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control

journa l homepage: www.e lsev ier .com/ locate /bspc

1. Introduction

Medical imaging is now a common diagnostic procedure inevery hospital. It allows physicians to non-invasively obtainimages of human anatomy and physiology. To view the gastro-intestinal tract of a patient, traditionally, doctors utilized endo-scopes (long metal tubes with a camera at the end). It is inserted inthe mouth of the patient and could be advanced to the level of theduodenum, which is just before the small bowel. It is a painfulprocedure realized only on sedated patients by highly skilleddoctors and it does not give way to visualization and diagnosis ofdiseases of the small bowel. To combat these drawbacks, GivenImaging Ltd. developed a wireless capsule endoscope called thePillCam TM SB [1–3]. This device non-invasively captures images ofthe complete gastrointestinal tract (including the small bowel).

Internal imaging with the PillCam is a powerful way to diagnosediseases of the small bowel, especially the four main types of

* Corresponding author.

E-mail addresses: [email protected] (J. Bonnel), [email protected]

(A. Khademi), [email protected] (S. Krishnan), [email protected]

(C. Ioana).

1746-8094/$ – see front matter � 2008 Elsevier Ltd. All rights reserved.

doi:10.1016/j.bspc.2008.07.002

cancers: adenocarcinoma, sarcoma, carcinoid, and lymphoma.However, diagnosis is a laborious, repetitive task, which requiresthe interpretation of a large volume of small bowel images.Consequently, such a procedure is very expensive (for the patientor for the health care system), as it requires several hours of thespecialist’s time. Furthermore, even a highly trained doctor couldmiss some important clues.

In order to introduce quantitative, objective analysis, speed upanalysis times and reduce the cost of such procedures, it is possibleto develop a computer-aided diagnostic tool, which automaticallyanalyzes the small bowel images. Such a computer-baseddiagnosis system could be used as a secondary opinion or it canbe used as a first stage, which locates the most interesting, orsuspicious images for the physician. In this context, this paperpresents an automatic classification method for small bowel colorimages. This preliminary work is performed on a small publicdatabase, which includes a variety of images a physician couldencounter. This work follows on from the work of Khademi [4] whoused gray-level images. The additional color information sig-nificantly improves the classification rates from 75:1% to 94:7%.

The outline of the rest of this paper is as follows. Section 2investigates the small bowel images’ characteristics the methodswhich will be used to analyze them. Sections 3 and 4 respectively

mailto:[email protected]




http://www.sciencedirect.com/science/journal/17468094

http://dx.doi.org/10.1016/j.bspc.2008.07.002

Fig. 1. Time-frequency tiles of 1D wavelet transform.

J. Bonnel et al. / Biomedical Signal Processing and Control 4 (2009) 7–158

present the shift-invariant wavelet transform and the cross-co-occurrence matrices for color images which will be utilized in thefeature extraction phase. Section 5 discusses the feature extractionand classification scheme. The last two sections describe theresults and the conclusion of this work.

2. Characteristics and analysis of small bowel images

Classification of small bowel images is a challenging problem.The PillCam TM freely travels through the gastrointestinal tract, sothe camera’s orientation is continuously changing and the locationof suspicious pathologies is unpredictable. Additionally, theproposed work aims to detect all types of pathologies withinone framework, which means that features must robustly detectpathology of different shapes, sizes and colors. Other mainchallenges exist: (1) the present of intestinal content may maskuseful information and (2) the images are stored in lossy JPEGformat, which may cause the images to contain compressionartifacts. Consequently, the aim is to extract robust anddiscriminant features – in the compressed domain – to character-ize our images, which may contain various sized, shaped, colorpathologies at a variety of orientations.

The CAD scheme is designed to analyze and classify the wirelesscapsule endoscope images into two classes: normal (healthy) onesand abnormal (diseased) ones (all the pathologies are grouped intoone category (abnormal)). Some examples of the abnormalities thatare found in the database are polypoid masses, carcinoid tumors,multifocal carcinomas and more. To achieve high classification ratesunder these scenarios, it is important to extract discriminantfeatures which are robust to translation, scale and rotation. Thus, inorder to design such discriminatory features, a good understandingof the images and their characteristics is needed. The followingsection will describe the types of image processing tools used toanalyze the small bowel images for abnormality detection.

2.1. Nonstationary behaviour

Generally, a medical image is a non-deterministic two-dimen-sional signal. Therefore, its content is unpredictable due to itsrandom nature and must be represented using probabilistic modelsand some statistical quantities. Moreover, as with most naturalimages, the frequency content of medical images is spatially varyingand are usually classified as ‘‘nonstationary’’ signals. Consequently,nonstationary image analysis tools are required to efficientlyanalyze and represent the space-frequency content of medicalimages. This work presented here could easily be adapted to otherreal images, as they also exhibit nonstationary behaviour.

As stated, nonstationary images have spatially varying fre-quency components. Consequently, it is useful to make use ofspecific tools which can precisely identify and localize theinteresting features. It is common to work on a transformationof the signal, which ideally puts the emphasis on what isinteresting in the initial signal. Such a transformation is in fact aprojection of the signal onto a set of mathematical functions; thedifficulty is to choose the most appropriate basis set, which issuitable for the current application.

2.2. Multiresolution analysis

The most well-known decomposition technique is the Fouriertransform (FT). Although popular, it only gives way to a descriptionof the global frequency content of the signal or image. Particularly,it does not give the spatial location these frequency components. Itis easy to understand that this method is not suitable fornonstationary images.

To combat this problem of localization, a new tool wasintroduced: the short-time Fourier transform (STFT) (for a completereview, see [5]). To compute the STFT, the analyzed signal is dividedinto smaller parts, and each part is assumed to be stationary. In fact,this division of the signal is obtained by expanding the signal on asinusoid basis function multiplied by window functions (thosefunctions are null everywhere except in a certain time interval: thewindow). It is the equivalent of computing a classical FT on each partof the signal and it gives the frequency components localized in eachwindow (the time is sampled). It is well known that it is impossibleto obtain a good precision both in frequency and in time (or space fora 2D signal). It is linked to the Heisenberg uncertainty principle.When applied to the case of signal analysis, it means that a widewindow gives a good frequency precision, but poor spatiallocalization, while a narrow window gives way to good spatiallocalization but poor frequency resolution. The downfall of thistechnique (the STFT) is that the size of the window is fixed for theentire decomposition, which does not account for all the localizedanomalies found in nonstationary signals.

Small bowel images (and generally all biomedical images)contain both diffuse structures or more localized phenomena. Thelocalized structures require small windows and the broadstructures require wide ones. A technique which can extract bothof these characteristics is required. Multiresolutional analysis(MRA) is such a tool [6]. Its implementation in discrete space couldbe done with discrete wavelet transform (DWT). MRA solves theprecision problem by dyadically changing the size of the windows.The basis of decomposition is built to obtain a good frequencyprecision in low frequency, and precise spatial resolution for highfrequency. For a better understanding, the time-frequency tiles fora 1D dyadic wavelet transform are represented in Fig. 1.

It is easy to notice that all the tiles have the same area, where eachtile it represents the amount of uncertainty according to theHeisenberg principle (i.e. uncertainty is related to the area of thetiles). The important thing is that the tile’s shape is changing sothe precision is not the same all over the time-frequency plane. Inwireless capsule endoscope images, this permits a good character-ization of low-frequency regions (small oscillations, textures) and agood localization of high-frequency structures (transients, edges).

2.3. Small bowel texture

To extract discriminant features, it is important to understandwhat differentiates normal and abnormal images. It can be noticedin the experimental dataset that normal images contain smooth,homogeneous texture elements with very little disruption in

Fig. 2. Different small bowel images (JPEG-256*256) collected by the PillCam TM.

J. Bonnel et al. / Biomedical Signal Processing and Control 4 (2009) 7–15 9

uniformity except for folds and crevices (Fig. 2). On the otherhand, abnormal images (benign and malignant) contain varioustypes of pathologies, but all seem to have some common texturalcharacteristics. Firstly, the diseased region presents many differenttextured areas simultaneously and secondly, these diseased areasare mostly composed of heterogeneous texture components.Understanding these differences between the two classes ofimages (abnormal and normal) will aid in the design of the featureextraction scheme.

2.4. Color images

As stated, to improve of the work of Khademi [4], this work willincorporate color information. Therefore, it is important tounderstand how color is represented. The color-space is a three-dimensional space and a color image is a combination of threegray-level images. Each one represents a projection of the colorimages on to an axis of the basis of the color-space. Some of theclassical spaces include the RGB (the basis vectors R, G, and Brespectively represents the red, green, and blue components of thecolor). In this work, the features were extracted from these colorplanes (in the JPEG-compressed domain).

3. Shift-invariant discrete wavelet transform

3.1. Discrete wavelet transform

The discrete wavelet transform (DWT) could be implementedwith filterbanks [7]. The image is expanded on wavelet and scalingbasis functions using a set of 1D low-pass ðh0ðnÞÞ and high-passðh1ðnÞÞ filter coefficients. Filters are applied separately on rows andcolumns to obtain a 2D transformation.

Let us consider an image Iðm;nÞ of size N � N. First the filtersH0ðzÞ and H1ðzÞ are applied to the rows of I. It creates two images(each one of size N � N) which respectively contain low and highfrequencies of I. After this, the rows of these two images aresubsampled by a factor of 2. It creates two ðN=2Þ � N images. Thenthe filters are reappplied along the columns, followed bydecimation by a factor of 2 (also along the columns). Finally atthe output there are four subband images of size ðN=2Þ � ðN=2Þlabeled LL, LH, HL and HH. For more decomposition levels theoperation is recursively repeated on the LL band. The filterbankimplementation method is given in Fig. 3.

This work uses the 5=3 Le Gull wavelet, as in [4]. The filterlengths are small and can warrant an efficient implementation. Thecoefficients of the decomposition filters are given in Table 1.

3.2. Properties of the discrete wavelet transform

The wavelet representation is an efficient tool because it allowsthe extraction of discriminative events which are representative ofimages, even for color ones [8].

First and foremost, the DWT is computed using filters onvertical and horizontal directions, which allows the isolation ofsome specific frequency events. High-pass filters emphasize high-frequency events, like edges. Combined with the direction (row orcolumn) of the filter, it gives interesting results: the HH, HL and LHsubbands respectively isolate diagonal, horizontal and verticaledges. This could be used to extract features related to theorientation of the texture in small bowel images. In addition, theDWT is a powerful tool for the removal of background noise inthe LL subband. Indeed, biomedical images are often noisy. In ourcase study, cancerous lesions could be masked by the tissue wall ofthe small bowel.

Fig. 3. Filterbank implementation of 2D discret wavelet transform.


The DWT scale-invariant. Therefore, a scaled version of theinput image will be matched with the corresponding function inthe dictionary. This is important pathologies could have varioussizes and could be imaged from different distances (thus creatingdifferent scales).

Although the DWT has many advantages, it has one maindrawback: it is shift-variant. As the PillCam TM travels freely in thesmall bowel, a suspicious pathology may be present at any locationin the images. Consequently, it becomes important to use shift-invariant tools in order to extract features which are robust toshifts (translations) in order to localize all pathologies (regardlessof its location). To combat this, a shift-invariant DWT (SIDWT) isutilized and its details are presented in the next subsection.

3.3. Shift-invariant discrete wavelet transform (SIDWT)

For simplicity, this section will first focus on gray-level images.There are numerous methods to compute a shift-invariantdiscrete wavelet transform (SIDWT). A good review could befound in [9]. However, in general these methods have downfalls:computational complexity, loss of orthogonality, redundancy oronly nearly shift-invariance. The algorithm proposed by Beylkin[10] can be used to address these limitations. It computes DWT forevery elementary shift of one pixel and finds the best basis ofdecomposition. This transformation uses orthogonal waveletswhich reduces redundancy in the face of the other methods.It has been extended to 2D signals by Liang and Parks [11,12]and the efficiency of this method for biomedical imagesclassification (and especially small bowel images) has been provenby Khademi [4].

The idea of this algorithm is to observe decompositions of all thetranslations of the input image, and to find the decomposition

Table 1Analysis filter coefficients for the 5/3 wavelet

i h0ðiÞ h1ðiÞ

�1 þ 28 � 1

2

�2 � 18

which is carrying the most information. This particular decom-position will be kept and is considered as the SIDWT of the image.

A translation along any vector could be seen as the combinationof elementary shifts: (0, 0), (0, 1), (1, 0) and (1, 1), where the indicesrespectively correspond to the row and column shifts. In fact, allpossible shifts could be browsed by applying the four elementaryshifts at the input of each level of decomposition. That means thatthe LL band from each level is shifted by the four translates (0, 0),(0, 1), (1, 0) and (1, 1) and it creates a tree with all possible shifts.

For the input shift ða; bÞ, the four subbands at the jthdecomposition level could be computed by

LL jða;bÞðx; yÞ ¼

X

m

X

n

h0ðm� 2xÞh0ðn� 2yÞLL j�1ðm� a;n� bÞ (1)

HL jða;bÞðx; yÞ ¼

X

m

X

n

h0ðm� 2xÞh1ðn� 2yÞLL j�1ðm� a;n� bÞ(2)

LH jða;bÞðx; yÞ ¼

X

m

X

n


HH jða;bÞðx; yÞ ¼

X

m

X

n


The superscript j� 1 corresponds to the previous decompositionlevel, m and n are the coordinates in this level while x and y are thecoordinates in the bands of level j. Only elementary shifts areconsidered, so ða; bÞ should be (0, 0), (0, 1), (1, 0) or (1, 1).

Once the full decomposition performed, the result is a treewhich contains all the DWT coefficients. If the size of the inputimage is N � N, the tree contains all the N2 circular translates.Fig. 4 shows the tree for three decomposition levels. Each squarecorresponds to three detail subbands images: HH, LH and HL sincethe fourth approximation subband LL is shifted and decomposedat the next level.

The final step is to find the best basis for decomposition, whichcorresponds to the particular path in the tree which minimizes acost (see Fig. 4). The cost is computed by calculating the entropy ofeach subbands and the path with the minimal entropy is kept. Thisis Coifman and Wickenhauser’s best basis selection techniqueand more information could be found in [13]. This best basiscorresponds to a translation of vector t1 from the input image I. Ifthe input image is a translation of I by a vector t2, the best basis will

Fig. 4. SIDWT decomposition tree for three decomposition levels with the best basis selection corresponding to the minimum cost path.


correspond to a translation of vector t, with t ¼ t1 � t2. The bestbasis is the same for every shift of the input image.

This process can easily be applied generally to color images.Three complete decomposition trees are computed: one for eachcolor band (each one is considered as a gray-level image). Thereafterthe cost is computed at the same time in the three trees by assumingthat the total entropy for the color image is the sum of the entropy foreach subband. Thus a global decomposition is obtained: the threesubbands are decomposed in the same way.

4. Cross-co-occurrence matrices

As stated earlier, the texture differences between the abnormaland normal small bowel images provide important visual clues (i.e.ways to differentiate between these two classes). By utilizing theSIDWT, we have obtained a localized description of the textureelements in the images. In conjunction to texture, it is also desiredto include color information for maximum discrimination. Thefollowing section will describe a method which allow the extractionof both color and texture information (from the wavelet domain).Some recent works have proved that it is a good idea to utilize bothcolor and texture in parallel and a way to perform such an analysis iswith cross-co-occurrence matrices (CCM) [14–17].

4.1. Generating cross-co-occurrence matrices (CCM)

Cross-co-occurrence matrices are generalizations for colorimages of the classical gray-level co-occurrence matrices definedfor gray-level images by Haralick [18]. These gray-level co-occurrence matrices obtain an estimate of the joint probabilitydistribution of two gray-levels, which indirectly contains thetextural information. Such a matrix is large and sparse, so theinformation is not directly usable. However, various metrics maybe extracted from the matrix to get a more useful representation ofimage features. Such a feature extraction scheme, will be discussedin Section 4.2.

Let I be a N � N discrete image with three color bands b1; b2; b3with G intensity levels in each band. Let x be a color band:x2 ðb1;b2; b3Þ and let y also be a color band: y2 ðb1; b2; b3Þ. Let X

and Y respectively be the projection of I on x and on y: X and Y areboth N � N discrete image with G intensity levels. Let d be adisplacement vector.

Cxyd is a CCM if each coefficient Cxy

d ði; jÞ represents theprobability for the intensity levels i in x and j in y to occur at

two locations separated by the displacement vector d (d2 ½1;N�2).Mathematically:

Cxyd ði; jÞ ¼ Cardððs; sþ dÞ 2 ð½1;N�2; ½1;N�2ÞjXðsÞ ¼ i;Yðsþ dÞ ¼ jÞ (5)

Both i and j are intensity levels: they are both in ½1;G�.Consequently Cxy

d is a G� G matrix, independent of the initialimage’s size.

For a given displacement vector d, there are nine different CCMpossible (the number of combinations of two color bands amongthree). However, Cxy

d and Cyxd carry the same information.

Consequently, we can define a new CCM as

Mxyd ði; jÞ ¼

Cxyd þ Cyx

d

2(6)

If x ¼ y then Mxyd ¼ Cxy

d . In that case, the CCM is the same as thegray-level co-occurrence matrix computed on only one channel. Ifx 6¼ y, Mxy

d carry the common information of Cxyd and Cyx

d and takeinto account the correlation between the two color channels. Withthat new definition, for a given displacement vector, there are sixdifferent CCM.

4.2. The features

The gray-level co-occurrence matrix contains texture informa-tion [18]. Haralick define 14 different parameters for GCM featureextraction. These parameters contain texture information, butcorrelation may exist between them. Consequently, the entire setof 14 features is not necessary and all will not be used (only few ofthem will be extracted). It is important to exploit the texturalcharacteristics of the small bowel images. As discussed in Section2.3, it is necessary to use texture features which describe therelative homogeneity or non-uniformity of the images. Thus, onlyfour features per matrix will be extracted: contrast (C), energy (N),homogeneity (H) and entropy (E) [19]. Considering a co-occurrencematrix M, these features are defined below:

C ¼X

i; j

ji� jj2Mði; jÞ (7)

Contrast is a measure of the intensity contrast between a pixeland its neighbor over the whole image:

H ¼X

i; j

Mði; jÞ1þ ji� jj (8)


Homogeneity is a value that measures the closeness of thedistribution of elements in the co-occurrence matrix to the co-occurrence matrix diagonal. It describes how uniform the texture is

E ¼ �X

i; j

Mði; jÞlog ðMði; jÞÞ (9)

Entropy is a statistical measure of randomness for the colorlevels in the image. It measures the non-uniformity:

N ¼X

i; j

Mði; jÞ2 (10)

Energy is the sum of squared elements. Experimental observa-tions and tests show that it is an important feature and should beused (see [4]).

5. Classification of small bowel images

The aim of this work is to combine color CCM with shift-invariant MRA. To the best of our knowledge, it has never beendone before. The classification process follows a classical process:first the features are extracted and then a classification algorithm isapplied on feature space.

5.1. Extraction of the features

Let’s consider a small bowel image. First, the SIDWT algorithmis computed on this color image as explained in Section 3.3.Consequently we obtain the decomposition of the image with thebasis set that contains the most information. It has been previouslyreported that the middle band of the detail coefficients are the onescontaining the most information [20]. Therefore, in order not toobtain too many features, the CCM are computed only on thesecond level of decomposition.

We compute co-occurrence matrices with displacement vectorsof length two (2 pixels long) [4]. Define all these displacementvectors as d j

i , with i and j respectively representing column and rowcoordinates of the vector. As all d j

i are 2 pixels long, i and j are both in(�1;0;1).Arepresentationofall these vectorscould be found in Fig.5.

As mentioned in Section 3.2, the DWT subbands containoriented or directional high-frequency information. Consequently,only some particular CCM are computed for each subband. Cd1

0and

Cd�10

are computed on HL2 to capture horizontal high-frequency

Fig. 5. Computation of cross-co-occurrence matrices: all the displacement vectors

of 2 pixels length.

content. Cd01

and Cd0�1

are computed on LH2 for vertical edges, whileCd1

1, Cd�1

1, Cd1

�1and Cd�1

�1are computed on HH2 for diagonal high-

frequency information. In each wavelet subband, the CCM areaveraged in order to obtain only one matrix with all the directionalinformation per wavelet subband (which also lends to a semi-rotation invariant representation, see [4]). As a result, we workwith 18 matrices: 6 different matrices (for the different combina-tions of color band) for three different wavelet subbands.

Then, four features per matrix are extracted as explained inSection 4.2. It makes a total of 72 features extracted per smallbowel image.

5.2. The classification technique

The classification can now be computed on the feature space.Several methods are conceivable. For example, neural network is aclassical and complicated one which is known to be suitable forwireless capsule endoscopic images [21,22]. However, our approchis different. As we want to prove that our feature set isdiscriminative, we only use basic classification methods. Thus,the results depend more on the feature choice than on an advancedclassification algorithm. However, two problems have to beovercome: (1) the image database is small, and (2) the dimensionof the feature space is large. How these two technical challengesare dealt with are explained in the following subsections.

5.2.1. The leave one out method

For the classification aspect, a supervised classifier is applied(which is discussed shortly). Consequently, we need a training setfor our algorithm, which must be constructed from the (small)database of small bowel images (75 images in total). In mostscenarios, a training set is comprised of half of the entire database.However, in this case, the database is small, which would limit thepracticality of this type of training scenario (small sample sizeslead to inaccurate computation of classifier parameters). There-fore, the leave one out method (LOOM) is used as it is a robustmethod for the small database scenario [23]. LOOM consists ofleaving one image from the database ‘‘out’’, and using all the othersamples for training. After the classifier has been changed, the leftout image is classified by the algorithm. The process is iterated andeach image of the database is left out once. This permits thecomputation of the classification accuracy for the entire database,while accommodating for the small database size.

5.2.2. Feature space dimensionality reduction

As the feature space is large, canonical discriminant analysis(CDA) is first applied on the training set in order to reduce thedimensionality of the feature space. Such a scheme seems to be thebest solution according to [24]. CDA is a descriptive method whichtransforms the data into a new feature space representation, whichoffers better discrimination between classes.

Consider a dataset with n observations of J variables. Theobservations are split up into C classes of nc elements. Let mc be themeans of the groups and Wc their covariance matrices. The aim isto construct a new axes Zh so that members of the same groupexpanded on this axis are close to each other (minimal withingroup scatter), but members of different groups are far from eachother (maximum between class scatter). The total dispersioninside the clusters (intraclass variance or scatter) W is given by

W ¼ 1

n

X

c

nc �Wc (11)

where Wc is the the intraclass variance of class c and the dispersionbetween groups, which is the interclass variance or scatter, B, is

Fig. 6. Features distribution after a canonical discriminant analysis.

Table 2Results for small bowel image classification (using cross-co-occurrence matrices on

wavelet domain) displayed as a confusion matrix

Normal Abnormal

Normal 38 (92.7%) 3 (7.3%)

Abnormal 1 (3%) 33 (97%)

Fig. 8. Receiver operating characteristic curve, with an area of 0.97.


given by

B ¼ 1

n

X

c

ncðmc �mÞðmc �mÞ0 (12)

where m is the global mean and nc is the number of classes. Theglobal dispersion, which is the total covariance matrix y, is givenby the Huyghens theorem: V ¼ BþW . The first axis Z1 is definedby the direction vector u1 which maximizes the quantityðu01Bu1Þ=ðu01yu1Þ. To solve this problem, it is necessary to solvethe linear equation y�1Bu ¼ lu (the solution is given by theeigenvalues and eigenvectors of the matrix y�1B). The ith axis Zi isobtained with the eigenvector ui, which corresponds to theeigenvalue li. All axes Zh are determined by the non-zeroeigenvalues, so in a classical case the dimension of the new basisis C � 1. Here we only want two classes: one class for the normalimages and a second class for the abnormal images. As a result, weexpand our feature set onto a one-dimension space. A representa-tion of all the features expanded in one-dimensional after a CDAcould be found on Fig. 6. One can see that the CDA allows an easydifferentiation between the two classes. Then a simple classifica-tion algorithm using thresholds can be utilized. The wholeclassification process is summarized on Fig. 7.

6. Results

The proposed feature extraction and classification schemeachieved high accuracies, with an overall classification accuracy of94.7%. The results, presented as a confusion matrix, as well as thereceiver operating characteristic (ROC) curve, are shown in Table 2and Fig. 8, respectively. The ROC curve measures the performanceof our binary classification by plotting sensitivity versus 1-specificity. Sensitivity represents the proportion of true positives(the proportion of images with diseases truly classified asabnormal) and specificity represents the proportion of truenegatives (the proportion of images which are known not tocontain disease (normal)). The accuracy of the classification ismeasured by the area under the curve: an area of 1 represents a

Fig. 7. The proposed small bowel

perfect classification (curve y ¼ 1), while an area of ð1=2Þrepresents a worthless classification (curve y ¼ x). The area underthe ROC curve on Fig. 8 is 0.97.

However, as our database is small, it is important to be carefulwith the results of our LOOM estimator. Indeed, such a cross-validated estimator could display undesirable features like highvariance or large outliers [23]. In our case, the classificationalgorithm estimates, for each image, a probability that samplesbelong to a particular class. For each of the classes, a probability isassigned: pn for normal images, or pa for abnormal images. In thisscenario, there are only two classes, so pn ¼ 1� pa. Let us define p

as the probability of being in the correct class (therefore, for anormal image p ¼ pn, or for an abnormal one p ¼ pa). The meanvalue of p was found to be 0.89 and its variance is 0.03. As ourclassification scheme is quite simple, the troublesome, undesirablefeatures which could have been caused by the LOOM estimator, arenot significant (as shown by the mean and variance of p). This is agood result is promising, in that it shows that the currentclassification algorithm is acceptable and could probably begeneralized to a larger dataset.

As previously discussed, the database contains 75 images. Thereare 41 normal images, which are comprised of various healthyregions of the small bowel: jejunum, duodenum, intestinal wall,ileo-cecal valve, etc. The other 34 (abnormal) images containseveral benign and malignant pathologies (carcinoid tumors,polyps, submucosal jejunal mass, etc.) in different locationsthroughout the gastrointestinal tract (in a variety of sizes, shapes,colors, and orientations as well). Despite these challenges, theresults are very good. The misclassified normal images aremarginal: the shadowed region and part of the gastrointestinaltract (low grade malt lymphoma) are completely different of all the

image classification process.

Table 3Results for small bowel image classification (using gray-level co-occurrence

matrices only) displayed as a confusion matrix

Normal Abnormal

Normal 34 (83%) 7 (17%)

Abnormal 9 (26%) 25 (74%)


other normal images. The only abnormal image misclassified is asmall bowel polyp but the image is of poor quality, which wouldhave caused the algorithm to misclassify it.

The success of this method could be attributed to severaldifferent factors. Firstly, the feature extraction scheme wasdesigned to be shift and scale invariant, and also nearly rotationallyinvariant. As a result, such a scheme is robust to the camera angleand position, as well as different sized pathologies. The co-occurrence matrix is an excellent way to extract textureinformation, while incorporating color information (as shown bythe results). Color carries important information for the recogni-tion of pathology. To account for the color information, the cross-co-occurrence matrices were computed in every color plane.Therefore, this unified framework extracts important color andtexture information, while being robust to the capsule’s position aswell as the natural variability of pathology. In an essence, thiscomputer vision system is mimicking the human visual system, asradiologists combine all of these features, while interpreting animage. For example, radiologists diagnose the images based oncolor and texture information, while not being affected by thelocation and size of the pathology, nor the orientation of thecamera.

To achieve such a high classification rate, it is obvious that allsteps of the method are necessary and important. Color carries asignificant amount of useful information for the classification.Recall in the prequel work [4] which only utilized gray-level co-occurrence matrices, the classification accuracy was 75.1%(see Table 4 for the results). This clearly shows that theutilization of color information is critical in achieving highclassification rates.

Additionally, utilization of the SIDWT is also an important step.By using only CCM-based features without the SIDWT (i.e. featureextraction with the conventional DWT), a separate simulationshowed that the the global efficiency of the classification isreduced to 78% (see Table 3 for detailed results). As shown inTables 4 and 3, without the SIDWT, the result is very near to theresults achieved by the original work (gray-level information only).By employing the SIDWT, the efficiency reaches 94.7%.Table 2(confusion matrix for the classification with CCM andSIDWT) shows a real improvement to that of Table 3(confusionmatrix for the classification with only CCM).

The proposed work is a preliminary study based on theautomated classification of wireless capsule endoscope imagesuses a database of 75 images. The results are very promisingand show the utility of the method: invariant features (scale,translation and (semi-)rotation) and features which captures bothcolor and texture information in the wavelet domain. For furthervalidation, future works will include testing the algorithm on alarger database, with several images from every patient.

Table 4Results for small bowel gray-level image classification (using co-occurrence

matrices on wavelet domain) displayed as a confusion matrix

Normal Abnormal

Normal 33 (80%) 8 (20%)

Abnormal 10 (30%) 24 (70%)

7. Conclusion

This small bowel image analysis system is the first reportedwork in the area detecting both malignant and benign pathologieswith a high classification rate (94.7% for a database of 75 images). Anew method for feature extraction has been developed. It extractsboth color and texture information simultaneously, and iscomputed in the wavelet space (to capture multiresolutionaltexture information). The shift-invariant discrete wavelet trans-form is computed to ensure that features are robust to translations.Cross-co-occurrence matrices are computed from each colorplane, for the high-frequency subbands of the second decomposi-tion level, to capture both texture and color information. Theincorporation of color was an important contribution of this work;it increased the performance by almost 20%, in comparison to theprevious system which only relied on gray-level information (theperformance of the grayscale system was 75.1%). Additionally,cross-co-occurrence matrices were computed at a variety ofdirections, to account for orientated texture elements (and toensure some rotational invariance). The feature space was reducedby canonical discriminant analysis, which permitted the use of asimple threshold classifier. The leave-one-out method was used tocombat the small database scenario. Additionally, all processingwas performed on JPEG-compressed images; therefore, thealgorithm is also robust to compression artifacts.

In summary, the high classification rates of the system stronglyproves the importance of including both color and textureinformation, in a robust and efficient manner. According to theresults, this system could in fact be used as a computer-aideddiagnosis or a content-based image retrieval system.

References

[1] Given Imaging Ltd., PillCam TM SB Capsule Endoscopy [ONLINE], 2006, http://www.givenimaging.com/.

[2] Given Imaging Ltd., Patient Brochure [ONLINE], 2006, http://www.givenimaging.-com/.

[3] Given Imaging Ltd., Patient Information Guide [ONLINE], 2006, http://www.gi-venimaging.com/.

[4] A. Khademi, Multiresolution analysis for classification and compression of med-ical images, M.A.Sc. Thesis, Ryerson University, 2006, pp. 1–92, 129–156.

[5] P. Flandrin, Temps-frequence (2eme edition, revue et corrigee) Editions Hermes,Paris, 1998.

[6] M. Unser, A. Aldroubi, A review of wavelets in biomedical applications, Proceed-ings of the IEEE 84 (4) (1996) 626–638.

[7] S. Mallat, Wavelet Tour of Signal Processing, Academic Press, USA, 1998.[8] S. Abdelmounaime, F.B. Mohamed, I. Tahar, La transformation en ondelettes pour

l’extraction de la texture-couleur. Application a la classification combinee desimages (HRV) de SPOT, International Journal of Remote Sensing 27 (18) (2006)3977–3990.

[9] A.P. Bradley, Shift-invariance in the discrete wavelet transform, The SeventhInternational Conference on Digital Image Computing: Techniques and Applica-tions, 2003, pp. 29–38.

[10] G. Beylkin, On the representation of operators in bases of compactly supportedwavelets, SIAM Journal of Numerical Analysis (1992) 1716–1740.

[11] J. Liang, T. Parks, Translation invariant wavelet transforms with symmetricextensions, IEEE Digital Signal Processing Workshop (1996) 69–72.

[12] J. Liang, T. Parks, Image coding using translation invariant wavelet transformswith symmetric extensions, IEEE Transactions on Image Processing 7 (1998) 762–769.

[13] R. Coifman, M. Wickerhauser, Entropy-based algorithms for best basis selection,IEEE Transactions on Information Theory 38 (1992) 713–718.

[14] J.K. Shuttleworth, A.G. Todman, R.N.G. Naguib, B.M. Newman, M.K. Bennett,Colour texture analysis using co-occurrence matrices for classification of coloncancer images, Proceedings of the 2002 IEEE Canadian Conference on Electricaland Computer Engineering 2 (2002) 1134–1139.

[15] V. Arvis, C. Debain, M. Berducat, A. Benassi, Generalization of the cooccurrencematrix for colour images: application to colour texture classification, ImageAnalysis and Stereology 23 (2004) 63–72.

[16] R. Muniz, J.A. Corrales, Novel techniques for color texture classification, in:Proceedings of the IPCV 2006, 2006, pp. 114–120.

[17] C. Palm, T. Lehmann, K. Spitzer, Color texture analysis of moving vocal cords usingapproaches from statistics signal theory. Braunschweig Journal, 2000, pp. 49–56.

[18] R. Haralick, K. Shanmugam, I. Dinstein, Textural features for image classification,IEEE Transactions on System, Man, Cybernetics 3 (6) (1973) 610–621.

http://www.givenimaging.com/







[19] R. Rangayyan, Biomedical Image Analysis, CRC Press LLC, United States ofAmerica, 2005.

[20] S.A. Karkanis, D.K. Iakovidis, D.E. Maroulis, D.A. Karras, M. Tzivras, Computer-aided tumor detection in endoscopic video using color wavelet features, IEEETransactions on Information Technology in Biomedicine 7 (2003) 141–152.

[21] V.S. Kodogiannis, M. Boulougoura, E. Wadge, J.N. Lygouras, The usage of soft-computing methodologies in interpreting capsule endoscopy, Engineering Appli-cations of Artificial IntelligenceVolume 20 (4) (2007) 539–553.

[22] N. Bourbakis, S. Makrogiannis, D. Kavraki, A Neural Network-Based Detectionof Bleeding in Sequences of WCE Images, in: Proceedings of the 5th IEEESymposium on Bioinformatics and Bioengineering (BIBE’05), 2005, pp. 324–327.

[23] U.M. Braga-Neto, E.R. Dougherty, Is cross-validation valid for small-samplemicroarray classification? Bioinformatics 20 (2004) 374–380.

[24] A.K. Jain, R.P.W. Duin, Jianchang. Mao, Statistical pattern recognition: a review,IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (1) (2000).

Documents

Small bowel image classification using cross-co-occurrence matrices on wavelet domain