15
1530 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997 Automatic Watershed Segmentation of Randomly Textured Color Images Leila Shafarenko, Maria Petrou, Member, IEEE, and Josef Kittler, Member, IEEE Abstract—A new method is proposed for processing randomly textured color images. The method is based on a bottom-up segmentation algorithm that takes into consideration both color and texture properties of the image. An LUV gradient is in- troduced, which provides both a color similarity measure and a basis for applying the watershed transform. The patches of watershed mosaic are merged according to their color contrast until a termination criterion is met. This criterion is based on the topology of the typical processed image. The resulting algorithm does not require any additional information, be it various thresholds, marker extraction rules, and suchlike, thus being suitable for automatic processing of color images. The algorithm is demonstrated within the framework of the problem of automatic granite inspection. The segmentation procedure has been found to be very robust, producing good results not only on granite images, but on the wide range of other noisy color images as well, subject to the termination criterion. Index Terms— Color texture, image segmentation, watershed algorithm. I. INTRODUCTION C OLOR IMAGE processing has been an active field of research during the last few years (see, for example, [1] and references therein). However, little research has been done in the area of processing color textured images, especially for the case of randomly textured color images, such as granites. In this paper, we address the problem of automatic segmentation of such images. Our goal is to segment a typical granite image in such a way that it mimics human perception of the image. Although this task is not well defined due to the very notion of human perception being somewhat vague, it often arises in practice. For example, the granite tiles are accepted or rejected by the manufacturer according to human perception. In this case, the segmentation algorithm has to extract those features from the granite sample that are perceived as salient by humans. This task is complicated by an overwhelmingly large num- ber of details that are present in a typical granite image. It is immediately noticeable, though, that not all of these details are perceived as significant by the human eye. On the one hand, if one tries to filter out insignificant features by applying size-based filters such as opening or closing [2], Manuscript received August 18, 1995; revised November 27, 1996. This work was supported by the CEC Project 0946 under the BRITE-EURAM Program. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. John Goutsias. The authors are with the Department of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, U.K. (e-mail: [email protected]). Publisher Item Identifier S 1057-7149(97)07030-9. or their combinations, then small but significant features in terms of color saliency may be removed. For example, a small-sized greenish discoloration would be quite noticeable against red background, but would be masked completely by similar-color surroundings. On the other hand, a method based only on the color would not work well either, since spatial characteristics of colored features are quite significant for the overall perception of the image. The chromato-structural approach proposed in [3] gives much better results, but this method assumes that colors found in granite images form distinct clusters in the color space, which is not always the case. The color difference between the features is often small, just sufficient for the human eye to single out the feature or make it stand out a little more strongly. The problem is, therefore, one of identifying features that appear distinct to the human eye, but which do not manifest themselves as well-defined clusters in any of the color spaces. The segmentation task this problem presents is far from easy. Indeed, the conventional morphological segmentation technique is the watershed transform [4]. However, watershed is intrinsically a gray-level transformation, with its applicabil- ity depending on the existence of an order relation on pixel values. Dealing with color images, one can try to use a general region-growing algorithm [6] instead of the watershed one. Then, only a similarity measure between a given point and its neighbors is required, rather than a total order relation. This approach was used in [7]. The results are satisfactory, but there is a drawback too: The procedure requires a set of markers, which makes it unsuitable for automatic segmentation. This problem was dealt with in [8] where an auxiliary mask was derived from all three color bands to be used in the flat-zone merging procedure. Here again the hierarchy of flat zones was required, which made it necessary to choose one color band arbitrarily. This approach would not be applicable to our case, because we are dealing with very subtle color shades, and the results could be biased were any color band to be treated differently. We have come to a conclusion, however, that there exists a way of dealing with this problem. It has always been assumed that it is some kind of value of a pixel that we should represent numerically for the watershed segmentation, whereas this need not be the case. In fact, the watershed method is routinely applied to the modulus of intensity gradient rather than the in- tensity itself. Modulus of the gradient is a measure of distance between neighboring pixels in the sense of their intensity, and is a scalar function of the coordinates. The idea is to use an 1057–7149/97$10.00 1997 IEEE

Automatic watershed segmentation of randomly textured color images

Embed Size (px)

Citation preview

Page 1: Automatic watershed segmentation of randomly textured color images

1530 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

Automatic Watershed Segmentationof Randomly Textured Color ImagesLeila Shafarenko, Maria Petrou,Member, IEEE,and Josef Kittler,Member, IEEE

Abstract—A new method is proposed for processing randomlytextured color images. The method is based on a bottom-upsegmentation algorithm that takes into consideration both colorand texture properties of the image. An LUV gradient is in-troduced, which provides both a color similarity measure anda basis for applying the watershed transform. The patches ofwatershed mosaic are merged according to their color contrastuntil a termination criterion is met. This criterion is basedon the topology of the typical processed image. The resultingalgorithm does not require any additional information, be itvarious thresholds, marker extraction rules, and suchlike, thusbeing suitable for automatic processing of color images. Thealgorithm is demonstrated within the framework of the problemof automatic granite inspection. The segmentation procedure hasbeen found to be very robust, producing good results not only ongranite images, but on the wide range of other noisy color imagesas well, subject to the termination criterion.

Index Terms—Color texture, image segmentation, watershedalgorithm.

I. INTRODUCTION

COLOR IMAGE processing has been an active field ofresearch during the last few years (see, for example, [1]

and references therein). However, little research has been donein the area of processing color textured images, especially forthe case of randomly textured color images, such as granites. Inthis paper, we address the problem of automatic segmentationof such images. Our goal is to segment a typical granite imagein such a way that it mimics human perception of the image.Although this task is not well defined due to the very notionof human perception being somewhat vague, it often arisesin practice. For example, the granite tiles are accepted orrejected by the manufacturer according to human perception.In this case, the segmentation algorithm has to extract thosefeatures from the granite sample that are perceived as salientby humans.

This task is complicated by an overwhelmingly large num-ber of details that are present in a typical granite image.It is immediately noticeable, though, that not all of thesedetails are perceived as significant by the human eye. Onthe one hand, if one tries to filter out insignificant featuresby applying size-based filters such as opening or closing [2],

Manuscript received August 18, 1995; revised November 27, 1996. Thiswork was supported by the CEC Project 0946 under the BRITE-EURAMProgram. The associate editor coordinating the review of this manuscript andapproving it for publication was Prof. John Goutsias.

The authors are with the Department of Electronic and ElectricalEngineering, University of Surrey, Guildford GU2 5XH, U.K. (e-mail:[email protected]).

Publisher Item Identifier S 1057-7149(97)07030-9.

or their combinations, then small but significant features interms of color saliency may be removed. For example, asmall-sized greenish discoloration would be quite noticeableagainst red background, but would be masked completely bysimilar-color surroundings. On the other hand, a method basedonly on the color would not work well either, since spatialcharacteristics of colored features are quite significant for theoverall perception of the image.

The chromato-structural approach proposed in [3] givesmuch better results, but this method assumes that colors foundin granite imagesform distinct clustersin the color space,which is not always the case. The color difference betweenthe features is often small, just sufficient for the human eye tosingle out the feature or make it stand out a little more strongly.The problem is, therefore, one of identifying features thatappear distinct to the human eye, but which do not manifestthemselves as well-defined clusters in any of the color spaces.

The segmentation task this problem presents is far fromeasy. Indeed, the conventional morphological segmentationtechnique is the watershed transform [4]. However, watershedis intrinsically a gray-level transformation, with its applicabil-ity depending on the existence of an order relation on pixelvalues. Dealing with color images, one can try to use a generalregion-growing algorithm [6] instead of the watershed one.Then, only a similarity measure between a given point and itsneighbors is required, rather than a total order relation. Thisapproach was used in [7]. The results are satisfactory, but thereis a drawback too: The procedure requires a set of markers,which makes it unsuitable for automatic segmentation. Thisproblem was dealt with in [8] where an auxiliary mask wasderived from all three color bands to be used in the flat-zonemerging procedure. Here again the hierarchy of flat zones wasrequired, which made it necessary to choose one color bandarbitrarily. This approach would not be applicable to our case,because we are dealing with very subtle color shades, andthe results could be biased were any color band to be treateddifferently.

We have come to a conclusion, however, that there exists away of dealing with this problem. It has always been assumedthat it is some kind ofvalueof a pixel that we should representnumerically for the watershed segmentation, whereas this neednot be the case. In fact, the watershed method is routinelyapplied to the modulus of intensity gradient rather than the in-tensity itself. Modulus of the gradient is a measure of distancebetween neighboring pixels in the sense of their intensity, andis a scalar function of the coordinates. The idea is to use an

1057–7149/97$10.00 1997 IEEE

Page 2: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1531

appropriatecolor distance between neighboring pixels for thesame purpose (i.e., watershed segmentation), subject to theaccurate quantification of the perceived difference betweentwo colors of different intensity, which requires a commonmetric in the whole color space. Such a metric exists. It is theEuclidean metric in the LUV color space.

This approach produces good results not only for graniteimages it has been meant to work with, but for a wide rangeof color images as well, even when a significant amount ofnoise is present.

In this paper, we describe our experience of adapting thewatershed transform to theLUV gradientof images with smallcolor saliency as well as developing appropriate techniques forthe perceptually acceptable segmentation of several complexcolor images. The structure of the rest of the paper is asfollows. Section II introduces the watershed method and itsimplementation. Section III presents the results of the chro-matic evaluation of borders. Section IV contains experimentalresults with real images and discussion.

II. WATERSHED METHOD

The idea of watershed is drawn from a topographic analogy.Consider the gray-level intensity as a topographic relief. Findthe minima and “pierce” them. Immerse the whole relief intowater and let the water flood the areas adjacent to the piercingpoints. As the relief goes down some of the flooded areas willtend to merge; prevent this happening by raising infinitely talldams along the watershed lines. When finished, the resultingnetwork of dams defines the watershed of the image. Theprocedure is as follows.

1) Preselection of the minima: The process begins withmarking up the seeds, i.e., the points at which the flood-ing will begin. In terms of “topography,” the relief is“pierced” at those points and then gradually “immersed”in water. The seeds do not have to coincide with any ofthe minima on the relief, in fact they need not be singlepoints, but they may be areas of arbitrary topology. Eachof the seeds has a unique identifier (ID), which will beused for identifying the regions that will be grown outof it (see a chapter on marker selection in [12]).

2) Fragmenting: As the relief goes deeper into the water,the regions surrounding the seeds become flooded. Even-tually two or more such regions expand to a point atwhich they would come into contact unless the watersare separated. This is the moment that a dam is raised.In the watershed method, the dams are all infinitely talland are arbitrarily complex sets of pixels depending onthe line of contact. This is a very informal definitionthough, since in the situation of discrete altitude of therelief, which is laid out on a discrete grid, there is no wayof gradually bringing the flooding water up to the pointof contact. However, it helps to visualize the procedure.

Technically, for every pixel with an identifier, the neighboringpixels are checked on being “under water,” and if any of themare, they receive the same identifier, provided that they havenot been identified with a different flood area already. Note that

Fig. 1. Schematic representation of the functions involved and the approxi-mations made in the calculation of the termination criterion in the Appendix.

the relation of neighborhood on pixels is in fact what replacesthe gradual flooding in the continuous case: The “waters”are stopped when they are about to flood neighboring pixels.It should be said, however, that this relation is completelyarbitrary, and if stated differently will result in a differentwatershed configruation. For the purposes of our study weshall assume the four-connectivity of the grid hereafter.

It is evident from the description above that eventually thewhole image will be partitioned into nonintersecting areas,calledcatchment basins,bordered by the watershed lines, andthe outer border of the image. The number of catchment basinscannot be different from the number of seeds, since no furtheridentifiers are created in the course of flooding. This posesthe problem of finding the optimal seeds, as using just anysingularity of the relief as a seed leads to a considerableoverfragmentation of the image, while too few seeds mayresult in the absorption of some important small details bylarger areas.

It should be stressed that the above discussion by nomeans exhausts the matter. There are algorithms that resultin very precise watershed lines (see, for example, [18]), aswell as variations of the method that are suitable for parallelimplementation (see [14]).

The watersed transform applied to the image itself still doesnot produce contours of the features; rather, it partitions theimage into the areas associated with each seed. The desiredcontours could be extracted by applying the method to themodulus of the intensity gradient rater than the intensity itself.This way, both the minima and maxima become minima ofthe new function, which reaches its maxima at the points ofthe highest intensity gradient. A well in the image intensitybecomes a crater in the modulus gradient representation andso does a peak; the process of flooding will stop at the crater’sborder, wich should highlight the real shape of the extremumas the eye catches it. There is generally no need to useany high-order approximation for spatial derivatives, since theoriginal is usually contaminated with high-frequency noise thatis going to be amplified by the differentiation anyway (this is,apparently, one of the weaknesses of the method). However,since the features we seek to extract are naturally masked byplenty of small details, there is a limit to the precision withwhich the notion of gradient is defined. In our experiments we

Page 3: Automatic watershed segmentation of randomly textured color images

1532 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

Fig. 2. Reconstruction of functionf from function g:

used the simplest approximation

which proved to be sufficient for our purposes.Interestingly, the finding of the minima has nothing to do

with the gradient approximation above. A minimum cannot bedefined as a zero of the gradient since, first, that may occur at asaddle point or at a maximum, and second, plateaus of intensitysweeping significant areas are not uncommon, for which itis impossible to determine the local minimum from a smallneighborhood anyway. (In the continuous analogy, one wouldhave to check spatial derivatives up to a very high order inorder to differentiate a true minimum from an intricate saddleconfiguration.) One would ask if the numerical procedurerequired to find the minima is computationally viable. In fact,mathematical morphology provides a clever way of findingminima without embarking on an exhaustive search.

The procedure used to extract the extrema of a function iscalled thegeodesic reconstruction[5]. To illustrate it, considertwo functions and and suppose that (i.e., for everypixel (see Fig. 2). Function is called themaskimage and is themarker.Denote by the elementaryball of the grid being used. For example, is a hexagonin six-connectivity, five-pixel square in four-connectivity ornine-pixel square in eight-connectivity. The reconstruction of

by is obtained by iterating the following operation untilstability is reached:

where is the level of iteration performed, and stands fordilation.

In the binary case, reconstructing and allows us toextract those connected components of binary imagethatcontain at least one pixel of [10]. This extends to the gray-level case in terms of peaks: as illustrated by Fig. 2, onlythe peaks of that are marked by are preserved throughreconstruction.

Among various applications of this useful transformation,extraction of local extrema is of interest for us now. To findthe maxima of an image it suffices to reconstruct from

By algebraic difference betweenand the reconstructedfunction, one gets the desired maxima. To extract the minima,one can reverse the image or use a dual reconstruction offrom

A. The Procedure

The process of flooding was implemented as follows. A datastructure was used, calledordered queue, whereby pixels arescheduled for processing. Initially, this is a pack offirst-in-first-out (FIFO) buffers, being the number of levels ofintensity in the image. An ordering is established on the bufferswith buffer zero being the most senior and buffer themost junior one in the pack. FIFO elements are scheduled forprocessing beginning with the most senior buffer elements.The head element (pixel) from a FIFO buffer is taken only ifall FIFO’s junior to the buffer are empty.

The process of flooding begins with enqueuing all seeds(with their ID’s) to the buffers corresponding to the seed value.As seeds, we used all minima of the relief, which we had foundusing the method described above. At every iteration, the headof the most senior nonempty FIFO is retrieved. As well as anyother FIFO cell, it contains coordinates of a pixel, its ID andvalue (which, in our case, is a value of the intensity gradient).The rest of the iteration is as follows.

1) The neighboring pixels of the image are retrieved (basedon the coordinates supplied by the cell).

2) Those neighbors currently having no ID are given theID of the cell, which is noted on the image, and areimmediately enqueued according to their value.

3) The ones that have some ID (no matter same or different)are ignored.

4) The current cell is removed from the buffer.

Consider now the issue of whether the distribution ID’s overthe image is uniquely defined by the watershed method. It isclear that there is an ambiguity with respect to the positionof the watershed lines on the plateaus. Indeed, consider twominima embedded in a large plateau. It is unclear where theborder between them should be drawn. However, it is desirableto put the border exactly “halfway between” the two minima,since that reflects equal significance of these in topographicalterms. There are implementations of the watershed that drawborders very accurately (see [18]). However, in most real-life applications, and certainly in the granite images we aredealing with, this issue hardly ever arises. Indeed, in order

Page 4: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1533

(a)

(b)

Fig. 3. The RGB coordinates of the (a) samples are (246, 255, 155) and (255, 255, 132). For the samples (b) the coordinates are (25, 20, 0) and (0,20, 0). Although the (a) samples and the (b) ones are the same Euclidean RGB distance apart, the perceived color difference is much greater for the (b)samples. Although the RGB space is not fully defined, all RGB spaces are related to the perceptually uniform spaces by a nonlinear transformation, andtherefore the gradient in all RGB spaces is not expected to reflect the perceived color difference.

to have large plateaus, image acquisition would have to beperfect, for the result changes dramatically whenever the valueof just one pixel is altered due to noise. Having said that, ifa plateau occurs nevertheless, the use of a queue as a sortingmechanism provides for an even split between neighboringcatchment basins.

B. LUV Gradient

The watershed method is meaningful only for gray-levelanalysis. It is based on the existence of a complete orderrelation on the data being processed. Since such a relationdoes not exist in color space, the watershed transform is notapplicable immediately [19].

Rather than trying to introduce an order relation into a colorspace, we propose to use a transformation that is analogousto the gradient transform in the gray-level case. To this end,we consider the transformation that maps each pixel ontothe distance to its furtherest neighbor. Bydistancewe meanEuclidean distance in the LUV space, and the neighboringrelation is given by the connectivity on the grid. The proposedtransformation assigns the rate of “color variance” for eachpixel, because LUV space is designed so that Euclideandistance quantifies color similarity between two given points.It is this particular property of the LUV space that facilitatesits use for color gradient computation. Indeed, consider twosets of colors as those in Fig. 3. Although the two (a) and thetwo (b) samples are the same red–green–blue (RGB) distanceapart, the perceived difference between the (b) samples ismuch greater than that between the (a) ones. However, if wewere to calculate the color gradient in the RGB space, both(a) samples and (b) samples would give the same value for the

color gradient, if samples (a) happen to be neighboring pixels,and so happen to be samples (b).

Note that there are two issues not to be confused:

1) the choice of the color space;2) the choice of the metric in a color space.

While the first one is very important for the adequatecalculation of the color gradient, the second one is merelya way of defining distance between colors. We have used theEuclidean distance as a metric because the LUV space wasdesigned with this metric in view. There are other possibilities,too. For example, if one wants to exclude the change ofillumination from consideration, then

perhaps would serve the purpose. On the other hand, for thesamples in Fig. 3, it is difficult to imagine any homogeneousmetric in the RGB space that would bring the RGB gradientclose to human perception of the color change [9].

One might argue that regardless of the color space used,the change in color is usually accompanied by an illuminationchange, which makes gray-level gradient sufficient, especiallywhen the image is expected to be oversegmented with a viewto reduce oversegmentation at a later stage. Fig. 4 illustratesthe LUV gradient in comparison with the gradient of the gray-level version of the image. One can see that the LUV gradientoffers a much better starting point for segmentation than doesthe gray-level gradient, which is not too surprising, as it isessential to take color characteristics of the image into accountas soon as possible.

Page 5: Automatic watershed segmentation of randomly textured color images

1534 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

(a)

(b) (c)

Fig. 4. (a) Original image. (b) LUV gradient. (c) Gradient of the gray-level version of the image.

III. CHROMATIC EVALUATION OF BORDERS

We have found that the watershed transformation performedon the LUV gradient of the color image identifies distinctfeatures very well. Moreover, what is usually believed tobe a disadvantage of the watershed transformation, namelyits tendency to significantly oversegment the image, turnsout to be an advantage in our case. Indeed, the standardway of avoiding oversegmentation is to use some additionalinformation to define some meaningful set of markers, thusexcluding some of the local minima from consideration. Thisapproach is not applicable in our case, because our aim isto develop an algorithm that does not rely on any additionalinformation and which works well on images within a widerange of texture and color characteristics typical of graniteimages. We therefore make sure that all the features, howeverinsignificant, are extracted. It is only then that we considersuppression of insignificant minima in order to reduce thecomputational complexity.

The technique we use for this purpose is known aswaterfall[21], [22]. It considerably reduces oversegmentation withoutcompromising the generality of the approach.

Consider a positive, bounded function (seeFig. 5). It is easy to notice that while the minima and

are significant and are likely to correspond to the featuresin the image, the rest of the local minima are less significantand are likely to be caused by noise. In order to suppress theirrelevant minima, let us first perform the watershed on the

original image with all the local minima being considered.Fig. 5(a) shows the positions of the watershed dams found.Let be the loci of the watershed dams. Consider thefollowing function:

Function is obviously greater than Let us now performthe closing by dual reconstruction of from The result ofthis reconstruction is shown in Fig. 5(b). It is easy to see thatthe minima of the resulting image correspond to the significantminima of the original image.

If we suppress the irrelevant minima of the LUV gradientusing this method, we could use the significant minima asmarkers for the watershed. The result is by far less overseg-mented than the original watershed (see Fig. 6). This approachis somewhat similar to using a threshold in dynamics [26]. Thissimilarity, however, is superficial: The dynamics are controlledby the relative altitudes of the minima. The waterfalls, on thecontrary, are related to the relative heights of the watershedlines [21]. Waterfalls have some advantages over thresholdin dynamics [21]. In our particular case, the drawback ofdynamics is that a threshold is needed to extract the minimawith high dynamics, which contradicts the automatic natureof the algorithm.

The segmentation produced so far is not good enough tobe considered final, because the image is still oversegmented.Indeed, the boundaries produced so far do not have the

Page 6: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1535

Fig. 5. Detection of significant minima.

(a) (b) (c)

Fig. 6. (a) Original image. (b) Watershed with all minima. (c) Watershed with significant minima.

same significance. Those that are inside regions with a highdegree of uniformity are less significant. We need to suppressinsignificant borders similarly to what the human eye does. It isour assumption that the human eye compares borders betweenthe neighboring regions according to the color difference oftheir associated regions, rather than colors themselves. Our aimis therefore to find a color space where the color differencewould represent the human perception.

We have tried RGB, HSV, and LUV systems, with thecolor difference being either the Euclidean distance or themaximum of the three component differences. We have foundthat the most successful combinations are HSV with thecolor difference being the maximum of the three componentdifferences, and LUV with the color difference being theEuclidean distance. In the discussion that follows, we use thelatter.

Before removing the “weak” borders, we assign a vectorto every catchment basin of the initial watershed whosecomponents are the average L, U, and V over the catchmentbasin. Note that the transformation from RGB to LUV space isnot linear, thus the noise distribution is nonuniform over LUVspace, with noise being amplified near the origin of the axes.For this reason, we require that the noise from the physicaldevice (be it the camera or other means of recording the image)is les than the natural fluctuations of the scene. In our casethat means that the noise could be neglected compared withthe deviation in L, U, and V.

We have tried the median, as well as mean, as the represen-tation of a catchment basin, but we found that the former gives

somewat less satisfactory results. This leads us to believe thathuman vision “averages” the color of a near-uniform fragmentof an image. Consequently the mean value of color has beenused thereafter.

By now, the image has been transformed into a patchworkof pieces of constant color, which, in fact, form larger regionsidentifiable by the human eye. The contrast between spots ofcolor inside such a region is not very high, whereas the bordersbetween regions separate more contrasting colors. In orderto reduce this oversegmentation, ahierarchical segmentation[10] procedure is usually applied. It starts at the minimaltransition and proceeds in a way resembling the watershedtransform of a graph. The vertices of the graph correspondto the boundaries and the edges of the graph connect thevertices related to the same catchment basin. Having tried thismethod, we have rejected it, the reason being that hierarchicalsegmentation takes into consideration only relative contrastsbetween the borders while totally disregards the absolute valueof the border contrast. This is illustrated by Fig. 7, where bothdiagrams depict the patchwork image after the watershed isperformed.

For both cases, the sequence of relative contrasts betweenthe borders are exactly the same, namely (in ascending order)

• the border between regions 1 and 2;• the border between regions 3 and 4, and between 3 and 5;• the border between regions 1 and 3;• the border between regions 2 and 3.

It is clear that the desired segmentation for the left image ofFig. 7 would be to segment region 3 and consider all other

Page 7: Automatic watershed segmentation of randomly textured color images

1536 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

(a) (b)

Fig. 7. Patchwork image after the watershed has been performed.

regions as the background, while for the right image it wouldbe to segment regions 3, 4, and 5 as a single region andregions 1 and 2 as the background. The watershed on thegraph, however, would not distinguish between the two cases.Moreover, in the configuration in question, the border betweenregions 3 and 4 is a local minimum, that is no border relatedto the regions 3 or 4 is greater. With this being the case,this border would be the first one to go in the neighborhood,were the watershed on the graph to be performed. However,Fig. 7, left, shows that the removal of this border would beinappropriate.

Taking the points made above into consideration, we havechosen the following method of reducing oversegmentation(similar ideas were proposed in [24] and [25]).

1) Every border is assigned the LUV distance between thetwo patches it separates.

2) We sort the borders in increasing order.3) We merge two regions separated by the least contrast

border.4) We repeat step 2 until the termination criterion is sat-

isfied.

This procedure, being rather straightforward, is somewhatmore computationally expensive than watershed on a graph,but we found that it gives much better results. However, nomatter what form of hierarchical segmentation is used, theissue of terminating the iterations at some point remains,depending on the desirable degree of segmentation.

So far, we have managed to keep our algorithm free of theneed of any additional information, be it various thresholds,marker extraction rules, and the like. We would like to keepit that way, that is to introduce a termination criterion thatrelies entirely on the information contained within the image.This is not an easy task, however, due to the nature of thegranite images we are dealing with. It is pointless to basethe termination criterion we are seeking on color or contrastcharacteristics of the image, because these characteristics varysignificantly within the range of colors and contrasts present.

IV. TERMINATION CRITERION

We have approached the question of terminating the algo-rithm from a different point of view. Indeed, the topology ofthe finally segmented granite image should remain the same

regardless of its color or texture characteristics; that is, weare aiming to segment the image into separate blobs on asingly connected background. Consider the image as a simpleundirected graph with its nodes corresponding to the patchesof constant color and its edges corresponding to the boundariesbetween patches [15]. Every time we merge two regions, thegraph is modified by merging two nodes into a new one, withits neighbors being the neighbors of the original two, andthe redundant edges removed. The node with the maximumdegree (i.e., maximum number of edges) corresponds to thebackground. This is illustrated by Fig. 8.

It seems appropriate now to iterate this procedure untilthe graph becomes a tree. Indeed, the resulting segmentationwould be exactly what we seek. Consider however whathappens when two salient blobs are close to each other, sothat they have a common boundary (see Fig. 9).

If a termination criterion were to be the graph becoming atree, the procedure would iterate until only the strongest ofthe blobs survives. This is obviously not what we intend toachieve, since the two blobs in question may be the strongestfeatures in the image. In such a case, the rest of the blobswould be removed regardless of their saliency.

For this reason, we have to take care of the situation wheretwo blobs have a common boundary: The algorithm has toterminate before the graph degenerates to a tree, and so a few“extra” edges should be left in place. The question is howmany additional edges need be left or, in other words, howmany blobs are likely to touch each other given the size of theimage, the number of blobs, and their size distribution.

In order to establish the likely rate of intersection for aset of blobs on a plane, we should assume certain statisticalproperties of the blob center locations1. Let denote theprobability to find blobs centred within a shape of areaFurthermore, let the symbol represent any function suchthat for example, function is and

Our assumptions are the following.

1) where is a positive constant and

2)

1We are grateful to an anonymous referee who suggested using the Poissondistribution for this process.

Page 8: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1537

(a) (b)

Fig. 8. Segmented image and its corresponding adjacency graph representation. (a) One extra edge remains. (b) The graph is a tree.

Fig. 9. An example illustrating a case when the graph can become a treeonly after one of the touching blobs goes. For this to happen, the rest ofthe blobs have to go first, because the borders of the touching blobs are thestrongest in the image.

3) The numbers of blobs centred in nonoverlapping areasare statistically independent.

Assumptions 1 and 3 state that the probability of findinga blob centred within a small shape of areais independentof the number of blobs centred within other nonoverlappingareas and is approximately proportional to the shape areaThe meaning of 2 is that the probability of finding centers oftwo or more blobs within the same small shape is practicallyequal to zero. It could be shown [27] that the above postulatesensure a Poisson process with the distribution

(1)

where is the density of a Poisson process. The condition forthe above distribution to be applicable is

(2)

Let us model the blobs as being “roundish” with theirradius being a random variabledistributed according to someprobability density We are interested in the case of smalltotal area of blobs (compared to the total image areaso thatintersection of more than two blobs is statistically unlikely(assumption 2). The sum of radii of two blobsis distributedaccording to the convolution of the probability density functionof the blob size with itself, as follows:

(3)

Now we are in a position to define the probability of twoblobs to intersect. Let us fix the centre of blob 1. The twoblobs intersect if

• the sum of their radii falls within the interval from to(the probability of this happening is );

• the center of blob 2 falls within the disk of radiusfromthe centre of blob 1 (the probability of this happening is

).

Multiplying the corresponding probabilities and summing overall possible values of we obtain

(4)

Page 9: Automatic watershed segmentation of randomly textured color images

1538 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

The density of the Poisson process is where is thenumber of blobs and is the total area of the image. The upperlimit in the outer integral is set to since the blobs are veryunlikely to have sizes commensurate with that of the image andso the inner integral falls off very quickly asincreases, whichmakes it possible to neglect both the boundary effects on thefunction and the finite area of integration. The above formulashould be multiplied by to get thenumberof expectedintersections as opposed to the probability of an intersectionto occur, and we have finally the number of intersectionsas follows:

(5)

Let us estimate the above number. Consider first the case ofthe blob distribution being single-sized, i.e., being a -function set at some Then is also a -function, but itis set at the point The number of intersections accordingto (5) should be

(6)

In order to quantify the influence of the finite width of theblob size distribution function, neglected in the previous case,we can change the order of integration and substituteto obtain

(7)

Note that for we arrive at the same resultas (6). Let us now estimate (7) for the finite width of thedistribution. For we have from (7) in the first orderin

(8)

where the overline denotes average. To obtain higher orderterms we should expand about the point

This, however, would involve higher moments of thedistirbution. Instead, we estimate (7) as follows (see Appendixfor details):

(9)

(we have substituted from (8)). Note that forand we again arrive at the same result as (6),

while for we arrive at (8) in the first order inThe above calculation is along similar lines to the one

given in [29]. However, the derivation in [29] is given forthe general case for which the complete distribution of blobsizes is available. As an example the formula derived is alsoapplied for the case when all blobs are of exactly the samesize. Our derivation here differs in the sense that we allow a

blob size distribution, which however we assume to be narrow.At the limit when our assumed distribution collapses to a deltafunction, our derived formulas exactly agree with those in [29].

The approximation (9) is valid if (see Appendix)

(10)

For the small ratio the approximation (9) is better thanthe straightforward expansion of (7) inbecause the conditionfor such expansion to be valid would be Thiscondition is stronger than (10), because it does not take intoconsideration the fact that is small.

Note that if the Poisson distribution is applicable at all,condition (2) holds. Substituting therefor we find that

(11)

It follows then, that even for not-so-small values ofour approximation is still valid, because condition (11) wouldsee it through.

Since during the computation the list of blobs’ areaswasmaintained, we express (9) in terms of rather than , asfollows:

(12)

Both and could be easily obtained from the list ofblobs’ areas. It is interesting that we have arrived at thisresult without making any assumptions on the exact typeof the distribution All information on the type of thisdistribution is contained in its moments.

Note that the only place where the assumption about blobs’roundness was explicitly used is formula (3); the rest of thediscussion could be reformulated in terms of blob area, ratherthan its radius. Thus, if the blobs could be described by their“characteristic linear size” so that (3) holds, the rest of thederivation is valid. Indeed, we have found that in practice (13)gives a good estimate for the number of blob intersections.

According to the above estimate, segmentation should stopwhen the number of redundant edges in the graph of thesegmented image [corresponding to loops, like edge 7–8in Fig. 8(a)] is roughly equal to the expected number ofintersections of blobs, as predicted by the statistical analysisabove. In the early stages of iteration the image is severalyoversegmented, and thus the background is not recognisableas the node with the highest number of neighbors. This doesnot invalidate the method because the number of intersectionsis so high that it does not matter which node is taken to bethe background, since the termination criterion is not going tobe met anyway.

V. EXPERIMENTAL RESULTS

The method described has been extensively tested on imagesof granite textures provided by a processing factory. Theprocess was entirely automatic with the termination criterionas derived above. Fig. 10 shows the typical segmentation se-quence based on the described algorithm. We notice that as themethod proceeds, more and more boundaries are removed andlarger areas are created. The removed boundaries, however,

Page 10: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1539

Fig. 10. Successive stages of iterating the algorithm.

are those that perceptually seem less significant, with themost salient features preserved in the end as expected. Fig. 11displays some more segmentation results using our method onsix other granite images. These images were chosen to be ofdifferent granite types as well as of different scale. We notethat in each case the algorithm stops when the most salientfeatures in each image have been isolated. It is important thatthe resulting segmentation corresponds to human perception ofthe image. Indeed, the possible application of the algorithm is

the defect identification. An abnormally large blob or a blob ofan unexpected color could constitute a defect. There are moresubtle defects too, such as the one shown in the last row ofFig. 11, where the grain size is nonuniform over the image.

We have found the segmentation procedure to be veryrobust. Although it has been designed to work specifically ongranite images, when applied to images of different kind it stillproduces very reasonable results, subject to the terminationcriterion. Indeed, we can not make assumptions about the

Page 11: Automatic watershed segmentation of randomly textured color images

1540 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

Fig. 11. Segmentation results.

topology of an arbitrary color image. Instead, we stopped theprocedure when the border to be removed has a contrast greaterthan a given fraction of the strongest border contrast. One canargue that ths introduces a parameter that has to be set up priorto the processing, which makes the procedure less automatic.However, this parameter is merely a quantitative measure ofthe desired degree of segmentation, and could be set once fora wide range of images. Fig. 12 illustrates the application ofthe algorithm to arbitrary color images with this parameterset to 0.125. It can be seen that the algorithm produces very

satisfactory results for a large variety of images and with thesame set of parameters. Finally, in Fig. 13 we show someresults when our algorithm is applied to a noisy image. Notethat the parameter was set to exactly the same value as thatfor the images in Fig. 12.

VI. CONCLUSIONS

In this paper, we have addressed the problem of colorsegmentation in randomly textured images. A bottom-up seg-

Page 12: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1541

Fig. 11. (Continued.) Segmentation results.

mentation algorithm has been proposed based on both colorand texture properties of the image. The nature of the imagesunder study did not allow any size-based preprocessing, thereason being that no small feature could be eliminated withoutconsidering its color relative to its neighborhood. Due to thisconstraint it would be impossible to use for example flat-zoneapproach [23], which relies upon initial filtering. Another factthat limits out use of traditional approaches is that the finalsegmentation sought was that based on the topology of theimage rather than its intensity (or derivatives thereof, such asgradient, contrast, etc.).

The other problem was one of introducing the order re-lation in the color space to be used with any segmentationalgorithm based on a hierarchical queue (such as traditionalwatershed, flat-zone, or any other region-growing algorithm).We have successfully used the LUV gradient to introduce sucha hierarchy.

The introduction of this concept was very crucial to thedevelopment of the algorithm. Many color processing schemesstart from the gray-level image and introduce the color compo-nent later on in the process. Indeed, initially we experimentedwith such an approach where the minima used for the wa-

Page 13: Automatic watershed segmentation of randomly textured color images

1542 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

Fig. 12. Segmentation of nongranite images.

tershed algorithm were computed from the gray image [20].It turned out that the results obtained were not particularlysatisfactory. One can be convinced of that by looking againat Fig. 4. In the gray image [Fig. 4(c)] there is no contrastbetween the upper leg of the doll and the background andyet, as is obvious from the color version of the same image(Fig. 12), the upper leg of the doll is very distinguishable.It is clear that any algorithm, no matter how sophisticated,when it starts from Fig. 4(c), will never be able to recoverthe lost information. So it seems that it is imperative for thecolor information to be introduced in the process as early aspossible. It can be seen that the watershed algorithm appliedto the LUV gradient image we introduced [Fig. 4b] will beable to preserve all relevant boundaries.

Further, the termination criterion we have derived seems towork very well for the particular type of images we wereinterested in, although it is obviously only appropriate forbloblike images. For other types of images we have shownthat one has to introduce only one arbitrary parameter for ourscheme to work, namely a termination parameter. Althoughthis is clearly dependeng on the degree of saliency one wishesto preserve in segmenting the image, we showed that verysatisfactory results could be achieved on a variety of imageswith this parameter set to be a fixed value.

Finally, we may say that the algorithm is suitable for theautomatic processing of granite or any other bloblike image,because it can be fully automatic and it does not require anyfine tuning of parameters.

APPENDIX

We shall estimate the value of

(13)

Fig. 13. Segmentation of a noisy image.

assuming that is small. We can rewrite the above as follows:

(14)

Suppose that is the width of the distribution andthe point where is centered. For small the averagedfunction is a very flat exponential, as shown in Fig. 1 for

Expanding around we obtain

(15)

where

(16)

is the width of the distribution of the random variableTaking into consideration that substituting for

in the series above will introduce errors that are of higherorder in we obtain

(17)

Page 14: Automatic watershed segmentation of randomly textured color images

SHAFARENKO et al.: RANDOMLY TEXTURED COLOR IMAGES 1543

Differentiating the above with respect to and substitutingin (14), we have

(18)

Thus

if (19)

For we haveand [28], and (19) becomes

if

(20)

ACKNOWLEDGMENT

Special thanks are extended to an anonymous referee forcomments concerning the termination criterion.

REFERENCES

[1] V. Cappellini, K. N. Plataniotis, and A. N. Venetsanopoulos, “Applica-tions of color image processing,” inProc. Int. Conf. on Digital SignalProcessing,Limassol, Cyprus, 1995.

[2] J. Serra,Image Analysis and Mathematical Morphology.New York:Academic, 1982.

[3] K. Y. Song, J. Kittler, and M. Petrou, “Defect detection in random colortextures,”Image Vis. Computing,vol. 14, pp. 667–683, 1996.

[4] H. Digabel and C. Lanteujoul, “Iterative algorithms,” inProc. 2ndEurop. Symp. on Quantitative Analysis of Microstructures in MaterialScience, Biology and Medicine,Caen, France, 1977.

[5] L. Vincent, Morphological “grayscale reconstruction in image analysis:Applications and efficient algorithms,”IEEE Trans. Image Processing,vol. 2, pp. 176–201, Apr. 1993.

[6] J. L. Horowitz and T. Pavlidis, “Picture segmentation by a direct splitand merge procedure,” inProc. 2nd Int. Conf. on Pattern Recognition,1974, pp. 424–433.

[7] F. Meyer, “Color image segmentation,” inProc. 4th Int. Conf. ImageProcessing and Its Applications,Maastricht, The Netherlands, Apr.1992.

[8] J. Crespo and R. Schafer, “The flat zone approach and color im-ages,”Matematical Morphology and Its Application to Image Process-ing. New York: Kluwer, 1994, pp. 85–92.

[9] G. Wyszecki and W. S. Stiles,Color Concepts and Methods, Quantita-tive data and Formulae. New York: Wiley, 1982.

[10] S. Beucher and F. Meyer, “The morphological approach to segmentatio:the watershed transformation,”Mathematical Morphology in ImageProcessing.New York, Marcel Dekker, 1993, pp. 443–481.

[11] L. Vincent, Morphological Algorithms,Mathematical Morphology inImage Processing. New York: Marcel Dekker, 1993, pp. 255–288.

[12] P. Salembier, Morphological multiscale segmentation for image coding,Signal Process.,vol. 38, pp. 359–386.

[13] S. Beucher and C. Lanteujoul, Use of watersheds in contour detection, inProc. Int. Workshop on Image Processing, Real-Time Edge and MotionDetection/Estimation,Rennes, Frances, 1979.

[14] B. Cramariuc, A. Moga, and M. Gabbouj, “Image segmentation bycomponent labeling,” inProc. Int. Conf. Digital Signal Processing,Limbassol, Cyprus, 1995, pp. 360–365.

[15] F. Meyer, “Minimum spanning forests for morphological segmentation,”Mathematical Morphology and Its Applications to Image Processing.Boston, MA: Kluwer, 1994, pp. 77–84.

[16] H. Digabel and C. Lanteujoul, “Iterative algorithms,” inProc. 2ndEurop. Symp. on Quantitative Analysis of Microstructures in MaterialScience, Biology and Medicine,J.-L. Chermant, Ed., Caen, France, Oct.1977, pp. 85–99.

[17] G. Matheron,Random Sets and Integral Geometry.New York: Wiley,1975.

[18] L. Vincent and P. Soille, Watersheds in Digital Spaces: An EfficientAlgorithm Based on Immersion Simualtions,IEEE Transactions onPattern Analysis and Machine Intelligence,vol. 13, no. 6, June 1991,pp. 583–598.

[19] J. Serra, “Anamorphoses and functional lattices (multivalued morphol-ogy),” Mathematical Morphology in Image Processing.New York:Marcel Dekker, 1993, pp. 483–523.

[20] L. Shafarenko, M. Petrou and J. Kittler, “The application of mor-phological filters in the color analysis of complex images,” inProc.EUSIPCO-94,Edinburgh, U.K., 1994, pp. 439–442.

[21] S. Beucher, “Watershed, hierarchical segmentation and waterfall al-gorithm,” in Mathematical Morphology and Its Applications to ImageProcessing. Boston, MA: Kluwer, 1994, pp. 69–76.

[22] , Ph.D. dissertation, School of Mines, Paris, France.[23] J. Crespo and J. Serra, “Morphological pyramids for image coding,”

SPIE vol. 2094, pp. 159–170, 1993.[24] B. Marcotegui, J. Crespo, and F. Meyer, “Morphological segmentation

using texture and coding cost,” inProc. IEEE Workshop on NonlinearSignal and Image Processing,Neos Marmaras, Greece, June 1995, pp.246–250.

[25] L. Shafarenko, M. Petrou, and J. Kittler, “Nonlinear filtering for colorsegmentaiton and enhancement,” inProc. IEEE Workshop on NonlinearSignal and Image Processing,Neos Marmaras, Greece, June 1995, pp.190–193.

[26] M. Grimaud, “A new measure of contrast: Dynamics,”SPIE, ImageAlgebra and Morphological Image Processing III,vol. 1769, San DiegoCA, 1992.

[27] R. Hogg and A. Craig,Introduction to Mathematical Statistics,NewYork, 1966.

[28] A. Papoulis,Probability, Random Variables, and Stochastic Processs.New York: McGraw-Hill, 1965.

[29] P. Hall, Introduction to the Theory of Coverage Processes.New York:Wiley, 1988, p. 244.

Leila Shafarenko received the M.Sc. degree inphysics in 1983 from the University of Novosibirsk,U.S.S.R. and the Ph.D. degree in 1996 from theUniversity of Surrey, U.K.

She is currently with AuthenTec, Aylesbury,U.K., working on data encoding. The work pre-sented here is part of her Ph.D. dissertation.

Maria Petrou (M’91) received the B.Sc. degree inphysics in 1975 from the University of Thessaloniki,Greece. She studied applied mathematics, Part III,at the University of Cambridge, Cambridge, U.K.,in 1977, and received the Ph.D. degree in astronomyin 1981, from the University of Cambridge,.

She has been a Lecturer in Astronomy at theUniversity of Athens from 1981 to 1983, and apost-doctoral Research Assistant at the Departmentof Theoretical Physics, University of Oxford, U.K.,from 1983 to 1986. She has been working on

computer vision since 1986. She was a Research Associate at the NERC Unitfor Thematic Information Systems (NUTIS), Geography Depatment, ReadingUniversity, Reading, U.K., and later an Atlas Research Fellow at RutherfordAppleton Laboratory, St. Hilda’s College, Oxford. Since 1988, she has beenwith the Department of Electronic and Electrical Engineering, University ofSurrey, Surrey, U.K., where she is currently a Reader. She has worked onvarious aspects of image processing and computer vision, including textureand color analysis, fault detection, object recognition and reconstraction. Shehas more than 150 publications (more than 60 of them in refereed journals).

Dr. Petrou is a member of the British Machine Vision Association, the IEEESociety for Pattern Recognition, and the Optical Engineering Society. She hasserved as Associate Editor of IEEE TRANSACTIONS ONIMAGE PROCESSINGandis newsletter editor for the International Association for Pattern Recognition.

Page 15: Automatic watershed segmentation of randomly textured color images

1544 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER 1997

Josef Kittler (M’74) received the B.A. degree inelectrical engineering in 1971, the Ph.D. in patternrecognition in 1974, and the Sc.D. degree in 1992,all from the University of Cambridge, Cambridge,U.K.

He has been a Research Assistant in the En-gineering Department, Cambridge University, from1973 to 1975, a SERC Research Fellow, Depart-ment of Electronics, University of Southampton,from 1975 to 1977, Royal Society European Re-search Fellow at the Ecole Nationale Superieure

des Telecommunications, Paris, France, from 1977 to 1978, IBM ResearchFellow at Balliol College, Oxford, U.K., from 1978 to 1980, PrincipalResearch Associate at SERC Rutherford Appleton Laboratory from 1980to 1984, and was Principal Scientific Officer, SERC Rutherford AppletonLaboratory, in 1985. He also worked as the SERC Coordinator for PatternAnalysis (1982), and was Rutherford Research Fellow at Oxford University.He joined the Department of Electrical Engineering, University of Surrey,in 1986, as a Reader in Information Technology. He became Professorof machine intelligence in 1991. He has worked on various theoreticalaspects of pattern recognition and on many applications including systemidentification, automatic inspection, ECG diagnosis, remote sensing, robotics,speech recognition, character recognition, and line drawing processing. hiscurrent research interests include pattern recognition, image processing, andcomputer vision.

Dr. Kittler is co-author ofPattern Recognition: A Statistical Approach(Englewood Cliffs, NJ: Prentice-Hall). He is a member of the Committee ofthe British Machine Vision Association and Society for Pattern Recognition,and has served as the President of the International Association for PatternRecognition. He is a member of the editorial boards of IEEE TRANSACTIONS ON

PATTERN ANALYSIS AND MACHINE INTELLIGENCE, Pattern Recognition Journal,Image and Vision Computing, Pattern Recognition Letters, Pattern Recogni-tion, and Artificial Intelligence.