9

Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

Coalescing Texture Descriptors�

Yossi Rubner and Carlo TomasiComputer Science Department, Stanford University

Stanford, CA 94305

Abstract

ARPA Image UnderstandingWorkshop 1996

As an alternative to texture segmentation forthe description of the images, we propose amethod for coalescing descriptors of adjacentimage patches with similar textural content intotight clusters. We achieve this result by extend-ing the notion of edge-preserving smoothing andanisotropic di�usion from gray-level and colorimages to vector-valued images that describethe textural content of each image patch. Tothis end, we �rst compute raw texture descrip-tors, that is, vectors that describe local textureappearance. We then de�ne texture contrast asthe maximum derivative of the texture descrip-tor vector, and we de�ne signi�cant regions asthose in which contrast is low. We proposean e�cient implementation of edge-preservingsmoothing and show that it is closely related toanisotropic di�usion. Experiments on texturemosaics and real images show that clear tex-ture edges are found even where intensity edgesare weak or poorly de�ned, and that signi�canttexture regions yield tight clusters of texture de-scriptors.

1 Introduction

Describing the textural content of images is animportant task in the indexing, perceptual orga-nization, and interpretation of pictures. For in-stance, in our work on image databases [Tomasiand Guibas, 1994; Guibas et al., 1995] we indexpictures by vectors that encode the texture con-tent of di�erent parts of an image. To this end,it is useful to determine areas of approximatelyuniform texture. In fact, each of these areas canthen be analyzed separately and as a unit. In-tegration of texture information over an entireregion, as opposed to a small patch, yields de-scriptor vectors that are at the same time moreaccurate and fewer in number. In addition, ar-

�Research supported by ARPA grant DAAH04-94-G-0284.

eas of mixed texture content can be excludedfrom analysis, thereby keeping the space of in-dexes clean of spurious vectors.

The traditional approach to the determinationof uniform texture regions has been texture seg-mentation. In this approach, images are parti-tioned into regions of supposedly uniform tex-ture. The insistence on partitioning, that is onthe requirement to label each pixel in the im-age with exactly one label, although apparentlya desirable outcome, conceals in reality severaldi�culties.

First, many parts of many images do not con-tain any \texture," under any de�nition of thisterm. For instance, texture segmentation wouldrequire to tag each eye in a person's portrait asa single texture, or to partition it into separatetextures. This is of course possible in principle,if anything by assigning each pixel to a separateregion, but it is hardly meaningful.

Second, the exact location of texture boundariesis hard to determine, and is often not requiredby the application. Some times, two texturesblend into each other gradually, in which casethe notion of a boundary is not even appropri-ate. However, even if a boundary does exist,whether a pixel belongs to one or the other tex-ture is often a futile if not impossible questionto answer.

Third, explicitly identifying the extent ofuniform-texture regions requires complex pro-cedures like region growing, splitting, or merg-ing. However, it is often su�cient to know, forevery patch in the image, whether that patch issimilar in some sense to its neighbors. Comput-ing regions, that is, the transitive closure of thissimilarity relation, is expensive and not alwaysuseful.

Fourth, and most fundamentally, the very no-tion of a \uniform" texture is of dubious va-lidity. Textures are often random, and no two

Page 2: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

patches are exactly equal to each other. Inaddition, uniform textures in the world be-come nonuniform in the image because of di�er-ences in distance, foreshortening, shading, andlighting. Insisting on determining in all caseswhether two patches belong to the same tex-ture requires restrictive texture models and ex-pensive computation, and leads to questionableresults.

In contrast with segmentation, we propose in-stead to let descriptors of similar and adjacenttexture patches coalesce into tight clusters. Inthis approach, raw texture descriptors are �rstcomputed for a �xed set of overlapping imagepatches. If two adjacent patches contain similartextures, a smoothing process brings the corre-sponding descriptors even closer to each other.If the textures in two adjacent patches are verydi�erent, however, then their descriptor vectorsare modi�ed so that their distance is left un-changed or is even increased.

In other words, we propose to do edge-preserving smoothing, but in the space of tex-ture vectors rather than for image intensities.This requires generalizing the de�nition of gra-dient to a vector function, which we do by in-troducing a measure of texture contrast. Thanksto this de�nition, we can run the equivalent ofanisotropic di�usion processes [Perona and Ma-lik, 1990] to achieve edge-preserving smoothing.Running these processes in a space with manydimensions brings also e�ciency considerationsto the forefront, and we propose an e�cient ap-proximation to anisotropic di�usion. This isparticularly important in an application like re-trieval from image databases, where large num-bers of images must be processed when they areentered into the database, and where query im-ages must be processed in a short time to ensureinteractive behavior.

The \hard" decisions of texture segmentationare eliminated by our approach. The exact lo-cation of boundaries is unimportant, and yetboundary regions can be identi�ed with su�-cient precision that they can be tagged as \in-signi�cant" for texture analysis. Even if onlycoarsely localized, strong texture edges can befound by our method, even when intensity edgesare weak. Also, clusters of texture descrip-tor vectors can later be replaced by single vec-tors not by precisely labeling regions, but sim-ply by standard vector quantization techniques[Gersho and Gray, 1992]. Now, whether vec-tor clusters correspond exactly to image regionsbecomes irrelevant: clustering is done by opti-mizing information compression, rather than bysatisfying some semantically loaded criterion ofuniformity.

The next section introduces our version of rawtexture descriptors. These are standard Gabor�lters, placed on a log-polar grid in the fre-quency domain. Section 3 de�nes texture con-trast. Section 4 discusses our e�cient, edge-preserving smoothing algorithm, and section 5concludes with some �nal remarks and a lookat future work.

2 Texture Descriptors

A common way to describe texture is to com-pute the projection of the image intensity func-tion onto a basis of functions. This is referredto as a spectral decomposition, because each ofthe di�erent basis functions is usually concen-trated in a di�erent area of the frequency do-main. In such a representation, a texture isrepresented by a vector of values, each valuecorresponding to the energy in a speci�ed scaleand orientation subband. Spectral decomposi-tion methods include using quadrature �lters[Knutsson and Granlund, 1983], Gabor �lters[Farrokhnia and Jain, 1991] [Big�un and du Buf,1994], steerable �lters [Freeman and Adelson,1991] [Simoncelli et al., 1992] [Perona, 1991][Greenspan et al., 1994] [Shy and Perona, 1994][Heeger and Bergen, 1995], and the cortex trans-form [Watson, 1987]. For our experiments, weimplemented the Gabor �lters because of theirsimplicity.

2.1 Gabor Filters

Gabor functions [Gabor, 1946] are Gaussiansmodulated by complex sinusoids. A useful prop-erty of these functions is that they are max-imally compact in both space and frequency.In order to give the same emphasis to di�er-ent frequency octaves, and because natural tex-tures often have a linearly decreasing log powerspectrum [Field, 1987], we use what we call log-Gabor �lters. The centers of these �lters in thefrequency domain are equally spaced in a log-polar representation of the spectrum of an im-age. More speci�cally, if !r and !' are variablesalong the logarithm of the magnitude and alongthe phase of the 2D frequency plane, we use the�lters derived by Big�un and du Buf (1994) :

Gij(!r; !') = G(!r � !0ri ; !' � !0'j ) (1)

where

G(!r; !') = exp

�!2r2�2ri

!exp

�!2'2�2'j

!

with 1 � i � M and 1 � j � N . Here, Mis the number of scales and N is the number

Page 3: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

of orientation bands. The N orientations aretaken to be equidistant:

�'j = �=2N!0'j = 2�'j(j � 1)

for 1 � j � N . The M scales are obtainedby dividing the frequency range !max � !min

into the desired number of octaves: 2� + 4� +: : : + 2M� = !max � !min which yields � =(!max � !min)=2(2

M � 1) and

�ri = 2i�1�!0ri = !min + f1 + 3(2i�1 � 1)g�

for 1 � i � M . Figure 1 shows the frequencyresponse of the M � N �lter bank for M = 4,N = 4.

The responses of the Gabor �lters on an im-age are computed in the frequency domain bymultiplying the Fourier-transformed image (af-ter removing the DC component) with the �lterresponses in equation (1), and transforming theresults back to the space domain. Working inthe frequency domain avoids the image bound-ary problems which occur when convolving inthe spatial domain. Also, we only retain themagnitudes of the responses since these encodethe energy content and are independent of po-sition within the texture region.

Figure 1: Frequency responses for Gabor �ltersfor M = 4 and N = 4.

This �ltering stage returns MN response im-ages. The values from corresponding positionsin these images form MN -dimensional vectors,which are our raw texture descriptors. To re-duce the dependence of the responses on light-ing and in order to emphasize the texture struc-ture information, we normalize the vectors tobe unit vectors. This makes our texture spacea MN -dimensional hypersphere.

Figure 2 shows a mosaic of four textures fromthe Brodatz album [Brodatz, 1966]. Figure 3

shows the result of applying the Gabor �ltersfrom �gure 1 to this image. We down-sampledeach response image by a factor of four to savecomputations. As expected, di�erent texturesgive di�erent responses in the various scale andorientation subbands.

Figure 2: Mosaic of four textures from the Bro-datz album.

Figure 3: The result of applying the 16 Gabor�lters in �gure 1 to the mosaic in �gure 2.

3 Texture Contrast

The raw texture vectors introduced in the pre-vious section describe the local appearance ofsmall image neighborhoods. The size of theseneighborhoods is equal to that of the supportsof the coarser resolution �lters employed. Onthe one hand, these supports are large enoughthat they can straddle boundaries between dif-ferent textures. In this case, they do not de-scribe \pure" textures, and they can convey in-formation that is hard to interpret and can be

Page 4: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

misleading. On the other hand, the supportsare often too small for them to \see" enough ofa texture to yield a reliable description. In fact,the basic period of repetition of the underlyingtexture may be comparable to the �lter supportsizes, so that adjacent �lters see somewhat dif-ferent parts of the texture, and the descriptionvectors di�er somewhat. Also, textures exhibitvariations in the world, as well as variationscaused by nonuniform viewing parameters.

Before using texture vectors for indexing or in-terpretation, it is therefore preferable to siftand summarize the information that they con-vey. Speci�cally, it is desirable to eliminate vec-tors that describe mixtures of textures, and toaverage away some of the variations betweenadjacent descriptors of similar image patches.This poses a problem that is typical in theperceptual organization of images: small dif-ferences between attributes of adjacent areasshould be reduced, if not obliterated, and large,extended di�erences should be enhanced to pre-vent contamination between dissimilar areas.The techniques of edge preserving smoothing[Saint-Marc et al., 1991] and anisotropic di�u-sion [Perona and Malik, 1990] have been pro-posed to address this problem for gray scale im-ages. For texture, a notion of texture contrastis needed to tell \small" from \large" changes,in order to apply analogous techniques. Also,because of the length of texture vectors, e�-cient processing is crucial, especially when sev-eral thousand images are processed as is the casein image database applications.

For texture contrast, we employ a notion of\generalized gradient" from di�erential geom-etry [Kreyszing, 1959] that has been used forcolor images in the past few years [Di Zenzo,1986; Cumani, 1991; Sapiro and Ringach, 1994].The rest of this section discusses this notion asit applies to texture.

Assume that we have a mapping � : S � <n !<m. Let �i denote the ith component of�. If �is a texture vector space, for example, then �i,1 � i � m are the spectral decomposition sub-bands, where m = MN . If � admits a Taylorexpansion we can write

�(x+�x) = �(x)+�0(x)�x+k�xke(x;�x) :where ke(x;�x)k ! 0 as �x! 0 and �0(x) isthe m� n Jacobian matrix of �:

�0(x) = J(x) =

0BB@

@�1@x1

� � � @�1@xn

......

@�m@x1

� � � @�m@xn

1CCA :

If one starts at point x and moves by a smallstep �x, the distance traveled in the attribute

domain is approximately

d �= k�0(x)�xk =p�xTJTJ�x :

The step direction which maximizes d is theeigenvector of JTJ corresponding to its largesteigenvalue. The square root of the largest eigen-value, or, equivalently, the largest singular valueof �0, corresponds to the gradient magnitude,and the corresponding eigenvector is the gradi-ent direction.

We can give a closed form solution for the eigen-values in the case that n = 2, as it is for images.In this case, the di�erential of � is

d� =2X

i=1

@�

@xidxi :

and so

kd�k2 =2X

i=1

2Xj=1

@�

@xi� @�@xj

dxidxj :

Using the notation from Riemannian geometry[Kreyszing, 1959], we have

kd�k2 =2X

i=1

2Xj=1

gijdxidxj

=

�dx1dx2

�T �g11 g12g21 g22

� �dx1dx2

�:

where

gij �mXk=1

@�k@xi

@�k@xj :

and g12 = g21. Let now

�� =1

2

�g11 + g22 �

q(g11� g22)2 + 4g212

�:

be the two eigenvalues in the matrix G =�g11 g12g21 g22

�. Since G is real and symmetric,

its eigenvalues are real.

We choose �+ as the generalized gradient mag-nitude since it corresponds to the direction ofmaximum change. We can verify that this mag-nitude reduces to the ordinary gradient norm inthe case of a gray scale image (m = 1):

�+ =1

2

��2x +�

2y +

q(�2

x ��2y)2 + 4�2

x�2y

�=

1

2

��2x +�

2y + (�2

x +�2y)�

= �2x +�

2y = kr�k2

where the subscripts x and y denote di�erenti-ation.

Page 5: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

While the application of this de�nition to coloris relatively straight-forward, matters are morecomplicated for texture. In fact, color is a pointproperty of an image, while texture must be de-�ned over neighborhoods. Because of the mul-tiscale nature of texture in general, and of ourraw textured descriptors in particular, di�erent�lters have in general di�erent supports. Com-puting derivatives at di�erent scales, that is, fordi�erent values of !r in equation (1), requiresoperators of appropriate magnitudes and spa-tial supports in order to yield properly scaledcomponents. For instance, if di�erentiation isperformed by convolution with the derivativesof a Gaussian, the standard deviation of eachGaussian must be proportional to scale, and themagnitudes must be properly normalized.

Figure 4(a) shows the texture contrast of thetexture mosaic from �gure 2. The textureboundaries are characterized by high contrast.However, even inside uniform texture regionssome residue contrast appears. This is discussedin the next section.

4 Edge-Preserving Smoothing

Figure 4(a) shows some areas with relativelyhigh texture contrast even inside regions thatappear to be of the same texture. The sourcesof these contrast areas have been discussed atthe beginning of the previous section, whereit was pointed out that small, low-contrast ar-eas should be smoothed away, while extendedridges of the contrast function should be main-tained. The measure of texture contrast in-troduced above allows extending anisotropicdi�usion [Perona and Malik, 1990] or edge-preserving smoothing [Saint-Marc et al., 1991]techniques to texture descriptors. In this sec-tion, we present an e�cient implementation ofan edge-preserving smoothing algorithm, whichis based on repeatedly convolving each of thespectral subbands of the image with a sepa-rable binomial �lter weighted by a measure ofthe texture contrast. After describing the algo-rithm, we explore its relation with the methoddescribed in [Perona and Malik, 1990].

First the weighting function g(C) is introduced,where C is the texture contrast de�ned is sec-tion 3. The function g(C) should be high wherethe texture is uniform and low where the textureis \edgy" and can be any nonnegative monoton-ically decreasing function with g(0) = 1. Wechose to use

g(C) = e��jCjk

�2(2)

where k controls the decay rate of g(�) and, as

we will see later, determines which of the edgesis preserved.

Smoothing is performed by convolving the spec-tral subbands of the image with the binomial�lter

B = BTB ; B = [ 1 2 1 ] :

after it is weighted by g(C). We chose this �l-ter because it is separable and can be appliede�ciently. When applied repeatedly, the bino-mial �lter quickly gives an approximation of aGaussian. For brevity, introduce the followingnotation:

a �x b =1X

i=�1

a(i)b(x+ i; y)

a �y b =1X

j=�1

a(j)b(x; y+ j)

c � b =1X

i=�1

1Xj=�1

c(i; j)b(x+ i; y+ j) :

A single iteration of the smoothing procedure iscomputed as follows at time t:

1. Compute the texture contrast C(t) as de-scribed in section 3.

2. G(t) = g(C(t)).

3. For 0 � m < M and 0 � n < N do:

(a) Let I(t) be the m-th scale and the n-thorientation subband.

(b) Compute:

K(t) = B �x G(t)

J(t) =B �x (G(t)I(t))

K(t)

L(t) = B �y K(t)

I(t+1) =B �y (K(t)J(t))

L(t):

(This computation can be simpli�ed. For clar-ity, we bring it in the above form). It is easy tosee that

I(t+1) =B � (G(t)I(t))

B �G(t):

Without the B(i; j) factor, this is similar to theadaptive smoothing proposed in [Saint-Marc etal., 1991], but implemented, in addition, ina separable fashion for greater e�ciency. In[Saint-Marc et al., 1991], this iterative processis proven to be stable and to preserve edges. Asimilar proof can be applied to our case with

Page 6: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

straight-forward modi�cations. That edges arepreserved is proved by showing that when thecontrast is large enough (> k), it increases asthe iterations progress, thereby sharpening theedge. When the contrast is small (< k), thecontrast decreases and the edge is smoothed.This implies that k is equivalent to a contrastthreshold.

In the following, we show that the edge-preserving smoothing iteration is equivalent toan iteration of the anisotropic di�usion pro-posed by Perona and Malik (1990) . De�ne

c(t) =G(t)

B �G(t):

Then we have

I(t+1) = B � (c(t)I(t))and by using the fact that

B � c(t) = 1

we can write

I(t+1) � I(t) = B � [c(t)(I(t) � I(t)�)] (3)

where �(x; y) = 1 if x = y = 0 and 0 oth-erwise. Equation (3) is a discretization of theanisotropic di�usion equation

It = r � (c(x; y; t)rI) :Instead of using a 4-neighbor discretization ofthe Laplacian as done by Perona and Malik, weuse a better, 8-neighbor discretization [J�ahne,1995]:

1

4

"1 2 12 �12 21 2 1

#:

Figure 4 (b) shows the texture contrast after20 iterations. To visualize the MN -dimensionalvectors we project them onto the plane that isspanned by the two most signi�cant principalcomponents of all texture vectors in the image.Figures 4(c), (d) show the projections of thetexture descriptors before and after the edge-preserving smoothing. Only descriptors fromsigni�cant regions are shown in the latter. Aregion is signi�cant if the contrast after smooth-ing is everywhere smaller than the parameter kin equation (2). We can see that the descriptorsof the four textures form clear distinct clusterseven in this two-dimensional projection. Noticethe sparse trail of points that connects the twoleftmost and the two up-most clusters in �gure4(d). These points come from a \leakage" in theboundary between two textures. This impliesthat we are limited in the amount of smoothingwe can do before di�erent textures start to mix.This is also a situation where most segmentation

(a) (b)

−0.02 −0.015 −0.01 −0.005 0 0.005 0.01 0.015−0.025

−0.02

−0.015

−0.01

−0.005

0

0.005

0.01

0.015

0.02

−0.02 −0.015 −0.01 −0.005 0 0.005 0.01 0.015−0.015

−0.01

−0.005

0

0.005

0.01

0.015

(c) (d)

Figure 4: (a) Texture contrast of �gure 2 beforesmoothing (dark means large) and (b) after 20iterations. (c) Projection of the texture descrip-tors onto the plane of their two most signi�cantprincipal components before smoothing and (d)after 20 iterations.

algorithms would have to make a classi�cationdecision, even if the two textures involved arein fact similar to each other.

In �gure 5 we see a real image and the resultof edge-preserving smoothing after only �ve it-erations. We used four scales and four orien-tation bands. Notice the strong texture edgeseven where intensity edges are poorly de�ned orweak.

The same processing was performed on the im-age in �gure 6. Figure 7 shows the contrastmeasure and the signi�cant regions. Notice thatthere are many fewer edges than an intensity-edge detector would �nd, and yet regions ofdi�erent textures are well delineated. For in-stance, the picket fence is essentially found tobe one region, while and edge detector wouldhave delineated each picket. Also the walls ofthe buildings are now nearly edge-free, in spiteof substantial intensity contrast and even in thepresence of relatively large variations in the at-tributes of the textures within each region.

Some of the texture boundaries in both �gures5 and 7 appear as double, parallel lines. Thisdoubling is caused by the response of �lters thatare insensitive to both textures along the edge.While these �lters give essentially no responseto the regions adjacent to these edges, the edge

Page 7: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

(a)

(b)

Figure 5: (a) A real image. (b) Texture edgesafter 5 iterations of the smoothing.

itself creates a peak, which is then di�erentiatedby the contrast detector, thereby creating dou-ble ridges. We are studying ways to eliminatethis e�ect.

5 Conclusion

In this paper, we have proposed an alterna-tive to texture segmentation for grouping localtexture descriptors. Rather than partitioningthe image into regions with \the same" tex-ture, a poorly de�ned notion, we coalesce simi-

Figure 6: A picture of a lighthouse.

lar and adjacent texture descriptors into tightclusters by a process analogous to the edge-preserving smoothing procedures that have tra-ditionally been applied to image intensities or tocolor. The main advantage of this alternativeapproach is that hard decisions about bound-aries and \sameness" of textures need not bemade. At the same time, texture descriptors aregrouped in a way that allows compressing themby standard techniques like vector quantization.We have introduced a simple and powerful no-tion of texture contrast, and we have also de-�ned signi�cant regions as those where texturecontrast is low. Boundary areas and compli-cated regions with no texture to speak of areeliminated as insigni�cant for texture analysis.

In our current and future research, we are con-sidering variations to Gabor �lters for the com-putation of raw texture descriptors. In particu-lar, we are experimenting with steerable �lters[Freeman and Adelson, 1991; Perona, 1991] be-cause they can yield many di�erent orientationswith less computation. When steerable �ltersare computed at di�erent scales, they can beimplemented e�ciently in a pyramid-like fash-ion. Greenspan et al. (1994) derive an e�cientsteerable pyramid which gives similar �lters tothe Gabor �lters we are using.

Acknowledgments

We thank Brian Rogo� and Guillermo Sapirofor useful discussions.

Page 8: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

(a)

(b)

Figure 7: (a) Texture edges after ten iterationsof the smoothing. (b) Nonigni�cant regions areblackened out.

References

[Big�un and du Buf, 1994] J. Big�un and J. M.du Buf. N-folded symmetries by complex mo-ments in gabor space and their applicationto unsupervised texture segmentation. IEEETransactions on Pattern Analysis and Ma-chine Intelligence, PAMI-16(1):80{87, Jan-uary 1994.

[Brodatz, 1966] P. Brodatz. Textures: A Pho-tographic Album for Artists and Designers.Dover, New York, NY, 1966.

[Cumani, 1991] A. Cumani. Edge detection

in multispectral images. CVGIP: GraphicalModels and Image Processing, 53(1):40{51,1991.

[Di Zenzo, 1986] S. Di Zenzo. A note on thegradient of a multi-image. Computer VisionGraphics and Image Processing, 33:116{125,1986.

[Farrokhnia and Jain, 1991] F. Farrokhnia andA. K. Jain. A multi-channel �ltering ap-proach to texture segmentation. In Proceed-ings of the IEEE Computer Society Confer-ence on Computer Vision and Pattern Recog-nition (CVPR), pages 364{370, June 1991.

[Field, 1987] D. J. Field. Relations betweenthe statistics of natural images and the re-sponse properties of cortical cells. Journal ofthe Optical Society of America, 4:2379{2394,1987.

[Freeman and Adelson, 1991] W. T. Freemanand E. H. Adelson. The design and useof steerable �lters. IEEE Transactions onPattern Analysis and Machine Intelligence,PAMI-13(9):891{906, September 1991.

[Gabor, 1946] D. Gabor. Theory of communi-cation. The Journal of the Institute of Electri-cal Engineers, Part III, 93(21):429{457, Jan-uary 1946.

[Gersho and Gray, 1992] A. Gersho and R. M.Gray. Vector quantization and signal com-pression. Kluwer Academic Publishers,Boston, MA, 1992.

[Greenspan et al., 1994] H. Greenspan, S. Be-longie, R. Goodman, P. Perona, S. Rakshit,and C. H. Anderson. Overcomplete steer-able pyramid �lters and rotation invariance.In Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition(CVPR), pages 222{228, June 1994.

[Guibas et al., 1995] L. Guibas, B. Rogo�, andC. Tomasi. Fixed-window image descriptorsfor image retrieval. In Proceedings of theSPIE Conference on Storage and Retrievalfor Image and Video Databases, pages 2420{31, San Jose, CA, February 1995.

[Heeger and Bergen, 1995] D. J. Heeger andJ. R. Bergen. Pyramid-based texture anal-ysis/synthesis. In Computer Graphics, pages229{238. ACM SIGGRAPH, 1995.

[J�ahne, 1995] B. J�ahne. Digital Image Process-ing. Springer, 1995.

[Knutsson and Granlund, 1983] H. Knutssonand G. H. Granlund. Texture analysis us-ing two-dimensional quadrature �lters. IEEEComputer Society Workshop on ComputerArchitecture for Pattern Analysis and ImageDatabase Management, 1983.

Page 9: Coalescing T exture Descriptors - Duke Universitytomasi/papers/rubner/rubnerIuw96.pdfe prop ose an e cien t ap-pro ximation to anisotropic di usion. This is particularly imp ortan

[Kreyszing, 1959] E. Kreyszing. Di�erentialGeometry. University of Toronto Press,Toronto, 1959.

[Perona and Malik, 1990] P. Peronaand J. Malik. Scale-space and edge detectionusing anisotropic di�usion. IEEE Transac-tions on Pattern Analysis and Machine In-telligence, PAMI-12(7):629{939, July 1990.

[Perona, 1991] P. Perona. Deformable kernelsfor early vision. In Proceedings of the IEEEConference on Computer Vision and PatternRecognition (CVPR), pages 222{227, June1991.

[Saint-Marc et al., 1991] P. Saint-Marc, J. S.Chen, and G. Medioni. Adaptive smoothing:A general tool for early vision. IEEE Trans-actions on Pattern Analysis and Machine In-telligence, PAMI-13(6):514{529, June 1991.

[Sapiro and Ringach, 1994] G. Sapiro andD. Ringach. Anisotropic di�usion in color

space. submitted, 1994.

[Shy and Perona, 1994] D. Shy and P. Perona.X-y separable pyramid steerable scalable ker-nels. In Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition(CVPR), pages 237{244, June 1994.

[Simoncelli et al., 1992] E. P. Si-moncelli, W. T. Freeman, E. H. Adelson, andD. J. Heeger. Shiftable multiscale transforms.IEEE Transactions on Information Theory,IT-38:587{607, 1992.

[Tomasi and Guibas, 1994] C. Tomasi andL. Guibas. Image descriptions for browsingand retrieval. In Proceedings of the ARPAImage Understanding Workshop, pages 165{168, November 1994.

[Watson, 1987] A. B. Watson. The cortextransform: rapid computation of simulatedneural images. Computer Vision, Graphics,and Image Processing, 39:311{327, 1987.