An iterative gradient edge detection algorithm

WMPTJTER GRAPHICS AND IMAGE PROCESSING $245-253 (1976)

An Iterative Gradient Edge Detection Algorithm *

ROBERT B. EBERLEIN

Computer Science Center, University of Maryland, College Park, MD sO?‘.@?

Communicated by A. Rosenfeld

Received July 15, 1975

Local gradients yield valuable directional information, which is useful in edge detection in digital pictures. Parallel iterative algorithms can be devised to process the edge vectors derived from the gradient. The resulting output consists of thinned edges which may be tracked very quickly. Examples of pictures processed by this method are given, and suggestions for further modifications are made.

1. EDGE DETECTION USING LOCAL GRADIENTS

Determining the boundaries of objects in gray level pictures is one of the major first steps in extracting the informatbn in a picture. Differing average gray levels between the regions on opposite sides of a boundary may be detected by using a local gradient operator. Holmes et al. [1] made one of the first successful usages of an approximate digital gradient to detect edges, and since then many local gradient operators have been devised and used. Parallel gradient operators detect edge elements; it is necessary to connect these elements together to produce an edge.

In many cases only the magnitude of the digital gradient has been used. For the purpose of edge tracking by the connection of edge elements, the directional information is very useful. The directional information contained in a gradient vector is most conveniently represented by an edge vector, which is a gradient vector rotated by 90”. Bowker [2] defines edge vectors in this way, by the formula E = u X grad I, where E is the edge vector, u is a unit vector normal to the image plane, and grad I is the gradient of the intensity. Edge vector reduction is the key problem ; a plane full of edge vectors at all points must be reduced to a sufficiently small number of actual edges.

2. PROCESSING OF EDGE VECTORS

The main difference between this paper’s approach and the approach of Bowker is in the nature of the operations which are applied to the initial edge vectors.

* The support of the U. S. Air Force Office of Scientific Research, under Contract F44620-72C- 0062, is gratefully acknowledged, as is Ms. Shelly Rowe for her help in preparing this paper. The author also wishes to thank David L. Milgram for helpful suggestions.

245

Copyright Q 1976 by Academic Press. Inc. All rights of reproduction in any form reserved.

Two simple parallel algorithms arc’ givctn. Tt,c~ration of thcsc> xlgoritjhms C:LII 91 tlvc t\yo charactrrist,ic difficult& of gradient cdgtb dcl c>c:tion :

(1) Gradients arc dctcctc-d in :t region n(‘ar an cdgc. Thcb size of that r~~gion depends on the size of thr local grxdiont8 and the clxt,ont, of t,ht> edgcb. F&W :IW blurred and need sharpening.

(2) Noise. may bc prcsclnt in the picture. Edges may need to br smoothed to rcmovc noise induced irrtlgularities.

These seemingly contradictory goals can ba achieved easily by smoothing along the edge and thinning at right angles to the edge. Difficulty (1) means t,hat the presence of an edge in thr: input image gives rise to a number of parallel edge vectors in the output). Bowkcr has termed the process whereby a numhcr of parallel vectors are reduced to a single vector “association” of edge vectors. The thinning algorithm about to br presented performs this task. Because of its simplicity, the thinning algorithm was vrry scnsitivc to n&c, and difficulty (2) arose.

Irregular broad edges oft’cn reduced to several parallel edges with gaps bn- twern them, unless precautions wcrc t)aken. A sufficient amount, of smoothing in advance of thinning was the cure. Smoothing caused the removal of many local maxima in the vicinity of an edge. The thinning algorithm works by removing nonmaxima, not by direct, suppression or deletion, but by absorbing them into the peak. The final output consists of an edge vector field in which all of the association has been done. A tracking algoritm riced only perform connection of edge vectors, to use Bowker’s terminology.

3. I)ESCRIPTION OF THE THINNING AND SMOOTHING AILWRITHMS

3.1. Thinning Algorithm

The thinning algorithm is a parallel algorithm which proceeds as follows : Each edge vector has a principal direction, determined by which of its X or Y components has the largest value. There arc four directions possible. The edge vector has two neighbors at right angles to its principal direction, and these neighbors are candidates for association. The central edge vector is compared with each neighbor which has the same principal direction. The result of a comparison is that the central vector absorbs a portion a of a neighbor which is smaller. Nothing happens if t,ho neighbor is larger, except that since the algorithm is parallel, the central vect’or will 10s~ to the neighbor when we are centered on the neighbor. Note also that nothing happens if thr principa1 directions of the two vectors differ.

This algorithm is it,erated, and the result is survival of the fittest. At, each step, the principal directions of all point,s remain the same. The original edge vector input can be viewed as having been partitioned into regions based on the principal directions. Within each region, thinning proceeds along lines of cross section to the principal direction. The thinning algorithm is essentially a onc- dimensional algorithm which has been transplanted with a minimum of modifica- Con to a two-dimensional environment.

The result of a number of itrrat,ions is that all point,s which were local maxima along a cross-sectional line of a region of the same principal direction survive

ITERATIVE GRADIENT EDGE DETECTION 247

and increase in value, while all others go to zero. Of course, points with no neighbors within the region, along a cross-sectional line, stay the same. They are also local maxima. The total area of a peak is absorbed into the location of the highest point. Without smoothing, a ragged peak will decompose int,o a set of spikes, but a smoothly descending peak will become just a single spike.

3.2. Smoothing Algorithm

While the thinning algorithm is essentially a one-dimensional algorithm adapted to two dimensions, the smoothing algorithm is inhcrcntly two-dimensional. Spreading neighborhoods are taken at each edge vector element, where the orientation of the neighborhood is parallel to the edge direction. The edge dircc- tion is quantized to the nearest 45”. For smoothing to distance 1, four types of neighborhoods arc possible, as shown:

xxx xx xx xx P xpx xpx xpx

xxx xx x x xx

The smoothing idea is to combine the central point with the neighbors, and there are two different approaches :

(1) To each neighborhood vector is added a portion of the central vector (diffusion).

(2) The central vector is replaced by the sum of itself and a weighted average of the neighbors (infusion).

Because the neighbor relation is not symmetric in this case, the two approaches ahove are clearly not the same. In either case, the smoothing algorithm is to be viewed as an averaging operation, which is not isotropic but depends on the direction of the edge vector at the central point. The result of successive iterations of the smoothing algorithm is further smoothing of the picture, but WC do not have the same invariants, such as principal direction, as we have in the thinning case.

3.3. Processing Sequence

There are three steps to the processing of a picture using these algorithms :

(1) The gradient of the picture is taken and separate, signed, X and I’ component pictures are obtained. The gradient is transformed into edge vectors by a 90” rotation.

(2) The smoothing algorithm is applied iteratively as much as is desired, perhaps not at all.

(3) The thinning algorithm is applied iteratively until the process has con- verged as closely as desired.

The gradient that was used in this study was the linear least squares gradient

for 3 X 3 neighborhoods, which can br rcprescnted by the masks:

-- 1 0 1 1 I 1 --I 0 1 0 0 0 .-- I 0 I -1 -1 -1

for the two components. The numbers shown represent t,hc weights by which a 3 X 3 neighborhood is summed to produce a gradient value at, the central point [3]. This gradient is less noise smsitjivc, but blurs more, than thca 2 X 2 gradient used by Bowker.

4.1. Thinning 4. I)ISCUSSION OF THE ALGORITHMS

The two algorithms differ in the extent to which precise quantitative statement,s may be made about their behavior. As it is basically a one-dimensional algorithm, the thinning algorithm is the easiest to make claims about. It has the following properties :

(1) The edge vector value converges to a limit at each point, as the number of iterations increases.

(2) All points which have a neighbor (in the right place and with the same principal direction) larger than themselves, have the limit value of zero. They will be absorbed by their neighbor, even if the neighbor is also being absorbed.

(3) All points strietly greater than both of their neighbors have a nonzero limit.

Let (I be the absorption factor; we require that 0 < o 5 0.5. This is because if a point is diminished simultaneously by both of its neighbors, we wish to avoid having it change sign. The value, 0.5, is a good choice for a! because a dying point will go to 0 immediately if both its neighbors are larger than it is. Two vectors pointing in opposite directions have different. principal directions, so~that all edge vectors which are candidates for association with their neighbors may be considered to be positive.

The only case in which the future behavior at a point is not describable in terms of only the original point and its neighbors, occurs when the point is tied with one or both neighbors. If- tied with only one neighbor, the point will still disappear if the other neighbor is larger. Otherwise, the final result depends on which of the tied pair becomes dominant. Consider the following pathological example, where each line is a successive step, and the first line is the original situation. The absorption factor is 0.5 in this example.

1 2 2 2 2 2 2 0 l/2 5/2 2 2 2 2 2 0 3/4 H/4 1 2 2 2 2 0 l/8 xi/a 0 5/2 2 2 2 0 l/16 71/16 0 7/2 1 2 2 0

: 0’

giving in the limit g/2 0 4 0 9/2 0 0

What was a very broad edge has been decomposed into parallel micro&es.

ITERATIVE GRADIENT EDGE DETECTION “49

The likelihood of a tie can be greatly reduced by a few smoothing steps. With- out any smoothing, it will depend on the number of points which determine the gradient. The worst case would be a gradient based on single point differences of a coarsely quantized picture. In the pictures given in the examples below, the gradient was based on a nine-point neighborhood of a picture with 64 gray levels, so the percentage of ties was fairly small.

QT. Smoothing

The analysis of the two smoothing algorithms is less exact, because the effects are more complicated. Even so, significant differences between the behavior of diffusion and infusion have been noted in both theory and practice. A first the- oretical difference is the number of points which contribute to the new edge value at a point in each cycle. In the case of infusion, it is always true that the central point and precisely six neighbors are needed. In the diffusion case, how- ever, only on the average do six neighboring points contribute; more or fewer may include the central point in their direction neighborhoods. Diffusion results could be expected to be more variable than infusion results for this reason.

Both forms of smoothing also suffer from the lack of any sort of convergence behavior, since they do not converge, but continually grow. As the examples show, when different cycles are scaled for comparison, it can be seen that edges become progressively more blurred. Diffusion has the property that straight lines tend to be extended, which can be useful, or an annoyance, depending on the class of pictures. The two methods were compared under equivalent condi- tions of weights for the neighborhood points and the central point. In the infusion case, the central point had the average of its neighbors added to it at each stage, so that 50% of the new value was due to the neighbors. This meant that one-sixth of the central point should be added to the neighborhood points in the diffusion case. The results of applying the two forms of smoothing are compared in Sec- tion 5.

4.3. Nonmaxima Absorption

The thinning algorithm is based on the idea of nonmaxima absorption, rather than nonmaxima suppression. This choice reflects the origin of the edge vector values from the gradient, and a desire to retain some of the properties of the gradient. A gradient component is the rate of change of intensity in the given direction. A line integral of the gradient from point A to point B will give the total gray level difference between A and B, regardless of the path. Suppose that A is in the interior of a region of constant gray level, and B is in the interior of another such region which is adjacent to A’s region. A boundary exists between the region of A and the region of B, and at the boundary the gradient is nonzero. lntegrating the line integral of the gradient from point A to point B is the same as integrating the gradient across the boundary. Edges sharpened by nonmaxima absorption retain the line integral property. Nonmaxima suppression will not distinguish between sharp edges between regions of close gray levels, and blurred edges between regions of siBable intensity differences. A one-dimensional ex-

ample follows. Suppose WC have the lJwo edges :

The corresponding l-point gradient diffcrenccs are

0 0 0 0 1 2 1 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0

Roth nonmaxima suppression and nonmaxima absorption will localize the edge at the same place in both examples. Absorpt.ion will give the same answer in both cases, whereas suppression will detect the second edge twice as strongly as the first edge. Discarding the nonmaxima represents loss of useful information.

An interesting question can uow be posed : Is it possibIe to iteratively perform thinning on a suitably chosen gradient of an input picture, sharpening the edges in the process, and then reconstruct a picture from the sharpened edges by means of integration, which corresponds to the original pict,ure? If the answer is yes, t,hen perhaps a useful method of image enhancement and deblurring has been found. A good gradient to use would be the simplest possible one, the difference between two neighboring points. If WC lcavc this gradient field untouched, WC can reconstruct the original picture (to within a constant) by integrating along any path between a point whose value has already been assigned, and a target point, filling in the picture as we go. Now it is only necessary to be able to find one path of integration t)o each point in the picture, after thinning has been per- formed. In my opinion this is possible, but t,he idea has not been pursued any further,

6. EXAMPLES

Figure 1 shows various steps in the processing of a picture of a finger. Except for the original picture, the output pictures show the absolute value of the largest

Fxa. 1. Finger example. (a) Original picture. (b) Gradient of (a). (c-e) Results of smoothing (b) (three steps). (f-i) Results of thinning (e) (four steps).

ITERATIVE GRADIENT EDGE DETECTION “51

a b d e FIG. 2. Chromosome example, without smoothing. (a) Original picture. (b-e) Results of thinning

gradient of (a) (four steps).

component at each point, scaled so that the largest in the entire picture is 63 on a gray scale of O-63. Figure la is the original picture, lb is the gradient, lc is after one step of smoothing, Id after two steps, and le after three steps. Figure If is le after one thinning step, lg aft’er two steps, etc. Figure li is the result after a total of three smoothing steps followed by four thinning steps. The outline of the finger is quite clear, and a simple edge tracking program readily extracts a chain code of the boundary.

Figures 2 and 3 illustrate the utility of smoothing prior to t,hinning. Figure 2a is a chromosome with a very blurred edge. Thinning applied directly to the gradient yields a very ragged outline of parallel microedges. Figures 2b-2e are the outputs after successive thinning steps. Figures 3a-3c show the result of applying three smoothing steps to the gradient obtained from Fig. 2a. Figures 3d-3g show the result of four thinning steps applied to Fig. 3c. Note how only a single contour is apparent around the chromosome.

Figures 4 and 5 compare the effects of the two alternative forms of smoothing. In Fig. 4, the first three steps (4b-4d) are smoothing of the gradient of Fig. 4a using infusion, which is the method also used in Figs. l-3. Figure 4h is the final result after four thinning steps are applied to 4d. Figure 5 is the same as Fig. 4 in that three smoothing steps are followed by four thinning steps, but the smoothing is of the diffusion type. The pertinent comparison is between 4h and 5h. Com- pared to Fig. 4h, the edges produced by diffusion smoothing in Fig. 5h are much more ragged and full of sawtccth.

a b

FIG. 3. Chromosome example, with smoothing. (a-c) Results of smoothing gradient of Fig. 2a (three steps). (d-g) Results of thinning (c) (four steps).

a b d

e h

Fra. 4. Orchard example, smoothing by infusion. (a) Original picture. (b-d) Results: of smoothing gradient of (a) by infusion (three steps.) (e-h) Results of thinning (d) (four steps).

6.1. The Algorithms 6. SUMMARY AND CONCLUSIONS

The parallel algorithms described above are designed to fully utilize the directional information obtainable from the gradient of a digitized picture. Edge vectors which express this directional information are the quantities whioh are operated on throughout. The goal of the processing is to produce thinned edges which can be quickly tracked by an edge following routine. Although the algorithms are iterative and therefore somewhat inefficient, they are simple and parallel. In any event, they are useful preprocessing steps which can be applied to a wide range of pictures. Varying the amount of smoothing gives the ability to deal with differing resolution requirements.

6.2. Usage of the Output

A simple tracking program has already been applied to some of the outputs of Figs. l-5. Conneetion of major edge vectors was rapid and accurate, and- chain codes of moderate length were obtained. Boundary locations are thus easily derived from such output. Vertices are a different matter. The gradient tends to die out at a vertex, and hence the edge vectors do also. The detection of the close proximity of the ends of chains to other chains and chain ends is a reasonable way to detect a vertex.

I

e f 9 h

Fra. 5. Orchard example, smoothing by diffusion. (a) Origin& picture. (b-d) Results of smoothing gradient of (a) by diffusion (three steps). (e-h) Results of th.&mi~g (d) (four steps).

ITERATIVE GRADIENT EDGE DETECTION 253

6.3. Possible Rejinements

Two different types of modifications to these algorithms are possible:

(1) modifications designed to reduce the computation time; (2) modifications designed to produce better results.

In the first category would be a modification to avoid using floating point arithmetic. If these algorithms are to be implemented in hardware or on mini- computers, such a change would be useful. Use of 0.5 for CY could replace a multi- plication by a binary shift, for example.

Worth mentioning in category (2) are:

(a) Use of a hierarchy of gradients of different sizes. Bowker [2] does this in his approach.

(b) Modifying thinning to use the same kind of neighborhood as smoothing does.

(c) Special purpose changes designed to give better results for a particular class of pictures.

(d) Modifications with the idea of picture reconstruction in mind. At present, smoothing destroys analytical reconstructability.

As further experience is gained with the usage of these algorithms, it is expected that some useful refinements will be developed.

REFERENCES

1. W. S. Holmes et al., Design of a photo interpretation automaton, in Proceedings, Fall Joint Computer Conference, 1962, pp. 27-35.

2. J. K. Bowker, Edge vector image analysis, in Proceedings, Second Znterndionul Joint Conference on Pattern Recognition, 1974, pp. 520624.

3. J. M. S. Prewitt, Object enhancement and extraction, in Picture Processing and Psychopictorics, (B, S. Lipkin and A. Rosenfeld, Eds.), pp. 75-149, Academic Press, New York, 1970.

Documents

An iterative gradient edge detection algorithm