7

Click here to load reader

Relaxation Matching Techniques-A Comparison

  • Upload
    keith-e

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

Correspondence

Relaxation Matching Techniques-A Comparison

KEITH E. PRICE

Abstract-Many different relaxation schemes have been proposed forimage analysis tasks. We have developed a general matching procedurefor comparing semantic network descriptions of images, and we haveimplemented a variety of relaxation techniques. An automatic segmen-tation and description system is used to produce the image representa-tions so that the matching procedures must cope with variations in fea-ture values, missing objects, and possible multiple matches. Thisenvironment is used to test different relaxation matching schemes undera variety of conditions. The best performance (of those we compared),in terms of the number of iterations and the number of errors, is forthe gradient-based optimization approach of Faugeras and Price. Therelated optimization approach of Hummel and Zucker performed almostas well, with differences primarily in difficult matches (i.e., where muchof the evidence is against the match, for instance, poor segmentations).The product combination rule proposed by Peleg was extremely fast,indeed, too fast to work when global context is needed. The classicalRosenfeld, Hummel, and Zucker method is included for historical com-parisons and performed only adequately, producing fewer correctmatches and taking more iterations.

Index Tenns-Image analysis, relaxation matching.

I. INTRODUCTION

Matching of images and descriptions has many different usesand can be performed at several different levels. Some matchingtasks require that very precise corresponding locations be com-puted (e.g., stereo depth computation, pixel-level change detec-tion). But for many tasks, matching at a higher level (i.e.,finding regions of correspondence between two scenes) is best.This correspondence discusses results of using a variety of re-laxation techniques in a general, symbolic-level image matchingsystem. We apply this system to the problem of matching animage with an a priori description of the scene (a model), andto the problem of matching two different images to find thelocation of an object in two different views. Thus, we use thisprogram to find correspondence between areas of the images(or object) rather than to find a pixel-level mapping betweenthem.

II. BACKGROUNDThe work reported here represents an extension of earlier

relaxation-based symbolic matching efforts [1]. A variety ofother image-matching techniques has been developed for dif-ferent tasks. Moravec [2] has developed a system which lo-cates feature points in one image (essentially corners) and usesa correlation-based matching procedure at multiple resolutionsto efficiently find a set of corresponding points in the two

Manuscript received February 7, 1984; revised March 28, 1985. Rec-ommended for acceptance by R. Bajcsy. This work was supported bythe Defense Advanced Research Projects Agency and monitored by theAir Force Wright Aeronautical Laboratories under Contract F33615-82-K- 1786, DARPA Order 3119.The author is with the Intelligent Systems Group, University of

Southern California, Los Angeles, CA 90089.

images. This system is intended for land-based robot naviga-tion, which uses the three-dimensional information from thesefeature points for navigation. A stereo system developed byBaker [3] generates a complete disparity map starting fromedge correspondences. The disparities can be used for depthcomputations if the camera positions are known. These two(and many other similar efforts) concentrate on precise match-ing of image data to obtain three-dimensional descriptions.Several systems which work on a variety of symbolic repre-

sentations have also been developed. Barnard and Thompson[4] have developed a relaxation-based motion analysis programwhich finds corresponding feature points in two images. Thefeature points are similar to those of Moravec [2], but theyare located in both images. Wong et al. [51 also use a relaxa-tion procedure to match corners which are detected in pairsof images. This system allows arbitrary translations and rota-tions of the camera. Clark et al. [6] have developed a systemto match linelike structures (generally either edges or regionboundaries). The program uses three initial matching line pairsto get a mapping between the two images. The quality of thematch depends on how well all the other lines match, and thebest match is determined by trying all possible triples of match-ing lines. The number of possible triples is limited by the allow-able transformation, i.e., given one match, the possible matchesfor the other two are very restricted. Gennery [7] extractssimple descriptions of objects and uses a tree-searching pro-cedure to find the best match.The primary relaxation procedure in this correspondence is

developed more fully in [1] and [8] , and differs from othermethods in its gradient optimization approach. The alternativerelaxation updating schemes used in the comparison are: thebasic method of Rosenfeld et al. [ 9]; the product combinationrule of Peleg [ 10]; and the optimization method of Hummeland Zucker [11].

III. SYMBOLIC DESCRIPTION

This matching system uses feature-based symbolic descrip-tions for its input. The description of an idealized version ofthe scene (a model) is developed by the user through an inter-active procedure. The image descriptions are derived auto-matically from the input images. The underlying descriptivemechanism is a semantic network. The nodes of the networkare the basic objects with associated feature values, and thelinks indicate the relations between objects.The basic objects used in the image description are regions or

linear features extracted by automatic segmentation procedures[12], [13]. These procedures produce a set of objects com-posed of connected regions, which are homogeneous with re-spect to some feature in the input image [ 12], and long, narrowobjects which differ from the background on both sides andcan be represented as a sequence of straight line segments [ 131 .

Only the important objects are described in the model. Theautomatic image segmentation produces many objects not in-cluded in the model (as many as 100-300 elements). Themodel description determines the outcome of the matchingprocedure and can also be used to guide the segmentation pro-cedure [ 14].

0162-8828/85/0900-0617$01.00 © 1985 IEEE

617

Page 2: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

The description is completed by extracting features of theregions and linear objects. The features are those which canbe easily computed from the data and which are reasonablyconsistent. These features include average values of the imageparameters (intensity, colors, etc.), size, location, texture, andsimple shape measures (length-to-width ratio, fraction of mini-mum bounding rectangle filled by the object, perimeter2/area,etc.). Relations included in the description are also those whichare easily computed, such as adjacency, relative position (northof, east of, etc.), near by, and an explicit indication of notnear by.

IV. MATCHING

The basic goal for the matching procedure is to determinewhich elements in the image correspond to the given objectsin the model. Most of the objects cannot be recognized byonly their feature values. They require contextual informa-tion to be correctly located. An important idea used by thematching system is to locate a small set of corresponding ob-jects using feature values and available contextual informa-tion. These initial islands of confidence provide the contextneeded for finding correspondences for the less well-definedobjects. Finally, when most objects are assigned, the matchingcan be done solely on the basis of context; thus, radical dif-ferences in a few objects do not cause the matching programto fail.The basic operation of the matching system is outlined in

Fig. 1. In the large outer loop, a set of possible matching regionsis determined for every element in the model. Each of thesepossible assignments has a rating (probability) based on howwell the model and image elements correspond. These ratingsare refined by the relaxation procedure in the inside loop untilone or more model elements have one highly likely assignment(usually a probability threshold of about 0.75 or 0.8). At thispoint, a firm assignment is made and all likely assignments arerecomputed using these assigned elements in order to given thecontext for the match. The inner relaxation procedure updatesthe probabilities of the assignment based on how compatiblethe assignment is with the assignments of its neighbors in thegraph (i.e., objects linked by relations). We use a variety of re-laxation schemes [ 1 1, [ 81 -[ 111, [ 151 in this loop, with the cri-teria optimizing method in [ 1] and [8] giving the best results.The importance of this two-level procedure is clear when an

analysis of relaxation updating is made. Relaxation can beviewed as moving around in a multidimensional space, searchingfor the global maximum of some function (such as overallmatch quality). But the search is constrained to find a localmaximum near the initial assignment [11] . The reinitializa-tion step moves the search from the vicinity of one local max-imum to another, which should be as high or higher.

A. Matching Details

The quality of match between two elements (one each fromthe model and the image, or two from different images) is givenby the weighted sum of the magnitude of the feature valuedifferences

mR(u, n)=3 IVuk - Vnk WkSk (1)

k= 1

where u is an element from the model n from the image, m isthe number of features being considered, and VUk(Vnk) is thevalue of the kth feature of element u,n. Wk is a normalizationweight (the same for all tasks) to equalize the impact from allfeatures. Sk is the task-dependent strength of a given feature.These strength values distinguish among important, average,and unimportant features. The ratio of the strength values is5 :1, and there is a fourth strength, zero, which indicates a fea-

Compute initial likelihoods

Update assignmentsusing a relaxation scheme

Any assignment over threshold

; Yes

Make firm assignments

No

I~~~~~~~~~~~~~~~~~~~~~Fig. 1. Overview of the symbolic matching system.

ture is not used. This rating function is converted to the range(0, 1) by

af(u, n) =

R(u, n) + a

where a is a constant which controls how steep the differencesfunction is. A value of 1 (a sharply declining function) pro-duces the best results with the optimization updating approach.A larger value such as 10 should be used when using the pro-duct combination method [ 10] . Relations are treated like fea-tures in computing their contribution to the match rating. Vukis the number of relations of type k specified in the model,and Vnk is the number that actually occur in the image. Fig. 2illustrates how these values are computed for a given ui. Foreach possible corresponding region nk, check all u; (in themodel) related to ui to see if the given correspondence (n1) foruj is properly related to nk. When computing the initial prob-abilities of a match, only those uj that have been previouslyassigned can be considered.The relaxation procedures require a function that measures

the compatibility of a particular assignment nk for ui, with thecurrent possible assignments at all neighboring (related) units.This is defined by

IQi(nk) - E 1 c(ui, nk, uj nl) pj(nl)IN(uj)j uj in N(uj) nj in Wj

+ ctf(ui, nk) Pi(nk) (3)

where Ni is the set of objects related to ui; INil is the numberof neighbors; da is a factor between 0 and 1 that adjusts the rela-tive importance of features versus relations (0.1 to 0.25 is theusual range); Pi(nk) is the current probability for assigningname nk to unit ui; and Wi is the set of likely assignments ofu (for efficiency and improved results we generally use onlytAe most likely assignment here). c(ui, nk, uj, nl) is the sameas f(ui, nk) except that only relations between ui and uj areconsidered. The vector Qi is used directly in the updating inthe updating step without normalization, which simplifies thecomputation. (This is a change from our original paper [ 1].)The iterative updating is given by

-(n+ i) =- Pn)+n)j (4)

where Pn is a positive step size to control the convergencespeed, Pi is a linear projection operator to maintain the con-straint on pn+i) that it is a probability vector, and gin) is anexplicit gradient function determined by the optimization cri-teria [ 1] , [ 8]

gi(nk) = - Qi(nk) P-i(nk) oif(ui, nk) (5)

uis.t. ui in N(ui) N(U1)J nI in Wjpc(Uj, nl, Ui, nk) Pj(oll)s

In practice, this gradient is not computed for all possible as-

618

(2)

Page 3: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACIIINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

Model Image

Possible

Fig. 2. The use of relations in the computation of the compatibility(matching) function.

signments of u,, only the current most likely assignment iscomputed. This is done to maintain symmetry with the com-putation of Q, where only the most likely assignment of theneighbor is used.

Briefly, the gradient gives the direction of greatest change inthe criteria and the updating function takes a step in this direc-tion. An alternative formulation is the one by Hummel andZucker [11]. They optimized a different function and thusdo not compute the gradient, but under the assumptions whichapply here, the final updating step is the same except that Qgis used rather than gi.

If the gradient computation (5) was applied for all possibleassignments of us, then the two methods would be essentiallythe same except for possible step-size differences. This obser-vation' explains why early experiments, where the gradientwas applied to all possible assignments, tended to convergeslowly. The general description of the projection operator in[ 1I] is not the same as used here, but when the same assump-tions are used (regarding which values are 0, etc.) the result isthe same. The original method of Rosenfeld et al. [91 uses thesame computation method for the compatibility measure buthas a different updating function.

p=p()(nk) Q(n)(nk)Pi k E P (")(n,) Q(n)(ni)

ni inN

The product method of Peleg [ 10] used the updating functiongiven in (6) but combines the compatibility values using a pro-duct rather than a sum. This changes the outer summation in(3) to a product. If this product-combining rule is used, thenthe constant (a) in (2) must be much larger (10 or 100), so thatthe match ratings are not all almost zero.

V. RESULTS

We have applied this system to a variety of images (generally,two views of each scene, see Figs. 3 and 4). For different viewsof the same scene, we use the same model. The results are pre-sented as overlays for the original images, showing the borderof regions or center lines of linear features. The labels are takenfrom the name given in the model, either the user-derived modelor the image which serves as a model. Table I summarizes theresults. The results given here reflect the use of a postmatcherror-elimination heuristic which eliminates errors based onambiguous matches.

Fig. 5 shows the results of matching the model of two imagesof the San Francisco, CA, area (Fig. 3). The errors in the sec-ond view [Fig. 5(b)] are caused by the segmentation errors.

The two sections of the Bay Bridge are missed by the linearfeature-extraction program and thus cannot be matched cor-

rectly. In addition, the island adjacent to the bridges and bothportions of the bay is mismatched. (Note that the two sections

1The observation is from one of the reviewers of the paper.

Fig. 3. The high-altitude views of the San Francisco, CA, area with theresults of the segmentations indicated on the image.

Fig. 4. The two low-altitude views of the Ft. Belvoir, VA, area.

of the Bay were intended in the model description to be splitby the bridges.) See Table I for a summary of the results.

Fig. 6 gives the results for a subwindow of the low-altitudeaerial images (Fig. 4). The regions were extracted using a

model-based segmentation technique [14]. Different objectsare segmented poorly in the two views, but the matching still

(6)

619

Page 4: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

View usedas latels

TABLE IMATCHING RESULTS SUMMARY

View to belabelled Gradient Huummel

Zuc ker

Method

Product Orininalw w B w-

Model Fig. 4a(segmentation #1) 35-1 36-0 31-2 34-3 right-wrong

14-21 20-35 4-10 7-13 iterations:Macro-Micro37:10 58:05 3:27 7:39 time

Model Fig. 4b(segmentation #1) 34-2 35-1 33-0 35-2 right-wrong

16-37 14-34 4-10 6-12 iterations:Macro-flicro33:34 39:03 3:10 3:57 time

Model Fig. 4a(segmentation #2) 35-1 35-0 34-1 20-13 right-wrong

11-18 18-28 6-12 7-15 iterations: Macro-Micro25:59 47:25 5:03 9:27 timel l~~~I

Model Fig. 4b(segmentation #2) 35-1 34-3 32-0 32-6 right-wrong

12-22 14-35 4-10 7-13 iterations:Macro-Micro28:09 46:28 2:58 6:01 time

Model Fig. 4a(upper grouip of 14 14-0 14-0 13-0 13-2 right-wrong

9-27 10-29 5-20 6-11 iterations: Macro-Micro3:22 2:50 1:31 1:19 time

Model Fig. 3a 15-0 10-0 11-0 10-0 right-wrong8-20 9-40 5-19 6-21 iterations:Macro-Micro17:26 25:58 4:52 7:11 time

Model

View usedas labels

Fig. 3b

View to belabelled

1 2-18-1 918:40

Gradient

6-0 7-410-44 10-2530:36 9:52

MethodHummel ProductZucker

6-14-184 :44

Original

right-wrongiterations :Macro-Microtime

Fig. 4a Fig. 4b 35-5 33-1 23-0 38-5 right-wronggsegmentation #1) (segmentation #1) 19-56 30-128 10-32 12-33 iterations:

Macro-Micro24:09 49:40 10:38 10:25 time

Fig. 4b Fig. 4a{segmentation #1) (segmentation #1) 37-4 36-3 1-0 34-9 right-wrong

13-31 30-139 2-12 12-29 iterations:Macro-Micro

16:25 48:33 3:06 10:58 time

Combined 34-0 31-0 1-0 33-1 right-wrong

Fig. 4a Fig. 4b 38-7 31-3 0-0 38-4 right-wrong(segmentation #2) (segmentation #2) 25-66 28-113 1-6 19-57 iterations:

Macro-Micro37:52 1:08:21 --- 27:28 time

Fig. 4b Fig. 4a 40-7 21-10 --- 35-9 right-wrong(segmentation #2) (segmentation #2) 20-54 26-127 --- 24-54 iterations:

Macro-Mi cro32:39 1:16:49 29:50 time

36-1 21 -0 34-0 right-wrong

works well for both. An alternative segmentation producedsimilar matching results (using the same model), even with dif-ference in some of the extracted regions.The two methods, [ 1] and [11 ], based on an optimization

approach, give more correct matches with fewer errors. Theproduct combination rule [ 10] converges very rapidly (as ex-

pected), but does not make the "difficult" assignments which

require more iterations to incorporate the global context. Theoriginal method [ 9] was more uneven in its performance (somegood, some poor). The product rule is the fastest: because ofthe product in the combinations, a few low-match ratingsquickly cause the overall rating of an assignment to approachzero. Because the number of iterations is greater, the originalmethod takes longer than the product method. The two op-

Combi ned

T-

620

Page 5: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

(a) (b)Fig. 5. The results of matching the same model to the two views shown

in Fig. 3.

tl*f%(; \v.900 90 e-ilI(a I la(a Q

K

(a) (b)Fig. 6. The results of matching the same model to the two views of

Fig. 4.

timization approaches take even longer because of the increasednumber of iterations and the increased complexity of eachstep. Interestingly, the simpler method ofHummel and Zuckerrequires more time and, usually, a greater number of iterations,because fewer potential matches are eliminated (forced to aprobability of zero) on each iteration. This effort is greater inthe early iterations that require more computation. As moreassignments are found, the differences between the two methodsvanish.

Figs. 7-9 present the image-to-image matching process. InFig. 7 the first view is used as the model, and in Fig. 8 the sec-ond is used. The image used as the model is the one on the left.Fig. 9 shows those pairs which occur in both cases. Table IIgives the computed disparities for each of these 37 matchedobjects. The same general comments apply here as in the pre-ceding paragraph. Note (see the summary in Table I) that the

product-rule combination forces probabilities quickly to oneor zero so that global context may be missed.

VI. SUMMARY AND CONCLUSIONS

This correspondence compared the use of four different re-laxation updating schemes to the same general matching sys-tem. The most consistent performance was the gradient-basedoptimization approach detailed in [ 1 ] . There is a higher costin terms of the complexity of the necessary operations as re-flected in the computation time. The simpler method [ 111performed almost as well, but the increased number of nec-essary iterations balanced the reduction complexity. Theseresults indicate that rapid convergence is not always an ad-vantage, as when the relaxation converges before enough globalcontext has been included.

621

(ZGOC20Q% Go

q (Z JIM

Q 'O Q0

Pto

Page 6: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

(2QG

3,f c

(%.!5(3 a0

(77(Z 1%TP0 (D53

iti

c a J5r-

(a) (b)Fig. 7. The results of matching the view in Fig. 4 (left) with Fig. 4

(right)

(a)

a )

(b)

Fig. 9. The combination of the results in Fig. 7 and Fig. 8, keeping onlythose matches equivalent in both.

TABLE IIDISPARITIES FOR IMAGE-TO-IMAGE MATCHING

Cn

(a) (b)Fig. 8. Results of matching Fig. 4 (right) with Fig. 4 (left).

REFERENCES

[1] 0. Faugeras and K. Price, "Semantic description of aerial imagesusing stochastic labeling," IEEE Trans. Pattern Anal. MachineIntell., vol. PAMI-3, pp. 638-642, Nov. 1981.

[2] H. Moravec, "Rover visual obstacle avoidance," in Proc. 7th Int.Conf. Artif. Intell., Vancouver, B.C., Canada, Aug. 1981, pp.785-790.

[3] H. Baker and T. Binford, "Depth from edge and intensity-basedstereo," in Proc. 7th Int. Joint Conf. Artif. Intell., Vancouver,B.C., Canada, Aug. 1981, pp. 631-636.

[41 S. Barnard and W. Thompson, "IDisparity analysis of images,"IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-2, pp.333-340, July 1980.

ClusterTopTopTopTopTopTopTopTopTopTopTopTopTopTopMiddleMiddleMiddleMiddleMiddleLeftLeftLeftLeftLeftLeftLeftLeftLeftLeftLeftLowerLowerLowerLowerLowerLowerLower

Region No. Row Disp Col Disp6910111213161820213738434427303945495814151719242628364034722293541

9289118858998709058978839499168959428988967658428558421396777799823839790808667786791790819744770760780777772763

1956197019451973199119721972197219732015197118491960196418671976198519382070201319931995198219622002200920062014200520071966193619381939197419351941

Error

[5] C. Wang, H. Sun, S. Yada, and A. Rosenfeld, "Some experimentsin relaxation image matching using corner features," PatternRecogn., vol. 16, pp. 167-182, 1983.

[6] C. Clark et al., "Matching of natural terrain scenes," in Proc. 5thInt. Conf: Pattern Recogn., Miami, FL, Dec. 1980, pp. 217-222.

622

-CM&

(21 .nI 9(D0 01

9 C; &t(!; a

C;

d9%

3la

uj

aC;

0 &

1:k) ol-.4

(9 2 (Z 11oh.

Page 7: Relaxation Matching Techniques-A Comparison

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-7, NO. 5, SEPTEMBER 1985

[71 D. Gennery, "A feature-based scene matcher," in Proc. 7th Int.Joint Conf. Artif: Intell., Vancouver, B.C., Canada, Aug. 1981,pp. 667-573.

[8] 0. Faugeras and M. Berthod, "Improving consistency and reducingambiguity in stochastic labeling: An optimization approach,"IEEETrans. Pattern Anal. Machine Intell., vol. PAMI-3, pp. 412-424,July 1981.

[9] A. Rosenfeld, R. Hummel, and S. Zucker, "Scene labeling byrelaxation operations," IEEE Trans. Syst. Man Cybern., vol.SMC-6, pp. 420-453, June 1976.

[101 S. Peleg, "A new probabilistic relaxation scheme," IEEE Trans.Pattern Anal. Machine Intell., vol. PAMI-2, pp. 362-369, July1980.

[11] R. Hummel and S. Zucker, "On the foundations of relaxationlabeling processes," IEEE Trans. Pattern Anal. Machine Intell.,vol. PAMI-5, pp. 267-287, May 1983.

[12] R. Ohlander, K. Price, and R. Reddy, "Picture segmentation usinga recursive region splitting method," Comput. Graph. ImageProcessing, vol. 8, pp. 313-333, 1978.

[13] R. Nevatia and K. Babu, "Linear feature extraction and descrip-tion," Comput. Graph. Image Processing, vol. 13, pp. 257-269,1980.

[14] K. Price and G. Medioni, "Segmentation using scene models,"to be published.

[15] L. Kitchen, "Relaxation applied to matching quantitative rela-tional structures," IEEE Trans. Syst. Man Cybern., vol. SMC-10,pp. 96-101, Feb. 1980.

[16] A. Hanson and E. Riseman, "VISIONS: A computer system forinterpreting scenes," in Computer Vision Systems, A. Hansonand E. Riseman, Eds. New York: Academic, 1978, pp. 303-333.

Distributed Computing for Vision: Architectureand a Benchmark Test

PETER G. SELFRIDGE AND SCOTT MAHAKIAN

Abstract-Computer vision algorithms are notorious for their computa-tional expense. Distributed vision, the use of more than one processor,can decrease computation costs and speed up algorithms. There are

various ways to do this, ranging from parallelism at the sensor level to

true multiprocessor systems. This correspondence first describes a sys-tem of the latter type: a system of microprocessors on a high-speed bus.A canonical vision task, locating a number of objects and measuring cer-

tain two-dimensional features of those objects, serves as a benchmarktest for the system. An algorithm for this task is presented. Perfor-mance measures are compared from implementations on the distributedsystem, a Vax 11/750, and a Vax 11/780. Results indicate that threemicroprocessors outperform a Vax 11/780 at this task. Finally, othermore interesting distributed algorithms are briefly discussed.

Index Terms-Computer vision, distributed systems, distributed vision,hierarchical algorithms, multiprocessor systems.

I. INTRODUCTION

Computer vision is computationally intensive for two rea-

sons: the vast amount of image data involved, and the difficultyof the task itself. A 256 X 256 gray level image, using 8 bitsper picture element or pixel, is over 65K bytes of image data,for example. In a real-time situation, one may need to examineseveral or many such frames per second. Similarly, it is very

Manuscript received June 11, 1984; revised March 8, 1985. Recom-mended for acceptance by S. Tanimoto.

P. G. Selfridge is with the Department of Robotics Systems Research,AT&T Bell Laboratories, Holmdel, NJ 07733.

S. Mahakian is with View Engineering, Simi Valley, CA 93063.

difficult to identify a variety of objects on a moving conveyorbelt with variable illumination.

Distributed systems offer some relief in data reduction andcomputation speed. Parallel architectures can be designedat the sensor level, or for specific kinds of low-level com-putation such as convolution. A two-dimensional array ofsimple processors can, for example, be configured so that eachlooks at part of the scene, usually one pixel each, and is con-nected to its immediate neighbors. Architectures of this kindallow various kinds of neighborhood operations, such as regiongrowing, averaging, and thinning [ 1 ]. Pipelined processors canalso perform low-level computations very quickly [2], andspecial purpose parallel hardware can perform computations,such as convolution, quickly [3] .Such low-level schemes may be considered more parallel than

distributed in that they are single instruction multiple data sys-tems (SIMD). At the other extreme are true multiprocessorsystems, multiple instruction multiple data systems (MIMD).Here, a larger variety of parallel and distributed algorithmscan be implemented. In this kind of scheme, several or manymicroprocessors are connected, often by a high-speed bus. Eachmicrocomputer can either operate completely independentlyof all other computers, or function cooperatively by communi-cating over the bus. The only important constraint is that thecomputation to be done in parallel is enough to counteract thecommunications overhead. Various researchers have con-structed or designed systems of this kind [4]-[7].We are following the latter approach by using a multipro-

cessor system designed at AT&T Bell Labs [8] for a variety ofresearch problems in robotics. This correspondence briefly de-scribes this system and presents an algorithm for visually locat-ing objects using a low-resolution image and a high-resolutionimage. We implemented and ran this algorithm on two VAXmachines, implemented a parallel version on our distributedsystem, and measured and compared the performance. Finally,futher work is described including more intelligent distributedalgorithms that will significantly increase our system's speedand utility.

II. A DISTRIBUTED VISION ARCHITECTURE

Our system (see Fig. 1) is built around a series of single boardcomputers that use the Motorola 68000 microprocessor. Eachprocessor board has a 10 Mz clock rate and sits in a multibuschassis. This allows us to add other peripherals. Each pro-cessor is augmented with extra memory and a floating pointboard, built by SKY Inc. [9]. One multibus has a frame buf-fer system attached to it, an Imaging Technologies FB-5 12[ 10], which includes a frame buffer memory, and a digitizingboard connected to a standard vidicon camera.

The processors are connected with the S/Net, a high-speedbus developed as part of a research effort on advanced multi-processors [ 8]1. The S/Net provides low latency, bounded con-

tention time, high throughput, and further provides hardwaresupport for low-level flow control and signaling. The currentimplementation supports up to 32 computers, with a switchthroughput of 80 million bits per second, and a kernel-to-kernellatency of 100 microseconds. A Vax serves as an overall con-

trolling machine and file store.The available system software includes a cross compiler for

the C language, a downloader, and a distributed operating sys-tem. The operating system provides the ability to configure a

distributed system on several processors and provides the fol-lowing communication primitives. SendMessage (proc_num,array, length) sends an array of values to the specified pro-cessor. Poll (proc-num) returns "true" if the specified pro-cessor has sent a message. ReceiveMessage (proc_num, array,length) reads a message sent by the specified processor. The

0162-8828/85/0900-0623$01.00 © 1985 IEEE

623