3Dtrajectory

Embed Size (px)

Citation preview

  • 8/7/2019 3Dtrajectory

    1/8

    Reconstructing 3D motion trajectories of particle swarms by globalcorrespondence selection

    Danping Zou 1, Qi Zhao 2, Hai Shan Wu 1, and Yan Qiu Chen 11School of Computer Science

    2School of Information Science and EngineeringFudan University, Shanghai, China

    {dpzou|zhaoqi|hswu|chenyq }@fudan.edu.cn

    Abstract

    This paper addresses the problem of reconstructing the3D motion trajectories of particle swarms using two tem-porally synchronized and geometrically calibrated cam-eras. The 3D trajectory reconstruction problem involvestwo challenging tasks - stereo matching and temporal track-ing. Existing methods separate the two and process themone at a time sequentially, and suffer from frequent irre-solvable ambiguities in stereo matching and in tracking.

    We unify the two tasks, and propose a Global Correspon-dence Selection scheme to solve stereo matching and tem-poral tracking simultaneously. It treats 3D trajectory acqui-sition problem as selecting appropriate stereo correspon-dences among all possible ones for each target by minimiz-

    ing a cost function. Experiment results show that the pro-posed method has signicant performance advantage over existing approaches.

    1. Introduction

    Phenomena that can be abstracted as particle swarmssuch as insect swarms, bird ocks, and sh schools occurprevalently in our environments. They have attracted signif-icant attention by scientists in many disciplines [12, 10, 3].The availability of the 3D motion trajectory for each indi-vidual in the swarm can greatly facilitate the study of theircollective behavior. There however has been little researchtowards developing effective methods to achieve this pur-pose.

    A feasible way to measure the 3D motion trajectories isthrough using multiple video cameras. The cameras cap-ture these dynamic targets from different viewing direc-tions, and record their positions projected onto the 2D im-age planes at each time step. The problem we need to solveis to retrieve the time-varying 3D locations of these tar-gets from the video sequences. It involves two challenging

    tasks, namely, stereo matching - establishing the stereo cor-respondences across views - and tracking - nding motioncorrespondences for the particles. There have been severalapproaches proposed to accomplish such tasks. These ap-proaches can be classied into two main groups.

    In the rst category, stereo correspondences are rst es-tablished at each frame to reconstruct 3D locations of thetargets. The locations corresponding to the same target atdifferent frames are temporally associated to yield the nal3D trajectory [8, 11, 13]. However, stereo matching ambi-guities frequently arise, especially when binocular systemis employed as shown in Figure 1(a) . Consequently, thenal result would be greatly compromised by incorrect 3Dlocations caused by wrong matches.

    ?

    ?

    ?

    (a) The point on the left im-age has multiple candidates onthe right image satisfying epipo-lar constraint.

    a bor ?

    8t

    1t

    ?

    8t

    1t

    ?

    a bor ?

    a

    b 2t

    2t

    7t

    7t

    (b) During t 3 t 6 , the projectionsof two targets on the image planebecome very close, making track-ing difcult.

    Figure 1. Ambiguities in stereo matching and 2D tracking.

    In the second category [4, 7, 5], targets are rst trackedthroughout the image sequence captured by each camera.The resultant 2D tracks are then matched using camera ge-ometry to generate 3D trajectories. For such methods, mo-tion cue is used to disambiguate stereo matching. How-ever, tracking of large quantity of targets in image sequenceis frequently interfered by occlusion (which occurs whenimage projections of several targets overlap) and interac-tion (which occurs when multiple targets move closely) asshown in Figure 1(b).

    In both strategies mentioned above, tracking and stereo

  • 8/7/2019 3Dtrajectory

    2/8

    matching are articially regarded as two independent suc-cessive stages. In fact, these two stages can facilitate eachother. Motivated by this observation, we propose an uniedmethod to disambiguate tracking and stereo matching si-multaneously. We treat the reconstruction of 3D trajectoriesas selecting a sequence of stereo correspondences betweenthe image projections of the targets throughout the entiretime span. All the frames are taken into account, and theoptimal trajectories are obtain by minimizing a cost func-tion incorporating three cues, namely epipolar constraint,motion coherence, and conguration observation match.

    Experimental results on synthetic data demonstrate thatthe proposed method exhibits signicant advantage over ex-isting approaches in presence of occlusion and severe inter-action. We also test the proposed method on a challengingreal-world case - a few hundreds of fruit ies ying freely inan acrylic box - and successfully obtain their 3D trajectoriesas the supplementary video shows.

    The major contribution of this paper is twofold:

    We present a global optimization framework that in-corporates all available cues to solve tracking andstereo matching simultaneously for large swarms of in-distinguishable particle-like objects.

    We are the rst to obtain complete 3D motion trajec-tories of a swarm of hundreds of ying fruit ies, en-abling biologists to conduct thorough study of collec-tive behavior of fruit ies.

    2. Related work

    Numerous approaches have been developed for multi-target tracking in image sequence, including MHT[14],JPDAF[1], Particle Filter[9], Greedy Assignment[17], andTrack Graph[15]. Among them, Greedy Assignment(GOA)is regarded as the most efcient for dealing with large num-ber of targets, and it has been used to track a large swarmof ying bats [2]. In [18], GOA was adopted to extract2D tracks of ying bees from image sequence. By virtueof narrow baseline between cameras, the curvature infor-mation of these tracks was utilized to enhance the perfor-mance of stereo matching. However, all the above methodssuffer from inherent ambiguity of 2D tracking. Likewise,approaches through stereo matching followed by trackingwould also fail in presence of stereo matching ambiguities.A few methods incorporate tracking and stereo matching togenerate more reliable results. Du et al. [4] segmented 2Dtracks when interaction occurs, and then established cor-respondences among these segments. However, the com-mon time span between segments could be too short to re-solve stereo matching ambiguity. In addition, the resultant3D trajectories are undesirably broken into many segments.Willneff et al. [19] proposed a spatio-temporal matching al-

    gorithm using motion prediction to eliminate stereo match-ing ambiguities. Unfortunately, the algorithm would fail inpresence of stereo matching ambiguities, because generat-ing reliable prediction highly depends on correct 3D loca-tions in the previous frame. In addition, only several con-secutive frames are taken into account, whereas our methodprocesses the entire time span in a global manner.

    3. 3D trajectory reconstruction

    Consider a 3D swarm of ying targets of similar ap-pearance and small size. They are recorded by two geo-metrically calibrated and temporally synchronized camerasfrom different viewing directions. At regular time stepst = 1 , 2, . . . , T , the cameras capture these targets, produc-ing image blobs with centers being M t = {m v,it }, wherev {1, 2} indicates the v-th view and i {1, 2, . . . , N vt }labels the i-th blob in the v-th view.

    To recover the 3D trajectories of the targets, we need toestablish stereo correspondence between blobs of differentviews over time. We regard each pairing of blobs as a po-tential stereo correspondence. The set of all possible pair-ings at time step t is given by H t = {m1,it } { m

    2,jt }. A

    pairing could be either true or false correspondence. Ourgoal is to correctly assign the true pairings to each target.Let S t = ( s1t , s 2t , . . . , s N t ) be a conguration of pairings as-signed to the targets at time step t , where snt represents thepairing chosen for target n . Our mission is then to nd asequence of congurations S 1:T = ( S 1 , . . . , S T ) that bestexplain the image blobs recorded during the capturing pro-cess. Thereafter, the 3D motion trajectory of each target canbe reconstructed from respective sequence of stereo corre-spondences sn1:T = ( s

    n1 , . . . , s nT ) through triangulation.

    4. Global correspondence selection

    We propose a Global Correspondence Selection (GCS)scheme to solve the problem via nding an optimum con-guration sequence S 1:T that minimizes a cost functionf (S 1:T ), that is:

    S 1:T = arg minS 1: T f (S 1:T ). (1)

    The cost function f () is designed to incorporate three cues- epipolar constraint, kinetic coherency and congurationprojection match. The details are discussed in followingsections.

    4.1. Epipolar constraint

    If the blobs in different views correspond to the same 3Dtarget, they should satisfy the epipolar constraint. This isan important cue for judging how likely a pairing of blobsis a true stereo correspondence. Given a paring, the cost of

  • 8/7/2019 3Dtrajectory

    3/8

    n

    t s

    2d

    1d

    Figure 2. The average distance is dened as e (snt ) = ( d1 + d2 )/ 2.

    being corresponding to the same target is dened as

    f e (snt ) = e (snt ) (2)

    where e () represents the average distance between theblob centroids and their respective epipolar lines as shownin Figure 2. To rule out the apparently false pairings whoseblobs are far from respective epipolar lines, we set f e (snt ) = , if e (snt ) is greater than a threshold e . For all pairingsselected at time t , the total cost of their being true corre-

    spondences is obtained by

    f E (S t ) =N

    n =1 f e (snt ). (3)

    4.2. Kinetic coherency

    Each pairing gives rise to a 3D location through trian-gulation. A genuine pairing sequence should generate 3Dlocations that is feasible as a trajectory traveled by a phys-ical object. To evaluate the likelihood of being a trajectory,kinetic model is used. At each time instance, the kineticmodel gives a prediction of the motion state of the targetbased on previous states. A trajectory is considered morelikely to be traveled by a target if the overall prediction er-ror is smaller.

    This is an important cue for selecting appropriate pair-ing sequence for a target. Consider the sequence of pairingssn1:T = ( s

    n1 , . . . , s nT ) assigned to target n and their corre-

    sponding 3D locations denoted by x 1 , . . . , x T . The predic-tion inaccuracy at time t is dened as

    f k (snt K , . . . , snt ) = k (x t , x t ). (4)

    The predicted 3D location x t is computed from K pre-vious states using the kinetic model, namely x t =P (x t K , . . . , x t 1). The function k (, ) is the Euclideandistance between two 3D locations.

    Selection of kinetic model relies on priori knowledge of the motion of the subjects. We adopt here a simple andyet powerful kinetic model - nearest-neighbor model [17],i.e., K = 1 and P (x t 1) = x t 1 . Although nearest-neighbor model does not accurately describe the motion inmost cases, it has been proved to be an effective modeladopted in many tracking applications, particularly whenthe motion is complex and hard to be formulated, such as

    wandering people and drifting insects. To reduce the num-ber of candidate trajectories, we assign a threshold k toremove impossible candidates by setting f k (snt 1 , s nt ) = when k (x t 1 , x t ) > k . By summing up the prediction er-rors of all targets, we get the total kinetic cost of successivepairings:

    f K (S t 1 , S t ) =N

    n =1 f k (snt 1 , s

    nt ). (5)

    4.3. Conguration observation match

    Another cue comes from measuring the level of matchbetween conguration and observation. We assume that: 1)cameras are well placed so that they can capture most tar-gets simultaneously; 2) at each time step, the proportion of overlapping image blobs on the 2D image plane is relativelysmall. The rst assumption holds in most cases; the secondone is also reasonable when the number of ying targets ismoderate - hundreds to thousands. The two assumptionsstatistically imply that, in each view, one blob correspondsto only one particular target and vice versa.

    Having noticed the tendency of one-to-one mapping be-tween blobs and targets, we propose a cost to evaluate howwell a given conguration S t accounts for the observedblobs M t :

    f C (S t , M t ) =1

    N 1t

    N 1t

    i =1 |n c (m1,it , S t ) 1|

    +1

    N 2t

    N 2t

    i =1 |n c (m2,it , S t ) 1|,

    (6)

    where mv,it , N vt denote the blob centers and the total num-ber in v-th view, and n c (, ) represents the number of tar-gets to which the blob is mapped with regard to current con-guration of assignments. We use a threshold c to preventthe congurations from mapping a blob to too many targets.That is, if n c (, ) > c , the match cost will be set to inn-ity, f c (S t , M t ) = . Figure 3 shows the costs of differentcongurations.

    5. Cost function and optimization method

    Combining the above-discussed three terms additively,we obtain the overall cost function:

    f (S 1:T ) =

    T

    t =1 f E (S t ) + T

    t =1 f C (S t , M t ) + T

    t =2 f K (S t 1 , S t ),(7)

  • 8/7/2019 3Dtrajectory

    4/8

    1

    t h

    2

    t h

    3

    t h

    4

    t h

    ab

    (a) S t = ( h 1t , h 3t )

    1

    t h

    2

    t h

    3

    t h

    4

    t h

    ab

    (b) S t = ( h 1t , h 2t )

    Figure 3. h 1t , h 2t , h 3t , and h 4t are four possible pairings. The con-guration in (b) is more desirable than that in (a). In (a), blob arelates to two targets but b has no target related; while in (b), one-to-one mapping between the blobs and targets are established ineach view.

    where ,, are the weights of the terms. The cost func-tion can be recursively decomposed into

    f (S 1: t ) = f (S 1: t 1) + f (S t ). (8)

    where the increment of cost f (S t ) is

    f (S t ) = f E (S t )+ f C (S t , M t )+ f K (S t 1 , S t ). (9)

    The cost function can be optimized by dynamic program-ming. It is accomplished in two stages: 1) proposing pos-sible congurations for next frame, 2) propagating cumula-tive minimum cost from previous ones to each newly pro-posed conguration.

    Since the number of possible congurations increasesexponentially with the number of target, we use Gibbs sam-pling [6] to only obtain those congurations with low costs.

    The initial congurations at the rst frame can be sampledfrom 1

    S 1 P [f (S 1)] , (10)

    where P () denotes a probability function that is decreas-ing at positive interval such that congurations with lowcosts could be sampled with high probabilities. Here, we letP (u) exp( u). For each sample at t-th frame, S ( i )t , thecongurations of the next frame can be proposed by sam-pling

    S t +1 P [ f (S t +1 )] . (S t = S ( i )t ) (11)

    The cumulative minimum cost of a newly sampled con-

    guration S ( j )t +1 , denoted by f (S ( j )t +1 ), can be computedfrom

    f (S ( j )t +1 ) = mini f (S ( i )t ) + f (S

    ( j )t +1 ) . (12)

    The number of samples at each frame is prevented from be-ing too large by removing those samples with large costs.

    1At this initial stage, samples with same assignments but in differentorders, e.g. S (1)1 = ( h 1 , h 1 ) and S

    (2)1 = ( h 1 , h 1 ) , should be treated as

    the same sample.

    Otherwise it will increase exponentially with the number of frames processed. Conguration sampling and cost prop-agation are performed frame by frame and nally producethe optimum result as shown in Figure 4. We can see thatthrough optimization, both ambiguities in stereo matchingand tracking can be resolved as shown in Figure 5.

    1 t 1t + T

    *1:T

    S }

    t S

    1

    t s

    n

    t s

    N

    t s

    Figure 4. Optimization is done by dynamic programming - sam-pling new congurations at next frame and propagating cumula-tive minimum costs to them until reaching the last frame. Finallythe optimum congurations sequence S 1: T is obtained by tracingback.

    6. Variable number of targets

    We have so far considered the case where the numberof target is constant. The number of targets in practice,however, is variable since targets might enter and leave thescene. We introduce an additional variable O to representthe absent state of target. By setting the dimension of con-guration large enough, the way to handle variable numberof targets is similar to that of handling xed one. The dif-ference is that, at each frame, a dummy pairing O can beassigned to a target, indicating this target is not in the scene.

    By introducing an additional pairing O, the two costs of epipolar constraint and kinetic coherency needs to be mod-ied. We redene the two costs based on two rules: 1) thetargets are encouraged to keep absent from the scene withzero costs; 2) a target tends to switch the visibility state onlywhen it is impossible to nd two pairings at adjacent videoframes that can be assigned to this target with nite cost of kinetic coherence. The rst rule indicates f e (O) = 0 andf k (O, O) = 0 , which ensures that whether a target is absentor not mainly relies on conguration observation match.

    The second rule suggests that the cost of visibility switchingf k (h, O) (or f k (O, h)), denoted by , should be properlyset so that if there is a pairing h that f k (h, h ) < , wehave f k (h, h ) + f e (h) < f k (h, O) + f e (O). It yields > / e + k .

    7. Efcient implementation

    The above-disccussed optimization problem can be bet-ter understood by using a graph representation. We con-struct a directed graph as shown in Figure 5(c) where a node

  • 8/7/2019 3Dtrajectory

    5/8

    Left

    a

    ? ?

    Right

    ? ??

    (a) Imaged targets (small circles) are associated to possible corre-sponding targets at next frame by directed links. Ambiguities existin both tracking and stereo matching, e.g., target a has two temporalcorrespondences at next frame and three possible stereo matches inthe right view.

    a

    (b) Ambiguities are removed after optimization

    (c) Graph representation of pair-ings. Two pairings are connectedby an edge weighted by the costof kinetic coherency in (4).

    (d) Pairing paths are extractedfor each of the three targets(marked by red, greed and bluerespectively) through optimiza-tion.

    Figure 5. Through optimization of the cost function, ambiguitiesin both stereo matching and tracking are eliminated.

    is the pairing satisfying the epipolar constraint; an edge con-nects two nodes and carries the cost of kinetic coherence atsuccessive frames. Acquiring 3D trajectories is equivalentto nding N optimal paths in this graph. The number of target, N , can be set with regard to the maximum numberof detected blobs in video streams. If N is small, the opti-mal paths can be obtained by direct optimization. But whenthe target number is large, direct optimization is infeasible,because the computational complexity is still too high in

    spite of employing sampling techniques. So we proposea two-phase strategy to signicantly reduce the computa-tional costs.

    7.1. Graph simplication

    The computational costs could be signicantly reducedif many of false pairings included in the graph are iden-tied and removed. A hypothetical target computed fromfalse pairings would disappear abruptly when the pairedblobs depart from their respective epipolar lines at some-

    1t +

    1t -

    t

    1t -

    t

    1t +h

    Figure 6. A false pairing h is detected, when both blobs leave re-

    spective epipolar lines of each other at t + 1 without tracking am-biguity - no other blob moves into the adjacent regions indicatedby gray disks from time t 1 to t + 1 .

    1t -

    t

    1t +

    a

    1h 2h

    3h

    4h

    Figure 7. If a paring is temporally connected by more than oneparings at adjacent frames - e.g., h 2 , h 3 , h 4 - or shares blobs withother parings - e.g., h1 and h2 share the blob a , it is an ambigu-ous pairing. The ambiguous pairings (yellow dots) are groupedinto a cluster (indicated by gray shadow), if they are temporallyconnected or share blobs.

    time. This indicates that nodes with no out link in the graphare likely to be false pairings. To determine whether a nodewith no out link is a false pairing, we investigate motions of related blobs on the image planes. As shown in Figure 6, if both blobs move from previous frame to next frame with-out tracking ambiguity, the paring is identied as a falsestereo correspondence. In the same manner, the false pair-ings which appear suddenly can also be recognized. Therecognized false parings would be removed from the graphand the process is repeated until no more false pairing canbe found.

    7.2. Optimization in clusters

    If no stereo-matching ambiguity exists in a pairing - ablob has only one corresponding blob on the related epipo-lar line in the other view - the two blobs can be establishedas a stereo correspondence. Through graph simplication,this kind of pairings emerge in large number. Many of themconnect each other one-by-one and form a paring path with-out branches. These paths in fact produce most part of the result - 3D trajectory segments of targets. Thereforeit is unnecessary to take them into account in optimization.We only focus on the remaining ambiguous pairings, whichhave either stereo-matching ambiguities or tracking ambi-guities.

    These ambiguous pairings are grouped into differentclusters by introducing an equivalence relation, which isdened such that the two ambiguous pairings connected orsharing blobs are classied into the same cluster as shownin Figure 7.

  • 8/7/2019 3Dtrajectory

    6/8

    Figure 8. The ambiguous pairings(colored dots) are classied intoclusters (marked by different colors). Optimization is done in eachcluster and nally optimum paths are acquired (colored curves).

    Optimization is done separately in each cluster. Eachcluster is rst extended by adding its adjacent pairings. LetM A i be the related blobs of the pairings in the cluster Ai .We minimize the cost f (S A i ) to yield the optimum pathsS A i for each cluster. The optimum paths of all clusters

    and the already obtained path segments are fused together,and nally produce the global optimum paths throughout allframes as shown in Figure 8. The technique signicantly re-duces the computational cost, enabling our method handlemassive targets more efciently.

    8. Experiments

    We test the proposed method on simulated particleswarms of various challenging levels. We also apply theproposed method to a real-world case of reconstructing the3D trajectories of fruit y swarms.

    8.1. Simulated particle swarm

    Simulated particles, conned in a cube of side length 2,are initialized with random locations and velocities. In eachstep, particle velocity is updated using

    v t = v t 1 + n t (13)

    where [0, 1] is a parameter used to control the smooth-ness of velocity and n t N (0, 0.05Id ) ( Id is a 3 3identity matrix ) is a Gaussian noise to perturb the velocityvector. Particle location is then computed from previous lo-cation by x t = x t 1 + v t . At each time step, the particlesare projected onto two image planes of 800 800 resolutionfrom different views, simulating two cameras. Each parti-cle is rendered as a sphere of the same color using OpenGL.The radius of the sphere is set to 0.02 to generate the pro-jected balls in image sequences with diameter of 5 pixels.The random noises are added to simulate real imaging pro-cess.

    Two cases are simulated to test the performance of theproposed method. In the rst case, the average velocity isset around 5 pixels per frame, the number of targets varies

    from 10 to 100. In the second one, we change the aver-age velocity from 6 pixels to 24 pixels per frame while thenumber of targets is set to 40. In both cases, is randomlychosen for each target around 0.9 to simulate the motion of ying insects. Evaluation metrics and results are presentedin the next sections.

    8.2. Evaluation metrics

    To evaluate the results, we propose a metric named Re-construction Association Error( RAE ) which measures er-rors in both reconstruction and temporal association. It isdened as

    RAE = [N mc + N f c + N ma + N f a2N T ], (14)where N and T are the numbers of targets and frames re-spectively. N mc denotes the numbers of missing genuinecorrespondences. N

    fcis the number of falsely selected cor-

    respondences. N ma , N f a are the numbers of missing as-sociations and false associations between successive framesrespectively.

    8.3. Results on simulated particle swarms

    We compare our method with recent Relative Epipo-lar Motion (REM) method [4] that uses 2D trajectory-matching strategy, and an extension of REM adopting theGOA tracker [17] instead of nearest neighbor tracker. Thereconstruction error and resulted 3D trajectories are shownin Figure 9 and Figure 10 respectively. In the rst case,the increasing number of targets will result in more stereomatching ambiguities, producing more 2D tracking errors.Since GOA-REM utilizes motion prior to facilitate the 2Dtracking, it outperforms REM. In the second case, due to theincreasing tracking ambiguities, both methods fail to obtain2D tracks efciently. Unlike the above two methods, ourapproach achieves superior results in both cases, because itunies tracking and stereo matching by global optimizationin both spatial and temporal domain. As shown in Figure11, the proposed method is resiliant to tracking ambiguityon 2D image plane.

    8.4. Results on fruit y swarm

    The collective behavior of fruit ies has attracted signif-icant attention from biologists[16, 10]. One important wayis analyzing their 3D motion trajectories. We applied ourapproach to acquiring the 3D trajectories of large numberof freely ying fruit ies.

    The fruit ies ied in an acrylic glass box of size 35cm 35cm 25cm , where the background was illuminated bywhite plane lights. Two synchronized and calibrated SonyHVR-V1C video cameras working in high speed mode of 200fps were used to capture the scene from different views.

  • 8/7/2019 3Dtrajectory

    7/8

    (a) Ground truth (b) REM ( RAE : 0.664) (c) GOA( RAE : 0 .345) (d) The proposed GCS( RAE :0.008 )

    Figure 10. Results of 60 targets using different methods. The black lines denote the missing correspondences/associations, and the redlines denote the false correspondences/associations. Our method yields remarkable result as in (C).

    Figure 11. GCS yields correct tracking result, in spite of tracking ambiguity on 2D image planes.

    20 30 40 50 60 70 80 90 1000

    0.2

    0.4

    0.6

    0.8

    number of targets

    RAE

    REMGOA REMGCS

    8 10 12 14 16 18 20 22 240

    0.2

    0.4

    0.6

    0.8

    1

    average speed on image plane (in pixels)

    RAE

    REMGOA REMGCS

    Figure 9. The proposed GCS method yields remarkable results inspite of high densities and fast speeds, while results from track-matching methods are poor, as errors reach up to 0.9.

    The image resolution is 960 540. We rst detected thefruit ies by background subtraction. We then applied ourmethod to reconstruct the 3D trajectories of a sequence with200 frames. We nally obtained 438 trajectories in to-tal. The average length of these trajectories is 60.7 frames.Among all of them, 86 trajectories are longer than 100frames. Figure 12 demonstrates the acquired 3D trajecto-ries. At each frame, the blobs that correspond to recon-structed targets, in average reach 77.6% percent of detectedblobs in each view. The result is promising, since some tar-gets are not simultaneously captured by both cameras, par-ticularly the targets near the image boundaries as shown inFigure 12.

    9. Conclusion

    We have proposed a novel approach for acquiring the3D trajectories of a swarm of ying targets. The proposedmethod simultaneously solves tracking and stereo matchingin a global manner, which signicantly reduced ambigui-ties. The framework can further be extended in many ways:it can accommodate additional information such as textureand color if available; the pairing can be replaced by group-ing in multiple cameras; high-order kinetic coherency canbe adopted to achieve better results.

    9.1. Acknowledgements

    The research work presented in this paper is supportedby National Natural Science Foundation of China, GrantNo. 60875024. We would like to thank Linguo Li and WeiLi at School of Life Sciences, Fudan University for provid-ing fruit ies for the experiments.

    References

    [1] Y. Bar-Shalom, T. Fortmann, and M. Scheffe. Joint prob-abilistic data association for multiple targets in clutter. In

    Proc. Conf. on Information Sciences and Systems , pages404409, 1980.

    [2] M. Betke, D. Hirsh, A. Bagchi, N. Hristov, N. Makris, andT. Kunz. Tracking large variable numbers of objects in clut-ter. In IEEE Conference on Computer Vision and PatternRecognition , pages 18, 2007.

    [3] M. Betke, D. Hirsh, N. Makris, G. McCracken, M. Proco-pio, N. Hristov, S. Tang, A. Bagchi, J. Reichard, J. Horn,et al. Thermal imaging reveals signicantly smaller brazilianfree-tailed bat colonies than previously estimated. Journal of Mammalogy , 89(1):1824, 2008.

  • 8/7/2019 3Dtrajectory

    8/8

    (a) Left view

    (b) Right view (c) Reconstructed 3D trajectories

    Figure 12. Results of experiment on fruit y swarm. See more details in the supplemental video.

    [4] H. Du, D. Zou, and Y. Chen. Relative epipolar motion of tracked features for correspondence in binocular stereo. InIEEE 11th International Conference on Computer Vision ,pages 18, 2007.

    [5] D. Engelmann, C. Garbe, and M. St ohr. Stereo particle track-

    ing. In Proc. of 8th Int. Symp. on Flow Visualization (CD-ROM) , pages 2401, 1998.[6] S. GEMAN and D. GEMAN. Stochastic relaxation, gibbs

    distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence ,6(6):721741, 1984.

    [7] Y. Guezennec, R. Brodkey, N. Trigui, and J. Kent. Algo-rithms for fully automated three-dimensional particle track-ing velocimetry. Experiments in Fluids , 17(4):209219,1994.

    [8] N. KASAGI and K. NISHINO. Probing turbulence withthree-dimensional particle-tracking velocimetry. Experimen-tal thermal and uid science , 4(5):601612, 1991.

    [9] J. MacCormick and A. Blake. A probabilistic exclusion prin-ciple for tracking multiple objects. International Journal of Computer Vision , 39(1):5771, 2000.

    [10] G. Maimon, A. D. Straw, and M. H. Dickinson. A sim-ple vision-based algorithm for decision making in yingdrosophila. Current Biology , 18(6):464470, March 2008.

    [11] N. Malik, T. Dracos, and D. Papantoniou. Particle track-ing velocimetry in three-dimensional ows - part ii:particletracking. Experiments in Fluids , 15(4):279294, 1993.

    [12] K. Norris and C. Schilt. Cooperative societies in three-dimensional space: on the origins of aggregations, ocks,

    and schools, with special reference to dolphins and sh.Ethology and Sociobiology , 9(2-4):149179, 1988.

    [13] F. Pereira, H. Stuer, E. Graff, and M. Gharib. Two-frame3d particle tracking. Measurement Science and Technology ,17(7):16801692, 2006.

    [14] D. Reid. An algorithm for tracking multiple targets. IEEE Transactions on Automatic Control , 24(6):843854, 1979.[15] J. Sullivan, S. Carlsson, and E. Hayman. Tracking and la-

    belling of interacting multiple targets. pages 619632, 2006.[16] L. Tammero and M. Dickinson. The inuence of visual land-

    scape on the free ight behavior of the fruit y drosophilamelanogaster. Journal of Experimental Biology , 205(3):327343, 2002.

    [17] C. Veenman, M. Reinders, and E. Backer. Resolving motioncorrespondence for densely moving points. IEEE Transac-tions on Pattern Analysis and Machine Intelligence , pages5472, 2001.

    [18] A. Veeraraghavan, M. Srinivasan, R. Chellappa, E. Baird,and R. Lamont. Motion based correspondence for 3d track-ing of multiple dim objects. In IEEE Conference on Acous-tics, Speech and Signal Processing , volume 2, 2006.

    [19] J. Willneff and A. Gruen. A new spatio-temporal match-ing algorithm for 3d-particle tracking velocimetry. In The9th International Symposium on Transport Phenomena and Dynamics of Rotating Machinery, Honolulu, Hawaii, USA,February , pages 1014, 2002.