A General Framework for Tracking Multiple People from a Moving Camera

Preview:

DESCRIPTION

A General Framework for Tracking Multiple People from a Moving Camera. Wongun Choi, Caroline Pantofaru, Silvio Savarese. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013. Overview. Motivation Related Work Introduction Proposed Method Experiment Result Conclusion. - PowerPoint PPT Presentation

Citation preview

1

A General Framework for Tracking MultiplePeople from a Moving Camera

Wongun Choi, Caroline Pantofaru, Silvio Savarese

IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013

2

Overview• Motivation• Related Work• Introduction• Proposed Method• Experiment Result• Conclusion

3

Motivation1.Final goal is tracking multiple people from a moving camera, including outdoor video scene and indoor video scene. 2.There are some challenge to solve:1) People have variety poses 2) Complexity of the motion patterns of multiple peoplein the same scene3) Changeable scene and illumination effect

4

Related work1. Tracking by online learning :Learning appearance model [10],[5],[34],[7],[26]Color histogram and mean shift [10]2. Tracking with a moving camera:Probabilistic framework multiple detectors [42],[43]Stereo and graphical model [12],[13][5] S. Avidan. Ensemble tracking. In PAMI, 2007[7] C. Bibby and I. Reid. Robust real-time visual tracking using pixelwise posteriors. In ECCV, 2008[10] D. Comaniciu and P. Meer. Mean shift:Arobust approach toward feature space analysis. In PAMI, 2002.[12] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A mobile vision system for robust multi-person tracking. In CVPR, 2008.[13] A. Ess, B. Leibe, K. Schindler, and L. van Gool. Robust multi person tracking from a mobile platform. PAMI, 2009.[26] S. Kwak, W. Nam, B. Han, and J. Han. Learning occlusion with likelihoods for visual tracking. In ICCV, 2011[34] D. Ramanan, D. Forsyth, and A. Zisserman. Tracking people by learning their appearance. PAMI, Jan. 2007.[42] C. Wojek, S. Walk, S. Roth, and B. Schiele. Monocular 3d scene understanding with explicit occlusion reasoning. In CVPR, 2011.[43] C. Wojek, S. Walk, and B. Schiele. Multi-cue onboard pedestrian detection. In CVPR, 2009

5

Introduction(1)To solve these issues proposed method:1) People have variety poses :Fusing multiple person detection method and some observations

2) Complexity of the motion patterns of multiple people in the same sceneBuild a motion model that capture the interaction between targets

3) Changeable scene and illumination effectProposed a novel 3D model which explain the process of video generation

6

Introduction(2)Observation cues:

7

Introduction(3)Build 3D Model:

8

Introduction(4)Particle filter:1.Def: posterior density estimation algorithms that estimate the posterior density of the state-space by directly implementing the Bayesian recursion equations

2.Using sampling for generating state distribution of posterior and using resamplingTo reconstruct the new distribution

9

Introduction(5)Reversible-Jump Markov Chain Monte Carlo(RJMCMC):A class of algorithms for sampling from probability distributions based on constructing a Markov chain which allows changes of the dimensionality of the state

10

Proposed MethodSystem overview:1.Using observation cues to generate detection hypotheses and an observationModel2.Build a motion model account both for people’s unexpected motions as well as interactions between people3. Sampling procedure for the RJ-MCMC tracker which include evaluation(resampling)

11

Proposed MethodModel representation:

12

Proposed Method Using as random variables and model their relationship by joint

posterior probability The tracking problem can formulate as finding maximum-a-posteri (MAP)

(a) Observation likelihood(b) Motion model (transition model)(c) Posterior at time t-1

13

Proposed Method(a) Observation likelihood:

Camera projection function:

14

Proposed MethodTarget Observation Likelihood:

j:detectorswj: weight for detector j

15

Proposed MethodTarget Observation Likelihood: 1) pedestrian detector 2) upper body detector 3) target-specific detector based on appearance model 4) detector based on upper-body shape from depth 5) face detector 6) skin detector 7) motion detector

16

Proposed MethodPedestrian and upper body detector using HOG:

17

Proposed MethodFace detector using OpenCV Viola-jones face detector:

18

Proposed MethodSkin color detector using threshold on HSV color space:

19

Proposed MethodDepth shape detector using world coordinate system:

20

Proposed MethodMotion detector by project motion points into image plane and threshold:

21

Proposed MethodGeometric Feature likelihood by interest point detector:

is the uniform distribution

22

Proposed MethodMotion prior:

23

Proposed MethodCamera motion prior:

24

Proposed MethodTarget motion prior:

25

Proposed MethodExistence prior:

26

Proposed Method

Motion prior:IndependentInteracting

27

Proposed MethodIndependent Motion prior :

update

28

Proposed MethodInteracting Motion prior:

Mode variable

29

Proposed MethodRepulsion:

Group motion:

Repulsion force

30

Proposed MethodTracking by Reversible Jump Markov Chain Monte Carlo Particle filtering: Sampling:

Convert posterior problem:

31

Experimental result Using ETH dataset [12]

Video frame rate ~14Hz

Resolution 640*480 pixels

32

Experimental result Single frame detection accuracy via overlap ratio between the ground truth bounding

box and tracked bounding box.

33

Experimental result

34

Conclusion

• Combine probabilistic model with joint variables– Relationship between the camera, targets’ and geometric features

• Combine multiple cues– adaptable to different sensor configurations and different

environments• Allowing people to interact• Automatically detecting people

Recommended