Kinematic Jump Processes for Monocular 3D Human Tracking

Cristian Sminchisescu (University of Toronto)

Bill Triggs (INRIA Rhone-Alpes)

Kinematic Jump Processes for Monocular 3D Human Tracking

Goal: track human body motion in monocular video and estimate 3D joint motion

Why Monocular ?• Movies, archival footage

• Resynthesis, e.g. change point of view or actor• Tracking / interpretation of actions & gestures (HCI)• How do humans do this so well?

Overall Modeling Approach1. Generative Human Model

– Kinematics, geometry, photometry– Predicts images or descriptors– Priors and anatomical constraints

2. Model-image matching cost function– Robust, probabilistically motivated– Contour and intensity based

3. Tracking by search / optimization– Discovers well supported

configurations of matching cost

Why is 3D-from-monocular hard?

Image matching ambiguities

Depth ambiguities

Violations of physical constraints

How many local minima are there?

Thousands ! – even without image matching ambiguities …

Examples of Kinematic Ambiguities

• Minima are separated by large distances in parameter space

Monocular 3D Tracking Methods• CONDENSATION (discrete, motion models)

– Deutscher et al.’00: annealing, walking– Sidenbladh et al.’00,02: importance sampling (walking + snippets)

• CSS, ET/HS/Hyperdynamics (continuous, cost-sensitive)– Sminchisescu&Triggs’01,02

Covariance Scaled Sampling (CSS)

HyperdynamicsHypersurface Sweeping (HS)

Search Globality and Adaption• Cost sensitive continuous search methods are

– Efficient - avoid large wastage factors with random sampling– Generic - no assumptions on known motions

• Focus on locating transition states and nearby minima

• But– Still local (i.e. sometimes myopic)

• Minima are typically far in parameter space

– No knowledge of global long-range minimum structure

• Want to search quasi-globally, yet preserve generality– Can we find other minima more efficiently by exploiting

intrinsic problem structure?

Kinematic Jump Sampling

• For any given model configuration, we can explicitly build the interpretation tree of alternative kinematic solutions with identical joint projections– work outwards from root of kinematic tree, recursively

evaluating forward/backward ‘flips’ for each body part• Alternatively, sample by generating flips randomly • … or, for tracking, sample shallowly and treat each limb quasi-

independently

Efficient Inverse Kinematics• The inverse kinematics is

simple, efficient to solve– Constrained by many

observations (3D articulation centers)

– The quasi-spherical articulation of the body

– Mostly in closed form

• The iterative solution is also very competitive • Optimize over model-hypothesized 3D joint assignments • 1 local optimization work per new minimum found

An adaptive diffusion method (CSS) is necessary for correspondence ambiguities

Candidate Sampling Chains

s=CovarianceScaledSampling(mi)

S=BuildInterpretationTree (s,C)

E=InverseKinematics(S)

Prune and locally optimize E

1tp

M

i

N

jijj CC

1 1

][v)(vote

tp

),( iiim

C=SelectSamplingChain(mi)

E

C1 CMC

The KJS Algorithm

Tracking Experiments

• 4s agile dancing sequence, 25 frames per second

• Cluttered background, self-occlusion, motion in depth

• Automatically select kinematic jump samples (KJS) from short 3-link chains (rooted at hips, shoulders, neck)

• 8 modes, CSS diffusion with scaling 4

Jump Sampling in Action

Quantitative Search Statistics

• Initialize in one minimum, different sampling regimes• Improved minima localization by KJS

– Local optimization often not necessary

Summary• Kinematic Jump Sampling Algorithm

– Construct interpretation trees of 3D joint positions corresponding to monocular kinematic ambiguities

– Solve efficiently using closed-form inverse kinematics

• Highly accurate hypothesis generator for long-range search

• Local optimization polishing often un-necessary

– Explicit kinematic jumps + cost-sensitive sampling

• Address both depth and image matching ambiguities

• Future work– Scene constraints (ground plane, equilibrium)

– Jump strategies for image matching

– Prior knowledge (Sminchisescu&Jepson03 upcoming)

The End

The End

Documents

Kinematic Jump Processes for Monocular 3D Human Tracking