Lecture 11 Tracking - Davide Scaramuzzarpg.ifi.uzh.ch/docs/teaching/2020/11_tracking.pdf11 ๐ผ๐ผ...

Preview:

Citation preview

Vision Algorithms for Mobile Robotics

Lecture 11Tracking

Davide Scaramuzzahttp://rpg.ifi.uzh.ch

1

Lab Exercise โ€“ This afternoon

Implement the Kanade-Lucas-Tomasi (KLT) tracker

2

Outline

โ€ข Point trackingโ€ข Template trackingโ€ข Tracking by detection of local image features

3

Point Tracking

โ€ข Problem: given two images, estimate the motion of a pixel point from image ๐ผ๐ผ0 to image ๐ผ๐ผ1

4

๐ผ๐ผ0(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)

Point Tracking

โ€ข Problem: given two images, estimate the motion of a pixel point from image ๐ผ๐ผ0 to image ๐ผ๐ผ1

5

๐ผ๐ผ1(๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ)

Point Tracking

โ€ข Problem: given two images, estimate the motion of a pixel point from image ๐ผ๐ผ0 to image ๐ผ๐ผ1

โ€ข Two approaches exist, depending on the amount of motion between the framesโ€ข Block-based methodsโ€ข Differential methods

6

๐ผ๐ผ0(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)

(๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ): optical flow vector

Point Tracking

โ€ข Consider the motion of the following corner

7

Point Tracking

โ€ข Consider the motion of the following corner

8

Point Tracking with Block Matching

โ€ข Search for the corresponding patch in a ๐ท๐ท ร— ๐ท๐ท region around the point to track.โ€ข Use SSD, SAD, or NCC

9

Search region

Patch to track

Pros and Cons of Block Matching

โ€ข Pros:โ€ข Works well if the motion is large

โ€ข Consโ€ข Can become computationally demanding if the motion is large

โ€ข Can the โ€œsearchโ€ be implemented in a smart way if the motion is โ€œsmallโ€?โ€ข Yes, use Differential methods

10

Point Tracking with Differential Methods

Looks at the local brightness changes at the same location. No patch shift is performed!

11

๐ผ๐ผ0(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)

Point Tracking with Differential Methods

Looks at the local brightness changes at the same location. No patch shift is performed!

12

๐ผ๐ผ0(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)

Point Tracking with Differential Methods

Looks at the local brightness changes at the same location. No patch shift is performed!

13

๐ผ๐ผ1(๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ)

Point Tracking with Differential Methods

Assumptions:โ€ข Brightness constancy

โ€ข The intensity of the pixels around the point to track does not change much between the two frames

โ€ข Temporal consistencyโ€ข The motion displacement is small (1-2 pixels); however, this

can be addressed using multi-scale implementations (see later)

โ€ข Spatial coherencyโ€ข Neighboring pixels undergo similar motion (i.e., they all lay

on the same 3D surface, i.e., no depth discontinuity)

14

The Kanade-Lucas-Tomasi (KLT) tracker

Consider the reference patch centered at (๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ) in image ๐ผ๐ผ0 and the shifted patch centered at (๐‘ฅ๐‘ฅ + ๐‘ข๐‘ข, ๐‘ฆ๐‘ฆ + ๐‘ฃ๐‘ฃ)in image ๐ผ๐ผ1. The patch has size ฮฉ. We want to find the motion vector (๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ) that minimizes the Sum of Squared Differences (SSD):

This is a simple quadratic function in two variables (๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ)

15Lucas, Kanade, An iterative image registration technique with an application to stereo vision. Proceedings of Imaging Understanding Workshop, 1981. PDF.

Tomasi, Kanade, Detection and Tracking of Point Features, Carnegie Mellon University Technical Report CMU-CS-91-132, 1991. PDF.

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† ๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ = ๏ฟฝ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆโˆˆฮฉ

(๐ผ๐ผ0 ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ โˆ’ ๐ผ๐ผ1 ๐‘ฅ๐‘ฅ + ๐‘ข๐‘ข,๐‘ฆ๐‘ฆ + ๐‘ฃ๐‘ฃ )2

โ‰…๏ฟฝ(๐ผ๐ผ0 ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ โˆ’ ๐ผ๐ผ1 ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ)2

โ‡’ ๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†(๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ) = ๏ฟฝ(โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ)2

๐ผ๐ผ0

๐ผ๐ผ1

(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)

(๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ)(๐‘ฅ๐‘ฅ + ๐‘ข๐‘ข, ๐‘ฆ๐‘ฆ + ๐‘ฃ๐‘ฃ)

The Kanade-Lucas-Tomasi (KLT) tracker

16

To minimize it, we differentiate it with respect to (๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ) and equate it to zero:

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†(๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ) = ๏ฟฝ(โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ)2

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•๐œ•๐œ•

= 0 , ๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•๐œ•๐œ•

= 0

โˆ’2๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ(โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ) = 0

โˆ’2๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆ โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ = 0

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•๐‘ข๐‘ข

= 0 โ‡’

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•๐‘ฃ๐‘ฃ

= 0 โ‡’

The Kanade-Lucas-Tomasi (KLT) tracker

โ€ข Linear system of two equations in two unknowns

โ€ข We can write them in matrix form:

17

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ(โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ) = 0

๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆ โˆ†๐ผ๐ผ โˆ’ ๐ผ๐ผ๐‘ฅ๐‘ฅ๐‘ข๐‘ข โˆ’ ๐ผ๐ผ๐‘ฆ๐‘ฆ๐‘ฃ๐‘ฃ = 0

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฅ๐‘ฅ ๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ ๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆ๐ผ๐ผ๐‘ฆ๐‘ฆ

๐‘ข๐‘ข๐‘ฃ๐‘ฃ =

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅโˆ†๐ผ๐ผ

๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆโˆ†๐ผ๐ผโ‡’ ๐‘ข๐‘ข

๐‘ฃ๐‘ฃ =

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฅ๐‘ฅ ๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ ๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆ๐ผ๐ผ๐‘ฆ๐‘ฆ

โˆ’1

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅโˆ†๐ผ๐ผ

๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆโˆ†๐ผ๐ผ

Havenโ€™t we seen this matrix already? Recall Harris detector!

Notice that these are NOT matrix products but pixel-wise products!

The Kanade-Lucas-Tomasi (KLT) tracker

In practice, det(๐‘€๐‘€) should be non zero, which means that its eigenvalues should be large (i.e., not a flat region, not an edge) โ†’ in practice, it should be a corner or more generally contain any textured region!

18

Edge โ†’ det(M) is low

Flat โ†’ det(M) is low

Texture โ†’ det(M) is high

๐‘€๐‘€ =

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฅ๐‘ฅ ๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ

๏ฟฝ๐ผ๐ผ๐‘ฅ๐‘ฅ๐ผ๐ผ๐‘ฆ๐‘ฆ ๏ฟฝ๐ผ๐ผ๐‘ฆ๐‘ฆ๐ผ๐ผ๐‘ฆ๐‘ฆRR

= โˆ’

2

11

00ฮป

ฮป

Application to Corner Tracking

Color encodes motion direction

19

Application to Optical Flow

What if you track every single pixel in the image?

20

Application to Optical Flow

21

Application to Optical Flow

22

Optical Flow example

23

Aperture Problem

โ€ข Consider the motion of the following corner

24

Aperture Problem

โ€ข Consider the motion of the following corner

25

Aperture Problem

โ€ข Now, look at the local brightness changes through a small aperture

26

Aperture Problem

โ€ข Now, look at the local brightness changes through a small aperture

27

Aperture Problem

โ€ข Now, look at the local brightness changes through a small aperture

โ€ข We cannot always determine the motion direction โ†’ Infinite motion solutions may exist!โ€ข Solution?

28

Aperture Problem

โ€ข Now, look at the local brightness changes through a small aperture

โ€ข We cannot always determine the motion direction โ†’ Infinite motion solutions may exist!โ€ข Solution?

โ€ข Increase aperture size!

29

Block-based vs Differential methods

โ€ข Block-based methods:โ€ข Robust to large motionsโ€ข Can be computationally expensive (๐ท๐ทร—๐ท๐ท validations need to be made for a single point to track)

โ€ข Differential methods: โ€ข Works only for small motions (e.g., high frame rate). For larger motion, multi-scale implementations

are used but are more expensive (see later)โ€ข Much more efficient than block-based methods. Thus, can be used to track the motion of every pixel

in the image (i.e., optical flow). It avoids searching in the neighborhood of the point by analyzing the local intensity changes (i.e., differences) of an image patch at a specific location (i.e., no search is performed)

30

Outline

โ€ข Point trackingโ€ข Template trackingโ€ข Tracking by detection of local image features

31

Template tracking

Goal: follow a template image in a video sequence by estimating the warp

32

Template tracking

Goal: follow a template image in a video sequence by estimating the warp

33

Template image

Template Warping

โ€ข Given the template image ๐‘‡๐‘‡(๐ฑ๐ฑ)โ€ข Take all pixels from the template image ๐‘‡๐‘‡(๐ฑ๐ฑ) and warp them using the function ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

parameterized in terms of parameters ๐ฉ๐ฉ

34

Template image

๐‘‡๐‘‡(๐ฑ๐ฑ)

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

warp

๐ผ๐ผ(๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ )

Current image

Common 2D Transformations

โ€ข Translation

โ€ข Euclidean

โ€ข Affine

โ€ข Projective(homography)

35

๐‘ฅ๐‘ฅโ€ฒ = ๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž1๐‘ฆ๐‘ฆโ€ฒ = ๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅโ€ฒ = ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) โˆ’ ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž1๐‘ฆ๐‘ฆโ€ฒ = ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅโ€ฒ = ๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž3๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž5๐‘ฆ๐‘ฆโ€ฒ = ๐‘Ž๐‘Ž2๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž4๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž6

๐‘ฅ๐‘ฅโ€ฒ =๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž2๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž3๐‘Ž๐‘Ž7๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž8๐‘ฆ๐‘ฆ + 1

๐‘ฆ๐‘ฆโ€ฒ =๐‘Ž๐‘Ž4๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž5๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž6๐‘Ž๐‘Ž7๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž8๐‘ฆ๐‘ฆ + 1

Common 2D Transformations in Matrix form

We denote the transformation W ๐ฑ๐ฑ,๐ฉ๐ฉ and p the set of parameters ๐‘๐‘ = (๐‘Ž๐‘Ž1,๐‘Ž๐‘Ž2, โ€ฆ ,๐‘Ž๐‘Ž๐‘›๐‘›)

โ€ข Translation

โ€ข Euclidean

โ€ข Affine

โ€ข Projective(homography)

36

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž1๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž2

= 1 0 ๐‘Ž๐‘Ž10 1 ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) โˆ’ ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž2

= cos(๐‘Ž๐‘Ž3) โˆ’sin(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž1sin(๐‘Ž๐‘Ž3) cos(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž3๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž5๐‘Ž๐‘Ž2๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž4๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž6

=๐‘Ž๐‘Ž1 ๐‘Ž๐‘Ž3 ๐‘Ž๐‘Ž5๐‘Ž๐‘Ž2 ๐‘Ž๐‘Ž4 ๐‘Ž๐‘Ž6

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

Homogeneous coordinates

๐‘Š๐‘Š ๏ฟฝ๐’™๐’™,๐ฉ๐ฉ =๐‘Ž๐‘Ž1 ๐‘Ž๐‘Ž2 ๐‘Ž๐‘Ž3๐‘Ž๐‘Ž4 ๐‘Ž๐‘Ž5 ๐‘Ž๐‘Ž6๐‘Ž๐‘Ž7 ๐‘Ž๐‘Ž8 1

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

Common 2D Transformations in Matrix form

37

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = 1 0 ๐‘Ž๐‘Ž10 1 ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = cos(๐‘Ž๐‘Ž3) โˆ’sin(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž1sin(๐‘Ž๐‘Ž3) cos(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =๐‘Ž๐‘Ž1 ๐‘Ž๐‘Ž3 ๐‘Ž๐‘Ž5๐‘Ž๐‘Ž2 ๐‘Ž๐‘Ž4 ๐‘Ž๐‘Ž6

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๏ฟฝ๐’™๐’™,๐ฉ๐ฉ =๐‘Ž๐‘Ž1 ๐‘Ž๐‘Ž2 ๐‘Ž๐‘Ž3๐‘Ž๐‘Ž4 ๐‘Ž๐‘Ž5 ๐‘Ž๐‘Ž6๐‘Ž๐‘Ž7 ๐‘Ž๐‘Ž8 1

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = ๐‘Ž๐‘Ž4cos(๐‘Ž๐‘Ž3) โˆ’sin(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) cos(๐‘Ž๐‘Ž3) ๐‘Ž๐‘Ž2

๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ1

Derivative and gradient

โ€ข Function: ๐‘“๐‘“ ๐‘ฅ๐‘ฅ

โ€ข Derivative: ๐‘“๐‘“โ€ฒ ๐‘ฅ๐‘ฅ = ๐‘‘๐‘‘๐‘‘๐‘‘๐‘‘๐‘‘๐‘‘๐‘‘

, where ๐‘ฅ๐‘ฅ is a scalar

โ€ข Function: ๐‘“๐‘“(๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘ฅ๐‘›๐‘› )

โ€ข Gradient: โˆ‡๐‘“๐‘“(๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘ฅ๐‘›๐‘› )= ๐œ•๐œ•๐‘‘๐‘‘๐œ•๐œ•๐‘‘๐‘‘1

, ๐œ•๐œ•๐‘‘๐‘‘๐œ•๐œ•๐‘‘๐‘‘2

, โ€ฆ , ๐œ•๐œ•๐‘‘๐‘‘๐œ•๐œ•๐‘‘๐‘‘๐‘›๐‘›

38

Jacobian

โ€ข ๐น๐น(๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘ฅ๐‘›๐‘› ) =๐‘“๐‘“1(๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘ฅ๐‘›๐‘› )

โ‹ฎ๐‘“๐‘“๐‘š๐‘š(๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, โ€ฆ , ๐‘ฅ๐‘ฅ๐‘›๐‘› )

is a vector-valued function

โ€ข The derivative in this case is called Jacobian ๐œ•๐œ•๐น๐น๐œ•๐œ•๐ฑ๐ฑ

:

39

Carl Gustav Jacob (1804-1851)

๐œ•๐œ•๐น๐น๐œ•๐œ•๐ฑ๐ฑ

=

๐œ•๐œ•๐‘“๐‘“1๐œ•๐œ•๐‘ฅ๐‘ฅ1

, โ€ฆ ,๐œ•๐œ•๐‘“๐‘“1๐œ•๐œ•๐‘ฅ๐‘ฅ๐‘›๐‘›

โ‹ฎ๐œ•๐œ•๐‘“๐‘“๐‘š๐‘š๐œ•๐œ•๐‘ฅ๐‘ฅ1

, โ€ฆ ,๐œ•๐œ•๐‘“๐‘“๐‘š๐‘š๐œ•๐œ•๐‘ฅ๐‘ฅ๐‘›๐‘›

Displacement-model Jacobians โˆ‡๐‘Š๐‘Š๐‘๐‘

๐‘๐‘ = (๐‘Ž๐‘Ž1,๐‘Ž๐‘Ž2, โ€ฆ ,๐‘Ž๐‘Ž๐‘›๐‘›)

โ€ข Translation:

โ€ข Euclidean:

โ€ข Affine:

40

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž1๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž2

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) โˆ’ ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) + ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) + ๐‘Ž๐‘Ž2

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =๐‘Ž๐‘Ž1๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž3๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž5๐‘Ž๐‘Ž2๐‘ฅ๐‘ฅ + ๐‘Ž๐‘Ž4๐‘ฆ๐‘ฆ + ๐‘Ž๐‘Ž6

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

=

๐œ•๐œ•๐‘Š๐‘Š1

๐œ•๐œ•๐‘Ž๐‘Ž1๐œ•๐œ•๐‘Š๐‘Š1

๐œ•๐œ•๐‘Ž๐‘Ž2๐œ•๐œ•๐‘Š๐‘Š2

๐œ•๐œ•๐‘Ž๐‘Ž1๐œ•๐œ•๐‘Š๐‘Š2

๐œ•๐œ•๐‘Ž๐‘Ž2

= 1 00 1

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

= 1 0 โˆ’๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3) โˆ’ ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3)0 1 ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ๐‘ฅ(๐‘Ž๐‘Ž3) โˆ’ ๐‘ฆ๐‘ฆ๐‘ฅ๐‘ฅ๐‘ฆ๐‘ฆ๐‘ฆ๐‘ฆ(๐‘Ž๐‘Ž3)

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

= ๐‘ฅ๐‘ฅ 0 ๐‘ฆ๐‘ฆ0 ๐‘ฅ๐‘ฅ 0

0 1 0๐‘ฆ๐‘ฆ 0 1

Template Warping

โ€ข Given the template image ๐‘‡๐‘‡(๐ฑ๐ฑ)โ€ข Take all pixels from the template image ๐‘‡๐‘‡(๐ฑ๐ฑ) and warp them using the function ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

parameterized in terms of parameters ๐ฉ๐ฉ

41

Template image

๐‘‡๐‘‡(๐ฑ๐ฑ)

๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

warp

๐ผ๐ผ(๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ )

Current image

Template Tracking: Problem Formulation

โ€ข The goal of template-based tracking is to find the set of warp parameters p such that:

โ€ข This is solved by determining p that minimizes the Sum of Squared Differences:

42

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ = ๐‘‡๐‘‡(๐ฑ๐ฑ)

๐‘†๐‘†๐‘†๐‘†๐ท๐ท = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ) ๐Ÿ๐Ÿ

Assumptions

โ€ข No errors in the template image boundaries: only the object to track appears in the template image

โ€ข No occlusion: the entire template is visible in the input image

โ€ข Brightness constancy, โ€ข Temporal consistency, โ€ข Spatial coherency

43

KLT tracker applied to template tracking

โ€ข Uses the Gauss-Newton method for minimization, that is:โ€ข Applies a first-order approximation of the warpโ€ข Attempts to minimize the SSD iteratively

44Lucas, Kanade, An iterative image registration technique with an application to stereo vision. Proceedings of Imaging Understanding Workshop, 1981. PDF.

Tomasi, Kanade, Detection and Tracking of Point Features, Carnegie Mellon University Technical Report CMU-CS-91-132, 1991. PDF.

Derivation of the KLT algorithm

โ€ข Assume that an initial estimate of p is known. Then, we want to find the increment โˆ†๐ฉ๐ฉthat minimizes

โ€ข First-order Taylor approximation of ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ yelds to:

45

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)๐Ÿ๐Ÿ

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)๐Ÿ๐Ÿ

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ โ‰… ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ +๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉโˆ†๐ฉ๐ฉ

๐›ป๐›ป๐ผ๐ผ = ๐ผ๐ผ๐‘ฅ๐‘ฅ , ๐ผ๐ผ๐‘ฆ๐‘ฆ = Image gradient evaluated at ๐‘Š๐‘Š(๐ฑ๐ฑ,๐ฉ๐ฉ) Jacobian of the warp ๐‘Š๐‘Š(๐ฑ๐ฑ,๐ฉ๐ฉ)

Derivation of the KLT algorithm

โ€ข By replacing ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ with its 1st order approximation, we get

โ€ข How do we minimize it?โ€ข We differentiate SSD with respect to โˆ†๐ฉ๐ฉ and we equate it to zero, i.e.,

46

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)๐Ÿ๐Ÿ

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ +๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)๐Ÿ๐Ÿ

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•โˆ†๐ฉ๐ฉ

= 0

Derivation of the KLT algorithm

47

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•โˆ†๐ฉ๐ฉ

= 2๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

T

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ +๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)

๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘†๐‘†๐œ•๐œ•โˆ†๐ฉ๐ฉ

= 0

2๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

T

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ +๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ) = 0 โ‡’

๐‘†๐‘†๐‘†๐‘†๐‘†๐‘† = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ +๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

โˆ†๐ฉ๐ฉ โˆ’ ๐‘‡๐‘‡(๐ฑ๐ฑ)๐Ÿ๐Ÿ

Derivation of the KLT algorithm

48

โ‡’ โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

T

๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ =

๐ป๐ป = ๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

T

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

Second moment matrix (Hessian) of the warped image

What does H look like when the warp is a pure translation?

Notice that these are NOT matrix products but pixel-wise products!

KLT algorithm

1. Warp ๐ผ๐ผ(๐ฑ๐ฑ) with ๐‘Š๐‘Š(๐ฑ๐ฑ,๐ฉ๐ฉ) โ†’๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

2. Compute the error: subtract ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ from ๐‘‡๐‘‡(๐ฑ๐ฑ)

3. Compute warped gradients: ๐›ป๐›ป๐ผ๐ผ = ๐ผ๐ผ๐‘ฅ๐‘ฅ , ๐ผ๐ผ๐‘ฆ๐‘ฆ , evaluated at ๐‘Š๐‘Š(๐ฑ๐ฑ,๐ฉ๐ฉ)

4. Evaluate the Jacobian of the warping: ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

5. Compute steepest descent: ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

6. Compute Inverse Hessian: ๐ป๐ปโˆ’1 = โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•

๐œ•๐œ•๐ฉ๐ฉ

โˆ’1

7. Multiply steepest descend with error: โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

8. Compute โˆ†๐ฉ๐ฉ

9. Update parameters: ๐ฉ๐ฉโ†๐ฉ๐ฉ + โˆ†๐ฉ๐ฉ10. Repeat until โˆ†๐ฉ๐ฉ < ๐œบ๐œบ

49

โ‡’ โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1๏ฟฝ๐ฑ๐ฑโˆˆ๐“๐“

๐›ป๐›ป๐ผ๐ผ๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

T

๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

6x1

6x6

50

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

6x1

6x6

51

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

6x1

6x6

What is the size?๐Ÿ๐Ÿ๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐Ÿ๐Ÿ ร— ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

Why does it look like that?๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

=๐‘ฅ๐‘ฅ 0 ๐‘ฆ๐‘ฆ

0 ๐‘ฅ๐‘ฅ 0

0 1 0

๐‘ฆ๐‘ฆ 0 1

52

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

6x1

6x6

Why does it look like that?(slide 48)

What is the size?๐Ÿ๐Ÿ๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐Ÿ๐Ÿ ร— ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

Whatโ€™s its size?

๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

=๐‘ฅ๐‘ฅ 0 ๐‘ฆ๐‘ฆ

0 ๐‘ฅ๐‘ฅ 0

0 1 0

๐‘ฆ๐‘ฆ 0 1

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

6x1

6x6

What is the size?๐Ÿ๐Ÿ๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐Ÿ๐Ÿ ร— ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

Whatโ€™s its size?

๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

=๐‘ฅ๐‘ฅ 0 ๐‘ฆ๐‘ฆ

0 ๐‘ฅ๐‘ฅ 0

0 1 0

๐‘ฆ๐‘ฆ 0 1

54

Why does it look like that?(slide 48)

KLT algorithm: computing โˆ†๐ฉ๐ฉ = ๐ป๐ปโˆ’1 โˆ‘๐ฑ๐ฑโˆˆ๐“๐“ ๐›ป๐›ป๐ผ๐ผ ๐œ•๐œ•๐œ•๐œ•๐œ•๐œ•๐ฉ๐ฉ

T๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ

๐œ•๐œ•๐‘Š๐‘Š๐œ•๐œ•๐ฉ๐ฉ

=๐‘ฅ๐‘ฅ 0 ๐‘ฆ๐‘ฆ

0 ๐‘ฅ๐‘ฅ 0

0 1 0

๐‘ฆ๐‘ฆ 0 1

What is the size?๐Ÿ๐Ÿ๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

Whatโ€™s its size?

๐Ÿ๐Ÿ ร— ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ

๐Ÿ๐Ÿ ร— ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

6x1

6x66x1

55

Why does it look like that?(slide 48)

KLT algorithm: Discussion

Lucas-Kanade follows a predict-correct cycle

โ€ข A prediction ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉ of the warped image is computed from an initial estimate

โ€ข The correction parameter โˆ†๐ฉ๐ฉ is computed as a function of the error ๐‘‡๐‘‡ ๐ฑ๐ฑ โˆ’ ๐ผ๐ผ ๐‘Š๐‘Š ๐ฑ๐ฑ,๐ฉ๐ฉbetween the prediction and the template

โ€ข The larger this error, the larger the correction applied

predict correct

56

KLT algorithm: Discussion

โ€ข How to get the initial estimate p?โ€ข When does the Lucas-Kanade fail?

โ€ข If the initial estimate is too far, then the linear approximation does not longer hold -> solution?

โ€ข Pyramidal implementations (see next slide)

โ€ข Other problems:โ€ข Deviations from the mathematical model: object deformations, illumination changes,

etc.โ€ข Occlusionsโ€ข Due to these reasons, tracking may drift -> solution?

โ€ข Update the template with the last image

57

Coarse-to-fine estimation

image I

pI(W)warp refine

p ฮ”p+

Pyramid of image I Pyramid of image T

image Tu=10 pixels

u=5 pixels

u=1.25 pixels

u=2.5 pixels

image T

58

Coarse-to-fine estimation

I I(W) Twarp refine

inp

pโˆ†+

I I(W) Twarp refine

p

pโˆ†+

I

pyramid construction

I I(W) Twarp refine

pโˆ†+

T

pyramid construction

outp 59

Generalization of KLT

โ€ข The same concept (predict/correct) can be applied to tracking of 3D object (in this case, what is the transformation to etimate? What is the template?)

60

Generalization of KLT

โ€ข The same concept (predict/correct) can be applied to tracking of 3D object (in this case, what is the transformation to etimate? What is the template?)

โ€ข In order to deal with wrong prediction, it can be implemented in a Particle-Filter fashion (using multiple hipotheses that need to be validated)

61

Outline

โ€ข Point trackingโ€ข Template trackingโ€ข Tracking by detection of local image features

62

Tracking by detection of local image features

โ€ข Step 1: Keypoint detection and matchingโ€ข invariant to scale, rotation, or perspective

63Template image with the object to detect Current test image

Tracking by detection of local image features

โ€ข Step 1: Keypoint detection and matchingโ€ข invariant to scale, rotation, or perspective

64Template image with the object to detect Current test image

Tracking by detection of local image features

โ€ข Step 1: Keypoint detection and matchingโ€ข invariant to scale, rotation, or perspective

โ€ข Step 2: Geometric verification (RANSAC) (e.g., 4-point RANSAC for planar objects, or 5 or 8-point RANSAC for 3D objects)

65Template image with the object to detect Current test image

Tracking by detection of local image features

66

Tracking issues

โ€ข How to segment the object to track from background?โ€ข How to initialize the warping?โ€ข How to handle occlusionsโ€ข How to handle illumination changes and non modeled effects?

67

Readings

โ€ข Chapter 8 of Szeliskiโ€™s book, 1st edition

68

Understanding Check

Are you able to answer the following questions?โ€ข Are you able to illustrate tracking with block matching?โ€ข Are you able to explain the underlying assumptions behind differential methods, derive their mathematical expression

and the meaning of the M matrix?โ€ข When is this matrix invertible and when not?โ€ข What is the aperture problem and how can we overcome it?โ€ข What is optical flow?โ€ข Can you list pros and cons of block-based vs. differential methods for tracking?โ€ข Are you able to describe the working principle of KLT?โ€ข What functional does KLT minimize?โ€ข What is the Hessian matrix and for which warping function does it coincide to that used for point tracking?โ€ข Can you list Lukas-Kanade failure cases and how to overcome them?โ€ข How do we get the initial guess?โ€ข Can you illustrate the coarse-to-fine Lucas-Kanade implementation?โ€ข Can you illustrate alternative tracking procedures using point features?

69

Recommended