Download pptx - Advanced Computer Vision CH6. Feature-based Alignment Professor: Prof. Fuh Presenter: Nick Chu ( 祝成豪 )

Advanced Computer VisionCH6. Feature-based Alignment

Professor: Prof. FuhPresenter: Nick Chu

(祝成豪 )

Taught Way

• Cover the main parts of textbook chapter 6.• Extra resources on EGGN 512 course videos. From Electrical

Engineering & Computer Science, Colorado School of Mines (CSM/Mines)

Main Content

• Alignment Algorithm 2D alignment using Least Square Error method, RANdom SAmple

Consensus (RANSAC)• Pose Estimation• Geometric Intrinsic Calibration

Feature-based Alignment Algorithm

• Feature-based methods: only use feature points to estimate parameters• Features: SIFT, SURF, Harris corner detection .etc. (Recall Ch4.)

Application sneak peek

1. Image stitching (Panography)2. Calibration3. Augmented Reality (AR)

Application sneak peek

Distortion review

1. Barrel 2. Pincushion 3. Mustache

6.1 2D and 3D feature-based alignment

1. Estimating motion between two or more sets of matched 2D or 3D points.

2. Applications to non-rigid or elastic deformations will not be covered in this chapter.

3. Restrict ourselves to global parametric transformations (Fig 6.2) or higher order (in space) transformation for curved surfaces.

2D planar transformation

2D planar transformation

Joke

6.1.1 2D Alignment using least squares

• Given a set of matched feature points and a planar parametric transformation

• What’s the best estimate of p?• We can use least squares, i.e., to minimize the sum of squared

residuals.

)},{( 'ii xx

);(' pxfx

2'2);(

iii

iiLS xpxfrE

Determine pairwise alignment

• p’=Mp, where M is a transformation matrix, p and p’ are feature matches• It is possible to use more complicated models such as affine or

perspective• For example, assume M is a 2x2 matrix

• Find M with the least square error

y

x

mm

mm

y

x

2221

1211

'

'

n

i

pMp1

2'

Determine pairwise alignment

• Overdetermined system

y

x

mm

mm

y

x

2221

1211

'

''1221211

'1121111

ymymx

xmymx

'

'

'2

'1

'1

22

21

12

11

22

11

11

00

00

00

00

00

n

n

nn

nn

y

x

x

y

x

m

m

m

m

yx

yx

yx

yx

yx

Normal Equation

• Given an overdetermined system

• the normal equation is that which minimizes the sum of the square differences between left and right sides

• By minimizing

• We’ll get

bAx

bAxAA TT )(

2)( bAxx E

Normal Equation

2)( bAxx E

n

m

nmn

m

b

b

x

x

aa

aa

:

:

:

:

...

::

::

::

... 1

1

1

111

nxm, n equations, m variables

Normal Equation

n

m

jjnj

i

m

jjij

m

jjj

n

i

m

jjnj

m

jjij

m

jjj

bxa

bxa

bxa

b

b

b

xa

xa

xa

1

1

11

1

1

1

1

11

:

:

:

:

:

:

bAx

n

ii

m

jjij bxaE

1

2

1

2)( bAxx

n

m

nmn

m

b

b

x

x

aa

aa

:

:

:

:

...

::

::

::

... 1

1

1

111

Normal Equation

n

ii

m

jjij bxaE

1

2

1

2)( bAxx

1

0x

E

n

iii

n

ij

m

jiji baxaa

11

1 11 22

11 1

2 i

n

ii

m

jjij abxa

)(20 bAAxAx

TT

E

bAAxA TT →

n

m

nmn

m

b

b

x

x

aa

aa

:

:

:

:

...

::

::

::

... 1

1

1

111

nmmm

n

n

aaa

aaa

aaa

21

22212

12111

Normal Equation

• Original over-determined system.

• Normal Equation

• Pseudo Inverse

bAxAA TT )(

bAAAx T1T )(

T1T )( AAAA

bAx

Joke

Uncertainty weighting

• The above least squares formulation assumes that all feature points are matched with the same accuracy.• This is often not the case, since certain points may fall into more

textured regions than others.• If we associate a scalar variance estimate with each

correspondence, we can minimize the weighted least squares problem instead.

2'-22-2W );(

iiii

iiiLS xpxfrE

EGGN Example

2972932029

223401398213pA

270302347

164339391207pB

X Y

Example :

11100B

B

A

A

y

x

x

x

y

x

tcs

tsc

bAx

)(

)1(

)1(

)()(

)1()1(

)1()1(

,,

10

10

01

NB

B

B

y

x

NA

NA

AA

AA

y

y

x

b

t

t

s

c

x

xy

xy

yx

A

Matlab code

Pseudo inverse

Result

• Apply parameters back to image A.

6.1.2 Panography

• Stitching = alignment + blending

geometricalregistration

photometricregistration

Image Stitching to Panography

• Panorama – the whole picture• Compact Camera FOV = 50 x 35°


• Panorama – the whole picture• Compact Camera FOV = 50 x 35°• Human FOV = 200 x 135°


• Panorama – the whole picture• Compact Camera FOV = 50 x 35°• Human FOV = 200 x 135°• Panoramic Mosaic = 360 x 180°

Image Stitching - Cylindrical panoramas

• Example:• http://graphics.stanford.edu/courses/cs178/applets/projection.h

tml

http://graphics.stanford.edu/courses/cs178/applets/projection.html

http://graphics.stanford.edu/courses/cs178/applets/projection.html

Application: Face Alignment

• Face alignment is the key preprocessing step for face recognition

Application: Face Alignment• Examples from LFW (Labeled Face in the Wild) dataset:

Joke

6.1.3 Iterative algorithms

• Linear least squares is the simplest method for estimating parameters.• Most do not have a simple linear relationship• Non-linear least squares or non-linear regression problems.

Non-Linear System

• Since we treated and as independent variables, we got a system of linear equations

• But c,s are not independent – we should really just solve for 3 variables (tx, ty, ), not 4 variables.• This gives us a system of non linear equations

• Solvable, but needs iterative method.

)cos(c )sin(s

yAA

xAA

tcysx

tsycx

B

B

y

x

yAA

xAA

tyx

tyx

)cos()sin(y

)sin()cos(x

B

B

Non-Linear Least Squares

• Example for a scalar function– Given

- A known function y = f(x)- A value of y, call it

– Find such that = f( )- We need a starting guess for x, call it

• Find the to take us closer to

1x 1x1y1y

0x

1x

1y

0x

0ydx

dx

dfdy

01 yyy x 1x

xdx

dfy

dxdf

yx

xxx 0

xx 0

'y


• We have a nonlinear function y = f(x)– x is a vector of our unknowns– y is a vector of our observations• We start with a guess for x, call it • We linearize (take the Taylor series expansion) about that point

Ny

y

y

2

1

y

Mx

x

x

2

1

x

0x

dxx

fdy

xj

i

0

)()()()(

000

0

xxxfxfdx

xdf

x

J


• The matrix of partial derivatives of f with respect to x is called the Jacobian matrix.

M

NNN

M

M

j

i

xf

xf

xf

xf

xf

xf

xf

xf

xf

x

f

21

2

2

2

1

2

1

2

1

1

1

J

Non-Linear Iterative Least Squares

1. Initialize x to 2. Compute . Residual error is dy =3. Calculate Jacobian of f, evaluate it at . We now have

4. Solve for dx using pseudo inverse dx = dy5. Set x to x + dx6. Repeat steps 2-5 until convergence (no more change in x)

xJy dd

0x

01 yy

0x

TT JJJ 1)(

)( 00 xy f

Problems

• What if the matches are false? Avoid impact of outliers. RANSAC

Example

Joke

6.1.4 RANSAC Algorithm

• RANSAC = RANdom SAmple Consensus• An iterative method to estimate parameters of a mathematical

model from a set of observed data which contains outliers [Wiki]• Compare to robust statistics

• Given N data points xi, assume that majority of them are generated from a model with parameters , try to recover .

RANSAC Algorithm

Run k times: (1) draw n samples randomly (2) fit parameters with these n samples (3) for each of other N-n points, calculate its distance to the fitted model, count the number of inlier points, cOutput with the largest c

How many times?How big? Smaller is better

How to define?Depends on the problem.

RANSAC Algorithm (Determine K)

knpP )1(1 n samples are all inliers

a failure

failure after k trials

)1log(

)1log(np

Pk

n p k

3 0.5 35

6 0.6 97

6 0.5 293

for P=0.99

p: probability of real inliersP: probability of success after k trials

RANSAC Algorithm (How to determine K)

RANSAC Algorithm (Example: line fitting)


n=2




c=3


c=3


c=15

RANSAC Algorithm (Example: Panorama)

Features from SIFT


RANSAC inliers


Joke

6.2 Pose Estimation

• A particular instance of feature-based alignment is estimating an object’s 3D pose from a set of 2D point projections. • Also known as extrinsic calibration, as opposed to the intrinsic

calibration of internal camera parameters such as focal length, which we discuss in Section 6.3.

6.2.1 Linear Algorithms

• Direct Linear Transform(DLT) - Directly solve the elements for the camera projection matrix.

DLT

• The projection of a 3D point in the world to a point in the pixel image

PW

),( imim yx

PWextKMp

~

13

2

1~

Z

Y

X

x

x

xW

extKMp 3231 , xxyxxx imim

DLT

• Extrinsic camera matrix.

• Intrinsic camera matrix

• We will solve for the 12 elements of extrinsic matrix by treating them independently (despite they’re not!).

Z

Y

X

WorgCC

Wext

trrr

trrr

trrr

333231

232221

131211

)( tRM

100

0

0

yy

xx

cf

cf

K

Normalized Image Coordinates

• If we knew the intrinsic camera matrix, we can convert the image points to “normalized” image coordinates.- Origin is in the center of image- Effective focal length = 1- = X/Z , = Y/Z• Then

- Multiple by K inverse, normalize the focal length and center of the image coordinate system.--

normalizedx normalizedy

edunnormaliznormalized pKp 1)(

100

0

0

yy

xx

cf

cf

Knormalizededunnormaliz Kpp

DLT

• The projection of a 3D point P in the world to a normalized image point.

or

• Linear system

1333231

232221

131211~

Z

Y

X

trrr

trrr

trrr

Z

Y

X

extn PMpz

y

z

x

tZrYrXr

tZrYrXry

tZrYrXr

tZrYrXrx

333231

232221

333231

131211 ,

0)(

0)(

333231232221

333231131211

zy

zx

tZrYrXrytZrYrXr

tZrYrXrxtZrYrXrnormalized

DLT

• Put into the form Ax = 0

• Solve using Single Value Decomposition (SVD)

010000

01000

33

32

31

23

22

21

13

12

11

z

y

x

t

t

t

r

r

r

r

r

r

r

r

r

yyZyYyXZYX

xxZxYxXZYXAx

Singular Value Decomposition

• where and • D is a diagonal matrix consists of singular values

where

• To solve Ax = 0, We can take the SVD of A- x is the column of V corresponding to the zero singular value of A

TUDVA IUU T IVV T

N

000

000

000

000

2

1

0...21 n

TUDV

Appication: Augmented Reality

• Human brain is doing pose estimation every single second!• Virtual 3D images or annotations are superimposed on top of a

live video feed.• Either through the use of see through glasses (a head-mounted

display) or on a regular computer or mobile device screen.• In some applications, a special pattern printed on cards or in a

book is tracked to perform the augmentation.

Microsoft Hololens

6.3 Geometric Intrinsic Calibration

• The computation of the intrinsic (internal) camera-calibration parameters and the estimation of the extrinsic (external) pose of the camera can occur simultaneously with respect to a known calibration set.• In this section, we look at alternative formulations, the use of

alternative calibration targets , and the estimation of the non-linear part of camera optics such as radial distortion.

6.3.1 Calibration patterns

• One of the more reliable ways to estimate camera intrinsic parameters. Target has easy-to-extract features.

Alternative method

• Place the camera on a large flat piece of cardboard and use a long metal ruler to draw lines on the cardboard that appear vertical in the image.

Nodal point

• Nodal points N1 and N2.

Planar Calibration Patterns

• N-planes calibration approach:Finite workspace + accurate machining and motion control platforms to move a planar calibration target in a controlled fashion.• A less accurate calibration can be obtained by waving calibration

pattern in front of a camera .In this case, the pattern’s pose has (in principle) to be recovered in conjunction with the intrinsics.

DLT Recall

• The projection of a 3D point P in the world to a normalized image point.

or

• Linear system

1333231

232221

131211~

Z

Y

X

trrr

trrr

trrr

Z

Y

X

extn PMpz

y

z

x

tZrYrXr

tZrYrXry

tZrYrXr

tZrYrXrx

333231

232221

333231

131211 ,

0)(

0)(

333231232221

333231131211

zy

zx

tZrYrXrytZrYrXr

tZrYrXrxtZrYrXrnormalized

Joke

6.3.2 Vanishing Points

• A common case for calibration. • The camera is looking at a man-made scene with strong extended

rectahedral objects such as boxes or room walls.• Intersect the 2D lines corresponding to 3D parallel lines to

compute their vanishing points, (as in 4.3.3).• Use these to determine the intrinsic and extrinsic calibration

parameters

Vanishing Points

Vanishing Points Calibration

• Assume a simplified form for the calibration matrix K where only the focal length is unknown.• For any vanishing point .

iiyi

xi

i rp

f

cy

cx

x

R~^

^

ix

0))(())((~ 2

fcycycxcx

f

cy

cx

f

cy

cx

rr yjyixjxiyj

xj

yi

xi

ji

)1,0,0()0,1,0()0,0,1( ororpi

Vanishing Points Calibration

• The accuracy increases as the vanishing points move closer to the center of the image. • It is best to tilt the calibration pattern a decent amount around the

45 axis

Application: Single View Metrology

• Allows people to interactively measure heights and other dimensions as well as to build piecewise planar 3D models,

Application: Single View Metrology

• Criminisi, Reid, and Zisserman (2000).• First step: Identify two orthogonal vanishing points on the ground

plane and the vanishing point for the vertical direction.• The user then marks a few dimensions in the image, such as the

height of a reference object.• The system can automatically compute the height of another

object. Walls and other planar impostors (geometry) can also be sketched and reconstructed.

6.3.4 Rotational Motion

• When no calibration targets or known structures are available.• But you can rotate the camera around its front nodal point (or,

equivalently, work in a large open environment where all objects are distant)• The camera can be calibrated from a set of overlapping images by

assuming that it is undergoing pure rotational motion.• The accuracy in this estimate is proportional to the total number

of pixels in the resulting cylindrical panorama.

6.3.5 Radial Distortion

• When images are taken with wide-angle lenses, itis often necessary to model lens distortions such as radial distortion.

Radial distortion

• The simplest radial distortion models use low-order polynomials.

• Where and are the distortion parameters.

)1(

)1(

42

21

^

42

21

^

rryy

rrxx

222 yxr 21,

Plumb Line Method

• One of the simplest and most usefulmethod.• Take an image of a scene with a lot of

straight lines, especially lines aligned with and near the edges of the image.• The radial distortion parameters can

then be adjusted until all of the lines in the image are straight.

Other Distortion

• More general models of lens distortion, such as fisheye and non-central projection, may sometimes be required. While the parameterization of such lenses may be more complicated .• The general approach of either using calibration rigs with known

3D positions or self-calibration through the use of multiple overlapping images of a scene can both be used.

Joke

Sources

• EGGN 512 – Computer Vision course by Prof. Willliam Hoff• http://inside.mines.edu/~whoff/courses/EENG512/• Playlist@Youtube:• http://www.youtube.com/playlist?list=PL4B3F8D4A5CAD8DA3

• Digital Visual Effects course by Prof. Yung-Yu Chuang• http://www.csie.ntu.edu.tw/~cyy/courses/vfx/15spring/news/

• Both are highly recommended

http://inside.mines.edu/~whoff/courses/EENG512/

http://www.youtube.com/playlist?list=PL4B3F8D4A5CAD8DA3

http://www.csie.ntu.edu.tw/~cyy/courses/vfx/15spring/news/

3D-2D Transforms

• EGGN 512 – Lecture 5-1 3D-2D Transforms• http://

www.youtube.com/watch?v=DFNjOUMuecU&index=11&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 5-2 3D-2D Transforms• http://

www.youtube.com/watch?v=5gesrLgNuQo&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=DFNjOUMuecU&index=11&list=PL4B3F8D4A5CAD8DA3



http://www.youtube.com/watch?v=5gesrLgNuQo&list=PL4B3F8D4A5CAD8DA3



Alignment Algorithm

• EGGN 512 – Lecture 14-1 Alignment• http://

www.youtube.com/watch?v=UcU4814hvR8&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 14-2 Alignment• http://

www.youtube.com/watch?v=XxEKMecNZk0&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 14-3 Alignment• http://www.youtube.com/watch?v=lSXfv4baMwk&list=PL4B3F8D4A5CAD8DA3• Slides • http://inside.mines.edu/~

whoff/courses/EENG512/lectures/15-AlignmentNonlinear.pdf

http://www.youtube.com/watch?v=UcU4814hvR8&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=UcU4814hvR8&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=XxEKMecNZk0&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=XxEKMecNZk0&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=lSXfv4baMwk&list=PL4B3F8D4A5CAD8DA3



http://inside.mines.edu/~whoff/courses/EENG512/lectures/15-AlignmentNonlinear.pdf



RANSAC Algorithm

• EGGN 512 – Lecture 27-1 RANSAC• http://www.youtube.com/watch?v=NKxXGsZdDp8&list=PL4B3F8

D4A5CAD8DA3&index=54

http://www.youtube.com/watch?v=NKxXGsZdDp8&list=PL4B3F8D4A5CAD8DA3&index=54

http://www.youtube.com/watch?v=NKxXGsZdDp8&list=PL4B3F8D4A5CAD8DA3&index=54

Pose Estimation

• EGGN 512 – Lecture 16-1 Pose Estimation• http://

www.youtube.com/watch?v=kq3c6QpcAGc&list=PL4B3F8D4A5CAD8DA3

• EGGN 512 – Lecture 17-1 Pose from Lines• http://

www.youtube.com/watch?v=D_4eUoqgWdc&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=kq3c6QpcAGc&list=PL4B3F8D4A5CAD8DA3



http://www.youtube.com/watch?v=D_4eUoqgWdc&list=PL4B3F8D4A5CAD8DA3



Pose Estimation

• EGGN 512 – Lecture 18-1 SVD• http://

www.youtube.com/watch?v=C852P1JrHXI&list=PL4B3F8D4A5CAD8DA3

• EGGN 512 – Lecture 18-2 SVD• http://

www.youtube.com/watch?v=aIkzK4CdYes&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=C852P1JrHXI&list=PL4B3F8D4A5CAD8DA3



http://www.youtube.com/watch?v=aIkzK4CdYes&list=PL4B3F8D4A5CAD8DA3



Pose Estimation

• EGGN 512 – Lecture 19-1 Linear Pose Estimation• http://

www.youtube.com/watch?v=HojSSrxsB4Q&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 19-2 Linear Pose Estimation• http://

www.youtube.com/watch?v=ik8dFybnyPY&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 19-3 Linear Pose Estimation• http://

www.youtube.com/watch?v=OS5b-3Xfn1M&list=PL4B3F8D4A5CAD8DA3

http://www.youtube.com/watch?v=HojSSrxsB4Q&list=PL4B3F8D4A5CAD8DA3



http://www.youtube.com/watch?v=ik8dFybnyPY&list=PL4B3F8D4A5CAD8DA3



http://www.youtube.com/watch?v=OS5b-3Xfn1M&list=PL4B3F8D4A5CAD8DA3