Advanced Computer VisionCH6. Feature-based Alignment
Professor: Prof. FuhPresenter: Nick Chu
(祝成豪 )
Taught Way
• Cover the main parts of textbook chapter 6.• Extra resources on EGGN 512 course videos. From Electrical
Engineering & Computer Science, Colorado School of Mines (CSM/Mines)
Main Content
• Alignment Algorithm 2D alignment using Least Square Error method, RANdom SAmple
Consensus (RANSAC)• Pose Estimation• Geometric Intrinsic Calibration
Feature-based Alignment Algorithm
• Feature-based methods: only use feature points to estimate parameters• Features: SIFT, SURF, Harris corner detection .etc. (Recall Ch4.)
Application sneak peek
1. Image stitching (Panography)2. Calibration3. Augmented Reality (AR)
Application sneak peek
Distortion review
1. Barrel 2. Pincushion 3. Mustache
6.1 2D and 3D feature-based alignment
1. Estimating motion between two or more sets of matched 2D or 3D points.
2. Applications to non-rigid or elastic deformations will not be covered in this chapter.
3. Restrict ourselves to global parametric transformations (Fig 6.2) or higher order (in space) transformation for curved surfaces.
2D planar transformation
2D planar transformation
Joke
6.1.1 2D Alignment using least squares
• Given a set of matched feature points and a planar parametric transformation
• What’s the best estimate of p?• We can use least squares, i.e., to minimize the sum of squared
residuals.
)},{( 'ii xx
);(' pxfx
2'2);(
iii
iiLS xpxfrE
Determine pairwise alignment
• p’=Mp, where M is a transformation matrix, p and p’ are feature matches• It is possible to use more complicated models such as affine or
perspective• For example, assume M is a 2x2 matrix
• Find M with the least square error
y
x
mm
mm
y
x
2221
1211
'
'
n
i
pMp1
2'
Determine pairwise alignment
• Overdetermined system
y
x
mm
mm
y
x
2221
1211
'
''1221211
'1121111
ymymx
xmymx
'
'
'2
'1
'1
22
21
12
11
22
11
11
00
00
00
00
00
n
n
nn
nn
y
x
x
y
x
m
m
m
m
yx
yx
yx
yx
yx
Normal Equation
• Given an overdetermined system
• the normal equation is that which minimizes the sum of the square differences between left and right sides
• By minimizing
• We’ll get
bAx
bAxAA TT )(
2)( bAxx E
Normal Equation
2)( bAxx E
n
m
nmn
m
b
b
x
x
aa
aa
:
:
:
:
...
::
::
::
... 1
1
1
111
nxm, n equations, m variables
Normal Equation
n
m
jjnj
i
m
jjij
m
jjj
n
i
m
jjnj
m
jjij
m
jjj
bxa
bxa
bxa
b
b
b
xa
xa
xa
1
1
11
1
1
1
1
11
:
:
:
:
:
:
bAx
n
ii
m
jjij bxaE
1
2
1
2)( bAxx
n
m
nmn
m
b
b
x
x
aa
aa
:
:
:
:
...
::
::
::
... 1
1
1
111
Normal Equation
n
ii
m
jjij bxaE
1
2
1
2)( bAxx
1
0x
E
n
iii
n
ij
m
jiji baxaa
11
1 11 22
11 1
2 i
n
ii
m
jjij abxa
)(20 bAAxAx
TT
E
bAAxA TT →
n
m
nmn
m
b
b
x
x
aa
aa
:
:
:
:
...
::
::
::
... 1
1
1
111
nmmm
n
n
aaa
aaa
aaa
21
22212
12111
Normal Equation
• Original over-determined system.
• Normal Equation
• Pseudo Inverse
bAxAA TT )(
bAAAx T1T )(
T1T )( AAAA
bAx
Joke
Uncertainty weighting
• The above least squares formulation assumes that all feature points are matched with the same accuracy.• This is often not the case, since certain points may fall into more
textured regions than others.• If we associate a scalar variance estimate with each
correspondence, we can minimize the weighted least squares problem instead.
2'-22-2W );(
iiii
iiiLS xpxfrE
EGGN Example
2972932029
223401398213pA
270302347
164339391207pB
X Y
Example :
11100B
B
A
A
y
x
x
x
y
x
tcs
tsc
bAx
)(
)1(
)1(
)()(
)1()1(
)1()1(
,,
10
10
01
NB
B
B
y
x
NA
NA
AA
AA
y
y
x
b
t
t
s
c
x
xy
xy
yx
A
Matlab code
Pseudo inverse
Result
• Apply parameters back to image A.
6.1.2 Panography
• Stitching = alignment + blending
geometricalregistration
photometricregistration
Image Stitching to Panography
• Panorama – the whole picture• Compact Camera FOV = 50 x 35°
Image Stitching to Panography
• Panorama – the whole picture• Compact Camera FOV = 50 x 35°• Human FOV = 200 x 135°
Image Stitching to Panography
• Panorama – the whole picture• Compact Camera FOV = 50 x 35°• Human FOV = 200 x 135°• Panoramic Mosaic = 360 x 180°
Image Stitching - Cylindrical panoramas
• Example:• http://graphics.stanford.edu/courses/cs178/applets/projection.h
tml
Application: Face Alignment
• Face alignment is the key preprocessing step for face recognition
Application: Face Alignment• Examples from LFW (Labeled Face in the Wild) dataset:
Joke
6.1.3 Iterative algorithms
• Linear least squares is the simplest method for estimating parameters.• Most do not have a simple linear relationship• Non-linear least squares or non-linear regression problems.
Non-Linear System
• Since we treated and as independent variables, we got a system of linear equations
• But c,s are not independent – we should really just solve for 3 variables (tx, ty, ), not 4 variables.• This gives us a system of non linear equations
• Solvable, but needs iterative method.
)cos(c )sin(s
yAA
xAA
tcysx
tsycx
B
B
y
x
yAA
xAA
tyx
tyx
)cos()sin(y
)sin()cos(x
B
B
Non-Linear Least Squares
• Example for a scalar function– Given
- A known function y = f(x)- A value of y, call it
– Find such that = f( )- We need a starting guess for x, call it
• Find the to take us closer to
1x 1x1y1y
0x
1x
1y
0x
0ydx
dx
dfdy
01 yyy x 1x
xdx
dfy
dxdf
yx
xxx 0
xx 0
'y
Non-Linear Least Squares
• We have a nonlinear function y = f(x)– x is a vector of our unknowns– y is a vector of our observations• We start with a guess for x, call it • We linearize (take the Taylor series expansion) about that point
Ny
y
y
2
1
y
Mx
x
x
2
1
x
0x
dxx
fdy
xj
i
0
)()()()(
000
0
xxxfxfdx
xdf
x
J
Non-Linear Least Squares
• The matrix of partial derivatives of f with respect to x is called the Jacobian matrix.
M
NNN
M
M
j
i
xf
xf
xf
xf
xf
xf
xf
xf
xf
x
f
21
2
2
2
1
2
1
2
1
1
1
J
Non-Linear Iterative Least Squares
1. Initialize x to 2. Compute . Residual error is dy =3. Calculate Jacobian of f, evaluate it at . We now have
4. Solve for dx using pseudo inverse dx = dy5. Set x to x + dx6. Repeat steps 2-5 until convergence (no more change in x)
xJy dd
0x
01 yy
0x
TT JJJ 1)(
)( 00 xy f
Problems
• What if the matches are false? Avoid impact of outliers. RANSAC
Example
Joke
6.1.4 RANSAC Algorithm
• RANSAC = RANdom SAmple Consensus• An iterative method to estimate parameters of a mathematical
model from a set of observed data which contains outliers [Wiki]• Compare to robust statistics
• Given N data points xi, assume that majority of them are generated from a model with parameters , try to recover .
RANSAC Algorithm
Run k times: (1) draw n samples randomly (2) fit parameters with these n samples (3) for each of other N-n points, calculate its distance to the fitted model, count the number of inlier points, cOutput with the largest c
How many times?How big? Smaller is better
How to define?Depends on the problem.
RANSAC Algorithm (Determine K)
knpP )1(1 n samples are all inliers
a failure
failure after k trials
)1log(
)1log(np
Pk
n p k
3 0.5 35
6 0.6 97
6 0.5 293
for P=0.99
p: probability of real inliersP: probability of success after k trials
RANSAC Algorithm (How to determine K)
RANSAC Algorithm (Example: line fitting)
RANSAC Algorithm (Example: line fitting)
n=2
RANSAC Algorithm (Example: line fitting)
RANSAC Algorithm (Example: line fitting)
RANSAC Algorithm (Example: line fitting)
c=3
RANSAC Algorithm (Example: line fitting)
c=3
RANSAC Algorithm (Example: line fitting)
c=15
RANSAC Algorithm (Example: Panorama)
Features from SIFT
RANSAC Algorithm (Example: Panorama)
RANSAC inliers
RANSAC Algorithm (Example: Panorama)
Joke
6.2 Pose Estimation
• A particular instance of feature-based alignment is estimating an object’s 3D pose from a set of 2D point projections. • Also known as extrinsic calibration, as opposed to the intrinsic
calibration of internal camera parameters such as focal length, which we discuss in Section 6.3.
6.2.1 Linear Algorithms
• Direct Linear Transform(DLT) - Directly solve the elements for the camera projection matrix.
DLT
• The projection of a 3D point in the world to a point in the pixel image
PW
),( imim yx
PWextKMp
~
13
2
1~
Z
Y
X
x
x
xW
extKMp 3231 , xxyxxx imim
DLT
• Extrinsic camera matrix.
• Intrinsic camera matrix
• We will solve for the 12 elements of extrinsic matrix by treating them independently (despite they’re not!).
Z
Y
X
WorgCC
Wext
trrr
trrr
trrr
333231
232221
131211
)( tRM
100
0
0
yy
xx
cf
cf
K
Normalized Image Coordinates
• If we knew the intrinsic camera matrix, we can convert the image points to “normalized” image coordinates.- Origin is in the center of image- Effective focal length = 1- = X/Z , = Y/Z• Then
- Multiple by K inverse, normalize the focal length and center of the image coordinate system.--
normalizedx normalizedy
edunnormaliznormalized pKp 1)(
100
0
0
yy
xx
cf
cf
Knormalizededunnormaliz Kpp
DLT
• The projection of a 3D point P in the world to a normalized image point.
or
• Linear system
1333231
232221
131211~
Z
Y
X
trrr
trrr
trrr
Z
Y
X
extn PMpz
y
z
x
tZrYrXr
tZrYrXry
tZrYrXr
tZrYrXrx
333231
232221
333231
131211 ,
0)(
0)(
333231232221
333231131211
zy
zx
tZrYrXrytZrYrXr
tZrYrXrxtZrYrXrnormalized
DLT
• Put into the form Ax = 0
• Solve using Single Value Decomposition (SVD)
010000
01000
33
32
31
23
22
21
13
12
11
z
y
x
t
t
t
r
r
r
r
r
r
r
r
r
yyZyYyXZYX
xxZxYxXZYXAx
Singular Value Decomposition
• where and • D is a diagonal matrix consists of singular values
where
• To solve Ax = 0, We can take the SVD of A- x is the column of V corresponding to the zero singular value of A
TUDVA IUU T IVV T
N
000
000
000
000
2
1
0...21 n
TUDV
Appication: Augmented Reality
• Human brain is doing pose estimation every single second!• Virtual 3D images or annotations are superimposed on top of a
live video feed.• Either through the use of see through glasses (a head-mounted
display) or on a regular computer or mobile device screen.• In some applications, a special pattern printed on cards or in a
book is tracked to perform the augmentation.
Microsoft Hololens
6.3 Geometric Intrinsic Calibration
• The computation of the intrinsic (internal) camera-calibration parameters and the estimation of the extrinsic (external) pose of the camera can occur simultaneously with respect to a known calibration set.• In this section, we look at alternative formulations, the use of
alternative calibration targets , and the estimation of the non-linear part of camera optics such as radial distortion.
6.3.1 Calibration patterns
• One of the more reliable ways to estimate camera intrinsic parameters. Target has easy-to-extract features.
Alternative method
• Place the camera on a large flat piece of cardboard and use a long metal ruler to draw lines on the cardboard that appear vertical in the image.
Nodal point
• Nodal points N1 and N2.
Planar Calibration Patterns
• N-planes calibration approach:Finite workspace + accurate machining and motion control platforms to move a planar calibration target in a controlled fashion.• A less accurate calibration can be obtained by waving calibration
pattern in front of a camera .In this case, the pattern’s pose has (in principle) to be recovered in conjunction with the intrinsics.
DLT Recall
• The projection of a 3D point P in the world to a normalized image point.
or
• Linear system
1333231
232221
131211~
Z
Y
X
trrr
trrr
trrr
Z
Y
X
extn PMpz
y
z
x
tZrYrXr
tZrYrXry
tZrYrXr
tZrYrXrx
333231
232221
333231
131211 ,
0)(
0)(
333231232221
333231131211
zy
zx
tZrYrXrytZrYrXr
tZrYrXrxtZrYrXrnormalized
Joke
6.3.2 Vanishing Points
• A common case for calibration. • The camera is looking at a man-made scene with strong extended
rectahedral objects such as boxes or room walls.• Intersect the 2D lines corresponding to 3D parallel lines to
compute their vanishing points, (as in 4.3.3).• Use these to determine the intrinsic and extrinsic calibration
parameters
Vanishing Points
Vanishing Points Calibration
• Assume a simplified form for the calibration matrix K where only the focal length is unknown.• For any vanishing point .
iiyi
xi
i rp
f
cy
cx
x
R~^
^
ix
0))(())((~ 2
fcycycxcx
f
cy
cx
f
cy
cx
rr yjyixjxiyj
xj
yi
xi
ji
)1,0,0()0,1,0()0,0,1( ororpi
Vanishing Points Calibration
• The accuracy increases as the vanishing points move closer to the center of the image. • It is best to tilt the calibration pattern a decent amount around the
45 axis
Application: Single View Metrology
• Allows people to interactively measure heights and other dimensions as well as to build piecewise planar 3D models,
Application: Single View Metrology
• Criminisi, Reid, and Zisserman (2000).• First step: Identify two orthogonal vanishing points on the ground
plane and the vanishing point for the vertical direction.• The user then marks a few dimensions in the image, such as the
height of a reference object.• The system can automatically compute the height of another
object. Walls and other planar impostors (geometry) can also be sketched and reconstructed.
6.3.4 Rotational Motion
• When no calibration targets or known structures are available.• But you can rotate the camera around its front nodal point (or,
equivalently, work in a large open environment where all objects are distant)• The camera can be calibrated from a set of overlapping images by
assuming that it is undergoing pure rotational motion.• The accuracy in this estimate is proportional to the total number
of pixels in the resulting cylindrical panorama.
6.3.5 Radial Distortion
• When images are taken with wide-angle lenses, itis often necessary to model lens distortions such as radial distortion.
Radial distortion
• The simplest radial distortion models use low-order polynomials.
• Where and are the distortion parameters.
)1(
)1(
42
21
^
42
21
^
rryy
rrxx
222 yxr 21,
Plumb Line Method
• One of the simplest and most usefulmethod.• Take an image of a scene with a lot of
straight lines, especially lines aligned with and near the edges of the image.• The radial distortion parameters can
then be adjusted until all of the lines in the image are straight.
Other Distortion
• More general models of lens distortion, such as fisheye and non-central projection, may sometimes be required. While the parameterization of such lenses may be more complicated .• The general approach of either using calibration rigs with known
3D positions or self-calibration through the use of multiple overlapping images of a scene can both be used.
Joke
Sources
• EGGN 512 – Computer Vision course by Prof. Willliam Hoff• http://inside.mines.edu/~whoff/courses/EENG512/• Playlist@Youtube:• http://www.youtube.com/playlist?list=PL4B3F8D4A5CAD8DA3
• Digital Visual Effects course by Prof. Yung-Yu Chuang• http://www.csie.ntu.edu.tw/~cyy/courses/vfx/15spring/news/
• Both are highly recommended
3D-2D Transforms
• EGGN 512 – Lecture 5-1 3D-2D Transforms• http://
www.youtube.com/watch?v=DFNjOUMuecU&index=11&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 5-2 3D-2D Transforms• http://
www.youtube.com/watch?v=5gesrLgNuQo&list=PL4B3F8D4A5CAD8DA3
Alignment Algorithm
• EGGN 512 – Lecture 14-1 Alignment• http://
www.youtube.com/watch?v=UcU4814hvR8&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 14-2 Alignment• http://
www.youtube.com/watch?v=XxEKMecNZk0&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 14-3 Alignment• http://www.youtube.com/watch?v=lSXfv4baMwk&list=PL4B3F8D4A5CAD8DA3• Slides • http://inside.mines.edu/~
whoff/courses/EENG512/lectures/15-AlignmentNonlinear.pdf
RANSAC Algorithm
• EGGN 512 – Lecture 27-1 RANSAC• http://www.youtube.com/watch?v=NKxXGsZdDp8&list=PL4B3F8
D4A5CAD8DA3&index=54
Pose Estimation
• EGGN 512 – Lecture 16-1 Pose Estimation• http://
www.youtube.com/watch?v=kq3c6QpcAGc&list=PL4B3F8D4A5CAD8DA3
• EGGN 512 – Lecture 17-1 Pose from Lines• http://
www.youtube.com/watch?v=D_4eUoqgWdc&list=PL4B3F8D4A5CAD8DA3
Pose Estimation
• EGGN 512 – Lecture 18-1 SVD• http://
www.youtube.com/watch?v=C852P1JrHXI&list=PL4B3F8D4A5CAD8DA3
• EGGN 512 – Lecture 18-2 SVD• http://
www.youtube.com/watch?v=aIkzK4CdYes&list=PL4B3F8D4A5CAD8DA3
Pose Estimation
• EGGN 512 – Lecture 19-1 Linear Pose Estimation• http://
www.youtube.com/watch?v=HojSSrxsB4Q&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 19-2 Linear Pose Estimation• http://
www.youtube.com/watch?v=ik8dFybnyPY&list=PL4B3F8D4A5CAD8DA3• EGGN 512 – Lecture 19-3 Linear Pose Estimation• http://
www.youtube.com/watch?v=OS5b-3Xfn1M&list=PL4B3F8D4A5CAD8DA3