72
Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia Tech Many slides from S. Lazebnik and D. Hoiem

Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Alignment and Object Instance Recognition

Computer Vision

Jia-Bin Huang, Virginia TechMany slides from S. Lazebnik and D. Hoiem

Page 2: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Administrative Stuffs

• HW 2 due 11:59 PM Oct 9

Page 3: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Anonymous feedback

• Lectures• Microphone on your shirt.• A little quick to follow• Make the topics easier to understand

Thanks for the feedback. I will adjust the contents.

• Homeworks• Minor mistakes in HW1• Too many assignments.• You are expecting too much.

Will provide explicit starter code structure and hints.

• Other aspects• Real-world demo first, then explain the foundation

Will do.

Page 4: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Today’s class

• Review fitting

• Alignment

• Object instance recognition

• Going over HW 2

Page 5: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Previous class

• Global optimization / Search for parameters• Least squares fit

• Robust least squares

• Iterative closest point (ICP)

• Hypothesize and test• Generalized Hough transform

• RANSAC

Page 6: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Least squares line fitting•Data: (x1, y1), …, (xn, yn)

•Line equation: yi = m xi + b

•Find (m, b) to minimize

022 yAApATT

dp

dE

)()()(2

1

1

12

2

11

1

2

ApApyApyy

yAp

TTT

nn

n

i ii

y

y

b

m

x

x

yb

mxE

n

i ii bxmyE1

2)(

(xi, yi)

y=mx+b

yAAApyAApA TTTT 1

Matlab: p = A \ y;

Modified from S. Lazebnik

Page 7: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Least squares line fittingfunction [m, b] = lsqfit(x, y)

% y = mx + b

% find line that best predicts y given x

% minimize sum_i (m*x_i + b - y_i).^2

A = [x(:) ones(numel(x), 1)];

b = y(:);

p = A\b;

m = p(1);

b = p(2);

A yp

Page 8: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Total least squaresFind (a, b, c) to minimize the sum of squared perpendicular distances

n

i ii dybxaE1

2)( (xi, yi)

ax+by+c=0

n

i ii cybxaE1

2)(Unit normal:

N=(a, b)

0)(21

n

i ii cybxac

Eybxay

n

bx

n

ac

n

i i

n

i i 11

ApApTT

nn

n

i iib

a

yyxx

yyxx

yybxxaE

2

11

1

2))()((

Solution is eigenvector corresponding to smallest eigenvalue of ATA

See details on Raleigh Quotient: http://en.wikipedia.org/wiki/Rayleigh_quotient

pp

ApAp pp ApAp

T

TTTTT minimize1 s.t.minimize

Slide modified from S. Lazebnik

Page 9: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Total least squaresfunction [m, b, err] = total_lsqfit(x, y)

% ax + by + c = 0

% distance to line for (a^2+b^2=1): dist_sq = (ax + by + c).^2

A = [x(:)-mean(x) y(:)-mean(y)];

[v, d] = eig(A'*A);

p = v(:, 1); % eigenvector corr. to smaller eigenvalue

% get a, b, c parameters

a = p(1);

b = p(2);

c = -(a*mean(x)+b*mean(y));

err = (a*x+b*y+c).^2;

% convert to slope-intercept (m, b)

m = -a/b;

b = -c/b; % note: this b is for slope-intercept now

A p

Page 10: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Robust Estimator

1. Initialize: e.g., choose 𝜃 by least squares fit and

2. Choose params to minimize:• E.g., numerical optimization

3. Compute new

4. Repeat (2) and (3) until convergence

errormedian5.1

i i

i

dataerror

dataerror22

2

),(

),(

errormedian5.1

Page 11: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [m, b] = robust_lsqfit(x, y)

% iterative robust fit y = mx + b

% find line that best predicts y given x

% minimize sum_i (m*x_i + b - y_i).^2

[m, b] = lsqfit(x, y);

p = [m ; b];

err = sqrt((y-p(1)*x-p(2)).^2);

sigma = median(err)*1.5;

for k = 1:7

p = fminunc(@(p)geterr(p,x,y,sigma), p);

err = sqrt((y-p(1)*x-p(2)).^2);

sigma = median(err)*1.5;

end

m = p(1);

b = p(2);

Page 12: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

x

y

Hough transformP.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High

Energy Accelerators and Instrumentation, 1959

Hough space

siny cosx

Use a polar representation for the parameter space

Slide from S. Savarese

Page 13: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [m, b] = houghfit(x, y)

% y = mx + b

% x*cos(theta) + y*sin(theta) = r

% find line that best predicts y given x

% minimize sum_i (m*x_i + b - y_i).^2

thetas = (-pi+pi/50):(pi/100):pi;

costhetas = cos(thetas);

sinthetas = sin(thetas);

minr = 0; stepr = 0.005; maxr = 1;

% count hough votes

counts = zeros(numel(thetas),(maxr-minr)/stepr+1);

for k = 1:numel(x)

r = x(k)*costhetas + y(k)*sinthetas;

% only count parameters within the range of r

inrange = find(r >= minr & r <= maxr);

rnum = round((r(inrange)-minr)/stepr)+1;

ind = sub2ind(size(counts), inrange, rnum);

counts(ind) = counts(ind) + 1;

end

% smooth the bin counts

counts = imfilter(counts,

fspecial('gaussian', 5, 0.75));

% get best theta, rho and show counts

[maxval, maxind] = max(counts(:));

[thetaind, rind] = ind2sub(size(counts),

maxind);

theta = thetas(thetaind);

r = minr + stepr*(rind-1);

% convert to slope-intercept

b = r/sin(theta);

m = -cos(theta)/sin(theta);

Page 14: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

RANSAC

14INAlgorithm:

1. Sample (randomly) the number of points required to fit the model (#=2)2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence

Page 15: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [m, b] = ransacfit(x, y)

% y = mx + b

N = 200;

thresh = 0.03;

bestcount = 0;

for k = 1:N

rp = randperm(numel(x));

tx = x(rp(1:2));

ty = y(rp(1:2));

m = (ty(2)-ty(1)) ./ (tx(2)-tx(1));

b = ty(2)-m*tx(2);

nin = sum(abs(y-m*x-b)<thresh);

if nin > bestcount

bestcount = nin;

inliers = (abs(y - m*x - b) < thresh);

end

end

% total least square fitting on inliers

[m, b] = total_lsqfit(x(inliers), y(inliers));

Page 16: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Line fitting demo

demo_linefit(npts, outliers, noise, method)

• npts: number of points

• outliers: number of outliers

• noise: noise level

• Method• lsq: least squares• tlsq: total least squares• rlsq: robust least squares• hough: hough transform• ransac: RANSAC

Page 17: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Which algorithm should I use?

If we know which points belong to the line, how do we find the “optimal” line parameters?Least squares

What if there are outliers?Robust fitting, RANSAC

• What if there are many lines?• Voting methods: RANSAC, Hough transform

Slide credit: S. Lazebnik

Page 18: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Alignment as fitting• Previous lectures: fitting a model to features in one

image

• Alignment: fitting a model to a transformation between pairs of features (matches) in two images

i

i Mx ),(residual

i

ii xxT )),((residual

Find model M that minimizes

Find transformation T

that minimizes

M

xi

T

xi

xi'

Page 19: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

What if you want to align but have no prior matched pairs?

• Hough transform and RANSAC not applicable

• Important applications

Medical imaging: match brain scans or contours

Robotics: match point clouds

Page 20: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Iterative Closest Points (ICP) Algorithm

Goal: estimate transform between two dense sets of points

1. Initialize transformation (e.g., compute difference in means and scale)

2. Assign each point in {Set 1} to its nearest neighbor in {Set 2}

3. Estimate transformation parameters • e.g., least squares or robust least squares

4. Transform the points in {Set 1} using estimated parameters

5. Repeat steps 2-4 until change is very small

Page 21: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Example: solving for translation

A1

A2 A3B1

B2 B3

Given matched points in {A} and {B}, estimate the translation of the object

y

x

A

i

A

i

B

i

B

i

t

t

y

x

y

x

Page 22: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

A1

A2 A3B1

B2 B3

Least squares solution

y

x

A

i

A

i

B

i

B

i

t

t

y

x

y

x

(tx, ty)

1. Write down objective function2. Derived solution

a) Compute derivativeb) Compute solution

3. Computational solutiona) Write in form Ax=bb) Solve using pseudo-inverse or eigenvalue

decomposition

A

n

B

n

A

n

B

n

AB

AB

y

x

yy

xx

yy

xx

t

t

11

11

10

01

10

01

Example: solving for translation

Page 23: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

A1

A2 A3B1

B2 B3

RANSAC solution

y

x

A

i

A

i

B

i

B

i

t

t

y

x

y

x

(tx, ty)

1. Sample a set of matching points (1 pair)2. Solve for transformation parameters3. Score parameters with number of inliers4. Repeat steps 1-3 N times

Problem: outliers

A4

A5

B5

B4

Example: solving for translation

Page 24: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

A1

A2 A3B1

B2 B3

Hough transform solution

y

x

A

i

A

i

B

i

B

i

t

t

y

x

y

x

(tx, ty)

1. Initialize a grid of parameter values2. Each matched pair casts a vote for consistent

values3. Find the parameters with the most votes4. Solve using least squares with inliers

A4

A5 A6

B4

B5 B6

Problem: outliers, multiple objects, and/or many-to-one matches

Example: solving for translation

Page 25: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

(tx, ty)

Problem: no initial guesses for correspondence

y

x

A

i

A

i

B

i

B

i

t

t

y

x

y

xICP solution1. Find nearest neighbors for each point2. Compute transform using matches3. Move points using transform4. Repeat steps 1-3 until convergence

Example: solving for translation

Page 26: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Algorithm Summary

• Least Squares Fit • closed form solution• robust to noise• not robust to outliers

• Robust Least Squares• improves robustness to noise• requires iterative optimization

• Hough transform• robust to noise and outliers• can fit multiple models• only works for a few parameters (1-4 typically)

• RANSAC• robust to noise and outliers• works with a moderate number of parameters (e.g, 1-8)

• Iterative Closest Point (ICP)• For local alignment only: does not require initial correspondences

Page 27: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Alignment• Alignment: find parameters of model that maps one

set of points to another

• Typically want to solve for a global transformation that accounts for most true correspondences

• Difficulties• Noise (typically 1-3 pixels)

• Outliers (often 30-50%)

• Many-to-one matches or multiple objects

Page 28: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Parametric (global) warping

Transformation T is a coordinate-changing machine:

p’ = T(p)

What does it mean that T is global?• Is the same for any point p• can be described by just a few numbers (parameters)

For linear transformations, we can represent T as a matrix

p’ = Tp

T

p = (x,y) p’ = (x’,y’)

y

x

y

xT

'

'

Page 29: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Common transformations

translation rotation aspect

affine perspective

original

Transformed

Slide credit (next few slides): A. Efros and/or S. Seitz

Page 30: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Scaling• Scaling a coordinate means multiplying each of its components by a

scalar

• Uniform scaling means this scalar is the same for all components:

2

Page 31: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

• Non-uniform scaling: different scalars per component:

Scaling

X 2,Y 0.5

Page 32: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Scaling

• Scaling operation:

• Or, in matrix form:

byy

axx

'

'

y

x

b

a

y

x

0

0

'

'

scaling matrix S

Page 33: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

2-D Rotation

(x, y)

(x’, y’)

x’ = x cos() - y sin()y’ = x sin() + y cos()

Page 34: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

2-D Rotation

Polar coordinates…x = r cos (f)y = r sin (f)x’ = r cos (f + )y’ = r sin (f + )

Trig Identity…x’ = r cos(f) cos() – r sin(f) sin()y’ = r sin(f) cos() + r cos(f) sin()

Substitute…x’ = x cos() - y sin()y’ = x sin() + y cos()

(x, y)

(x’, y’)

f

Page 35: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

2-D RotationThis is easy to capture in matrix form:

Even though sin() and cos() are nonlinear functions of ,• x’ is a linear combination of x and y

• y’ is a linear combination of x and y

What is the inverse transformation?

• Rotation by –

• For rotation matrices

y

x

y

x

cossin

sincos

'

'

TRR 1

R

Page 36: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Basic 2D transformations

TranslateRotate

ShearScale

y

x

y

x

y

x

1

1

'

'

y

x

y

x

cossin

sincos

'

'

y

x

s

s

y

x

y

x

0

0

'

'

110

01y

x

t

t

y

x

y

x

1

y

x

fed

cba

y

x

Affine

Affine is any combination of translation, scale, rotation, shear

Page 37: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Affine Transformations

Affine transformations are combinations of

• Linear transformations, and

• Translations

Properties of affine transformations:

• Lines map to lines

• Parallel lines remain parallel

• Ratios are preserved

• Closed under composition

1

y

x

fed

cba

y

x

11001

'

'

y

x

fed

cba

y

x

or

Page 38: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Projective Transformations

w

yx

ihg

fedcba

w

yx

'

''Projective transformations are combos of

• Affine transformations, and

• Projective warps

Properties of projective transformations:

• Lines map to lines

• Parallel lines do not necessarily remain parallel

• Ratios are not preserved

• Closed under composition

• Models change of basis

• Projective matrix is defined up to a scale (8 DOF)

Page 39: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Projective Transformations (homography)

• The transformation between two views of a planar surface

• The transformation between images from two cameras that share the same center

Page 40: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Application: Panorama stitching

Source: Hartley & Zisserman

Page 41: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Application: document scanning

Page 42: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

2D image transformations (reference table)

Page 43: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Object Instance Recognition

1. Match keypoints to object model

2. Solve for affine transformation parameters

3. Score by inliers and choose solutions with score above threshold

A1

A2

A3

Affine Parameters

Choose hypothesis with max score above threshold

# Inliers

Matched keypoints

This Class

Page 44: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Overview of Keypoint Matching

K. Grauman, B. Leibe

Np

ixel

s

N pixels

Af

e.g. color

Bf

e.g. color

A1

A2 A3

Tffd BA ),(

1. Find a set of distinctive key-points

3. Extract and normalize the region content

2. Define a region around each keypoint

4. Compute a local descriptor from the normalized region

5. Match local descriptors

Page 45: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Finding the objects (overview)

1. Match interest points from input image to database image

2. Matched points vote for rough position/orientation/scale of object

3. Find position/orientation/scales that have at least three votes

4. Compute affine registration and matches using iterative least squares with outlier check

5. Report object if there are at least T matched points

Input Image Stored

Image

Page 46: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Matching Keypoints• Want to match keypoints between:

1. Query image

2. Stored image containing the object

• Given descriptor x0, find two nearest neighbors x1, x2with distances d1, d2

• x1 matches x0 if d1/d2 < 0.8• This gets rid of 90% false matches, 5% of true matches in

Lowe’s study

Page 47: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Affine Object Model• Accounts for 3D rotation of a surface under

orthographic projection

Page 48: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Fitting an affine transformation• Assume we know the correspondences, how do we

get the transformation?),( ii yx

),( ii yx

2

1

43

21

t

t

y

x

mm

mm

y

x

i

i

i

i

tMxx ii

Want to find M, t to minimize

n

i

ii

1

2|||| tMxx

Page 49: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Fitting an affine transformation

),( ii yx ),( ii yx

2

1

43

21

t

t

y

x

mm

mm

y

x

i

i

i

i

i

i

ii

ii

y

x

t

t

m

m

m

m

yx

yx

2

1

4

3

2

1

1000

0100

• Assume we know the correspondences, how do we get the transformation?

Page 50: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Fitting an affine transformation

• Linear system with six unknowns

• Each match gives us two linearly independent equations: need at least three to solve for the transformation parameters

i

i

ii

ii

y

x

t

t

m

m

m

m

yx

yx

2

1

4

3

2

1

1000

0100

Page 51: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Finding the objects (in detail)1. Match interest points from input image to database image

2. Get location/scale/orientation using Hough voting

• In training, each point has known position/scale/orientation wrt whole object

• Matched points vote for the position, scale, and orientation of the entire object

• Bins for x, y, scale, orientation• Wide bins (0.25 object length in position, 2x scale, 30 degrees orientation)

• Vote for two closest bin centers in each direction (16 votes total)

3. Geometric verification

• For each bin with at least 3 keypoints

• Iterate between least squares fit and checking for inliers and outliers

4. Report object if > T inliers (T is typically 3, can be computed to match some probabilistic threshold)

Page 52: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Examples of recognized objects

Page 53: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

View interpolation

• Training– Given images of different

viewpoints– Cluster similar viewpoints

using feature matches– Link features in adjacent

views

• Recognition– Feature matches may be

spread over several training viewpoints

Use the known links to “transfer votes” to other viewpoints

Slide credit: David Lowe

[Lowe01]

Page 54: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Applications

• Sony Aibo(Evolution Robotics)

• SIFT usage– Recognize

docking station– Communicate

with visual cards

• Other uses– Place recognition– Loop closure in SLAM

K. Grauman, B. Leibe 54Slide credit: David Lowe

Page 55: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Location Recognition

Slide credit: David Lowe

Training

[Lowe04]

Page 56: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Another application: category recognition• Goal: identify what type of object is in the image

• Approach: align to known objects and choose category with best match

“Shape matching and object recognition using low distortion correspondence”, Berg et al., CVPR 2005: http://www.cnbc.cmu.edu/cns/papers/berg-cvpr05.pdf

?

Page 57: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Examples of Matches

Page 58: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Examples of Matches

Page 59: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Other ideas worth being aware of

• Thin-plate splines: combines global affine warp with smooth local deformation

• Robust non-rigid point matching: A new point matching algorithm for non-rigid registration, CVIU 2003 (includes code, demo, paper)

Page 60: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

HW 2: Feature Trackingfunction featureTracking

% Main function for feature tracking

folder = '.\images';

im = readImages(folder, 0:50);

tau = 0.06; % Threshold for harris corner detection

[pt_x, pt_y] = getKeypoints(im{1}, tau); % Prob 1.1: keypoint detection

ws = 7; % Tracking ws x ws patches

[track_x, track_y] = ... % Prob 1.2 Keypoint tracking

trackPoints(pt_x, pt_y, im, ws);

% Visualizing the feature tracks on the first and the last frame

figure(2), imagesc(im{1}), hold off, axis image, colormap gray

hold on, plot(track_x', track_y', 'r');

Page 61: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

HW 2 – Feature/Keypoint detection

• Compute second moment matrix

• Harris corner criterion

• Threshold

• Non-maximum suppression

)()(

)()()(),(

2

2

DyDyx

DyxDx

IDIIII

IIIg

222222 )]()([)]([)()( yxyxyx IgIgIIgIgIg

])),([trace()],(det[ 2

DIDIhar

04.0

Page 62: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [keyXs, keyYs] = getKeypoints(im, tau)

% im: input image; tau: threshold

% keyXs, keyYs: detected keypoints, with dimension [N] x [1]

% 0. Smooth image (optional)

% 1. Compute image gradients. Hint: can use “gradient”

% 2. Compute Ix*Ix, Iy*Iy, Ix*Iy

% 3. Smooth Ix*Ix, Iy*Iy, Ix*Iy (locally weighted average)

% 4. Compute Harris corner score. Normalize the max to 1

% 5. Non-maximum suppression. Hint: use “ordfilt2”

% 6. Find positions whose values larger than tau. Hint: use

“find”

end

Page 63: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [track_x, track_y] = trackPoints(pt_x, pt_y, im, ws)

% Tracking initial points (pt_x, pt_y) across the image sequence

% track_x: [Number of keypoints] x [nim]

% track_y: [Number of keypoints] x [nim]

% Initialization

N = numel(pt_x); % Number of keypoints

nim = numel(im); % Number of images

track_x = zeros(N, nim);

track_y = zeros(N, nim);

track_x(:, 1) = pt_x(:);

track_y(:, 1) = pt_y(:);

for t = 1:nim-1 % Tracking points from t to t+1

[track_x(:, t+1), track_y(:, t+1)] = ...

getNextPoints(track_x(:, t), track_y(:, t), im{t}, im{t+1}, ws);

end

Page 64: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Iterative L-K algorithm

1. Initialize (x’,y’) = (x,y)

2. Compute (u,v) by

3. Shift window by (u, v): x’=x’+u; y’=y’+v;

4. Recalculate It

5. Repeat steps 2-4 until small change• Use interpolation for subpixel values

2nd moment matrix for feature

patch in first image displacement

It = I(x’, y’, t+1) - I(x, y, t)

Original (x,y) position

Page 65: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function [x2, y2] = getNextPoints(x, y, im1, im2, ws) % Iterative Lucas-Kanade feature tracking

% (x, y) : initialized keypoint position in im1; (x2, y2): tracked keypoint positions in im2

% ws: patch window size

% 1. Compute gradients from Im1 (get Ix and Iy)

% 2. Grab patches of Ix, Iy, and im1.

Hint 1: use “[X, Y] = meshgrid(-hw:hw,-hw:hw);” to get patch index, where hw = floor(ws/2);

Hint 2: use “interp2” to sample non-integer positions.

for iter = 1:numIter % 5 iterations should be sufficient

% Check if tracked patch are outside the image. Only track valid patches.

% For each keypoint

% - grab patch1 (centered at x1, y1), grab patch2 (centered at x2,y2)

% - compute It = patch2 – patch1

% - grab Ix, Iy (centered at x1, y1)

% - Set up matrix A and vector b

% - Solve linear system d = A\b.

% - x2(p)=x2(p)+d(1); y2(p)=y2(p)+d(1); -> Update the increment

end

Page 66: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

HW 2 – Shape Alignment

• Global transformation (similarity, affine, perspective)

• Iterative closest point algorithm

Page 67: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function Tfm = align_shape(im1, im2)

% im1: input edge image 1

% im2: input edge image 2

% Output: transformation Tfm [3] x [3]

% 1. Find edge points in im1 and im2. Hint: use “find”

% 2. Compute initial transformation (e.g., compute translation and scaling by center of mass, variance within each image)

for i = 1: 50

% 3. Get nearest neighbors: for each point pi find correspondingmatch(i) = argmin

jdist(pi, qj)

% 4. Compute transformation T based on matches

% 5. Warp points p according to T

end

Page 68: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

HW 2 – Local Feature Matching

• Implement distance ratio test feature matching algorithm

Page 69: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

function featureMatching

im1 = imread('stop1.jpg');

im2 = imread('stop2.jpg');

% Load pre-computed SIFT features

load('SIFT_features.mat');

% Descriptor1, Descriptor2: SIFT features

% Frame1, Frame2: position, scale, rotation

% For every descriptor in im1, find the 1st nearest neighbor

and the 2nd nearest neighbor in im2.

% Compute distance ratio score.

% Threshold and get the matches: a 2 x N array of indices

that indicates which keypoints from image1 match which

points in image 2

figure(1), hold off, clf

plotmatches(im2double(im1),im2double(im2),Frame1,Frame2,mat

ches); % Display the matched keypoints

Page 70: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Things to remember• Alignment

• Hough transform

• RANSAC

• ICP

• Object instance recognition• Find keypoints, compute

descriptors

• Match descriptors

• Vote for / fit affine parameters

• Return object if # inliers > T

Page 71: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

What have we learned?• Interest points

• Find distinct and repeatable points in images• Harris-> corners, DoG -> blobs• SIFT -> feature descriptor

• Feature tracking and optical flow• Find motion of a keypoint/pixel over time• Lucas-Kanade:

• brightness consistency, small motion, spatial coherence

• Handle large motion: • iterative update + pyramid search

• Fitting and alignment • find the transformation parameters that

best align matched points

• Object instance recognition • Keypoint-based object instance recognition and search

Page 72: Alignment and Object Instance Recognitionjbhuang/teaching/ece5554-4554/fa17/lectures/Lecture_10...Alignment and Object Instance Recognition Computer Vision Jia-Bin Huang, Virginia

Next week –Perspective and 3D Geometry