78
ENGN8530: Computer Vision and Image Understanding: Theories and Research Topic 3: Image Matching and Registration Dr Chunhua Shen and Dr Roland Goecke VISTA / NICTA & RSISE, ANU Acknowledgement: Some slides from Dr Antonio Robles-Kelly, Dr Tiberio Caetano, Dr Rob Mahony, Dr Cordelia Schmid, and Dr David Lowe

CVIU Lecture 3

Embed Size (px)

Citation preview

Page 1: CVIU Lecture 3

ENGN8530: Computer Vision and Image Understanding:

Theories and Research

Topic 3:Image Matching and Registration

Dr Chunhua Shen and Dr Roland GoeckeVISTA / NICTA & RSISE, ANU

Acknowledgement: Some slides from Dr Antonio Robles-Kelly, Dr TiberioCaetano, Dr Rob Mahony, Dr Cordelia Schmid, and Dr David Lowe

Page 2: CVIU Lecture 3

ENGN8530: CVIU 2

Some TermsImage Matching:

Align images of the same modality so as to provide a continuous ‘larger’ imageImages often taken from different viewpoints

Image Registration:Align images of the same or different modalities so that the object of interest shows up in the same way in all the imagesUsually more or less taken from the same viewpointVery important in medical imaging!

However, these terms are often used interchangeably!

Page 3: CVIU Lecture 3

ENGN8530: CVIU 3

How to Build a Panorama?

Reference: M. Brown and D.G. Lowe, “Recognising Panoramas”, ICCV 2003.

Page 4: CVIU Lecture 3

ENGN8530: CVIU 4

Template MatchingUseful for locating objects with known shape and appearance in an image.An n×m template is compared with n×m regions of the image with the aim of finding the point in the image to which it is most similar.Disadvantage:

Cannot handle pose changes wellCannot handle scale changes well

Page 5: CVIU Lecture 3

ENGN8530: CVIU 5

Template Matching (2)Given 2 images I1 and I2 (of same size) measure the correlation between them.E.g. how similar are these? Need similarity measure!

-1

-1

0

-1 0

0 1

1 1

-10

-10

0

-10 0

0 10

10 10

-10

-10

0

-10 0

0

1010

10

Page 6: CVIU Lecture 3

ENGN8530: CVIU 6

Similarity MeasuresSum of Absolute Difference (SAD):

Let I1 and I2 be images of the same size, I1(pi) = ai

I2(pi) = bi

Sum of Squared Difference (SSD) very similar, except we use squares

∑ −=i

ii ba),(SAD 21 II

Page 7: CVIU Lecture 3

ENGN8530: CVIU 7

SADExample: Find the eyes in a face

Original image showing templateand location of optimum

Sum of absolute differences, note the peak over the left eye

Template

Page 8: CVIU Lecture 3

ENGN8530: CVIU 8

SAD (2)Advantages:

Intuitivedegrades gracefully

Disadvantage:Not robust to changes in illumination

Page 9: CVIU Lecture 3

ENGN8530: CVIU 9

SAD (3)

original image I

darkened image

Idark= I/2 + 0.2

FAILS:

not robust to uniform change in illumination

Page 10: CVIU Lecture 3

ENGN8530: CVIU 10

CorrelationCorrelation measures how changes in one variable correlate with changes in another variable.We will use the normalised cross-correlation (NCC) to determine correlation between I1 and I2. Without normalisation, correlation is also sensitive to illumination changes.This measures the similarity in the way in which I1 and I2 deviate from their mean values. Measures the similarity of the ‘patterns’ of two images.

⇒ Robust to changes in illumination.

Page 11: CVIU Lecture 3

ENGN8530: CVIU 11

Normalised Cross CorrelationLet I1 and I2 be images of the same size, I1(pi) = ai

I2(pi) = bi

( )( )( )

( ) ( )2221

∑∑

−−

−−=

ii

ii

iii

bbaa

bbaa),(NCC II

Page 12: CVIU Lecture 3

ENGN8530: CVIU 12

NCC (2)

I1 I2 NCC(I1, I2)

-1-10

-1 00 11 1

-1-10

-1 00 11 1

k>01k . +l

-1-10

-1 00 11 1 -1-10

-10

01

11

-1k . +l k>0

Here k and l are constants, '+l' means to add l to all matrix elements

Page 13: CVIU Lecture 3

ENGN8530: CVIU 13

NCC (3)

NCC(I1, I2) ∈ [-1,1]Measures the similarity of the ‘patterns’ of two images.Is undefined for a flat featureless image:

This virtually neveroccurs in practice

k

k

k

k k

k k

k k

This means “the interval on the real line bounded by –1 and 1, containing both its boundary points.”

Page 14: CVIU Lecture 3

ENGN8530: CVIU 14

NCC (4)

0

1

1

1

-1

0

1

1

-1

-1

0

0

2

3

2

0

-1

-1

0

-1 0

0 1

1 1

Apply in a similar manner to convolution, but• only calculate in ‘valid’ region.

Page 15: CVIU Lecture 3

ENGN8530: CVIU 15

NCC (5)

-1

-1

0

-1 0

0 1

1 1

Apply in a similar manner to convolution, but• only calculate in ‘valid’ region.

1

-1

-1

0

-1 0

0 1

1 1

0

1

1

1

-1

0

1

1

-1

-1

0

0

2

3

2

0

Page 16: CVIU Lecture 3

ENGN8530: CVIU 16

NCC (6)

Apply in a similar manner to convolution, but• only calculate in ‘valid’ region.

-1

-1

0

-1 0

0 1

1 1

-1

-1

0

-1 0

0 1

1 1

0

1

1

1

-1

0

1

1

-1

-1

0

0

2

3

2

0

1 0.846

Page 17: CVIU Lecture 3

ENGN8530: CVIU 17

NCC (7)

Apply in a similar manner to convolution, but• only calculate in ‘valid’ region.

-1

-1

0

-1 0

0 1

1 1

-1

-1

0

-1 0

0 1

1 1

0

1

1

1

-1

0

1

1

-1

-1

0

0

2

3

2

0

1 0.846

0.833

Page 18: CVIU Lecture 3

ENGN8530: CVIU 18

NCC (8)

Apply in a similar manner to convolution, but• only calculate in ‘valid’ region.

-1

-1

0

-1 0

0 1

1 1

-1

-1

0

-1 0

0 1

1 1

0

1

1

1

-1

0

1

1

-1

-1

0

0

2

3

2

0

1 0.846

0.833 0.258

Page 19: CVIU Lecture 3

ENGN8530: CVIU 19

Template Matching using NCCTemplate of right eye is flipped and used to locate left eye

Original image showing template,and location of maximum

in normalised cross-correlation

Normalised cross-correlation, note the peak over the left eye

Page 20: CVIU Lecture 3

ENGN8530: CVIU 20

NCC v. SADNCC SAD

unaltered image I

darkened image

Idark= I/2 + 0.2

Robust to uniform change

in illuminationnot robust to uniform change in illumination

FAIL!

Page 21: CVIU Lecture 3

ENGN8530: CVIU 21

IssuesThese similarity measures are not very good for handling

Scale changesPose changesArbitrary rotations of the object or cameraIllumination changes, in particular non-global

How can we improve image matching / registration?Solution:

Use local invariant image featuresThen use these features to do the matching / registration

Page 22: CVIU Lecture 3

ENGN8530: CVIU 22

How to Build a Panorama? (2)Need to align (match) images

Page 23: CVIU Lecture 3

ENGN8530: CVIU 23

How to Build a Panorama? (3)Detect feature points in both images

Page 24: CVIU Lecture 3

ENGN8530: CVIU 24

How to Build a Panorama? (4)Detect feature points in both imagesFind corresponding pairs

Page 25: CVIU Lecture 3

ENGN8530: CVIU 25

How to Build a Panorama? (5)Detect feature points in both imagesFind corresponding pairsUse these pairs to align images

Page 26: CVIU Lecture 3

ENGN8530: CVIU 26

Local Image FeaturesLocal invariant photometric descriptors

( )local descriptor

Local : robust to occlusion/clutter + no segmentationPhotometric : distinctiveInvariant : to image transformations + illumination changes

Page 27: CVIU Lecture 3

ENGN8530: CVIU 27

Invariant FeaturesImage content is transformed into local feature coordinates thatare invariant to translation, rotation, scale, and other imagingparameters

SIFT Features

Page 28: CVIU Lecture 3

ENGN8530: CVIU 28

Invariant Features (2)Advantages:

Locality: features are local, so robust to occlusion and clutter (no prior segmentation)Distinctiveness: individual features can be matched to a large database of objectsQuantity: many features can be generated for even small objectsEfficiency: close to real-time performanceExtensibility: can easily be extended to wide range of differing feature types, with each adding robustness

Page 29: CVIU Lecture 3

ENGN8530: CVIU 29

Matching with FeaturesProblem 1:

Detect the same point independently in both images

no chance to match!

We need a repeatable detector

Page 30: CVIU Lecture 3

ENGN8530: CVIU 30

Matching with Features (2)Problem 2:

For each point correctly recognize the corresponding one

?

We need a reliable and distinctive descriptor

Page 31: CVIU Lecture 3

ENGN8530: CVIU 31

Matching with Features (3)Determining correspondences

Vector comparison using the Mahalanobis distance

)()(),( 1 qpqpqp −Λ−= −TMdist

( ) ( )=?

Page 32: CVIU Lecture 3

ENGN8530: CVIU 32

A Little Bit of History…Zhang, Deriche, Faugeras, Luong (Artificial Intelligence, 1995):

Apply Harris corner detectorMatch points by correlating only at corner points Derive epipolar alignment using robust least-squares

Page 33: CVIU Lecture 3

ENGN8530: CVIU 33

A Little Bit of History… (2)Schmid & Mohr (1997)Apply Harris corner detectorUse rotational invariants at corner points

However, not scale invariant. Sensitive to viewpoint and illumination change.

Page 34: CVIU Lecture 3

ENGN8530: CVIU 34

Interest Point DetectorsContour based methods

Junctions, ends, etc.

Intensity based methodsAuto-correlation matrix

Parametric-model based methodL-corner

Page 35: CVIU Lecture 3

ENGN8530: CVIU 35

Harris Corner Detector

Basic idea:We should easily recognize the point by looking through a small windowShifting a window in any directionshould give a large change in intensity

Reference: C. Harris and M. Stephens, “A combined corner and edge detector”, Proceedings of the 4th Alvey Vision Conference, 1988, pp. 147--151.

Page 36: CVIU Lecture 3

ENGN8530: CVIU 36

Harris Corner Detector (2)Based on the idea of auto-correlation

“flat” region:no change in all directions

“edge”:no change along the edge direction

“corner”:significant change in all directions

Page 37: CVIU Lecture 3

ENGN8530: CVIU 37

Harris Corner Detector (3)Change of intensity for the shift [u,v]:

[ ]2

,

( , ) ( , ) ( , ) ( , )x y

E u v w x y I x u y v I x y= + + −∑

IntensityShifted intensity

Window function

orWindow function w(x,y) =

Gaussian1 in window, 0 outside

Page 38: CVIU Lecture 3

ENGN8530: CVIU 38

Harris Corner Detector (4)

For small shifts [u,v] we have a bilinear approximation:

[ ]( , ) ,u

E u v u v Mv⎡ ⎤

≅ ⎢ ⎥⎣ ⎦

where M is a 2×2 matrix computed from image derivatives:

2

2,

( , ) x x y

x y x y y

I I IM w x y

I I I⎡ ⎤

= ⎢ ⎥⎢ ⎥⎣ ⎦

∑Auto-correlation matrix

Page 39: CVIU Lecture 3

ENGN8530: CVIU 39

Harris Corner Detector (5)Intensity change in shifting window: eigenvalue analysis

[ ]( , ) ,u

E u v u v Mv⎡ ⎤

≅ ⎢ ⎥⎣ ⎦

direction of the slowest change

direction of the fastest change

λ1, λ2 – eigenvalues of M

(λmax)-1/2

(λmin)-1/2

Ellipse E(u,v) = const

Page 40: CVIU Lecture 3

ENGN8530: CVIU 40

Harris Corner Detector (6)Auto-correlation matrix

captures the structure of the local neighborhoodmeasure based on eigenvalues of this matrix

2 strong eigenvalues => interest point1 strong eigenvalue => contour0 eigenvalue => uniform region

Interest point detectionthreshold on the eigenvalueslocal maximum for localization

Page 41: CVIU Lecture 3

ENGN8530: CVIU 41

Harris Corner Detector (7)

λ1

λ2

“Corner”λ1 and λ2 are large,λ1 ~ λ2;E increases in all directions

λ1 and λ2 are small;E is almost constant in all directions

“Edge”λ1 >> λ2

“Edge”λ2 >> λ1

“Flat”region

Classification of image points using eigenvalues of M:

Page 42: CVIU Lecture 3

ENGN8530: CVIU 42

Harris Corner Detector (8)

Measure of corner response:

( )2det traceR M k M= −

1 2

1 2

dettrace

MM

λ λλ λ

== +

(k – empirical constant, k = 0.04-0.06)

Page 43: CVIU Lecture 3

ENGN8530: CVIU 43

Harris Corner Detector (9)

λ1

λ2 “Corner”

“Edge”

“Edge”

“Flat”

R > 0

R < 0

R < 0|R| small

•R depends only on eigenvalues of M

•R is large for a corner

•R is negative with large magnitude for an edge

•|R| is small for a flatregion

Page 44: CVIU Lecture 3

ENGN8530: CVIU 44

Harris Corner Detector (10)The Algorithm:

Find points with large corner response function R (R > threshold)Take the points of local maxima of R

Page 45: CVIU Lecture 3

ENGN8530: CVIU 45

Harris: Workflow

Page 46: CVIU Lecture 3

ENGN8530: CVIU 46

Harris: Workflow (2)Compute corner response R

Page 47: CVIU Lecture 3

ENGN8530: CVIU 47

Harris: Workflow (3)Find points with large corner response: R>threshold

Page 48: CVIU Lecture 3

ENGN8530: CVIU 48

Harris: Workflow (4)Take only the points of local maxima of R

Page 49: CVIU Lecture 3

ENGN8530: CVIU 49

Harris: Workflow (5)

Page 50: CVIU Lecture 3

ENGN8530: CVIU 50

Harris: PropertiesRotation invariance

Ellipse rotates but its shape (i.e. eigenvalues) remains the same

Corner response R is invariant to image rotation

Page 51: CVIU Lecture 3

ENGN8530: CVIU 51

Harris: Properties (2)Partial invariance to affine intensity change

Only derivatives are used => invariance to intensity shift I → I + b

Intensity scale: I → a I

R

x (image coordinate)

threshold

R

x (image coordinate)

Page 52: CVIU Lecture 3

ENGN8530: CVIU 52

Harris: Properties (3)But: non-invariant to image scale!

All points will be classified as edges

Corner !

Page 53: CVIU Lecture 3

ENGN8530: CVIU 53

Scale Invariant FeaturesConsider regions (e.g. circles) of different sizes around a pointRegions of corresponding sizes will look the same in both images

Page 54: CVIU Lecture 3

ENGN8530: CVIU 54

Scale Invariant Features (2)The problem: how do we choose corresponding circles independently in each image?

Page 55: CVIU Lecture 3

ENGN8530: CVIU 55

Scale Invariant Features (3)Solution:

Design a function on the region (circle), which is “scale invariant” (the same for corresponding regions, even if they are at different scales)

Example: average intensity. For corresponding regions (even of different sizes) it will be the same.

scale = 1/2

– For a point in one image, we can consider it as a function of region size (circle radius)

f

region size

Image 1 f

region size

Image 2

Page 56: CVIU Lecture 3

ENGN8530: CVIU 56

Scale Invariant Features (4)Common approach:

scale = 1/2

f

region size

Image 1 f

region size

Image 2

Take a local maximum of this functionObservation: region size, for which the maximum is achieved, should be invariant to image scale.

s1 s2

Important: This scale invariant region size is found in each image independently!

Page 57: CVIU Lecture 3

ENGN8530: CVIU 57

Scale Invariant Features (5)A “good” function for scale detection:

has one stable sharp peak

f

region size

bad

f

region size

Good !f

region size

bad

• For usual images: a good function would be a one which responds to contrast (sharp local intensity change)

Page 58: CVIU Lecture 3

ENGN8530: CVIU 58

Scale Invariant Features (6)Functions for determining scale

2 2

21 22

( , , )x y

G x y e σπσ

σ+

−=

( )2 ( , , ) ( , , )xx yyL G x y G x yσ σ σ= +

( , , ) ( , , )DoG G x y k G x yσ σ= −

Kernel Imagef = ∗Kernels:

where Gaussian

(Laplacian)

(Difference of Gaussians)

Note: both kernels are invariant to scale and rotation

Page 59: CVIU Lecture 3

ENGN8530: CVIU 59

Scale Invariant Features (7)

scale

x

y

← Harris →

←La

plac

ian →

Harris-LaplacianFind local maximum of:

Harris corner detector in space (image coordinates)Laplacian in scale

Reference: K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001

Page 60: CVIU Lecture 3

ENGN8530: CVIU 60

Scale invariant Harris pointsMulti-scale extraction of Harris interest pointsSelection of points at characteristic scale in scale space

Characteristic scale:- Maximum in scale space- Scale invariantLaplacian

Page 61: CVIU Lecture 3

ENGN8530: CVIU 61

Scale invariant Harris points (2)

Multi-scale Harris points

Selection of points

at the characteristic scalewith Laplacian

invariant points + associated regions

Page 62: CVIU Lecture 3

ENGN8530: CVIU 62

Viewpoint ChangesLocally approximated by an affine transformation

A

Detected scale invariant region Projected region

Affine transformation: Linear transformation followed by a translation

Page 63: CVIU Lecture 3

ENGN8530: CVIU 63

Affine Invariant FeaturesSo far we considered:Similarity transform (rotation + uniform scale)

• Now we go on to:Affine transform (rotation + non-uniform scale)

Page 64: CVIU Lecture 3

ENGN8530: CVIU 64

Affine Invariant Features (2)

Take a local intensity extremum as initial pointGo along every ray starting from this point and stop when extremum of function f is reached

f

points along the ray

0

10

( )( )

( )t

ot

I t If t

I t I dt

−=

−∫

Reference: T.Tuytelaars, L.V.Gool. “Wide Baseline Stereo Matching Based on Local, Affinely Invariant Regions”. BMVC 2000.

Page 65: CVIU Lecture 3

ENGN8530: CVIU 65

Affine Invariant Features (3)We will obtain approximately corresponding regions

The regions found may not exactly correspond, so we approximate them with ellipsesGeometric moments of orders up to 2 allow to approximate the region by an ellipse

Remark: Search for scale in every direction

Page 66: CVIU Lecture 3

ENGN8530: CVIU 66

Affine Invariant Features (4)

q Ap=

2 1TA AΣ = Σ

12 1Tq q−Σ =

2 region 2

TqqΣ =

• Covariance matrix of region points defines an ellipse:

11 1Tp p−Σ =

1 region 1

TppΣ =

( p = [x, y]T is relative to the center of mass)

Ellipses, computed for corresponding regions,

also correspond!

Page 67: CVIU Lecture 3

ENGN8530: CVIU 67

Affine Invariant Harris

Initialisation with multi-scale interest points

Iterative modification of location, scale and neighbourhood

Page 68: CVIU Lecture 3

ENGN8530: CVIU 68

MSERMaximally Stable Extremal Regions

Threshold image intensities: I > I0

Extract connected components(“Extremal Regions”)Find a threshold when an extremalregion is “Maximally Stable”,i.e. there is a local minimum of the relative growth of its squareApproximate a region with an ellipse

Reference: J. Matas, O. Chum, M.Urban, T. Pajdla, “Robust Wide Baseline Stereo from Maximally Stable Extremal Regions”, BMVC 2002, pp. 384-393

Page 69: CVIU Lecture 3

ENGN8530: CVIU 69

SIFTScale-Invariant Feature TransformBasically SIFT is a 4-step process

Scale-space extrema detectionKeypoint localizationOrientation assignmentKeypoint descriptor

Reference: D. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

Page 70: CVIU Lecture 3

ENGN8530: CVIU 70

SIFT (2)Build Scale-Space Pyramid

All scales must be examined to identify scale-invariant featuresAn efficient function is to compute the Difference of Gaussian (DOG) pyramid (Burt & Adelson, 1983)

Blur

Res ample

Subtra ct

Blur

Res ample

Subtra ct

Blur

Resample

Subtract

Page 71: CVIU Lecture 3

ENGN8530: CVIU 71

SIFT (3)Scale space processed one octave at a time

Page 72: CVIU Lecture 3

ENGN8530: CVIU 72

SIFT (4)Key point localisation

Detect maxima and minima of Difference-of-Gaussian (DoG) in scale spaceCould also use Laplacian of Gaussian (LoG)

Page 73: CVIU Lecture 3

ENGN8530: CVIU 73

SIFT (5)

0 2π

Select canonical orientation:Create histogram of local gradient directions computed at selected scaleAssign canonical orientation at peak of smoothed histogramEach key specifies stable 2D coordinates (x, y, scale, orientation)

Page 74: CVIU Lecture 3

ENGN8530: CVIU 74

SIFT (6)SIFT vector formation (Keypoint Descriptor):

Thresholded image gradients are sampled over 16x16 array of locations in scale spaceCreate array of orientation histograms8 orientations x 4x4 histogram array = 128 dimensions

Page 75: CVIU Lecture 3

ENGN8530: CVIU 75

SIFT Example

Laplacian of Gaussian

Page 76: CVIU Lecture 3

ENGN8530: CVIU 76

SIFT Example (2)

SIFT keypoints

Page 77: CVIU Lecture 3

ENGN8530: CVIU 77

SIFT Example (3)

Query

Result

Task:

Find query image parts in the image

Page 78: CVIU Lecture 3

ENGN8530: CVIU 78

SummarySIFT arguably the best affine invariant local image feature, but…SIFT is relatively expensive (computationally)MSER doesn’t work well with images with any motion blur, e.g. from a moving cameraInteresting alternatives:

GLOH (Gradient Location and Orientation Histogram)SURF (Speeded Up Robust Features)Histogram of Oriented GradientsKadir-Brady Saliency Detector