47
Robust Learning-based Image Matching IMW CVPR 2019 Dawei Sun Zixin Luo Jiahui Zhang

Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

Robust Learning-based Image MatchingIMW CVPR 2019

Dawei Sun Zixin Luo Jiahui Zhang

Page 2: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

00 Outline

An image matching pipeline: 1) local keypoint detection, 2) local keypoint description, 3) sparse matching.

Page 3: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

Part 1

ContextDesc: a learning-based

local descriptorContextDesc: Local Descriptor

Augmentation with Cross-Modality

Context, CVPR’19

00 Outline

Part 2

A learning-based inlier

classification and fundamental

matrix estimation methodIn submission

An image matching pipeline: 1) local keypoint detection, 2) local keypoint description, 3) sparse matching.

Page 4: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Ref.

NN

GT

Ref.

NN

GT

The results are becoming saturated on standard benchmark

*Samples are from HPatches dataset.

Page 5: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Ref.

NN

GT

Ref.

NN

GT

Similar but incorrect

Nearest-neighbor matches

Ground-truth matches

Reference patches

Similar but incorrect

*GeoDesc is used for feature description.

Page 6: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Ref.

NN

GT

Ref.

NN

GT

Locally distinguishable by visual appearance for,

e.g., repetitive patterns?

Indistinguishable

Indistinguishable

Nearest-neighbor matches

Ground-truth matches

Reference patches

Page 7: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

More visual context!

• The representation of both local details and richer

context – how to construct the network?• Construct feature pyramids – too costly for this low-

level task?

Page 8: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

*Keypoints are derived from SIFT.

Keypoint distribution reveals meaningful scene structure

Page 9: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Coarse matches can be established, even without

color information

Keypoint distribution reveals meaningful scene structureKeypoints are designed to be repeatable in the same underlying scene

Page 10: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Encoding geometric context from keypoint

distribution of individual image

• Keypoints are irregular and unordered – how to construct a proper encoder?

• Keypoints are not perfectly repeatable – how to acquire strong invariance property to different image variations?

Page 11: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

Visual context

• Incorporate high-level visual information

• Resort to regional representation often used in image retrieval

(one forward pass of the entire image)

Geometric context

• Geometric cues from keypoint distribution.

• Resort to PointNet-like architecture to process 2D point sets

Page 12: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

01 Motivation

A unified framework: Cross-modality local

descriptor augmentation

Based on off-the-shelf descriptors…

Page 13: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Preparation Augmentation

Aggregation

Visual context encoder

𝐾𝐾 × 4

𝐾𝐾 × 32 × 32 𝐾𝐾 × 128

Geometric context encoder

Regional feature extractor

𝐻𝐻×𝑊𝑊

×3

𝐻𝐻32

×𝑊𝑊32

× 2048

𝐾𝐾 × 2

ResNet-50

Patch sampler

KeypointsImage sample

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾 × 128

MLPC

Local feature extractor

MLP

𝐾𝐾 × 128

𝐾𝐾× 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLP Multi-layer perceptron

CN ContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

Methods

Page 14: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Preparation

𝐾𝐾 × 4

𝐾𝐾 × 32 × 32 𝐾𝐾 × 128

Regional feature extractor

𝐻𝐻×𝑊𝑊

×3

𝐻𝐻32

×𝑊𝑊32

× 2048

𝐾𝐾 × 2

ResNet-50

Patch sampler

KeypointsImage sample

Local feature extractor

Keypoint locationsE.g., SIFT

Methods

Page 15: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Preparation

𝐾𝐾 × 4

𝐾𝐾 × 32 × 32 𝐾𝐾 × 128

Regional feature extractor

𝐻𝐻×𝑊𝑊

×3

𝐻𝐻32

×𝑊𝑊32

× 2048

𝐾𝐾 × 2

ResNet-50

Patch sampler

KeypointsImage sample

Local feature extractor

Raw local featuresE.g., GeoDesc

Methods

Page 16: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Preparation

𝐾𝐾 × 4

𝐾𝐾 × 32 × 32 𝐾𝐾 × 128

Regional feature extractor

𝐻𝐻×𝑊𝑊

×3

𝐻𝐻32

×𝑊𝑊32

× 2048

𝐾𝐾 × 2

ResNet-50

Patch sampler

KeypointsImage sample

Local feature extractor

Regional featuresAn off-the-shelf image

retrieval model

Methods

Page 17: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

A variant of PointNet

𝐾𝐾 × 2Keypoint locations

Methods𝐾𝐾 × 128

Geometric context

Page 18: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

𝐾𝐾 × 2Keypoint locations

𝐾𝐾 × 128Raw local features

Concat: 𝐾𝐾 × 3

Matchability predictor

Methods

𝐾𝐾 × 1

+1.0

-1.0

Page 19: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

𝐻𝐻32

×𝑊𝑊32

× 2048Regional features

𝐾𝐾 × 2Keypoint locations

Interpolation

Methods

𝐾𝐾 × 128Visual context

Page 20: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

Raw local featureRaw local feature

Geometric context

Visual context

Methods

Page 21: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

Raw local feature

Geometric context

Visual context

Element-wise summation

Methods

Page 22: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

02

Augmentation

Aggregation

Visual context encoder

Geometric context encoder

Sampling gridRegional features Interpolation

𝐾𝐾× 128

C

Matchability predictor

Res

idua

l Uni

t 0w

/ CN

(…)

Res

idua

l Uni

t 3w

/ CN

𝐾𝐾 × 1

𝐾𝐾 × 3

Perc

eptro

n

𝐾𝐾× 128

MLPC

MLP

𝐾𝐾 × 128

𝐾𝐾 × 640𝐾𝐾 × 512

Quad Loss

N-pair loss

𝐾𝐾 × 2048

𝐾𝐾 × 128

C Channel-wise concatenation

Element-wise summation

MLPMulti-layer perceptron

CNContextNormalization

tanh

𝑙𝑙2-Normalization

MLP

w/ C

N

128-d feature

Methods

Page 23: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

03 Ablations Improvements from visual context

Page 24: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

03 Improvements from geometric context

Ablations

Page 25: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

03

Improvements from cross-modality context

Ablations

Page 26: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 EvaluationsSUN3D: indoor scenes

Page 27: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 EvaluationsYFCC: outdoor scenes

Page 28: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 Evaluations

Oxford dataset: Different image

variations

Page 29: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 Evaluations

3D reconstruction benchmark

Page 30: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 Evaluations

Illumination changeScale or rotation change

Perspective change Indoor scene

SIFT

Geo

Des

cO

urs

SIFT

Geo

Des

cO

urs

Page 31: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

04 Evaluations

When pose accuracy as evaluation metric: consistent improvement, but less significant

After obtaining sufficient matches, what is the next bottleneck in order to improve the image matching?

Page 32: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

1) Establish putative matches (nearest-neighbor search/FLANN)

2) Outlier rejection (ratio test/mutual check/GMS)

3) Geometry computation (5-point/8-point algorithm with RANSAC)

4) Non-linear optimization for refinement

Page 33: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

1) Establish putative matches (nearest-neighbor search/FLANN)

2) Outlier rejection (ratio test/mutual check/GMS)

3) Geometry computation (5-point/8-point algorithm with RANSAC)

4) Non-linear optimization for refinement Learning-based

*Yi et al.: Learning to find good correspondences, CVPR’18.

Page 34: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Given putative matches 𝑁𝑁 × 4, where each row vector

denotes a correspondence (𝑥𝑥, 𝑦𝑦,𝑥𝑥′, 𝑦𝑦′) of an image pair.

The network predicts the probability vector 𝑁𝑁 × 1 that

indicates whether a correspondence is an inlier.

Only inlier matches (and its confidence) are used for

computing the geometry.

Page 35: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Given putative matches 𝑁𝑁 × 4, where each row vector

denotes a correspondence (𝑥𝑥, 𝑦𝑦,𝑥𝑥′, 𝑦𝑦′) of an image pair.

The network predicts the probability vector 𝑁𝑁 × 1 that

indicates whether a correspondence is an inlier.

Only inlier matches (and its confidence) are used for

computing the geometry.

Page 36: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Given putative matches 𝑁𝑁 × 4, where each row vector

denotes a correspondence (𝑥𝑥, 𝑦𝑦,𝑥𝑥′, 𝑦𝑦′) of an image pair.

The network predicts the probability vector 𝑁𝑁 × 1 that

indicates whether a correspondence is an inlier.

Only inlier matches (and their confidence) are used for

solving the two-view geometry.

Page 37: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

No outlier rejection

Mutual check

Proposed learning-based

Why is it important?

Page 38: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Previous method• Adopt a PointNet-like architecture.

• Apply context normalization (instance

normalization) on the entire point set to capture

global context.

*Yi et al.: Learning to find good correspondences, CVPR’18.

Local context, e.g., piece-wise smoothness

(GMS matcher).

Page 39: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Previous method• Adopt a PointNet-like architecture.

• Apply context normalization (instance

normalization) on the entire point set to capture

global context.

*Bian et al.: GMS: Grid-based Motion Statistics for Fast, Ultra-robust Feature Correspondence, CVPR’17.

Local context, e.g., piece-wise smoothness

(GMS matcher).

Page 40: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Previous method• Adopt a PointNet-like architecture.

• Apply context normalization (instance

normalization) on the entire point set to capture

global context.

Proposed• Learn to establish neighboring relations on

unordered, non-Euclidean correspondence sets.

• Build a hierarchical architecture to capture both

global and local context.

Page 41: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

05 Sparse matching

Previous method

Proposed

Page 42: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

06 Future work

Evaluation metric• HPatches: patch verification/matching/retrieval -

Reflect the performance in real applications? [1]• Two-view image matching on pose recovery

accuracy - Involve other variables such as RANSAC-based algorithms? [2]

• 3D reconstruction metrics – Involve more variables such as image retrieval, SfM or bundle adjustment? [3]

• The ground truth is often obtained from SfM with a

traditional matching pipeline.[1] Balntas et al.: HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, CVPR’17[2] Yi et al.: Learning to find good correspondences, CVPR’18.[3] Schönberger et al.: Comparative Evaluation of Hand-Crafted and Learned Local Features, CVPR’17.

Keypoint detection• Challenging for learning-based methods – clear

definition as supervision?

• Pixel-wise, even sub-pixel wise accuracy – preserve

low-level details after multiple convolutions?

• Combine the advantage of both human priors and

learned priors.

Page 43: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

06 Future work

Keypoint detection• Challenging for learning-based methods – clear

definition as supervision?

• Pixel-wise, even sub-pixel wise accuracy – preserve

low-level details after multiple convolutions?

• Combine the advantage of both human priors and

learned priors.

Evaluation metric• HPatches: patch verification/matching/retrieval -

Reflect the performance in real applications? [1]• Two-view image matching on pose recovery

accuracy - Involve other variables such as RANSAC-based algorithms? [2]

• 3D reconstruction metrics – Involve more variables such as image retrieval, SfM or bundle adjustment? [3]

• The ground truth is often obtained from SfM with a

traditional matching pipeline.[1] Balntas et al.: HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, CVPR’17[2] Yi et al.: Learning to find good correspondences, CVPR’18.[3] Schönberger et al.: Comparative Evaluation of Hand-Crafted and Learned Local Features, CVPR’17.

Page 44: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

06 Future work

Keypoint detection• Challenging for learning-based methods – clear

definition as supervision?

• Pixel-wise, even sub-pixel wise accuracy – preserve

low-level details after multiple convolutions?

• Combine the advantage of both human priors and

learned priors.

Evaluation metric• HPatches: patch verification/matching/retrieval -

Reflect the performance in real applications? [1]• Two-view image matching on pose recovery

accuracy - Involve other variables such as RANSAC-based algorithms? [2]

• 3D reconstruction metrics – Involve more variables such as image retrieval, SfM or bundle adjustment? [3]

• The ground truth is often obtained from SfM with a

traditional matching pipeline.[1] Balntas et al.: HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, CVPR’17[2] Yi et al.: Learning to find good correspondences, CVPR’18.[3] Schönberger et al.: Comparative Evaluation of Hand-Crafted and Learned Local Features, CVPR’17.

Page 45: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

06 Future work

Keypoint detection• Challenging for learning-based methods – clear

definition as supervision?

• Pixel-wise, even sub-pixel wise accuracy – preserve

low-level details after multiple convolutions?

• Combine the advantage of both human priors and

learned priors.

Evaluation metric• HPatches: patch verification/matching/retrieval -

Reflect the performance in real applications? [1]• Two-view image matching on pose recovery

accuracy - Involve other variables such as RANSAC-based algorithms? [2]

• 3D reconstruction metrics – Involve more variables such as image retrieval, SfM or bundle adjustment? [3]

• The ground truth is often obtained from SfM with a

traditional matching pipeline.[1] Balntas et al.: HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, CVPR’17[2] Yi et al.: Learning to find good correspondences, CVPR’18.[3] Schönberger et al.: Comparative Evaluation of Hand-Crafted and Learned Local Features, CVPR’17.

Page 46: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

06 Future work

Keypoint detection• Challenging for learning-based methods – clear

definition as supervision?

• Pixel-wise, even sub-pixel wise accuracy – preserve

low-level details after multiple convolutions?

• Combine the advantage of both human priors and

learned priors.

Evaluation metric• HPatches: patch verification/matching/retrieval -

Reflect the performance in real applications? [1]• Two-view image matching on pose recovery

accuracy - Involve other variables such as RANSAC-based algorithms? [2]

• 3D reconstruction metrics – Involve more variables such as image retrieval, SfM or bundle adjustment? [3]

• The ground truth is often obtained from SfM with a

traditional matching pipeline.[1] Balntas et al.: HPatches: A benchmark and evaluation of handcrafted and learned local descriptors, CVPR’17[2] Yi et al.: Learning to find good correspondences, CVPR’18.[3] Schönberger et al.: Comparative Evaluation of Hand-Crafted and Learned Local Features, CVPR’17.

Page 47: Robust Learning-based Image Matching · 01. Motivation. Visual context • Incorporate high-level visual information • Resort to regional representationoften used in image retrieval

Thanks!

Code available at: https://github.com/lzx551402/contextdesc