36
CS664 Computer Vision 10. Stereo Dan Huttenlocher

CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

CS664 Computer Vision

10. Stereo

Dan Huttenlocher

Page 2: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

2

Stereo Matching

Given two or more images of the same scene or object, compute a representation of its shape

Some applications

Page 3: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

3

Face modeling

From one stereo pair to a 3D head model

[Frederic Deverney, INRIA]

Page 4: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

4

Z-keying: Mix Live and Synthetic

Takeo Kanade, CMU (Stereo Machine)

Page 5: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

5

View Interpolation

Spline-based depth map

input depth image novel view

[Szeliski & Kang ‘95]

Page 6: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

6

Stereo Matching

Given two or more images of the same scene or object, compute a representation of its shape

Some possible representations– Depth maps

– Volumetric models

– 3D surface models

– Planar (or offset) layers

Page 7: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

7

Stereo Matching

Possible algorithms– Match “interest points” and interpolate– Match edges and interpolate– Match all pixels with windows (coarse-fine)– Optimization:

• Iterative updating•Dynamic programming•Energy minimization (regularization, stochastic)

•Graph algorithms

Page 8: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

8

Outline

Image rectificationMatching criteriaLocal algorithms (aggregation)– Iterative updating

Optimization algorithms:– Energy (cost) formulation & Markov Random

Fields– Mean-field, stochastic, and graph algorithms

Page 9: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

9

Stereo: epipolar geometry

Match features along epipolar lines

viewing rayviewing rayepipolar planeepipolar plane

epipolar lineepipolar line

Page 10: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

10

Stereo: Recall Epipolar geometry

For two images (or images with collinear camera centers), can find epipolar linesEpipolar lines are the projection of the pencil of planes passing through the centers

Rectification: warping the input images (perspective transformation) so that epipolar lines are horizontal

Page 11: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

11

Rectification

Project each image onto same plane, which is parallel to the epipoleResample lines (and shear/stretch) to place lines in correspondence, and minimize distortion

Page 12: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

12

Rectification

BAD!

Page 13: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

13

Rectification

GOOD!

Page 14: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

14

Choosing the Baseline

What’s the optimal baseline?– Too small: large depth error– Too large: difficult search problem

Large BaselineLarge Baseline Small BaselineSmall Baseline

Page 15: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

15

Matching Criteria

Raw pixel values (correlation)Band-pass filtered images [Jones & Malik 92]“Corner” like features [Zhang, …]Edges [Many 1980’s methods…]Gradients [Seitz 89; Scharstein 94]Rank statistics [Zabih & Woodfill 94]Slanted surfaces [Birchfield & Tomasi 99]

Page 16: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

16

Finding Correspondences

Apply feature matching criterion (e.g., correlation) at all pixels simultaneouslySearch only over epipolar lines (many fewer candidate positions)

Page 17: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

17

Block Based Matching

How to determine correspondences?

– Block matching or SSD (sum squared differences)

d is the disparity (horizontal motion)

How big should neighborhood be?

Page 18: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

18

Effects of Block Size

Smaller neighborhood: more detailsLarger neighborhood: fewer isolated mistakes

w = 3 w = 20

Page 19: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

19

Plane Sweep Stereo

Sweep family of planes through volume

– each plane defines an image ⇒ composite homography

virtual cameravirtual camera

compositecompositeinput imageinput image

← projectiveprojective rere--sampling of (sampling of (X,Y,ZX,Y,Z))

Page 20: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

20

Plane Sweep Stereo

For each depth plane– Compute composite (mosaic) image — mean

– Compute error image — variance– Convert to confidence and aggregate spatially

Select winning depth at each pixel

Page 21: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

21

Plane Sweep Stereo

Re-order (pixel / disparity) evaluation loops

for every pixel, for every disparityfor every disparity for every pixel

compute cost compute cost

Page 22: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

22

Stereo Matching Framework

For every disparity, compute raw matching costs

Robust cost functions– Occlusions, other outliers

Combine with spatial coherence or consistency

Page 23: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

23

Stereo Matching Framework

Aggregate costs spatially

Can use box filter(efficient moving averageimplementation)Can also use weighted average,[non-linear] diffusion…

Page 24: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

24

Stereo Matching Framework

Choose winning disparity at each pixel

Interpolate to sub-pixel accuracy

d

E(d)

d*

Page 25: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

25

Traditional Stereo Matching

Advantages:– Detailed surface estimates– Fast algorithms using moving averages– Sub-pixel disparity estimates and confidence

Limitations:– Narrow baseline ⇒ noisy estimates

– Fails in textureless areas– Gets confused near occlusion boundaries

Page 26: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

26

Stereo with Non-Linear Diffusion

Problem with traditional approach:– Gets confused near discontinuities

Another approach:– Use iterative (non-linear) aggregation to

obtain better estimate– Turns out to be provably equivalent to mean-

field estimate of Markov Random Field

Page 27: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

27

Linear Diffusion

Average energy with neighbors + starting value

window diffusion

Page 28: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

28

Feature-Based Stereo

Match “corner” (interest) points

Interpolate complete solution

Page 29: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

29

Data Interpolation

Given a sparse set of 3D points, how do we interpolate to a full 3D surface?Scattered data interpolation [Nielson93]TriangulatePut onto a grid and fill (use pyramid?)Place a kernel function over each data pointMinimize an energy function

Page 30: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

30

Dynamic Programming

1-D cost function

Page 31: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

31

Dynamic Programming

Disparity space image and min. cost path

Page 32: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

32

Dynamic Programming

Sample result(note horizontalstreaks)

[Intille & Bobick]

Page 33: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

33

Dynamic Programming

Can we apply this trick in 2D as well?

dx,ydx-1,y

dx,y-1dx-1,y-1

No: dx,y-1 and dx-1,y may depend on different values of dx-1,y-1

Page 34: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

34

Graph Cuts

Solution technique for general 2D problem

Page 35: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

35

Formulate as statistical inference problemPrior model pP(d)Measurement model pM(IL, IR| d)Posterior model

pM(d | IL, IR) ∝ pP(d) pM(IL, IR| d)Maximum a Posteriori (MAP estimate):

maximize pM(d | IL, IR)

Bayesian Inference

Page 36: CS664 Computer Vision 10. Stereo - Cornell University · 2008-02-28 · Stereo Matching Framework Choose winning disparity at each pixel Interpolate to sub-pixel accuracy d E(d) d*

36

Probability distribution on disparity field d(x,y)

Enforces smoothness or coherence on field

Markov Random Field