41
CS4501: Introduction to Computer Vision Dense Stereo and Epipolar Geometry Various slides from previous courses by: D.A. Forsyth (Berkeley / UIUC), I. Kokkinos (Ecole Centrale / UCL). S. Lazebnik (UNC / UIUC), S. Seitz (MSR / Facebook), J. Hays (Brown / Georgia Tech), A. Berg (Stony Brook / UNC), D. Samaras (Stony Brook) . J. M. Frahm (UNC), V. Ordonez (UVA), Steve Seitz (UW).

CS4501: Introduction to Computer Vision Dense Stereo and

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS4501: Introduction to Computer Vision Dense Stereo and

CS4501: Introduction to Computer VisionDense Stereo and

Epipolar Geometry

Various slides from previous courses by: D.A. Forsyth (Berkeley / UIUC), I. Kokkinos (Ecole Centrale / UCL). S. Lazebnik (UNC / UIUC), S. Seitz (MSR / Facebook), J. Hays (Brown / Georgia Tech), A. Berg (Stony Brook / UNC), D. Samaras (Stony Brook) . J. M. Frahm (UNC), V. Ordonez (UVA), Steve Seitz (UW).

Page 2: CS4501: Introduction to Computer Vision Dense Stereo and

• Camera Calibration• Stereo Vision

Last Class

Page 3: CS4501: Introduction to Computer Vision Dense Stereo and

• Stereo Vision – Dense Stereo• More on Epipolar Geometry

Today’s Class

Page 4: CS4501: Introduction to Computer Vision Dense Stereo and

Camera Calibration

• What does it mean?

Page 5: CS4501: Introduction to Computer Vision Dense Stereo and

Slide Credit: Silvio Saverese

Recall the Projection matrix

[ ]XtRKx = x: Image Coordinates: (u,v,1)K: Intrinsic Matrix (3x3)R: Rotation (3x3) t: Translation (3x1)X: World Coordinates: (X,Y,Z,1)

Ow

iw

kw

jwR,T

Page 6: CS4501: Introduction to Computer Vision Dense Stereo and

Recall the Projection matrix

[ ]XtRKx =

! =

#[%'] =

Page 7: CS4501: Introduction to Computer Vision Dense Stereo and

Recall the Projection matrix

[ ]XtRKx =

! =

#[%'] = Goal: Find )

Page 8: CS4501: Introduction to Computer Vision Dense Stereo and

Camera Calibration

[ ]XtRKx =

! =

# =

Page 9: CS4501: Introduction to Computer Vision Dense Stereo and

Camera Calibration

[ ]XtRKx =

! =

# =$[&(]Goal: Find

Page 10: CS4501: Introduction to Computer Vision Dense Stereo and

Calibrating the CameraUse an scene with known geometry

• Correspond image points to 3d points• Get least squares solution (or non-linear solution)

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

Known 3d locations

Known 2d image coords

Unknown Camera Parameters

Slide Credit: James Hays

Page 11: CS4501: Introduction to Computer Vision Dense Stereo and

How do we calibrate a camera?

312.747 309.140 30.086305.796 311.649 30.356307.694 312.358 30.418310.149 307.186 29.298311.937 310.105 29.216311.202 307.572 30.682307.106 306.876 28.660309.317 312.490 30.230307.435 310.151 29.318308.253 306.300 28.881306.650 309.301 28.905308.069 306.831 29.189309.671 308.834 29.029308.255 309.955 29.267307.546 308.613 28.963311.036 309.206 28.913307.518 308.175 29.069309.950 311.262 29.990312.160 310.772 29.080311.988 312.709 30.514

880 21443 203

270 197886 347745 302943 128476 590419 214317 335783 521235 427665 429655 362427 333412 415746 351434 415525 234716 308602 187

Known 3d locations

Known 2d image coords

Slide Credit: James Hays

Page 12: CS4501: Introduction to Computer Vision Dense Stereo and

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

14131211 mZmYmXmsu +++=

24232221 mZmYmXmsv +++=

34333231 mZmYmXms +++=

Known 3d locations

Known 2d image coords

Unknown Camera Parameters

34333231

14131211

mZmYmXmmZmYmXmu

++++++

=

34333231

24232221

mZmYmXmmZmYmXmv

++++++

=

Slide Credit: James Hays

Page 13: CS4501: Introduction to Computer Vision Dense Stereo and

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

Known 3d locations

Known 2d image coords

Unknown Camera Parameters

34333231

14131211

mZmYmXmmZmYmXmu

++++++

=

34333231

24232221

mZmYmXmmZmYmXmv

++++++

=

1413121134333231 )( mZmYmXmumZmYmXm +++=+++

2423222134333231 )( mZmYmXmvmZmYmXm +++=+++

1413121134333231 mZmYmXmumuZmuYmuXm +++=+++

2423222134333231 mZmYmXmvmvZmvYmvXm +++=+++

Slide Credit: James Hays

Page 14: CS4501: Introduction to Computer Vision Dense Stereo and

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

Known 3d locations

Known 2d image coords

Unknown Camera Parameters

1413121134333231 mZmYmXmumuZmuYmuXm +++=+++

2423222134333231 mZmYmXmvmvZmvYmvXm +++=+++

umuZmuYmuXmmZmYmXm 34333231141312110 ----+++=vmvZmvYmvXmmZmYmXm 34333231242322210 ----+++=

Slide Credit: James Hays

Page 15: CS4501: Introduction to Computer Vision Dense Stereo and

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

Known 3d locations

Known 2d image coords

Unknown Camera Parameters

umuZmuYmuXmmZmYmXm 34333231141312110 ----+++=vmvZmvYmvXmmZmYmXm 34333231242322210 ----+++=

• Method 1 – homogeneous linear system. Solve for m’s entries using linear least squares

úúúúúú

û

ù

êêêêêê

ë

é

=

úúúúúúúúúúúúúúúúú

û

ù

êêêêêêêêêêêêêêêêê

ë

é

úúúúúú

û

ù

êêêêêê

ë

é

--------

--------

00

00

1000000001

1000000001

34

33

32

31

24

23

22

21

14

13

12

11

1111111111

1111111111

!!

mmmmmmmmmmmm

vZvYvXvZYXuZuYuXuZYX

vZvYvXvZYXuZuYuXuZYX

nnnnnnnnnn

nnnnnnnnnn

[U, S, V] = svd(A);M = V(:,end);M = reshape(M,[],3)';

Slide Credit: James Hays

Page 16: CS4501: Introduction to Computer Vision Dense Stereo and

• Method 2 – nonhomogeneous linear system. Solve for m’s entries using linear least squares

úúúúúú

û

ù

êêêêêê

ë

é

=

úúúúúúúúúúúúúúú

û

ù

êêêêêêêêêêêêêêê

ë

é

úúúúúú

û

ù

êêêêêê

ë

é

------

------

n

n

nnnnnnnnn

nnnnnnnnn

vu

vu

mmmmmmmmmmm

ZvYvXvZYXZuYuXuZYX

ZvYvXvZYXZuYuXuZYX

!!1

1

33

32

31

24

23

22

21

14

13

12

11

111111111

111111111

1000000001

1000000001

Ax=b form

M = A\Y;M = [M;1];M = reshape(M,[],3)';

úúúú

û

ù

êêêê

ë

é

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

134333231

24232221

14131211

ZYX

mmmmmmmmmmmm

ssvsu

Known 3d locations

Known 2d image coords

Unknown Camera Parameters Slide Credit: James Hays

Page 17: CS4501: Introduction to Computer Vision Dense Stereo and

Can we factorize M back to K [R | T]?

• Yes!• You can use RQ factorization (note – not the more familiar QR

factorization). R (right diagonal) is K, and Q (orthogonal basis) is R. T, the last column of [R | T], is inv(K) * last column of M.

• But you need to do a bit of post-processing to make sure that the matrices are valid. See http://ksimek.github.io/2012/08/14/decompose/

Credit: James Hays

= " #%

Page 18: CS4501: Introduction to Computer Vision Dense Stereo and

Vicente OrdonezUniversity of Virginia

Stereo:Epipolar geometry

Slides by Kristen Grauman

Page 19: CS4501: Introduction to Computer Vision Dense Stereo and

Why multiple views?• Structure and depth are inherently ambiguous from single views.

Optical center

P1P2

P1’=P2’

[ ]XtRKx =

Page 20: CS4501: Introduction to Computer Vision Dense Stereo and

Estimating depth with stereo

• Stereo: shape from “motion” between two views• We’ll need to consider:• Info on camera pose (“calibration”)• Image point correspondences

scene point

optical center

image plane

Page 21: CS4501: Introduction to Computer Vision Dense Stereo and

Potential matches for x have to lie on the corresponding line l’.

Potential matches for x’ have to lie on the corresponding line l.

Key idea: Epipolar constraint

x x’

X

x’

X

x’

X

Page 22: CS4501: Introduction to Computer Vision Dense Stereo and

• Epipolar Plane – plane containing baseline (1D family)

• Epipoles= intersections of baseline with image planes = projections of the other camera center

• Baseline – line connecting the two camera centers

Epipolar geometry: notationX

x x’

Page 23: CS4501: Introduction to Computer Vision Dense Stereo and

• Epipolar Lines - intersections of epipolar plane with imageplanes (always come in corresponding pairs)

Epipolar geometry: notationX

x x’

• Epipolar Plane – plane containing baseline (1D family)

• Epipoles= intersections of baseline with image planes = projections of the other camera center

• Baseline – line connecting the two camera centers

Page 24: CS4501: Introduction to Computer Vision Dense Stereo and

Example: Converging cameras

Page 25: CS4501: Introduction to Computer Vision Dense Stereo and

Geometry for a simple stereo system

• First, assuming parallel optical axes, known camera parameters (i.e., calibrated cameras):

Page 26: CS4501: Introduction to Computer Vision Dense Stereo and

Simplest Case: Parallel images

• Image planes of cameras are parallel to each other and to the baseline

• Camera centers are at same height• Focal lengths are the same• Then epipolar lines fall along the

horizontal scan lines of the images

Page 27: CS4501: Introduction to Computer Vision Dense Stereo and
Page 28: CS4501: Introduction to Computer Vision Dense Stereo and

baseline

optical center (left)

optical center (right)

Focal length

World point

image point (left)

image point (right)

Depth of p

Page 29: CS4501: Introduction to Computer Vision Dense Stereo and

• Assume parallel optical axes, known camera parameters (i.e., calibrated cameras). What is expression for Z?

Similar triangles (pl, P, pr) and (Ol, P, Or):

Geometry for a simple stereo system

ZT

fZxxT rl =

--+

lr xxT

fZ-

=disparity

Page 30: CS4501: Introduction to Computer Vision Dense Stereo and

Depth from disparity

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y), y)

So if we could find the corresponding points in two images, we could estimate relative depth…

Page 31: CS4501: Introduction to Computer Vision Dense Stereo and

Matching cost

disparity

Left Right

scanline

Correspondence search

• Slide a window along the right scanline and compare contents of that window with the reference window in the left image

• Matching cost: SSD or normalized correlation

Page 32: CS4501: Introduction to Computer Vision Dense Stereo and

Left Right

scanline

Correspondence search

SSD

Page 33: CS4501: Introduction to Computer Vision Dense Stereo and

Left Right

scanline

Correspondence search

Norm. corr

Page 34: CS4501: Introduction to Computer Vision Dense Stereo and

Basic stereo matching algorithm

• If necessary, rectify the two stereo images to transform epipolar lines into scanlines

• For each pixel x in the first image• Find corresponding epipolar scanline in the right image• Examine all pixels on the scanline and pick the best match x’• Compute disparity x–x’ and set depth(x) = B*f/(x–x’)

Page 35: CS4501: Introduction to Computer Vision Dense Stereo and

Failures of correspondence search

Textureless surfaces Occlusions, repetition

Non-Lambertian surfaces, specularities

Page 36: CS4501: Introduction to Computer Vision Dense Stereo and

Effect of window size

• Smaller window+ More detail• More noise

• Larger window+ Smoother disparity maps• Less detail

W = 3 W = 20

Page 37: CS4501: Introduction to Computer Vision Dense Stereo and

Results with window search

Window-based matching Ground truth

Data

Page 38: CS4501: Introduction to Computer Vision Dense Stereo and

Better methods exist...

Graph cuts Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001

Page 39: CS4501: Introduction to Computer Vision Dense Stereo and

When cameras are not aligned:Stereo image rectification

• Reproject image planes onto a common• plane parallel to the line between optical centers• Pixel motion is horizontal after this transformation• Two homographies (3x3 transform), one for each input image

reprojection

•C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE Conf. Computer Vision and Pattern Recognition, 1999.

Page 40: CS4501: Introduction to Computer Vision Dense Stereo and

Rectification example

Page 41: CS4501: Introduction to Computer Vision Dense Stereo and

Questions?

41