Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Structure from Motion
Read Chapter 7 in Szeliski’s book
Video → 3D model
3D fro
2
Recorded at archaeological site of Sagalassos in Turkey
accuracy ~1/500 from DV video (i.e. 140kb jpegs 576x720
Schedule (tentative)
3
# date topic 1 Sep.19 Introduction and geometry 2 Sep.26 Camera models and calibration 3 Oct.3 Multiple-view geometry (Richard Hartley) 4 Oct.10 Invariant features 5 Oct.7 Model fitting (RANSAC, EM, …) 6 Oct.24 Segmentation (Jianbo Shi)
7 Oct.31 Stereo Matching 8 Nov.7 Shape from Silhouettes 9 Nov.14 Structure from motion 10 Nov.21 Optical Flow 11 Nov.28 Tracking (Kalman, particle filter) 12 Dec.5 Object recognition 13 Dec.12 Object category recognition 14 Dec.19 Applications and demos
Today’s class
Structure from motion
– factorization – sequential
– bundle adjustment
Factorization
Factorise observations in structure of the scene and motion/calibration of the camera
Use all points in all images at the same time
ü Affine factorisation ü Projective factorisation
Affine camera The affine projection equations are
1
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
⎥⎦
⎤⎢⎣
⎡=⎥
⎦
⎤⎢⎣
⎡
j
j
j
yi
xi
ij
ij
ZYX
PP
yx
10001
1 ⎥
⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
j
j
j
yi
xi
ij
ij
ZYX
PP
yx
~~
4
4
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎥⎦
⎤⎢⎣
⎡=⎥
⎦
⎤⎢⎣
⎡=
⎥⎥⎦
⎤
⎢⎢⎣
⎡
−
−
j
j
j
yi
xi
ij
ijyiij
xiij
ZYX
PP
yx
PyPx
how to find the origin? or for that matter a 3D reference point?
affine projection preserves center of gravity
∑−=i
ijijij xxx~ ∑−=i
ijijij yyy~
Orthographic factorization The orthographic projection equations are
where njmijiij ,...,1,,...,1,Mm === P
All equations can be collected for all i and j
where
[ ]n
mmnmm
n
n
M,...,M,M,,
mmm
mmmmmm
212
1
21
22221
11211
=
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
= M
P
PP
Pm
MPm =
M ~~
m⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
=⎥⎦
⎤⎢⎣
⎡=⎥
⎦
⎤⎢⎣
⎡=
j
j
j
jyi
xi
iij
ijij
ZYX
,PP,
yx
P
Note that P and M are resp. 2mx3 and 3xn matrices and therefore the rank of m is at most 3
(Tomasi Kanade’92)
Orthographic factorization Factorize m through singular value decomposition
An affine reconstruction is obtained as follows
TVUm Σ=
TVMUP Σ==~,~
(Tomasi Kanade’92)
[ ]n
mmnmm
n
n
M,...,M,M
mmm
mmmmmm
min 212
1
21
22221
11211
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
−⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
P
PP
Closest rank-3 approximation yields MLE!
A metric reconstruction is obtained as follows
Where A is computed from
Orthographic factorization Factorize m through singular value decomposition
An affine reconstruction is obtained as follows
TVUm Σ=
TVMUP Σ==~,~
MAMAPP ~,~ 1 == −
0
1
1
=
=
=
T
T
T
yi
xi
yi
yi
xi
xi
PP
PP
PP
(Tomasi Kanade’92)
0~~1~~1~~
1
1
1
=
=
=
−−
−−
−−
TT
TT
TT
yi
xi
yi
yi
xi
xi
PP
PP
PP
AA
AA
AA
A metric reconstruction is obtained as follows
Where A is computed from
Orthographic factorization Factorize m through singular value decomposition
An affine reconstruction is obtained as follows
TVUm Σ=
TVMUP Σ==~,~
MAMAPP ~,~ 1 == −
(Tomasi Kanade’92)
0~~1~~1~~
=
=
=
T
T
T
yi
xi
yi
yi
xi
xi
PP
PP
PP
C
C
C
A metric reconstruction is obtained as follows
Where A is computed from
Orthographic factorization Factorize m through singular value decomposition
An affine reconstruction is obtained as follows
TVUm Σ=
TVMUP Σ==~,~
MAMAPP ~,~ 1 == −
3 linear equations per view on symmetric matrix C (6DOF)
A can be obtained from C through Cholesky factorisation and inversion
(Tomasi Kanade’92)
Examples
Tomasi Kanade’92, Poelman & Kanade’94
Examples
Tomasi Kanade’92, Poelman & Kanade’94
Examples
Tomasi Kanade’92, Poelman & Kanade’94
Examples
Tomasi Kanade’92, Poelman & Kanade’94
Perspective factorization
The camera equations
for a fixed image i can be written in matrix form as
where
mjmijiijij ,...,1,,...,1,Mmλ === P
MPm iii =Λ
[ ] [ ]( )imiii
mimiii
λ,...,λ,λdiagM,...,M,M , m,...,m,m
21
2121
=Λ== Mm
Perspective factorization
All equations can be collected for all i as
where PMm =
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
Λ
Λ
Λ
=
mnn P
PP
P
m
mm
m...
,...
2
1
22
11
In these formulas m are known, but Λi,P and M are unknown
Observe that PM is a product of a 3mx4 matrix and a 4xn matrix, i.e. it is a rank-4 matrix
Perspective factorization algorithm
Assume that Λi are known, then PM is known. Use the singular value decomposition
PM=UΣ VT In the noise-free case
Σ=diag(σ1,σ2,σ3,σ4,0, … ,0) and a reconstruction can be obtained by setting:
P=the first four columns of UΣ.
M=the first four rows of V.
Iterative perspective factorization
When Λi are unknown the following algorithm can be used:
1. Set λij=1 (affine approximation).
2. Factorize PM and obtain an estimate of P and M. If σ5 is sufficiently small then STOP.
3. Use m, P and M to estimate Λi from the camera equations (linearly) mi Λi=PiM
4. Goto 2.
In general the algorithm minimizes the proximity measure P(Λ,P,M)=σ5
Note that structure and motion recovered up to an arbitrary projective transformation
Further Factorization work
Factorization with uncertainty Factorization for dynamic scenes
(Irani & Anandan, IJCV’’02)
(Costeira and Kanade ‘‘94) (Bregler et al. ‘‘00, Brand ‘‘01)
(Yan and Pollefeys, ‘‘05/’’06)
practical structure and motion recovery from images
Obtain reliable matches using matching or tracking and 2/3-view relations Compute initial structure and motion Refine structure and motion Auto-calibrate Refine metric structure and motion
Initialize Motion (P1,P2 compatibel with F)
Sequential Structure and Motion Computation
Initialize Structure (minimize reprojection error)
Extend motion (compute pose through matches seen in 2 or more previous views)
Extend structure (Initialize new structure, refine existing structure)
Computation of initial structure and motion
according to Hartley and Zisserman “this area is still to some extend a
black-art”
All features not visible in all images ⇒ No direct method (factorization not applicable) ⇒ Build partial reconstructions and assemble (more views is more stable, but less corresp.)
1) Sequential structure and motion recovery
2) Hierarchical structure and motion recovery
Sequential structure and motion recovery
Initialize structure and motion from two views For each additional view
– Determine pose – Refine and extend structure
Determine correspondences robustly by jointly estimating matches and epipolar geometry
Initial structure and motion
[ ][ ][ ]eeaFeP0IP
Tx +=
=
2
1
Epipolar geometry ↔ Projective calibration
012 =FmmT
compatible with F
Yields correct projective camera setup (Faugeras´92,Hartley´92)
Obtain structure through triangulation Use reprojection error for minimization Avoid measurements in projective space
Compute Pi+1 using robust approach (6-point RANSAC) Extend and refine reconstruction
)x,...,X(xPx 11 −= iii
2D-2D
2D-3D 2D-3D
mi mi+1
M
new view
Determine pose towards existing structure
Compute P with 6-point RANSAC
Generate hypothesis using 6 points
Count inliers Projection error
( )( ) ?x,x...,,xXP 11 td iii <−
Back-projection error ( ) ijtd jiij <∀< ?,x,xF Re-projection error ( )( ) td iiii <− x,x,x...,,xXP 11
3D error ( )( ) ?X,xP 3-1
Dii td <Λ
Projection error with covariance ( )( ) td iii <−Λ x,x...,,xXP 11
Expensive testing? Abort early if not promising Verify at random, abort if e.g. P(wrong)>0.95
(Chum and Matas, BMVC’02)
Calibrated structure from motion
Equations more complicated, but less degeneracies For calibrated cameras:
– 5-point relative motion (5DOF) Nister CVPR03
– 3-point pose estimation (6DOF) Haralick et al. IJCV94
D. Nistér, An efficient solution to the five-point relative pose problem, In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), Volume 2, pages. 195-202, 2003.
R. Haralick, C. Lee, K. Ottenberg, M. Nolle. Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem. Int’l Journal of Computer Vision, 13, 3, 331-356, 1994.
Linear equations for 5 points
Linear solution space
Non-linear constraints
5-point relative motion
E = xX + yY + zZ + wW
detE = 0 10 cubic polynomials
w = 1scale does not matter, choose
(Nister, CVPR03)
!x1x1 !x1y1 !x11 !y1x1 !y1y1 !y1 x1 y1 1!x2x2 !x2y2 !x21 !y2x2 !y2y2 !y2 x2 y2 1!x3x3 !x3y3 !x31 !y3x3 !y3y3 !y3 x3 y3 1!x4x4 !x4y4 !x41 !y4x4 !y4y4 !y4 x4 y4 1!x5x5 !x5y5 !x51 !y5x5 !y5y5 !y5 x5 y5 1
"
#
$$$$$$$
%
&
'''''''
E11E12E13E21E22E23E31E32E33
"
#
$$$$$$$$$$$$$
%
&
'''''''''''''
= 0
5-point relative motion
Perform Gauss-Jordan elimination on polynomials
-z
-z
-z
[n]represents polynomial of degree n in z
(Nister, CVPR03)
Three points perspective pose – p3p
(Haralick et al., IJCV94)
All techniques yield 4th order polynomial Haralick et al. recommends using Finsterwalder’s technique as it yields the best results numerically
1903 1841
Minimal solvers Lot’s of recent activity using Groebner bases: Henrik Stewénius, David Nistér, Fredrik Kahl, Frederik
Schaffalitzky: A Minimal Solution for Relative Pose with Unknown Focal Length, CVPR 2005.
H. Stewénius, D. Nistér, M. Oskarsson, and K. Åström. Solutions to minimal generalized relative pose problems. Omnivis 2005.
D. Nistér, A Minimal solution to the generalised 3-point pose problem, CVPR 2004
Martin Bujnak, Zuzana Kukelova, Tomás Pajdla: A general solution to the P4P problem for camera with unknown focal length. CVPR 2008.
Brian Clipp, Christopher Zach, Jan-Michael Frahm and Marc Pollefeys, A New Minimal Solution to the Relative Pose of a Calibrated Stereo Camera with Small Field of View Overlap, ICCV 2009.
Zuzana Kukelova, Martin Bujnak, Tomás Pajdla: Automatic Generator of Minimal Problem Solvers. ECCV 2008.
F. Fraundorfer, P. Tanskanen and M. Pollefeys. A Minimal Case Solution to the Calibrated Relative Pose Problem for the Case of Two Known Orientation Angles. ECCV 2010.
…
Minimal relative pose with know vertical
39
-g
Fraundorfer, Tanskanen and Pollefeys, ECCV2010
5 linear unknowns → linear 5 point algorithm 3 unknowns → quartic 3 point algorithm
Vertical direction can often be estimated inertial sensor vanishing point
Incremental Structure from Motion
Photo Tourism
Noah Snavely, Steve Seitz, Rick Szeliski University of Washington
Richard Szeliski
Incremental structure from motion
• Automa�cally select an ini�al pair of images
Incremental structure from motion
Incremental structure from motion
Hierarchical structure and motion recovery
Compute 2-view Compute 3-view Stitch 3-view reconstructions Merge and refine reconstruction
F T
H
PM
Hierarchical structure from motion
Farenzena, Fusiello, and Gherardi, Structure-and-motion Pipeline on a Hierarchical Cluster Tree, 3DIM 2009, October 2009
Richar
ICVSS 2010 45
Hierarchical structure from motion
Richar
ICVSS 2010 46
Hierarchical structure from motion
Richar
ICVSS 2010 47
Stitching 3-view reconstructions
Different possibilities 1. Align (P2,P3)with(P’1,P’2) ( ) ( )-123
-112
HHP',PHP',Pminarg AA dd +
2. Align X,X’ (and C’C’) ( )∑j
jjAd HX',XminargH
3. Minimize reproj. error ( )( )∑
∑+
jjj
jjj
d
d
x',HXP'
x,X'PHminarg 1-
H
4. MLE (merge) ( )∑j
jjd x,PXminargXP,
More issues
Repetition ambiguities
Large scale reconstructions
Marc Pollefeys ICVSS 2010 49
Disambiguating visual relations using loop constraints
50
(Zach et al CVPR‘10)
Exploiting symmetries in SfM
51
Detected symmetry matches
(Cohen et al 2012?)
57
Projective ambiguity
reconstruction from uncalibrated images only ⇒ projective ambiguity on reconstruction
´M´M))((Mm 1 PTPTP === − Similarity (7DOF)
Affine (12DOF)
Projective (15DOF)
58
Euclidean projection matrix
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
=
1yy
xx
cfcsf
K
Factorization of Euclidean projection matrix
Intrinsics:
Extrinsics: ( )t,RNote: every projection matrix can be factorized, but only meaningful for euclidean projection matrices
(camera geometry)
(camera motion)
[ ]tRRKP TT −=
59
The Absolute Quadric
Eliminate extrinsics from equation
TT KKPP =Ω
[ ]tRRK TT − TKR TRK TKK
)1110(diag* =Ω
Equivalent to projection of quadric
))(Ω)((Ω *1* TTTTT PTTTPTPPKK −−==
Absolute Quadric also exists in projective world
Absolute Quadric
T´Ω´´ *PP=Transforming world so that reduces ambiguity to metric
** ΩΩ´ →
2 2 2 0A B C+ + =
0AX BY CZ D+ + + =
60
ω*
Ω*
projection
constraints
Absolute Quadric = calibration object which is always present but can only be observed through constraints on the intrinsics
Tii
Tiii Ωω KKPP == ∗∗
Absolute quadric and self-calibration
Projection equation:
Translate constraints on K through projection equation to constraints on Ω*
Don’t treat all constraints equal
61
Practical linear self-calibration
( ) ( )( )( )( ) 0Ω
0Ω0Ω
0ΩΩ
23T13
T12
T22
T11
T
=
=
=
=−
∗
∗
∗
∗∗
PPPPPP
PPPP(Pollefeys et al., ECCV‘02)
2
* 2
ˆ 0 0ˆ0 0
0 0 1
f
f
⎡ ⎤⎢ ⎥
= Ω ≈ ⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
P PT TKK
1ˆ ≈f( ) ( )( ) ( ) 0ΩΩ
0ΩΩ
33T
22T
33T
11T
=−
=−∗∗
∗∗
PPPPPPPP
(only rough aproximation, but still usefull to avoid degenerate configurations)
(relatively accurate for most cameras)
91
91
1.011.0101.01
2.01
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
=
1yy
xx
cfcsf
K
after normalization!
when fixating point at image-center not only absolute quadric diag(1,1,1,0) satisfies ICCV98 eqs., but also diag(1,1,1,a), i.e. real or imaginary spheres! Recent related work (Gurdjos et al. ICCV‘09)
62
Upgrade from projective to metric
Locate ΩΩ* in projective reconstruction (self-calibration) Problem: not always unique (Critical Motion Sequences)
Transform projective reconstruction to bring ΩΩ* in canonical position
Refining structure and motion
Minimize reprojection error
– Maximum Likelyhood Estimation (if error zero-mean Gaussian noise)
– Huge problem but can be solved efficiently (Bundle adjustment)
( )∑∑= =
m
k
n
iikD
ik 1 1
2
kiM̂,P̂
M̂P̂,mmin
Non-linear least-squares
Newton iteration Levenberg-Marquardt Sparse Levenberg-Marquardt
(P)X f= (P)X argminP
f−
Newton iteration
Taylor approximation
f (P0 +Δ) ≈ f (P0 )+ JΔ J = ∂X∂P
Jacobian
X− f (P1) X− f (P1) ≈ X− f (P0 )− JΔ = e0 − JΔ
⇒ JTJΔ = JTe0 ⇒Δ = JTJ( )-1JTe0
Pi+1 = Pi +Δ Δ = JTJ( )-1JTe0
Δ = JTΣ-1J( )-1JTΣ-1e0
normal eq.
Levenberg-Marquardt
0TT JNJJ e=Δ=Δ
0TJN' e=Δ
Augmented normal equations
Normal equations
J)λdiag(JJJN' TT +=
30 10λ −=
10/λλ :success 1 ii =+
ii λ10λ :failure = solve again
accept
λ small ~ Newton (quadratic convergence)
λ large ~ descent (guaranteed decrease)
Levenberg-Marquardt
Requirements for minimization Function to compute f Start value P0 Optionally, function to compute J (but numerical ok, too)
Richard Szeliski
Bundle Adjustment
What makes this non-linear minimization hard? – many more parameters: potentially slow – poorer conditioning (high correlation) – potentially lots of outliers – gauge (coordinate) freedom
Chained transformations
Think of the optimization as a set of transformations along with their derivatives
Richard Szeliski
Richard Szeliski
Levenberg-Marquardt
Iterative non-linear least squares [Press’92] – Linearize measurement equations
– Substitute into log-likelihood equation: quadratic cost function in ΔΔm
Richard Szeliski
Robust error models
Outlier rejection – use robust penalty applied
to each set of joint measurements
– for extremely bad data, use random sampling [RANSAC, Fischler & Bolles, CACM’81]
Richard Szeliski
Iterative non-linear least squares [Press’92] – Solve for minimum
Hessian: error:
Levenberg-Marquardt
you may have to delete the image and then insert it again.
Chained transformations
Only some derivatives are non-zero: j th camera, i th point
Richard Szeliski
Multi-camera rig
Easy extension of generalized bundler:
Richard Szeliski
Richard Szeliski
Lots of parameters: sparsity
Only a few entries in Jacobian are non-zero
Exploiting sparsity
Schur complement to get reduced camera matrix
Richard Szeliski
Sparse Levenberg-Marquardt
complexity for solving – prohibitive for large problems (100 views 10,000 points ~30,000 unknowns)
Partition parameters – partition A – partition B (only dependent on A and itself)
0T-1 JN' e=Δ3N
Sparse bundle adjustment
residuals: normal equations: with note: tie points should be in partition A
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been
then insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been
insert it again.
Sparse bundle adjustment
normal equations:
modified normal equations:
solve in two parts:
Sparse bundle adjustment
U1
U2
U3
WT
W
V
P1 P2 P3 M
Jacobian of has sparse block structure
=J == JJN T
12xm 3xn (in general
much larger)
im.pts. view 1
( )( )∑∑= =
m
k
n
iikD
1 1
2ki M̂P̂,m
Needed for non-linear minimization
Sparse bundle adjustment
Eliminate dependence of camera/motion parameters on structure parameters Note in general 3n >> 11m
WT V
U-WV-1WT =×⎥⎦
⎤⎢⎣⎡ − −
NI0WVI 1
11xm 3xn
Allows much more efficient computations
e.g. 100 views,10000 points, solve ±1000x1000, not ±30000x30000
Often still band diagonal use sparse linear algebra algorithms
Sparse bundle adjustment
normal equations:
modified normal equations:
solve in two parts:
Sparse bundle adjustment
Covariance estimation
-1WVY =( )
1a
Tb
1-a
VYYWWVU
−
+
+∑=∑
−=∑
Yaab ∑=∑ -
Direct vs. iterative solvers
How do they work? When should you use them?
Richard Szeliski
Large reduced camera matrices
Use iterative preconditioned conjugate gradient – [Jeong, Nister, Steedly, Szeliski, Kweon, CVPR 2009] – [Agarwal, Snavely, Seitz, and Szeliski, ECCV 2009] – Block diagonal works well – Sometimes partial Cholesky does too – Direct solvers for small problems
Richard Szeliski
Large-scale structure from motion
Richard Szeliski
Large-scale reconstruction
Most of the models shown so far have had ~500 images
How do we scale from 100s to 10,000s of images?
Observation: Internet collections represent very non-uniform samplings of viewpoint
Richard Szeliski
Skeletal graphs for efficient structure from motion
Noah Snavely, Steve Seitz, Richard Szeliski
CVPR’2008
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted.
appears, you may have to delete the image and then insert it again.
Skeletal set
Goal: select a smaller set of images to reconstruct, without sacrificing quality of reconstruction
Two problems: – How do we measure quality? – How do we find a subset of images with
bounded quality loss?
Skeletal set
Goal: given an image graph , select a small set S of important images to reconstruct, bounding the loss in quality of the reconstruction
Reconstruct the skeletal set S Estimate the remaining images with much faster pose estimation steps
Properties of the skeletal set
Should touch all parts of Dominating set
Should form a single reconstruction Connected dominating set
Should result in an accurate reconstruction ?
Example
Full graph Skeletal graph
Representing information in a graph What kind of information?
– No absolute information about camera positions
– Each edge provides information about the relative positions of two images
– … but not all edges are equally informative We model information with the uncertainty (covariance) in pairwise camera positions
Estimating certainty
Usually measured with covariance matrix
SfM has thousands or millions of parameters We only measure covariance in camera positions (3x3 matrix for every camera)
Pantheon
Full graph Skeletal graph (t=16)
Skeletal reconstruc�on 101 images
A�er adding leaves 579 images
A�er final op�miza�on 579 images
Pantheon
Trafalgar Square
2973 images registered (277 in skeletal set)
Trafalgar Square
Running time
0
20
40
60
80
100
120
140
160
180
Stonehenge St. Peters Pantheon Pisa Trafalgar
Full reconstruction
Skeletal reconstruction
Skeletal reconstruction + BA
(~10 days) (~30 days)
hours
Building Rome in a Day
Sameer Agarwal, Noah Snavely, Ian Simon, Steven M. Seitz,
Richard Szeliski ICCV’2009
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then
insert it again.
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then
then insert it again.
Distributed matching pipeline
Query expansion
Results: Dubrovnik
Results: Rome
Changchang Wu’’s SfM code for iconic graph uses 5-point+RANSAC for 2-view initialization uses 3-point+RANSAC for adding views performs bundle adjustment For additional images use 3-point+RANSAC pose estimation
Rome on a cloudless day
(Frahm et al. ECCV 2010) GIST & clustering (1h35)
SIFT & Geometric verification (11h36)
SfM & Bundle (8h35)
Dense Reconstruction (1h58)
Some numbers 1PC 2.88M images 100k clusters 22k SfM with 307k images 63k 3D models Largest model 5700 images Total time 23h53
Structure from motion: limitations
Very difficult to reliably estimate metric structure and motion unless: – large (x or y) rotation or – large field of view and depth variation
Camera calibration important for Euclidean reconstructions Need good feature tracker
Related problems
On-line structure from motion and SLaM (Simultaneous Localization and Mapping) – Kalman filter (linear) – Particle filters (non-linear)
Visual SLAM
Real-time Stereo Visual SLAM
113
(Clipp et al., IROS2010)
Real-Time Stereo Visual SLAM
Marc Pollefeys
(Lim et al., CVPR201
Collaboration with
Open challenges
Large scale structure from motion – Complete building – Complete city
Open source vode available, e.g. Christopher Zach (SfM, Bundle adjustment) http://www.inf.ethz.ch/personal/chzach/opensource.html
Photo Tourism Bundler: http://phototour.cs.washington.edu/bundler/
117
Next week: Optical flow