Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Odometria visuale nell'ambito del progetto STEPS
Aldo Cumani
Istituto Nazionale di Ricerca Metrologica
Panoramica INRiM
15 maggio 2013
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 1 / 33
Sommario
1 Introduzione
STEPS
Odometria
2 Algoritmo INRiM
Algoritmo generico
Features (visual landmarks)
Motion estimation
3 Risultati
4 Grazie per l'attenzione
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 2 / 33
Introduzione STEPS
Progetto Regionale STEPS
Sistemi e Tecnologie per l'EsPlorazione Spaziale
R&D di tecnologie per l'esplorazione spaziale con l'obiettivo di promuovere, inambito internazionale, l'eccellenza tecnologica presente nel territorio piemontese.
Partecipanti
Thales Alenia Space (capo�la), PoliTo, Università di Torino, Università delPiemonte Orientale, ALTEC, INRiM, 24 PMI piemontesi
Risultati (STEPS 1 2009-2012, 20M)
Tecnologie abilitanti, dimostratori (virtuali e �sici) in particolare �nalizzati allosviluppo di un sistema per atterraggio morbido (lander) e mobilità di super�cie(rover), applicabile a missioni verso Luna e Marte.
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 3 / 33
Introduzione STEPS
Contributo di INRiM in STEPS
INRiM ha partecipato al WorkPackage 1B di STEPS, per uncontributo complessivo di circa 93 ke, con i seguenti compiti:
Study, development, implementation and testing of a Visual Odometry
method based upon the processing of images taken by a stereo rig
onboard the rover.
Integration of the Visual Odometry algorithm into the STEPS
demonstrator software
Help to other partners in WP1B for the development of
stereovision-based algorithms for 3D reconstruction (DEM building)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 4 / 33
Introduzione Odometria
Visual Odometry
Odometria
Dal greco oδoς=strada, µετρoν=misura:
calcolo della lunghezza del percorso di un veicolo
dal numero di giri delle ruote.
Odometria Visuale
Ricostruzione quantitativa, almeno 2D ma
preferibilmente 3D, del percorso di un veicolo da
sequenze di immagini riprese da bordo del
veicolo stesso.
Odometro di Leonardo(Codex Atlanticus, 1478-1518)
����-
q q q q q q q q
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 5 / 33
Introduzione Odometria
Il tema dell'odometria visuale si inquadra nel campo della cosiddetta
navigazione autonoma, cioè l'insieme di tecniche che consentono ad una
piattaforma robotica mobile di muoversi, senza intervento di operatori
umani, in un ambiente non strutturato (per esempio: esplorazione
automatica della super�cie di un pianeta).
�A fully autonomous robot has the ability to
Gain information about the environment (Rule #1)
Work for an extended period without human intervention (Rule #2)
Move either all or part of itself throughout its operating environmentwithout human assistance (Rule #3)
Avoid situations that are harmful to people, property, or itself unless thoseare part of its design speci�cations (Rule #4)�
(Wikipedia, Autonomous robot)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 6 / 33
Introduzione Odometria
Perchè VO?
L'odometria delle ruote non è precisa (slittamenti...)
E non è comunque in grado di dare una stima 3D della traiettoria, a
di�erenza dei metodi basati sulla visione
Come si fa?
Per stimare l'egomotion con metodi di visione,
devono esserci nella scena punti di riferimento
(vicini) visibili prima e dopo lo spostamento
Di conseguenza, tutti gli algoritmi di visual
odometry sono di tipo incrementale: la
traiettoria del robot risulta dalla somma di
tanti spostamenti elementari, ciascuno stimato
in qualche modo dalle immagini riprese prima
e dopo lo spostamento
����-
q q q q q q q q
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 7 / 33
Introduzione Odometria
Quante telecamere?
una (visione monoculare): fattibile, ma
la visione monoculare non fornisce il
fattore di scala ⇒ necessità di ricavarlo
per altra via (p.es. osservando landmark
di dimensione nota) oppure da altri
sensori (p.es. odometria dellle ruote,
blah!)
due o più (stereo o multi-camera): OK,
due telecamere ( calibrate ) consentono
di stimare sia la struttura 3D
dell'ambiente che il movimento del rover
nell'ambiente stesso, incluso il fattore di
scala
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 8 / 33
Algoritmo INRiM Algoritmo generico
(Almost) generic Stereo Visual Odometry Algorithm
Feature extraction and tracking: At suitably
spaced keyframes visual landmarks are
extracted and matched (left-right and to the
previous keyframe)
Motion estimation: The relative motion of the
rover between the current keyframe and the
previous one is estimated from matched
features.
Possible Loop closure correction: Features and
pose estimates are periodically saved. When the
rover believes to be near a saved position,
observed features are compared to the stored
ones, and a pose correction is possibly
computed.
����-
q q q q q q q q
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 9 / 33
Algoritmo INRiM Algoritmo generico
(Almost) generic Stereo Visual Odometry Algorithm
Feature extraction and tracking: At suitably
spaced keyframes visual landmarks are
extracted and matched (left-right and to the
previous keyframe)
Motion estimation: The relative motion of the
rover between the current keyframe and the
previous one is estimated from matched
features.
Possible Loop closure correction: Features and
pose estimates are periodically saved. When the
rover believes to be near a saved position,
observed features are compared to the stored
ones, and a pose correction is possibly
computed.
����-
q q q q q q q q
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 9 / 33
Algoritmo INRiM Algoritmo generico
(Almost) generic Stereo Visual Odometry Algorithm
Feature extraction and tracking: At suitably
spaced keyframes visual landmarks are
extracted and matched (left-right and to the
previous keyframe)
Motion estimation: The relative motion of the
rover between the current keyframe and the
previous one is estimated from matched
features.
Possible Loop closure correction: Features and
pose estimates are periodically saved. When the
rover believes to be near a saved position,
observed features are compared to the stored
ones, and a pose correction is possibly
computed.
����-
q q q q q q q q
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 9 / 33
Algoritmo INRiM Features (visual landmarks)
Features (visual landmarks)
What kind of features?
Point (2 linear image coordinates)
Line (1 linear, 1 angular coord)
Line segment (4 linear... but unreliable!)
Other...
In an unstructured environment like Mars or Moon surface, the only
reasonable choice are point features
Obviously, two coordinates (x,y) are not enough for identifying the
feature - we need a descriptor of the image behaviour around the given
(x,y) in order to be able to compare points in di�erent images
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 10 / 33
Algoritmo INRiM Features (visual landmarks)
Point features
Detection
Localisation (x , y) of visually salient points in the image, and determination
of the apparent size (scale σ) of the visual feature. Detectors typicallysearch image space (x , y) and scale space (σ) for extrema of some local
operator (Harris, Hessian, Laplacian etc.)
Description
compact representation of the image behaviour around the detected point:
N values → point in EN → similarity from Euclidean distance. Descriptors
encode the behaviour of the luminance in a region around (x , y) of size∝ σ into N parameters from some transformation (e.g. wavelets)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 11 / 33
Algoritmo INRiM Features (visual landmarks)
Feature matching
Matching relies essentially on a similarity measure based upon descriptor
vectors distance in feature space. A nearest-neighbor-ratio approach is
used, with a bidirectional matching check.
for stereo matching, positive disparity and epipolar constraints can be
used to restrict the search for matches
for tracking, blind matching is used, although previous knowledge
about 3D structure and predicted rover motion could be used for
restricting the search areas
blind matching is also needed for cyclic corrections (in this case,
relative motion is quite unreliable)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 12 / 33
Algoritmo INRiM Features (visual landmarks)
Choosing the right features
Detectors
Harris-Laplace detector (harlap)
Hessian-Laplace detector (heslap)
Harris-A�ne detector (hara� )
Hessian-A�ne detector (hesa� )
Harris-Hessian-Laplace detector(harhes)
Edge-Laplace detector (sedgelap)
Descriptors
Freeman's steerable �lters (jla)
Lowe's Scale Invariant FeatureTransform (sift)
Gradient Location-OrientationHistogram (extended SIFT) (gloh)
Van Gool's moment invariants (mom)
Spin image (spin)
cross-correlation of image patches (cc)
Speeded Up Robust Features (SURF) (Bay, Tuytelaars, Van Gool, 9th ECCV -
2006)
Fast CVL features
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 13 / 33
Algoritmo INRiM Features (visual landmarks)
CVL Features
Simpli�ed Speeded Up Robust Features (SURF) (Bay, Tuytelaars, VanGool, 9th ECCV - 2006):
detector: pixel-space and scale-space maxima of the normalized Hessian
H(σ) = σ2(Ixx(σ)Iyy (σ)− D(σ)I 2xy
(σ))
descriptor: normalized 8× 8 resampled luminance in a 10σ area aroundthe maximum
Fast detection (as in SURFs) using integral image and box �lter
approximations for derivatives
Discretised and cropped Gaussian (σ = 1.2) second derivative �lter masksin yy and xy (left), and their box approximations (right) (from Bay et al. 2006)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 14 / 33
Algoritmo INRiM Features (visual landmarks)
CVL Features 2
Fast computation of descriptors, again using the integral image:
Drawbacks: Invariant to translation, scale and a�ne illumination
changes, but NOT invariant to rotation and uneven scaling
But much faster:points time (ms) pts/frame good pts
SURF 1474 1605 2451 155U-SURF 1474 808 2451 204New method 1448 391 2401 151
Performance comparison of new method and SURF
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 15 / 33
Algoritmo INRiM Features (visual landmarks)
Synthetic world
Mars-like environment simulated
by POV-Ray
advantage: ground truth rover
pose and link between rover and
world coordinates are known
with absolute accuracy
-60
-40
-20
0
20
40
60
-50 0 50 100 150 200 250
circular path
waving path
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 16 / 33
Algoritmo INRiM Features (visual landmarks)
Synthetic world results
I (circular path, various detectors)
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 50 100 150 200 250 300 350
cvlsurf
heslap/siftharaff/siftharlap/sifthesaff/siftharhes/sift
sedgelap/sift
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0 50 100 150 200 250 300 350
cvlsurf
heslap/siftharaff/siftharlap/sifthesaff/siftharhes/sift
sedgelap/sift
Simulated circular path. Left: position error (m), right: rotation error (rad),vs. path length (m) for various detectors.
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 17 / 33
Algoritmo INRiM Features (visual landmarks)
Synthetic world results - some statistics
circular path
featuresstep0..20
total0..5000
4-match0..500
inliers0..250
hara�/cchara�/glohhara�/jlahara�/momhara�/sifthara�/spinharhes/siftharlap/sifthesa�/siftheslap/siftsedgelap/siftsurfcvl
waving path
featuresstep0..20
total0..5000
4-match0..500
inliers0..250
hara�/cchara�/glohhara�/jlahara�/momhara�/sifthara�/spinharhes/siftharlap/sifthesa�/siftheslap/siftsedgelap/siftsurfcvl
step: average inter-keyframe steptotal: average total number of features detected in each image4-match: average number of features matched over all 4 images of a keyframe pairinliers: average number of usable 4-matched features in a keyframe pair
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 18 / 33
Algoritmo INRiM Motion estimation
Motion estimation I
Registration in 3D
First estimate the 3D positions of the observed points in the reference
frames of the stereo head before and after the motion
Then estimate the rototranslation by trying to align the two point
clouds
Used on the �rst Mars Exploration Rovers (Maimone et al., Journal ofField Robotics 24/3, 2007)
Drawbacks:
The �tting tolerance depends on the point distance and directionIt is di�cult to handle outliers
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 19 / 33
Algoritmo INRiM Motion estimation
Motion estimation II
Registration on the image plane
Begin estimating the 3D positions of the observed points and the
rototranslation as before, but only as a starting approximation
Then optimize the estimate by trying to minimize the image plane
error, i.e. the di�erence between backprojected points and actually
observed ones
Advantages:
The �tting tolerance does not depend on point distance or direction,and this greatly eases handling of outliers
This technique has been well known to photogrammetrists since 1950's
as bundle adjustment . Indeed, it consists in adjusting the bundles of
rays from each camera so that the image plane error is minimixed.
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 20 / 33
Algoritmo INRiM Motion estimation
Motion estimation by bundle adjustment
Given a pair of keyframes:
xi ,1L = PLXi
xi ,1R = PRMSXi
xi ,2L = PLM12Xi
xi ,2R = PRMSM12Xi
M =
[e[r]× t
0> 1
]X =
xy1
t
x =
uvw
Cost function:
J(p) =∑
i
∑q f (‖eiq‖2)
=∑
i
∑q f (‖uiq − u∗iq‖2)
u =
[u/wv/w
]p = [r12, t12, x1, y1, t1, ...xN , yN , tN ]> (6 + 3N unknowns)
f (e2) = log(1 + e2/σ2) (robust Lorentzian cost!)
Optimization by e�cient sparse Levenberg-Marquardt. Two-pass
optimisation (�rst with Lorentzian cost, second with standard
sum-of-squares cost on inliers only)A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 21 / 33
Risultati
INRiM data I
-10
0
10
20
30
40
50
-50 -40 -30 -20 -10 0 10
-15
-10
-5
0
5
-50 -40 -30 -20 -10 0 10
INRIM campus data I
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 22 / 33
Risultati
INRiM data I (cont.)
INRIM
campus
data I
grassy plain path
featuresstep0..20
total0..5000
4-match0..500
inliers0..250
cvl (0.7)cvl (0.9)hara�/mom
0
2
4
6
8
10
0 50 100 150 200 250 300 350
cvl (raw)cvl (dejavu)
haraff/mom (raw)
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 23 / 33
Risultati
INRiM data II-20
-10
0
10
20
30
40
50
-10 0 10 20 30 40 50 60
y [m]
-5
0
5
10 15
-10 0 10 20 30 40 50 60
z [m]
x [m]
INRIM campus data II
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 24 / 33
Risultati
INRiM data II (cont.)
INRIM
campus
data II
paved road path
featuresstep0..20
total0..5000
4-match0..500
inliers0..250
cvl (0.7)hara�/mom
0
2
4
6
8
10
12
14
0 50 100 150 200 250 300 350
return position error [m]
estimated path length [m]
cvl 0.7haraff/mom 0.9
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 25 / 33
Risultati
Results (CVL) on Oxford data
New College Dataset52478 stereo pairs over a total path length of
about 2840m
-100
-50
0
50
0 50 100 150 200
0
5
10
15
20
0 200 400 600 800 1000 1200 1400
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 26 / 33
Risultati
Results (CVL) on Karlsruhe data
20090908drive21-8
-6
-4
-2
0
2
4
6
0 5 10 15 20 25 30 35 40 45 50
20100304drive21
-70
-60
-50
-40
-30
-20
-10
0
10
0 20 40 60 80 100 120 140 160
20090908drive19-120
-100
-80
-60
-40
-20
0
20
0 50 100 150 200 250 300 350 400 450
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 27 / 33
Grazie per l'attenzione
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 28 / 33
Odometro
Odometro di Leonardo (Codex Atlanticus, 1478-1518)
Go Back
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 29 / 33
Bundle adjustment
from:
http://www.geodetic.com/Whatis.htm
given the observations of N unknown 3D pointsin M images, express the image plane errors, i.e.distances of 2D projected points from theobserved ones, as a function of the unknownparameters - the 3D point coordinates, pluspossibly imaging geometry parameters (e.g. theposes of the M cameras)
de�ne a cumulative image plane error J as asuitable function (e.g. sum of squares) of theabove errors, and estimate the unknownparameters (3D structure and motion) by seekingfor a minimum of J
for J = sum of squares and Gaussian disturbances, the estimate is optimal (ML)
the least squares solution is highly sensitive to outliers (e.g. due to wrongmatches), so it is generally safer to use a more robust cost function, or to performan accurate outliers detection (or both)
Go Back
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 30 / 33
Single View Geometry (Pinhole Camera)
r
6
��
��
�3
!!!!�
��
��
��
��
�
��
��
��
��
��
r6
��
��3
-
!!!!!!!!!!!!!
rtX
x
o
y
xv
u
zprincipal
axis
imageplane
Ccameracenter
X =
2664x
y
z
t
3775 x =
24 u
v
w
35x = PX
P =ˆM | p4
˜= K
ˆR | t
˜K =
24 fu s u0fv v0
1
35in the general case, P is a rank-3 3×4 matrix (projective camera). If M isnonsingular, it is a �nite camera. P is de�ned up to scale, so it has 11 DOF.
the matrix K is the intrinsic calibration matrix of the camera: fu and fv are thefocal lengths, s the skew, u0 and v0 the image center
Go Back
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 31 / 33
Parallel Axes Stereo
PL =
24 f 0 0f 0
1
35 ˆ I3 | 03˜
PR =
24 f 0 0f 0
1
3524 | −bI3 | 0| 0
35uL = fx/z uR = f (x − b)/z
vL = vR = fy/z
d = uL − uR = fb/z
d = uL − uR is the stereo disparity which allows to determine directly the depth z
of the observed point
z = fb/d ⇒ δz = (−z2/(fb))δd , i.e. for given image plane error the error in depthincreases quadratically with z
Go Back
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 32 / 33
Parallel Axes Stereo (2)
Let's do some rotations
X =
»R1 00 1
–X′
xL = R1x′L xR = R2x
′R
then
x′L =ˆI | 0
˜X′
x′R =ˆR′2RR1 | R′2t
˜X′
it is always possible to determine R1 and R2 so that R′2RR1 = I andR′2t = [−b, 0, 0]′, i.e. the parallel axes case.
the transformations R1 and R2 can be used to warp image data obtaining a s.c.stereo recti�ed pair (note that warping may include corrections for lens distortion).This is mostly useful for stereo algorithms seeking dense disparity maps by directcorrelation of luminance data.
Go Back
A. Cumani (INRiM) Odometria visuale Panoramica INRiM 2013 33 / 33