Upload
leo-warren
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction: Robot VisionPhilippe Martinet
Unifying Vision and ControlSelim Benhimane
Efficient Keypoint RecognitionVincent Lepetit
Multi-camera and Model-based Robot VisionAndrew Comport
Visual SLAM for Spatially Aware RobotsWalterio Mayol-Cuevas
Outdoor Visual SLAM for RoboticsKurt Konolige
Advanced Vision in Deformable EnvironmentsAdrien Bartoli
Tutorial organized by Andrew Comport and Adrien BartoliNice, September 22
Visual SLAM and Spatial Awareness
SLAM= Simultaneous Localisation and Mapping
An overview of some methods currently used for SLAM using computer vision.
Recent work on enabling more stable and/or robust mapping in real-time.
Work aiming to provide better scene understanding in the context of SLAM: Spatial Awareness.
Here we concentrate on “Small” working areas where GPS, odometry and other traditional sensors are not operational or available.
Spatial Awareness
SA: A key cognitive competence that permits efficient motion and task planning.
Even from early age we use spatial awareness: the toy has not vanished it is behind the sofa.
I can point to where the entrance to the building is but cant tell how many doors are from here to there.
SLAM offers a rigorous way to implement and manage SA
Wearable personal assistants
Mayol, Davison and Murray 2003
Video at http://www.robots.ox.ac.uk/ActiveVision/Projects/Vslam/vslam.02/Videos/wearableslam2.mpg
SLAM Key historical reference:
Smith, R.C.and Cheeseman, P. "On the Representation and Estimation of Spatial Uncertainty". The International Journal of Robotics Research 5 (4): 56-68. 1986.
Proposed a stochastic framework to maintain the relationship (uncertainties) between features in the map.
“Our knowledge of the spatial relationships among objects is inherently uncertain. A manmade object does not match its geometric model exactly because of manufacturing tolerances. Even if it did, a sensor could not measure the geometric features, and thus locate the object exactly, because of measurement errors. And even if it could, a robot using the sensor cannot manipulate the object exactly as intended, because of hand positioning errors…”[Smith,Self,Cheesman 1986]
SLAM
A problem that has been identified for several years, central in mobile robot navigation and branching into other fields like wearable computing and augmented reality.
SLAM – Simultaneous Localisation And Mapping
camera
3D points (features)
camera moved
perspective projection
predict location
Aim to:
• Localise camera (6DOF – Rotation and
Translation from reference
view)
• Simultaneously
estimate 3D map of
features (e.g. 3D points)
update positions
update location
Implemented using:
Extended Kalman Filter, Particle filters, SIFT, Edglets, etc.
Challenges for visual SLAM
On the computer vision side, improving data association: Ensuring a match is a true positive.
Representations and parameterizations to enhance mapping while within real-time.
Alternative frameworks for mapping: Can we extend area of operation? Better scene understanding.
For data association, earlier approach
Small (e.g. 11x11) image patches around salient points to represent features.
Normalized Cross Correlation (NCC) to detect features.
Small patches + accurate search regions lead to fast camera pose estimation.
Depth by projecting hypothesis at different depths.
See: A. Davison, Real-Time Simultaneous Localisation and Mapping with aSingle Camera, ICCV 2003.
However
Simple patches are insufficient for large view point or scale variations.
Small patches help speed but prone to mismatch.
Search regions can’t always be trusted (camera occlusion, motion blur).
Possible solutions: Use better feature description orOther types of features e.g. edge information.
SIFT [D. Lowe, IJCV 2004]
Find maxima in scale space to locate keypoint.
…
128 elements vector
Around keypoint, build invariant local descriptor
using gradient histograms.If for tracking, this may be wasteful!
•Uses SIFT-like descriptors (histogram of gradients) around Harris corners.•Get scale from SLAM = “predictive SIFT”.
[Chekhlov, Pupilli, Mayol and Calway, ISVC06/CVPR07]
[Chekhlov, Pupilli, Mayol and Calway, ISVC06/CVPR07]
Video at http://www.cs.bris.ac.uk/Publications/attachment-delivery.jsp?id=9
[Eade and Drummond, BMVC2006]
Edglets:
• Locally straight section of gradient Image.
•Parameterized as 3D point + direction.
•Avoid regions of conflict (e.g. close parallel edges).
•Deal with multiple matches through robust estimation.
Video at http://mi.eng.cam.ac.uk/~ee231/bmvcmovie.avi
RANSAC [Fischler and Bolles 1981]
RandomSamplingANdConsensus
Gross “outliers”
Least squares fit
RANSAC fit
•Select random sample of points.
•Propose a model (hypothesis) based on sample.
•Assess fitness of hypothesis to rest of data.
•Repeat until max number of iterations or fitness threshold reached.
•Keep best hypothesis and potentially refine hypothesis with all inliers.
OK but…
Having rich descriptors or even multiple kinds of features may still lead to wrong data associations (mismatches).
If we pass to the SLAM system every measurement we think is good it can be catastrophic.
Better to be able to recover from failure than to think it won’t fail!
[Williams, Smith and Reid ICRA2007]
•Camera relocalization using small 2D patches + RANSAC to compute pose.
•Adds a “supervisor” between visual measurements and SLAM system.
Use 3 point algorithm -> up to 4 possible poses. Verify using Matas’ Td,d test.
Also see recent work [Williams, Klein and Reid ICCV2007] using randomised trees rather than simple 2D patches.
Carry onIs lost? Select 3matches
Computepose Consistent?
yes yes
no
In brief, while within real-time limit do:
[Williams, Smith and Reid ICRA2007]
Video at http://www.robots.ox.ac.uk/ActiveVision/Projects/Vslam/vslam.04/Videos/relocalisation_icra_07.mpg
Relocalisation based on appearance hashing Use a hash function to index similar descriptors (Brown et al 2005).
Fast and memory efficient (only an index needs to be saved per descriptor).
Chekhlov et al 2008
Quantize result of Haar masks
Video at: http://www.cs.bris.ac.uk/Publications/pub_master.jsp?id=2000939
Parallel Tracking and Mapping
[Klein and Murray, Parallel Tracking and Mapping for Small AR Workspaces Proc. International Symposium on Mixed and Augmented Reality. 2007]
Decouple Mapping from Tracking, run them in separate threads on multi-core CPU.
Mapping is based on key-frames, processed using batch Bundle Adjustment.
Map is intialised from a stereo pair (using 5-Point Algorithm).
Initialised new points with epipolar search.
Large numbers (thousands) of points can be mapped in a small workspace.
[Klein and Murray, 2007]
Parallel Tracking and Mapping
CPU1
CPU2
Detect Features
Compute Camera Pose
DrawGraphics
Update Map
Detect Features
Compute Camera Pose
DrawGraphics
…
……
Video at http://www.robots.ox.ac.uk/ActiveVision/Videos/index.html
So far we have mentioned that
Maps are sparse collections of low-level features: Points (Davison et al., Chekhlov et al.) Edgelets (Eade and Drummond) Lines (Smith et al., Gee and Mayol-Cuevas)
Full correlation between features and camera Maintain full covariance matrix Loop closure: effects of measurements propagated to
all features in map
Increase in state size limits number of features
Emphasis on localization and less on the mapping output.
SLAM should avoid making “beautiful” maps (there are other better methods for that!).
Very few examples exist on improving the awareness element, e.g. Castle and Murray BMVC 07 on known object recognition within SLAM.
Commonly in Visual SLAM
Better spatial awareness through higher level structural inference
Types of Structure
• Coplanar points → planes
• Collinear edgelets → lines
• Intersecting lines → junctions
Our Contribution
• Method for augmenting SLAM map with planar and line structures.
• Evaluation of method in simulated scene: discover trade-off between
efficiency and accuracy.
Plane Parameters:
Plane Representation
ii
i
ii
ii
θ
θφ
,φθ
coscos
sin
sincos
c
Camera
Plane
(x,y,z)
c(θ1,φ1)
c(θ2,φ2)
normal
2211 zyxm
Basis vectors:
World
O
Gee et al 2007
1. Discover planes using
RANSAC over thresholded
subset of map
2. Initialise plane in state using
best-fit plane parameters
found from SVD of inliers
3. Augment state covariance, P,
with new plane
Plane Initialisation
TJR0
0PJP
0
new
01
... zmmv ssss
0IJ
n
O
World
P=
Append measurement covariance
R0 to covariance matrix
Multiplication with Jacobian populates cross-covariance terms
State size increases by 7 after adding plane
Gee et al 2007
State size decreases by 1 after adding point to plane
Fix points in plane: reduces state size by 2 for each fixed point
Add point to planeAdd other points to planeState size is smaller than original state if >7 points are added to plane
d
1. Decide whether point lies on
plane
2. Add point by projecting onto
plane and transforming state
and covariance
3. Decide whether to fix point on
plane
Adding Points to Plane
TJPJP new
I00
rrrrr
00I
J mmmmv niii......
11
O
σmax
s
World
Gee et al 2007
Plane Observation1. Cannot make direct
observation of plane
2. Transform points to
3D world space
3. Project points into
image and match with
predicted observations
4. Covariance matrix
embodies constraints
between plane,
camera and points
World
Gee et al 2007
An example application
Chekhlov et al. 2007 Video at http://www.cs.bris.ac.uk/Publications/pub_master.jsp?id=2000745
Other interesting recent work
Active search and matching: or know what to measure. Davison ICCV 2005 and Chli and Davison ECCV 2008
Submapping: managing better the scalability problem. Clemente et al RSS 2007 Eade and Drummond BMVC 2008
And the work presented in this tutorial: Randomised trees: Vincent Lepetit SFM: Andrew Comport
Software tools:
http://www.doc.ic.ac.uk/~ajd/Scene/index.html
<MonoSLAM code for Linux, works out of the box>
http://www.robots.ox.ac.uk/~gk/PTAM/
<Parallel tracking and mapping>
http://www.openslam.org/
<for SLAM algorithms mainly from robotics community>
http://www.robots.ox.ac.uk/~SSS06/
<SLAM literature and some software in Matlab>
Recommended intro reading: Yaakov Bar-Shalom, X. Rong Li, Thiagalingam Kirubarajan, Estimation with
Applications to Tracking and Navigation, Wiley-Interscience, 2001.
Hugh Durrant-Whyte and Tim Bailey, Simultaneous Localisation and Mapping (SLAM): Part I The Essential Algorithms. Robotics and Automation Magazine, June, 2006.
Tim Bailey and Hugh Durrant-Whyte, Simultaneous Localisation and Mapping (SLAM): Part II State of the Art. Robotics and Automation Magazine, September, 2006.
Andrew Davison, Ian Reid, Nicholas Molton and Olivier Stasse MonoSLAM: Real-Time Single Camera SLAM, IEEE Trans. PAMI 2007.
Andrew Calway, Andrew Davison and Walterio Mayol-Cuevas, Slides of Tutorial on Visual SLAM, BMVC 2007 avaliable at:
http://www.cs.bris.ac.uk/Research/Vision/Realtime/bmvctutorial/