Self-localization using sporadic features

7/30/2019 Self-localization using sporadic features

1/9

Robotics and Autonomous Systems 40 (2002) 111119

Self-localization using sporadic features

Gerhard K. Kraetzschmar, Stefan EnderleUniversity of Ulm, Neuroinformatics, D-89069 Ulm, Germany

Abstract

Knowing its position in an environment is an essential capability for any useful mobile robot. Monte Carlo Localization(MCL) has become a popular framework for solving the self-localization problem in mobile robots. The known methods

exploit sensor data obtained from laser range finders or sonar rings to estimate robot positions and are quite reliable and robust

against noise.

An open question is whether comparable localization performance can be achieved using only camera images, especially

if the camera images are used both for localization and object recognition. In such a situation, it is both harder to obtain

suitable models for predicting sensor readings and to correlate actual with predicted sensor data. Both problems can be easily

solved if localization is based on features obtained by preprocessing images. In our image-based MCL approach, we combine

visual distance features and visual landmark features, which have different characteristics. Distance features can always be

computed, but have value-dependent and dramatically increasing margins for noise. Landmark features give good directional

information, but are detected only sporadically. In our paper, we discuss the problems arising from these characteristics and

show experimentally that MCL nevertheless works very well under these conditions.

2002 Published by Elsevier Science B.V.

Keywords: Self-localization; Visual features; Dynamic environments

1. Introduction

A mobile robot that is expected to navigate a known

environment in a goal-directed and efficient manner

must have the ability to self-localize, i.e. to determine

its position and orientation in the environment [6,9]. In

the past few years, Monte Carlo Localization (MCL)

has become a very popular framework for solving theself-localization problem in mobile robots [4,5]. This

method is very reliable and robust against noise, espe-

cially if the robots are equipped with laser range find-

ers or sonar sensors. In some environments, however,

e.g. in the popular RoboCup domain [7], using a laser

Corresponding author.

E-mail addresses: [email protected]

(G.K. Kraetzschmar), [email protected]

(S. Enderle).

scanner on each robot may be difficult, or impossible,

or too costly. Also, sonar data is extremely noisy due to

the highly dynamic environment. Thus, enhancing the

existing localization methods such that they can use

other sensory channels, like uni- or omni-directional

vision systems is an interesting open problem. In this

work, we present a vision-based MCL approach us-

ing visual features which are extracted from a robotssingle camera and matched to a known model of the

environment.

2. Markov localization

In Markov localization [4], the position of a robot

is estimated by computing the probability distribu-

tion over all possible positions in the environment.

Let l = x , y , denote a position in the state space.

0921-8890/02/$ see front matter 2002 Published by Elsevier Science B.V.P I I : S 0 9 2 1 - 8 8 9 0 ( 0 2 ) 0 0 2 3 6 - 1


2/9

112 G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

Then, Bel(l) expresses the robots belief in being at

location l.

The algorithm starts with an initial distributionBel(l) depending on the initial knowledge of the

robots position. If the correct position of l is known,Bel(l) = 1 for l = l and Bel(l) = 0 for l = l. If the

position is completely unknown, Bel(l) is uniformly

distributed over the environment to reflect this un-

certainty. As the robot operates, two kinds of update

steps are iteratively applied to incrementally refine

Bel(l).

2.1. Belief projection across robot motion

This step handles robot motion and projects therobots belief about its position across the executed

actions. A motion model P (l|l, m) is used to predict

the likelihood of the robot being in position l assum-

ing that it executed a motion command m and was

previously in position l. Here, it is assumed that the

new position depends only on the previous position

and the movement (Markov property). Using the mo-

tion model, the distribution of the robots position be-

liefBel(l) can be updated according to the commonly

used formula for Markov chains [1]:

Bel(l) =

P (l|l, m)Bel(l) dl. (1)

2.2. Integration of sensor input

The data obtained from the robots sensors are used

to update the beliefBel(l). For this step, an observation

model P (o|l) must be provided which models the

likelihood of an observation o given the robot is at

position l. Bel(l) is then updated by applying Bayes

rule as follows:

Bel(l) = P(o|l)Bel(l), (2)

where is a normalization factor ensuring that Bel(l)

integrates to 1.

The Markov localization method provides a math-

ematical framework for solving the localization prob-

lem. Unlike methods based on Kalman filtering [8], it

is easy to use multimodal distributions. However, im-

plementing Markov localization on a mobile robot in

a tractable and efficient way is a non-trivial task.

3. Monte Carlo localization

The MCL approach [5] solves the implementation

problem by representing the infinite probability dis-

tribution Bel(l) by a set of N samples S = {si |i =

1, . . . , N }. Each sample si = li , pi consists of a

robot location li and weight pi . The weight corre-

sponds to the likelihood of li being the robots cor-

rect position, i.e. pi Bel(li ) [10]. Furthermore, as

the weights are interpreted as probabilities, we assumeNi=1pi = 1.

The algorithm for MCL is adopted from the gen-

eral Markov localization framework described above.

Initially, a set of samples reflecting initial knowledge

about the robots position is generated. During robot

operation, the following two kinds of update steps areiteratively executed.

3.1. Sample projection across robot motion

As in the general Markov algorithm, a motion modelP (l|l, m) is used to update the probability distribution

Bel(l). In MCL, a new sample set S is generated from

a previous set S by applying the motion model as

follows: for each sample l, p S a new sample

l, p is added to S, where l is randomly drawn from

the density P (l|l

, m). The motion model takes intoaccount robot properties like drift and translational and

rotational errors.

3.2. Belief update and weighted resampling

Sensor inputs are used to update the robots beliefs

about its position. According to Eq. (2), all samples

are re-weighted by incorporating the sensor data o and

applying the observation model P (o|l). Most com-

monly, sensors such as laser range finders or sonars,

which yield distance data, are used. In this case, ideal

sensor readings can be computed a priori, if a mapof the environment is given. An observation model is

then obtained by noisifying the ideal sensor readings,

often simply using Gaussian noise distributions. Given

a sample l, p, the new weight p for this sample is

given by:

p = P(o|l)p, (3)

where is again a normalization factor which ensures

that all beliefs sum up to 1. These new weights for the


3/9

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119 113

samples in S provide a probability distribution over

S, which is then used to construct a new sample set

S. This is done by randomly drawing samples fromS using the distribution given by the weights, i.e. for

any sample si = li , pi S

Prob(si S) pi . (4)

The relationship is approximate only, because after

each update step we add a small number of uniformly

distributed samples, which ensure that the robot can

re-localize in cases where it lost track of its position,

i.e. where the sample set contains no sample close to

the correct robot position.

The MCL approach has several interesting proper-

ties: for instance, it can be implemented as an anytimealgorithm [2], and it is possible to dynamically adapt

the sample size (see [4] for a detailed discussion of

MCL properties).

4. Vision-based localization

There are mainly two cases where MCL based on

distance sensor readings cannot be applied: (i) If dis-

tance sensors like laser range finders or sonars are not

available. (ii) If the readings obtained by these sen-sors are too unreliable, e.g. in a highly dynamic en-

vironment. In these cases, other sensory information

must be used for localization. A natural candidate is

information from the visual channel, because many

robots include cameras as standard equipment. An ex-

ample for using visual information for MCL has been

provided by Dellaert et al. [3]. On their indoor robot

Minerva, they successfully used the distribution of

light intensity in images obtained from a camera fac-

ing the ceiling. The advantage of this setup is that there

are few disturbances caused by people in the environ-

ment, and that one can always obtain a comparatively

large number of sensor readings, even if occasionally

they may be very noisy (due to use of flashlights etc.).

An interesting and open question is whether the

MCL approach still works when the number of obser-

vations is significantly reduced and when particular

observations can be made only intermittently. In the

following, we describe a situation where these prob-

lems arise and show how to adopt the MCL approach

accordingly.

4.1. The RoboCup environment

In RoboCup, two teams of robots play soccer

against each other, where each team has three field

players and one goalie. The field is about 9 m long

and 5 m wide and has two goals which are marked

blue and yellow for visual recognition. The field has

a green floor and is surrounded with a white wall. It

has four corners each of which is marked with two

green vertical bars of 5 cm width.

4.2. Feature-based modeling

As described in Eq. (3), the sensor update mech-

anism needs a sensor model P (o|l) which describes

how probable a sensor reading o is at a given robotlocation l. This probability is often computed by esti-

mating the sensor reading o at location l and determine

a similarity measure between the given measurement o

and the estimation o. If sensor readings oi are camera

images two problems arise: (i) Estimating complete

images oi for each samples location is computation-

ally very expensive. (ii) Finding and computing a

similarity measure between images is quite difficult.

A better idea is to lift the similarity test to work on

processed, more aggregated, and lower-dimensional

data, such a feature vectors. The selection of the visualfeatures is guided by several criteria: (i) uniqueness

of the feature in the environment; (ii) computational

effort needed to detect the feature in images; (iii)

reliability of the feature detection mechanism.

In the RoboCup domain, only few environment

features can be detected while meeting these criteria:

goal posts and field corners as landmark features,

and distances to walls as distance features. The cor-

ner detector yields ambiguous landmarks, because

corners are indistinguishable from each other, while

the four goal post-features are unique landmarks. At

most eight landmark features can be detected over-

all. Only under very special circumstancesuse of

an omni-directional camera, all features within the

visual field, and no occlusion of a feature by another

robotare all eight features detectable at the same

time. In addition to landmark features, we compute

wall distance features at just four columns in the im-

age. On our Sparrow robot, we use an uni-directional

camera with a visual field of roughly 64 (see Fig. 1).

Consequently, even in the best case our robot can


4/9


Fig. 1. The Sparrow-99 robot (left), and the field used in the RoboCup middle-size robot league and the position of the visually detectable

features.

see not more than four landmark features and four

distance features at a time. Thus, features available

for localization are sparse, at least compared to thestandard approach, where a laser scanner provides

hundreds of distance estimates per second. In many

situations, cf. when located close to the middle line,

facing either sideline, and looking down it detects

none at all, because no feature is within the visual

field. Thus, the fact that the features can be detected

only sporadically is another significant difference to

previous applications of MCL.

We also performed some experiments in office en-

vironments, where we just applied a simple distance

estimator technique. Vertical edges would be goodcandidates for landmarks. However, aside of selecting

features that are sufficiently easy and reliable to de-

tect the problem of online generation of feature vec-

tor estimates based on given positions must be solved.

Standard world representations used for robot map-

ping permit to compute distance estimates sufficiently

fast, but more advances features, like corners or edges

of posters hanging on walls require an extensions of

the representation and additional computational effort.

In order to incorporate sensor information on a

feature-based level, two steps are needed: (i) the fea-tures have to be extracted from the current camera

Fig. 2. Feature detection in RoboCup (from left to right): original image, segmented image, detection of goal post, corner, and walls.

image; (ii) the samples weights have to be adapted

according to some model P (o|l). This is described in

the following paragraphs.

4.3. Feature detection

The camera image is color-segmented in order to

simplify the feature detection process. On the seg-

mented image we apply filters detecting color discon-

tinuities to yield the respective feature detectors (see

Fig. 2). The goal post detector detects a horizontal

whiteblue or a whiteyellow transition for the blue

or yellow goal post, respectively. The corner detec-

tor searches for a horizontal greenwhitegreen tran-sition in the image. The distance estimator estimates

the distance to the walls based on detected verticalgreenwhite transitions (greenblue or greenyellow

in goal areas) in the image. At the moment, we select

four specific columns in the image for detecting the

walls.

For office environments, we use only distance fea-

tures. Fig. 3 shows the image of a corridor inside our

office building, and the edges detected from it. The

lines starting from the bottom of the image visual-

ize the estimated distances to the walls. Note, thatthere are several errors mainly in the door areas. The


5/9


Fig. 3. Feature detection in office environments and their projection into maps.

right-hand side of Fig. 3 shows projections of the es-

timated distances to maps of laser scans. The brighter

dots show the estimated distances by the visual feature

detector, while the smaller darker dots show the laser

readings in the same situation. Note, that in the leftimage a situation is shown where a lot of visual dis-

tances can be estimated, while in the right image only

four distances are estimated at all. Due to the simplic-

ity of the applied feature detector often only few dis-

tance features are detected and therefore the situation

is comparable to the RoboCup environment.

4.4. Weight update

As described in Eq. (3), the sample set must be

re-weighted using a sensor model P (o|l). Let the sen-sor data o be a vector of n features f1, . . . , f n. If we

assume that the detection of features depends solely

on the robots position and does not depend on the

detectability of other features, then the features are in-

dependent and we can conclude:

P (o|l) = P (f1, . . . , f n|l)

(5)= P (f1|l

) , . . . , P ( f n|l).

Note that n may vary, i.e. it is possible to deal with

varying numbers of features.

The sensor model P (fi |l) describes how likely it

is to detect a particular feature fi given a robot lo-

cation l. In our implementation, this sensor model is

computed by comparing the horizontal position of the

detected feature with an estimated position as deter-

mined by a geometrical world model. The similarity

between feature positions is mapped to a probability

estimate by applying a heuristic function as illustrated

in Fig. 4. The application of these heuristics to the

examples used in Fig. 2 are illustrated in Fig. 5, which

also demonstrates that probabilistic combination of

evidence for several features significantly improves

the results.

The figure illustrates various properties of our ap-

proach. First, the heuristic function for each of thefour unique goal post-features has only a single peak.

The shape of the function causes all samples with

comparatively high angular error to be drastically

down-valued and successively being sorted out. Thus,

detecting a single goal post will reshape the distribu-

tion of the sample set such that mostly locations that

make it likely to see a goal post in a certain direction

will survive in the sample set. Furthermore, if two

goal posts are detected at the same time, the reshap-

ing effect will be overlayed and drastically focus the

sample set. Secondly, there is only a single heuristic

function that captures all of the ambiguous corner

features. Each actually detected corner is successively

given a probability estimate. If the corner detector

Fig. 4. Heuristics for estimating probabilities P (fi |l) given a

current position.


6/9


Fig. 5. Probability distributions P (fi |l) for all field positions l (with fixed orientation) given the detection of a particular feature fi or

combinations of them (from left to right): goal posts, field corners, wall distances, corner and goal features combined, and all features

combined.

misses corners that we expected to see, this does not

do any harm. If the detector returns more corners

than actually expected, the behavior depends on the

particular situation: detecting an extra corner close to

where we actually expected one, has a small negative

influence on the weight of this sample, while detectinga corner where none was expected at all has a much

stronger negative effect. Note, that the robustness of

the corner detector can be modeled by the heuristic

function: having a robust detector means assigning

very low probabilities to situations where the similar-

ity between estimated and detected feature position is

small, while for a noisy detector the heuristic function

would assign higher probabilities in such situations.

The effect of detecting too many features or detecting

them at wrong positions is accordingly different.

5. Experiments

5.1. Number of features

A Sparrow-99 robot was placed in one corner of

the field. In order to have an accurate reference path,

we moved the robot by hand along a rectangular tra-

jectory indicated by the dots. This simple experiment

demonstrates the practical usability of the approach. It

is shown that the robot can localize robustly and thatthe accuracy can be improved by adding more features.

Fig. 6. Feasibility experiments.

The first image in Fig. 6 shows the trajectory given

by the odometry when executing a single round. Note

the displacement at the end of the tour. The second

image displays the corrected path found by the lo-

calization algorithm using only the goal posts as fea-

tures. In the third image we can see a highly accu-rate path found by the localization algorithm which

used all possible features. The fourth image shows the

drift that occurs when moving multiple rounds without

self-localization. The fifth image displays the trajec-

tory as determined with self-localization, which does

not drift away.

5.2. Number of samples

Implementing sample-based localization, it is im-

portant to have an idea of how many samples areneeded. Obviously, a small number of samples is pre-

ferred, since the computational effort increases with

the number of samples. On the other side, an appro-

priate number of samples is needed in order to achieve

the desired accuracy. In this experiment, we moved

four rounds exactly as in the previous experiment and

used five different numbers of samples: 50, 100, 150,

1000, 5000.

Fig. 7 illustrates the average localization errors (left

image) and both average error and maximum error

(right) for different numbers of samples during theexperiment. The result is that a few hundred samples


7/9


Fig. 7. Average localization errors of the different sample numbers (left); average error and maximum error of the same five runs (50,

100, 150, 1000 and 5000 samples used) (right).

are usually sufficient to keep the average localization

error sufficiently low.

5.3. Stability and robustness

In order to evaluate the stability of our approach in

the presence of sparse and sporadic data, we logged

the number of landmark and distances features de-tected at every cycle (see Fig. 8 top and middle) and

Fig. 8. Demonstrating the robustness of the approach.

compared this to the development of the standard de-

viation overall samples with respect to the true robot

position (bottom).

The graphics illustrate that: (i) distance feature de-

tection is more robust than detection of landmark fea-

tures; (ii) the method is quite stable with respect to

sparsity and sporadicity of detected features; (iii) only

after neither kind of features can be detected for sometime the accuracy of the robot position degrades.


8/9


Fig. 9. Performance of sporadic visual localization during a sample

run in an office environment.

5.4. Feasibility for office environments

It was interesting to see whether our approach

chosen for the RoboCup environment would also be

feasible for more complex environments, like regularcorridors and rooms. Our B21 robot Stanislav was

positioned at one end (left side ofFig. 9) of a corridor

of about 20 m length and 2.2 m width. The camera,

mounted at about 1.5 m height, was directed straight

to the front. Only distance features were computed.

No use was made of the pan-tilt unit to reposition

the camera during the experiment. We used 1000

samples which were initialized at the approximating

robot position with a diameter of 1 m and an angular

uncertainty of 20.

The robot was able to track the path along the cor-

ridor up to the right side and the way back to the start-ing point. Then, while rotating 180, the camera could

not see the floor any more. As a result, many wrong

distance estimates were generated due to visual arti-

facts, and the samples drifted away. Potential remedies

include active use of the pan-tilt ensure the floor can

be seen at all times or exploiting additional landmark

features such as corners, doors, or posters at walls.

6. Conclusions

In this paper, we used the Monte Carlo approach

for vision-based localization of a soccer robot on the

RoboCup soccer field. Unlike many previous applica-

tions, the robot could not use distance sensors like laser

scanners or sonars. Also, special camera setups were

not available. Instead, the onboard camera was used

for localization purposes in addition to object recog-

nition tasks. As a consequence, sensor input to update

the robots belief about its position was sparse andsporadic. Nevertheless, the experimental evaluation

demonstrated that MCL works quite well even under

these restrictive conditions. Even with very small num-

bers of detected features, the localization results can

be quite usable, but naturally, increasing the number

of visual features available dramatically enhances ac-

curacy. Finally, first experimental results indicate that

the vision-based localization approach promises good

results also in office environments. Considering that

nowadays cameras are much cheaper than laser scan-

ners and sonar rings, vision-based robot localization

could provide a cost-effective alternative for commer-

cial robot applications.

Acknowledgements

The authors would like to thank Marcus Ritter and

Heiko Folkerts for implementing most of the algo-

rithms and performing many experiments, and Stefan

Sablatng, Hans Utz, and Gnther Palm for helpful

discussions and their continuing support.

References

[1] K. Chung, Markov Chains with Stationary Transition

Probabilities, Springer, Berlin, 1960.[2] T. Dean, M. Boddy, An analysis of time-dependent planning,

in: Proceedings of the AAAI-88, St. Paul, MN, 1988,

pp. 4954.[3] F. Dellaert, W. Burgard, D. Fox, S. Thrun, Using the

condensation algorithm for robust, vision-based mobile robot

localization, in: Proceedings of the IEEE Computer Society

Conference on Computer Vision and Pattern Recognition,

1999, pp. 588594.[4] D. Fox, Markov localization: A probabilistic framework

for mobile robot localization and navigation, Ph.D. Thesis,

University of Bonn, Bonn, Germany, December 1998.[5] D. Fox, W. Burgard, F. Dellaert, S. Thrun, Monte Carlo

localization: Efficient position estimation for mobile robots,

in: Proceedings of the AAAI-99, Orlando, FL, 1999,

pp. 343349.[6] J.-S. Gutmann, W. Burgard, D. Fox, K. Konolige, An

experimental comparison of localization methods, in:

Proceedings of the International Conference on Intelligent

Robots and Systems (IROS98), Victoria, BC, October 1998.[7] H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, E. Osawa,

H. Matsubara, RoboCup: A challenge problem for AI, AI

Magazine 18 (1) (1997) 7385.[8] P.S. Maybeck, The Kalman Filter: An Introduction to

Concepts, Springer, Berlin, 1990, pp. 194204.[9] B. Schiele, J.L. Crowley, A comparison of position estimation

techniques using occupancy grids, in: Proceedings of the

1994 IEEE International Conference on Robotics and

Automation, San Diego, CA, May 1994, pp. 16281634.


9/9


[10] A.F.M. Smith, A.E. Gelfand, Bayesian statistics without tears:

A samplingresampling perspective, American Statistician

46 (2) (1992) 8488.

Gerhard K. Kraetzschmar is a Senior

Research Scientist and Assistant Profes-

sor at the Neural Information Process-

ing Department at University of Ulm.

His research interests include autonomous

mobile robots, neurosymbolic integration,

learning and memory, and robot control

and agent architectures. He is a member

of IEEE, AAAI, ACM, IAS, and GI.

Stefan Enderle is a Ph.D. candidate at

the Neural Information Processing Depart-

ment at University of Ulm. His research

interests include sensor interpretation and

fusion, map building and self-localizationin robotics, and software development for

autonomous mobile robots.

Documents

Self-localization using sporadic features