Probabilistic Robotics

  • Upload
    asif

  • View
    84

  • Download
    1

Embed Size (px)

DESCRIPTION

Probablistic robotics by sebastian thrun

Citation preview

  • PROBABILISTICROBOTICS

    Sebastian THRUNStanfo rd Universit y

    Stanfo rd, CA

    Wolfram BURGARDUniversit y of Freiburg

    Freiburg, Germany

    Dieter FOXUniversit y of W ashington

    Seattle, WA

    EARLY DRAFTNO T FORDISTRIBUTIONc SebastianThrun,DieterFox, WolframBurgard,1999-2000

  • CONTENTS

    1 INTRODUCTION 11.1 Uncertaintyin Robotics 11.2 ProbabilisticRobotics 31.3 Implications 51.4 RoadMap 61.5 BibliographicalRemarks 7

    2 RECURSIVE STATE ESTIMATION 92.1 Introduction 92.2 BasicConceptsin Probability 102.3 RobotEnvironmentInteraction 16

    2.3.1 State 162.3.2 EnvironmentInteraction 182.3.3 ProbabilisticGenerative Laws 202.3.4 Belief Distributions 22

    2.4 BayesFilters 232.4.1 TheBayesFilter Algorithm 232.4.2 Example 242.4.3 MathematicalDerivationof theBayesFilter 282.4.4 TheMarkov Assumption 30

    2.5 RepresentationandComputation 302.6 Summary 312.7 BibliographicalRemarks 32

    3 GAUSSIAN FILTERS 333.1 Introduction 333.2 TheKalmanFilter 34

    v

  • vi PROBABILISTIC ROBOTICS

    3.2.1 LinearGaussianSystems 343.2.2 TheKalmanFilter Algorithm 363.2.3 Illustration 373.2.4 MathematicalDerivationof theKF 39

    3.3 TheExtendedKalmanFilter 483.3.1 LinearizationVia TaylorExpansion 493.3.2 TheEKF Algorithm 503.3.3 MathematicalDerivationof theEKF 513.3.4 PracticalConsiderations 53

    3.4 TheInformationFilter 553.4.1 CanonicalRepresentation 553.4.2 TheInformationFilter Algorithm 573.4.3 MathematicalDerivationof theInformationFilter 583.4.4 TheExtendedInformationFilter Algorithm 603.4.5 MathematicalDerivation of the ExtendedInformation

    Filter 613.4.6 PracticalConsiderations 62

    3.5 Summary 643.6 BibliographicalRemarks 65

    4 NONPARAMETRIC FILTERS 674.1 TheHistogramFilter 68

    4.1.1 TheDiscreteBayesFilter Algorithm 694.1.2 ContinuousState 694.1.3 DecompositionTechniques 734.1.4 BinaryBayesFiltersWith StaticState 74

    4.2 TheParticleFilter 774.2.1 BasicAlgorithm 774.2.2 ImportanceSampling 804.2.3 MathematicalDerivationof thePF 824.2.4 Propertiesof theParticleFilter 84

    4.3 Summary 894.4 BibliographicalRemarks 90

    5 ROBOT MOTION 915.1 Introduction 91

  • Contents vii

    5.2 Preliminaries 925.2.1 KinematicConguration 925.2.2 ProbabilisticKinematics 93

    5.3 VelocityMotion Model 955.3.1 ClosedFormCalculation 955.3.2 SamplingAlgorithm 965.3.3 MathematicalDerivation 99

    5.4 OdometryMotion Model 1075.4.1 ClosedFormCalculation 1085.4.2 SamplingAlgorithm 1115.4.3 MathematicalDerivation 113

    5.5 Motion andMaps 1145.6 Summary 1185.7 BibliographicalRemarks 119

    6 MEASUREMENTS 1216.1 Introduction 1216.2 Maps 1236.3 BeamModelsof RangeFinders 124

    6.3.1 TheBasicMeasurementAlgorithm 1246.3.2 AdjustingtheIntrinsicModelParameters 1296.3.3 MathematicalDerivation 1346.3.4 PracticalConsiderations 138

    6.4 LikelihoodFieldsfor RangeFinders 1396.4.1 BasicAlgorithm 1396.4.2 Extensions 143

    6.5 Correlation-BasedSensorModels 1456.6 Feature-BasedSensorModels 147

    6.6.1 FeatureExtraction 1476.6.2 LandmarkMeasurements 1486.6.3 SensorModelWith Known Correspondence 1496.6.4 SamplingPoses 1506.6.5 FurtherConsiderations 152

    6.7 PracticalConsiderations 1536.8 Summary 154

  • viii PROBABILISTIC ROBOTICS

    7 MOBILE ROBOT LOCALIZA TION 1577.1 Introduction 1577.2 A Taxonomyof LocalizationProblems 1587.3 Markov Localization 1627.4 Illustrationof Markov Localization 1647.5 EKF Localization 166

    7.5.1 Illustration 1677.5.2 TheEKF LocalizationAlgorithm 1687.5.3 MathematicalDerivation 170

    7.6 EstimatingCorrespondences 1747.6.1 EKF Localizationwith Unknown Correspondences 1747.6.2 MathematicalDerivation 176

    7.7 Multi-HypothesisTracking 1797.8 PracticalConsiderations 1817.9 Summary 184

    8 GRID AND MONTE CARLO LOCALIZA TION 1878.1 Introduction 1878.2 Grid Localization 188

    8.2.1 BasicAlgorithm 1888.2.2 Grid Resolutions 1898.2.3 ComputationalConsiderations 1938.2.4 Illustration 195

    8.3 MonteCarloLocalization 2008.3.1 TheMCL Algorithm 2008.3.2 Propertiesof MCL 2018.3.3 RandomParticleMCL: Recovery from Failures 2048.3.4 Modifying theProposalDistribution 209

    8.4 Localizationin DynamicEnvironments 2118.5 PracticalConsiderations 2168.6 Summary 2188.7 Exercises 219

    9 OCCUPANCY GRID MAPPING 2219.1 Introduction 2219.2 TheOccupancy Grid MappingAlgorithm 224

  • Contents ix

    9.2.1 Multi-SensorFusion 2309.3 LearningInverseMeasurementModels 232

    9.3.1 InvertingtheMeasurementModel 2329.3.2 Samplingfrom theForwardModel 2339.3.3 TheErrorFunction 2349.3.4 FurtherConsiderations 236

    9.4 MaximumA PosteriorOccupancy Mapping 2389.4.1 TheCasefor MaintainingDependencies 2389.4.2 Occupancy Grid Mappingwith ForwardModels 240

    9.5 Summary 242

    10 SIMULTANEOUS LOCALIZA TION ANDMAPPING 24510.1 Introduction 24510.2 SLAM with ExtendedKalmanFilters 248

    10.2.1SetupandAssumptions 24810.2.2SLAM with Known Correspondence 24810.2.3MathematicalDerivation 252

    10.3 EKF SLAM with Unknown Correspondences 25610.3.1TheGeneralEKF SLAM Algorithm 25610.3.2Examples 26010.3.3FeatureSelectionandMapManagement 262

    10.4 Summary 26410.5 BibliographicalRemarks 26510.6 Projects 265

    11 THE EXTENDED INFORMA TION FORMALGORITHM 26711.1 Introduction 26711.2 IntuitiveDescription 26811.3 TheEIF SLAM Algorithm 27111.4 MathematicalDerivation 276

    11.4.1TheFull SLAM Posterior 27711.4.2TaylorExpansion 27811.4.3ConstructingtheInformationForm 28011.4.4ReducingtheInformationForm 283

  • x PROBABILISTIC ROBOTICS

    11.4.5RecoveringthePathandtheMap 28511.5 DataAssociationin theEIF 286

    11.5.1The EIF SLAM Algorithm With Unknown Correspon-dence 287

    11.5.2MathematicalDerivation 29011.6 Efciency Consideration 29211.7 EmpiricalImplementation 29411.8 Summary 300

    12 THE SPARSE EXTENDED INFORMA TIONFILTER 30312.1 Introduction 30312.2 IntuitiveDescription 30512.3 TheSEIFSLAM Algorithm 30812.4 MathematicalDerivation 312

    12.4.1Motion Update 31212.4.2MeasurementUpdates 316

    12.5 Sparsication 31612.5.1GeneralIdea 31612.5.2Sparsicationsin SEIFs 31812.5.3MathematicalDerivation 319

    12.6 AmortizedApproximateMapRecovery 32012.7 How SparseShouldSEIFsBe? 32312.8 IncrementalDataAssociation 328

    12.8.1ComputingDataAssociationProbabilities 32812.8.2PracticalConsiderations 330

    12.9 Tree-BasedDataAssociation 33512.9.1CalculatingDataAssociationProbaiblities 33612.9.2TreeSearch 33912.9.3Equivalency Constraints 34012.9.4PracticalConsiderations 341

    12.10Multi-VehicleSLAM 34412.10.1FusingMapsAcquiredby Multiple Robots 34412.10.2EstablishingCorrespondence 347

    12.11Discussion 349

  • Contents xi

    13 MAPPING WITH UNKNOWN DATAASSOCIATION 35313.1 LatestDerivation 35313.2 Motivation 35613.3 Mappingwith EM: TheBasicIdea 35813.4 Mappingwith theEM Algorithm 365

    13.4.1TheEM MappingAlgorithm 36513.4.2TheMapLikelihoodFunction 36713.4.3Efcient MaximumLikelihoodEstimation 37013.4.4TheE-step 37113.4.5TheM-step 37613.4.6Examples 379

    13.5 Grid-BasedImplementation 37913.6 LayeredEM Mapping 381

    13.6.1LayeredMapRepresentations 38213.6.2LocalMaps 38313.6.3ThePerceptualModelFor LayeredMaps 38413.6.4EM with LayeredMaps 38613.6.5TheLayeredEM MappingAlgorithm 38913.6.6Examples 389

    13.7 Summary 39013.8 BibliographicalRemarks 391

    14 FAST INCREMENT AL MAPPING ALGORITHMS 39314.1 Motivation 39314.2 IncrementalLikelihoodMaximization 39514.3 MaximumLikelihoodasGradientDescent 398

    14.3.1Searchin PoseSpace 39814.3.2GradientCalculation 40014.3.3Suggestionsfor theImplementation 40314.3.4Examples 40414.3.5Limitations 406

    14.4 IncrementalMappingwith PosteriorEstimation 40714.4.1DetectingCycles 40714.4.2CorrectingPosesBackwardsin Time 40814.4.3Illustrations 410

  • xii PROBABILISTIC ROBOTICS

    14.5 Multi-RobotMapping 41214.6 Mappingin 3D 41414.7 Summary 41814.8 BibligraphicalRemarks 41914.9 Projects 419

    15 MARK OV DEVISION PROCESSES 42115.1 Motivation 42115.2 Uncertaintyin Action Selection 42415.3 ValueIteration 427

    15.3.1GoalsandPayoff 42715.3.2FindingControlPoliciesin Fully ObservableDomains 43115.3.3ValueIteration 43315.3.4Illustration 435

    16 PARTIALL Y OBSERVABLE MARK OV DECISIONPROCESSES 43716.1 Motivation 43716.2 FiniteEnvironments 439

    16.2.1An IllustrativeExample 43916.2.2ValueIterationin Belief Space 44816.2.3CalculatingtheValueFunction 45016.2.4LinearProgrammingSolution 455

    16.3 GeneralPOMDPs 45816.3.1TheGeneralPOMDPAlgorithm 461

    16.4 A MonteCarloApproximation 46216.4.1MonteCarloBackups 462

    16.4.1.1LearningValueFunctions 46516.4.1.2NearestNeighbor 465

    16.4.2ExperimentalResults 46616.5 AugmentedMarkov DecisionProcesses 468

    16.5.1TheAugmentedStateSpace 46916.5.2ValueIterationin AMDPs 47016.5.3Illustration 472

    16.6 Summary 47516.7 BibliographicalRemarks 475

  • Contents xiii

    16.8 Projects 475

    REFERENCES 477

  • 1INTR ODUCTION

    1.1 UNCER TAINTY IN ROBOTICS

    Roboticsis the scienceof perceiving and manipulatingthe physical world throughcomputer-controlledmechanicaldevices.Examplesof successfulroboticsystemsin-cludemobileplatformsfor planetaryexploration[], roboticsarmsin assemblylines[],carsthat travel autonomouslyon highways [], actuatedarmsthat assistsurgeons[].Roboticssystemshave in commonthatthey arearesituatedin thephysicalworld, per-ceive their environmentsthroughsensors,andmanipulatetheir environmentthroughthingsthatmove.

    While muchof roboticsis still in its infancy, the ideaof intelligent manipulatingdeviceshasan enormouspotentialto changesociety. Wouldnt it be greatif all ourcarswereableto safelysteerthemselves,makingcaraccidentsa notionof thepast?Wouldnt it begreatif robots,andnot people,would cleanup nucleardisasterssiteslike Chernobyl?Wouldnt it begreatif our homeswerepopulatedby intelligentser-vice robots that would carry out such tedioustasksas loading the dishwasher, andvacuumingthe carpet,or walking our dogs? And lastly, a betterunderstandingofroboticswill ultimately leadto abetterunderstandingof animalsandpeople.

    Tomorrows applicationdomainsdiffer from yesterdays,suchasmanipulatorsin as-semblylines thatcarryout the identicaltaskday-inday-out.Themoststriking char-acteristicof the new robot systemsis that they operatein increasinglyunstructuredenvironments,environmentsthatareinherentlyunpredictable.An assemblyline is or-dersof magnitudemorepredictableandcontrollablethana privatehome.As a result,roboticsis moving into areaswheresensorinputbecomesincreasinglyimportant,andwhererobotsoftwarehasto berobustenoughto copewith arangeof situationsoftentoo many to anticipatethemall. Robotics,thus,is increasinglybecominga software

    1

  • 2 Chapter 1

    science,wherethegoal is to developrobustsoftwarethatenablesrobotsto withstandthenumerouschallengesarisingin unstructuredanddynamicenvironments.

    Thisbookfocusesonakey elementof robotics:Uncertainty. Uncertaintyarisesif therobot lackscritical information for carryingout its task. It arisesfrom ve differentfactors:

    1. Envir onments. Physicalworldsareinherentlyunpredictable.While thedegreeof uncertaintyin well-structuredenvironmentssuchassemblylines is small,en-vironmentssuchashighwaysandprivatehomesarehighly dynamicandunpre-dictable.

    2. Sensors.Sensorsare inherentlylimited in what they canperceive. Limitationsarisefrom two primary factors. First, rangeand resolutionof a sensoris sub-ject to physical laws. For example,Camerascant seethroughwalls, andevenwithin the perceptualrangethe spatialresolutionof cameraimagesis limited.Second,sensorsaresubjectto noise,whichperturbssensormeasurementsin un-predictableways and hencelimits the information that can be extractedfromsensormeasurements.

    3. Robots.Robotactuationinvolvesmotorsthatare,at leastto someextent,unpre-dictable,dueeffects like controlnoiseandwear-and-tear. Someactuators,suchas heavy-duty industrial robot arms,are quite accurate. Others,like low-costmobilerobots,canbeextremelyinaccurate.

    4. Models. Modelsare inherentlyinaccurate.Modelsareabstractionsof the realworld. As such,they only partially modeltheunderlyingphysicalprocessesofthe robotandits environment.Model errorsarea sourceof uncertaintythathaslargely beenignoredin robotics,despitethe fact thatmostroboticmodelsusedin state-or-the-artroboticssystemsarerathercrude.

    5. Computation. Robotsarereal-timesystems,which limits the amountof com-putationthatcanbecarriedout. Many state-of-the-artalgorithms(suchasmostof the algorithmsdescribedin this book) areapproximate,achieving timely re-sponsethroughsacricing accuracy.

    All of thesefactorsgive riseto uncertainty. Traditionally, suchuncertaintyhasmostlybeenignoredin robotics.However, asrobotsaremoving away from factoryoors intoincreasinglyunstructuredenvironments,theability to copewith uncertaintyis criticalfor building successfulrobots.

  • Introduction 3

    1.2 PR OBABILISTIC ROBOTICS

    Thisbookprovidesacomprehensiveoverview of probabilisticalgorithmsfor robotics.Probabilisticroboticsis anew approachto roboticsthatpaystributeto theuncertaintyin robotperceptionandaction.They key ideaof probabilisticroboticsis to representuncertaintyexplicitly, usingthecalculusof probabilitytheory. Putdifferently, insteadof relying on a single best guessas to what might be the casein the world, prob-abilistic algorithmsrepresentinformation by probability distributions over a wholespaceof possiblehypotheses.By doingso, they canrepresentambiguityanddegreeof belief in a mathematicallysoundway, enablingthemto accommodateall sourcesof uncertaintylisted above. Moreover, by basingcontrol decisionson probabilisticinformation,thesealgorithmsdegradenicely in the faceof thevarioussourcesof un-certaintydescribedabove, leadingto new solutionsto hardroboticsproblems.

    Let us illustrate the probabilisticapproachwith a motivating example: mobile robotlocalization.Localizationis theproblemof estimatinga robots coordinatesin anex-ternalreferenceframefrom sensordata,usinga mapof theenvironment.Figure1.1illustratestheprobabilisticapproachto mobile robot localization.Thespecic local-izationproblemstudiedhereis known asglobal localization, wherea robot is placedsomewherein the environmentandhasto localize itself from scratch. In the proba-bilistic paradigm,the robots momentaryestimate(alsocalledbelief) is representedby a probability densityfunction over the spaceof all locations. This is illustratedin the rst diagramin Figure1.1,which shows a uniform distribution (theprior) thatcorrespondsto maximumuncertainty. Supposethe robot takes a rst sensormea-surementandobserves that it is next to a door. The resultingbelief, shown in theseconddiagramin Figure1.1,placeshigh probabilityat placesnext to doorsandlowprobabilityelsewhere.Notice that this distribution possessesthreepeaks,eachcorre-spondingto oneof the(indistinguishable)doorsin theenvironment.Furthermore,theresultingdistribution assignshigh probability to threedistinct locations,illustratingthattheprobabilisticframework canhandlemultiple,conicting hypothesesthatnatu-rally arisein ambiguoussituations.Finally, evennon-doorlocationspossessnon-zeroprobability. This is accountedby the uncertaintyinherentin sensing:With a small,non-zeroprobability, the robot might err and actually not be next to a door. Nowsupposethe robot moves. The third diagramin Figure1.1 shows the effect of robotmotionon its belief,assumingthat therobotmovedasindicated.Thebelief is shiftedin thedirectionof motion. It is alsosmoothed,to accountfor the inherentuncertaintyin robotmotion. Finally, the fourth and lastdiagramin Figure1.1 depictsthebeliefafter observinganotherdoor. This observation leadsour algorithm to placemostofthe probability masson a locationnearoneof the doors,and the robot is now quitecondent asto whereit is.

  • 4 Chapter 1

    x

    bel(x)

    x

    bel(x)

    x

    p(z|x)

    x

    bel(x)

    x

    bel(x)

    x

    p(z|x)

    x

    bel(x)

    Figure1.1 Thebasicideaof Markov localization:A mobilerobotduringgloballocaliza-tion.

  • Introduction 5

    This example illustratesthe probabilisticparadigmin the context of a specic per-ceptualproblem. Statedprobabilistically, the robotperceptionproblemis a statees-timation problem,andour localizationexampleusesan algorithmknown asBayeslter for posteriorestimationover thespaceof robotlocations.Similarly, whenselect-ing actions,probabilisticapproachesconsidersthe full uncertainty, not just themostlikely guess.By doingso,theprobabilisticapproachtradesoff informationgathering(exploration)andexploitation,andactoptimally relative to thestateof knowledge.

    1.3 IMPLICA TIONS

    Whataretheadvantagesof programmingrobotsprobabilistically, whencomparedtootherapproachesthatdo not representuncertaintyexplicitly? Our centralconjectureis nothinglessthanthefollowing:

    A robotthatcarriesa notionof its ownuncertaintyandthatactsaccordinglyis superiorto onethatdoesnot.

    In particular, probabilisticapproachesaretypically morerobust in the faceof sensorlimitations,sensornoise,environmentdynamics,andso on. They often scalemuchbetterto complex andunstructuredenvironments,wherethe ability to handleuncer-tainty is of evengreaterimportance.In fact,certainprobabilisticalgorithmsarecur-rently theonly known workingsolutionsto hardroboticestimationproblems,suchasthekidnappedrobotproblem, in which a mobilerobotmustrecover from localizationfailure; or the problemof building accuratemapsof very large environments,in theabsenceof a globalpositioningdevice suchasGPS.Additionally, probabilisticalgo-rithmsmakemuchweakerrequirementson theaccuracy of modelsthanmany classicalplanningalgorithmsdo, therebyrelieving the programmerfrom the unsurmountableburdento comeupwith accuratemodels.Viewedprobabilistically, therobot learningproblemis a long-termestimationproblem.Thus,probabilisticalgorithmsprovide asoundmethodologyfor many a vorsof robot learning. And nally , probabilistical-gorithmsarebroadlyapplicableto virtually every probleminvolving perceptionandactionin therealworld.

    However, theseadvantagescomeat a price. Traditionally, the two most frequentlycited limitations of probabilistic algorithmsare computationalinefciency, and aneedto approximate. Probabilisticalgorithmsare inherentlylessefcient thannon-probabilisticones,dueto the fact that they considerentireprobabilitydensities.Theneedto approximatearisesfrom thefactthatmostrobotworldsarecontinuous.Com-putingexactposteriordistributionsis typically infeasible,sincedistributionsover the

  • 6 Chapter 1

    continuumpossessinnitely many dimensions.Sometimes,one is fortunatein thattheuncertaintycanapproximatedtightly with a compactparametricmodel(e.g.,dis-cretedistributionsor Gaussians);in othercases,suchapproximationsare too crudeandmorecomplicatedrepresentationsmostbe employed. Recentresearchhassuc-cessfullyled to a rangeof arecomputationallyefcient probabilisticalgorithms,for arangeof hardroboticsproblemsmany of whicharedescribedin depthin thisbook.

    1.4 ROAD MAP

    This bookattemptsto provide a comprehensive andin-depthintroductioninto proba-bilistic robotics.Thechoiceof materialis somewhatbiasedtowardsresearchcarriedout at Carnegie Mellon University, theUniversityof Bonn,andafliated labs. How-ever, we have attemptedto include in-depthdescriptionsof other, importantproba-bilistic algorithms. The algorithmsdescribedherehave beendevelopedfor mobilerobots;however, many of themareequallyapplicableto othertypesof robots.Thus,thecoverageof thematerialis by nomeanscomplete;probabilisticideashave recentlybecomeextremelypopularin robotics,anda completedescriptionof the eld wouldsimply not t into a singlebook. However, we believe that thechoiceof materialisrepresentative for theexistingbodyof literature.

    The goal of the book is to provide a systematicintroduction into the probabilisticparadigm,from theunderlyingmathematicalframework to implementation.For eachmajoralgorithm,thisbookprovides

    acompletemathematicalderivation,

    pseudo-codein aC-like language,

    discussionsof implementationdetailsandpotentialpitfalls,and

    empiricalresultsobtainedin elded systems.

    We believe thatall four itemsareessentialfor obtaininga deepunderstandingof theprobabilisticparadigm. At the end of eachchapter, the book also provides biblio-graphicalnotesanda list of questionsandexercises.

    Thebookhasbeenwrittenwith researchers,graduatestudentsor advancedundergrad-uatestudentsin mind,specializingin roboticsor appliedstatistics.Wehaveattemptedto presentthematerialin a way that requiresa minimumof backgroundknowledge.

  • Introduction 7

    However, basicknowledgeof probability theorywill almostcertainlyhelp in under-standingthematerial.Thevariousmathematicalderivationscaneasilybeskippedatrst reading.However, we stronglyrecommendto take the time andstudythemathe-maticalderivations,asaprofoundmathematicalunderstandingis will almostcertainlyleadto deepand importantinsightsinto theworking of theprobabilisticapproachtorobotics.

    If usedin theclassroom,eachchaptershouldbecoveredin oneor two lectures;how-ever, we recommendthatthestudyof thebookbeaccompaniedby practical,hands-onexperimentationasdirectedby thequestionsandexercisesat theendof eachchapter.

    Thisbook is organizedin four majorparts.

    The rst part,Chapters2 through5, discussthe basicmathematicalframeworkthatunderliesall of thealgorithmsdescribedin this book. Chapters2 through4introducethebasicprobabilisticnotationanddescribesa collectionof lters forprobabilisticstateestimation.Chapter5 discussesspecic probabilisticmodelsthatcharacterizemobilerobotperceptionandmotion.

    The secondpart, which comprisedChapters7 to ??, describesa rangeof per-ceptualalgorithms,which mapsensormeasurementsinto internalrobotbeliefs.In particular, Chapter7 describesalgorihtmsfor mobile robot localization,fol-lowedby algorithmsfor mapacquisitiondescribedin Chapters??. Thispartalsocontainsachapteron learningmodels.

    Thethird part,in Chapters?? to ??, introducesprobabilisticplanningandactionselectionalgorihtms.

    Finally, Chapter?? describestwo robot systemsthat werecontrolledby proba-bilistic algorithms.Theserobotsweredeployed in museumsasinteractive tour-guiderobots,wherethey managedto navigatereliablywithout theneedto modifythemuseumsin any way.

    The book is bestread in order, from the beginning to the end. However, we haveattemptedto make eachindividualchapterself-explanatory.

    1.5 BIBLIOGRAPHICAL REMARKS

    The term robotwas inventedin 1921 by the Czechnovelist Karel Capek[42], to describea willing,intelligentandhuman-like machinesthatmake life pleasantby doing the typework we dont like to do. IntheFourties,Asimov coinedthetermroboticsandpostulatedthefamousthreelawsof robotics[1, 2],

  • 8 Chapter 1

    Robotics,asa scientific disciplinehasbeenan active field of researchfoer several decades.In the earlyyears,mostof theresearchfocusedon

    Major trendsin robotics:

    Seventies:classical(deliberate)approach,accuratemodels,no uncertainty, no sensing.Still: veryhard problem. Reif: planningproblemNP hard, but only doubly exponentialalgorithmsknown.Canny: first singleexponentialplanningalgorithm.Latombe:Many impressive randomizedplanningalgorithms. Importantdifference:randomizationusedfor search,not for representinguncertainty.Planningalgorithmfor specialtopologies:Schwartz,andSharir. Sometakesensordatainto account.Koditscheknavigationfunctionfor feedbackcontrol,generalizesKhatibspotentialfields, whichsuf-fer from localminima(navigationfunctionsdont).

    Mid-Eighties: reactive approach,rejectionof models,purerelianceon sensors,uncertainty(if any)handledby feedbackcontrol. Relieson sensordatacarryingsufficient informationfor actionselec-tion, thereforeuncertaintynot an issue.Typically confined to smallenvironments,whereeverythingof importancecanbeperceivedby thesensors.Nevertheless,someimpressive resultsfor controlofleggedrobots. Brooks,Mataric, Steels,Connell,andmany others. Biologically inspiredrobotics.Smithers.

    Mid-Nineties: Hybrid approaches,reactive at low level (fast decisioncycle), deliberateat higherlevels.Usessensingmostlyat low level,andmodelsathigh level. Gat,Arkins,Balch,Firby, Simmonsandmany others. harvestsbestof both worlds: robust throughreaction,but cando morepowerfultasksdueto deliberation.

    LateNinties:Probabilisticrobotics,differentway to integratemodelsandsensordata.

    Probabilisticroboticscanbe tracedbackto Sixtiesto adventof Kalmanfilters, which hasbeenusedexten-sively in robotics.Firstseriousadvanceis SmithandCheesemanin mid-80s,whopropposeanalgorithmforconcurrentmappingandlocalizationbasedonKalmanfilters thatis now in widespreaduse.Durrant-Whyte,Leonard,Castellanos,andmany others.

    Main activity in pastfi ve years. Main advances:moreflexible representations(beyond Kalmanfilters),moreefficient algorithms.Statisticalapproachesfor solvinghardcorrespondenceproblems.

    What is new in probabilisticrobotics,relative to approachesabove?

    seamlesslyblendsmodelsandperceptionin anovel way

    soundmathematicaltheory, clearassumptions,thereforeitseasierto predictfailuremodes

    appliesto all levels: lowestto highest,sinceuncertaintyariseseverywhere

    currentlythebestknown solutionsin a rangeof hardroboticsproblems

  • 2RECURSIVE STATE ESTIMA TION

    2.1 INTR ODUCTION

    At the coreof probabilisticroboticsis the ideaof estimatingstatefrom sensordata.Stateestimationaddressestheproblemof estimatingquantitiesfrom sensordatathatare not directly observable, but that can be inferred. In most robotic applications,determiningwhat to do is relatively easyif one only knew certain quantities. Forexample,moving a mobile robot is relatively easyif the exact locationof the robotandall nearbyobstaclesareknown. Unfortunately, thesevariablesarenot directlymeasurable.Instead,a robot has to rely on its sensorsto gather this information.Sensorscarryonly partial informationaboutthosequantities,andtheirmeasurementsarecorruptedby noise.Stateestimationseeksto recoverstatevariablesfrom thedata.Probabilisticstateestimationalgorithmscomputebelief distributions over possibleworld states.An exampleof probabilisticstateestimationwasalreadyencounteredinthe introductionto thisbook: mobilerobot localization.

    The goalof this chapteris to introducethe basicvocabulary andmathematicaltoolsfor estimatingstatefrom sensordata.

    Section2.2 introducesbasicprobabilisticconceptsandnotationsusedthroughoutthebook.

    Section2.3describesour formal modelof robotenvironmentinteraction,settingforth someof thekey terminologyusedthroughoutthebook.

    Section2.4 introducesBayeslter s, the recursive algorithmfor stateestimationthatformsthebasisof virtually every techniquepresentedin thisbook.

    9

  • 10 Chapter 2

    Section2.5 discussesrepresentationalandcomputationalissuesthatarisewhenimplementingBayeslters.

    2.2 BASIC CONCEPTS IN PR OBABILITY

    This sectionfamiliarizesthereaderwith thebasicnotationandprobabilisticfactsandnotationusedthroughoutthebook. In probabilisticrobotics,quantitiessuchassensormeasurements,controls,andthestatesa robotandits environmentmight assumeareall modeledasrandomvariables.Randomvariablescantake on multiple values,andthey do soaccordingto specic probabilisticlaws. Probabilisticinferenceis thepro-cessof calculatingtheselaws for randomvariablesthatarederivedfrom otherrandomvariables,suchasthosemodelingsensordata.

    Let X denotea randomvariableandx denotea specic event thatX might take on.A standardexampleof a randomvariableis thatof a coin ip, whereX cantake onthevaluesheador tail. If thespaceof all valuesthatX cantake on is discrete,as isthecaseif X is theoutcomeof acoin ip, wewrite

    p(X = x) (2.1)

    to denotetheprobability that the randomvariableX hasvaluex. For example,a faircoin is characterizedby p(X = head) = p(X = tail ) = 12 . Discreteprobabilitiessumto one,that is,

    X

    x

    p(X = x) = 1 ; (2.2)

    and of course,probabilitiesare always non-negative, that is, p(X = x) 0. Tosimplify the notation,we will usuallyomit explicit mentionof the randomvariablewhenever possible,andinsteadusethecommonabbreviation p(x) insteadof writingp(X = x).

    Most techniquesin this book addressestimationanddecisionmaking in continuousspaces.Continuousspacesarecharacterizedby randomvariablesthat can take on acontinuumof values. Throughoutthis book, we assumethat all continuousrandomvariablespossessprobability densityfunctions(PDFs). A commondensityfunctionis thatof theone-dimensionalnormaldistributionwith mean andvariance 2. This

  • RecursiveStateEstimation 11

    distribution is givenby thefollowing Gaussianfunction:

    p(x) =2 2

    12 exp

    12(x )2

    2

    (2.3)

    Normal distributionsplay a major role in this book. We will frequentlyabbreviatethemasN (x; ; 2), whichspeciestherandomvariable,its mean,andits variance.

    The Normal distribution (2.3) assumesthat x is a scalarvalue. Often, x will be amulti-dimensionalvector. Normal distributionsover vectorsarecalledmultivariate.Multivariatenormaldistributionsarecharacterizedby densityfunctionsof thefollow-ing form:

    p(x) = det (2 ) 12 exp

    12 (x )

    T 1(x )

    (2.4)

    Here is the meanvectorand a (positive semidenite) symmetricmatrix calledcovariancematrix. The superscriptT marksthe transposeof a vector. The readershouldtake a momentto realizethatEquation(2.4) is a strict generalizationof Equa-tion (2.3); both denitions areequivalent if x is a scalarvalue. The PDFsof a one-anda two-dimensionalnormaldistributionaregraphicallydepictedin Figure5.6.

    Equations(2.3)and(2.4)areexamplesof PDFs.Justasdiscreteprobabilitydistribu-tionsalwayssumsup to one,aPDFalwaysintegratesto 1:

    Zp(x) dx = 1 : (2.5)

    However, unlike a discreteprobability, thevalueof a PDF is not boundedabove by 1.Throughoutthisbook,wewill usethetermsprobability, probabilitydensityandprob-ability densityfunction interchangeably. We will silently assumethat all continuousrandomvariablesaremeasurable,andwealsoassumethatall continuousdistributionsactuallypossessdensities.

    The joint distributionof two randomvariablesX andY is givenby

    p(x; y) = p(X = x andY = y) : (2.6)

  • 12 Chapter 2

    Thisexpressiondescribestheprobabilityof theeventthattherandomvariableX takeson thevaluex andthatY takeson thevaluey. If X andY areindependent, wehave

    p(x; y) = p(x) p(y) : (2.7)

    Often,randomvariablescarry informationaboutotherrandomvariables.Supposewealreadyknow thatY s valueis y, andwe would like to know theprobability thatX svalueis x conditionedon thatfact.Suchaprobabilitywill bedenoted

    p(x j y) = p(X = x j Y = y) (2.8)

    andis calledconditionalprobability. If p(y) > 0, thentheconditionalprobability isdened as

    p(x j y) =p(x; y)p(y)

    : (2.9)

    If X andY areindependent,wehave

    p(x j y) =p(x) p(y)

    p(y)= p(x) : (2.10)

    In otherwords,if X andY areindependent,Y tells usnothingaboutthevalueof X .Thereis no advantageof knowing Y if our interestpertainsto knowing X . Indepen-dence,and its generalizationknown asconditionalindependence,playsa major rolethroughoutthisbook.

    An interestingfact,which follows from the denition of conditionalprobabilityandtheaxiomsof probabilitymeasures,is oftenreferredto astheoremof total probability:

    p(x) =X

    y

    p(x j y) p(y) (discretecase) (2.11)

    p(x) =Z

    p(x j y) p(y) dy (continuouscase) (2.12)

    If p(x j y) or p(y) arezero,we dene theproductp(x j y) p(y) to bezero,regardlessof thevalueof theremainingfactor.

  • RecursiveStateEstimation 13

    Equally importantis Bayesrule, which relatesconditionalsof the type p(x j y) totheir in verse, p(y j x). Therule,asstatedhere,requiresp(y) > 0:

    p(x j y) =p(y j x) p(x)

    p(y)=

    p(y j x) p(x)P

    x 0 p(y j x0) p(x0)

    (discrete) (2.13)

    p(x j y) =p(y j x) p(x)

    p(y)=

    p(y j x) p(x)Rp(y j x0) p(x0) dx0

    (continuous) (2.14)

    Bayesruleplaysapredominantrole in probabilisticrobotics.If x is aquantitythatwewould like to infer from y, theprobabilityp(x) will bereferredto asprior probabilitydistribution, andy is called the data (e.g.,a sensormeasurement).The distributionp(x) summarizestheknowledgewe have regardingX prior to incorporatingthedatay. Theprobabilityp(x j y) is calledtheposteriorprobability distribution over X . As(2.14)suggests,Bayesruleprovidesaconvenientway to computeaposteriorp(x j y)using the in verse conditionalprobability p(y j x) alongwith the prior probabilityp(x). In otherwords, if we are interestedin inferring a quantityx from sensordatay, Bayesrule allows us to do so throughthe inverseprobability, which species theprobabilityof datay assumingthatx wasthecase.In robotics,this inverseprobabilityis often coinedgenerative model, since it describes,at somelevel of abstraction,how statevariablesX causesensormeasurementsY .

    An importantobservationis thatthedenominatorof Bayesrule,p(y), doesnotdependon x. Thus, the factorp(y) 1 in Equations(2.13) and (2.14) will be the sameforany valuex in the posteriorp(x j y). For this reason,p(y) 1 is often written asanormalizervariable,andgenericallydenoted :

    p(x j y) = p(y j x) p(x) : (2.15)

    If X is discrete,equationsof this typecanbecomputedasfollows:

    8x : auxx jy = p(y j x) p(x) (2.16)

    auxy =X

    x

    auxx jy (2.17)

    8x : p(x j y) =auxx jyauxy

    ; (2.18)

    whereauxx jy andauxy areauxiliaryvariables.Theseinstructionseffectively calculatep(x j y), but insteadof explicitly computingp(y), they insteadjust normalizethe

  • 14 Chapter 2

    result.Theadvantageof thenotationin (2.15) lies in its brevity. Insteadof explicitlyproviding theexactformulafor anormalizationconstantwhich cangrow largeveryquickly in someof the mathematicalderivationsto followwe simply will usethenormalizer to indicatethat the nal resulthasto benormalizedto 1. Throughoutthis book,normalizersof this typewill bedenoted (or 0, 00, . . . ). We will freelyusethesame in differentequationsto denotenormalizers,even if theiractualvaluesaredifferent.

    Theexpectationof a randomvariableX is givenby

    E [X ] =X

    x

    x p(x) ;

    E [X ] =Z

    x p(x) dx : (2.19)

    Not all randomvariablespossessnite expectations;however, thosethatdonotareofnorelevanceto thematerialpresentedin thisbook.Theexpectationis a linearfunctionof a randomvariable.In particular, wehave

    E [aX + b] = aE [X ] + b (2.20)

    for arbitrarynumericalvaluesa andb. Thecovarianceof X is obtainedasfollows

    Cov[X ] = E [X E [X ]]2 = E [X 2] E [X ]2 (2.21)

    The covariancemeasuresthe squaredexpecteddeviation from the mean. As statedabove, themeanof a multivariatenormaldistribution N (x; ; ) is , andits covari-anceis .

    Anotherimportantcharacteristicof a randomvariableis its entropy. For discreteran-domvariables,theentropy is givenby thefollowing expression:

    H (P ) = E [ log2 p(x)] = X

    x

    p(x) log2 p(x) : (2.22)

    Theconceptof entropy originatesin informationtheory. Theentropy is theexpectedinformation that the value of x carries: log2 p(x) is the numberof bits required

  • RecursiveStateEstimation 15

    to encodex usingan optimal encoding,andp(x) is the probability at which x willbe observed. In this book,entropy will be usedin robotic informationgathering,toexpressthe informationa robotmayreceive uponexecutingspecic actions.

    Finally, wenoticethat it is perfectlyne to conditionany of therulesdiscussedsofaronarbitraryotherrandomvariables,suchasthevariableZ . For example,conditioningBayesruleonZ = z givesus:

    p(x j y; z) =p(y j x; z) p(x j z)

    p(y j z)(2.23)

    Similarly, we canconditionthe rule for combiningprobabilitiesof independentran-domvariables(2.7)onothervariablesz:

    p(x; y j z) = p(x j z) p(y j z) : (2.24)

    Sucha relation is known asconditional independence. As the readereasilyveries,(2.24)is equivalentto

    p(x j z) = p(x j z; y)

    p(y j z) = p(y j z; x) (2.25)

    Conditionalindependenceplaysan importantrole in probabilisticrobotics.It applieswhenever a variabley carriesno informationabouta variablex if anothervariablesvaluez is known. Conditionalindependencedoesnot imply (absolute)independence,that is,

    p(x; y j z) = p(x j z) p(y j z) 6) p(x; y) = p(x) p(y) (2.26)

    Theconverseis alsoin generaluntrue:absoluteindependencedoesnot imply condi-tional independence:

    p(x; y) = p(x) p(y) 6) p(x; y j z) = p(x j z) p(y j z) (2.27)

    In specialcases,however, conditionalandabsoluteindependencemaycoincide.

  • 16 Chapter 2

    Control system

    Environment, state

    Perceptual/action data

    Actions

    World model, belief

    Figure2.1 RobotEnvironmentInteraction.

    2.3 ROBOT ENVIR ONMENTINTERA CTION

    Figure 2.1 illustratesthe interactionof a robot with its environment. The environ-ment, or world, of a robot is a dynamicalsystemthat possessesinternalstate. Therobotcanacquireinformationaboutits environmentusingits sensors.However, sen-sorsarenoisy, and thereareusuallymany things that cannotbe senseddirectly. Asa consequence,the robot maintainsan internalbelief with regardsto the stateof itsenvironment,depictedon the left in this gure. The robot canalso inuence its en-vironmentthroughits actuators.However, the effect of doing so is often somewhatunpredictable.This interactionwill now bedescribedmoreformally.

    2.3.1 State

    Environmentsarecharacterizedby state. For the materialpresentedin this book, itwill be convenientto think of stateas the collectionof all aspectsof the robot andits environmentthat can impact the future. Statemay changeover time, suchas thelocationof people;or it may remainstatic throughoutthe robots operation,suchasthe locationof walls in (most)buildings. Statethat changeswill be calleddynamicstate, whichdistinguishesit from static, or non-changingstate.Thestatealsoincludesvariablesregardingtherobotitself,suchasits pose,velocity, whetheror not its sensorsare functioningcorrectly, andso on. Throughoutthis book,statewill be denotedx;althoughthespecic variablesincludedin x will dependon thecontext. Thestateattime t will bedenotedx t . Typicalstatevariablesusedthroughoutthisbookare:

  • RecursiveStateEstimation 17

    The robotpose,which comprisesits locationandorientationrelative to a globalcoordinateframe.Rigid mobilerobotspossesssix suchstatevariables,threefortheir Cartesiancoordinates,and threefor their angularorientation,also calledEuler angles(pitch, roll, andyaw). For rigid mobile robotsconned to planarenvironments,the poseis usuallygiven by threevariables,its two locationco-ordinatesin the planeand its headingdirection(yaw). The robot poseis oftenreferredto askinematicstate.

    Thecongurationof therobotsactuators,suchasthejointsof roboticmanipula-tors.Eachdegreeof freedomin arobotarmis characterizedbyaone-dimensionalcongurationatany point in time,which is partof thekinematicstateof therobot.

    The robotvelocity andthevelocitiesof its joints. A rigid robotmoving throughspaceis characterizedby up to six velocityvariables,onefor eachposevariables.Velocitiesarecommonlyreferredto asdynamicstate. Dynamicstatewill playonly aminor role in thisbook.

    The locationandfeaturesof surroundingobjectsin theenvironment.An objectmaybea tree,a wall, or a pixel within a largersurface.Featuresof suchobjectsmaybe their visualappearance(color, texture). Dependingon thegranularityofthestatethat is beingmodeled,robotenvironmentspossessbetweena few dozenandup to hundredsof billions of statevariables(andmore). Justimaginehowmany bits it will take to accuratelydescribeyourphysicalenvironment!For manyof theproblemsstudiedin this book, the locationof objectsin theenvironmentwill be static. In someproblems,objectswill assumethe form of landmarks,which aredistinct,stationaryfeaturesof theenvironmentthatcanberecognizedreliably.

    The locationandvelocitiesof moving objectsandpeople.Often,therobot is notthe only moving actor in its environment. Othermoving entitiespossesstheirown kinematicanddynamicstate.

    Therecanbe a hugenumberof otherstatevariables.For example,whetherornot a sensoris broken is a statevariable,as is the level of batterycharge for abattery-poweredrobot.

    A statex t will becalledcompleteif it is thebestpredictorof the future. Put differ-ently, completenessentailsthat knowledgeof paststates,measurements,or controlscarry no additionalinformation that would help us to predict the future moreaccu-rately. It it importantto noticethatourdenition of completenessdoesnot requirethefuture to be a deterministicfunction of the state. The future may be stochastic,butno variablesprior to x t may inuence thestochasticevolution of futurestates,unless

  • 18 Chapter 2

    this dependenceis mediatedthroughthestatex t . Temporalprocessesthatmeettheseconditionsarecommonlyknown asMarkov chains.

    The notion of statecompletenessis mostly of theoreticalimportance.In practice,itis impossibleto specifya completestatefor any realisticrobot system.A completestateincludesnot just all aspectsof theenvironmentthatmayhave an impacton thefuture,but alsotherobotitself, thecontentof its computermemory, thebraindumpsofsurroundingpeople,etc.Thosearehardto obtain.Practicalimplementationsthereforesingleout a smallsubsetof all statevariables,suchastheoneslistedabove. Suchastateis calledincompletestate.

    In mostroboticsapplications,thestateis continuous,meaningthatx t is denedoveracontinuum.A goodexampleof acontinuousstatespaceis thatof a robotpose,thatis,its locationandorientationrelative to anexternalcoordinatesystem.Sometimes,thestateis discrete.An exampleof adiscretestatespaceis the(binary)statevariablethatmodelswhetheror not a sensoris broken. Statespacesthatcontainbothcontinuousanddiscretevariablesarecalledhybridstatespaces.

    In mostcasesof interestingroboticsproblems,statechangesover time. Time,through-out this book,will bediscrete,that is, all interestingeventswill take placeat discretetimesteps

    0; 1; 2; : : : : (2.28)

    If the robotstartsits operationat a distinctpoint in time, we will denotethis time ast = 0.

    2.3.2 Environmen t In teraction

    Therearetwo fundamentaltypesof interactionsbetweena robotandits environment:The robotcaninuence thestateof its environmentthroughits actuators.And it cangatherinformationaboutthestatethroughits sensors.Both typesof interactionsmayco-occur, but for didacticreasonswe will distinguishthemthroughoutthis book.Theinteractionis illustratedin Figure2.1:

    Sensormeasurements. Perceptionis the processby which the robot usesitssensorsto obtain informationaboutthe stateof its environment. For example,a robot might take a cameraimage,a rangescan,or query its tactile sensorsto receive informationaboutthe stateof the environment. The resultof sucha

  • RecursiveStateEstimation 19

    perceptualinteractionwill becalledameasurement, althoughwewill sometimesalsocall it observationor percept. Typically, sensormeasurementsarrive withsomedelay. Hencethey provide informationaboutthestatea few momentsago.

    Control actionschangethestateof theworld. They do soby actively assertingforceson the robots environment. Examplesof control actionsinclude robotmotionandthemanipulationof objects.Even if the robotdoesnot performanyactionitself,stateusuallychanges.Thus,for consistency, wewill assumethattherobotalwaysexecutesa controlaction,even if it choosesnot to move any of itsmotors. In practice,the robotcontinuouslyexecutescontrolsandmeasurementsaremadeconcurrently.

    Hypothetically, a robotmaykeepa recordof all pastsensormeasurementsandcontrolactions.We will refer to sucha thecollectionasthedata (regardlessof whethertheyarebeingmemorized).In accordancewith the two typesof environmentinteractions,therobothasaccessto two differentdatastreams.

    Measurement data provides informationabouta momentarystateof the envi-ronment. Examplesof measurementdataincludecameraimages,rangescans,andsoon. For mostparts,we will simply ignoresmall timing effects(e.g.,mostladar sensorsscanenvironmentssequentiallyat very high speeds,but we willsimply assumethe measurementcorrespondsto a specic point in time). Themeasurementdataat time t will bedenoted

    zt (2.29)

    Throughoutmostof thisbook,wesimplyassumethattherobottakesexactlyonemeasurementata time. Thisassumptionis mostlyfor notationalconvenience,asnearlyall algorithmsin thisbookcaneasilybeextendedto robotsthatcanacquirevariablesnumbersof measurementswithin asingletimestep.Thenotation

    zt 1 :t 2 = zt 1 ; zt 1 +1 ; zt 1 +2 ; : : : ; zt 2 (2.30)

    denotesthesetof all measurementsacquiredfrom time t1 to time t2, for t1 t2.

    Control data carry informationaboutthe change of state in the environment.In mobile robotics,a typical exampleof control datais the velocity of a robot.Settingthevelocity to 10cmpersecondfor thedurationof vesecondssuggeststhat the robots pose,afterexecutingthis motioncommand,is approximately50cm aheadof its posebeforecommandexecution. Thus, its main informationregardsthechangeof state.

  • 20 Chapter 2

    An alternative sourceof controldataareodometers. Odometersaresensorsthatmeasurethe revolution of a robots wheels. As such they convey informationaboutthechangeof thestate.Even thoughodometersaresensors,we will treatodometryascontrol data,sinceits main information regardsthe changeof therobots pose.

    Controldatawill be denotedut . The variableut will alwayscorrespondto thechangeof statein thetime interval (t 1; t]. As before,wewill denotesequencesof controldataby ut 1 :t 2 , for t1 t2:

    ut 1 :t 2 = ut 1 ; ut 1 +1 ; ut 1 +2 ; : : : ; ut 2 : (2.31)

    Sincethe environmentmay changeeven if a robot doesnot executea speciccontrolaction,thefactthattimepassedby constitutes,technicallyspeaking,con-trol information. Hence,we assumethat thereis exactly onecontrol dataitempertimestept.

    The distinctionbetweenmeasurementandcontrol is a crucial one,asboth typesofdataplay fundamentallydifferentroles in the materialyet to come. Perceptionpro-videsinformationabouttheenvironments state,henceit tendsto increasetherobotsknowledge.Motion, on theotherhand,tendsto inducea lossof knowledgedueto theinherentnoisein robotactuationandthestochasticityof robotenvironments;althoughsometimesacontrolmakestherobotmorecertainaboutthestate.By nomeansis ourdistinctionintendedto suggestthatactionsandperceptionsareseparatedin time, i.e.,that the robot doesnot move while taking sensormeasurements.Rather, perceptionandcontrol takesplaceconcurrently;many sensorsaffect the environment;and theseparationis strictly for convenience.

    2.3.3 Probabilistic Generativ e Laws

    The evolution of stateandmeasurementsis governedby probabilisticlaws. In gen-eral,thestateat timex t is generatedstochastically. Thus,it makessenseto specifytheprobabilitydistribution from which x t is generated.At rst glance,theemergenceofstatex t might beconditionedon all paststates,measurements,andcontrols.Hence,theprobabilisticlaw characterizingtheevolution of statemight begivenby a proba-bility distributionof thefollowing form:

    p(x t j x0:t 1; z1:t 1; u1:t ) (2.32)

    (Notice that throughno particularmotivationwe assumeherethat the robotexecutesa controlactionu1 rst, andthentakesa measurementz1.) However, if thestatex is

  • RecursiveStateEstimation 21

    completethen it is a sufcient summaryof all that happenedin previous time steps.In particular, x t 1 is a sufcient statisticof all previous controlsandmeasurementsup to this point, that is, u1:t 1 andz1:t 1. From all the variablesin the expressionabove, only thecontrolut mattersif we know thestatex t 1. In probabilisticterms,this insight is expressedby thefollowing equality:

    p(x t j x0:t 1; z1:t 1; u1:t ) = p(x t j x t 1; ut ) (2.33)

    Thepropertyexpressedby thisequalityis anexampleof conditionalindependence. Itstatesthatcertainvariablesareindependentof othersif oneknows thevaluesof a thirdgroupof variables,theconditioningvariables.Conditionalindependencewill beex-ploitedpervasively in this book,asit is themainsourceof tractabilityof probabilisticroboticsalgorithms.

    Similarly, one might want to model the processby which measurementsare beinggenerated.Again, if x t is complete,wehave an importantconditionalindependence:

    p(zt j x0:t ; z1:t 1; u1:t ) = p(zt j x t ) (2.34)

    In otherwords,thestatex t is sufcient to predictthe(potentiallynoisy)measurementzt . Knowledgeof any othervariable,suchaspastmeasurements,controlsor evenpaststates,is irrelevant if x t is complete.

    This discussionleavesopenasto what the two resultingconditionalprobabilitiesare:p(x t j x t 1; ut ) andp(zt j x t ). Theprobabilityp(x t j x t 1; ut ) is thestatetransitionprobability. It specieshow environmentalstateevolvesover time asa function ofrobotcontrolsut . Robotenvironmentsarestochastic,which is reectedby thefactthatp(x t j x t 1; ut ) is a probabilitydistribution,not a deterministicfunction. Sometimesthestatetransitiondistribution doesnot dependon the time index t, in which casewemaywrite it asp(x0 j u; x), wherex0 is thesuccessorandx thepredecessorstate.

    The probability p(zt j x t ) is called the measurementprobability. It also may notdependon the time index t, in which caseit shall be written asp(z j x). The mea-surementprobabilityspeciestheprobabilisticlaw accordingto whichmeasurementsz aregeneratedfrom theenvironmentstatex. Measurementsareusuallynoisyprojec-tionsof thestate.

    Thestatetransitionprobabilityandthemeasurementprobabilitytogetherdescribethedynamicalstochasticsystemof the robot and its environment. Figure?? illustratesthe evolution of statesandmeasurements,dened throughthoseprobabilities. The

  • 22 Chapter 2

    stateat time t is stochasticallydependenton the stateat time t 1 and the controlut . Themeasurementzt dependsstochasticallyon thestateat time t. Sucha temporalgenerative model is alsoknown ashiddenMarkov model(HMM) or dynamicBayesnetwork (DBN). To specifythemodelfully, we alsoneedan initial statedistributionp(x0).

    2.3.4 Belief Distributions

    Anotherkey conceptin probabilisticroboticsis thatof a belief. A belief reects therobots internalknowledgeaboutthestateof theenvironment.We alreadydiscussedthat statecannotbe measureddirectly. For example,a robots posemight be x =h14:12; 12:7; 0:755i in someglobalcoordinatesystem,but it usuallycannotknow itspose,sinceposesarenotmeasurabledirectly (notevenwith GPS!).Instead,therobotmustinfer its posefrom data.We thereforedistinguishthe truestatefrom its internalbelief, or stateof knowledge with regardsto thatstate.

    Probabilisticroboticsrepresentsbeliefsthroughconditionalprobabilitydistributions.A belief distribution assignsa probability (or densityvalue)to eachpossiblehypoth-esiswith regardsto the truestate.Belief distributionsareposteriorprobabilitiesoverstatevariablesconditionedon the availabledata. We will denotebelief over a statevariablex t by bel(x t ), which is anabbreviation for theposterior

    bel(x t ) = p(x t j z1:t ; u1:t ) : (2.35)

    Thisposterioris theprobabilitydistributionover thestatex t at time t, conditionedonall pastmeasurementsz1:t andall pastcontrolsu1:t .

    The readermay noticethat we silently assumethat thebelief is taken after incorpo-rating themeasurementzt . Occasionally, it will prove usefulto calculatea posteriorbefore incorporatingzt , just after executingthe control ut . Sucha posteriorwill bedenotedasfollows:

    bel(x t ) = p(x t j z1:t 1; u1:t ) (2.36)

    This probabilitydistribution is often referredto asprediction in thecontext of prob-abilistic ltering. This terminologyreects the fact that bel(x t ) predictsthe stateattime t basedon the previous stateposterior, before incorporatingthe measurementat time t. Calculatingbel(x t ) from bel(x t ) is calledcorrectionor the measurementupdate.

  • RecursiveStateEstimation 23

    2.4 BA YES FIL TERS

    2.4.1 The Bayes Filter Algorithm

    The mostgeneralalgorithm for calculatingbeliefs is given by the Bayeslter algo-rithm. This algorithm calculatesthe belief distribution bel from measurementandcontroldata.We will rst statethebasicalgorithmandelucidateit with a numericalexample.After that,we will derive it mathematicallyfrom theassumptionsmadesofar.

    Table2.1depictsthebasicBayeslter in pseudo-algorithmicform. TheBayeslter isrecursive, that is, thebeliefbel(x t ) at time t is calculatedfrom thebeliefbel(x t 1) attime t 1. Its input is thebeliefbelat time t 1, alongwith themostrecentcontrolutandthemostrecentmeasurementzt . Its outputis thebeliefbel(x t ) at time t. Table2.1only depictsa singlestepof theBayesFilter algorithm: theupdaterule. This updaterule is appliedrecursively, to calculatethe belief bel(x t ) from the belief bel(x t 1),calculatedpreviously.

    The Bayeslter algorithmpossessestwo essentialsteps. In Line 3, it processesthecontrol ut . It doesso by calculatinga belief over the statex t basedon the priorbeliefoverstatex t 1 andthecontrolut . In particular, thebeliefbel(x t ) thattherobotassignsto statex t is obtainedby theintegral (sum)of theproductof two distributions:theprior assignedto x t 1, andtheprobabilitythatcontrolut inducesa transitionfromx t 1 to x t . The readermay recognizethe similarity of this updatestepto Equation(2.12).As notedabove, thisupdatestepis calledthecontrolupdate,or prediction.

    Thesecondstepof theBayeslter is calledthemeasurementupdate.In Line 4, theBayeslter algorithmmultipliesthebeliefbel(x t ) by theprobabilitythatthemeasure-mentzt mayhave beenobserved. It doesso for eachhypotheticalposteriorstatex t .As will becomeapparentfurther below whenactuallyderiving the basiclter equa-tions, the resultingproductis generallynot a probability, that is, it maynot integrateto 1. Hence,the resultis normalized,by virtue of thenormalizationconstant . Thisleadsto the nal beliefbel(x t ), which is returnedin Line 6 of thealgorithm.

    To computethe posteriorbelief recursively, the algorithm requiresan initial beliefbel(x0) at time t = 0 as boundarycondition. If one knows the value of x0 withcertainty, bel(x0) shouldbe initialized with a point massdistribution thatcentersallprobability masson the correctvalueof x0, andassignszeroprobability anywhereelse. If one is entirely ignorantaboutthe initial valuex0, bel(x0) maybe initializedusinga uniform distribution over the domainof x0 (or relateddistribution from theDirichlet family of distributions). Partial knowledgeof the initial value x0 can be

  • 24 Chapter 2

    1: Algorithm Bayes lter( bel(x t 1); ut ; zt ):2: for all x t do3: bel(x t ) =

    Rp(x t j ut ; x t 1) bel(x t 1) dx

    4: bel(x t ) = p(zt j x t ) bel(x t )5: endfor6: returnbel(x t )

    Table2.1 Thegeneralalgorithmfor Bayesfiltering.

    expressedby non-uniformdistributions;however, thetwo casesof full knowledgeandfull ignorancearethemostcommononesin practice.

    ThealgorithmBayeslter canonly be implementedin the form statedherefor verysimpleestimationproblems.In particular, we eitherneedto beableto carryout theintegration in Line 3 and the multiplication in Line 4 in closedform, or we needtorestrictourselvesto nite statespaces,sothatthe integral in Line 3 becomesa (nite)sum.

    2.4.2 Example

    Our illustration of the Bayeslter algorithm is basedon the scenarioin Figure2.2,which shows a robot estimatingthe stateof a door using its camera.To make thisproblemsimple,let usassumethatthedoorcanbe in oneof two possiblestates,openor closed,andthatonly therobotcanchangethestateof thedoor. Let us furthermoreassumethat the robotdoesnot know thestateof thedoor initially. Instead,it assignsequalprior probabilityto thetwo possibledoorstates:

    bel(X 0 = open) = 0:5 (2.37)

    bel(X 0 = closed ) = 0:5 (2.38)

    Let usfurthermoreassumetherobotssensorsarenoisy. Thenoiseis characterizedbythefollowing conditionalprobabilities:

    p(Z t = sense open j X t = is open) = 0:6

    p(Z t = sense closed j X t = is open) = 0:4 (2.39)

  • RecursiveStateEstimation 25

    Figure2.2 A mobilerobotestimatingthestateof adoor.

    and

    p(Z t = sense open j X t = is closed ) = 0:2

    p(Z t = sense closed j X t = is closed ) = 0:8 (2.40)

    Theseprobabilitiessuggestthat the robots sensorsarerelatively reliablein detectinga closeddoor, in that theerrorprobability is 0.2. However, whenthedoor is open,ithasa0:4 probabilityof a falsemeasurement.

    Finally, let usassumetherobotusesits manipulatorto pushthedooropen.If thedooris alreadyopen,it will remainopen. It it is closed,the robothasa 0:8 chancethat itwill beopenafterwards:

    p(X t = is open j Ut = push ; X t 1 = is open) = 1

    p(X t = is closed j Ut = push ; X t 1 = is open) = 0 (2.41)

    p(X t = is open j Ut = push ; X t 1 = is closed ) = 0:8

    p(X t = is closed j Ut = push ; X t 1 = is closed ) = 0:2 (2.42)

    It canalsochoosenot to useits manipulator, in whichcasethestateof theworld doesnotchange.This is statedby thefollowing conditionalprobabilities:

    p(X t = is open j Ut = do nothing ; X t 1 = is open) = 1

    p(X t = is closed j Ut = do nothing ; X t 1 = is open) = 0 (2.43)

    p(X t = is open j Ut = do nothing ; X t 1 = is closed ) = 0

    p(X t = is closed j Ut = do nothing ; X t 1 = is closed ) = 1 (2.44)

  • 26 Chapter 2

    Supposeat time t, therobottakesnocontrolactionbut it sensesanopendoor. There-sultingposteriorbelief is calculatedby theBayeslter usingtheprior beliefbel(X 0),the control u1 = do nothing , and the measurementsense open as input. Sincethestatespaceis nite, the integral in Line 3 turnsinto a nite sum:

    bel(x1) =Z

    p(x1 j u1; x0) bel(x0) dx0

    =X

    x 0

    p(x1 j u1; x0) bel(x0)

    = p(x1 j U1 = do nothing ; X 0 = is open) bel(X 0 = is open)

    + p(x1 j U1 = do nothing ; X 0 = is closed ) bel(X 0 = is closed )

    (2.45)

    We can now substitutethe two possiblevaluesfor the statevariableX 1. For thehypothesisX 1 = is open, weobtain

    bel(X 1 = is open)

    = p(X 1 = is open j U1 = do nothing ; X 0 = is open) bel(X 0 = is open)

    + p(X 1 = is open j U1 = do nothing ; X 0 = is closed ) bel(X 0 = is closed )

    = 1 0:5 + 0 0:5 = 0:5 (2.46)

    Likewise,for X 1 = is closed weget

    bel(X 1 = is closed )

    = p(X 1 = is closed j U1 = do nothing ; X 0 = is open) bel(X 0 = is open)

    + p(X 1 = is closed j U1 = do nothing ; X 0 = is closed ) bel(X 0 = is closed )

    = 0 0:5 + 1 0:5 = 0:5 (2.47)

    The fact that thebelief bel(x1) equalsour prior belief bel(x0) shouldnot surprise,astheactiondo nothing doesnot affect thestateof theworld; neitherdoestheworldchangeover timeby itself in ourexample.

    Incorporatingthemeasurement,however, changesthebelief. Line 4 of theBayeslteralgorithmimplies

    bel(x1) = p(Z1 = sense open j x1) bel(x1) : (2.48)

  • RecursiveStateEstimation 27

    For thetwo possiblecases,X 1 = is open andX 1 = is closed , weget

    bel(X 1 = is open)

    = p(Z1 = sense open j X 1 = is open) bel(X 1 = is open)

    = 0:6 0:5 = 0:3 (2.49)

    and

    bel(X 1 = is closed )

    = p(Z1 = sense open j X 1 = is closed ) bel(X 1 = is closed )

    = 0:2 0:5 = 0:1 (2.50)

    Thenormalizer is now easilycalculated:

    = (0:3 + 0:1) 1 = 2:5 (2.51)

    Hence,wehave

    bel(X 1 = is open) = 0:75

    bel(X 1 = is closed ) = 0:25 (2.52)

    This calculationis now easily iteratedfor the next time step. As the readereasilyveries, for u2 = push andz2 = sense open weget

    bel(X 2 = is open) = 1 0:75+ 0:8 0:25 = 0:95

    bel(X 2 = is closed ) = 0 0:75+ 0:2 0:25 = 0:05 ; (2.53)

    and

    bel(X 2 = is open) = 0:6 0:95 0:983

    bel(X 2 = is closed ) = 0:2 0:05 0:017: (2.54)

    At this point, the robot believes that with 0:983 probability the door is open,henceboth its measurementswerecorrect.At rst glance,this probabilitymayappearto be

  • 28 Chapter 2

    sufciently high to simply acceptthis hypothesisas the world stateandact accord-ingly. However, suchanapproachmayresultin unnecessarilyhighcosts.If mistakinga closeddoor for anopenone incurscosts(e.g.,the robotcrashesinto a door),con-sideringbothhypothesesin thedecisionmakingprocesswill beessential,asunlikelyasoneof themmaybe. Justimagineying anaircrafton autopilot with a perceivedchanceof 0:983for notcrashing!

    2.4.3 Mathematical Deriv ation of the BayesFilter

    Thecorrectnessof theBayeslter algorithmis shown by induction.To doso,weneedto show that it correctlycalculatestheposteriordistributionp(x t j z1:t ; u1:t ) from thecorrespondingposterioronetime stepearlier, p(x t 1 j z1:t 1; u1:t 1). Thecorrect-nessfollows thenby inductionundertheassumptionthatwe correctlyinitialized theprior beliefbel(x0) at time t = 0.

    Our derivation requiresthat thestatex t is complete,asdened in Section2.3.1,andit requiresthatcontrolsarechosenat random.Therst stepof ourderivation involvestheapplicationof Bayesrule (2.23)to thetargetposterior:

    p(x t j z1:t ; u1:t ) =p(zt j x t ; z1:t 1; u1:t ) p(x t j z1:t 1; u1:t )

    p(zt j z1:t 1; u1:t )= p(zt j x t ; z1:t 1; u1:t ) p(x t j z1:t 1; u1:t ) (2.55)

    Wenow exploit theassumptionthatourstateis complete.In Section2.3.1,wedeneda statex t to becompleteif no variablesprior to x t may inuence thestochasticevo-lution of futurestates.In particular, if we (hypothetically)knew thestatex t andwereinterestedin predictingthe measurementzt , no pastmeasurementor control wouldprovide us additional information. In mathematicalterms, this is expressedby thefollowing conditionalindependence:

    p(zt j x t ; z1:t 1; u1:t ) = p(zt j x t ) : (2.56)

    Sucha statementis anotherexampleof conditional independence. It allows us tosimplify (2.55)asfollows:

    p(x t j z1:t ; u1:t ) = p(zt j x t ) p(x t j z1:t 1; u1:t ) (2.57)

  • RecursiveStateEstimation 29

    andhence

    bel(x t ) = p(zt j x t ) bel(x t ) (2.58)

    Thisequationis implementedin Line 4 of theBayeslter algorithmin Table2.1.

    Next, weexpandthetermbel(x t ), using(2.12):

    bel(x t ) = p(x t j z1:t 1; u1:t )

    =Z

    p(x t j x t 1; z1:t 1; u1:t ) p(x t 1 j z1:t 1; u1:t ) dxt 1 (2.59)

    Onceagain, we exploit theassumptionthatour stateis complete.This implies if weknow x t 1, pastmeasurementsandcontrolsconvey no informationregardingthestatex t . Thisgivesus

    p(x t j x t 1; z1:t 1; u1:t ) = p(x t j x t 1; ut ) (2.60)

    Herewe retainthecontrolvariableut , sinceit doesnotpredatethestatex t 1. Finally,wenotethatthecontrolut cansafelybeomittedfrom thesetof conditioningvariablesin p(x t 1 j z1:t 1; u1:t ) for randomlychosencontrols. This givesus the recursiveupdateequation

    bel(x t ) =Z

    p(x t j x t 1; ut ) p(x t 1 j z1:t 1; u1:t 1) dxt 1 (2.61)

    As the readereasily veries, this equationis implementedby Line 3 of the Bayeslter algorithmin Table2.1. To summarize,theBayeslter algorithmcalculatestheposteriorover thestatex t conditionedon themeasurementandcontroldataup to timet. Thederivationassumesthattheworld is Markov, that is, thestateis complete.

    Any concreteimplementationof thisalgorithmrequiresthreeprobabilitydistributions:Theinitial beliefp(x0), themeasurementprobabilityp(zt j x t ), andthestatetransitionprobabilityp(x t j ut ; x t 1). We have not yet specied thesedensities,but will do soin laterchapters(Chapters5 and??). Additionally, we alsoneeda representationforthebeliefbel(x t ), whichwill alsobediscussedfurtherbelow.

  • 30 Chapter 2

    2.4.4 The Mark ov Assumption

    A word is in orderon theMarkov assumption,or thecompletestateassumption,sinceit playssucha fundamentalrole in thematerialpresentedin this book. TheMarkovassumptionpostulatesthatpastandfuturedataareindependentif oneknows thecur-rent statex t . To seehow severean assumptionthis is, let us considerour exampleof mobile robot localization. In mobile robot localization,x t is the robots pose,andBayeslters areappliedto estimatethe poserelative to a x ed map. The followingfactorsmayhave a systematiceffect on sensorreadings.Thus,they induceviolationsof theMarkov assumption:

    Unmodeleddynamicsin theenvironmentnot includedin x t (e.g.,moving peopleandtheireffectsonsensormeasurementsin our localizationexample),

    inaccuraciesin theprobabilisticmodelsp(zt j x t ) andp(x t j ut ; x t 1),

    approximationerrorswhenusingapproximaterepresentationsof belief functions(e.g.,gridsor Gaussians,whichwill bediscussedbelow), and

    softwarevariablesin therobotcontrolsoftwarethatinuencemultiplecontrolse-lection(e.g.,thevariabletargetlocation typically inuencesanentiresequenceof controlcommands).

    In principle,many of thesevariablescanbe includedin staterepresentations.How-ever, incompletestaterepresentationsareoften preferableto morecompleteonestoreducethecomputationalcomplexity of theBayeslter algorithm. In practiceBayeslters have beenfound to besurprisinglyrobust to suchviolations.As a generalruleof thumb,onehasto exercisecarewhendening the statex t , so that the effect ofunmodeledstatevariableshasclose-to-randomeffects.

    2.5 REPRESENT ATION ANDCOMPUT ATION

    In probabilisticrobotics,Bayeslters areimplementedin severaldifferentways. Aswe will seein the next two chapters,thereexist quite a variety of techniquesandalgorithmsthatareall derivedfrom theBayeslter . Eachsuchtechniquereliesondif-ferentassumptionsregarding the measurementandstatetransitionprobabilitiesandthe initial belief. Thoseassumptionsthengive rise to differenttypesof posteriordis-tributions,andthealgorithmsfor computing(or approximating)thosehave different

  • RecursiveStateEstimation 31

    computationalcharacteristics.As a generalrule of thumb,exact techniquesfor cal-culatingbeliefsexist only for highly specializedcases;in generalroboticsproblems,beliefshave to beapproximated.Thenatureof theapproximationhasimportantram-ications on the complexity of the algorithm. Finding a suitableapproximationisusuallyachallengingproblem,with nouniquebestanswerfor all roboticsproblems.

    Whenchoosinganapproximation,onehasto tradeoff a rangeof properties:

    1. Computational efciency. Someapproximations,suchas linear Gaussianap-proximationsthat will be discussedfurther below, make it possibleto calculatebeliefs in time polynomial in the dimensionof the statespace.Othersmay re-quireexponentialtime. Particlebasedtechniques,discussedfurtherbelow, havean any-timecharacteristic,enablingthem to tradeoff accuracy with computa-tionalefciency.

    2. Accuracy of the approximation. Someapproximationscan approximateawider rangeof distributionsmoretightly thanothers.For example,linearGaus-sian approximationsare limited to unimodaldistributions, whereashistogramrepresentationscan approximatemulti-modal distributions, albeit with limitedaccuracy. Particlerepresentationscanapproximatea wide arrayof distributions,but thenumberof particlesneededto attainadesiredaccuracy canbe large.

    3. Easeof implementation. The difculty of implementingprobabilisticalgo-rithms dependson a variety of factors,suchas the form of the measurementprobabilityp(zt j x t ) andthestatetransitionprobabilityp(x t j ut ; x t 1). Parti-cle representationsoftenyield surprisinglysimpleimplementationsfor complexnonlinearsystemsone of thereasonsfor their recentpopularity.

    Thenext two chapterswill introduceconcreteimplementablealgorithms,which farequitedifferentlyrelative to thecriteriadescribedabove.

    2.6 SUMMAR Y

    In this section,we introducedthebasicideaof Bayeslters in robotics,asa meanstoestimatethestateof anenvironment(whichmay includethestateof therobot itself).

    Theinteractionof a robotandits environmentis modeledasacoupleddynamicalsystem,in which therobotcanmanipulateits environmentby choosingcontrols,andin which it canperceive its environmentthroughsensormeasurements.

  • 32 Chapter 2

    In probabilisticrobotics,thedynamicsof therobotandits environmentarechar-acterizedin the form of two probabilisticlaws: the statetransitiondistribution,andthemeasurementdistribution. Thestatetransitiondistribution characterizeshow statechangesover time, possibleastheeffect of a robotcontrol. Themea-surementdistribution characterizeshow measurementsare governedby states.Both laws areprobabilistic,accountingfor the inherentuncertaintyin stateevo-lution andsensing.

    The belief of a robot is the posteriordistribution over the stateof the environ-ment(including therobotstate),givenall pastsensormeasurementsandall pastcontrols. TheBayeslter is theprincipalalgorithmfor calculatingthebelief inrobotics.TheBayeslter is recursive; thebelief at time t is calculatedfrom thebeliefat time t 1.

    The Bayeslter makes a Markov assumptionthat species that the stateis acompletesummaryof the past. This assumptionimplies the belief is sufcientto representthe pasthistory of the robot. In robotics,the Markov assumptionis usually only an approximation. We identied conditionsunderwhich it isviolated.

    Since the Bayeslter is not a practicalalgorithm, in that it cannotbe imple-mentedon a digital computer, probabilisticalgorithmsusetractableapproxima-tions. Suchapproximationsmaybeevaluatedaccordingto differentcriteria,re-lating to theiraccuracy, efciency, andeaseof implementation.

    Thenext two chaptersdiscusstwo popularfamiliesof recursive stateestimationtech-niquesthatarebothderivedfrom theBayeslter .

    2.7 BIBLIOGRAPHICAL REMARKS

  • 3GA USSIAN FIL TERS

    3.1 INTR ODUCTION

    This chapterdescribesan importantfamily of recursive stateestimators,collectivelycalledGaussianFilters. Historically, Gaussianlters constitutetheearliesttractableimplementationsof the Bayeslter for continuousspaces.They arealsoby far themostpopularfamily of techniquesto datedespite anumberof shortcomings.

    Gaussiantechniquesall sharethebasicideathatbeliefsarerepresentedby multivariatenormaldistributions. We alreadyencountereda denition of themultivariatenormaldistribution in Equation(2.4),which is restatedhere:

    p(x) = det (2 ) 12 exp

    12 (x )

    T 1(x )

    (3.1)

    Thisdensityoverthevariablex ischaracterizedby twosetsof parameters:Themeanandthecovariance . Themean is a vectorthatpossessesthesamedimensionalityas the statex. The covarianceis a quadraticmatrix that is symmetricandpositive-semidenite. Its dimensionis the dimensionalityof the statex squared.Thus, thenumberof elementsin thecovariancematrix dependsquadraticallyon thenumberofelementsin thestatevector.

    Thecommitmentto representtheposteriorby aGaussianhasimportantramications.Most importantly, Gaussiansare unimodal, that is, they possesa single maximum.Sucha posterioris characteristicof many trackingproblemsin robotics,in which theposterioris focusedaroundthe true statewith a small margin of uncertainty. Gaus-sianposteriorsarea poormatchfor many globalestimationproblemsin which manydistincthypothesesexist, eachof which forming its own modein theposterior.

    33

  • 34 Chapter 3

    The representationof a Gaussianby its meanandcovarianceis called the momentsrepresentation.This is becausethemeanandcovariancearethe rst andsecondmo-mentsof aprobabilitydistribution;all othermomentsarezerofor normaldistributions.In thischapter, wewill alsodiscussanalternative representation,calledcanonicalrep-resentation, or sometimesnatural representation. Both representations,themomentsandthecanonicalrepresentations,arefunctionallyequivalentin thata bijective map-ping exists that transformsoneinto theother(andback).However, they leadto lteralgorithmswith orthogonalcomputationalcharacteristics.

    Thischapterintroducesthetwo basicGaussianlter algorithms.

    Section3.2describestheKalmanlter , which implementstheBayeslter usingthemomentsrepresentationfor a restrictedclassof problemswith lineardynam-icsandmeasurementfunctions.

    The Kalman lter is extendedto nonlinearproblemsin Section3.3, which de-scribestheextendedKalmanlter .

    Section3.4describestheinformationlter , which is thedualof theKalmanlterusingthecanonicalrepresentationof Gaussians.

    3.2 THE KALMAN FIL TER

    3.2.1 Linear Gaussian Systems

    Probablythebeststudiedtechniquefor implementingBayeslters is theKalmanlter(KF). The Kalman lter was inventedin the 1950sby RudolphEmil Kalman,asatechniquefor ltering andpredictionin linearsystems.TheKalmanlter implementsbeliefcomputationfor continuousstates.It is notapplicableto discreteor hybrid statespaces.

    The Kalman lter representsbeliefsby the momentsrepresentation:At time t, thebelief is representedby thethemean t andthecovariance t . PosteriorsareGaussianif the following threepropertieshold, in addition to the Markov assumptionsof theBayeslter .

    1. Thenext stateprobabilityp(x t j ut ; x t 1) mustbea linear function in its argu-mentswith addedGaussiannoise.This is expressedby thefollowing equation:

    x t = A t x t 1 + B t ut + " t : (3.2)

  • GaussianFilters 35

    Herex t andx t 1 arestatevectors,andut is thecontrolvectorat time t. In ournotation,bothof thesevectorsareverticalvectors,that is, they areof theform

    x t =

    0

    BBB@

    x1;tx2;t

    ...xn;t

    1

    CCCA

    and ut =

    0

    BBB@

    u1;tu2;t

    ...um;t

    1

    CCCA

    : (3.3)

    A t andB t arematrices. A t is a squarematrix of sizen n, wheren is thedimensionof thestatevectorx t . B t is of sizen m, with m beingthedimensionof the control vectorut . By multiplying the stateandcontrol vectorwith thematricesA t andB t , respectively, thestatetransitionfunctionbecomeslinear inits arguments.Thus,Kalmanlters assumelinearsystemdynamics.

    Therandomvariable" t in (3.2) is aGaussianrandomvectorthatmodelstheran-domnessin thestatetransition.It is of thesamedimensionasthestatevector. Itsmeanis zeroandits covariancewill bedenotedRt . A statetransitionprobabilityof the form (3.2) is calleda linear Gaussian, to reect the fact that it is linear inits argumentswith additive Gaussiannoise.

    Equation(3.2)denesthestatetransitionprobabilityp(x t j ut ; x t 1). Thisprob-ability is obtainedby pluggingEquation(3.2) into the denition of the multi-variatenormal distribution (3.1). The meanof the posteriorstateis given byA t x t 1 + B t ut andthecovarianceby Rt :

    p(x t j ut ; x t 1) (3.4)

    = det (2 Rt ) 12 exp

    12 (x t A t x t 1 B t ut )

    T R 1t (x t A t x t 1 B t ut )

    2. Themeasurementprobabilityp(zt j x t ) mustalsobelinear in its arguments,withaddedGaussiannoise:

    zt = Ct x t + t : (3.5)

    HereCt is a matrix of sizek n, wherek is thedimensionof themeasurementvectorzt . Thevector t describesthemeasurementnoise.Thedistribution of tis a multivariateGaussianwith zeromeanandcovarianceQt . Themeasurementprobability is thusgivenby thefollowing multivariatenormaldistribution:

    p(zt j x t ) = det (2 Qt ) 12 exp

    12 (zt Ct x t )

    T Q 1t (zt Ct x t )

    (3.6)

  • 36 Chapter 3

    1: Algorithm Kalman lter( t 1; t 1; ut ; zt ):2: t = A t t 1 + B t ut3: t = A t t 1 ATt + Rt4: K t = t CTt (Ct t C

    Tt + Qt )

    1

    5: t = t + K t (zt Ct t )6: t = (I K t Ct ) t7: return t ; t

    Table 3.1 TheKalmanfilter algorithmfor linearGaussianstatetransitionsandmeasure-ments.

    3. Finally, the initial belief bel(x0) mustbenormaldistributed.We will denotethemeanof thisbeliefby 0 andthecovarianceby 0:

    bel(x0) = p(x0) = det (2 0) 12 exp

    12 (x0 0)

    T 10 (x0 0)

    Thesethreeassumptionsaresufcient to ensurethat the posteriorbel(x t ) is alwaysa Gaussian,for any point in time t. Theproof of this non-trivial resultcanbe foundbelow, in themathematicalderivationof theKalmanlter (Section3.2.4).

    3.2.2 The Kalman Filter Algorithm

    The Kalman lter algorithm is depictedin Table 3.1. Kalman lters representthebeliefbel(x t ) at time t by themean t andthecovariance t . Theinputof theKalmanlter is the belief at time t 1, representedby t 1 and t 1. To updatetheseparameters,Kalmanlters requirethecontrolut andthemeasurementzt . Theoutputis thebeliefat time t, representedby t and t .

    In Lines 2 and3, the predictedbelief and is calculatedrepresentingthe beliefbel(x t ) onetimesteplater, but beforeincorporatingthemeasurementzt . Thisbelief isobtainedby incorporatingthecontrolut . Themeanis updatedusingthedeterministicversionof the statetransitionfunction (3.2), with the mean t 1 substitutedfor thestatex t 1. The updateof the covarianceconsidersthe fact that statesdependonpreviousstatesthroughthe linearmatrix A t . This matrix is multiplied twice into thecovariance,sincethecovarianceis aquadraticmatrix.

  • GaussianFilters 37

    Thebeliefbel(x t ) is subsequentlytransformedinto thedesiredbeliefbel(x t ) in Lines4 through6, by incorporatingthe measurementzt . The variableK t , computedinLine 4 is calledKalmangain. It species the degreeto which the measurementisincorporatedinto thenew stateestimate.Line 5 manipulatesthemean,by adjustingit in proportionto theKalmangain K t andthedeviation of theactualmeasurement,zt , and the measurementpredictedaccordingto the measurementprobability (3.5).Finally, thenew covarianceof theposteriorbelief is calculatedin Line 6, adjustingforthe informationgain resultingfrom themeasurement.

    TheKalmanlter is computationallyquiteefcient. For todays bestalgorithms,thecomplexity of matrix inversionis approximatelyO(d2:8) for a matrix of sized d.Each iteration of the Kalman lter algorithm, as statedhere, is lower boundedby(approximately)O(k2:8), wherek is thedimensionof themeasurementvectorzt . This(approximate)cubiccomplexity stemsfrom thematrix inversionin Line 4. It is alsoatleastin O(n2), wheren is thedimensionof thestatespace,dueto themultiplicationin Line 6 (thematrix K t Ct maybesparse).In many applicationssuch asthe robotmappingapplicationsdiscussedin later chapters-the measurementspaceis muchlower dimensionalthan the statespace,and the updateis dominatedby the O(n2)operations.

    3.2.3 Illustration

    Figure3.2 illustratestheKalmanlter algorithmfor a simplisticone-dimensionallo-calizationscenario.Supposetherobotmovesalongthehorizontalaxisin eachdiagramin Figure3.2.Let theprior over therobot locationbegivenby thenormaldistributionshown in Figure3.2a. The robotqueriesits sensorson its location(e.g.,a GPSsys-tem),andthosereturnameasurementthatis centeredat thepeakof theboldGaussianin Figure3.2b. This bold Gaussianillustratesthis measurement:Its peakis thevaluepredictedby thesensors,andits width (variance)correspondsto theuncertaintyin themeasurement.Combiningthe prior with the measurement,via Lines 4 through6 oftheKalmanlter algorithmin Table3.1,yieldstheboldGaussianin Figure3.2c.Thisbeliefsmeanliesbetweenthetwo originalmeans,andits uncertaintyradiusis smallerthanbothcontributingGaussians.Thefactthattheresidualuncertaintyis smallerthanthecontributing Gaussiansmayappearcounter-intuitive,but it is a generalcharacter-istic of informationintegrationin Kalmanlters.

    Next, assumetherobotmovestowardstheright. Its uncertaintygrows dueto the factthat thenext statetransitionis stochastic.Lines2 and3 of theKalmanlter providesus with the Gaussianshown in bold in Figure3.2d. This Gaussianis shiftedby theamounttherobotmoved,andit is alsowider for thereasonsjust explained.Next, the

  • 38 Chapter 3

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (a)0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (b)

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (c)0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (d)

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (e)0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 5 10 15 20 25 30 (f)

    Figure 3.2 Illustration of Kalmanfilters: (a) initial belief, (b) a measurement(in bold)with theassociateduncertainty, (c) belief after integratingthemeasurementinto thebeliefusing the Kalman filter algorithm, (d) belief after motion to the right (which introducesuncertainty),(e)anew measurementwith associateduncertainty, and(f) theresultingbelief.

  • GaussianFilters 39

    robot receivesa secondmeasurementillustratedby thebold Gaussianin Figure3.2e,which leadsto theposteriorshown in bold in Figure3.2f.

    As this example illustrates,the Kalman lter alternatesa measurementupdatestep(Lines5-7), in whichsensordatais integratedinto thepresentbelief,with apredictionstep(or control updatestep),which modies the belief in accordanceto an action.Theupdatestepdecreasesandthepredictionstepincreasesuncertaintyin the robotsbelief.

    3.2.4 Mathematical Deriv ation of the KF

    This sectionderivestheKalmanlter algorithmin Table3.1. Thesectioncansafelybeskippedat rst reading.

    Part 1: Prediction. Our derivation begins with Lines 2 and3 of the algorithm, inwhich thebelief bel(x t ) is calculatedfrom thebelief onetime stepearlier, bel(x t 1).Lines2 and3 implementthe updatestepdescribedin Equation(2.61),restatedherefor thereaders convenience:

    bel(x t ) =Z

    p(x t j x t 1; ut )| {z }N (x t ;A t x t 1 + B t u t ;R t )

    bel(x t 1)| {z }N (x t 1 ; t 1 ; t 1 )

    dxt 1 (3.7)

    Theprior beliefbel(x t 1) is representedby themean t 1 andthecovariance t 1.Thestatetransitionprobabilityp(x t j x t 1; ut ) wasgiven in (3.4)asa normaldistri-bution over x t with meanA t x t 1 + B t ut andcovarianceRt . As we shallshow now,theoutcomeof (3.7) is again a Gaussianwith mean t andcovariance t asstatedinTable3.1.

    Webegin by writing (3.7) in its Gaussianform:

    bel(x t ) = Z

    exp

    12 (x t A t x t 1 B t ut )T R 1t (x t A t x t 1 B t ut )

    exp

    12 (x t 1 t 1)T 1t 1(x t 1 t 1)

    dxt 1 : (3.8)

    In short,wehave

    bel(x t ) = Z

    expf L t g dxt 1 (3.9)

  • 40 Chapter 3

    with

    L t = 12 (x t A t x t 1 B t ut )T R 1t (x t A t x t 1 B t ut )

    + 12 (x t 1 t 1)T 1t 1 (x t 1 t 1): (3.10)

    NoticethatL t is quadraticin x t 1; it is alsoquadraticin x t .

    Expression(3.9) containsan integral. To solve this integral in closedform, we willnow decomposeL t into two functions,L t (x t 1; x t ) andL t (x t ):

    L t = L t (x t 1; x t ) + L t (x t ) (3.11)

    so that all termscontainingx t 1 arecollectedin L t (x t 1; x t ). This decompositionwill allow us to move L t (x t ) outsidethe integration,sinceits valuedoesnot dependon the integrationvariablex t 1:

    bel(x t ) = Z

    expf L t g dxt 1

    = Z

    expf L t (x t 1; x t ) L t (x t )g dxt 1

    = expf L t (x t )gZ

    expf L t (x t 1; x t )g dxt 1 (3.12)

    Furthermore,we will chooseL t (x t 1; x t ) suchthat thevalueof the integral in (3.12)doesnotdependonx t . Thus,theintegralwill simplybecomeaconstantrelative to theproblemof estimatingthebelief distribution over x t . The resultingdistribution overx t will beentirelydened throughL t (x t ).

    bel(x t ) = expf L t (x t )g (3.13)

    Let us now perform this decomposition. We are seekinga function L t (x t 1; x t )quadraticin x t 1. (This functionwill alsodependonx t , but thatshallnotconcernusat thispoint.) To determinethecoefcients of thisquadratic,wecalculatetherst twoderivativesof L t :

    @L t@x t 1

    = ATt R 1t (x t A t x t 1 B t ut ) +

    1t 1 (x t 1 t 1) (3.14)

  • GaussianFilters 41

    @2L t@x2t 1

    = ATt R 1t A t +

    1t 1 =:

    1t (3.15)

    t denes thecurvatureof L t (x t 1; x t ). Settingthe rst derivative of L t to 0 givesusthemean:

    ATt R 1t (x t A t x t 1 B t ut ) =

    1t 1 (x t 1 t 1) (3.16)

    Thisexpressionis now solvedfor x t 1

    ( ) ATt R 1t (x t B t ut ) A

    Tt R

    1t A t x t 1 =

    1t 1 x t 1

    1t 1 t 1

    ( ) ATt R 1t A t x t 1 +

    1t 1 x t 1 = A

    Tt R

    1t (x t B t ut ) +

    1t 1 t 1

    ( ) (ATt R 1t A t +

    1t 1) x t 1 = A

    Tt R

    1t (x t B t ut ) +

    1t 1 t 1

    ( ) 1t x t 1 = ATt R

    1t (x t B t ut ) +

    1t 1 t 1

    ( ) x t 1 = t [ATt R 1t (x t B t ut ) +

    1t 1 t 1] (3.17)

    Thus,wenow have aquadraticfunctionL t (x t 1; x t ), dened asfollows:

    L t (x t 1; x t ) = 12 (x t 1 t [ATt R

    1t (x t B t ut ) +

    1t 1 t 1])

    T 1

    (x t 1 t [ATt R 1t (x t B t ut ) +

    1t 1 t 1]) (3.18)

    Clearly, this is not theonly quadraticfunctionsatisfyingour decompositionin (3.11).However, L t (x t 1; x t ) is of thecommonquadraticform of thenegativeexponentof anormaldistribution. In factthefunction

    det(2 ) 12 expf L t (x t 1; x t )g (3.19)

    is avalid probabilitydensityfunction(PDF)for thevariablex t 1. As thereadereasilyveries, this function is of the form dened in (3.1). We know from (2.5) thatPDFsintegrateto 1. Thus,wehave

    Zdet(2 )

    12 expf L t (x t 1; x t )g dxt 1 = 1 (3.20)

  • 42 Chapter 3

    Fromthis it follows that

    Zexpf L t (x t 1; x t )g dxt 1 = det(2 )

    12 : (3.21)

    Theimportantthing to noticeis thatthevalueof this integral is independentof x t , ourtargetvariable.Thus,for our problemof calculatinga distribution over x t , this inte-gral is constant.Subsumingthis constantinto thenormalizer , we get the followingexpressionfor Equation(3.12):

    bel(x t ) = expf L t (x t )gZ

    expf L t (x t 1; x t )g dxt 1

    = expf L t (x t )g (3.22)

    Notice that the normalizers left andright of the equalsign arenot the same.Thisdecompositionestablishesthecorrectnessof (3.13).

    It remainsto determinethe functionL t (x t ), which is thedifferenceof L t , dened in(3.10),andL t (x t 1; x t ), dened in (3.18):

    L t (x t ) = L t L t (x t 1; x t )

    = 12 (x t A t x t 1 B t ut )T R 1t (x t A t x t 1 B t ut )

    + 12 (x t 1 t 1)T 1t 1 (x t 1 t 1)

    12 (x t 1 t [ATt R

    1t (x t B t ut ) +

    1t 1 t 1])

    T 1

    (x t 1 t [ATt R 1t (x t B t ut ) +

    1t 1 t 1]) (3.23)

    Let us quickly verify that L t (x t ) indeeddoesnot dependon x t 1. To do so, wesubstituteback t = (ATt R

    1t A t +

    1t 1)

    1, andmultiply out thetermsabove. Forthe readers convenience,termsthat containx t 1 areunderlined(doubly if they arequadraticin x t 1).

    L t (x t ) = 12 xTt 1A

    Tt R

    1t A t x t 1 x

    Tt 1A

    Tt R

    1t (x t B t ut )

    + 12 (x t B t ut )T R 1t (x t B t ut )

    + 12 xTt 1

    1t 1 x t 1 x

    Tt 1

    1t 1 t 1 +

    12

    Tt 1

    1t 1 t 1

    12 xTt 1 (A

    Tt R

    1t A t +

    1t 1) x t 1

  • GaussianFilters 43

    + xTt 1 [ATt R

    1t (x t B t ut ) +

    1t 1 t 1]

    12 [ATt R

    1t (x t B t ut ) +

    1t 1 t 1]

    T (ATt R 1t A t +

    1t 1)

    1

    [ATt R 1t (x t B t ut ) +

    1t 1 t 1] (3.24)

    It is now easilyseenthatall termsthatcontainx t 1 cancelout. This shouldcomeatnosurprise,sinceit is aconsequenceof ourconstructionof L t (x t 1; x t ).

    L t (x t ) = + 12 (x t B t ut )T R 1t (x t B t ut ) +

    12

    Tt 1

    1t 1 t 1

    12 [ATt R

    1t (x t B t ut ) +

    1t 1 t 1]

    T (ATt R 1t A t +

    1t 1)

    1

    [ATt R 1t (x t B t ut ) +

    1t 1 t 1] (3.25)

    Furthermore,L t (x t ) is quadraticin x t . This observationmeansthatbel(x t ) is indeednormaldistributed. The meanandcovarianceof this distribution areof coursetheminimumandcurvatureof L t (x t ), which we now easilyobtainby by computingtherst andsecondderivativesof L t (x t ) with respectto x t :

    @L t (x t )@x t

    = R 1t (x t B t ut ) R 1t A t (A

    Tt R

    1t A t +

    1t 1)

    1

    [ATt R 1t (x t B t ut ) +

    1t 1 t 1]

    = [R 1t R 1t A t (A

    Tt R

    1t A t +

    1t 1)

    1ATt R 1t ] (x t B t ut )

    R 1t A t (ATt R

    1t A t +

    1t 1)

    1 1t 1 t 1 (3.26)

    The inversion lemmastated(andproved) in Table3.2 allows us to expressthe rstfactorasfollows:

    R 1t R 1t A t (A

    Tt R

    1t A t +

    1t 1)

    1ATt R 1t = (Rt + A t t 1 A

    Tt )

    1

    (3.27)

    Hencethedesiredderivative is givenby thefollowing expression:

    @L t (x t )@x t

    = (Rt + A t t 1 ATt ) 1 (x t B t ut )

    R 1t A t (ATt R

    1t A t +

    1t 1)

    1 1t 1 t 1 (3.28)

  • 44 Chapter 3

    Inversion Lemma. For any invertiblequadraticmatricesR andQ andany matrix Pwith appropriatedimension,thefollowing holdstrue

    (R + P Q PT ) 1 = R 1 R 1 P (Q 1 + PT R 1 P ) 1 PT R 1

    assumingthatall above matricescanbe invertedasstated.

    Proof. It sufces to show that

    (R 1 R 1 P (Q 1 + PT R 1 P ) 1 PT R 1) (R + P Q PT ) = I

    This is shown throughaseriesof transformations:

    = R 1 R| {z }= I

    + R 1 P Q PT R 1 P (Q 1 + PT R 1 P ) 1 PT R 1 R| {z }= I

    R 1 P (Q 1 + PT R 1 P ) 1 PT R 1 P Q PT

    = I