Knowledge Representation and Reasoning HMM and …...Master 2 MOSIG Knowledge Representation and Reasoning HMM and Bayesian Filtering Elise Arnaud Universit´e Joseph Fourier / INRIA

Master 2 MOSIG

Knowledge Representation and Reasoning

HMM and Bayesian Filtering

Elise Arnaud

Université Joseph Fourier / INRIA Rhône-Alpes

[email protected]

Overview

1. Introduction on HMM

2. Filtering : Problem Statement

3. Overview of existing solutions

4. Forward algorithm

5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

Références

– A. Doucet, S.J. Godsill, C. Andrieu. On sequential Monte Carlo sampling

methods for Bayesian filtering. Statist. Comput., 10, 197-208, 2000.

– S.M. Arulampalam, S. Maskell, N.J. Gordon, T.C. Clapp. A tutorial on particle

filters for online nonlinear / non-Gaussian Bayesian tracking IEEE

Transactions on Signal Processing 50, 2 174-188, February 2002.

– Arnaud Doucet, Nando de Freitas, Neil Gordon (eds). Sequential Monte Carlo

Methods in Practice. Springer-Verlag, 2001.

Overview





5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

2 (Introduction on HMM : 1)

Reminder on Markov chain

– we observe a system at discrete times 0, 1, 2, ..., t.

– The system can be in one state of a collection of possible states

– The observation of the system is considered as an experience whose (random)

result is the state’s system → stochastic process

examples :

– state of an engine (working, not working)

– weather (rain, cloud, snow, sun)

– robot’s position on a grid

The system is evolving in time



Markov property : the state of a system at time t only depends on the state at

time t − 1

Knowing the present, we can forget the past to predict the future

Let x be a Markov chain :

X = {X0,X1,X2, . . . ,Xk . . .} = {Xk; k > 0}

Xk takes its value in a finite set of possible values : the state space X

p(xk+1|x0,x1, . . .xk) = p(xk+1|xk)



To define a Markov chain X = {Xk; k > 0}, one needs :

– the state space X (the m possible values is the state space is discrete)

– the initial distribution p(X0)

– the transition matrix Q that describes the probabilities to go from one state to

another p(xk|xk−1)

To do inference calculation, we will use the joint law :

p(x0:t) = p(x0)∏

k=1:t

p(xk|xk−1)


Markov chain ... Hidden Markov chain

Let X = {Xk; k > 0} be a Markov chain

What we are interested in :

Knowing the state of the chain at instant k xk ∈ X ?

– We would like to know the weather (rain, cloud, sun, snow)

– We would like to know the position of a robot on a grid








Problem : the state of the system is indirectly / partially observed

– We would like to know the weather ... but we measure the temperature only

– We would like to know the position of a robot on a grid .... but we gather data

from a gyroscope on top of the robot








Problem : the state of the system is indirectly / partially observed

– We would like to know the weather ... but we measure the temperature only

– We would like to know the position of a robot on a grid .... but we gather data

from a gyroscope on top of the robot

described by a hidden Markov chain


Hidden Markov chain

Hidden Markov Model HMM = {Xk,Zk}k>0


Hidden Markov chain


{Xk}k>0 : state process

– state space X

– Markovian process

– transition law p(xk|xk−1) (transition matrix Q if X discrete and finite)


Hidden Markov chain


{Xk}k>0 : state process

– state space X

– Markovian process

– transition law p(xk|xk−1) (transition matrix Q if X discrete and finite)

{Zk}k>0 : observation (measurement) process

– observation space Z

– the measurement at time k only depend on the state at time k

– likelihood p(zk|xk) (likelihood matrix B if Z discrete and finite)


Hidden Markov chain

Hidden Markov Chain : model of a dynamic system

described by

1. state space X and observation space Z

2. initial distribution p(X0)

3. transition law p(xk|x0:k−1, z1:k−1) = p(xk|xk−1)

4. likelihood p(zk|x0:k−1, z1:k−1) = p(zk|xk)


Goal

calculate the state of a system from a set of observations

c©mercator


Goal


– estimate the weather today from a set of temperatures measured till today

– estimate the position of the robot from the gyroscope data


Goal


– estimate the weather today from a set of temperatures measured till today

– estimate the position of the robot from the gyroscope data

From a sequence of observations z1:k = {z1, . . . , zk}, the goal is to find the state

xk for wich the probability p(xk|z1:k) is maximal.

filtering problem


Goal

other problematics

From a sequence of observations z1:k = {z1, . . . , zk}, and the model, estimate

– lthe sequence of state x0:k = {x1, . . . ,xk} for wich the probability p(x0:k|z1:k) is

maximal : trajectography

– the most probable (previous) state at time t, with t < k : smoothing

– the most probable (futur) state at time t, with t > k : prediction

– the probability of occurence of the sequence of observation (to study rare events)

From a sequence of observations z1:k, and a sequence of states x1:k, estimate the model

parameters : learning


Applications

positioning, navigation and tracking

– target tracking

– computer vision

– mobile robotics

– ambient intelligence

– sensor networks, etc.


Applications

among others ...

data assimilation

environmental sciences

(oceanography, meteorology, atmospheric pollution)

information theory

bio-informatic

speech recognition

handwritting recognition

finance

...

Overview





5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

18 (Filtering : Problem Statement : 1)

Problem Statement

Dynamic system modeled as a Hidden Markov Chain

described by

1. state space X ; measurement space Z

2. the initial distribution p(X0)

3. an evolution model (transition law) p(xk|x0:k−1, z1:k−1) = p(xk|xk−1)

4. a observation model (likelihood) p(zk|x0:k−1, z1:k−1) = p(zk|xk)


Problem Statement

Dynamic system modeled as a Hidden Markov Chain

We have :

p(x0:t, z1:t) = p(x0)∏

k=1:t p(xk|xk−1) p(zk|xk)

p(x0:t, z1:t) = p(xt|xt−1) p(zt|xt) p(x0:t−1, z1:t−1)


Problem Statement

Filtering - tracking :

estimation of the state given the past and present measurements

p(xk|z1:k)

Trajectography :

estimation of the state trajectory given the past and present measurements

p(x0:k|z1:k)

Smoothing :

estimation of the state given the past and some future measurements

p(xk|z1:t) t > k

Prediction :

estimation of a future state given the measurements up to a past time

p(xk|z1:t) t < k


Problem Statement

Filtering

– estimation of the state given the past and present measurements

filtering distribution : p(xk|z1:k)

– This estimation has to be sequential ie :

p(xk−1|z1:k−1) → Algorithm → p(xk|z1:k)


Problem Statement

Toy exemple : the white car tracking

– state xk : position + velocity

– evolution model : the car evolves at constant velocity

– observation zk : detected white cars

– observation model : the tracked car should be one of

the detected cars

p(xk|z1:k) current position of the white car

knowing all previous and current detected white cars


Problem Statement

So far ...

p(xk−1|z1:k−1) → Algorithm → p(xk|z1:k)

– exact solution : Optimal Bayesian Filter

– but this solution implies the calculation of two huge integrals ...

– various algorithms can be proposed, depending on the model :

p(xk|xk−1) and p(zk|xk)

Overview





5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

28 (Overview of existing solutions : 1)

Overview of existing solutions

– Both X and Z are discrete and finite → Forward algorithm

– Otherwise

– Linear Gaussian model → Kalman filter

– weakly non linear, Gaussian → Extensions of Kalman filter

– non linear, non Gaussian → Particle filter (Sequential Monte Carlo methods)

Overview





5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

29 (Forward algorithm : 1)

Forward algorithm

Let us suppose that the HMM is characterized by the transition matrix Q defined

such as :

qij = p(Xk+1 = j|Xk = i)

and the observation matrix B defined such as :

bi(j) = p(Zk = j|Xk = i)

then, we can use the forward algorithm to calculate p(xk|z1:k)


Forward algorithm

Exemple of a 2 state HMM

marche arret

0.7

0.3

0.1

0.9

Q =qMM qMA

qAM qAA=

0.9 0.1

0.7 0.3


Forward algorithm

Exemple of a 2 state HMM

marche arret

0.7

0.3

0.1

0.9

0.2 0.8 0.050.95

B =bM (R) bM (V )

bA(R) bA(V )=

0.2 0.8

0.95 0.05


Forward algorithm

Exemple of a 2 state HMM - representation on a lattice

mar

arrarr

mar

arr

mar

arr

mar

...

t = 2t = 0

t = 1 t = k

p(X0 = M) = µ(M) = 0.9 ; p(X0 = A) = µ(A) = 0.1

p(X3 = M |Z1 = R,Z2 = R,Z3 = V )?

p(X3 = R|Z1 = R,Z2 = R,Z3 = V )?


Forward algorithm

Exemple of a 2 state HMM - representation on a latticemarche

arretarret

marche

arret

marche

arret

marche

...

t = 2t = 0

t = 1 t = k

p(X1 = M |Z1 = R)

∝ p(X1 = M,Z1 = R)

= [µ(M) ∗ p(X1 = M |X0 = M) + µ(A) ∗ p(X1 = M |X0 = A)] p(Z1 = R|X1 = M)

= [µ(M) qMM + µ(A) qAM ] bM (R)

= α1(M)

in a similar manner :

p(X1 = A|Z1 = R) ∝ p(X1 = A,Z1 = R)

= [µ(M) qMA + µ(A) qAA] bA(R) = α1(A)


Forward algorithm

Exemple of a 2 state HMM - representation on a latticemarche

arretarret

marche

arret

marche

arret

marche

...

t = 2t = 0

t = 1 t = k

p(X2 = M |Z1 = R,Z2 = R)

∝ p(X2 = M,Z1 = R,Z2 = R)

= α1(M) ∗ qMM ∗ bM (R) + α1(A) ∗ qAM ∗ bM (R) = α2(M)

in a similar manner :

p(X2 = A|Z1 = R,Z2 = R)

∝ p(X2 = A,Z1 = R,Z2 = R)

= α1(M) ∗ qMA ∗ bA(R) + α1(A) ∗ qAA ∗ bA(R) = α2(A)

... we can do the same for p(X3|Z1 = R,Z2 = R,Z3 = V )


Forward algorithm

Generalization

We define αk(i) = p(z1:k,Xk = i) the probability to observe z1 . . . zk with a

sequence of states that end in state i.

We have p(Xk = i|z1:k) ∝ αk(i)

The forward algorithm gives us a efficient way to calculate αk+1(i) knowing

αk(j) ∀i, j ∈ X


Forward algorithm

Generalization

Initialization

α0(i) = µ(i)

Induction

αk+1(i) =

N∑

j=1

αk(j) qji

bi(zk+1)

We can also calculate

p(z1:k) =N

∑

j=1

αk(j)

T observations, N state ⇒ N2T operations


Forward algorithm

x1

x2

xi

x3

xN

z

...

q1i

qNi

q2i

q3i

bi(z)


Other problems

– trajectography : Viterbi algorithm

– smoothing : Forward-Backward algorithm

– learning : Baum-Welch algorithm

Overview





5. Kalman filter

6. Particle filter

7. Applications

8. Conclusion

Documents

Knowledge Representation and Reasoning HMM and …...Master 2 MOSIG Knowledge Representation and Reasoning HMM and Bayesian Filtering Elise Arnaud Universit´e Joseph Fourier / INRIA