38
Recent developments in imprecise probabilities and probabilistic graphical models Gert de Cooman Ghent University, SYSTeMS [email protected] http://users.UGent.be/˜gdcooma gertekoo.wordpress.com ECAI 2012 31 August 2012

Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Recent developments in imprecise probabilitiesand probabilistic graphical models

Gert de Cooman

Ghent University, SYSTeMS

[email protected]://users.UGent.be/˜gdcooma

gertekoo.wordpress.com

ECAI 201231 August 2012

Page 2: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

What would I like to achieve and convey?

IMPRECISEPROBABILITIES

PROBABILISTICGRAPHICAL

MODELS

Page 3: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

IMPRECISE PROBABILITYMODELS

Page 4: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal sets

Page 5: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Mass functions and expectationsAssume we are uncertain about:I the value or a variable XI in a finite set of possible values X.

This is usually modelled by a probability mass function p on X:

p(x)≥ 0 and ∑x∈X

p(x) = 1;

With p we can associate a prevision/expectation operator Pp:

Pp(f ) := ∑x∈X

p(x) f (x) where f : X→ R.

If A⊆X is an event, then its probability is given by

Pp(A) = ∑x∈A

p(x) = Pp(IA).

Page 6: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

The simplex of all probability mass functions

Consider the simplex ΣX of all mass functions on X:

ΣX :=

{p ∈ RX

+ : ∑x∈X

p(x) = 1

}.

b

c

a

ΣX

(0,1,0)

(0,0,1)

(1,0,0)

b

c

a

ΣX

pu

Page 7: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal sets

DefinitionA credal set M is a convex closed subset of ΣX .

b

c

a Mb

c

a

M

b

c

a

M

b

c

a

M

It is completely characterised by its set of extreme points ext(M ).

Page 8: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Conditioning and credal sets

Suppose we have two variables X1 in X1 and X2 in X2.

A credal set for (X1,X2) jointly is a convex closed set of joint massfunctions p(x1,x2):

M ⊆ ΣX1×X2

This gives rise to a conditional model by applying Bayes’s Rule to eachmass function:

M |x2 := {p(·|x2) : p ∈M } .

Working with extreme points does the job too.

Page 9: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Independence and credal sets

Suppose we have two variables X1 in X1 and X2 in X2.

Marginal models are credal sets for X1 and X2) separately:

M1 ⊆ ΣX1 and M2 ⊆ ΣX2

Their strong product is the joint credal set:

M1�M2 := CCH({p1 ·p2 : p1 ∈M1 and p2 ∈M2} .

This leads to a notion of strong independence.

Page 10: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Lower previsions

Page 11: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Lower and upper previsions

b

c

a

ΣX

P(I{c}) = 1/4

P(I{c}) = 4/7

Equivalent modelConsider the set L (X) = RX of all real-valued maps on X. We definetwo real functionals on L (X): for all f : X→ R

PM (f ) = min{Pp(f ) : p ∈M } lower prevision/expectationPM (f ) = max{Pp(f ) : p ∈M } upper prevision/expectation.

Observe thatPM (f ) =−PM (−f ).

Page 12: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Basic properties of lower previsions

DefinitionWe call a real functional P on L (X) a lower prevision if it satisfies thefollowing properties:for all f and g in L (X) and all real λ ≥ 0:

1. P(f )≥min f [boundedness];2. P(f +g)≥ P(f )+P(g) [super-additivity];3. P(λ f ) = λP(f ) [non-negative homogeneity].

TheoremA real functional P is a lower prevision if and only if it is the lowerenvelope of some credal set M .

Page 13: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Conditioning and lower previsions

Suppose we have two variables X1 in X1 and X2 in X2.

Consider for instance:I a joint lower prevision P1,2 for (X1,X2) defined on L (X1×X2);I a conditional lower prevision P2(·|x1) for X2 conditional on X1 = x1,

defined on L (X2), for all values x1 ∈X1.

CoherenceThese lower previsions P1,2 and P2(·|X1) must satisfy certain (joint)coherence criteria: compare with Bayes’s Rule and de Finetti’scoherence criteria for precise previsions

See the web site of SIPTA (www.sipta.org) for pointers to moredetails.

Page 14: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Independence and lower previsions

Suppose we have two variables X1 in X1 and X2 in X2.

Definition (Epistemic irrelevance)X1 is epistemically irrelevant to X2 when learning the value of X1 doesnot change our beliefs about X2:

P1,2(f (X2)) = P2(f (X2)|x1) for all f ∈L (X2) and all x1 ∈X1

Important:Epistemic irrelevance is not a symmetrical notion!It is weaker than strong independence.

Epistemic independence (also weaker) is the symmetrised version.

Page 15: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Sets of desirable gambles

Page 16: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

First steps: Peter Walley (2000)

@ARTICLE{walley2000,author = {Walley, Peter},title = {Towards a unified theory of imprecise probability},journal = {International Journal of Approximate Reasoning},year = 2000,volume = 24,pages = {125--148}

}

Page 17: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

First steps: Peter Williams (1977)

@ARTICLE{williams2007,author = {Williams, Peter M.},title = {Notes on conditional previsions},journal = {International Journal of Approximate Reasoning},year = 2007,volume = 44,pages = {366--383}

}

Page 18: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Set of desirable gambles as a belief model

Gambles:A gamble f : X→ R is an uncertain reward whose value is f (X)

Set of desirable gambles:D ⊆L (X) is a set of gambles that a subject strictly prefers to zero

Page 19: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Why work with sets of desirable gambles?

Working with sets of desirable gambles D:I is simpler, more intuitive and more elegantI is more general and expressive than (conditional) lower previsionsI gives a geometrical flavour to probabilistic inferenceI includes classical propositional logic as another special caseI shows that probabilistic inference and Bayes’ Rule are ‘logical’

inferenceI includes precise probability as one special caseI avoids problems with conditioning on sets of probability zero

Page 20: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Most comprehensive approach so far: note on arXiv

Page 21: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Introduction to Imprecise Probabilities

@BOOK{troffaes2012,title = {Introduction to Imprecise Probabilities},publisher = {Wiley},editor = {Augustin, Thomas and Coolen, Frank P. A.

and De Cooman, Gert and Troffaes, Matthias C. M.},note = {Due end 2012},

}

Page 22: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

IMPRECISE-PROBABILISTICGRAPHICAL MODELS

Page 23: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal sets

Page 24: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal networks: the special case of a tree

Basic conceptConsider a directed tree T, with a variable Xt attached to each nodet ∈ T.

X1

X2

X3 X4

X5

X6

X7

X8 X9

X10 X11

Each variable Xt assumes values in a set Xt.

Page 25: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal trees: local uncertainty models

Local uncertainty model associated with each node tFor each possible value xm(t) ∈Xm(t) of the mother variable Xm(t), wehave a local conditional credal set

Mt|Xm(t)

which is a collection of credal sets

Mt|xm(t) ⊆ ΣXt for each xm(t) ∈Xm(t)

Xm(t)

Xs . . . Xt . . . Xs′

Page 26: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Interpretation of the graphical structure

The graphical structure is interpreted as follows:Conditional on the mother variable, the non-parent non-descendants ofeach node variable are strongly independent of it and its descendants.

X1

X2

X3 X4

X5

X6

X7

X8 X9

X10 X11

Page 27: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Lower previsions

Page 28: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Credal trees: local uncertainty models

Local uncertainty model associated with each node tFor each possible value xm(t) ∈Xm(t) of the mother variable Xm(t), wehave a conditional lower prevision/expectation

Qt(·|xm(t)) : L (Xt)→ R

where

Qt(f |xm(t)) = lower prevision of f (Xt), given that Xm(t) = xm(t).

The local model Qt(·|Xm(t)) is a conditional lower prevision operator.

Xm(t)

Xs . . . Xt . . . Xs′

Page 29: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Interpretation of the graphical structure

The graphical structure is interpreted as follows:Conditional on the mother variable, the non-parent non-descendants ofeach node variable are epistemically irrelevant to it and itsdescendants.

X1

X2

X3 X4

X5

X6

X7

X8 X9

X10 X11

Page 30: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

@ARTICLE{cooman2010,author = {{d}e Cooman, Gert and Hermans, Filip and Antonucci, Alessandro and Zaffalon, Marco},title = {Epistemic irrelevance in credal nets: the case of imprecise {M}arkov trees},journal = {International Journal of Approximate Reasoning},year = 2010,volume = 51,pages = {1029--1052},doi = {10.1016/j.ijar.2010.08.011}

}

Page 31: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

MePICTIr for updating a credal tree

For a credal tree we can find the joint model from the local modelsrecursively, from leaves to root.

Exact message passing algorithm

– credal tree treated as an expert system– linear complexity in the number of nodes

Python code

– written by Filip Hermans– testing and connection with strong independence in cooperation

with Marco Zaffalon and Alessandro Antonucci

Current (toy) applications in HMMscharacter recognition, air traffic trajectory tracking and identification,earthquake rate prediction

Page 32: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

@INPROCEEDINGS{cooman2011,author = {De Bock, Jasper and {d}e Cooman, Gert},title = {Imprecise probability trees: Bridging two theories of imprecise probability},booktitle = {ISIPTA ’09 -- Proceedings of the 6th International Symposium on Imprecise Probability: Theories and Applications},year = 2009,editor = {Coolen, Frank P. A. and {d}e Cooman, Gert and Fetz, Thomas and Oberguggenberger, Michael},address = {Innsbruck, Austria},publisher = {SIPTA}

}

Page 33: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

A HMM is a special credal tree

X1 X2 Xk Xn

O1 O2 Ok On

Q1 (·) Q2(·|X1) Qk(·|Xk−1) Qn(·|Xn−1)

S1(·|X1) S2(·|X2) Sk(·|Xk) Sn(·|Xn)

State sequence:

Output sequence:

Page 34: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Maximal state sequences

Classically (Viterbi):Find the state sequence x̂1:n that maximises the posterior probabilityp(x1:n|o1:n) corresponding to a given observation sequence o1:n.

Maximality (under robust ordering):Define a partial order > on state sequences:

x̂1:n > x1:n iff p(x̂1:n|o1:n)> p(x1:n|o1:n) for all compatible p(·|o1:n)

Find the state sequences x̂1:n that are maximal: undominated by anyother state sequence.

Page 35: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

ESTIHMM for finding all maximal state sequences

Exact backward-forward algorithm

– developed by Jasper De Bock– finds all maximal state sequences that correspond to a given

observation sequence– quadratic complexity in the number of nodes [linear]– cubic complexity in the number of states [quadratic]– linear complexity in the number of maximal sequences. [linear]

Python code

– written by Jasper De Bock

Current (toy) applications in HMMscharacter recognition, finding gene islands

Page 36: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Sets of desirable gambles

Page 37: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

@ARTICLE{moral2005,author = {Moral, Serafín},title = {Epistemic irrelevance on sets of desirable gambles},journal = {Annals of Mathematics and Artificial Intelligence},year = 2005,volume = 45,pages = {197--214},doi = {10.1007/s10472-005-9011-0}

}

Page 38: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X

Most comprehensive approach so far: note on arXiv