Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement...

Preview:

Citation preview

Discrete Optimization in Computer Vision

Nikos KomodakisEcole des Ponts ParisTech, LIGM

Traitement de l’information et vision artificielle

Message passing algorithms for energy minimization

Message-passing algorithms

Central concept: messages

These methods work by propagating messages across the MRF graph

Widely used algorithms in many areas

Message-passing algorithms

But how do messages relate to optimizing the energy?

Let’s look at a simple example first: we will examine the case where the MRF graph is a chain

Message-passing on chains

MRF graph

Message-passing on chains

Corresponding lattice or trellis

Message-passing on chains

Global minimum in linear time

Optimization proceeds in two passes: Forward pass (dynamic

programming) Backward pass

Message-passing on chains

(example on board)

(algebraic derivation of messages)

sqp r

Message-passing on chains

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

5.1

1.0

1

5.2

pqM

Forward pass (dynamic programming)

sqp r

),()()(min)( kjjMjkM qrpqqj

qr

0.5

2

1.2

2.0

qrM

Forward pass (dynamic programming)

k

sqp r

Forward pass (dynamic programming)

1.0

4.0

2.0

1.0

rsM

0.5

2

1.2

2.0

s

Min-marginal for node s and label j:

min ( )E x js x x

sqp r

Backward pass

xs

( ) min ( ) ( ) ( , )M x j M j j xrs s r qr rs sj

arg min ( ) ( ) ( , )x j M j j xr r qr rs sj

xr

( ) min ( ) ( ) ( , )M x j M j j xqr r q pq qr rj

arg min ( ) ( ) ( , )x j M j j xq q pq qr rj

xqxp

Message-passing on chains

How can I compute min-marginals for any node in the chain?

How to compute min-marginals for all nodes efficiently?

What is the running time of message-passing on chains?

Message-passing on trees

We can apply the same idea to tree-structured graphs

Slight generalization from chains

Resulting algorithm called: belief propagation (also called under many other names: e.g., max-product, min-sum etc.)(for chains, it is also often called the Viterbi algorithm)

Belief propagation(BP)

Dynamic programming: global minimum in linear time

BP: Inward pass (dynamic programming) Outward pass

Gives min-marginals

qp r

BP on a tree [Pearl’88]

rootleaf

leaf

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

5.1

1.0

1

5.2

pqM

Inward pass (dynamic programming)

qp r

),()()(min)( kjjMjkM qrpqqj

qr

0.2

2.1

2

5.0

pqM

Inward pass (dynamic programming)

k

qp r

Inward pass (dynamic programming)

qp r

Inward pass (dynamic programming)

qp r

Outward pass

qp r

BP on a tree: min-marginals

Min-marginal for node q and label j:

jxE q )(min xx

)()()( jMjMj rqpqq

j

Belief propagation: message-passing on trees

Belief propagation: message-passing on trees

min-marginals = ???min-marginals = sum of all messages + unary potential

What is the running time of message-passing for trees?

Message-passing on chains

Essentially, message passing on chains is dynamic programming

Dynamic programming meansreuse of computations

Generalizing belief propagation Key property: min(a+b,a+c) =

a+min(b,c) BP can be generalized to any operators

satisfying the above property E.g., instead of (min,+), we could have:

(max,*) Resulting algorithm called max-product.What does it compute?

(+,*) Resulting algorithm called sum-product.What does it compute?

Belief propagation as a distributive algorithm BP works distributively

(as a result, it can be parallelized)

Essentially BP is a decentralized algorithm

Global results through local exchange of information

Simple example to illustrate this: counting soldiers

Counting soldiers in a line

Can you think of a distributive algorithm for the commander to count its soldiers?

(From David MacKay’s book “Information Theory, Inference, and Learning”)

Counting soldiers in a line

Counting soldiers in a tree

Can we do the same for this case?

Counting soldiers in a tree

Counting soldiers Simple example to illustrate BP

Same idea can be used in cases which are seemingly more complex: counting paths through a point in a grid probability of passing through a node in the

grid

In general, we have used the same idea for minimizing MRFs (a much more general problem)

Graphs with loops

How about counting these soldiers?

Hmmm…overcounting?

Recommended