Discrete Optimization in Computer Vision
Nikos KomodakisEcole des Ponts ParisTech, LIGM
Traitement de l’information et vision artificielle
Message-passing algorithms
Central concept: messages
These methods work by propagating messages across the MRF graph
Widely used algorithms in many areas
Message-passing algorithms
But how do messages relate to optimizing the energy?
Let’s look at a simple example first: we will examine the case where the MRF graph is a chain
Message-passing on chains
Global minimum in linear time
Optimization proceeds in two passes: Forward pass (dynamic
programming) Backward pass
),()( qppqpp xxx
sqp r
),()( min)( jiijM pqpi
pq
j
5.1
1.0
1
5.2
pqM
Forward pass (dynamic programming)
sqp r
Forward pass (dynamic programming)
1.0
4.0
2.0
1.0
rsM
0.5
2
1.2
2.0
s
Min-marginal for node s and label j:
min ( )E x js x x
sqp r
Backward pass
xs
( ) min ( ) ( ) ( , )M x j M j j xrs s r qr rs sj
arg min ( ) ( ) ( , )x j M j j xr r qr rs sj
xr
( ) min ( ) ( ) ( , )M x j M j j xqr r q pq qr rj
arg min ( ) ( ) ( , )x j M j j xq q pq qr rj
xqxp
Message-passing on chains
How can I compute min-marginals for any node in the chain?
How to compute min-marginals for all nodes efficiently?
What is the running time of message-passing on chains?
Message-passing on trees
We can apply the same idea to tree-structured graphs
Slight generalization from chains
Resulting algorithm called: belief propagation (also called under many other names: e.g., max-product, min-sum etc.)(for chains, it is also often called the Viterbi algorithm)
Dynamic programming: global minimum in linear time
BP: Inward pass (dynamic programming) Outward pass
Gives min-marginals
qp r
BP on a tree [Pearl’88]
rootleaf
leaf
),()( qppqpp xxx
qp r
),()( min)( jiijM pqpi
pq
j
5.1
1.0
1
5.2
pqM
Inward pass (dynamic programming)
qp r
BP on a tree: min-marginals
Min-marginal for node q and label j:
jxE q )(min xx
)()()( jMjMj rqpqq
j
Belief propagation: message-passing on trees
min-marginals = ???min-marginals = sum of all messages + unary potential
Message-passing on chains
Essentially, message passing on chains is dynamic programming
Dynamic programming meansreuse of computations
Generalizing belief propagation Key property: min(a+b,a+c) =
a+min(b,c) BP can be generalized to any operators
satisfying the above property E.g., instead of (min,+), we could have:
(max,*) Resulting algorithm called max-product.What does it compute?
(+,*) Resulting algorithm called sum-product.What does it compute?
Belief propagation as a distributive algorithm BP works distributively
(as a result, it can be parallelized)
Essentially BP is a decentralized algorithm
Global results through local exchange of information
Simple example to illustrate this: counting soldiers
Counting soldiers in a line
Can you think of a distributive algorithm for the commander to count its soldiers?
(From David MacKay’s book “Information Theory, Inference, and Learning”)
Counting soldiers Simple example to illustrate BP
Same idea can be used in cases which are seemingly more complex: counting paths through a point in a grid probability of passing through a node in the
grid
In general, we have used the same idea for minimizing MRFs (a much more general problem)