Bayesian Belief Propagation Reading Group. Overview Problem Background Bayesian Modelling Bayesian Modelling Markov Random Fields Markov Random Fields

Bayesian Belief PropagationBayesian Belief PropagationReading GroupReading Group

OverviewOverview

Problem BackgroundProblem Background Bayesian ModellingBayesian Modelling Markov Random FieldsMarkov Random Fields

Examine use of Bayesian Belief Propagation Examine use of Bayesian Belief Propagation (BBP) in three low level vision applications.(BBP) in three low level vision applications. Contour Motion EstimationContour Motion Estimation Dense Depth EstimationDense Depth Estimation Unwrapping Phase ImagesUnwrapping Phase Images

Convergence IssuesConvergence IssuesConclusionsConclusions

Problem BackgroundProblem Background

A problem of probabilistic inferenceA problem of probabilistic inference

Estimate unknown variables given Estimate unknown variables given observed data.observed data.

For low level vision:For low level vision: Estimate unknown Estimate unknown scenescene properties (e.g. properties (e.g.

depth) from image properties (e.g. Intensity depth) from image properties (e.g. Intensity gradients) gradients)

Bayesian models in low level visionBayesian models in low level vision

A statistical description of an estimation problem. A statistical description of an estimation problem. Given data Given data dd, we want to estimate unknown parameters , we want to estimate unknown parameters uu

Two componentsTwo components Prior Model Prior Model pp((uu)) – Captures know information about unknown – Captures know information about unknown

data and is independent of observed data. Distribution of data and is independent of observed data. Distribution of probable solutions. probable solutions.

Sensor Model Sensor Model pp((d|ud|u)) – Describes relationship between sensed – Describes relationship between sensed measurements measurements dd and unknown hidden data and unknown hidden data uu. .

Combine using Bayes’ Rule to give the posteriorCombine using Bayes’ Rule to give the posterior

Markov Random FieldsMarkov Random Fields

Image Data Nodes (d)

Hidden Scene Nodes (u)

Sensor model

Prior model

ui

Neighborhood Ni

ikkii Nuuupup }),{|()|( u

Pairwise Markov Random Field:Model commonly used to represent images

Contour Motion EstimationContour Motion Estimation

Yair WeissYair Weiss


Estimate the motion of contour using only local Estimate the motion of contour using only local information.information.Less computationally intensive method than Less computationally intensive method than optical flow. optical flow. Application example: object tracking.Application example: object tracking.Difficult due to the aperture problem.Difficult due to the aperture problem.


Aperture Problem

Ideal Actual

Prior Model: ui+1 = ui +

whereN(0,p)


i

i+1

i-1

i-2

i+2

didi-1 di+1di-2 di+2

ui

ui+1

ui-1

2

21

1 2

)(exp)|(

p

iiii

uuuup

Tiii yxu 1,,

0

t

Iy

y

Ix

x

I

dt

dI

Brightness Constant Constraint Equation

uiui-1 ui+1ui-2 ui+2

T

iiii t

I

y

I

x

Id

,,

where Ii = I(xi,yi,t)

0iTi ud

2

2

2exp)|(

i

iTi

ii

uddup

1D Belief Propagation1D Belief Propagation

)|()|()|()|( 11 iiiiiii uupuupudpup d

uiui-1 ui+1ui-2 ui+2

didi-1 di+1di-2 di+2

2

2

2

2

2

2

)(2

)(

)(2

)(

2

)(exp)()|(

i

ii

i

ii

i

iiiii

uuduubup d

),( iii

),( iii

222

222

111

111

iii

i

i

i

i

ii

i

d

u

21

21

12

1

121

11

11

ii

i

i

ii

i

d

1

2

1

21

22 11

iipi

Iterate until message values converge

ResultsResults

Contour motion estimation [Weiss]Contour motion estimation [Weiss]

Faster and more accurate solutions over pre-existing methods such Faster and more accurate solutions over pre-existing methods such as relaxation.as relaxation.Results after iteration n are optimal given all data within distance of Results after iteration n are optimal given all data within distance of n nodes.n nodes.Due to the nature of the problem, all velocity components should Due to the nature of the problem, all velocity components should and do converge to the same value.and do converge to the same value.

Interesting to try algorithm on problems where this is not the caseInteresting to try algorithm on problems where this is not the caseMultiple motions within the same contourMultiple motions within the same contourRotating contours (requires a new prior model)Rotating contours (requires a new prior model)

Only one dimensional problems tackled but extensions to 2D are Only one dimensional problems tackled but extensions to 2D are discussed. discussed. Also use of algorithm to solve Direction Of Figure (DOF) problem Also use of algorithm to solve Direction Of Figure (DOF) problem using convexity (not discussed)using convexity (not discussed)

Dense Depth EstimationDense Depth Estimation

Richard SzeliskiRichard Szeliski

Depth EstimationDepth Estimation

iDepth Zi

Disparity ui = 1 / Zi

Assume smooth variation in disparity

Define prior using Gibbs Distribution:

T

Ep p )(

exp)(u

uEp(u) is an energy functional:

uAuu pyx

p yxuyxuyxuyxuE T

),(

22

2

1),()1,(),(),1(

2

1)(


Image T=0Image T=1

Image T=tImage T=t+1

Image T=t+2Image t=t+3

di

Disparity: i

i Zd

1

2i related to

correlation metric

i

Tn

dddd 321 2,,, d T

nuuuu

321 2,,, u

)(~ where, Σ0RRHud ,Ν

Where H is a measurement matrix and

200

0

000

00

2

1

n

Σ

)(exp)|( uud sEp

Es(u) is an energy functional:

dHuΣdHuu 1

2

1)( T

sE


E(u) is the overall energy: )(/)()( uuu sp ETEE

cE TT buAuuu2

1)(

HΣHAA 1 Tp dΣHb 1 Twhere an

d

Energy function E(u) minimized when u=A-1b

)(exp)()|()|( uuuddu Eppp Posterior:

Matrix A-1 is large and expensive to compute

Gauss-Seidel RelaxationGauss-Seidel Relaxation

Minimize energy locally for each node Minimize energy locally for each node uuii keeping all other keeping all other

nodes fixed.nodes fixed. Leads to update rule:Leads to update rule:

This is also the estimated mean of the marginal probability This is also the estimated mean of the marginal probability distribution distribution pp((uuii||dd) ) given by Gibbs Sampling.given by Gibbs Sampling.

For the 1-D example given by Weiss:For the 1-D example given by Weiss:

iNjjijiiii uabau 1

22

12122

21

111

pi

ip

ip

ii

i

uud

u

222

222

111

111

iii

i

i

i

i

ii

i

d

u

ResultsResults

Dense depth estimation [Szeliski]Dense depth estimation [Szeliski]Dense (per pixel) depth estimation from a sequence of images with Dense (per pixel) depth estimation from a sequence of images with known camera motion.known camera motion.Adapted Kalman Filter: estimates of depth from time t-1 are used to Adapted Kalman Filter: estimates of depth from time t-1 are used to improve estimates at time t.improve estimates at time t.Uses multi-resolution technique (image pyramid) to improve Uses multi-resolution technique (image pyramid) to improve convergence times.convergence times.Uses Gibbs Sampling to sample the posterior. Uses Gibbs Sampling to sample the posterior. Stochastic Gauss-Seidel relaxationStochastic Gauss-Seidel relaxation

Not guaranteed to converge.Not guaranteed to converge.Problem can be reformulated to use message passing.Problem can be reformulated to use message passing.Does not account for loops in the network, only recently has belief Does not account for loops in the network, only recently has belief propagation in networks with loops been fully understood [Yedidia et propagation in networks with loops been fully understood [Yedidia et al]al]

Unwrapping Phase ImagesUnwrapping Phase Images

Brendan Frey et alBrendan Frey et al

Unwrapping Phase ImagesUnwrapping Phase Images

Wrapped phase images are produced by Wrapped phase images are produced by devices such as MRI and radar.devices such as MRI and radar.Unwrapping involves finding Unwrapping involves finding shift shift values values between each point.between each point.Unwrapping is simple in one dimensionUnwrapping is simple in one dimension One path through dataOne path through data Use local gradient to estimate shift.Use local gradient to estimate shift.

For 2D images, the problem is more difficult For 2D images, the problem is more difficult (NP-hard)(NP-hard) Many paths through the dataMany paths through the data Shifts along all paths must be consistent Shifts along all paths must be consistent

Zero-Curl ConstraintZero-Curl Constraint

a(x,y)

b(x,y) b(x+1,y)

a(x,y+1)

0),()1,(),1(),( yxbyxayxbyxa

(x,y) (x+1,y)

(x+1,y+1)(x,y+1)

Data Point

Shift node

Constraint Node

Sensor DataSensor DataEstimating relative shift (variables a and b) values [-1,0 or 1] between Estimating relative shift (variables a and b) values [-1,0 or 1] between each data point. each data point. Use local image gradient as sensor inputUse local image gradient as sensor input

),(),1(

),()1,(

),(),1(

)0,0()1,0(

)0,0()0,1(

nnInnI

yxIyxI

yxIyxI

II

II

dSensor nodes:

),(

),(

),(

)0,0(

)0,0(

nnb

yxb

yxa

b

a

uHidden shift nodes:

)()(

2

1exp)|( 1 duΣduud Tp

Gaussian sensor model: Estimate from wrapped image

Belief PropagationBelief Propagation

m4

m1

m2

m3

1

1

1

1

1

13214 )(

j k llkji mmmjilkm

m5

ii midm 422*

5 2/)(exp 54

54 )()|),(()),((

mmd

Tiimm

iyxapiyxab

m5

1.0

-1 0 1

Belief

a(x,y)Data Point

Shift node

Constraint Node

ResultsResults

Unwrapping phase images [Frey et al.]Unwrapping phase images [Frey et al.]Initialize message to uniform distribution and iterate to Initialize message to uniform distribution and iterate to convergence. convergence. Estimates a solution to an NP-Hard problem in O(n) time Estimates a solution to an NP-Hard problem in O(n) time in the number of the nodes.in the number of the nodes.Reduction in reconstruction error over relaxation Reduction in reconstruction error over relaxation methods. methods.

Does not account for loops in the network, messages Does not account for loops in the network, messages could cycle leading to incorrect belief estimates.could cycle leading to incorrect belief estimates.Not guaranteed to converge.Not guaranteed to converge.

Convergence only guaranteed when network is Convergence only guaranteed when network is a tree structure and all data is available. a tree structure and all data is available.

In networks with loops, messages can cycle In networks with loops, messages can cycle resulting in incorrect belief estimates.resulting in incorrect belief estimates.

Multi-resolution methods such as image Multi-resolution methods such as image pyramids can be used to speed up convergence pyramids can be used to speed up convergence times (and improve results).times (and improve results).

ConvergenceConvergence

ConclusionConclusion

BBP used to infer marginal posterior distribution of hidden BBP used to infer marginal posterior distribution of hidden information from observable data. information from observable data. Efficient message passing system is linear in the number of nodes Efficient message passing system is linear in the number of nodes as opposed to exponential.as opposed to exponential.Propagate local information globally to achieve more reliable Propagate local information globally to achieve more reliable estimates.estimates.Useful for low level vision applicationsUseful for low level vision applications Contour Motion Estimation [Weiss]Contour Motion Estimation [Weiss] Dense Depth Estimation [Szeliski]Dense Depth Estimation [Szeliski] Unwrapping Phase Images [Frey et al]Unwrapping Phase Images [Frey et al]

Improved results over standard relaxation algorithms.Improved results over standard relaxation algorithms.Can be used in conjunction with multi-resolution framework to Can be used in conjunction with multi-resolution framework to improve convergence times. improve convergence times. Need to account for loops to prevent cycling of messages [Yedidia Need to account for loops to prevent cycling of messages [Yedidia et al].et al].

Documents

Bayesian Belief Propagation Reading Group. Overview Problem Background Bayesian Modelling Bayesian Modelling Markov Random Fields Markov Random Fields