Upload
henry-mathews
View
226
Download
2
Embed Size (px)
Citation preview
Efficient Inference for Fully-Connected CRFs with Stationarity
Yimeng Zhang, Tsuhan ChenCVPR 2012
Summary
• Explore object-class segmentation with fully-connected CRF models
• Only restriction on pairwise terms is `spatial stationarity’ (i.e. depend on relative locations)
• Show how efficient inference can be achieved by– Using a QP formulation– Using FFT to calculate gradients in complexity
(linear in) O(NlogN)
Fully-connected CRF model
• General pairwise CRF model:
• Image I• Class labeling, X:• Label set, L: • V = set of pixels, N_i = neighbourhood of pixel i,
Z(I) = partition function, psi = potential functions
Fully-connected CRF model
• General pairwise CRF model:
• In fully-connected CRF, for all i, N_i = V
Unary Potential
• Unary potential generates a score for each object class per pixel (TextonBoost)
Pairwise Potential
• Pairwise potential measures compatibility of the labels at each pair of pixels
• Combines spatial and colour contrast factors
Pairwise Potential
• Colour contrast:
• Spatial term:
Pairwise Potential
• Learning the spatial term
MAP inference using QP relaxation• Introduce a binary indicator variable for each
pixel and label
• MAP inference expressed as a quadratic integer program, and relaxed to give the QP
MAP inference using QP relaxation
• QP relaxation has been proved to be tight in all cases (Ravikumar ICML 2006 [24])
• Moreover, it is convex whenever matrix of edge-weights is negative-definite
• Additive bound for non-convex case• QP requires O(KN) variables, LP requires (K^2E)
MAP inference using QP relaxation
• Gradient
• Derive fixed-point update by forming Lagrangian and setting its derivative to 0
Illustration of QP updates
Efficiently evaluating the gradient
• Required summation
• Would be a convolution without the color term• With color term is requires 5D-filtering• Can be approximated by clustering into C color
clusters, => C convolutions across
Efficiently evaluating the gradient
• Hence, for the case x_i = x_j, we need to evaluate
• Instead, evaluate for C clusters (C = 10 to 15)
• where
• Finally, interpolate
Update complexity• FFTs of each spatial filters can be calculated in
advance (K^2 filters)• At each update, we require C FFTs calculating,
O(CNlogN)• K^2 convolutions are needed, each requiring a
multiplication, O(K^2CN)• Terms can be added in Fourier domain, => only KC inverse FFTs needed, O(KCNlogN)
• Run-time per iteration < 0.1s for 213x320 pixels (+ downsampling by factor of 5)
MSRC synthetic experiment
• Unary terms randomized• Spatial distributions set to ground-truth
MSRC synthetic experiment
• Running times
Sowerby synthetic experiment
MSRC full experiment
• Use TextonBoost unary potentials• Compare with several other CRFs with same
unaries– Grid only– Grid + P^N (Kohli, CVPR 2008)– Grid + P^N + Cooccurrence (Ladickỳ, ECCV 2010)– Fully-connected + Gaussian spatial (Krähenbühl,
NIPS 2011)
MSRC full experiment
• Qualitative comparison
MSRC full experiment
• Quantitative comparison– Overall
– Per-class
– Timing: 2-8s per image