Upload
prue
View
52
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Variational Algorithms for Marginal MAP. Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine. Variational Approximations. Abstract. - PowerPoint PPT Presentation
Citation preview
Variational Algorithms for Marginal MAP Qiang Liu Alexander Ihler
Department of Computer Science, University of California, Irvine
AbstractMarginal MAP tasks seek an optimal configuration of the marginal distribution over a subset of variables. Marginal MAP can be computationally much harder than more common inference tasks.
We show• a general variational framework for marginal MAP problems• analogues to Bethe, tree-reweighted, & mean-field approximations• novel upper bounds via the tree-reweighted free energy• “mixed” message passing and CCCP-based solvers• conditions for global or local optimality of the solutions• close connections to EM and variational EM approaches
Variational Form
Graphical ModelsGraphical models:• Factors & exponential family form
• Factors are associated with cliques of a graph G=(V,E)
Tasks: max (B) sum (A)
Harder
Mixed inference problems can be hard even in trees, since
• A-B trees extend notion of efficient structure to mixed inference• Ensure graph structure remains a tree during inference• Two example sub-types:
sum
max
sum
max
Example from D. Koller and N. Friedman (2009)
Mixed-Inference (marginal MAP, MAP)
Sum-Inference (partition function, probability of evidence)
Max-Inference (MAP, MPE)
Variational Algorithms
Sum- product
Max- product
Match max and sum
max (B) sum (A)A ! A [ B
B ! B
B ! A
Mixed-product message passing• start with “standard” weighted message passing• Generalize zero-temperature limit results of Weiss et al. (2007)• Apply limit directly to messages ( for Bethe, for TRW)
• Match updates interpretable as a “local” marginal MAP problem• Mixed marginals satisfy a reparameterization property• Fixed points are locally optimal (similar to max-product results)• Convergence can be an issue
Double-loop algorithms• Decompose H into two parts H=H+ - H- & iteratively linearize H-
• CCCP algorithm: take H+, H- to be convex• Can also take H+ to be the Bethe approximation (non-convex)• Iteratively solve sum-product and apply truncation correction
“Type 1” “Type 2”
Connections to EM• Restrict to the mean-field like product subspace
• Coordinate-wise updates = in the primal:
• Reformulate inference as a distributional optimization problem
• Define and
Sum-Inference
Max-Inference
Mixed-Inference
Sum-inference: Mixed-inference:
(with equality when q=p) (with equality when q = p(A|B) 1(B=B*) or similar)
This work
Variational ApproximationsBethe approximation (exact on A-B tree)• “Truncated” free energy
Tree-reweighted approximation (convex comb. of A-B trees)• Dual in terms of edge appearances
ExperimentsChain graphs• GA is a tree• TRW1: type-1 only• TRW2: ½ type-1, ½ type 2• Bethe: most accurate• EM: stuck quickly (2-3 iter.)
Grid graphs• Attractive or mixed potentials • GA has cycles• Similar trends
Attractive Mixed
% correct solutions Energy relative error