16
Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing – p.1/16

Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Bayes and VariationalMessage Passing

Mohammad Emtiyaz Khan

CS,UBC

Variational Bayes and Variational Message Passing – p.1/16

Page 2: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Inference

Find a tractable distribution Q(H) that closely approximatesthe true posterior distribution P (H|V ).

log P (V ) =∑

H

Q(H) log P (V )

=∑

H

Q(H) logP (H,V )

P (H|V )

=∑

H

Q(H) log

[P (H,V )

Q(H)

Q(H)

P (H|V )

]

=∑

H

Q(H) logP (H,V )

Q(H)︸ ︷︷ ︸

L(Q)

+∑

H

−Q(H) logP (H|V )

Q(H)︸ ︷︷ ︸

KL(Q||P )

Variational Bayes and Variational Message Passing – p.2/16

Page 3: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Inference

log P (V ) = L(Q) + KL(Q||P ) (1)

L(Q) =∑

H

Q(H) logP (H,V )

Q(H)(2)

KL(Q||P ) = −∑

H

Q(H) logP (H|V )

Q(H)(3)

Find Q(H) that maximizes lower bound L(Q) (andhence minimizes KL divergence).

For Q(H) = P (H|V ), KL vanishes to zero, but P (H|V )is intractable (that’s why variational approach).

Trick : Consider a restricted class of Q(H), and thenfind the member which minimizes the KL divergence.

Variational Bayes and Variational Message Passing – p.3/16

Page 4: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Factorized Distributions

Q(H) =∏

i

Qi(Hi) (4)

Substituting this in the expression for lower bound,

L(Q) =∑

H

i

Qi(Hi) logP (H,V )

i Qi(Hi)(Outline)

=∑

H

i

Qi(Hi) log P (H,V )−∑

H

i

Qi(Hi)∑

i

log Qi(Hi)

=∑

H

i

Qi(Hi) log P (H,V )−∑

i

Hi

i

Qi(Hi) log Qi(Hi)

=∑

H

i

Qi(Hi) log P (H,V ) +∑

i

H(Qi)

Variational Bayes and Variational Message Passing – p.4/16

Page 5: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Factorized Distributions

Now separate out all the terms in one factor Qj.

L(Q) =∑

Hj

Qj(Hj)〈log P (H,V )〉∼Qj(Hj)︸ ︷︷ ︸

log Q∗

j (Hj)

+ H(Qi) +∑

i6=j

H(Qi)

= −KL(Qj ||Q∗j) + terms not in Qj (5)

This bound is maximized wrt Qj when

log Qj(Hj) = log Q∗j(Hj) = 〈log P (H,V )〉∼Qj(Hj) + c (6)

Now iterate, guaranteed convergence ...

Variational Bayes and Variational Message Passing – p.5/16

Page 6: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Bayes for Bayesian Networks

log Q∗j(Hj) = 〈log P (H,V )〉∼Qj(Hj) + c

=∑

i

〈log P (Xi|pai)〉∼Qj(Hj)+ c

= 〈log P (Hj |paj)〉∼Qj(Hj)

+∑

k∈chj

〈log P (Xk|paj)〉∼Qj(Hj) + c

Variational Bayes and Variational Message Passing – p.6/16

Page 7: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (7)

u(Y ) = Natural statistics (8)

φY (θ) = Natural Parameter vector (9)

g(θ) = Constant of integration (10)

Example I: Bernoulli Distribution

p(x|µ) = µx(1− µ)1−x (11)

log p(x|µ) = x log µ + (1− x) log(1− µ) (12)

= logµ

(1− µ)︸ ︷︷ ︸

φ(µ)

x︸︷︷︸

u(x)

+ log(1− µ)︸ ︷︷ ︸

g(µ)

(13)

Variational Bayes and Variational Message Passing – p.7/16

Page 8: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (14)

P (Y |φ) = exp[φT u(Y ) + f(Y ) + g̃(φ)](Re-parametrization)

Property I: 〈u(Y )〉P (Y |θ) = −dg̃(φ)dφ

log p(x|µ) = logµ

(1− µ)︸ ︷︷ ︸

φ(µ)

x︸︷︷︸

u(x)

+ log(1− µ)︸ ︷︷ ︸

g(µ)

(15)

φ = logµ

(1− µ)⇒ µ =

1 + eφ(16)

g(µ) = log(1− µ) = − log(1 + eφ) = g̃(φ) (17)

E(x) = 〈u(Y )〉 = eφ(1 + eφ)−1 = µ (18)

Variational Bayes and Variational Message Passing – p.8/16

Page 9: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (19)

Example II: Gaussian Distribution θ → Y → X ← β

p(Y |θ) = (2π)−1/2 exp−1

2(Y −θ)2

log p(Y |θ) = [θ,−1/2]︸ ︷︷ ︸

φY (θ)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

−1

2θ2

︸︷︷︸

gY (θ)

−1

2log(2π)

︸ ︷︷ ︸

fY (Y )

p(X|Y, β) = (2π)−1/2β1/2 exp−β

2(X−Y )2

log p(X|Y, β) = [βY,−β/2]︸ ︷︷ ︸

φX(Y,β)

[

X

X2

]

︸ ︷︷ ︸

uX(X)

+−1

2(βY 2 + log β)

︸ ︷︷ ︸

gX(Y,β)

−1

2log(2π)

︸ ︷︷ ︸

fX(X)

Variational Bayes and Variational Message Passing – p.9/16

Page 10: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

Property II: Multi-linearity θ → Y → X ← β

log p(X|Y, β) = [βY,−β/2]︸ ︷︷ ︸

φX(Y,β)

[

X

X2

]

︸ ︷︷ ︸

uX(X)

+−1

2(βY 2 + log β)

︸ ︷︷ ︸

gX(Y,β)

−1

2log(2π)

︸ ︷︷ ︸

fX(X)

= [βX,−β/2]︸ ︷︷ ︸

φXY (X,β)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

+−1

2(βX2 + log β)

︸ ︷︷ ︸

gXY (X,β)

−1

2log(2π)

︸ ︷︷ ︸

fY (Y )

log p(Y |θ) = [θ,−1/2]︸ ︷︷ ︸

φY (θ)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

−1

2θ2

︸︷︷︸

gY (θ)

−1

2log(2π)

︸ ︷︷ ︸

fY (Y )

Variational Bayes and Variational Message Passing – p.10/16

Page 11: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

Consider Y node and it’s children in θ → Y → X ← β,

log P (Y |θ) = φTY (θ)uY (Y ) + fY (Y ) + gY (θ)

log P (X|Y, β) = φTX(Y, β)uX(X) + fX(X) + gX(Y, β)

= φTXY (X, β)uY (Y ) + gXY (Y, β)

Recall that,

log Q∗Y (Y ) = 〈log P (Y |θ)〉∼QY (Y ) + 〈log P (X|Y, β)〉∼QY (Y ) + c

= 〈φTY (θ)uY (Y ) + fY (Y ) + gY (θ)〉∼QY (Y )

+〈φTXY (X, β)uY (Y ) + gXY (Y, β)〉∼QY (Y ) + c

= 〈φTY (θ) + φT

XY (X, β)〉∼QY (Y )uY (Y ) + fY (Y ) + c1

Variational Bayes and Variational Message Passing – p.11/16

Page 12: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Exponential-Conjugate Models

log Q∗Y (Y ) = 〈φT

Y (θ) + φTXY (X, β)〉∼QY (Y )uY (Y ) + fY (Y ) + c1

Finally,

〈φTY (θ)〉 = [θ,−1/2]

〈φTXY (X, β)〉 = 〈[βX,−β/2]〉

Later is found using the property I (explain).

Variational Bayes and Variational Message Passing – p.12/16

Page 13: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Back to Bayesian Networks

Take each node, write the expression as a function ofnatural statistics of that node.

log Q∗Y (Y )

= 〈log P (Y |paY )〉∼QY (Y ) +∑

k∈chj

〈log P (Xk|paj)〉∼QY (Y ) + c

=

〈φTY (θ) +

k∈chj

φTXY (X, β)〉∼QY (Y )

uY (Y ) + fY (Y ) + c1

The compute the expectation of natural statistics of eachchildren node, and use that to find the quantity in bracket.

Variational Bayes and Variational Message Passing – p.13/16

Page 14: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Message Passing

Message from a parent node Y to a child node X:

mY →X = 〈uY 〉 (20)

Message from a child node X to a parent node Y:

mX→Y = φ̃XY (〈uX〉, {mi→X}i∈cpY) (21)

Node Y update it’s posterior Q∗Y :

φ∗Y = φ̃Y ({mi→Y }i∈paY

) +∑

j∈chY

mj→Y (22)

Variational Bayes and Variational Message Passing – p.14/16

Page 15: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Variational Message Passing

Variational Bayes and Variational Message Passing – p.15/16

Page 16: Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

Discussion

Initialization and message passing schedule.

Calculation of Lower Bound

Allowable Model

VIBES

Variational Bayes and Variational Message Passing – p.16/16