A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation

A Simple Stochastic Gradient Variational

Bayesfor Latent Dirichlet

Allocation

Tomonari MASADA ( 正田备也 )Nagasaki University (长崎大学 )

masada@nagasaki-u.ac.jp

Aim•Obtain an informative summary of a large

set of documents•by extracting word lists, each relating to a

specific topic

Topic modeling

Contribution•We propose a new posterior estimation for

latent Dirichlet allocation (LDA) [Blei+ 03]

•by applying stochastic gradient variational Bayes

(SGVB) [Kingma+ 14] to LDA

LDA [Blei+ 03]• Achieve a clustering of word tokens by assigning each word

token to one among the topics.

• : the topic to which the -th word token in document is

assigned.

• : How often the topic is talked about in document ?

• Topic probability distribution in each document

• : How often the word is used to talk about the topic ?

•Word probability distribution for each topic

discrete variablescontinuous variables

Variational Bayesian (VB) inference= maximization of evidence lower bound (ELBO)•VB tries to approximate the true posterior.•An approximate posterior is introduced when ELBO is

obtained by applying Jensen's inequality:

• : discrete hidden variables (topic assignments)• : continuous hidden variables (multinomial parameters)

evidence approximate posterior

Factorization assumption•We assume the approximate posterior factorizes as

to make the inference tractable.

•Then ELBO can be written as

Stochastic gradient variational Bayes (SGVB) [Kingma+ 14]•A general framework for estimating evidence

lower bound (ELBO) in variational Bayes (VB)

•Only applicable to continuous distributions

(SGVB) Monte Carlo integration•By using Monte Carlo integration, ELBO can be

estimated with random samples as

• The discrete part is estimated in a similar manner to

the original VB for LDA [Blei+ 03].10

(SGVB) Reparameterization• SGVB can be applied "under certain mild conditions."•We use the logistic normal distributions for approximating

the true posterior of: per-doc topic probability distributions, and: per-topic word probability distributions.

•We can efficiently sample from the logistic normal with reparameterization.

Maximize ELBO using gradient ascent

"Stochastic" gradient VB•The expectation integrations in ELBO are estimated

by Monte Carlo method.

•The derivatives of ELBO depend on random

samples.

•Randomness is incorporated into maximization.• SGVB = VB where gradients are stochastic.

• (Observation) It seems easier to avoid poor local minima.

without randomness= with zero standard deviation •A special case of the proposed method is quite

similar to CVB0 [Asuncion+ 09].

•Our method has a context.15

Data sets for evaluation# docs # vocabulary

NYT 99,932 46,263

MOVIE 27,859 62,408

NSF 128,818 21,471

MED 125,490 42,83016

Not that efficient in time…

•500 iters for NYT data set when

•LNV: 43 hours

•CGS: 14 hours

•VB: 23 hours

•However, parallelization with GPU works.

• (preparing an implementation with TensorFlow)

Conclusion•We incorporate randomness into variational

inference for LDA by applying SGVB to LDA.

•The proposed method gives perplexities

comparable to the existing inferences for

Future work•SGVB is a general framework for devising a

posterior inference for probabilistic models.

•We've already applied SGVB to CTM [Blei+ 05].• This will be poster-presented at APWeb'16.

•SGVB is also applicable to other document models.• NVDM [Miao+ 16]: document modeling with MLP

A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation

Engineering

Variational Inference for Hierarchical Dirichlet Process ...cs.brown.edu/research/pubs/theses/ugrad/2015/stephenson.will.pdf · Variational Inference for Hierarchical Dirichlet Process

The FMRIB Variational Bayes Tutorial

Auto-Encoding Variational Bayes - University of Cambridge · 2019-10-30 · Auto-Encoding Variational Bayes Pawe l F. P. Budzianowski, Thomas F. W. Nicholson, William C. Tebbutt Problem

Nonparametric Inference for Auto-Encoding Variational Bayes · 2017-12-19 · Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin *Iman Malik Carl Henrik Ek Neill

Variational Bayes and Variational Message Passing · Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing

A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model

Auto-encoding Variational Bayes - Universitetet i oslofolk.uio.no/geirs/STK9200/Vinit_Autoencoding.pdf · 2019-10-28 · Auto-encoding Variational Bayes Vinit Ravishankar Language

Updating Variational Bayes: Fast sequential posterior ... · We refer to this approach as Stochastic Variational Bayes (SVB). There is a rich tradition of using only a subset of the

Averaged Collapsed Variational Bayes Inferencejmlr.csail.mit.edu/papers/volume18/14-249/14-249.pdf · Averaged Collapsed Variational Bayes Inference Katsuhiko Ishiguro k .ishiguro

Finding New Malicious Domains Using Variational Bayes on ... · Finding New Malicious Domains Using Variational Bayes on Large-Scale Computer Network Data Vojtˇech L etal, Tom´

Auto encoding-variational-bayes

Memoized Online Variational Inference for Dirichlet Process Mixture Models

MAP approximation to the variational Bayes Gaussian ... · MAP approximation to the variational Bayes Gaussian mixture model and application 3289 Fig. 1 GraphmodelofBayesianGMM.The

Memoized Online Variational Inference for Dirichlet ...cs.brown.edu/~sudderth/papers/nips13dpMemoized.pdf · Variational inference algorithms provide the most effecti ve framework

Variational Bayes

Coupled Variational Bayes via Optimization Embedding · Coupled Variational Bayes via Optimization Embedding Bo Dai1;2, Hanjun Dai1, Niao He3, Weiyang Liu1, Zhen Liu1, Jianshu Chen4,

Stochastic gradient variational Bayes for gamma

Variational Inference for Beta-Bernoulli Dirichlet Process ...cs.brown.edu/research/pubs/theses/masters/2015/ni.mengrui.pdf · Variational Inference for Beta-Bernoulli Dirichlet Process

D VARIATIONAL BAYES FILTERS: UNSUPERVISED LEARNING OF … · 2017. 3. 6. · Published as a conference paper at ICLR 2017 DEEP VARIATIONAL BAYES FILTERS: UNSUPERVISED LEARNING OF

A variational Bayes approach to variable selection · 2014-07-18 · A variational Bayes approach to variable selection BY JOHN T. ORMEROD, CHONG YOU AND SAMUEL MULLER¨ School of