19
Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan Gorur, and Zoubin Ghahramani, AISTATS 2007

Stick-breaking Construction for the Indian Buffet Process

  • Upload
    reece

  • View
    88

  • Download
    2

Embed Size (px)

DESCRIPTION

Stick-breaking Construction for the Indian Buffet Process. Yee Whye The, Dilan Gorur, and Zoubin Ghahramani, AISTATS 2007. Duke University Machine Learning Group Presented by Kai Ni July 27, 2007. Outline. Introduction Indian Buffet Process (IBP) Stick-breaking construction for IBP - PowerPoint PPT Presentation

Citation preview

Page 1: Stick-breaking Construction for the Indian Buffet Process

Stick-breaking Construction for the Indian Buffet Process

Duke University Machine Learning Group

Presented by Kai Ni

July 27, 2007

Yee Whye The, Dilan Gorur, and Zoubin Ghahramani,AISTATS 2007

Page 2: Stick-breaking Construction for the Indian Buffet Process

Outline

Introduction

Indian Buffet Process (IBP)

Stick-breaking construction for IBP

Slice samplers

Results

Page 3: Stick-breaking Construction for the Indian Buffet Process

Introduction Indian Buffet Process (IBP)

A distribution over binary matrices consisting of N rows (objects) and an unbounded number of columns (features);

1/0 in entry (i,k) indicates feature k present/absent from object i.

An example Objects are movies – “Terminator 2”, “Shrek” and “Shanghai

Knights”;

Features are – “action”, “comedy”, “stars Jackie Chan”;

The matrix can be [101; 010; 110].

Page 4: Stick-breaking Construction for the Indian Buffet Process

Relationship to CRP IBP and CRP are both tools for defining nonparametric

Bayesian models with latent variables.

CRP – Each object belongs to only one of infinitely many latent classes.

IBP – Each object can possess potentially any combination of infinitely many latent features.

Previous Gibbs sampler for IBP is based on CRP. In this paper the author derives a stick-breaking representation for the IBP, and develop efficient slice samplers.

Page 5: Stick-breaking Construction for the Indian Buffet Process

Indiant Buffet Process

Let Z be a random binary N x K matrix, and denote entry (I,k) in Z by zik. For each feature k let uk be the prior probability that feature k is present in an object.

Let be the strength parameter of the IBP, the full model is:

If we integrated out uk and taking the limit of K -> infinity, we obtain the IBP in the situation similar to CRP.

Page 6: Stick-breaking Construction for the Indian Buffet Process
Page 7: Stick-breaking Construction for the Indian Buffet Process

Gibbs sampler for IBP

For new features:

Page 8: Stick-breaking Construction for the Indian Buffet Process

Stick-breaking construction for IBP

Page 9: Stick-breaking Construction for the Indian Buffet Process

Derivation

pdf for each u cdf for each u

cdf for u(1)

pdf for u(1)

Page 10: Stick-breaking Construction for the Indian Buffet Process

Relation to DP

Page 11: Stick-breaking Construction for the Indian Buffet Process

Stick-breaking for IBP (2)

In truncated stick-breaking for IBP, let K* be the truncation level. We set u(k)=0 for k>K*, and zik=0 for k>K*.

Page 12: Stick-breaking Construction for the Indian Buffet Process

Slice Sampler

Using Adaptive rejection sampling (ARS) to deal with the truncation level. Introduce an auxiliary slice variable s with

Page 13: Stick-breaking Construction for the Indian Buffet Process

1. Update s: if new s makes K* becomes larger, we iteratively draw u(k) until u(K*’) > s.

2. Update Z: given s, we only need to update zik for each i and k<=K*.

3. Update for k = 1,…, K*.

4. Update u(k) for k = 1, …, K*.

Sampling

k

1 2 K* K*’ Decreasing u(k)

Old s New s

0Range of uniform dist. for s

Page 14: Stick-breaking Construction for the Indian Buffet Process

Change of Representations IBP – ignoring the ordering on features; Stick-breaking IBP – enforcing an ordering with

decreasing weights.

Stick-breaking -> IBP: Drop the stick lengths and the inactive features, leaving only the K+ active feature columns along with the corresponding parameters.

IBP -> stick-breaking: Draw both the stick lengths and order the features in decreasing stick lengths, introducing Ko inactive features until

Page 15: Stick-breaking Construction for the Indian Buffet Process

Semi-ordered Stick-breaking uk

+ on active features are unordered and draw from a CRP similar distribution:

The stick length on inactive feature is similar to the stick-breaking IBP

The auxiliary variable s determines how many inactive features need to add.

(unordered 1~K+) Ko

s0Range of uniform dist. for s

Min(u(k))

Page 16: Stick-breaking Construction for the Indian Buffet Process

Results Used the conjugate linear-Gaussian binary latent

feature model for comparing the performance of the different samplers. Each data point is modeled using a spherical Gaussian with mean zi,:A and variance 2

X

Page 17: Stick-breaking Construction for the Indian Buffet Process

Demonstration

Apply semi-ordered slice sampler to 1000 examples of handwritten images of 3’s in the MNIST dataset.

Page 18: Stick-breaking Construction for the Indian Buffet Process
Page 19: Stick-breaking Construction for the Indian Buffet Process

Conclusion

The author derived novel stick-breaking representations of the Indian buffet process.

Based on these representations, new MCMC samplers are proposed that are easy to implement and work on more general models than Gibbs sampling.