15

Click here to load reader

Discussion of Faming Liang's talk

Embed Size (px)

DESCRIPTION

Discussion at Adap'skiii, Park City, Utah, Jan. 3-4 2011

Citation preview

Page 1: Discussion of Faming Liang's talk

Adaptive MCMH for Bayesian inference of spatialautologistic models

discussion by Pierre E. Jacob & Christian P. Robert

Universite Paris-Dauphine and CRESThttp://statisfaction.wordpress.com

Adap’skiii, Park City, Utah, Jan. 03, 2011

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models1/ 7

Page 2: Discussion of Faming Liang's talk

Context: untractable likelihoods

Cases when the likelihood function f(y|θ) is unavailable andwhen the completion step

f(y|θ) =∫

Zf(y, z|θ) dz

is impossible or too costly because of the dimension of z

c© MCMC cannot be implemented!

=⇒ adaptive MCMH

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models2/ 7

Page 3: Discussion of Faming Liang's talk

Context: untractable likelihoods

Cases when the likelihood function f(y|θ) is unavailable andwhen the completion step

f(y|θ) =∫

Zf(y, z|θ) dz

is impossible or too costly because of the dimension of zc© MCMC cannot be implemented!

=⇒ adaptive MCMH

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models2/ 7

Page 4: Discussion of Faming Liang's talk

Context: untractable likelihoods

Cases when the likelihood function f(y|θ) is unavailable andwhen the completion step

f(y|θ) =∫

Zf(y, z|θ) dz

is impossible or too costly because of the dimension of zc© MCMC cannot be implemented!

=⇒ adaptive MCMH

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models2/ 7

Page 5: Discussion of Faming Liang's talk

Illustrations

Potts model: if y takes values on a grid Y of size kn and

f(y|θ) ∝ exp

∑l∼i

Iyl=yi

}

where l∼i denotes a neighbourhood relation, n moderately largeprohibits the computation of the normalising constant

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models3/ 7

Page 6: Discussion of Faming Liang's talk

Compare ABC with adaptive MCMH?

ABC algorithm

For an observation y ∼ f(y|θ), under the prior π(θ), keep jointlysimulating

θ′ ∼ π(θ) , z ∼ f(z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y.

[Tavare et al., 1997]

ABC can also be applied when the normalizing constant isunknown, hence it could be compared to adaptive MCMH, e.g. fora given number of y’s drawn from f , or in terms of execution time.

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models4/ 7

Page 7: Discussion of Faming Liang's talk

Some questions on the estimator R(θt, ϑ)

R(θt, ϑ) =1

m0 +m0∑

θi∈Str{θt} I(||θi − ϑ|| ≤ η)

×

∑θi∈Str{θt}

I(||θi − ϑ|| ≤ η) m0∑j=1

g(z(i)j , θt)

g(z(i)j , ϑ)

+m0∑j=1

g(z(t)j , θt)

g(z(t)j , ϑ)

Why does it bypass the number of repetitions of the θi’s?

Why resample the y(i)j ’s only to inverse-weight them later?

Why not stick to the original sample and weight the y(i)j ’s

directly? This would avoid the necessity to choose m0.

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models5/ 7

Page 8: Discussion of Faming Liang's talk

Some questions on the estimator R(θt, ϑ)

R(θt, ϑ) =1

m0 +m0∑

θi∈Str{θt} I(||θi − ϑ|| ≤ η)

×

∑θi∈Str{θt}

I(||θi − ϑ|| ≤ η) m0∑j=1

g(z(i)j , θt)

g(z(i)j , ϑ)

+m0∑j=1

g(z(t)j , θt)

g(z(t)j , ϑ)

Why does it bypass the number of repetitions of the θi’s?

Why resample the y(i)j ’s only to inverse-weight them later?

Why not stick to the original sample and weight the y(i)j ’s

directly? This would avoid the necessity to choose m0.

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models5/ 7

Page 9: Discussion of Faming Liang's talk

Combine adaptations?

Adaptive MCMH looks great, but it adds new tuning parameterscompared to standard MCMC!

It would be nice to be able to adapt the proposal Q(θt, ϑ),e.g. to control the acceptance rate, as in other adaptiveMCMC algorithms.

Then could η be adaptive, since it does as well depend on the(unknown) scale of the posterior distribution of θ ?

How would the current theoretical framework cope with theseadditional adaptations?

In the same spirit, could m (and m0) be considered asalgorithmic parameters and be adapted as well? Whatcriterion would be used to adapt them?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models6/ 7

Page 10: Discussion of Faming Liang's talk

Combine adaptations?

Adaptive MCMH looks great, but it adds new tuning parameterscompared to standard MCMC!

It would be nice to be able to adapt the proposal Q(θt, ϑ),e.g. to control the acceptance rate, as in other adaptiveMCMC algorithms.

Then could η be adaptive, since it does as well depend on the(unknown) scale of the posterior distribution of θ ?

How would the current theoretical framework cope with theseadditional adaptations?

In the same spirit, could m (and m0) be considered asalgorithmic parameters and be adapted as well? Whatcriterion would be used to adapt them?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models6/ 7

Page 11: Discussion of Faming Liang's talk

Combine adaptations?

Adaptive MCMH looks great, but it adds new tuning parameterscompared to standard MCMC!

It would be nice to be able to adapt the proposal Q(θt, ϑ),e.g. to control the acceptance rate, as in other adaptiveMCMC algorithms.

Then could η be adaptive, since it does as well depend on the(unknown) scale of the posterior distribution of θ ?

How would the current theoretical framework cope with theseadditional adaptations?

In the same spirit, could m (and m0) be considered asalgorithmic parameters and be adapted as well? Whatcriterion would be used to adapt them?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models6/ 7

Page 12: Discussion of Faming Liang's talk

Combine adaptations?

Adaptive MCMH looks great, but it adds new tuning parameterscompared to standard MCMC!

It would be nice to be able to adapt the proposal Q(θt, ϑ),e.g. to control the acceptance rate, as in other adaptiveMCMC algorithms.

Then could η be adaptive, since it does as well depend on the(unknown) scale of the posterior distribution of θ ?

How would the current theoretical framework cope with theseadditional adaptations?

In the same spirit, could m (and m0) be considered asalgorithmic parameters and be adapted as well? Whatcriterion would be used to adapt them?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models6/ 7

Page 13: Discussion of Faming Liang's talk

Sample size and acceptance rate

Another MH algorithm where the “true” acceptance ratio isreplaced by a ratio with an unbiased estimator is the ParticleMCMC algorithm.

In PMCMC, the acceptance rate is growing with the numberof particles used to estimate the likelihood, ie when theestimation is more precise, the acceptance rate increases andconverges towards the acceptance rate of an idealizedstandard MH algorithm.

In adaptive MCMH, is there such a link betweeen m, m0 andthe number of iterations on one side and the acceptance rateof the algorithm on the other side? Should it be growing whenthe estimation becomes more and more precise?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models7/ 7

Page 14: Discussion of Faming Liang's talk

Sample size and acceptance rate

Another MH algorithm where the “true” acceptance ratio isreplaced by a ratio with an unbiased estimator is the ParticleMCMC algorithm.

In PMCMC, the acceptance rate is growing with the numberof particles used to estimate the likelihood, ie when theestimation is more precise, the acceptance rate increases andconverges towards the acceptance rate of an idealizedstandard MH algorithm.

In adaptive MCMH, is there such a link betweeen m, m0 andthe number of iterations on one side and the acceptance rateof the algorithm on the other side? Should it be growing whenthe estimation becomes more and more precise?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models7/ 7

Page 15: Discussion of Faming Liang's talk

Sample size and acceptance rate

Another MH algorithm where the “true” acceptance ratio isreplaced by a ratio with an unbiased estimator is the ParticleMCMC algorithm.

In PMCMC, the acceptance rate is growing with the numberof particles used to estimate the likelihood, ie when theestimation is more precise, the acceptance rate increases andconverges towards the acceptance rate of an idealizedstandard MH algorithm.

In adaptive MCMH, is there such a link betweeen m, m0 andthe number of iterations on one side and the acceptance rateof the algorithm on the other side? Should it be growing whenthe estimation becomes more and more precise?

discussion by Pierre E. Jacob & Christian P. RobertAdaptive MCMH for Bayesian inference of spatial autologistic models7/ 7