22
Gibbs sampler - simple Gibbs sampler - simple properties properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is irreducible (which depends on which elements of have non-zero probability), then this MC is a correct MCMC algorithm for simulating random variables with distribution .

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is

Embed Size (px)

Citation preview

Gibbs sampler - simple Gibbs sampler - simple propertiespropertiesIt’s not hard to show that this MC

chain is aperiodic.Often is reversible distribution.If in addition the chain is

irreducible (which depends on which elements of have non-zero probability), then this MC is a correct MCMC algorithm for simulating random variables with distribution .

Systematic Gibbs samplerSystematic Gibbs samplerA commonly used variant of the

Gibbs sampler is the MC obtained by systematic cycle trough the vertex set.

For instance, for we may decide to update vertex:

Back to the Boolean cubeBack to the Boolean cubeIn this case: , .In each step we pick and flip

the appropriate coordinate of with probability .

Stationary distribution for the Stationary distribution for the Boolean CubeBoolean Cube

We will show that the uniform distribution is reversible (and therefore stationary).

Denote by the transition probability from state to .

We need to show that: for any two states.

Denote by the number of indexes in which and differ:◦Case no.1: ◦Case no.2: ◦Case no.3:

Reminder - Coupling Reminder - Coupling For a given MC with state

space and transitions matrix , a coupling for the MC is a MC on the state space with transition probabilities defined by:

Reminder (cont.) - The coupling Reminder (cont.) - The coupling lemmalemma

For a coupling , based on a ground MC on . Suppose is a function satisfying the condition: for all     and all :

then the mixing time for is bounded by .

Fast convergence of the Fast convergence of the Gibbs sampler for the Gibbs sampler for the Boolean cube Boolean cube Coupling: we run two Markov chains and simultaneously.

We go threw the coordinates systematically.

If at some integer time it holds that:

We flip both of them with probability Otherwise we flip each one separately

with same probability.

Fast convergence of the Fast convergence of the Gibbs sampler for the Gibbs sampler for the Boolean cube Boolean cube For any coordinate : after one

cycle threw the coordinates: After cycles threw the chain:

The event implies that for at least one coordinate : meaning that:

Fast convergence of the Fast convergence of the Gibbs sampler for the Gibbs sampler for the Boolean cube Boolean cube Setting and solving for

gives:  Now, using the coupling lemma,

it means that the mixing time is bounded by:

Gibbs sampler for q-Gibbs sampler for q-coloringscoloringsFor a vertex and an assignment of

colors to the vertices other than , the conditional distribution of the color at is uniform over the set of all colors that are not attained in at some neighbor of .

A Gibbs Sampler for random q-colorings is therefore an -valued MC where at each time , transitions take place as follows: ◦Pick a vertex uniformly at random.◦Pick according to the uniform

distribution over the set of colors that are not attained at any neighbor of .

◦Leave the color unchanged at all other vertices, i.e. let for all vertices except .

Gibbs sampler for q-Gibbs sampler for q-coloringscoloringsIt has as a stationary

distribution, the proof is similar to the hard-core model case, based on the fact that it is reversible.

Whether or not the chain is irreducible depends on and , and it is a nontrivial question to generally determine this. But for big enough ( is always enough) it is.

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringTheorem [HAG 8.1]: Let be a

graph. Let be the number of vertices in , and suppose any vertex has at most neighbors. Suppose furthermore that Then, for any fixed , the number of iterations needed for the systematic sweep Gibbs sampler described above (starting from any fixed -coloring ) to come within total variation distance of the target distribution is at most:

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloring

Comments:◦Can be proved for (rather than

).◦The important bound is:

for some .

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringCoupling: we run two -valued

Markov chains and simultaneously.

When a vertex is chosen to be updated, we pick a new color for it uniformly at random among the set of colors that are not attained by any neighbor of . Concretely:

Let be an i.i.d sequence of random permutations of , each of them uniformly distributed on the set the of such permutations.

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloring (cont.)sampler for q-coloring (cont.)At each time , the updates of the

2 chains use the permutation and the vertex to be updated is assigned the new value where:

in the first chain. In the second chain, we set

where:

Successful and not-successful Successful and not-successful updatesupdatesIf at some (random, hopefully

not too large) time , we will also have for all

First consider the probability that: for a given vertex .

We call an update of the 2 chains at time successful if it results in having:

Otherwise – we say that the update failed.

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringNotations:

The event of having successful update has probability:

Respectively: Using that and that

we get: (we only used and some algebra).

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringHence, we have, after steps of the

MC-s, that for each vertex : Now, consider updates during the 2nd

sweep of the Gibbs sampler, i.e. between times and .

For an update at time during the 2nd sweep to fail the configurations and need to differ in at least one neighbor of .

Each neighbor has with probability at most

By summing over at most neighbors we get that:

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloring

“discrepancy” means that there exists a neighbor of with .

By Repeating the arguments above we get:

Hence, after steps of the Markov chains, each vertex has probability at most:

By arguing the same for the third sweep:

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringand for any :

The event implies thatfor at least one vertex , meaning that:

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloringSummary: By setting and solving

for , we get by using the coupling lemma that for:

It holds that the mixing time after scans is bounded by .

Fast convergence of Gibbs Fast convergence of Gibbs sampler for q-coloringsampler for q-coloring

To go from the number of scans to the number of steps of the MC, we have to multiply by , giving that: should be enough.

In order to make sure that gets an integer value we should take to be: