23
Replica exchange MCMC with Stan and R Kentaro Matsuura June 4, 2016 Tokyo.Stan

Replica exchange MCMC

  • Upload
    -

  • View
    3.591

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Replica exchange MCMC

Replica exchange MCMC

with Stan and R

Kentaro Matsuura

June 4, 2016

Tokyo.Stan

Page 2: Replica exchange MCMC

Example data

• N = 50 # number of data points

2

Page 3: Replica exchange MCMC

Model (Assumptions)

• 𝑌 𝑛 ~ Normal sin 𝛽 𝑋 𝑛 , 𝜎𝑌

• angular rate: 𝛽 ~ Normal+ 0, 50

• noise scale: 𝜎𝑌 ~ Student_t+ 4, 0, 5

3

𝑛 = 1,… ,𝑁

Page 4: Replica exchange MCMC

Stan Code 4

Page 5: Replica exchange MCMC

R Code 5

Page 6: Replica exchange MCMC

Result | trace plot 6

This chain seems good.

Should I set initial values

around (b, s_y)=(24, 0.3) ?

Page 7: Replica exchange MCMC

By the way

A contour map of log-posterior can be visualized.

Because the number of parameters is just two in this model.

7

For later explanation, I visualized 𝐸

𝑇(≈ − log-posterior )

𝐸: Energy, 𝑇: Temperature

higher log-posterior lower 𝐸

𝑇previous model is correspond to 𝑇 = 1

Page 8: Replica exchange MCMC

Contour map of 𝐸/𝑇 (at 𝑇 = 1) 8

Page 9: Replica exchange MCMC

Contour map of 𝐸/𝑇 (at 𝑇 = 1) 9

around (24, 0.3)

𝐸 ≈ 33

around (0.5, 0.3)

𝐸 ≈ 28

Page 10: Replica exchange MCMC

10

Replica exchange MCMC (also known as parallel tempering)

Setting initial values is generally difficult.

Page 11: Replica exchange MCMC

References

• Y. Iba, "Extended Ensemble Monte Carlo," Int. J. Mod.

Phys. C, 12, pp.623-652, 2001.

• Movie by Y. Iba (in Japanese):

レプリカ交換MCMC講義 (伊庭幸人) 難易度★★https://www.youtube.com/watch?v=1c7mQIhEqmQ

• Books (in Japanese)

– 福島孝治 (2006) サイコロふって積分する方法 確率的情報処理と統計力学(SGCライブラリ 50) p.60-66.

– 伊庭ほか (2005) 計算統計II (統計科学のフロンティア 12) p.74-78

11

Page 12: Replica exchange MCMC

Overview 12

Monte Carlo Step

𝑇5

𝑇4

𝑇3

𝑇2

𝑇1 = 1

>>>>

𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 = 5

Page 13: Replica exchange MCMC

Annealing and Heating 13

𝑇ℎ𝑖𝑔ℎ

𝑇𝑚𝑖𝑑𝑑𝑙𝑒

𝑇1 = 1

Page 14: Replica exchange MCMC

Sampling from joint distribution

• Theoretically, Replica exchange MCMC

does sampling from the following joint

distribution:

𝑝 𝜽1, … , 𝜽𝑇𝑁_𝑟𝑒𝑝𝑙𝑖𝑐𝑎 =

𝑟=1

𝑁_𝑟𝑒𝑝𝑙𝑖𝑐𝑎

𝑝 𝜽𝑟|𝑇 = 𝑇𝑟

where 𝑝 𝜽𝑟|𝑇 = 𝑇𝑟 ∝ exp −𝐸 𝜽𝑟

𝑇𝑟

14

Page 15: Replica exchange MCMC

Exchange probability of replicas

𝑝 = min 1,exp −𝐸𝑟′𝑇𝑟−𝐸𝑟𝑇𝑟′

exp −𝐸𝑟𝑇𝑟−𝐸𝑟′𝑇𝑟′

= min 1, exp 𝐸𝑟 − 𝐸𝑟′ ×1

𝑇𝑟−1

𝑇𝑟′

15

Page 16: Replica exchange MCMC

Pseudo Code 16

set T of replicas: T = (1, T2, T3, ..., TN_replica)set initial values: x = (x0, x0, ..., x0)

for e in 1, ..., N_exchangefor r in 1, ..., N_replica

short sampling started from x[r] at each T[r]endsave MCMC samples at T = 1

exchange replicas if p < unif(0,1)

for r in 1, ..., N_replicaupdate x[r]

endend

Page 17: Replica exchange MCMC

Stan Code 17

Page 18: Replica exchange MCMC

R Code and Settings

• R Code:

• 𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 = 10

• 𝑁𝑒𝑥𝑐ℎ𝑎𝑛𝑔𝑒 = 100

• 𝑇 = 1, 1.54, … , 32.37, 50

• 𝑥0 = 𝑏, 𝜎𝑦 0= 24. 2, 0.4

• iter=70 and warmup=50 each short sampling

(i.e. 20 MCMC samples)

18

𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 needs ≈ 𝑁

geometric progression

deep local minimum ☠

https://gist.github.com/MatsuuraKentaro/bccf13af3ba52c9d6c379c0032725b91

Page 19: Replica exchange MCMC

Result | Exchanges of replicas 19

Page 20: Replica exchange MCMC

Result | 𝐸 (Energy) 20

Page 21: Replica exchange MCMC

Result | 𝐸 (Energy) 21

Page 22: Replica exchange MCMC

Result | trace plot (at 𝑇 = 1) 22

Page 23: Replica exchange MCMC

Discussion

• Can I re-use warmup result in Stan?

i.e. step size &

diagonal elements of inverse mass matrix

• How to set iter and warmup of each short

sampling?

warmup=50 is usually too short as total sampling.

But, total warmup is 50*100 in this case.

Does it make sense?

23