Upload
-
View
3.591
Download
0
Embed Size (px)
Citation preview
Replica exchange MCMC
with Stan and R
Kentaro Matsuura
June 4, 2016
Tokyo.Stan
Example data
• N = 50 # number of data points
2
Model (Assumptions)
• 𝑌 𝑛 ~ Normal sin 𝛽 𝑋 𝑛 , 𝜎𝑌
• angular rate: 𝛽 ~ Normal+ 0, 50
• noise scale: 𝜎𝑌 ~ Student_t+ 4, 0, 5
3
𝑛 = 1,… ,𝑁
Stan Code 4
R Code 5
Result | trace plot 6
This chain seems good.
Should I set initial values
around (b, s_y)=(24, 0.3) ?
By the way
A contour map of log-posterior can be visualized.
Because the number of parameters is just two in this model.
7
For later explanation, I visualized 𝐸
𝑇(≈ − log-posterior )
𝐸: Energy, 𝑇: Temperature
higher log-posterior lower 𝐸
𝑇previous model is correspond to 𝑇 = 1
Contour map of 𝐸/𝑇 (at 𝑇 = 1) 8
Contour map of 𝐸/𝑇 (at 𝑇 = 1) 9
around (24, 0.3)
𝐸 ≈ 33
around (0.5, 0.3)
𝐸 ≈ 28
10
Replica exchange MCMC (also known as parallel tempering)
Setting initial values is generally difficult.
References
• Y. Iba, "Extended Ensemble Monte Carlo," Int. J. Mod.
Phys. C, 12, pp.623-652, 2001.
• Movie by Y. Iba (in Japanese):
レプリカ交換MCMC講義 (伊庭幸人) 難易度★★https://www.youtube.com/watch?v=1c7mQIhEqmQ
• Books (in Japanese)
– 福島孝治 (2006) サイコロふって積分する方法 確率的情報処理と統計力学(SGCライブラリ 50) p.60-66.
– 伊庭ほか (2005) 計算統計II (統計科学のフロンティア 12) p.74-78
11
Overview 12
Monte Carlo Step
𝑇5
𝑇4
𝑇3
𝑇2
𝑇1 = 1
>>>>
𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 = 5
Annealing and Heating 13
𝑇ℎ𝑖𝑔ℎ
𝑇𝑚𝑖𝑑𝑑𝑙𝑒
𝑇1 = 1
Sampling from joint distribution
• Theoretically, Replica exchange MCMC
does sampling from the following joint
distribution:
𝑝 𝜽1, … , 𝜽𝑇𝑁_𝑟𝑒𝑝𝑙𝑖𝑐𝑎 =
𝑟=1
𝑁_𝑟𝑒𝑝𝑙𝑖𝑐𝑎
𝑝 𝜽𝑟|𝑇 = 𝑇𝑟
where 𝑝 𝜽𝑟|𝑇 = 𝑇𝑟 ∝ exp −𝐸 𝜽𝑟
𝑇𝑟
14
Exchange probability of replicas
𝑝 = min 1,exp −𝐸𝑟′𝑇𝑟−𝐸𝑟𝑇𝑟′
exp −𝐸𝑟𝑇𝑟−𝐸𝑟′𝑇𝑟′
= min 1, exp 𝐸𝑟 − 𝐸𝑟′ ×1
𝑇𝑟−1
𝑇𝑟′
15
Pseudo Code 16
set T of replicas: T = (1, T2, T3, ..., TN_replica)set initial values: x = (x0, x0, ..., x0)
for e in 1, ..., N_exchangefor r in 1, ..., N_replica
short sampling started from x[r] at each T[r]endsave MCMC samples at T = 1
exchange replicas if p < unif(0,1)
for r in 1, ..., N_replicaupdate x[r]
endend
Stan Code 17
R Code and Settings
• R Code:
• 𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 = 10
• 𝑁𝑒𝑥𝑐ℎ𝑎𝑛𝑔𝑒 = 100
• 𝑇 = 1, 1.54, … , 32.37, 50
• 𝑥0 = 𝑏, 𝜎𝑦 0= 24. 2, 0.4
• iter=70 and warmup=50 each short sampling
(i.e. 20 MCMC samples)
18
𝑁𝑟𝑒𝑝𝑙𝑖𝑐𝑎 needs ≈ 𝑁
geometric progression
deep local minimum ☠
https://gist.github.com/MatsuuraKentaro/bccf13af3ba52c9d6c379c0032725b91
Result | Exchanges of replicas 19
Result | 𝐸 (Energy) 20
Result | 𝐸 (Energy) 21
Result | trace plot (at 𝑇 = 1) 22
Discussion
• Can I re-use warmup result in Stan?
i.e. step size &
diagonal elements of inverse mass matrix
• How to set iter and warmup of each short
sampling?
warmup=50 is usually too short as total sampling.
But, total warmup is 50*100 in this case.
Does it make sense?
23