Upload
curtis-cannon
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Outline
Gibbs Sampling Advances in Gibbs sampling
Blocking Cutset sampling (Rao-Blackwellisation)
Importance Sampling Advances in Importance Sampling Particle Filtering
Importance Sampling Theory
Z
EX
n
iii
EX
eEZPeEP
eXpaXPeEEXPeEP
),()(
)),(|(),\()(\ 1\
simplify E,\XZLet
Importance Sampling Theory
Given a distribution called the proposal distribution Q (such that P(Z=z,e)>0=> Q(Z=z)>0)
Zz
eEzZPeEP ),()(
)()(
),()( zZQ
zZQ
eEzZPeEP
Zz
Zz
Q zZzQZE )( :value expected of definition By
)()(
),()( zZwE
zZQ
eEzZPEeEP QQ
w(Z=z) is called as importance weight
Importance Sampling Theory
)()(
),()( zZwE
zZQ
eEzZPEeEP QQ
)()(ˆ ,N
)(1
)(
),(1)(ˆ
)z,...,(z Samples
Q fromdrawn samples ofset aGiven
11
n1
eEPeEPAs
zZwNzZQ
eEzZP
NeEP
N
i
ii
N
ii
i
Underlying principle, Approximate Average over a set of numbers by an average over a set of sampled numbers
Importance Sampling (Informally) Express the problem as computing the
average over a set of real numbers Sample a subset of real numbers Approximate the true average by sample
average. True Average:
Average of (0.11, 0.24, 0.55, 0.77, 0.88,0.99)=0.59 Sample Average over 2 samples:
Average of (0.24, 0.77) = 0.505
How to generate samples from Q
Express Q in product form: Q(Z)=Q(Z1)Q(Z2|Z1)….Q(Zn|Z1,..Zn-1)
Sample along the order Z1,..Zn
Example: Q(Z1)=(0.2,0.8) Q(Z2|Z1)=(0.2,0.8,0.1,0.9) Q(Z3|Z1,Z2)=Q(Z3|Z1)=(0.5,0.5,0.3,0.7)
N
ii
i
zZQ
eEzZP
NeEP
1 )(
),(1)(
How to sample from Q
Generate a random number between 0 and 1
Q(Z1)=(0.2,0.8)Q(Z2|Z1)=(0.2,0.8,0.1,0.9)Q(Z3|Z1,Z2)=Q(Z3|Z1)=(0.5,0.5,0.3,0.7)
0 10.2
Which value to select for Z1?
Domains of each variable is {0,1}
01
How to sample from Q?
Each Sample Z=z Sample Z1=z1 from Q(Z1) Sample Z2=z2 from Q(Z2|Z1=z1) Sample Z3=z3 from Q(Z3|Z1=z1)
Generate N such samples
)(1
)(
),(1)(
)z,...,(z Samples
11
n1
iN
i
N
ii
i
zZwNzZQ
eEzZP
NeEP
Likelihood weighting example
lung Cancer
Smoking
X-ray
Bronchitis
DyspnoeaP(D|C,B)
P(B|S)
P(S)
P(X|C,S)
P(C|S)
P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B)
0)BC,|S)P(DC,|1S)P(X|0S)P(B|P(S)P(C0)B1,P(X
false0 and 1 where?)0,1( trueBXP
Likelihood weighting example
lung Cancer
Smoking
X-ray
Bronchitis
DyspnoeaP(D|C,B)
P(B|S)
P(S)
P(X|C,S)
P(C|S)
Q=Prior
Q(S,C,D)=Q(S)*Q(C|S)*Q(D|C,B=0)
=P(S)P(C|S)P(D|C,B=0)
Sample S=s from P(S)
Sample C=c from P(C|S=s)
Sample D=d from P(D|C=c,B=0)
N
ii
i
zZQ
eEzZP
NeEP
1 )(
),(1)(
),|1()|0(
)0,|()|()(
)0,|(),|1()|0()|()(
)0,|()|()(
)0,1,,,(
)(
),()(
sScCXPsSBP
BcCdDPsScCPsSP
BcCdDPsScCXPsSBPsScCPsSP
BcCdDPsScCPsSP
BXdDcCsSP
zZQ
eEzZPzZw
i
ii
The Algorithm
N
P
w(e)P(e)P
paePww
eX
paxPxX
EX
XXoX
w
N1k
(e)P
k
iikk
ii
iiii
i
ni
k
(e)ˆReturn
ˆˆ
)|(
Assign
)|( from sample
:),...,(order icalin topologeach each For
1
to
0ˆ
1
else
if
For
How to solve belief updating?
eE
eExX
eEP
eExXPeExXP
ii
iiii
is Evidence :rDenominato
, is Evidence :Numerator
sampling importanceby r Denominato andNumerator Estimate
)(
),()|(
0 , z sample iff 1),(,
)(
)(),(
)|(ˆ
j
1
1
elsexXcontainszxwhere
zw
zwzx
eExXP
iij
i
N
j
j
N
j
jji
ii
Difference between estimating P(E=e) and P(Xi=xi|E=e)
N
i
izwN
eEP1
)(1
)(ˆ
N
j
j
N
j
jji
ii
zw
zwzx
eExXP
1
1
)(
)(),(
)|(ˆ
UnbiasedAsymptotically Unbiased )()(ˆ eEPeEPEQ )|()|(ˆ eExXPeExXPE iiiiQ
)|()|(ˆlim eExXPeExXPE iiiiQN
Proposal Distribution: Which is better?
e)P(E compute tosufficient is sample oneonly and
)()(ˆ then 0, varianceIf
ondistributi proposal variancelowprefer should one So
)()()(
is |)()(ˆ|y thatprobabilit The
22
2
eEPeEP
VariancezQeEPzw
eEPeEP
Zz
Outline
Gibbs Sampling Advances in Gibbs sampling
Blocking Cutset sampling (Rao-Blackwellisation)
Importance Sampling Advances in Importance Sampling Particle Filtering
Research Issues in Importance Sampling
Better Proposal Distribution Likelihood weighting
Fung and Chang, 1990; Shachter and Peot, 1990 AIS-BN
Cheng and Druzdzel, 2000 Iterative Belief Propagation
Changhe and Druzdzel, 2003 Iterative Join Graph Propagation and
variable ordering Gogate and Dechter, 2005
Research Issues in Importance Sampling (Cheng and
Druzdzel 2000)
Adaptive Importance Sampling
k
)(ˆ Re
')(Q Update
)(N
1)(ˆe)(EP̂
Q z,...,z samples Generate
dok to1iFor
0)(ˆ
))(|(*..*))(|(*)()(Q Proposal Initial
1k
1
N1
2211
eEPturn
End
QQkQ
zweEP
from
eEP
ZpaZQZpaZQZQZ
kk
iN
jk
k
nn
Adaptive Importance Sampling
General case Given k proposal distributions Take N samples out of each
distribution Approximate P(e)
1)(ˆ
1
k
j
proposaljthweightAvgk
eP
Estimating Q'(z)
sampling importanceby estimated is
)Z,..,Z|(ZQ'each where
))(|('*..*))(|('*)(')(Q
1-i1i
221'
nn ZpaZQZpaZQZQZ
Cutset importance sampling
Divide the Set of variables into two parts Cutset (C) and Remaining Variables
(R)
instancefor bel-Elim using computed is )|(
)|(*)(
)(1)(
~
1
j
jN
jj
j
cCRP
cCRPcCQ
cCP
NeEP
(Gogate and Dechter, 2005) and (Bidyuk and Dechter 2006)
Outline
Gibbs Sampling Advances in Gibbs sampling
Blocking Cutset sampling (Rao-Blackwellisation)
Importance Sampling Advances in Importance Sampling Particle Filtering
Dynamic Belief Networks (DBNs)
Bayesian Network at time t
Bayesian Network at time t+1
Transition arcs
Xt Xt+1
Yt Yt+1
X0 X1 X2
Y0 Y1 Y2
Unrolled DBN for t=0 to t=10
X10
Y10
Query
Compute P(X 0:t |Y 0:t ) or P(X t |Y 0:t ) Example P(X0:10|Y0:10) or P(X10|Y0:10)
Hard!!! over a long time period Approximate! Sample!
Particle Filtering (PF)
= “condensation” = “sequential Monte Carlo” = “survival of the fittest”
PF can treat any type of probability distribution, non-linearity, and non-stationarity;
PF are powerful sampling based inference/learning algorithms for DBNs.