110
logo Let’s Practice What We Preach: Likelihood Methods for Monte Carlo Data Xiao-Li Meng Department of Statistics, Harvard University September 24, 2011 Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 1 / 23

Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

Embed Size (px)

DESCRIPTION

Xiao-Li Meng's slides for his talks at Columbia, Sept. 2011, and ICERM, Nov. 2012

Citation preview

Page 1: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s Practice What We Preach:Likelihood Methods for Monte Carlo Data

Xiao-Li Meng

Department of Statistics, Harvard University

September 24, 2011

Based on

Kong, McCullagh, Meng, Nicolae, and Tan (2003, JRSS-B, withdiscussions);Kong, McCullagh, Meng, and Nicolae (2006, Doksum Festschrift);Tan (2004, JASA); ..., Meng and Tan (201X)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 1 / 23

Page 2: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s Practice What We Preach:Likelihood Methods for Monte Carlo Data

Xiao-Li Meng

Department of Statistics, Harvard University

September 24, 2011

Based on

Kong, McCullagh, Meng, Nicolae, and Tan (2003, JRSS-B, withdiscussions);Kong, McCullagh, Meng, and Nicolae (2006, Doksum Festschrift);Tan (2004, JASA); ..., Meng and Tan (201X)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 1 / 23

Page 3: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 4: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 5: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 6: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 7: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 8: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance sampling (IS)

Estimand:

c1 =

∫Γ

q1(x)µ(dx) =

∫Γ

q1(x)

p2(x)p2(x)µ(dx).

Data: {Xi2, i = 1, . . . n2} ∼ p2 = q2/c2

Estimating Equation (EE):

r ≡ c1

c2= E2

[q1(X )

q2(X )

].

The EE estimator:

r =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

Standard IS estimator for c1 when c2 = 1.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 2 / 23

Page 9: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

f (X12 . . .Xn22) =

n2∏i=1

p2(Xi2) — free of the estimand c1!

So why are {Xi2, i = 1, . . . n2} even relevant?Violation of likelihood principle?

What are we “inferring”?What is the “unknown” model parameter?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 3 / 23

Page 10: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

f (X12 . . .Xn22) =

n2∏i=1

p2(Xi2) — free of the estimand c1!

So why are {Xi2, i = 1, . . . n2} even relevant?Violation of likelihood principle?

What are we “inferring”?What is the “unknown” model parameter?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 3 / 23

Page 11: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

f (X12 . . .Xn22) =

n2∏i=1

p2(Xi2) — free of the estimand c1!

So why are {Xi2, i = 1, . . . n2} even relevant?Violation of likelihood principle?

What are we “inferring”?What is the “unknown” model parameter?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 3 / 23

Page 12: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

f (X12 . . .Xn22) =

n2∏i=1

p2(Xi2) — free of the estimand c1!

So why are {Xi2, i = 1, . . . n2} even relevant?Violation of likelihood principle?

What are we “inferring”?What is the “unknown” model parameter?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 3 / 23

Page 13: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge sampling (BS)

Data: {Xij , i = 1, . . . , nj} ∼ pj = qj/cj , j = 1, 2

Estimating Equation (Meng and Wong, 1996):

r ≡ c1

c2=

E2[α(X )q1(X )]

E1[α(X )q2(X )], ∀ α : 0 < |

∫αq1q2dµ| <∞

Optimal choice: αO(x) ∝ [n1q1(x) + n2rq2(x)]−1

Optimal estimator rO , the limit of

r(t+1)O =

1n2

n2∑i=1

[q1(Xi2)

s1q1(Xi2)+s2 r(t)O q2(Xi2)

]1n1

n1∑i=1

[q2(Xi1)

s1q1(Xi1)+s2 r(t)O q2(Xi1)

]

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 4 / 23

Page 14: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge sampling (BS)

Data: {Xij , i = 1, . . . , nj} ∼ pj = qj/cj , j = 1, 2

Estimating Equation (Meng and Wong, 1996):

r ≡ c1

c2=

E2[α(X )q1(X )]

E1[α(X )q2(X )], ∀ α : 0 < |

∫αq1q2dµ| <∞

Optimal choice: αO(x) ∝ [n1q1(x) + n2rq2(x)]−1

Optimal estimator rO , the limit of

r(t+1)O =

1n2

n2∑i=1

[q1(Xi2)

s1q1(Xi2)+s2 r(t)O q2(Xi2)

]1n1

n1∑i=1

[q2(Xi1)

s1q1(Xi1)+s2 r(t)O q2(Xi1)

]

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 4 / 23

Page 15: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge sampling (BS)

Data: {Xij , i = 1, . . . , nj} ∼ pj = qj/cj , j = 1, 2

Estimating Equation (Meng and Wong, 1996):

r ≡ c1

c2=

E2[α(X )q1(X )]

E1[α(X )q2(X )], ∀ α : 0 < |

∫αq1q2dµ| <∞

Optimal choice: αO(x) ∝ [n1q1(x) + n2rq2(x)]−1

Optimal estimator rO , the limit of

r(t+1)O =

1n2

n2∑i=1

[q1(Xi2)

s1q1(Xi2)+s2 r(t)O q2(Xi2)

]1n1

n1∑i=1

[q2(Xi1)

s1q1(Xi1)+s2 r(t)O q2(Xi1)

]

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 4 / 23

Page 16: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge sampling (BS)

Data: {Xij , i = 1, . . . , nj} ∼ pj = qj/cj , j = 1, 2

Estimating Equation (Meng and Wong, 1996):

r ≡ c1

c2=

E2[α(X )q1(X )]

E1[α(X )q2(X )], ∀ α : 0 < |

∫αq1q2dµ| <∞

Optimal choice: αO(x) ∝ [n1q1(x) + n2rq2(x)]−1

Optimal estimator rO , the limit of

r(t+1)O =

1n2

n2∑i=1

[q1(Xi2)

s1q1(Xi2)+s2 r(t)O q2(Xi2)

]1n1

n1∑i=1

[q2(Xi1)

s1q1(Xi1)+s2 r(t)O q2(Xi1)

]

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 4 / 23

Page 17: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge sampling (BS)

Data: {Xij , i = 1, . . . , nj} ∼ pj = qj/cj , j = 1, 2

Estimating Equation (Meng and Wong, 1996):

r ≡ c1

c2=

E2[α(X )q1(X )]

E1[α(X )q2(X )], ∀ α : 0 < |

∫αq1q2dµ| <∞

Optimal choice: αO(x) ∝ [n1q1(x) + n2rq2(x)]−1

Optimal estimator rO , the limit of

r(t+1)O =

1n2

n2∑i=1

[q1(Xi2)

s1q1(Xi2)+s2 r(t)O q2(Xi2)

]1n1

n1∑i=1

[q2(Xi1)

s1q1(Xi1)+s2 r(t)O q2(Xi1)

]

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 4 / 23

Page 18: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 19: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 20: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 21: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 22: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 23: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What about MLE?

The “likelihood” is:

2∏j=1

nj∏i=1

qj(Xij)

cj∝ c−n1

1 c−n22 — free of data!

What went wrong: cj is not “free parameter” becausecj =

∫Γ qj(x)µ(dx) and qj is known.

So what is the “unknown” model parameter?

Turns out rO is the same as Bennett’s (1976) optimal acceptanceratio estimator, as well as Geyer’s (1994) reversed logistic regressionestimator.

So why is that? Can it be improved upon without any “sleight ofhand”?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 5 / 23

Page 24: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Pretending the measure is unknown!

Because

c =

∫Γ

q(x)µ(dx),

and q is known in the sense that we can evaluate it at any samplevalue, the only way to make c “unknown” is to assume the underlyingmeasure µ is “unknown”.

This is natural because Monte Carlo simulation means we use samplesto represent, and thus estimate/infer, the underlying populationq(x)µ(dx), and hence estimate/infer µ since q is known.

Monte Carlo integration is about finding a tractable discrete µ toapproximate the intractable µ.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 6 / 23

Page 25: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Pretending the measure is unknown!

Because

c =

∫Γ

q(x)µ(dx),

and q is known in the sense that we can evaluate it at any samplevalue, the only way to make c “unknown” is to assume the underlyingmeasure µ is “unknown”.

This is natural because Monte Carlo simulation means we use samplesto represent, and thus estimate/infer, the underlying populationq(x)µ(dx), and hence estimate/infer µ since q is known.

Monte Carlo integration is about finding a tractable discrete µ toapproximate the intractable µ.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 6 / 23

Page 26: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Pretending the measure is unknown!

Because

c =

∫Γ

q(x)µ(dx),

and q is known in the sense that we can evaluate it at any samplevalue, the only way to make c “unknown” is to assume the underlyingmeasure µ is “unknown”.

This is natural because Monte Carlo simulation means we use samplesto represent, and thus estimate/infer, the underlying populationq(x)µ(dx), and hence estimate/infer µ since q is known.

Monte Carlo integration is about finding a tractable discrete µ toapproximate the intractable µ.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 6 / 23

Page 27: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Pretending the measure is unknown!

Because

c =

∫Γ

q(x)µ(dx),

and q is known in the sense that we can evaluate it at any samplevalue, the only way to make c “unknown” is to assume the underlyingmeasure µ is “unknown”.

This is natural because Monte Carlo simulation means we use samplesto represent, and thus estimate/infer, the underlying populationq(x)µ(dx), and hence estimate/infer µ since q is known.

Monte Carlo integration is about finding a tractable discrete µ toapproximate the intractable µ.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 6 / 23

Page 28: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Estimand: c1 =∫

Γ q1(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} ∼ i .i .d . c−12 q2(x)µ(dx)

Likelihood for µ:

L(µ) =

n2∏i=1

c−12 q2(Xi2)µ(Xi2)

Note that c2 is a functional of µ.

The nonparametric MLE of µ is

µ(dx) =P(dx)

q2(x), P — empirical measure

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 7 / 23

Page 29: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Estimand: c1 =∫

Γ q1(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} ∼ i .i .d . c−12 q2(x)µ(dx)

Likelihood for µ:

L(µ) =

n2∏i=1

c−12 q2(Xi2)µ(Xi2)

Note that c2 is a functional of µ.

The nonparametric MLE of µ is

µ(dx) =P(dx)

q2(x), P — empirical measure

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 7 / 23

Page 30: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Estimand: c1 =∫

Γ q1(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} ∼ i .i .d . c−12 q2(x)µ(dx)

Likelihood for µ:

L(µ) =

n2∏i=1

c−12 q2(Xi2)µ(Xi2)

Note that c2 is a functional of µ.

The nonparametric MLE of µ is

µ(dx) =P(dx)

q2(x), P — empirical measure

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 7 / 23

Page 31: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Estimand: c1 =∫

Γ q1(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} ∼ i .i .d . c−12 q2(x)µ(dx)

Likelihood for µ:

L(µ) =

n2∏i=1

c−12 q2(Xi2)µ(Xi2)

Note that c2 is a functional of µ.

The nonparametric MLE of µ is

µ(dx) =P(dx)

q2(x), P — empirical measure

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 7 / 23

Page 32: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Estimand: c1 =∫

Γ q1(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} ∼ i .i .d . c−12 q2(x)µ(dx)

Likelihood for µ:

L(µ) =

n2∏i=1

c−12 q2(Xi2)µ(Xi2)

Note that c2 is a functional of µ.

The nonparametric MLE of µ is

µ(dx) =P(dx)

q2(x), P — empirical measure

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 7 / 23

Page 33: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Thus the MLE for r ≡ c1/c2 is

r =

∫q1(x)µ(dx) =

1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

When c2 = 1, q2 = p2, standard IS estimator for c1 is obtained.

{X(i2), i = 1, . . . n2} is (minimum) sufficient for µ onx ∈ S2 = {x : q2(x) > 0}, and hence c1 is guaranteed to beconsistent only when S1 ⊂ S2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 8 / 23

Page 34: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Thus the MLE for r ≡ c1/c2 is

r =

∫q1(x)µ(dx) =

1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

When c2 = 1, q2 = p2, standard IS estimator for c1 is obtained.

{X(i2), i = 1, . . . n2} is (minimum) sufficient for µ onx ∈ S2 = {x : q2(x) > 0}, and hence c1 is guaranteed to beconsistent only when S1 ⊂ S2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 8 / 23

Page 35: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Thus the MLE for r ≡ c1/c2 is

r =

∫q1(x)µ(dx) =

1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

When c2 = 1, q2 = p2, standard IS estimator for c1 is obtained.

{X(i2), i = 1, . . . n2} is (minimum) sufficient for µ onx ∈ S2 = {x : q2(x) > 0}, and hence c1 is guaranteed to beconsistent only when S1 ⊂ S2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 8 / 23

Page 36: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Importance Sampling Likelihood

Thus the MLE for r ≡ c1/c2 is

r =

∫q1(x)µ(dx) =

1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)

When c2 = 1, q2 = p2, standard IS estimator for c1 is obtained.

{X(i2), i = 1, . . . n2} is (minimum) sufficient for µ onx ∈ S2 = {x : q2(x) > 0}, and hence c1 is guaranteed to beconsistent only when S1 ⊂ S2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 8 / 23

Page 37: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

Estimand: ∝ cj =∫

Γ qj(x)µ(x), j = 1, . . . , J.

Data: {Xij , 1 ≤ i ≤ nj} ∼ c−1j qj(x)µ(dx), 1 ≤ j ≤ J

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 c−1j qj(Xij)µ(Xij)

Writing θ(x) = log µ(x), then

log L(µ) = n

∫Γθ(x)dP −

J∑j=1

nj log cj(θ),

P is the empirical measure on {Xij , 1 ≤ i ≤ nj , 1 ≤ j ≤ J}.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 9 / 23

Page 38: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

Estimand: ∝ cj =∫

Γ qj(x)µ(x), j = 1, . . . , J.

Data: {Xij , 1 ≤ i ≤ nj} ∼ c−1j qj(x)µ(dx), 1 ≤ j ≤ J

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 c−1j qj(Xij)µ(Xij)

Writing θ(x) = log µ(x), then

log L(µ) = n

∫Γθ(x)dP −

J∑j=1

nj log cj(θ),

P is the empirical measure on {Xij , 1 ≤ i ≤ nj , 1 ≤ j ≤ J}.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 9 / 23

Page 39: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

Estimand: ∝ cj =∫

Γ qj(x)µ(x), j = 1, . . . , J.

Data: {Xij , 1 ≤ i ≤ nj} ∼ c−1j qj(x)µ(dx), 1 ≤ j ≤ J

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 c−1j qj(Xij)µ(Xij)

Writing θ(x) = log µ(x), then

log L(µ) = n

∫Γθ(x)dP −

J∑j=1

nj log cj(θ),

P is the empirical measure on {Xij , 1 ≤ i ≤ nj , 1 ≤ j ≤ J}.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 9 / 23

Page 40: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

Estimand: ∝ cj =∫

Γ qj(x)µ(x), j = 1, . . . , J.

Data: {Xij , 1 ≤ i ≤ nj} ∼ c−1j qj(x)µ(dx), 1 ≤ j ≤ J

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 c−1j qj(Xij)µ(Xij)

Writing θ(x) = log µ(x), then

log L(µ) = n

∫Γθ(x)dP −

J∑j=1

nj log cj(θ),

P is the empirical measure on {Xij , 1 ≤ i ≤ nj , 1 ≤ j ≤ J}.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 9 / 23

Page 41: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

Estimand: ∝ cj =∫

Γ qj(x)µ(x), j = 1, . . . , J.

Data: {Xij , 1 ≤ i ≤ nj} ∼ c−1j qj(x)µ(dx), 1 ≤ j ≤ J

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 c−1j qj(Xij)µ(Xij)

Writing θ(x) = log µ(x), then

log L(µ) = n

∫Γθ(x)dP −

J∑j=1

nj log cj(θ),

P is the empirical measure on {Xij , 1 ≤ i ≤ nj , 1 ≤ j ≤ J}.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 9 / 23

Page 42: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

MLE for µ given by equating the canonical sufficient statistics P toits expectation:

nP(dx) =J∑

j=1

nj c−1j qj(x)µ(dx),

µ(dx) =nP(dx)∑J

j=1 nj c−1j qj(x)

. (A)

Consequently, the MLE for {c1, . . . , cJ} must satisfy

cr =

∫Γ

qr (x) d µ =J∑

j=1

nj∑i=1

qr (xij)∑Js=1 ns c−1

s qs(xij). (B)

(B) is the “dual” equation of (A), and is also the same as theequation for optimal multiple bridge sampling estimator (Tan 2004).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 10 / 23

Page 43: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

MLE for µ given by equating the canonical sufficient statistics P toits expectation:

nP(dx) =J∑

j=1

nj c−1j qj(x)µ(dx),

µ(dx) =nP(dx)∑J

j=1 nj c−1j qj(x)

. (A)

Consequently, the MLE for {c1, . . . , cJ} must satisfy

cr =

∫Γ

qr (x) d µ =J∑

j=1

nj∑i=1

qr (xij)∑Js=1 ns c−1

s qs(xij). (B)

(B) is the “dual” equation of (A), and is also the same as theequation for optimal multiple bridge sampling estimator (Tan 2004).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 10 / 23

Page 44: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

MLE for µ given by equating the canonical sufficient statistics P toits expectation:

nP(dx) =J∑

j=1

nj c−1j qj(x)µ(dx),

µ(dx) =nP(dx)∑J

j=1 nj c−1j qj(x)

. (A)

Consequently, the MLE for {c1, . . . , cJ} must satisfy

cr =

∫Γ

qr (x) d µ =J∑

j=1

nj∑i=1

qr (xij)∑Js=1 ns c−1

s qs(xij). (B)

(B) is the “dual” equation of (A), and is also the same as theequation for optimal multiple bridge sampling estimator (Tan 2004).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 10 / 23

Page 45: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Bridge Sampling Likelihood

MLE for µ given by equating the canonical sufficient statistics P toits expectation:

nP(dx) =J∑

j=1

nj c−1j qj(x)µ(dx),

µ(dx) =nP(dx)∑J

j=1 nj c−1j qj(x)

. (A)

Consequently, the MLE for {c1, . . . , cJ} must satisfy

cr =

∫Γ

qr (x) d µ =J∑

j=1

nj∑i=1

qr (xij)∑Js=1 ns c−1

s qs(xij). (B)

(B) is the “dual” equation of (A), and is also the same as theequation for optimal multiple bridge sampling estimator (Tan 2004).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 10 / 23

Page 46: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 47: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 48: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 49: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 50: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodel

Linear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 51: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodel

Log-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 52: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

But We Can Ignore Less ...

To restrict the parameter space for µ by using some knowledge of theknown µ, that it, to set up a sub-model.

The new MLE has a smaller asymptotic variance under the submodelthan under the full model.

Examples:

Group-invariance submodelLinear submodelLog-linear submodel

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 11 / 23

Page 53: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 54: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 55: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 56: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 57: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 58: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 59: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 60: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 61: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

An Universally Improved IS

Estimand: r = c1/c2; cj =∫Rd qj(x)µ(dx)

Data: {Xi2, i = 1, . . . n2} i .i .d ∼ c−12 q2µ(dx)

Taking G = {Id ,−Id} leads to

rG =1

n2

n2∑i=1

q1(Xi2) + q1(−Xi2)

q2(Xi2) + q2(−Xi2).

Because of the Rao-Blackwellization, V(rG) ≤ V(r).

Need twice as many evaluations, but typically this is a small insurancepremium.

Consider S1 = R & S2 = R+. Then rG is consistent for r :

rG =1

n2

n2∑i=1

q1(Xi2)

q2(Xi2)+

1

n2

n2∑i=1

q1(−Xi2)

q2(Xi2).

But standard IS r only estimates∫∞

0 q1(x)µ(dx)/c2.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 12 / 23

Page 62: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

There are many more improvements ...

Define a sub-model by requiring µ to be G-invariant, where G is afinite group on Γ.

The new MLE of µ is

µG(dx) =nPG(dx)∑J

j=1 nj c−1j q Gj (x)

,

where PG(A) = aveg∈G P(gA); q Gj (x) = aveg∈G qj(gx).

When the draws are i.i.d. within each psdµ,

µG = E [µ| GX ],

i.e., the Rao-Blackwellization of µ given the orbit.

Consequently,

c Gj =

∫Γ

qj(x)µG(dx) = E [cj |GX ].

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 13 / 23

Page 63: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

There are many more improvements ...

Define a sub-model by requiring µ to be G-invariant, where G is afinite group on Γ.

The new MLE of µ is

µG(dx) =nPG(dx)∑J

j=1 nj c−1j q Gj (x)

,

where PG(A) = aveg∈G P(gA); q Gj (x) = aveg∈G qj(gx).

When the draws are i.i.d. within each psdµ,

µG = E [µ| GX ],

i.e., the Rao-Blackwellization of µ given the orbit.

Consequently,

c Gj =

∫Γ

qj(x)µG(dx) = E [cj |GX ].

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 13 / 23

Page 64: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

There are many more improvements ...

Define a sub-model by requiring µ to be G-invariant, where G is afinite group on Γ.

The new MLE of µ is

µG(dx) =nPG(dx)∑J

j=1 nj c−1j q Gj (x)

,

where PG(A) = aveg∈G P(gA); q Gj (x) = aveg∈G qj(gx).

When the draws are i.i.d. within each psdµ,

µG = E [µ| GX ],

i.e., the Rao-Blackwellization of µ given the orbit.

Consequently,

c Gj =

∫Γ

qj(x)µG(dx) = E [cj |GX ].

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 13 / 23

Page 65: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

There are many more improvements ...

Define a sub-model by requiring µ to be G-invariant, where G is afinite group on Γ.

The new MLE of µ is

µG(dx) =nPG(dx)∑J

j=1 nj c−1j q Gj (x)

,

where PG(A) = aveg∈G P(gA); q Gj (x) = aveg∈G qj(gx).

When the draws are i.i.d. within each psdµ,

µG = E [µ| GX ],

i.e., the Rao-Blackwellization of µ given the orbit.

Consequently,

c Gj =

∫Γ

qj(x)µG(dx) = E [cj |GX ].

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 13 / 23

Page 66: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Using Groups to model trade-off

If G1 k G2, thenVar

(~c G1

)≤ Var

(~c G2

).

The statistical efficiency increases with the size of Gi , but so does thecomputational cost needed for function evaluation (but not forsampling, because there are no additional samples involved).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 14 / 23

Page 67: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Using Groups to model trade-off

If G1 k G2, thenVar

(~c G1

)≤ Var

(~c G2

).

The statistical efficiency increases with the size of Gi , but so does thecomputational cost needed for function evaluation (but not forsampling, because there are no additional samples involved).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 14 / 23

Page 68: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Using Groups to model trade-off

If G1 k G2, thenVar

(~c G1

)≤ Var

(~c G2

).

The statistical efficiency increases with the size of Gi , but so does thecomputational cost needed for function evaluation (but not forsampling, because there are no additional samples involved).

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 14 / 23

Page 69: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Linear submodel: stratified sampling (Tan 2004)

Data: {Xij , 1 ≤ i ≤ nj}i .i .d∼ pj(x)µ(dx), 1 ≤ j ≤ J.

The sub-model has parameter space{µ :

∫Γ

pj(x)µ(dx), 1 ≤ j ≤ J, are equal (to 1).

}Likelihood for µ: L(µ) =

∏Jj=1

∏nj

i=1 pj(Xij)µ(Xij)

The MLE is

µlin(dx) =P(dx)∑J

j=1 πjpj(x),

where πjs are MLEs from a mixture model:

the datai .i .d∼

∑Jj=1 πjpj(·) with πjs unknown

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 15 / 23

Page 70: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Linear submodel: stratified sampling (Tan 2004)

Data: {Xij , 1 ≤ i ≤ nj}i .i .d∼ pj(x)µ(dx), 1 ≤ j ≤ J.

The sub-model has parameter space{µ :

∫Γ

pj(x)µ(dx), 1 ≤ j ≤ J, are equal (to 1).

}Likelihood for µ: L(µ) =

∏Jj=1

∏nj

i=1 pj(Xij)µ(Xij)

The MLE is

µlin(dx) =P(dx)∑J

j=1 πjpj(x),

where πjs are MLEs from a mixture model:

the datai .i .d∼

∑Jj=1 πjpj(·) with πjs unknown

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 15 / 23

Page 71: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Linear submodel: stratified sampling (Tan 2004)

Data: {Xij , 1 ≤ i ≤ nj}i .i .d∼ pj(x)µ(dx), 1 ≤ j ≤ J.

The sub-model has parameter space{µ :

∫Γ

pj(x)µ(dx), 1 ≤ j ≤ J, are equal (to 1).

}

Likelihood for µ: L(µ) =∏J

j=1

∏nj

i=1 pj(Xij)µ(Xij)

The MLE is

µlin(dx) =P(dx)∑J

j=1 πjpj(x),

where πjs are MLEs from a mixture model:

the datai .i .d∼

∑Jj=1 πjpj(·) with πjs unknown

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 15 / 23

Page 72: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Linear submodel: stratified sampling (Tan 2004)

Data: {Xij , 1 ≤ i ≤ nj}i .i .d∼ pj(x)µ(dx), 1 ≤ j ≤ J.

The sub-model has parameter space{µ :

∫Γ

pj(x)µ(dx), 1 ≤ j ≤ J, are equal (to 1).

}Likelihood for µ: L(µ) =

∏Jj=1

∏nj

i=1 pj(Xij)µ(Xij)

The MLE is

µlin(dx) =P(dx)∑J

j=1 πjpj(x),

where πjs are MLEs from a mixture model:

the datai .i .d∼

∑Jj=1 πjpj(·) with πjs unknown

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 15 / 23

Page 73: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Linear submodel: stratified sampling (Tan 2004)

Data: {Xij , 1 ≤ i ≤ nj}i .i .d∼ pj(x)µ(dx), 1 ≤ j ≤ J.

The sub-model has parameter space{µ :

∫Γ

pj(x)µ(dx), 1 ≤ j ≤ J, are equal (to 1).

}Likelihood for µ: L(µ) =

∏Jj=1

∏nj

i=1 pj(Xij)µ(Xij)

The MLE is

µlin(dx) =P(dx)∑J

j=1 πjpj(x),

where πjs are MLEs from a mixture model:

the datai .i .d∼

∑Jj=1 πjpj(·) with πjs unknown

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 15 / 23

Page 74: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

So why MLE?

Goal: to estimate c =∫

Γ q(x)µ(dx).

For an arbitrary vector b, consider the control-variate estimator(Owen and Zhou 2000)

cb ≡J∑

j=1

nj∑i=1

q(xji )− b>g(xji )∑Js=1 nsps(xji )

,

where g = (p2 − p1, . . . , pJ − p1)>.

A more general class: for∑J

j=1 λj(x) ≡ 1 and∑J

j=1 λj(x)bj(x) ≡ b,consider (Veach and Guibas 1995 for bj ≡ 0; Tan, 2004)

cλ,B =J∑

j=1

1

nj

nj∑i=1

λj(xji )q(xji )− b>j (xji )g(xji )

pj(xji )

Should cλ,B be more efficient than cb? Could there be somethingeven more efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 16 / 23

Page 75: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

So why MLE?

Goal: to estimate c =∫

Γ q(x)µ(dx).

For an arbitrary vector b, consider the control-variate estimator(Owen and Zhou 2000)

cb ≡J∑

j=1

nj∑i=1

q(xji )− b>g(xji )∑Js=1 nsps(xji )

,

where g = (p2 − p1, . . . , pJ − p1)>.

A more general class: for∑J

j=1 λj(x) ≡ 1 and∑J

j=1 λj(x)bj(x) ≡ b,consider (Veach and Guibas 1995 for bj ≡ 0; Tan, 2004)

cλ,B =J∑

j=1

1

nj

nj∑i=1

λj(xji )q(xji )− b>j (xji )g(xji )

pj(xji )

Should cλ,B be more efficient than cb? Could there be somethingeven more efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 16 / 23

Page 76: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

So why MLE?

Goal: to estimate c =∫

Γ q(x)µ(dx).

For an arbitrary vector b, consider the control-variate estimator(Owen and Zhou 2000)

cb ≡J∑

j=1

nj∑i=1

q(xji )− b>g(xji )∑Js=1 nsps(xji )

,

where g = (p2 − p1, . . . , pJ − p1)>.

A more general class: for∑J

j=1 λj(x) ≡ 1 and∑J

j=1 λj(x)bj(x) ≡ b,consider (Veach and Guibas 1995 for bj ≡ 0; Tan, 2004)

cλ,B =J∑

j=1

1

nj

nj∑i=1

λj(xji )q(xji )− b>j (xji )g(xji )

pj(xji )

Should cλ,B be more efficient than cb? Could there be somethingeven more efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 16 / 23

Page 77: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

So why MLE?

Goal: to estimate c =∫

Γ q(x)µ(dx).

For an arbitrary vector b, consider the control-variate estimator(Owen and Zhou 2000)

cb ≡J∑

j=1

nj∑i=1

q(xji )− b>g(xji )∑Js=1 nsps(xji )

,

where g = (p2 − p1, . . . , pJ − p1)>.

A more general class: for∑J

j=1 λj(x) ≡ 1 and∑J

j=1 λj(x)bj(x) ≡ b,consider (Veach and Guibas 1995 for bj ≡ 0; Tan, 2004)

cλ,B =J∑

j=1

1

nj

nj∑i=1

λj(xji )q(xji )− b>j (xji )g(xji )

pj(xji )

Should cλ,B be more efficient than cb? Could there be somethingeven more efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 16 / 23

Page 78: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

So why MLE?

Goal: to estimate c =∫

Γ q(x)µ(dx).

For an arbitrary vector b, consider the control-variate estimator(Owen and Zhou 2000)

cb ≡J∑

j=1

nj∑i=1

q(xji )− b>g(xji )∑Js=1 nsps(xji )

,

where g = (p2 − p1, . . . , pJ − p1)>.

A more general class: for∑J

j=1 λj(x) ≡ 1 and∑J

j=1 λj(x)bj(x) ≡ b,consider (Veach and Guibas 1995 for bj ≡ 0; Tan, 2004)

cλ,B =J∑

j=1

1

nj

nj∑i=1

λj(xji )q(xji )− b>j (xji )g(xji )

pj(xji )

Should cλ,B be more efficient than cb? Could there be somethingeven more efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 16 / 23

Page 79: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Three estimators for c =∫

Γ q(x) µ(dx):

IS: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πj = nj/n are the true proportions.

Reg:1

n

n∑i=1

q(xi )− β>g(xi )∑Jj=1 πjpj(xi )

,

where β is the estimated regression coefficient, ignoring stratification.

Lik: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πjs are the estimated proportions, ignoring stratification.

Which one is most efficient? Least efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 17 / 23

Page 80: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Three estimators for c =∫

Γ q(x) µ(dx):

IS: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πj = nj/n are the true proportions.

Reg:1

n

n∑i=1

q(xi )− β>g(xi )∑Jj=1 πjpj(xi )

,

where β is the estimated regression coefficient, ignoring stratification.

Lik: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πjs are the estimated proportions, ignoring stratification.

Which one is most efficient? Least efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 17 / 23

Page 81: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Three estimators for c =∫

Γ q(x) µ(dx):

IS: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πj = nj/n are the true proportions.

Reg:1

n

n∑i=1

q(xi )− β>g(xi )∑Jj=1 πjpj(xi )

,

where β is the estimated regression coefficient, ignoring stratification.

Lik: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πjs are the estimated proportions, ignoring stratification.

Which one is most efficient? Least efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 17 / 23

Page 82: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Three estimators for c =∫

Γ q(x) µ(dx):

IS: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πj = nj/n are the true proportions.

Reg:1

n

n∑i=1

q(xi )− β>g(xi )∑Jj=1 πjpj(xi )

,

where β is the estimated regression coefficient, ignoring stratification.

Lik: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πjs are the estimated proportions, ignoring stratification.

Which one is most efficient? Least efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 17 / 23

Page 83: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Three estimators for c =∫

Γ q(x) µ(dx):

IS: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πj = nj/n are the true proportions.

Reg:1

n

n∑i=1

q(xi )− β>g(xi )∑Jj=1 πjpj(xi )

,

where β is the estimated regression coefficient, ignoring stratification.

Lik: 1

n

n∑i=1

q(xi )∑Jj=1 πjpj(xi )

,

where πjs are the estimated proportions, ignoring stratification.

Which one is most efficient? Least efficient?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 17 / 23

Page 84: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 85: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 86: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 87: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 88: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or

(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 89: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Let’s find it out ...

Γ = R10 and µ is Lebesgue measure.

The integrand is

q(x) = 0.810∏j=1

φ(x j) + 0.210∏j=1

ψ(x j ; 4) ,

where φ(·) is standard normal density and ψ(·; 4) is t4 density.

Two sampling designs:

(i) q2(x) with n draws, or(ii) q1(x) and q2(x) each with n/2 draws,

where

q1(x) =10∏j=1

φ(x j), q2(x) =10∏j=1

ψ(x j ; 1)

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 18 / 23

Page 90: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

A little surprise?

Table: Comparison of design and estimator

one sampler two samplers

IS Reg Lik IS Reg Lik

Sqrt MSE .162 .00942 .00931 .0175 .00881 .00881

Std Err .162 .00919 .00920 .0174 .00885 .00884

Note: Sqrt MSE is√

mean squared error of the point estimates andStd Err is

√mean of the variance estimates from 10000 repeated

simulations of size n = 500.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 19 / 23

Page 91: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Comparison of efficiency:

Statistical efficiency: IS < Reg ≈ Lik

IS is a stratified estimator, which uses only the labels.

Reg is conventional method of control variates.

Lik is constrained MLE, which uses pjs but ignores the labels;it is exact if q = pj for any particular j .

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 20 / 23

Page 92: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Comparison of efficiency:

Statistical efficiency: IS < Reg ≈ Lik

IS is a stratified estimator, which uses only the labels.

Reg is conventional method of control variates.

Lik is constrained MLE, which uses pjs but ignores the labels;it is exact if q = pj for any particular j .

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 20 / 23

Page 93: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Comparison of efficiency:

Statistical efficiency: IS < Reg ≈ Lik

IS is a stratified estimator, which uses only the labels.

Reg is conventional method of control variates.

Lik is constrained MLE, which uses pjs but ignores the labels;it is exact if q = pj for any particular j .

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 20 / 23

Page 94: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Comparison of efficiency:

Statistical efficiency: IS < Reg ≈ Lik

IS is a stratified estimator, which uses only the labels.

Reg is conventional method of control variates.

Lik is constrained MLE, which uses pjs but ignores the labels;it is exact if q = pj for any particular j .

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 20 / 23

Page 95: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Comparison of efficiency:

Statistical efficiency: IS < Reg ≈ Lik

IS is a stratified estimator, which uses only the labels.

Reg is conventional method of control variates.

Lik is constrained MLE, which uses pjs but ignores the labels;it is exact if q = pj for any particular j .

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 20 / 23

Page 96: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Building intuition ...

Suppose we make n = 2 draws, one from N(0, 1) and one fromCauchy(0, 1), hence π1 = π2 = 50%.

Suppose the draws are {1, 1}, what would be the MLE (π1, π2)?

Suppose the draws are {1, 3}, what would be the MLE (π1, π2)?

Suppose the draws are {3, 3}, what would be the MLE (π1, π2)?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 21 / 23

Page 97: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Building intuition ...

Suppose we make n = 2 draws, one from N(0, 1) and one fromCauchy(0, 1), hence π1 = π2 = 50%.

Suppose the draws are {1, 1}, what would be the MLE (π1, π2)?

Suppose the draws are {1, 3}, what would be the MLE (π1, π2)?

Suppose the draws are {3, 3}, what would be the MLE (π1, π2)?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 21 / 23

Page 98: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Building intuition ...

Suppose we make n = 2 draws, one from N(0, 1) and one fromCauchy(0, 1), hence π1 = π2 = 50%.

Suppose the draws are {1, 1}, what would be the MLE (π1, π2)?

Suppose the draws are {1, 3}, what would be the MLE (π1, π2)?

Suppose the draws are {3, 3}, what would be the MLE (π1, π2)?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 21 / 23

Page 99: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Building intuition ...

Suppose we make n = 2 draws, one from N(0, 1) and one fromCauchy(0, 1), hence π1 = π2 = 50%.

Suppose the draws are {1, 1}, what would be the MLE (π1, π2)?

Suppose the draws are {1, 3}, what would be the MLE (π1, π2)?

Suppose the draws are {3, 3}, what would be the MLE (π1, π2)?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 21 / 23

Page 100: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

Building intuition ...

Suppose we make n = 2 draws, one from N(0, 1) and one fromCauchy(0, 1), hence π1 = π2 = 50%.

Suppose the draws are {1, 1}, what would be the MLE (π1, π2)?

Suppose the draws are {1, 3}, what would be the MLE (π1, π2)?

Suppose the draws are {3, 3}, what would be the MLE (π1, π2)?

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 21 / 23

Page 101: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What Did I Learn?

Model what we ignore, not what we know!

Model comparison/selection is not about which model is true (as allof them are “true”), but which model represents a better compromiseamong human, computational, and statistical efficiency.

There is a cure for our “schizophrenia” — we now can analyze MonteCarlo data using the same sound statistical principles and methods foranalyzing real data.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 22 / 23

Page 102: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What Did I Learn?

Model what we ignore, not what we know!

Model comparison/selection is not about which model is true (as allof them are “true”), but which model represents a better compromiseamong human, computational, and statistical efficiency.

There is a cure for our “schizophrenia” — we now can analyze MonteCarlo data using the same sound statistical principles and methods foranalyzing real data.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 22 / 23

Page 103: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What Did I Learn?

Model what we ignore, not what we know!

Model comparison/selection is not about which model is true (as allof them are “true”), but which model represents a better compromiseamong human, computational, and statistical efficiency.

There is a cure for our “schizophrenia” — we now can analyze MonteCarlo data using the same sound statistical principles and methods foranalyzing real data.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 22 / 23

Page 104: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

What Did I Learn?

Model what we ignore, not what we know!

Model comparison/selection is not about which model is true (as allof them are “true”), but which model represents a better compromiseamong human, computational, and statistical efficiency.

There is a cure for our “schizophrenia” — we now can analyze MonteCarlo data using the same sound statistical principles and methods foranalyzing real data.

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 22 / 23

Page 105: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23

Page 106: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23

Page 107: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.

Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23

Page 108: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23

Page 109: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23

Page 110: Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data

logo

If you are looking for theoretical research topics ...

RE-EXAM OLD ONES AND DERIVE NEW ONES!

Prove it is MLE, or a good approximation to MLE.Or derive MLE or a cost-effective approximation to it.

Markov chain Monte Carlo (Tan 2006, 2008)

More ......

Xiao-Li Meng (Harvard) MCMC+likelihood September 24, 2011 23 / 23