83
MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE, FMI INVERSE PROBLEMS SUMMER SCHOOL, HELSINKI 2019 INVERSE PROBLEMS SUMMER SCHOOL, HELSINKI 2019 1

MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MARKOVCHAINMONTECARLOMETHODSMARKOVCHAINMONTECARLOMETHODSMARKOLAINE,FMIMARKOLAINE,FMIINVERSEPROBLEMSSUMMERSCHOOL,HELSINKI2019INVERSEPROBLEMSSUMMERSCHOOL,HELSINKI2019

1

Page 2: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

TABLEOFCONTENTSTABLEOFCONTENTSUncertaintiesinmodelling

Uncertaintiesinmodelling

Parameterestimation

Parameterestimation

MarkovchainMonteCarlo–MCMC

MarkovchainMonteCarlo–MCMC

MCMCinpractice

MCMCinpractice

SomeMCMCtheory

SomeMCMCtheory

AdaptiveMCMCmethods

AdaptiveMCMCmethods

OtherMCMCvariantsandimplementations

OtherMCMCvariantsandimplementations

Example:dynamicalstatespacemodelsandMCMC

Example:dynamicalstatespacemodelsandMCMC

Exercises

Exercises

2

Page 3: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

UNCERTAINTIESINMODELLINGUNCERTAINTIESINMODELLINGFirstweintroducesomebasicconceptsrelatedtodi�erentsourcesofuncertaintiesinmodellingandtoolstoquantifyuncertaity.Westartwithlinearmodel,withknownproperties.

3 . 1

Page 4: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

Considersimpleregressionproblem,whereweareinterestedinmodelingthesystematicpartbehindthenoisyobservations.Inadditiontothebest�ttingmodel,weneedinformationabouttheuncertaintyinourestimates.Forlinearmodels,wehaveclassicalstatisticaltheorygivingformulasfortheuncertaintiesdependingontheassumptionsonthenatureofthenoise.

INTRODUCTIONINTRODUCTION

4 . 1

Page 5: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

Fornon-linearmodels,orhighdimensionallinearmodels,thesituationisharder.Simulationbasedanalysis,suchasMarkovchainMonteCarlo,providesremedies.Ifweareabletosamplerealizationsfromourmodelwhileperturbingtheinput,wecanassesthesensitivityofthemodeloutputontheinput.TheBayesianstatisticalparadigmallowshandlingofalluncertaintiesbyauni�edframework.

NON-LINEARMODELSNON-LINEARMODELS

5 . 1

Page 6: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

STATISTICALANALYSISBYSIMULATIONSTATISTICALANALYSISBYSIMULATIONTheuncertaintydistributionofmodelparametervector giventheobservations andthemodel: .Thisdistributionistypicallyanalyticallyintractable.

Butwecansimulateobservationsfrom .

Statisticalanalysisisusedtode�newhatisagood�t.Parametersthatareconsistentwiththedataandthemodellinguncertaintyareaccepted.

θ yp(θ|y, M)

p(y|θ, M)

6 . 1

Page 7: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

Simulatethemodelwhilesamplingtheparametersfromaproposaldistribution.Accept(orweight)theparametersaccordingtoasuitablegoodness-of-�tcriteriadependingonpriorinformationanderrorstatisticsde�ningthelikelihoodfunction.TheresultingchainisasamplefromtheBayesianposteriordistributionofparameteruncertainty.

MARKOVCHAINMONTECARLO–MCMCMARKOVCHAINMONTECARLO–MCMC

7 . 1

Page 8: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

WhilesamplingthemodelusingMCMC,weget:

Posteriordistributionofmodelparameters.Posteriordistributionofmodelpredictions.Posteriordistributionformodelcomparison.

Inmanyinverseproblems,modelparametersareofsecondaryinterest.Wearemostlyinterestedinmodel-basedpredictionsofthestate.Usuallyitisevenenoughtobeabletosimulateanensembleofpossiblerealizationsthatcorrespondtothepredictionuncertainty.

POSTERIORDISTRIBUTIONSPOSTERIORDISTRIBUTIONS

8 . 1

Page 9: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXAMPLE:UNCERTAINTYINNUMERICALWEATHERPREDICTIONSEXAMPLE:UNCERTAINTYINNUMERICALWEATHERPREDICTIONSEuropeanCentreforMediumRangeWeatherForecasts(ECMWF)runsanensembleof50forecastswithperturbedinitialvalues.Thisisdonetwiceadaytogetaposteriordistributionoftheforecastuncertainty.

https://en.ilmatieteenlaitos.�/weather/helsinki?forecast=long

https://en.ilmatieteenlaitos.�/weather/helsinki?forecast=long

9 . 1

Page 10: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

GENERALMODELGENERALMODEL

Herewewillmostlyconsideramodellingprobleminverygeneralform

IfweassumeindependentGaussianerrors ,thelikelihoodfunctioncorrespondstoasimplequadraticcostfunction,

Wecandirectlyextendthistonon-Gaussianlikelihoodsbyde�ning,thelog-likelihoodin"sum-of-squares"format.

Forcalculatingtheposterior,wealsoneedtoaccount ,theprior"sum-of-squares".

y

observations

= f (x, θ) + ϵ,

= model + error.

ϵ ∼ N(0, I )σ 2

p(y|θ) ∝ exp{− } = exp{− }.1

2

∑ni ( − f ( |θ))yi xi

2

σ 2

1

2

SS(θ)

σ 2

SS(θ) = −2 log(p(y|θ))S (θ) = −2 log(p(θ))Spri

10 . 1

Page 11: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SOURCESOFUNCERTAINTIESINMODELLINGSOURCESOFUNCERTAINTIESINMODELLINGuncertainty source methodsObservation instrumentnoise, samplingdesign

  sampling,representation, retrievalmethod

  retrieval  

Parameter estimation, optimalestimation

  calibration,tuning MCMC

Modelformulation approximatephysics, modeldiagnostics

  numerics, modelselection

  resolution,sub-gridscale averaging

  processes Gaussianprocesses

Initialvalue statespacemodels Kalman�lter

    assimilation

11 . 1

Page 12: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

PARAMETERESTIMATIONPARAMETERESTIMATIONSomeremarksondi�erentestimationparadigms,beforewegofullyBayesian.

12 . 1

Page 13: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

PARAMETERESTIMATIONPARAMETERESTIMATIONWhenconsideringtheproblemofparameterestimationwebasicallyhavetwoalternativemethodologies:classicalleastsquaresestimationandBayesianapproach.Whenthemodelisnon-linearortheerrordistributionisnon-Gaussian,weneedsimulationbasedoriterativenumericalmethodsforestimation.WithMCMCweapplyBayesianreasoningandgetasaresultasamplefromthedistributionthatdescribestheuncertaintyintheparameters.Uncertaintyintheestimatestogetherwitherrorintheobservationscauseuncertaintyinthemodelpredictions.MonteCarlomethodsallowwaystosimulatemodelpredictionswhiletakingintoaccounttheuncertaintyintheparametersandotherinputvariables.

13 . 1

Page 14: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ANOTEONNOTATIONANOTEONNOTATIONThesymbol standsfortheunknownparametertobeestimatedinthebasicmodelequation .Thisiscommoninstatisticalliterature.However,statisticalinverseproblemliteratureisusuallyconcernedinestimatingunknownstateonthesystem,whichistypicallydenotedby .Instatisticalterms,wearedealingwithsameproblem.However,inthestateestimationproblemthereareusuallyspeci�cthingstotakecareof,suchasthediscretizationofthemodel.EspeciallywhenwearefollowingtheBayesianparadigm,alluncertainties,whethertheconcernthestateofthesystem,theparametersofthemodelortheuncertaintyinthepriorknowledge,aretreatedinuni�edway.

θy = f (x; θ) + ϵ

x

14 . 1

Page 15: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXAMPLE  EXAMPLE  

Considerachemicalreaction ,modelledasanODEsystem

Thedata consistsofmeasurementsofthecomponents atsomesamplinginstants,andthe 'sarethetimeinstances ,but couldincludeotherconditions,suchastemperature.Theunknownstobeestimatedaretherateconstants, andperhapssomeinitialconditions.Themodelfunction returnsthesolutionoftheaboveequations,perhapsusingsomenumericalODEsolver.

ff ((xx,, θθ)) ++ ϵϵ

A → B → C

dA

dtdB

dtdC

dt

= − Ak1

= A − Bk1 k2

= Bk2

y A, B, Cx , i = 1, 2, . . . nti x

θ = ( , )k1 k2

f (x, θ)

15 . 1

Page 16: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ESTIMATIONPARADIGMSESTIMATIONPARADIGMSAnestimateofaparameterisavaluecalculatedfromthedatathattriestobeasgoodapproximationoftheunknowntruevalueaspossible.Forexample,thesamplemeanisanestimateofthemeanofthedistributionthatisgeneratingthenumbers.Thereareseveralwaysofde�ningoptimalestimators:least-squares,maximumlikelihood,minimumloss,Bayesestimators,etc.Anestimatormustalwaysbeaccompaniedwithestimateofitsuncertainty.Basically,therearetwowaysofconsideringtheuncertainties:frequentistic(samplingtheorybased)andBayesian.

16 . 1

Page 17: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ESTIMATIONPARADIGMSESTIMATIONPARADIGMSFrequentistic,samplingtheorybaseduncertaintyconsidersthesamplingdistributionofestimatorwhenweimagineindependentreplicationsofthesameobservationgeneratingprocedureunderidenticalconditions.InBayesiananalysistheinformationontheuncertaintyabouttheparametersiscontainedintheposteriordistributioncalculatedaccordingtotheBayesrule.

17 . 1

Page 18: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXAMPLEEXAMPLE

Ifwehaveindependentobservations fromnormaldistribution(assumeknown ),weknowthatthatthesamplemean isaminimumvarianceunbiasedestimatorfor andithassamplingdistribution

Thiscanbeusedtoconstructtheusualcon�denceintervalsfor .

∼ N(θ, ), i = 1, … , nyi σ 2

σ 2 y⎯⎯⎯

θ

∼ N(θ, /n).y⎯⎯⎯

σ 2

θ

18 . 1

Page 19: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXAMPLE(CONT.)EXAMPLE(CONT.)Bayesianinferenceassumesthatwecandirectlytalkaboutthedistributionofparameter(notjustthedistributionofestimator)anduseBayesformulatomakeinferenceaboutit.The'drawback'isthenecessaryintroductionofthepriordistribution.

Ifourpriorinformationon isveryvague, ,thenafterobservingthedatawehave

andthisdistributioncontainsalltheinformationabout availabletous.

θ p(θ) = 1 y

θ ∼ N( , /n)y⎯⎯⎯

σ 2

θ

19 . 1

Page 20: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MARKOVCHAINMONTECARLO–MCMCMARKOVCHAINMONTECARLO–MCMCNextwelookinmoredetailinsomespeci�cMCMCalgorithmsandtheiruses.

20 . 1

Page 21: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MCMCMETHODSMCMCMETHODSInprinciple,theBayesformulasolvestheestimationprobleminafullyprobabilisticsense.Theposteriordistributioncanbeusedforprobabilitystatementsaboutthemodelunknowns.WecanuseMAPorposteriormeanasapointestimate,calculateprobabilitylimits(con�denceregions),etc.However,someproblemsremain.

Howtode�netheapriordistribution.Howtocalculatetheintegralofthenormalizingconstant.

Theintegrationofthenormalizingconstantisadi�culttaskinhighdimensional,non-linearcases(dimensionhigherthan2or3).TheBayesianapproachhasbecometrulyaccessibleduetovariousMonteCarlomethods.

21 . 1

Page 22: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MCMCMCMCMarkovchainMonteCarlo(MCMC)algorithmsgeneratesasequenceofparametervalues whoseempiricaldistribution,approachestheposteriordistribution.Thegenerationofthevectorsinthechain , isdonebyrandomnumbers(MonteCarlo)issuchwaythateachnewpoint mayonlydependonthepreviouspoint (Markovchain).Intuitively,acorrectdistributionofpoints isgeneratedbyfavoringpointswithhighvaluesoftheposterior .Thechainmaybeusedasifitwouldbeasamplefromtheposterior,andvariousconclusionsconcerningthemodelpredictionsmaybebasedonmeanvalues,variancesetc.computedbythechain.AsimplebuttypicalexampleofMarkovchainisarandomwalk ,wheretherandomincrement doesnotdependon .

, , …θ1 θ2

θn n = 1, 2, …

θn+1

θn θp(θ|y)

= +Xn+1 Xn ϵn

ϵn , , …Xn Xn−1

22 . 1

Page 23: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

Thingsaremorecomplicatedinhighdimensionsandourintuitioniseasilyfooled.

Theplotshowsthevolumeofahypersphere

dividedbythevolumeofahypercube .

Randomwalktypemethodsareneededtoexplorethespaceofstatisticalsigni�cantprobability.Otherwisewewillalwaysbelostatsomedistantcorners.

HIGHDIMENSIONALSPACESAREVERYEMPTYHIGHDIMENSIONALSPACESAREVERYEMPTY

2πd/2rd

dΓ(d/2)

(2r)d

23 . 1

Page 24: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

THEMETROPOLIS-HASTINGSALGORITHMTHEMETROPOLIS-HASTINGSALGORITHM1.Chooseinitialvalues andproposaldensity .2.Usingcurrentvalueofthechain proposeanewvalue usingproposaldistribution

.

3.Generatearandomnumber uniformon andacceptthenewvalueif

4.Ifacceptedset ,ifnot .5.Gotoii)untilenoughvalueshavebeensampled.

θ0 qθi θ′

q( , ⋅)θi

u [0, 1]

u ≤ min{1, } .π( )q( , )θ′ θ′ θi

π( )q( , )θi θi θ′

=θi+1 θ′ =θi+1 θi

24 . 1

Page 25: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

RANDOMWALKMETROPOLISRANDOMWALKMETROPOLIS

Number1inthelistof .Top10Algorithmsofthe20thCentury

Top10Algorithmsofthe20thCentury

chain(1,:) = oldpar; for i=2:nsimu % simulation loop newpar = oldpar + randn(1,npar)*R; newss = ssfun(newpar,xdata,ydata); if (newss<oldss | rand < exp(-0.5*(newss-oldss)/sigma2))) chain(i,:) = newpar; % accept oldpar = newpar; oldss = newss; else chain(i,:) = oldpar; % reject endend

25 . 1

Page 26: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MCMCINPRACTICEMCMCINPRACTICEHowtodoMCMCinpractice.

26 . 1

Page 27: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

NON-LINEARMODELFITTINGNON-LINEARMODELFITTING

Consideranon-linearmodeldescribingobservations bycontrolvariables andparametervector :

In'classical'theorywe�ndtheoptimal byminimizingthesumofsquares

whichleadstoanon-linearminimizationproblem.Con�denceregionsfor areusuallyobtainedbylinearizingthelikelihoodfunctionandbyasymptoticarguments.

InBayesianapproachwecangetasimilar�tbyusingalikelihoodde�nedbyaugmentedbysuitablepriorinformationandtheinferenceisdonewiththeposteriordistributionof obtainedbyMCMCsampling.

y xθ

y = f (x, θ) + ϵ, ϵ ∼ N(0, I ).σ 2

θ

SS(θ) = ∑i=1

n

( − f ( , θ))yi xi2

θ

SS(θ)

θ

27 . 1

Page 28: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

METROPOLIS-HASTINGSALGORITHMMETROPOLIS-HASTINGSALGORITHMRandomwalkMetropolis-HastingsalgorithmwithGaussianproposaldistribution(andGaussianlikelihood).

Proposenewparametervalue ,where isdrawnfromthproposaldistribution.

Accept withprobability,

ForGaussianlikelihoodswithscalarunknown wecanupdateitwithaGibbsdrawfrom

= + ξθprop θcurr ξ ∼ N(0, )Σprop

θprop

α( , ) = 1 ∧ exp{ − ( )− (S ( ) − S (θcurr θprop

1

2

SS( ) − SS( )θprop θcurr

σ 2

1

2Spri θprop Spri θcu

σ 2

∼ Γ( , ) .σ −2 + nn0

2

+ SS( )n0S20 θcurr

2

28 . 1

Page 29: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

PRIORINFORMATIONPRIORINFORMATIONInthepreviousalgorithm,weassumedthatthepriordistributionswasgivenas

.Ingeneral ,whichforstandarddistributionsandindependentcomponentsiseasytoformulate.

IfpriorisindependentGaussian , ,thenwehave

Forlognormaldensity wehave

S (θ)Spri

S (θ) = −2 log(p(θ))Spri

∼ N( , )θi νi η2i i = 1, … p

S (θ) = .Spri ∑i=1

p

( )−θi νi

ηi

2

∼ logN(log( ), )θi νi η2i

S (θ) = .Spri ∑i=1

p

( )log( / )θi νi

ηi

2

29 . 1

Page 30: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

OBSERVATIONERROROBSERVATIONERROR

Inpreviousexample,weassumedasocalledconjugatepriorfortheerrorvariancewith

Thisisusefulastheconditionaldistribution isknown:

Andthisallowsustosampleanewvaluefor aftereveryMHstepbyGibbssampling.Matlabsteprequired:

Thispriordistributionisalsoknownasinverse priorfor .

σ 2

p( ) = Γ( , ) .σ −2 n0

2

2

n0S20

p( |y, θ)σ −2

p( |y, θ) = Γ( , ) .σ −2 + nn0

2

2

+ Sn0S20 Sθ

σ 2

sigma2 = 1/gammar(1,1,(n0+n)./2,2./(n0*s20+oldss));

χ2 σ 2

30 . 1

Page 31: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

GIBBSSAMPLINGGIBBSSAMPLING

InGibbssamplingthetargetisupdatedcomponentwise(orinblocks),sothattheproposaldistributionsisaconditionaltargetdistributionwithrespecttotheothercomponents.

Weseethattheacceptanceprobabilityisthenalways1.Problemistheconstructionofconditionalprobabilities.Ifthe1dimensionalconditionaldistributionisnotknown,itmustbeapproximativelycreated.Thisusuallyrequiresseveralevaluationsofthelikelihoodtobuildanempiricalversionofthedistribution.Insomeapplicationsthereareeasywaystoconstructthesedistributions.Hierarchicallinearmodelswithconjugatepriorsbeingonetypicalexample.

( | ) = π( | , , … , , , … , )qi θi θi− θi θ1 θ2 θi−1 θi+1 θd

31 . 1

Page 32: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

CONJUGACYCONJUGACYTherewasatrickinusingGammadistributioninthesamplingoftheerrorvariancebefore.Whencalculatingtheposteriorweneedtocomputetheproductoflikelihoodandprior

andconsideritasadistributionfortheparameter .Thelikelihooditselfisnotgenerallyadensitywithrespectto .Howeveritcanhaveaformthatcanbescaledtobeadensityandwithsuitableprior,theproductwillalsoretainsthisform.

ForexampletheGaussiandistributionconsideredasfunctionof issimilartoGammadistributionfor

p(y|θ)p(θ) θθ

σ 2

τ = 1/σ 2

∝ .1

πσ 2‾ ‾‾‾√e

− (x−μ)2

2σ 2 τα−1e−βτ

32 . 1

Page 33: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

WHYCONJUGACYWHYCONJUGACYConjugacyismanytimesjustacomputationalaidthatisusedbecauseonedoesnotwanttocodemorerealisticalternatives.Howeveritsometimescomesfromtheideathatparametersanddataarisefromasamekindofreasoning.Ifpriorandposteriorhavethesameform,thepriorcanbethoughtasarisingfrom(imaginary)databydirectlyobservingtheparameter.Also,conjugacyisbasisforsimpletextbookdemosofBayesiananalysis

(examplesinnornor.m,sigmaprior.m,binbeta).

33 . 1

Page 34: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

GaussianlikelihoodwithGaussianprior.

EXAMPLE:NORMALLIKELIHOOD,NORMALPRIOREXAMPLE:NORMALLIKELIHOOD,NORMALPRIOR

nobs = 10; xmean = 1.0; obssigma2 = 0.5^2; priormu = 0; priorsigma2 = 0.2^2; x = linspace(-2,2,200); prior = norpf(x,priormu,priorsigma2); postmu = (priormu/priorsigma2+nobs*xmean/obssigma2) /... (1/priorsigma2 + nobs/obssigma2); postsigma2 = 1/(1/priorsigma2+nobs/obssigma2); posterior = norpf(x,postmu,postsigma2); postuninf = norpf(x,xmean,obssigma2/nobs); plot(x,prior,x,posterior,x,postuninf) hline([],xmean); legend({'prior','posterior','posterior (noninf)','obs'},'Location','best') title('Gaussian likelihood with Gaussian prior for \mu')

34 . 1

Page 35: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

Inverse aspriorfor .

EXAMPLE:INV-EXAMPLE:INV-χχ22

χ2 σ 2

% prior parameters n0 = 1; s20 = 5; % observed sum-of-squared n = 40; ss = 10*n; x = linspace(0.01,20,200); plot(x,invchipf(x,n0,s20),... x,invchipf(x,n+n0,(n0*s20+ss)/(n0+n))) hline([],ss/n); legend({'prior','posterior','obs'}) title('Inverse \chi^2 prior for \sigma^2')

35 . 1

Page 36: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

BinomiallikelihoodwithBetaprior.

EXAMPLE:BETA-BINOMIALEXAMPLE:BETA-BINOMIAL

n = 10; % number of tries y = 1; % number of success p = linspace(0,1); a = 3; b = 3; % prior parameters prior = betapf(p,a,b); % prior ptas = betapf(p, y+a, n-y+b); % posterior plot(p,ptas,p,prior) hline([],y/n); legend({'posterior','prior','obs'},'Location','best') title('Binomial likelihood with Beta prior for p')

36 . 1

Page 37: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

CHAINCONVERGENCECHAINCONVERGENCEMonteCarlomethodshaveerrorintheestimatesthatbehaveslike where is

numberofsimulations,oringeneral,availableresourcesorCPUcycles.Non-stochasticnumericalmethodshavemuchbetterconvergenceresults,butgenerallytherearenousabledirectnumericalmethodsforhighdimensions(>3–4).Inadditiontoslowconvergence,theMCMCchainhascorrelationbetweensimulatedvalues,whicha�ecttheMonteCarloerroroftheestimatescalculatedfromthechain.

1n√

n

37 . 1

Page 38: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SERIALAUTOCORRELATIONSERIALAUTOCORRELATION

38 . 1

Page 39: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

INTEGRATEDAUTOCORRELATIONTIMEINTEGRATEDAUTOCORRELATIONTIME

Thesecondterm

iscalledintegratedautocorrelationtime.Ittellstheincreaseofvarianceofsamplebasedestimatesduetoautocorrelation.Itisthepricetopayofnothavingani.i.d.sample.

Function inMCMCtoolboxestimates usingmethoddueSokal.AnalternativemethodforestimatingtheMonteCarloerrorisbybatchmeanstandarderror,functions and .

ττ

τ = 1 + 2 ρ(k)∑k=1

iact.m τ

bmstd.m chainstats.m

39 . 1

Page 40: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

CHAINSTHATHAVENOTMIXEDYETCHAINSTHATHAVENOTMIXEDYET>> chainstats(chain,names) MCMC statistics, nsimu = 1000 mean std MC_err tau geweke --------------------------------------------------------------------- theta_1 0.037146 0.29164 0.054374 67.086 0.017371 theta_2 -0.98318 0.39323 0.082073 112.74 0.43098 theta_3 0.0077533 0.37183 0.076218 101.3 0.013535 theta_4 0.058239 0.34665 0.069839 101.88 0.014742 ---------------------------------------------------------------------

40 . 1

Page 41: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

CHAINSTHATHAVEBETTERMIXINGCHAINSTHATHAVEBETTERMIXING>> chainstats(chain,names) MCMC statistics, nsimu = 1000 mean std MC_err tau geweke --------------------------------------------------------------------- theta_1 0.047117 0.42071 0.068536 43.274 0.027867 theta_2 -1.1139 0.49858 0.088196 47.542 0.60531 theta_3 0.050454 0.43527 0.069026 41.525 0.023323 theta_4 0.033922 0.42226 0.067013 43.516 0.038749 ---------------------------------------------------------------------

41 . 1

Page 42: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ERGODICITYERGODICITYThetheoreticalcorrectnessofMCMCmethodsmaybeexpressedbythefollowingergodicitytheorem.Let bethesamplesproducedbyaMCMCalgorithm.Thefollowingshouldbevalid,andindeedis,forexamplefortherandomwalkMetropolisalgorithmwithaGaussianproposal:

TheoremLet bethedensityfunctionofatargetdistributionintheEuclideanspace .ThentheMCMCalgorithmsimulatesthedistribution correctly:foranarbitraryboundedandmeasurablefunction it('almostsurely')holdsthat

, . . .θ1 θn

π Rd

π

f : → RRd

(f ( ) + … + f ( )) = f (θ)π(dθ).limn→∞

1

nθ1 θn ∫

Rd

42 . 1

Page 43: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ERGODICITYANDPREDICTIVEINFERENCEERGODICITYANDPREDICTIVEINFERENCETheergodicitytheoremsimplystatesthatthesampledvaluesasymptoticallyapproachthetheoreticallycorrectones,foreachrealizationofthechain.Notetheroleofthefunction .If isthecharacteristicfunctionofaset ,thentherighthandsideoftheequationgivestheprobabilitymeasureof ,whilethelefthandsidegivesthefrequencyof'hits'to bythesampling.But mightalsobeourmodelandthen isthemodeloutputassumingtheparametervalue .Thetheoremstatesthatthevaluescalculatedatthesampledparameterscorrectlygivesthedistributionofthemodelpredictions,thesocalledpredictivedistribution.

f f AA

Af f (θ)

θ

43 . 1

Page 44: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

PREDICTIVEINFERENCEPREDICTIVEINFERENCEPredictiveinferencemeansstudyingtheposteriordistributionofthemodelpredictions.Inmanyapplicationsthesepredictionsaremoreinterestingthantheactualvaluesofthemodelparameters.ThenicefeatureofMCMCanalysesistheeasewithwhichyoucanmakeprobabilitystatementsandinferenceonthevaluesthatcanbecalculatedfromthemodel.Wesimplycalculatethemodelpredictionforeachrowofthechain(orrandomsubsetofit)andwehaveasamplefromtheposteriorpredictivedistribution.

44 . 1

Page 45: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

PARAMETERUNCERTAINTYPARAMETERUNCERTAINTY

45 . 1

Page 46: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MODELPREDICTIONUNCERTAINTYMODELPREDICTIONUNCERTAINTY

46 . 1

Page 47: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MCMCTOOLBOXFORMATLABMCMCTOOLBOXFORMATLAB

47 . 1

Page 48: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MCMCTOOLBOXFORMATLABMCMCTOOLBOXFORMATLAB

MatlabtoolboxforadaptiveMCMC

model.ssfun = @mycostfun data = load('datafile.dat'); parameters = { {'par1', 2.3 } {'par2', 1.2 } }; options.nsimu = 5000; options.method = 'am'; [results,chain] = mcmcrun(model,data,parameters,options); mcmcplot(chain,[],results)

https://www.github.com/mjlaine/mcmcstat/

https://www.github.com/mjlaine/mcmcstat/

48 . 1

Page 49: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SOMEMCMCTHEORYSOMEMCMCTHEORYLet'sreviewsometheoryofMarkovchainsrelatedtoMCMC.

49 . 1

Page 50: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SOMEMCMCTHEORYSOMEMCMCTHEORYLet beaparametervectorhavingvaluesinaparameterspace ,indexingfamilyofpossibleprobabilitydistributions describingourobservations .

TheBayesformulagivestheposterior intermsoflikelihoodandprior

MarkovchainMonteCarlomethodsconstructaMarkovchainwhichhasthespaceasitsstatespaceand asitslimitingstationarydistribution.Thatmeanswehaveawayofsamplingvaluesfromposteriordistribution andthereforemakeMonteCarloinferenceabout informofsampleaveragesanddensityestimates.

θ Θ

p(y|θ) y

π p(θ)

π(θ) := p(θ|y) =p(y|θ)p(θ)

∫ p(y|θ)p(θ) dθ

Θ

ππ

θ

50 . 1

Page 51: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

MARKOVCHAINMONTECARLO–MCMCMARKOVCHAINMONTECARLO–MCMCAMarkovchainisdescribedbyatransitionkernel thatgivesforeachstatetheprobabilitydistributionforthechaintomovetostate inthenextstep.Belowweassumethatthereexistsacorrespondingtransitiondensity .

MCMCmethodsproducechainsthatareaperiodic,irreducibleandful�llareversibilityconditioncalleddetailedbalanceequation:

If istheinitialdistributionofthestartingstate,thentheintensityofgoingfromstatetostate issameasthatofgoingfrom to .

Directconsequenceofthereversibilityis

thatmeansthat isthestationarydistributionofthechainandwecanuseasamplefromthechainasarandomsamplefrom .

P(θ, d )θ′ θdθ′

p(θ, )θ′

π(θ)p(θ, ) = π( )p( , θ) θ, ∈ Θ.θ′ θ′ θ′ θ′

πθ θ′ θ′ θ

∫ π(θ)p(θ, ) dθ = π( ),  forall  ∈ Θθ′ θ′ θ′

ππ

51 . 1

Page 52: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

THEMETROPOLIS-HASTINGSALGORITHMTHEMETROPOLIS-HASTINGSALGORITHM

InMetropolis-HastingsalgorithmwegenerateaMarkovchainwithatransitiondensity

forsomeproposaldensity andforacceptanceprobability .

Thechainisreversibleifandonlyif

Whichleadstochoose as

Usually ,butthereversibilityconditioncanbeformulatedinmoregeneralstatespace.

p(θ, )θ′

p(θ, θ)

= q(θ, )α(θ, ), θ ≠θ′ θ′ θ′

= 1 − ∫ q(θ, )α(θ, ) dθθ′ θ′

q α

π(θ)q(θ, )α(θ, ) = π( )q( , θ)α( , θ).θ′ θ′ θ′ θ′ θ′

α

α(θ, ) = min{1, } .θ′ π( )q( , θ)θ′ θ′

π(θ)q(θ, )θ′

Θ ⊂ Rd

52 . 1

Page 53: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

NOTESNOTES

InMHalgorithmweneedtocalculatetheposteriorratio intheformulaforbutBayesformulagivesthisintermsoflikelihoodandprioras

andtheconstantofproportionalitydisappears.

WeneedsometheoryofMarkovchainsbutwithimportantsimpli�cations:weknowbyconstructionthatthestationarydistribution exists.Also,weareabletochoosetheinitialdistributionaswelike.ThatgivesussimplewaystoproveimportantergodicpropertiesoftheMHchain.Thelawoflargenumbersthatgivesuspermissiontousesampleaveragesasestimatesandthecentrallimittheoremwhichgivesustheconvergencerateforthealgorithms.

π( )/π(θ)θ′

α

p(y| )p( )θ′ θ′

p(y|θ)p(θ)

π

53 . 1

Page 54: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

REVERSIBLEJUMPMETROPOLIS-HASTINGSALGORITHMREVERSIBLEJUMPMETROPOLIS-HASTINGSALGORITHMThedetailedbalanceequationcanalsobeformulatedinverygeneralstatespaces.FortheMetropolis-Hastingsalgorithmtoworkitmustonlyacceptstatesfromwherethereisapositiveprobabilitytodoareversiblemovebacktotheoriginalstate.

ForthereversiblejumpMetropolis-Hastingsalgorithmthestatespaceiswrittenas

where isenumerablemodelspaceand istheparameterspaceofmodel.Thedimensionof canvarywith .

Theposteriordistributioncanbefactorizedas

andwemightbeinterestedintheposteriorprobabilitiesofdi�erentmodels anddrawconditionalormarginalconclusionsaboutdi�erentmodelsintermsofor .

E = {(k, ), k ∈ , ∈ } ,θ(k) θ(k)Θk

∈θ(k)Θk

k θ(k) k

π( , k) = π( |k)π(k)θ(k) θ(k)

π(k)π( |k)θ(k)

π( )θ(k)

54 . 1

Page 55: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

IMPLEMENTINGRJMCMCIMPLEMENTINGRJMCMC

55 . 1

Page 56: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

STEPOFTHEALGORITHM:STEPOFTHEALGORITHM:Whenbeinginmodel withparametervector :

1.Chooseanewmodel bydrawingitfromdistribution .Proposeavaluefortheparameter bygenerating fromdistribution .

2.Acceptthemovewithprobability(**): and .

3.Ifthemoveisnotaccepted,stayinthecurrentmodel: and .

Instep(ii)itisalsopossibletochoosestayinthecurrentmodelanddoastandardMetropolis-Hastingsstep.

ki θ( )ki

i

j p(i, ⋅)θ(j) u ( , u)q jki

θ( )ki

i

= jki+1 =θ( )ki+1

i+1 θ(j)

=ki+1 ki =θ( )ki+1

i+1 θ( )ki

i

56 . 1

Page 57: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ADAPTIVEMCMCMETHODSADAPTIVEMCMCMETHODSShortdescriptionofadaptiveMCMCmethodswhichweredevelopedtosolvelargenumberofestimationproblemswithouthandtuningofthealgorithm.

MCMCadaptationhasmanyinterestingtheoreticalquestionsaboutconvergenceofthemethods.

57 . 1

Page 58: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ADAPTIVEMETHODS,AMADAPTIVEMETHODS,AMThebottleneckinMCMC(Metropolis)calculationsoftenisto�ndaproposalthatmatchesthetargetdistribution,sothatthesamplingise�cient.Thismayleadtoatime-consumingtrial-and-errortuningoftheproposal.Variousadaptivemethodshavebeendevelopedinordertoimprovetheproposalduringtherun.Onerelativelysimplewayistocomputethecovariancematrixofthechainanduseitastheproposal.ThisiscalledAM,theAdaptiveMetropolisalgorithm.InAMthenewpointdependsnotjustonthepreviouspoint,butontheearlierhistoryofthechain.SothealgorithmisnomoreMarkovian.However,iftheadaptationisbasedonanincreasingpartofthechain,onecanprovethatthealgorithmproducesacorrectergodicresult.

58 . 1

Page 59: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ADAPTIVEMETHODS,AMADAPTIVEMETHODS,AM

AdaptiveMetropolisisarandomwalkalgorithmthatusesGaussianproposalwithacovariance thatdependonthechaingeneratedsofar,

where isaparameterthatdependsonlyonthedimension ofthesamplingspace,isaconstantthatwemaychooseverysmall,and ,de�nesthelengthof

theinitialnon–adaptationperiodandwelet

Cn

= { ,Cn,C0

cov( , … , ) + ε ,sd θ1 θn sd Id

n ≤ n0

n > .n0

sd dε > 0 > 0n0

59 . 1

Page 60: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ADAPTIVEMETHODS,AMADAPTIVEMETHODS,AMThisformofadaptationcanbeproventobeergodic.Notethatthesameadaptation,butwitha�xedupdatehistorylength,isnotergodic.Thechoiceforthelengthoftheinitialnon-adaptiveportionofthesimulation, ,isfree.Theadaptationisnotneededbedoneateachtimestep,onlyatgivenintervals.Thisformofadaptationimprovesthemixingpropertiesofthealgorithm,especiallyforhighdimensions.Theroleoftheparameter istoensurethat willnotbecomesingular.Inmostpracticalcases canbesafelysettozero.Thescalingparameterusuallyistakenas .

n0

ε Cn

ε

= /dsd 2.42

60 . 1

Page 61: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DR,THEDELAYEDREJECTIONALGORITHMDR,THEDELAYEDREJECTIONALGORITHM

Supposethecurrentpositionofasampledchainis .AsinaregularMH,acandidatemove isgeneratedfromaproposal andacceptedwiththeusualprobability

Uponrejection,insteadofretainingthesameposition, ,asecondstagemove, ,isproposed.Thesecondstageproposalisallowedtodependnotonlyonthecurrentpositionofthechainbutalsoonwhatwehavejustproposedandrejected:

.

= θθn

θ′1 (θ, ⋅)q1

(θ, ) = 1 ∧α1 θ′1

π( ) ( , θ)θ′1 q1 θ′

1

π(θ) (θ, )q1 θ′1

= θθn+1

θ′2

(θ, , ⋅)q2 θ′1

61 . 1

Page 62: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DR,THEDELAYEDREJECTIONALGORITHMDR,THEDELAYEDREJECTIONALGORITHM

Itcanbeshownthatanergodicchainiscreated,whenthesecondstageproposalisacceptedwithprobability

Thisprocessofdelayingrejectioncanbeiteratedtotrysamplingfromfurtherproposalsincaseofrejectionbythepresentone.However,inmanycasestheessentialbene�t–moreacceptedpointinasituationwhereone(e.g.,Gaussian)proposaldoesnotseemtoworkproperly–isalreadyreachedbytheabove2-stageDRalgorithm.

(θ, , ) = 1 ∧α2 θ′1 θ′

2

π( ) ( , ) ( , , θ)[1 − ( , )]θ′2 q1 θ′

2 θ′1 q2 θ′

2 θ′1 α1 θ′

2 θ′1

π(θ) (θ, ) (θ, , )[1 − (θ, )]q1 θ′1 q2 θ′

1 θ′2 α1 θ′

1

62 . 1

Page 63: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DRWITHADAPTATION:DRAMDRWITHADAPTATION:DRAMItispossibletocombinetheideasofadaptationanddelayedrejection.Toavoidcomplications,adirectwayofimplementingAMadaptationwithan$m$-stageDRalgorithmissuggested.Theproposalatthe�rststageofDRisadaptedjustasinAM:thecovariance fortheGaussianproposaliscomputedfromthepointsofthesampledchain,nomatteratwhichstageofDRthesepointshavebeenacceptedinthesamplepath.Thecovariance oftheproposalforthe 'thstage( )iscomputedasascaledversionofthe�rstproposal, .Thescalefactors canbefreelychosen:theproposalsofthehigherstagescanhaveasmallerorlargervariancethantheproposalatearlierstages.WehaveseenthatAMalonetypicallyrecoversfromaninitialproposalthatistoosmall,whiletheadaptationhasdi�cultiesifnooronlyafewacceptedpointsarecreatedinthestart.Soagooddefaultistousejusta2stageversionwherethesecondproposalisscaleddownfromthe(adapted)proposalofthe1.stage.

C1n

C in i i = 2, . . . , m

=C in γiC

1n

γi

63 . 1

Page 64: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DRAMversionoftheMHalgorithmthatproposesfromaGaussiandistribution.Secondstageproposalcovarianceishalfthesizeofthe�rststage.Adaptationisdoneafterevery20iterations.Animationshowsthe�rst100DRAMsteps.

DRAM-DELAYEDREJECTIONADAPTIVEMETROPOLISDRAM-DELAYEDREJECTIONADAPTIVEMETROPOLIS

64 . 1

Page 65: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SHORTCHAINSANDADAPTATIONSHORTCHAINSANDADAPTATIONItisimportanttomakeshortchainsase�cientaspossible.E�cient:produceestimateswithsmallMonteCarloerror.

ShortMCMCchainrepeated1000timeswithdi�erentalgorithms,Gaussian10dimensionaltargetandtoolargeinitialcovariance.

65 . 1

Page 66: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

SHORTCHAINSANDADAPTATIONSHORTCHAINSANDADAPTATIONBut,adaptationmightslowtheconvergence.

Sameasinthepreviousslide,butnowwithmoreoptimalinitialproposal,Gaussian10dimensionaltarget,nearoptimalinitialcovariance.

66 . 1

Page 67: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

RandomwalkMCMCisbynaturesequential,anditisgenerallymoree�cienttorunonelongchainthanmanyshortindependentchains.Onecouldrunsseveralchainsinparallel,withoutcommunicationbetweenthechains.InparalleladaptiveMCMC,theadaptationisdoneoverthepointsinallchainsandtheyshareonecommonadaptedproposalcovariance.Communicationbetweenthechainscanbeasynchronous.

FASTERMCMC:PARALLELCHAINSFASTERMCMC:PARALLELCHAINS

67 . 1

Page 68: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EARLYREJECTIONEARLYREJECTIONInmanycases isamonotonicallyincreasingfunctionwrt.addingnewobservationsorsimulatingthemodelfurtherintime.Withtheacceptanceprobability de�nedasbefore,wedraw

,andacceptif

Ifwewritethisas

thenwecanstopevaluatingthemodelwhen .

SS(θ)

α( , )θcurr θprop

u ∼ U(0, 1)

−2 log(u) < + S ( ) − S ( ).SS( ) − SS( )θprop θcurr

σ 2Spri θprop Spri θcurr

SScrit = −2 log(u) + SS( )/ + S ( )θcurr σ 2 Spri θcurr

< SS( )/ + S ( ),θprop σ 2 Spri θprop

SS( ) ≥ (S − S ( ))θprop Scrit Spri θprop σ 2

68 . 1

Page 69: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

OTHERMCMCVARIANTSANDIMPLEMENTATIONSOTHERMCMCVARIANTSANDIMPLEMENTATIONS

69 . 1

Page 70: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

GRADIENTBASEDMETHODSGRADIENTBASEDMETHODSLangevindi�usion.Hamiltonianmethods.

Bothneed(automatic)di�erentationofthelikelihood/forwardmodel.Othertoolboxes

STANPYMC3FME

Niceanimationsofdi�erentMCMCmethodsbyChiFeng.

SeealsoblogpostbyColinCarroll .

https://mc-stan.org

https://mc-stan.org

https://docs.pymc.io

https://docs.pymc.io

https://cran.r-project.org/package=FME

https://cran.r-project.org/package=FME

https://chi-feng.github.io/mcmc-demo/

https://chi-feng.github.io/mcmc-demo/

HamiltonianMonteCarlofromscratch

HamiltonianMonteCarlofromscratch

70 . 1

Page 71: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXAMPLE:DYNAMICALSTATESPACEMODELSANDEXAMPLE:DYNAMICALSTATESPACEMODELSANDMCMCMCMC

AsamoreadvancedexampleonmodellingwithMCMCmethods,weconsiderageneralstatespacemodelasaframeworkformultivariatetimeseriesanalysis.ThisiscloselyconnectedwithJanne'slecturesabouttheKalman�lter.

71 . 1

Page 72: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DLM–DYNAMICLINEARMODELDLM–DYNAMICLINEARMODELStatespaceformofthegeneralmodel

Modeloperator ,observationoperator ,modelerrorcovariance ,observationerrorcovariance ,withtimeindex .

Bayesianhierarchicalmodel

Observationmodel: .Processmodel: .Parametermodel: .

Bayesformula

xt

yt

= + ,Mtxt−1 Et

= + ,Htxt ϵt

∼ (0, )Et Nm Qt

∼ (0, ).ϵt Np Rt

Mt Ht Qt

Rt t = 0, 1, … , n

p( | , θ)yt xt

p( | , θ)xt+1 xt

p(θ)

p( , θ| ) ∝ p( | , θ)p( | , θ)p(θ).x1:n y1:n ∏t=1

n

yt xt xt xt−1

72 . 1

Page 73: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ESTIMATINGLINEARSTATESPACEMODELESTIMATINGLINEARSTATESPACEMODELFordynamiclinearmodelswehavecomputationaltoolsforallrelevantstatisticaldistributions.

onesteppredictionsbyKalman�lter.�lteringdistributionbyKalman�lter.smoothingdistributionbyKalmansmoother.jointstatebysimulationsmoother.

parameterlikelihoodbyKalman�lterlikelihood.jointstateandparameterbyMCMC.

Theparameter containstheauxiliarymodelparameterrelatedtomodelstructureandobservationandmodelerrorcovariances.Theparameterscanbe�xedbypriorknowledge,estimatedbymaximumlikelihood,orestimatedandmarginalizedoverbyMCMC.

p( | , , θ)xt+1 xt y1:t

p( | , θ)xt y1:t

p( | , θ)xt y1:n

p( | , θ)x1:n y1:n

p( |θ)y1:n

p( , θ| )x1:n y1:n

θ

73 . 1

Page 74: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

TRENDANALYSISBYSIMULATIONTRENDANALYSISBYSIMULATIONKalmanformulasgivemarginaldistributions .WecansimulateDLMstatesfrom .

NeedMCMCtointegrateouttheuncertaintyabout andsimulatefrom

Thisisneeded,forexample,togetuncertaintyestimatesoftrendrelatedstatisticsintimeseriesanalysis.

p( | , θ)xt y1:n

p( | , θ)x1:n y1:n

θ

p( | ) = ∫ p( | , θ) dθ.x1:n y1:n x1:n y1:n

74 . 1

Page 75: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

ESTIMATINGPARAMETERSESTIMATINGPARAMETERSHowtoselectthemodelmatrix ?

Byconsideringalltherelevantprocesses.Diagnosingtheresiduals.

Howtochooseinitialstatedistribution ?Byassumingdi�usepriors .

Howtoestimateerrorcovariances and ?

ByKalman�lterlikelihood:

M

N( , )x0 C0

Qt Rt

−2 log(p( | , θ)) ∝ [( − ( − ) + log(| |)]y1:n x1:n ∑t=1

n

yt Ht x̂ t)T C

y−1t yt Ht x̂ t C

yt

75 . 1

Page 76: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

DLMTOOLBOXDLMTOOLBOXMatlabtoolboxforDLM

Ozonetimeseriesexample

MoreinfoonDLMhere

https://www.github.com/mjlaine/dlm/

https://www.github.com/mjlaine/dlm/

https://mjlaine.github.io/dlm/ex/ozonedemo.html

https://mjlaine.github.io/dlm/ex/ozonedemo.html

https://arxiv.org/abs/1903.11309

https://arxiv.org/abs/1903.11309

76 . 1

Page 77: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

THATSALL!THATSALL!H.Haario,E.Saksman,J.Tamminen:AnadaptiveMetropolisalgorithm,Bernoulli,7(2),2001.H.Haario,E.Saksman,J.Tamminen:ComponentwiseadaptationforhighdimensionalMCMC,ComputationalStatistics,20(2),2005.H.Haario,M.Laine,A.Mira,E.Saksman:DRAM:E�cientadaptiveMCMC,StatisticsandComputing,16(3),2006.HaarioH.,M.Laine,M.Lehtinen,E.Saksman,J.Tamminen:MCMCmethodsforhighdimensionalinversioninremotesensing,JournaloftheRoyalStatisticalSociety,SeriesB,66(3),2004.LaineM.,J.Tamminen:AerosolmodelselectionanduncertaintymodellingbyadaptiveMCMCtechnique,AtmosphericChemistryandPhysics,8(24),2008.

SolonenA.,P.Ollinaho,M.Laine,H.Haario,J.Tamminen,H.Järvinen:E�cientMCMCforClimateModelParameterEstimation:ParallelAdaptiveChainsandEarlyRejection,BayesianAnalysis,7(3),2012.M.Laine:IntroductiontoDynamicLinearModelsforTimeSeriesAnalysis,inGeodeticTimeSeriesAnalysisandApplications,Springer2019. ,

http://dx.doi.org/10.2307/3318737

http://dx.doi.org/10.2307/3318737

http://dx.doi.org/10.1007/BF02789703

http://dx.doi.org/10.1007/BF02789703

http://dx.doi.org/10.1007/s11222-006-9438-0

http://dx.doi.org/10.1007/s11222-006-9438-0

http://dx.doi.org/10.1111/j.1467-9868.2004.02053.x

http://dx.doi.org/10.1111/j.1467-9868.2004.02053.x

http://dx.doi.org/10.5194/acp-8-7697-2008

http://dx.doi.org/10.5194/acp-8-7697-2008

http://dx.doi.org/10.1214/12-BA724

http://dx.doi.org/10.1214/12-BA724

https://arxiv.org/abs/1903.11309

https://arxiv.org/abs/1903.11309

77 . 1

Page 78: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EXERCISESEXERCISES

78 . 1

Page 79: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

NON-LINEARMODELFITTINGNON-LINEARMODELFITTINGTakeasimplenon-linearmodel.Forexampleexponentialdecay,with:

withdata

Here,youcanassume�rstthat .

FamiliarizeyourselfwithsomeMCMCpackage(matlab,python,R)orevenwriteyourowncode.Writethecodeneededforthebasicmodel ,withi.i.d.Gaussianerrors,

.

y = + ( − ) exp(− t)θ1 A0 θ1 θ2

data.tdata = [1,2,3,4,5,6,7,8,9,10]'; data.ydata = [0.487 0.572 0.369 0.179 0.119 0.0809 0.104 0.091 0.047 0.051]';

= 1A0

y = f (x, θ) + ϵϵ ∼ N(0, I )σ 2

79 . 1

Page 80: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

NON-LINEARMODELFITTING(CONT.)NON-LINEARMODELFITTING(CONT.)WithyourchosenMCMCpackage.

Estimateposterior withMCMC.Studythechainconvergence.Howlongchainsdoyouneed?WritedownparameterestimatesandtheMonteCarloerrorsoftheestimatesofposteriormeanandposteriorcovariance.Considerdi�erentstrategiestohandleuncertaintyinobservation, .Generatesuitablepredictiveenvelopesaroundthebest�tmodel.Howwouldhandlethefactthatobservationsareconstrainedtopositive?Add asanextraparameter.

p(θ|y)

σ

A0

80 . 1

Page 81: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

EFFICIENCYOFMCMCMETHODSEFFICIENCYOFMCMCMETHODSHowwouldyoujudgecorrectness,convergenceandthee�ciencyofaMCMCsampler?

CompareyourchosenMCMCsamplertoknownGaussiantarget(oreventothe"banana"target).Dosamplingmultipletimesforthesametargetandseehowthemeanestimatesbehave.Couldyouinferthesamefromasinglerun?EstimateMonteCarloerrorbyintegratedautocorrelationandbybatchmeans.Doyougetsimilarresults?Whatarethebene�tsofdi�erentadaptationschemes?Whendoyouneedburn-in?

81 . 1

Page 82: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

LOGISTICMODELWITHPOISSONLIKELIHOODLOGISTICMODELWITHPOISSONLIKELIHOODConsideramodelfordiscreteeventsfromPoissondistributions ,wherethemeanparameter ismodelledfurtherwithalogisticfunction

andassuming .

Fitthemodelparameters and usingMCMCandstudythepredictivebehaviourofthe�t,whenhavethefollowingobservations(nextslide)

∼ Poisson(μ(t))Nobs

μ(t) = E(N)

= kμ( − μ),μ′ μmax

N(0) = 1

k μmax

82 . 1

Page 83: MARKOV CHAIN MONTE CARLO METHODS - FIPSnumbers (Monte Carlo) is such way that each new point may only depend on the previous point (Markov chain). Intuitively, a correct distribution

LOGISTICMODELWITHPOISSONLIKELIHOOD(CONT.)LOGISTICMODELWITHPOISSONLIKELIHOOD(CONT.)time

0 1

2 1

4 3

6 5

8 3

10 4

12 3

14 0

16 1

18 0

20 1

t Nobs

83 . 1