of 43 /43
Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Sampling from log-concave density Alain Durmus, Eric Moulines, Marcelo Pereyra Telecom ParisTech, Ecole Polytechnique, Bristol University A. Durmus, Eric Moulines, Marcelo Pereyra eminaire des jeunes probabilistes et statisticiens-2016

Sampling from log-concave densityjps.math.cnrs.fr/slides/Durmus.pdf · Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion When (k) k 1 is nonincreasing

  • Author
    lytruc

  • View
    214

  • Download
    0

Embed Size (px)

Text of Sampling from log-concave densityjps.math.cnrs.fr/slides/Durmus.pdf · Sampling from log-concave...

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Sampling from log-concave density

    Alain Durmus, Eric Moulines, Marcelo Pereyra

    Telecom ParisTech, Ecole Polytechnique, Bristol University

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Introduction

    Sampling distribution over high-dimensional state-space has recentlyattracted a lot of research efforts in computational statistics andmachine learning community...

    Applications (non-exhaustive)

    1 Bayesian inference for high-dimensional models and Bayesian nonparametrics

    2 Bayesian linear inverse problems (typically function space problemsconverted to high-dimensional problem by Galerkin method)

    3 Aggregation of estimators and experts

    Most of the sampling techniques known so far do not scale tohigh-dimension... Challenges are numerous in this area...

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Bayesian setting (I)

    - In a Bayesian setting, a parameter Rd is embedded with a priordistribution and the observations are given by a probabilistic model:

    Y `(|)

    The inference is then based on the posterior distribution:

    (d|Y ) = (d)`(Y |)

    `(Y |u)(du).

    In most cases the normalizing constant is not tractable:

    (d|Y ) (d)`(Y |) .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Bayesian setting (II)

    Bayesian decision theory relies on computing expectations:Rdf()`(Y |)(d)

    Generic problem: estimation of an expectation E[f ], where

    - is known up to a multiplicative factor ;

    - we do not know how to sample from (no basic Monte Carloestimator);

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Examples: Logistic and probit regression

    Likelihood: Binary regression set-up in which the binary observations(responses) (Y1, . . . , Yn) are conditionally independent Bernoullirandom variables with success probability F (TXi), where

    1 Xi is a d dimensional vector of known covariates,2 is a d dimensional vector of unknown regression coefficient3 F is a distribution function.

    Two important special cases:

    1 probit regression: F is the standard normal distribution function,2 logistic regression: F is the standard logistic distribution function,F (t) = et/(1 + et).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Examples: Logistic and probit regression

    The posterior density distribution of is given by Bayes rule, up toa proportionality constant by (|(Y,X)) exp(U()) , wherethe potential U() is given by

    U() = pi=1

    {Yi logF (TXi) + (1 Yi) log(1 F (TXi))}

    + g() ,

    where g is the log-density of the prior distribution.

    Two important cases:

    Gaussian prior: g() = (1/2)T, ridge regression.Laplace prior: g() =

    dk=1 |k|, lasso regression.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    New challenges

    Problem the number of predictor variables d is large (104 and up).Examples

    - text categorization,

    - genomics and proteomics (gene expression analysis), ,

    - other data mining tasks (recommendations, longitudinal clinicaltrials, ..).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Data Augmentation

    The most popular algorithms for Bayesian inference in ridge binaryregression models are based on data augmentation:

    1 probit link: Albert and Chib (1993).2 logistic link: Polya-Gamma sampler, Polsson and Scott (2012)... !

    Bayesian lexicon:

    - Data Augmentation instead on sampling (|(Y,X)) sample(,W |(Y,X)) and marginalize W .

    - Typical application of the Gibbs sampler: sample in turn(|W,Y,X) and (W |,X, Y )

    - The choice of the DA should make these two steps reasonably easy...

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Data Augmentation algorithms

    These two algorithms have been shown to be uniformly geometricallyergodic, BUT the constants depends highly on the dimension.

    The algorithms are very demanding in terms of computationalressources...

    - applicable only when is d small 10 to moderate 100 but certainly notwhen d is large (104 or more).

    - convergence time prohibitive as soon as d 102.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    A daunting problem ?

    In the case of the ridge regression, the potential 7 U() issmooth, strongly convex

    In the case of the lasso regression, the potential 7 U() isnon-smooth but still convex...

    A wealth of reasonably fast optimisation algorithms are available tosolve this problem in high-dimension...

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Framework

    Denote by a target density w.r.t. the Lebesgue measure on Rd,known up to a normalisation factor

    x 7 eU(x)/Rd

    eU(y)dy ,

    Implicitly, d 1.Assumption: U is L-smooth : continuously differentiable and thereexists a constant L such that for all x, y Rd,

    U(x)U(y) Lx y .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Langevin diffusion

    Langevin SDE:

    dYt = U(Yt)dt+

    2dBt ,

    where (Bt)t0 is a d-dimensional Brownian Motion.

    Denote by (Pt)t0 the semigroup of the diffusion,Pt(x,A) = Ex [Yt A].(Pt)t0 is

    - aperiodic, strong Feller (all compact sets are small).- reversible w.r.t. to (admits as its unique invariant distribution).

    eU is reversible ; the unique invariant probability measure.For all x Rd, measurable and bounded functions f : Rd R

    limt+

    Ptf(x) = limt+

    Ex [f(Yt)] =Rdf(y)d(y) .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Discretized Langevin diffusion

    Idea: Sample the diffusion paths, using for example theEuler-Maruyama (EM) scheme:

    Xk+1 = Xk k+1U(Xk) +

    2k+1Zk+1

    where

    - (Zk)k1 is i.i.d. N (0, Id)- (k)k1 is a sequence of stepsizes, which can either be held constant

    or be chosen to decrease to 0 at a certain rate.

    Closely related to the gradient algorithm.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Discretized Langevin diffusion: constant stepzize

    When k = , then (Xk)k1 is an homogeneous Markov chain withMarkov kernel R

    Under some appropriate conditions, this Markov chain is irreducible,positive recurrent ; unique invariant distribution .

    Problem: 6= .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    When (k)k1 is nonincreasing and non constant (Xk)k1 is aninhomogeneous Markov chain associated with the sequence ofMarkov kernel (Rk)k1

    Denote by xQp the law of Xp started at x.

    Reminder: the diffusion converges to the target distribution .

    Question: since the EM disretization approximates the diffusion, canit be used to sample from :

    Is xQp closed to and in which sense ?

    Can we have some theoretical guarantees ? In particular what is thedependence on the dimension d ?

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Metric on probability spaces

    Definition

    For , two probabilities measure on Rd, define

    TV = sup|f |1

    |E[f ] E [f ]| .

    W1(, ) = supfLip1

    |E[f ] E [f ]| .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Wasserstein distance convergence

    We assume in this part that U is strongly convex: there exist andm > 0, such that for all x, y Rd,

    U(x)U(y), x y m x y2 .

    For all x Rd,

    W 21 (xPt, ) emtRdy x2 (dy) .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Assume U is L-smooth and strongly convex. Let (k)k1 be anonincreasing sequence with 1 1/(m+ L).Then for all x Rd and n 1,

    W 21 (xQn , ) u(1)n ()

    Rdy x2 (dy) + u(2)n () ,

    where (u(1)n (), u

    (2)n ())n1 are explicit.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    We have if limk+ k = 0 and limk+ k = +,

    limn+

    W 21 (xQn , ) = 0 ,

    with explicit convergence.

    Order of convergence for decreasing stepsize.

    (0, 1) = 1Order of convergence O(n) O(n1)

    Table : Order of convergence of W2(xQn , ) for k = 1k

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    When (k)k1 is constant:

    We optimize and n to get W1(xQn , ) . In particular, we need

    n = O(d2) .

    If the number of iterations n is fixed, we can optimize and we finda bound in W1(xQ

    n , ) O(

    n).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    To improve the bound, we make a regularity assumption on U : Thepotential U is three times continuously differentiable and there existsL such that for all x, y Rd:2U(x)2U(y) L x y .Assume U is strongly convex L-smooth and satisfies the conditionabove. Let (k)k1 be a nonincreasing sequence with1 1/(m+ L).Then for all x Rd and n 1,

    W 21 (xQn , ) u(3)n ()

    Rdy x2 (dy) + u(4)n () ,

    where (u(3)n (), u

    (4)n ())n1 are explicit.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Order of convergence for decreasing stepsize.

    (0, 1) = 1Order of convergence O(n2) O(n2)

    Table : Order of convergence of W1(xQn , ) for k = 1k

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    When (k)k1 is constant:

    We optimize and n to get W1(xQn , ) . In particular, we need

    n = O(d1) .

    If the number of iterations n is fixed, we can optimize and we finda bound in W1(xQ

    n , ) O(n1).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Convergence of the Euler discretization

    If we assume that U is convex, L-smooth.

    Explicit bound for xQp TV.If limk+ k = 0, and

    k k = + then

    limp+

    xQp TV = 0 .

    Computable bounds for the convergence.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Target precision : the convex case

    For constant stepsizes, We can optimize and p to get

    xQp TV .

    d L O(d4) O(2/ log(1)) O(L2)p O(d7) O(2 log2(1)) O(L2)

    We can also at fixed iteration, optimize the stepsize .

    Dependence on the dimensions: comes from the fact that theconvergence of the diffusion in the convex case also depends on thedimension.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Non-smooth potentials

    The target distribution has a density with respect to the Lebesguemeasure on Rd of the form x 7 eU(x)/

    Rd eU(y)dy where U = f + g,

    with f : Rd R and g : Rd (,+] are two lower bounded,convex functions satisfying:

    1 f is continuously differentiable and gradient Lipschitz with Lipschitzconstant Lf , i.e. for all x, y Rd

    f(x)f(y) Lf x y .

    2 g is lower semi-continuous andRd eg(y)dy (0,+).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Moreau-Yosida regularization

    Let h : Rd (,+] be a l.s.c convex function and > 0. The-Moreau-Yosida envelope h : Rd R and the proximal operatorproxh : Rd Rd associated with h are defined for all x Rd by

    h(x) = infyRd

    {h(y) + (2)1 x y2

    } h(x) .

    For every x Rd, the minimum is achieved at a unique point,proxh(x), which is characterized by the inclusion

    x proxh(x) h(proxh(x)) .

    The Moreau-Yosida envelope is a regularized version of g, whichapproximates g from below.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Properties of proximal operators

    As 0, converges h converges pointwise h, i.e. for all x Rd,

    h(x) h(x) , as 0 .

    The function h is convex and continuously differentiable

    h(x) = 1(x proxh(x)) .

    The proximal operator is a monotone operator, for all x, y Rd,proxh(x) proxh(y), x y

    0 ,

    which implies that the Moreau-Yosida envelope is L-smooth:h(x)h(y) 1 x y, for all x, y Rd.A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    MY regularized potential

    If g is not differentiable, but the proximal operator associated with gis available, its -Moreau Yosida envelope g can be considered.

    This leads to the approximation of the potential U : Rd Rdefined for all x Rd by

    U(x) = f(x) + g(x) .

    Theorem

    Under (H), for all > 0, 0 0,

    TV g2Lip .

    3 If g = K where K is a convex body of Rd. Then for all > 0 wehave

    TV 2 (1 + D(K, ))1 ,

    where D(K, ) is explicit in the proof, and is of order O(1) as goes to 0.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    The MYULA algorithm-I

    Given a regularization parameter > 0 and a sequence of stepsizes{k, k N}, the algorithm produces the Markov chain {XMk , k N}:for all k 0,

    XMk+1 = XMk k+1

    {f(XMk ) + 1(XMk proxg (XMk ))

    }+

    2k+1Zk+1 ,

    where {Zk, k N} is a sequence of i.i.d. d-dimensional standardGaussian random variables.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    The MYULA algorithm-II

    The ULA target the smoothed distribution .

    To compute the expectation of a function h : Rd R under from{XMk ; 0 k n}, an importance sampling step is used to correctthe regularization.

    This step amounts to approximateRd h(x)(x)dx by the weighted

    sum

    Shn =

    nk=0

    k,nh(Xk) , with k,n =

    {nk=0

    keg(XMk )

    }1ke

    g(XMk ) ,

    where for all x Rd

    g(x) = g(x)g(x) = g(proxg (x))g(x)+(2)1x proxg (x)2 .

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Image deconvolution

    Objective recover an original image x Rn from a blurred and noisyobserved image y Rn related to x by the linear observation modely = Hx + w, where H is a linear operator representing the blurpoint spread function and w is a Gaussian vector with zero-meanand covariance matrix 2In.

    This inverse problem is usually ill-posed or ill-conditioned: exploitsprior knowledge about x.

    One of the most widely used image prior for deconvolution problemsis the improper total-variation norm prior, (x) exp (dx1),where d denotes the discrete gradient operator that computes thevertical and horizontal differences between neighbour pixels.

    (x|y) exp[y Hx2/22 dx1

    ].

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    (a) (b) (c)

    Figure : (a) Original Boat image (256 256 pixels), (b) Blurred image, (c)MAP estimate.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Credibility intervals

    (a) (b) (c)

    Figure : (a) Pixel-wise 90% credibility intervals computed with proximal MALA(computing time 35 hours), (b) Approximate intervals estimated with MYULAusing = 0.01 (computing time 3.5 hours), (c) Approximate intervalsestimated with MYULA using = 0.1 (computing time 20 minutes).

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    1 Motivation

    2 Framework

    3 Sampling from strongly log-concave density

    4 Sampling from log-concave density

    5 Non-smooth potentials

    6 Numerical illustrations

    7 Conclusion

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

  • MotivationFramework

    Sampling from strongly log-concave densitySampling from log-concave density

    Non-smooth potentialsNumerical illustrations

    Conclusion

    Whats next ?

    Extension of this work

    - Richardson-Romberg interpolation: debiaising for smoothfunctionnals with non-asymptotic bounds on the MSE.

    - Langevin meets Gibbs: ULA within Gibbs.- detailed comparison with MALA

    Thank you for your attention.

    A. Durmus, Eric Moulines, Marcelo Pereyra Seminaire des jeunes probabilistes et statisticiens-2016

    MotivationFrameworkSampling from strongly log-concave densitySampling from log-concave densityNon-smooth potentialsNumerical illustrationsConclusion