32
Large deviation and variational approaches to generalised gradient flows Hong Duong Imperial College London Universit´ e Jean Monnet, 23/01/2018 Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows 23/01/2018 1 / 30

Large deviation and variational approaches to generalised ...tugaut.perso.math.cnrs.fr/pdf/APSSE/2018.01.23.pdf2018/01/23  · Prob H n n = 1 = 2 n = exp( nlog 2) S. Varadhan Abel

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Large deviation and variational approaches togeneralised gradient flows

    Hong Duong

    Imperial College London

    Université Jean Monnet, 23/01/2018

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 1 / 30

  • Introduction

    1 Wasserstein gradient flows

    2 From Large-deviation principle to Wasserstein gradient flows

    3 Generalised Wasserstein gradient flows

    4 Multiscale analysis/Coarse-graining

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 2 / 30

  • Wasserstein gradient flows

    Jordan-Kinderlehrer-Otto, 1998: the Fokker-Planck equation

    ∂tρ = div(∇V ρ) + ∆ρ, ρ(0) = ρ0,

    is a gradient flow of the free energy

    F(ρ) =ˆρ log ρ+ V ρ

    with respect to the Wasserstein distance on the probability measures withfinite second moments

    W 22 (µ, ν) = infγ∈Γ(µ,ν)

    ˆRd×Rd

    |x − y |2 γ(dxdy)

    (Γ(µ, ν) is the set of couplings between µ and ν)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 3 / 30

  • JKO-scheme

    JKO-scheme: Given a time-step h > 0;

    ρh0 := ρ0,

    for n ≥ 1, ρhn is defined to be the unique minimizer of

    1

    2hW 22 (ρ, ρ

    hn−1) + F(ρ).

    Then the sequence {ρhn}n≥0, after an appropriate interpolation, convergesto the solution to the Fokker-Planck equation as h→ 0.

    Jh(·|ρ0) :=1

    2

    [ 12h

    W 22 (·, ρ0) + F(·)−F(ρ0)]

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 4 / 30

  • The theory of Wasserstein gradient flows

    The theory of Wasserstein gradient flow has developed tremendously overthe last 20 years

    Many PDEs are Wasserstein gradient flows: porous medium equation,McKean-Vlasov equations, finite Markov chain, etc;

    Link different areas of Mathematics together (optimal transport,measure geometry & probability)

    see e.g., monographs

    Ambrosio L., Gigli N., Savaré G (2008),

    Villani C. (2003 & 2009).

    or the most recent survey by Santambrogio (2017).

    BUT

    Only for gradient flows, i.e., dissipative systems;

    No microscopic interpretation.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 5 / 30

  • The theory of Wasserstein gradient flows

    The theory of Wasserstein gradient flow has developed tremendously overthe last 20 years

    Many PDEs are Wasserstein gradient flows: porous medium equation,McKean-Vlasov equations, finite Markov chain, etc;

    Link different areas of Mathematics together (optimal transport,measure geometry & probability)

    see e.g., monographs

    Ambrosio L., Gigli N., Savaré G (2008),

    Villani C. (2003 & 2009).

    or the most recent survey by Santambrogio (2017).

    BUT

    Only for gradient flows, i.e., dissipative systems;

    No microscopic interpretation.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 5 / 30

  • Questions discussed today

    Why the JKO scheme?

    Can one establish JKO-type scheme for non-dissipative system, e.g.,the kinetic Fokker-Planck equation?

    Can one exploit variational formulation for multi-scale analysis?

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 6 / 30

  • From Large-deviation principle to Wasserstein gradient flows

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 7 / 30

  • Large-deviation principles

    Toss a fair coin n times, Hn = number ofheads

    What is the chance of getting all heads?

    Prob

    (Hnn

    = 1

    )= 2−n = exp(−n log 2)

    S. VaradhanAbel prize 2007

    In general,

    Prob

    (Hnn≈ x

    )n→∞∼ exp(−n I (x)), for 0 ≤ x ≤ 1

    I : rate functional

    bb

    bc bc

    0 112

    ∞ ∞

    I(x)

    x

    log(2)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 8 / 30

  • Large-deviation principles

    Toss a fair coin n times, Hn = number ofheads

    What is the chance of getting all heads?

    Prob

    (Hnn

    = 1

    )= 2−n = exp(−n log 2)

    S. VaradhanAbel prize 2007

    In general,

    Prob

    (Hnn≈ x

    )n→∞∼ exp(−n I (x)), for 0 ≤ x ≤ 1

    I : rate functional

    bb

    bc bc

    0 112

    ∞ ∞

    I(x)

    x

    log(2)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 8 / 30

  • Microscopic interpretation of the JKO-scheme

    Stochastic particle systems

    dX it = −∇V (X it ) dt +√

    2dW it , i = 1, . . . , n;

    and its empirical measure

    ρnt :=1

    n

    n∑i=1

    δX it .

    Large deviation principle:

    Prob(ρnh ≈ ρ|ρn0 ≈ ρ0

    )∼ exp

    (− nIh(ρ|ρ0)

    )as n→∞.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 9 / 30

  • From large-deviation principle to Wasserstein gradient flow

    Theorem (Adams-Peletier-Dirr-Zimmer 2011, D.-Laschos-Renger2013, Erbar-Maas-Renger 2015)

    Ih(·|ρ0)−1

    4hW 22 (·, ρ0)

    h→0−−−−−→Γ

    1

    2F(·)− 1

    2F(ρ0).

    i.e.,Ih(·|ρ0) ≈ Jh(·|ρ0) as h→ 0.

    (Γ: Gamma convergence)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 10 / 30

  • Idea of proof

    H−1(ρ)-norm: D = C∞c (Rd), s ∈ D′:

    ‖s‖2−1,ρ := supf ∈D

    {2〈s, f 〉 −

    ˆ|∇f |2 dρ

    }Benamou-Brenier formulation

    W 22 (ρ0, ρ1) = inf{ˆ 1

    0‖∂tρt‖2−1,ρt dt : (ρt)t ∈ AC 2(ρ0, ρ1)};

    Representation of the rate-functional

    Ih(ρ1|ρ0) =1

    4h

    ˆ 10

    ∥∥∥∂tρt − h(div(∇V ρ) + ∆ρ)∥∥∥2−1,ρt

    dt.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 11 / 30

  • Idea of proof (cont.)

    Proof of Γ-convergence

    Liminf-inequality:

    Ih(ρ1|ρ0) =1

    2

    (F(·)−F(ρ0)

    )+

    1

    4h

    ˆ 10‖∂tρt‖2−1,ρt dt

    +h

    4

    ˆ 10‖ div(∇V ρ)∆ρ‖2−1,ρt dt

    ≥ 12

    (F(·)−F(ρ0)

    )+

    1

    4h

    ˆ 10‖∂tρt‖2−1,ρt dt

    ≥ 12

    (F(·)−F(ρ0)

    )+

    1

    4hW 22 (ρ0, ρ1).

    Limsup-inequality: construction of a curve (ρt)t such that the lastterm in Ih vanishes.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 12 / 30

  • Variational formulation for non-dissipative systems

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 13 / 30

  • Langevin dynamics and the kinetic Fokker-Planck equation

    The kinetic Fokker-Planck equation

    ∂tρ = −divq( pmρ)

    + divp(∇V ρ

    )+ γ divp

    ( pmρ)

    + γβ−1∆pρ.

    The Langevin dynamics

    dQ =P

    mdt,

    dP = −∇V (Q) dt − γ Pm

    dt +√

    2γβ−1 dWt .

    Neither a gradient flow nor a Hamiltonian flow, but a combination of them.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 14 / 30

  • Variational formulation for the kinetic Fokker-Planckequation

    New optimal transportation cost functional

    Ch(q, p; q′, p′) := h inf

    {ˆ h0

    ∣∣mξ̈(t) +∇V (ξ(t))∣∣2 dt : ξ ∈ C 1([0, h],Rd)such that

    (ξ,mξ̇)(0) = (q, p), (ξ,mξ̇)(h) = (q′, p′)

    }.

    Given µ(dqdp), ν(dq′dp′), define

    W̃h(µ, ν) := infγ∈Γ(µ,ν)

    ˆR2d×R2d

    Ch(q, p; q′, p′)γ(dqdpdq′dp′)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 15 / 30

  • Explanation of the cost

    Wasserstein cost

    C (x , y) = |x − y |2 = h inf{ˆ h

    0

    ∣∣ξ̇(t)∣∣2 dt : ξ ∈ C 1([0, h],Rd)such that ξ(0) = x , ξ(h) = y

    }.

    Freidlin-Wentzell small-noise large deviation

    dX εt =√

    2εdWt .

    then {X ε : X ε(0) = x ,X ε(h) = y} satisfies a LDP with rate functionC (x , y).Our cost: small-noise perturbation of the Hamiltonian system

    dQ =P

    mdt

    dP = −∇V (Q) dt +√

    2εdWt

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 16 / 30

  • Variational formulation for the kinetic Fokker-Planckequation

    JKO-type scheme for the kinetic Fokker-Planck equation: ρh0 := ρ0, forn ≥ 1, ρhn as the solution of the minimization problem

    minρ

    1

    2h

    1

    γW̃h(ρ

    hk−1, ρ) + F(ρ),

    Theorem (D.-Peletier-Zimmer 2014)

    Under the piece-wise constant interpolation, the sequence {ρhn}nconverges, as h→ 0, to the solution of the kinetic Fokker-Planck equation.

    Extensions

    D.-Tran 2017: a degenerate Kolmogorov diffusion.

    D.-Pavliotis, in prep.: hypoelliptic diffusions.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 17 / 30

  • Multiscale analysis and coarse-graining

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 18 / 30

  • Langevin and overdamped Langevin dynamics

    Langevin dynamics

    dQ =P

    mdt,

    dP = −∇V (Q) dt − γ Pm

    dt +√

    2γβ−1 dWt .

    overdamped Langevin dynamics

    dX = −∇V (X ) dt +√

    2β−1dWt

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 19 / 30

  • Singular limits of Langevin dynamics

    Overdamped limit (γ →∞): Kramers 1940 (formal computations),Nelson 1967 (rigorous, SDEs).

    Small noise limit (ε := β−1 → 0): Freidlin-Wentzell theory (RandomPerturbations of Dynamical Systems) 1994.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 20 / 30

  • Coarse-graining

    Challenge: the curse of dimensionality (d extremely large).

    Question: Reduced description of the system, that still includes importantdynamical information?

    Coarse-graining/model reduction:

    F : Rd → Rk with k < d .

    Quantity of interest: path t 7→ F (X (t)). [X = (Q,P) or X = Q]

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 21 / 30

  • Previous work

    Gyöngy 1986 introduced an exact dynamics

    dY1(t) = b1(t,Y1(t)) dt +√

    2β−1A1(t,Y1(t))dW (t), Y1(t) ∈ Rk

    Law(Y1) = Law(F (X )) but it is extremely difficult/expensive tocompute b1 and A1.

    Legoll et al. 2010, 2016 proposed an effective dynamics

    dY2(t) = b2(Y2(t)) dt +√

    2β−1A2(Y2(t)) dW (t), Y2(t) ∈ Rk

    It is much easier/cheaper to compute b2 and A2. The authorsestimate the error between ρ1 := Law(Y1) and ρ2 := Law(Y2).

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 22 / 30

  • Limitations of Legoll et al.

    only overdamped Langevin,

    scalar coarse-graining maps (k = 1),

    errors measured only in the relative entropy sense.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 23 / 30

  • Our work: a unifying technique

    Based on the connection to large-deviation principle, we

    provide a unifying technique for both singular limits andcoarse-graining,

    generalise Legoll et al. to the full Langevin dynamics for vectorialmaps and more error-measurement methods.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 24 / 30

  • Theorem (D.-Lamacz-Peletier-Sharma 2017,D.-Lamacz-Peletier-Schlichting-Sharma sub. 2017)

    Under certain regularity and growth assumptions on V and F ,

    (1) (qualitative) We recover well-known results for the overdamped andsmall-noise limits of Langevin dynamics.

    (2) (quantitative) It holds that

    supt∈[0,T ]

    D(ρ1(t), ρ2(t))︸ ︷︷ ︸Error measurement

    (D=W 22 or H)

    ≤D(ρ1(0), ρ2(0))︸ ︷︷ ︸Initial data

    +C (V ,F )︸ ︷︷ ︸Data

    H(Law(X0)||µ).︸ ︷︷ ︸Reference dynamics

    If V (x) = 1εV0(x0, . . . , xd−k) + V1(x), then

    C (V ,F ) ≤ Cε2.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 25 / 30

  • Idea of proof: connections to large-deviation principle

    Both X and Y2 can be written as

    dZ (t) = b(Z (t)) dt +√

    2σ(Z (t)) dW (t), Z (t) ∈ RN .

    Consider n-copies of Z

    dZi (t) = b(Zi (t)) dt +√

    2σ(Zi (t)) dWi (t), Zi (t) ∈ RN , (i = 1, . . . , n)

    The empirical measure

    ρn(t) :=1

    n

    n∑i=1

    δZi (t).

    [Dawson-Gartner 1987]: ρn satisfies a large-deviation principle with a ratefunctional I

    Prob(ρn∣∣[0,T ]

    ≈ ρ∣∣[0,T ]

    )n→∞∼ exp(−nI(ρ))

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 26 / 30

  • Properties of the rate-functional

    (i) I ≥ 0(ii) I(ρ) = 0 ⇐⇒ ρ = Law(Z)(iii) I has a dual formulation

    I(ρ) = supf

    G (ρ, f )

    G (ρ, f ) =

    ˆRN

    [fTρT − f0ρ0]−ˆ T

    0

    ˆRN

    (∂t f + Lf + Γ(f , f )

    )dρt dt.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 27 / 30

  • Key inequalities

    (HIR inequality) Let ρ = Law(Z ) then

    H(η(t)||ρ(t)) +ˆ T

    0R(η(s)||ρ(s)) ds ≤ H(η(0)||ρ(0)) + I(η),

    for any η ∈ C ([0,T ],P(RN)).(Estimate of the rate functional)

    I(η) ≤ C (V ,F )H(Law(X0)||µ)

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 28 / 30

  • Conclusions

    The central role of the large-deviation rate functional

    (1) microscopic interpretations of the Wasserstein gradient flowformulation,

    (2) construction of new approximation schemes,

    (3) Singular limits and coarse-graining of Langevin and overdampedLangevin dynamics.

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 29 / 30

  • Thank you for your attention!

    Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 30 / 30

    Wasserstein gradient flowsFrom Large-deviation principle to Wasserstein gradient flowsGeneralised Wasserstein gradient flowsMultiscale analysis/Coarse-graining