Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Large deviation and variational approaches togeneralised gradient flows
Hong Duong
Imperial College London
Université Jean Monnet, 23/01/2018
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 1 / 30
Introduction
1 Wasserstein gradient flows
2 From Large-deviation principle to Wasserstein gradient flows
3 Generalised Wasserstein gradient flows
4 Multiscale analysis/Coarse-graining
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 2 / 30
Wasserstein gradient flows
Jordan-Kinderlehrer-Otto, 1998: the Fokker-Planck equation
∂tρ = div(∇V ρ) + ∆ρ, ρ(0) = ρ0,
is a gradient flow of the free energy
F(ρ) =ˆρ log ρ+ V ρ
with respect to the Wasserstein distance on the probability measures withfinite second moments
W 22 (µ, ν) = infγ∈Γ(µ,ν)
ˆRd×Rd
|x − y |2 γ(dxdy)
(Γ(µ, ν) is the set of couplings between µ and ν)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 3 / 30
JKO-scheme
JKO-scheme: Given a time-step h > 0;
ρh0 := ρ0,
for n ≥ 1, ρhn is defined to be the unique minimizer of
1
2hW 22 (ρ, ρ
hn−1) + F(ρ).
Then the sequence {ρhn}n≥0, after an appropriate interpolation, convergesto the solution to the Fokker-Planck equation as h→ 0.
Jh(·|ρ0) :=1
2
[ 12h
W 22 (·, ρ0) + F(·)−F(ρ0)]
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 4 / 30
The theory of Wasserstein gradient flows
The theory of Wasserstein gradient flow has developed tremendously overthe last 20 years
Many PDEs are Wasserstein gradient flows: porous medium equation,McKean-Vlasov equations, finite Markov chain, etc;
Link different areas of Mathematics together (optimal transport,measure geometry & probability)
see e.g., monographs
Ambrosio L., Gigli N., Savaré G (2008),
Villani C. (2003 & 2009).
or the most recent survey by Santambrogio (2017).
BUT
Only for gradient flows, i.e., dissipative systems;
No microscopic interpretation.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 5 / 30
The theory of Wasserstein gradient flows
The theory of Wasserstein gradient flow has developed tremendously overthe last 20 years
Many PDEs are Wasserstein gradient flows: porous medium equation,McKean-Vlasov equations, finite Markov chain, etc;
Link different areas of Mathematics together (optimal transport,measure geometry & probability)
see e.g., monographs
Ambrosio L., Gigli N., Savaré G (2008),
Villani C. (2003 & 2009).
or the most recent survey by Santambrogio (2017).
BUT
Only for gradient flows, i.e., dissipative systems;
No microscopic interpretation.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 5 / 30
Questions discussed today
Why the JKO scheme?
Can one establish JKO-type scheme for non-dissipative system, e.g.,the kinetic Fokker-Planck equation?
Can one exploit variational formulation for multi-scale analysis?
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 6 / 30
From Large-deviation principle to Wasserstein gradient flows
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 7 / 30
Large-deviation principles
Toss a fair coin n times, Hn = number ofheads
What is the chance of getting all heads?
Prob
(Hnn
= 1
)= 2−n = exp(−n log 2)
S. VaradhanAbel prize 2007
In general,
Prob
(Hnn≈ x
)n→∞∼ exp(−n I (x)), for 0 ≤ x ≤ 1
I : rate functional
bb
bc bc
0 112
∞ ∞
I(x)
x
log(2)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 8 / 30
Large-deviation principles
Toss a fair coin n times, Hn = number ofheads
What is the chance of getting all heads?
Prob
(Hnn
= 1
)= 2−n = exp(−n log 2)
S. VaradhanAbel prize 2007
In general,
Prob
(Hnn≈ x
)n→∞∼ exp(−n I (x)), for 0 ≤ x ≤ 1
I : rate functional
bb
bc bc
0 112
∞ ∞
I(x)
x
log(2)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 8 / 30
Microscopic interpretation of the JKO-scheme
Stochastic particle systems
dX it = −∇V (X it ) dt +√
2dW it , i = 1, . . . , n;
and its empirical measure
ρnt :=1
n
n∑i=1
δX it .
Large deviation principle:
Prob(ρnh ≈ ρ|ρn0 ≈ ρ0
)∼ exp
(− nIh(ρ|ρ0)
)as n→∞.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 9 / 30
From large-deviation principle to Wasserstein gradient flow
Theorem (Adams-Peletier-Dirr-Zimmer 2011, D.-Laschos-Renger2013, Erbar-Maas-Renger 2015)
Ih(·|ρ0)−1
4hW 22 (·, ρ0)
h→0−−−−−→Γ
1
2F(·)− 1
2F(ρ0).
i.e.,Ih(·|ρ0) ≈ Jh(·|ρ0) as h→ 0.
(Γ: Gamma convergence)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 10 / 30
Idea of proof
H−1(ρ)-norm: D = C∞c (Rd), s ∈ D′:
‖s‖2−1,ρ := supf ∈D
{2〈s, f 〉 −
ˆ|∇f |2 dρ
}Benamou-Brenier formulation
W 22 (ρ0, ρ1) = inf{ˆ 1
0‖∂tρt‖2−1,ρt dt : (ρt)t ∈ AC 2(ρ0, ρ1)};
Representation of the rate-functional
Ih(ρ1|ρ0) =1
4h
ˆ 10
∥∥∥∂tρt − h(div(∇V ρ) + ∆ρ)∥∥∥2−1,ρt
dt.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 11 / 30
Idea of proof (cont.)
Proof of Γ-convergence
Liminf-inequality:
Ih(ρ1|ρ0) =1
2
(F(·)−F(ρ0)
)+
1
4h
ˆ 10‖∂tρt‖2−1,ρt dt
+h
4
ˆ 10‖ div(∇V ρ)∆ρ‖2−1,ρt dt
≥ 12
(F(·)−F(ρ0)
)+
1
4h
ˆ 10‖∂tρt‖2−1,ρt dt
≥ 12
(F(·)−F(ρ0)
)+
1
4hW 22 (ρ0, ρ1).
Limsup-inequality: construction of a curve (ρt)t such that the lastterm in Ih vanishes.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 12 / 30
Variational formulation for non-dissipative systems
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 13 / 30
Langevin dynamics and the kinetic Fokker-Planck equation
The kinetic Fokker-Planck equation
∂tρ = −divq( pmρ)
+ divp(∇V ρ
)+ γ divp
( pmρ)
+ γβ−1∆pρ.
The Langevin dynamics
dQ =P
mdt,
dP = −∇V (Q) dt − γ Pm
dt +√
2γβ−1 dWt .
Neither a gradient flow nor a Hamiltonian flow, but a combination of them.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 14 / 30
Variational formulation for the kinetic Fokker-Planckequation
New optimal transportation cost functional
Ch(q, p; q′, p′) := h inf
{ˆ h0
∣∣mξ̈(t) +∇V (ξ(t))∣∣2 dt : ξ ∈ C 1([0, h],Rd)such that
(ξ,mξ̇)(0) = (q, p), (ξ,mξ̇)(h) = (q′, p′)
}.
Given µ(dqdp), ν(dq′dp′), define
W̃h(µ, ν) := infγ∈Γ(µ,ν)
ˆR2d×R2d
Ch(q, p; q′, p′)γ(dqdpdq′dp′)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 15 / 30
Explanation of the cost
Wasserstein cost
C (x , y) = |x − y |2 = h inf{ˆ h
0
∣∣ξ̇(t)∣∣2 dt : ξ ∈ C 1([0, h],Rd)such that ξ(0) = x , ξ(h) = y
}.
Freidlin-Wentzell small-noise large deviation
dX εt =√
2εdWt .
then {X ε : X ε(0) = x ,X ε(h) = y} satisfies a LDP with rate functionC (x , y).Our cost: small-noise perturbation of the Hamiltonian system
dQ =P
mdt
dP = −∇V (Q) dt +√
2εdWt
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 16 / 30
Variational formulation for the kinetic Fokker-Planckequation
JKO-type scheme for the kinetic Fokker-Planck equation: ρh0 := ρ0, forn ≥ 1, ρhn as the solution of the minimization problem
minρ
1
2h
1
γW̃h(ρ
hk−1, ρ) + F(ρ),
Theorem (D.-Peletier-Zimmer 2014)
Under the piece-wise constant interpolation, the sequence {ρhn}nconverges, as h→ 0, to the solution of the kinetic Fokker-Planck equation.
Extensions
D.-Tran 2017: a degenerate Kolmogorov diffusion.
D.-Pavliotis, in prep.: hypoelliptic diffusions.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 17 / 30
Multiscale analysis and coarse-graining
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 18 / 30
Langevin and overdamped Langevin dynamics
Langevin dynamics
dQ =P
mdt,
dP = −∇V (Q) dt − γ Pm
dt +√
2γβ−1 dWt .
overdamped Langevin dynamics
dX = −∇V (X ) dt +√
2β−1dWt
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 19 / 30
Singular limits of Langevin dynamics
Overdamped limit (γ →∞): Kramers 1940 (formal computations),Nelson 1967 (rigorous, SDEs).
Small noise limit (ε := β−1 → 0): Freidlin-Wentzell theory (RandomPerturbations of Dynamical Systems) 1994.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 20 / 30
Coarse-graining
Challenge: the curse of dimensionality (d extremely large).
Question: Reduced description of the system, that still includes importantdynamical information?
Coarse-graining/model reduction:
F : Rd → Rk with k < d .
Quantity of interest: path t 7→ F (X (t)). [X = (Q,P) or X = Q]
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 21 / 30
Previous work
Gyöngy 1986 introduced an exact dynamics
dY1(t) = b1(t,Y1(t)) dt +√
2β−1A1(t,Y1(t))dW (t), Y1(t) ∈ Rk
Law(Y1) = Law(F (X )) but it is extremely difficult/expensive tocompute b1 and A1.
Legoll et al. 2010, 2016 proposed an effective dynamics
dY2(t) = b2(Y2(t)) dt +√
2β−1A2(Y2(t)) dW (t), Y2(t) ∈ Rk
It is much easier/cheaper to compute b2 and A2. The authorsestimate the error between ρ1 := Law(Y1) and ρ2 := Law(Y2).
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 22 / 30
Limitations of Legoll et al.
only overdamped Langevin,
scalar coarse-graining maps (k = 1),
errors measured only in the relative entropy sense.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 23 / 30
Our work: a unifying technique
Based on the connection to large-deviation principle, we
provide a unifying technique for both singular limits andcoarse-graining,
generalise Legoll et al. to the full Langevin dynamics for vectorialmaps and more error-measurement methods.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 24 / 30
Theorem (D.-Lamacz-Peletier-Sharma 2017,D.-Lamacz-Peletier-Schlichting-Sharma sub. 2017)
Under certain regularity and growth assumptions on V and F ,
(1) (qualitative) We recover well-known results for the overdamped andsmall-noise limits of Langevin dynamics.
(2) (quantitative) It holds that
supt∈[0,T ]
D(ρ1(t), ρ2(t))︸ ︷︷ ︸Error measurement
(D=W 22 or H)
≤D(ρ1(0), ρ2(0))︸ ︷︷ ︸Initial data
+C (V ,F )︸ ︷︷ ︸Data
H(Law(X0)||µ).︸ ︷︷ ︸Reference dynamics
If V (x) = 1εV0(x0, . . . , xd−k) + V1(x), then
C (V ,F ) ≤ Cε2.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 25 / 30
Idea of proof: connections to large-deviation principle
Both X and Y2 can be written as
dZ (t) = b(Z (t)) dt +√
2σ(Z (t)) dW (t), Z (t) ∈ RN .
Consider n-copies of Z
dZi (t) = b(Zi (t)) dt +√
2σ(Zi (t)) dWi (t), Zi (t) ∈ RN , (i = 1, . . . , n)
The empirical measure
ρn(t) :=1
n
n∑i=1
δZi (t).
[Dawson-Gartner 1987]: ρn satisfies a large-deviation principle with a ratefunctional I
Prob(ρn∣∣[0,T ]
≈ ρ∣∣[0,T ]
)n→∞∼ exp(−nI(ρ))
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 26 / 30
Properties of the rate-functional
(i) I ≥ 0(ii) I(ρ) = 0 ⇐⇒ ρ = Law(Z)(iii) I has a dual formulation
I(ρ) = supf
G (ρ, f )
G (ρ, f ) =
ˆRN
[fTρT − f0ρ0]−ˆ T
0
ˆRN
(∂t f + Lf + Γ(f , f )
)dρt dt.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 27 / 30
Key inequalities
(HIR inequality) Let ρ = Law(Z ) then
H(η(t)||ρ(t)) +ˆ T
0R(η(s)||ρ(s)) ds ≤ H(η(0)||ρ(0)) + I(η),
for any η ∈ C ([0,T ],P(RN)).(Estimate of the rate functional)
I(η) ≤ C (V ,F )H(Law(X0)||µ)
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 28 / 30
Conclusions
The central role of the large-deviation rate functional
(1) microscopic interpretations of the Wasserstein gradient flowformulation,
(2) construction of new approximation schemes,
(3) Singular limits and coarse-graining of Langevin and overdampedLangevin dynamics.
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 29 / 30
Thank you for your attention!
Hong Duong (Imperial College London) Large deviation and variational approaches to generalised gradient flows23/01/2018 30 / 30
Wasserstein gradient flowsFrom Large-deviation principle to Wasserstein gradient flowsGeneralised Wasserstein gradient flowsMultiscale analysis/Coarse-graining