Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
PROBABILISTIC ANALYSIS OF MFGSIII. MASTER EQUATIONS, GAMES WITH
COMMON NOISE, AND WITH MAJOR ANDMINOR PLAYERS
René Carmona
Department of Operations Research & Financial EngineeringPACM
Princeton University
Minerva Lectures, Columbia U October 27, 2016
RECALL THE JOINT CHAIN RUILE
I If u is smoothI If dξt = ηt dt + γt dWt
I If dXt = bt dt + σt dWt and µt = PXt
u(t , ξt , µt ) = u(0, ξ0, µ0) +
∫ t
0∂x u(s, ξs, µs) ·
(γsdWs
)+
∫ t
0
(∂t u(s, ξs, µs) + ∂x u(s, ξs, µs) · ηs +
12
trace[∂2
xx u(s, ξs, µs)γsγ†s])
ds
+
∫ t
0E[∂µu(s, ξs, µs)(Xs) · bs
]ds +
12
∫ t
0E[trace
(∂v[∂µu(s, ξs, µs)
](Xs)σsσ
†s)]
ds
where the process (Xt , bt , σt )0≤t≤T is an independent copy of the process(Xt , bt , σt )0≤t≤T , on a different probability space (Ω, F , P)
DERIVING THE MASTER EQUATION
I If we assume a form of Dynamic Programing PrincipleI If (t , x , µ) → U(t , x , µ) = Vµ(t , x) is the master fieldI We expect(
U(t ,Xt , µt )−∫ t
0f(s,Xs, µs, α(s,Xs, µs,Ys)
)ds)
0≤t≤T
to be a martingale whenever (Xt ,Yt ,Zt )0≤t≤T is the solution of theFBSDE characterizing the optimal path under (µt )0≤t≤T .
I So compute its Itô differential and set the drift to 0
AN EXAMPLE OF DERIVATION
dXt = b(t ,Xt , µt , αt )dt + dWt
H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)
α(t , x , µ, y) = arg infα
H(t , x , µ, y , α)
Itô’s Formula with µt = PXt
(set αt = α(t ,Xt , µt , ∂U(t ,Xt , µt )) and bt = b(t ,Xt , µt , αt ))
dU(t ,Xt , µt ) =(∂tU(t ,Xt , µt ) + bt · ∂xU(t ,Xt , µt ) +
12
trace[∂2xxU(t ,Xt , µt )] + f
(t , x , µ, αt
))dt
+ E[bt · ∂µU(t ,Xt , µt )(Xt ) +
12∂v∂µU(t ,Xt , µt )
](Xt )]
]dt + ∂xU(t ,Xt , µt )dWt
THE ACTUAL MASTER EQUATION
∂tU(t , x , µ) + b(t , x , µ, α(t , x , µ, ∂U(t , x , µ))
)· ∂xU(t , x , µ)
+12
trace[∂2
xxU(t , x , µ)]
+ f(t , x , µ, α(t , x , µ, ∂U(t , x , µ))
)+
∫Rd
[b(t , x ′, µ, α(t , x , µ, ∂U(t , x , µ))
)· ∂µU(t , x , µ)(x ′)
+12
trace(∂v∂µU(t , x , µ)(x ′)
)]dµ(x ′) = 0,
for (t , x , µ) ∈ [0,T ]× Rd × P2(Rd ), with the terminal conditionV (T , x , µ) = g(x , µ).
MEAN FIELD GAMES WITH A COMMON NOISE
Starting with a finite player game, i.e.
Simultaneous Minimization of
J i (α) = E
∫ T
0f (t ,X i
t , µNt , α
it )dt + g(XT , µ
NT )
, i = 1, · · · ,N
under constraints (dynamics of players private states)
dX it = b(t ,X i
t , µNt , α
it )dt + σ(t ,X i
t , µNt , α
it )dW i
t + σ0(t ,X it , µ
Nt , α
it )dW 0
t
for i.i.d. Wiener processes W kt for k = 0,1, · · · ,N.
LARGE GAME ASYMPTOTICS (CONT.)
Conditional Law of Large NumbersI If we consider exchangeable equilibriums,(α1
t , · · · , αNt ), then
I By de Finetti LLNlim
N→∞µN
t = PX1t |F
0t
I Dynamics of player 1 (or any other player) becomes
dX 1t = b(t ,X 1
t , µt , α1t )dt +σ(t ,X 1
t , µt , α1t )dWt +σ0(t ,X 1
t , µt , α1t )dW 0
t ;
with µt = PX1t |F
0t.
I Cost to player 1 (or any other player) becomes
E∫ T
0f (t ,Xt , µt , α
1t )dt + g(XT , µT )
MFG WITH COMMON NOISE PARADIGM
1. For each Fixed measure valued (F0t )-adapted process µ = (µt ) in P(R), solve
the standard stochastic control problem
α = arg infα
E
∫ T
0f (t ,Xt , µt , αt )dt + g(XT , µT )
subject to
dXt = b(t ,Xt , µt , αt )dt + σ(t ,Xt , µt , αt )dWt + σ0(t ,Xt , µt , αt )dW 0t ;
2. Fixed Point Problem: determine µ = (µt ) so that
∀t ∈ [0,T ], PXt |F0t
= µt a.s.
Once this is done one expects that, if αt = φ(t ,Xt ), for N player game,
αj∗t = φ∗(t ,X j
t ), j = 1, · · · ,N
form an approximate Nash equilibrium for the game with N players.
PONTRYAGIN STOCHASTIC MAXIMUM PRINCIPLE
Freeze µ = (µt )0≤t≤T , write (reduced) Hamiltonian
H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)
Standard definition
Given an admissible control α = (αt )0≤t≤T and the corresponding controlledstate process Xα = (Xα
t )0≤t≤T , any couple (Yt ,Zt )0≤t≤T satisfying:dYt = −∂x H(t ,Xα
t , µt ,Yt , αt )dt + ZtdWt + Z 0t dW 0
t
YT = ∂x g(XαT , µT )
is called a set of adjoint processes
STOCHASTIC CONTROL STEP SOLUTION
Determineα(t , x , µ, y) = arg inf
αH(t , x , µ, y , α)
Inject in FORWARD and BACKWARD dynamics and SOLVEdXt = b(t ,Xt , µt , α(t ,Xt , µt ,Yt ))dt + σ(t ,Xt )dWt + σ0(t ,Xt )dW 0
t ,
dYt = −∂xHµt (t ,X ,Yt , α(t ,Xt , µt ,Yt ))dt + ZtdWt + Z 0t dW 0
t
with X0 = x0 and YT = ∂xg(XT , µT )
Standard FBSDE (for each fixed t → µt )
FIXED POINT STEP
Solve the fixed point problem
(µt )0≤t≤T −→ (Xt )0≤t≤T −→ (PXt |F0t)0≤t≤T
Note: if we enforce µt = PXt |F0t
for all 0 ≤ t ≤ T in FBSDE we havedXt = b(t ,Xt ,PXt |F0
t, α
PXt |F0t (t ,Xt ,Yt ))dt + σ(t ,Xt )dWt + σ(t ,Xt ) dW 0
t ,
dYt = −∂xHPXt |F
0t (t ,Xα
t ,Yt , αPXt |F
0t (t ,Xt ,Yt ))dt + ZtdWt + Z 0
t dW 0t ,
withX0 = x0 and YT = ∂xg(XT ,PXT |F0
T)
FBSDE of Conditional McKean-Vlasov type !!!
Very difficult
SEVERAL APPROCHES
I Relaxed Controls (R.C. - Delarue - Lacker)
I FBSDEs of Conditional McKean-Vlasov Type (RC - Delarue)I Serious measurability difficulties
I e.g. may need extra martingale terms (and jumps) in BSDE iffiltrations not Brownian
I Develop TheoryI SDEs of Conditional McKean-Vlasov Type (baby steps in RC - Zhu)I Conditional Propagation of Chaos (baby steps in RC - Zhu)
I Introduce notions of strong and weak solutions to MFG problemI Extend Yamada-Watanabe theory to MFG set-upI Existence for a finite common noise (Schauder Theorem)I Weak Solutions by Limiting argumentsI Uniqueness via Monotonicity or Strong ConvexityI Strong Solutions via extension of Yamada-Watanabe
MFGs with Major andMinor Players
R.C. - G. Zhu, R.C. - P. Wang
MFG WITH MAJOR AND MINOR PLAYERS. I
State equationsdX 0
t = b0(t ,X 0t , µt , α
0t )dt + σ0(t ,X 0
t , µt , α0t )dW 0
t
dXt = b(t ,Xt , µt ,X 0t , αt , α
0t )dt + σ(t ,Xt , µt ,X 0
t , αt , α0t dWt ,
Costs J0(α0,α) = E
[∫ T0 f0(t ,X 0
t , µt , α0t )dt + g0(X 0
T , µT )]
J(α0,α) = E[∫ T
0 f (t ,Xt , µNt ,X
0t , αt , α
0t )dt + g(XT , µT )
],
OPEN LOOP VERSION OF THE MFG PROBLEM
The controls used by the major player and the representative minor playerare of the form:
α0t = φ0(t ,W 0
[0,T ]), and αt = φ(t ,W 0[0,T ],W[0,T ]), (1)
for deterministic progressively measurable functions
φ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0
andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A
THE MAJOR PLAYER BEST RESPONSE
Assume representative minor player uses the open loop control given byφ : (t ,w0,w) 7→ φ(t ,w0,w),
Major player minimizes
Jφ,0(α0) = E[∫ T
0f0(t ,X 0
t , µt , α0t )dt + g0(X 0
T , µT )]
under the dynamical constraints:dX 0
t = b0(t ,X 0t , µt , α
0t )dt + σ0(t ,X 0
t , µt , α0t )dW 0
t
dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), α0t )dt
+σ(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), α0t )dWt ,
µt = L(Xt |W 0[0,t]) conditional distribution of Xt given W 0
[0,t].
Major player problem as the search for:
φ0,∗(φ) = arg infα0
t =φ0(t,W 0[0,T ]
)Jφ,0(α0) (2)
Optimal control of the conditional McKean-Vlasov type!
THE REP. MINOR PLAYER BEST RESPONSE
System against which best response is sought comprises
I a major playerI a field of minor players different from the representative minor playerI Major player uses strategy α0
t = φ0(t ,W 0[0,T ])
I Representative of the field of minor players uses strategyαt = φ(t ,W 0
[0,T ],W[0,T ]).
State dynamicsdX 0
t = b0(t ,X 0t , µt , φ
0(t ,W 0[0,T ]))dt + σ0(t ,X 0
t , µt , φ0(t ,W 0
[0,T ]))dW 0t
dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dt+σ(t ,Xt , µt ,X 0
t , φ(t ,W 0[0,T ],W[0,T ]), φ
0(t ,W 0[0,T ]))dWt ,
where µt = L(Xt |W 0[0,t]) is the conditional distribution of Xt given W 0
[0,t].
Given φ0 and φ, SDE of (conditional) McKean-Vlasov type
THE REP. MINOR PLAYER BEST RESPONSE (CONT.)
Representative minor player chooses a strategy αt = φ(t ,W 0[0,T ],W[0,T ]) to
minimize
Jφ0,φ(α) = E
[∫ T
0f (t ,X t ,X 0
t , µt , αt , φ0(t ,W 0
[0,T ]))dt + g(X T , µt )],
where the dynamics of the virtual state X t are given by:
dX t = b(t ,X t , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dt
+ σ(t ,X t , µt ,X 0t , φ(t ,W 0
[0,T ],W[0,T ]), φ0(t ,W 0
[0,T ]))dW t ,
for a Wiener process W = (W t )0≤t≤T independent of the other Wienerprocesses.
I Optimization problem NOT of McKean-Vlasov type.I Classical optimal control problem with random coefficients
φ∗(φ0, φ) = arg inf
αt =φ(t,W 0[0,T ]
,W[0,T ])Jφ
0,φ(α)
NASH EQUILIBRIUM
Search for Best Response Map Fixed Point
(φ0, φ) =(φ0,∗(φ), φ∗(φ0, φ)
).
CLOSED LOOP VERSIONS OF THE MFG PROBLEM
I Closed Loop VersionControls of the major player and the representative minor player are ofthe form:
α0t = φ0(t ,X 0
[0,T ], µt ), and αt = φ(t ,X[0,T ], µt ,X 0[0,T ]),
for deterministic progressively measurable functionsφ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0 andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A.
I Markovian VersionControls of the major player and the representative minor player are ofthe form:
α0t = φ0(t ,X 0
t , µt ), and αt = φ(t ,Xt , µt ,X 0t ),
for deterministic feedback functions φ0 : [0,T ]× Rd0 ×P2(Rd ) 7→ A0 andφ : [0,T ]× Rd × P2(Rd )× Rd0 7→ A.
LINEAR QUADRATIC MODELS
State dynamicsdX 0
t = (L0X 0t + B0α
0t + F0Xt )dt + D0dW 0
t
dXt = (LXt + Bαt + FXt + GX 0t )dt + DdWt
where Xt = E[Xt |F0t ], (F0
t )t≥0 filtration generated by W0
Costs
J0(α0,α) = E[∫ T
0[(X 0
t − H0Xt − η0)†Q0(X 0t − H0Xt − η0) + α0†
t R0α0t ]dt
]J(α0,α) = E
[∫ T
0[(Xt − HX 0
t − H1Xt − η)†Q(Xt − HX 0t − H1Xt − η) + α†t Rαt ]dt
]in which Q, Q0, R, R0 are symmetric matrices, and R, R0 are assumed to bepositive definite.
EQUILIBRIA
I Open Loop VersionI Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation
I Closed Loop VersionI Fixed point step more difficultI Search limited to controls of the form
α0t = φ0(t ,X 0
t , Xt ) = φ00(t) + φ0
1(t)X 0t + φ0
2(t)Xt
αt = φ(t ,Xt ,X 0t , Xt ) = φ0(t) + φ1(t)Xt + φ2(t)X 0
t + φ3(t)Xt
I Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation
Solutions are not the same !!!!
APPLICATION TO FLOCKINGInspired by Nourian-Caines-Malhame generalization of the basic Cucker-Smalemodel.
I β = 0, so state reduces to velocity as cost depends only on velocity
I V 0,Nt velocity of the (major player) leader at time t
I V i,Nt the velocity of the i-th follower, i = 1, · · · ,N at time t
I Linear dynamics dV 0,N
t = α0t dt + Σ0dW 0
tdV i,N
t = αit dt + ΣdW i
t
I Minimization of Quadratic costs
J0 = E[∫ T
0
(λ0‖V 0,N
t − νt‖2 + λ1‖V 0,Nt − V N
t ‖2 + (1− λ0 − λ1)‖α0
t ‖2)dt
]
I V Nt := 1
N
∑Ni=1 V i,N
t the average velocity of the followers,I deterministic function [0,T ] 3 t → νt ∈ Rd (leader’s free will)I λ0 and λ1 are positive real numbers satisfying λ0 + λ1 ≤ 1
J i = E[∫ T
0
(l0‖V i,N
t − V 0,Nt ‖2 + l1‖V i,N
t − V Nt ‖
2 + (1− l0 − l1)‖αit‖
2)dt]
l0 ≥ 0 and l1 ≥ 0, l0 + l1 ≤ 1.
SAMPLE TRAJECTORIES IN EQUILIBRIUM
ν(t) := [−2π sin(2πt),2π cos(2πt)]
0.0
0.5
1.0
−1.0 −0.5 0.0 0.5
x
y
k0 = 0.80 k1 = 0.19 l0 = 0.19 l1 = 0.80
−0.5
0.0
0.5
1.0
−1.0 −0.5 0.0 0.5
x
y
k0 = 0.80 k1 = 0.19 l0 = 0.80 l1 = 0.19
FIGURE: Optimal velocity and trajectory of follower and leaders
SAMPLE TRAJECTORIES IN EQUILIBRIUM
ν(t) := [−2π sin(2πt),2π cos(2πt)]
0.0
0.5
1.0
1.5
2.0
0.0 0.5 1.0 1.5 2.0
x
y
k0 = 0.19 k1 = 0.80 l0 = 0.19 l1 = 0.80
0.0
0.5
1.0
1.5
2.0
−0.5 0.0 0.5 1.0 1.5
x
y
k0 = 0.19 k1 = 0.80 l0 = 0.80 l1 = 0.19
FIGURE: Optimal velocity and trajectory of follower and leaders
CONDITIONAL PROPAGATION OF CHAOS
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 5
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 10
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 20
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 50
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
N = 100
FIGURE: Conditional correlation of 5 followers’ velocities