Probabilistic Analysis of MFGs III. Master Equations, Games with … · 2016-10-28 · PROBABILISTIC ANALYSIS OF MFGS III. MASTER EQUATIONS, GAMES WITH COMMON NOISE, AND WITH MAJOR

PROBABILISTIC ANALYSIS OF MFGSIII. MASTER EQUATIONS, GAMES WITH

COMMON NOISE, AND WITH MAJOR ANDMINOR PLAYERS

René Carmona

Department of Operations Research & Financial EngineeringPACM

Princeton University

Minerva Lectures, Columbia U October 27, 2016

RECALL THE JOINT CHAIN RUILE

I If u is smoothI If dξt = ηt dt + γt dWt

I If dXt = bt dt + σt dWt and µt = PXt

u(t , ξt , µt ) = u(0, ξ0, µ0) +

∫ t

0∂x u(s, ξs, µs) ·

(γsdWs

)+

∫ t

0

(∂t u(s, ξs, µs) + ∂x u(s, ξs, µs) · ηs +

12

trace[∂2

xx u(s, ξs, µs)γsγ†s])

ds

+

∫ t

0E[∂µu(s, ξs, µs)(Xs) · bs

]ds +

12

∫ t

0E[trace

(∂v[∂µu(s, ξs, µs)

](Xs)σsσ

†s)]

ds

where the process (Xt , bt , σt )0≤t≤T is an independent copy of the process(Xt , bt , σt )0≤t≤T , on a different probability space (Ω, F , P)

DERIVING THE MASTER EQUATION

I If we assume a form of Dynamic Programing PrincipleI If (t , x , µ) → U(t , x , µ) = Vµ(t , x) is the master fieldI We expect(

U(t ,Xt , µt )−∫ t

0f(s,Xs, µs, α(s,Xs, µs,Ys)

)ds)

0≤t≤T

to be a martingale whenever (Xt ,Yt ,Zt )0≤t≤T is the solution of theFBSDE characterizing the optimal path under (µt )0≤t≤T .

I So compute its Itô differential and set the drift to 0

AN EXAMPLE OF DERIVATION

dXt = b(t ,Xt , µt , αt )dt + dWt

H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)

α(t , x , µ, y) = arg infα

H(t , x , µ, y , α)

Itô’s Formula with µt = PXt

(set αt = α(t ,Xt , µt , ∂U(t ,Xt , µt )) and bt = b(t ,Xt , µt , αt ))

dU(t ,Xt , µt ) =(∂tU(t ,Xt , µt ) + bt · ∂xU(t ,Xt , µt ) +

12

trace[∂2xxU(t ,Xt , µt )] + f

(t , x , µ, αt

))dt

+ E[bt · ∂µU(t ,Xt , µt )(Xt ) +

12∂v∂µU(t ,Xt , µt )

](Xt )]

]dt + ∂xU(t ,Xt , µt )dWt

THE ACTUAL MASTER EQUATION

∂tU(t , x , µ) + b(t , x , µ, α(t , x , µ, ∂U(t , x , µ))

)· ∂xU(t , x , µ)

+12

trace[∂2

xxU(t , x , µ)]

+ f(t , x , µ, α(t , x , µ, ∂U(t , x , µ))

)+

∫Rd

[b(t , x ′, µ, α(t , x , µ, ∂U(t , x , µ))

)· ∂µU(t , x , µ)(x ′)

+12

trace(∂v∂µU(t , x , µ)(x ′)

)]dµ(x ′) = 0,

for (t , x , µ) ∈ [0,T ]× Rd × P2(Rd ), with the terminal conditionV (T , x , µ) = g(x , µ).

MEAN FIELD GAMES WITH A COMMON NOISE

Starting with a finite player game, i.e.

Simultaneous Minimization of

J i (α) = E

∫ T

0f (t ,X i

t , µNt , α

it )dt + g(XT , µ

NT )

, i = 1, · · · ,N

under constraints (dynamics of players private states)

dX it = b(t ,X i

t , µNt , α

it )dt + σ(t ,X i

t , µNt , α

it )dW i

t + σ0(t ,X it , µ

Nt , α

it )dW 0

t

for i.i.d. Wiener processes W kt for k = 0,1, · · · ,N.

LARGE GAME ASYMPTOTICS (CONT.)

Conditional Law of Large NumbersI If we consider exchangeable equilibriums,(α1

t , · · · , αNt ), then

I By de Finetti LLNlim

N→∞µN

t = PX1t |F

0t

I Dynamics of player 1 (or any other player) becomes

dX 1t = b(t ,X 1

t , µt , α1t )dt +σ(t ,X 1

t , µt , α1t )dWt +σ0(t ,X 1

t , µt , α1t )dW 0

t ;

with µt = PX1t |F

0t.

I Cost to player 1 (or any other player) becomes

E∫ T

0f (t ,Xt , µt , α

1t )dt + g(XT , µT )

MFG WITH COMMON NOISE PARADIGM

1. For each Fixed measure valued (F0t )-adapted process µ = (µt ) in P(R), solve

the standard stochastic control problem

α = arg infα

E

∫ T

0f (t ,Xt , µt , αt )dt + g(XT , µT )

subject to

dXt = b(t ,Xt , µt , αt )dt + σ(t ,Xt , µt , αt )dWt + σ0(t ,Xt , µt , αt )dW 0t ;

2. Fixed Point Problem: determine µ = (µt ) so that

∀t ∈ [0,T ], PXt |F0t

= µt a.s.

Once this is done one expects that, if αt = φ(t ,Xt ), for N player game,

αj∗t = φ∗(t ,X j

t ), j = 1, · · · ,N

form an approximate Nash equilibrium for the game with N players.

PONTRYAGIN STOCHASTIC MAXIMUM PRINCIPLE

Freeze µ = (µt )0≤t≤T , write (reduced) Hamiltonian

H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)

Standard definition

Given an admissible control α = (αt )0≤t≤T and the corresponding controlledstate process Xα = (Xα

t )0≤t≤T , any couple (Yt ,Zt )0≤t≤T satisfying:dYt = −∂x H(t ,Xα

t , µt ,Yt , αt )dt + ZtdWt + Z 0t dW 0

t

YT = ∂x g(XαT , µT )

is called a set of adjoint processes

STOCHASTIC CONTROL STEP SOLUTION

Determineα(t , x , µ, y) = arg inf

αH(t , x , µ, y , α)

Inject in FORWARD and BACKWARD dynamics and SOLVEdXt = b(t ,Xt , µt , α(t ,Xt , µt ,Yt ))dt + σ(t ,Xt )dWt + σ0(t ,Xt )dW 0

t ,

dYt = −∂xHµt (t ,X ,Yt , α(t ,Xt , µt ,Yt ))dt + ZtdWt + Z 0t dW 0

t

with X0 = x0 and YT = ∂xg(XT , µT )

Standard FBSDE (for each fixed t → µt )

FIXED POINT STEP

Solve the fixed point problem

(µt )0≤t≤T −→ (Xt )0≤t≤T −→ (PXt |F0t)0≤t≤T

Note: if we enforce µt = PXt |F0t

for all 0 ≤ t ≤ T in FBSDE we havedXt = b(t ,Xt ,PXt |F0

t, α

PXt |F0t (t ,Xt ,Yt ))dt + σ(t ,Xt )dWt + σ(t ,Xt ) dW 0

t ,

dYt = −∂xHPXt |F

0t (t ,Xα

t ,Yt , αPXt |F

0t (t ,Xt ,Yt ))dt + ZtdWt + Z 0

t dW 0t ,

withX0 = x0 and YT = ∂xg(XT ,PXT |F0

T)

FBSDE of Conditional McKean-Vlasov type !!!

Very difficult

SEVERAL APPROCHES

I Relaxed Controls (R.C. - Delarue - Lacker)

I FBSDEs of Conditional McKean-Vlasov Type (RC - Delarue)I Serious measurability difficulties

I e.g. may need extra martingale terms (and jumps) in BSDE iffiltrations not Brownian

I Develop TheoryI SDEs of Conditional McKean-Vlasov Type (baby steps in RC - Zhu)I Conditional Propagation of Chaos (baby steps in RC - Zhu)

I Introduce notions of strong and weak solutions to MFG problemI Extend Yamada-Watanabe theory to MFG set-upI Existence for a finite common noise (Schauder Theorem)I Weak Solutions by Limiting argumentsI Uniqueness via Monotonicity or Strong ConvexityI Strong Solutions via extension of Yamada-Watanabe

MFGs with Major andMinor Players

R.C. - G. Zhu, R.C. - P. Wang

MFG WITH MAJOR AND MINOR PLAYERS. I

State equationsdX 0

t = b0(t ,X 0t , µt , α

0t )dt + σ0(t ,X 0

t , µt , α0t )dW 0

t

dXt = b(t ,Xt , µt ,X 0t , αt , α

0t )dt + σ(t ,Xt , µt ,X 0

t , αt , α0t dWt ,

Costs J0(α0,α) = E

[∫ T0 f0(t ,X 0

t , µt , α0t )dt + g0(X 0

T , µT )]

J(α0,α) = E[∫ T

0 f (t ,Xt , µNt ,X

0t , αt , α

0t )dt + g(XT , µT )

],

OPEN LOOP VERSION OF THE MFG PROBLEM

The controls used by the major player and the representative minor playerare of the form:

α0t = φ0(t ,W 0

[0,T ]), and αt = φ(t ,W 0[0,T ],W[0,T ]), (1)

for deterministic progressively measurable functions

φ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0

andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A

THE MAJOR PLAYER BEST RESPONSE

Assume representative minor player uses the open loop control given byφ : (t ,w0,w) 7→ φ(t ,w0,w),

Major player minimizes

Jφ,0(α0) = E[∫ T

0f0(t ,X 0

t , µt , α0t )dt + g0(X 0

T , µT )]

under the dynamical constraints:dX 0

t = b0(t ,X 0t , µt , α

0t )dt + σ0(t ,X 0

t , µt , α0t )dW 0

t

dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), α0t )dt

+σ(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), α0t )dWt ,

µt = L(Xt |W 0[0,t]) conditional distribution of Xt given W 0

[0,t].

Major player problem as the search for:

φ0,∗(φ) = arg infα0

t =φ0(t,W 0[0,T ]

)Jφ,0(α0) (2)

Optimal control of the conditional McKean-Vlasov type!

THE REP. MINOR PLAYER BEST RESPONSE

System against which best response is sought comprises

I a major playerI a field of minor players different from the representative minor playerI Major player uses strategy α0

t = φ0(t ,W 0[0,T ])

I Representative of the field of minor players uses strategyαt = φ(t ,W 0

[0,T ],W[0,T ]).

State dynamicsdX 0

t = b0(t ,X 0t , µt , φ

0(t ,W 0[0,T ]))dt + σ0(t ,X 0

t , µt , φ0(t ,W 0

[0,T ]))dW 0t

dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dt+σ(t ,Xt , µt ,X 0

t , φ(t ,W 0[0,T ],W[0,T ]), φ

0(t ,W 0[0,T ]))dWt ,

where µt = L(Xt |W 0[0,t]) is the conditional distribution of Xt given W 0

[0,t].

Given φ0 and φ, SDE of (conditional) McKean-Vlasov type

THE REP. MINOR PLAYER BEST RESPONSE (CONT.)

Representative minor player chooses a strategy αt = φ(t ,W 0[0,T ],W[0,T ]) to

minimize

Jφ0,φ(α) = E

[∫ T

0f (t ,X t ,X 0

t , µt , αt , φ0(t ,W 0

[0,T ]))dt + g(X T , µt )],

where the dynamics of the virtual state X t are given by:

dX t = b(t ,X t , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dt

+ σ(t ,X t , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dW t ,

for a Wiener process W = (W t )0≤t≤T independent of the other Wienerprocesses.

I Optimization problem NOT of McKean-Vlasov type.I Classical optimal control problem with random coefficients

φ∗(φ0, φ) = arg inf

αt =φ(t,W 0[0,T ]

,W[0,T ])Jφ

0,φ(α)

NASH EQUILIBRIUM

Search for Best Response Map Fixed Point

(φ0, φ) =(φ0,∗(φ), φ∗(φ0, φ)

).

CLOSED LOOP VERSIONS OF THE MFG PROBLEM

I Closed Loop VersionControls of the major player and the representative minor player are ofthe form:

α0t = φ0(t ,X 0

[0,T ], µt ), and αt = φ(t ,X[0,T ], µt ,X 0[0,T ]),

for deterministic progressively measurable functionsφ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0 andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A.

I Markovian VersionControls of the major player and the representative minor player are ofthe form:

α0t = φ0(t ,X 0

t , µt ), and αt = φ(t ,Xt , µt ,X 0t ),

for deterministic feedback functions φ0 : [0,T ]× Rd0 ×P2(Rd ) 7→ A0 andφ : [0,T ]× Rd × P2(Rd )× Rd0 7→ A.

LINEAR QUADRATIC MODELS

State dynamicsdX 0

t = (L0X 0t + B0α

0t + F0Xt )dt + D0dW 0

t

dXt = (LXt + Bαt + FXt + GX 0t )dt + DdWt

where Xt = E[Xt |F0t ], (F0

t )t≥0 filtration generated by W0

Costs

J0(α0,α) = E[∫ T

0[(X 0

t − H0Xt − η0)†Q0(X 0t − H0Xt − η0) + α0†

t R0α0t ]dt

]J(α0,α) = E

[∫ T

0[(Xt − HX 0

t − H1Xt − η)†Q(Xt − HX 0t − H1Xt − η) + α†t Rαt ]dt

]in which Q, Q0, R, R0 are symmetric matrices, and R, R0 are assumed to bepositive definite.

EQUILIBRIA

I Open Loop VersionI Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation

I Closed Loop VersionI Fixed point step more difficultI Search limited to controls of the form

α0t = φ0(t ,X 0

t , Xt ) = φ00(t) + φ0

1(t)X 0t + φ0

2(t)Xt

αt = φ(t ,Xt ,X 0t , Xt ) = φ0(t) + φ1(t)Xt + φ2(t)X 0

t + φ3(t)Xt

I Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation

Solutions are not the same !!!!

APPLICATION TO FLOCKINGInspired by Nourian-Caines-Malhame generalization of the basic Cucker-Smalemodel.

I β = 0, so state reduces to velocity as cost depends only on velocity

I V 0,Nt velocity of the (major player) leader at time t

I V i,Nt the velocity of the i-th follower, i = 1, · · · ,N at time t

I Linear dynamics dV 0,N

t = α0t dt + Σ0dW 0

tdV i,N

t = αit dt + ΣdW i

t

I Minimization of Quadratic costs

J0 = E[∫ T

0

(λ0‖V 0,N

t − νt‖2 + λ1‖V 0,Nt − V N

t ‖2 + (1− λ0 − λ1)‖α0

t ‖2)dt

]

I V Nt := 1

N

∑Ni=1 V i,N

t the average velocity of the followers,I deterministic function [0,T ] 3 t → νt ∈ Rd (leader’s free will)I λ0 and λ1 are positive real numbers satisfying λ0 + λ1 ≤ 1

J i = E[∫ T

0

(l0‖V i,N

t − V 0,Nt ‖2 + l1‖V i,N

t − V Nt ‖

2 + (1− l0 − l1)‖αit‖

2)dt]

l0 ≥ 0 and l1 ≥ 0, l0 + l1 ≤ 1.

SAMPLE TRAJECTORIES IN EQUILIBRIUM

ν(t) := [−2π sin(2πt),2π cos(2πt)]

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5

x

y

k0 = 0.80 k1 = 0.19 l0 = 0.19 l1 = 0.80

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5

x

y

k0 = 0.80 k1 = 0.19 l0 = 0.80 l1 = 0.19

FIGURE: Optimal velocity and trajectory of follower and leaders

SAMPLE TRAJECTORIES IN EQUILIBRIUM

ν(t) := [−2π sin(2πt),2π cos(2πt)]

0.0

0.5

1.0

1.5

2.0

0.0 0.5 1.0 1.5 2.0

x

y

k0 = 0.19 k1 = 0.80 l0 = 0.19 l1 = 0.80

0.0

0.5

1.0

1.5

2.0

−0.5 0.0 0.5 1.0 1.5

x

y

k0 = 0.19 k1 = 0.80 l0 = 0.80 l1 = 0.19

FIGURE: Optimal velocity and trajectory of follower and leaders

CONDITIONAL PROPAGATION OF CHAOS

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 5

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 10

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 20

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 50

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 100

FIGURE: Conditional correlation of 5 followers’ velocities

Documents

Probabilistic Analysis of MFGs III. Master Equations, Games with … · 2016-10-28 · PROBABILISTIC ANALYSIS OF MFGS III. MASTER EQUATIONS, GAMES WITH COMMON NOISE, AND WITH MAJOR