Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
On Stopping Times and Impulse Control with Constraint
Jose Luis MenaldiBased on joint papers with M. Robin (2016, 2017)
Department of MathematicsWayne State UniversityDetroit, Michigan 48202, USA(e-mail: [email protected])
IMA, Minneapolis, MNMay 7–11, 2018
Wed May 09, 2018, 10:00-10:50
JLM (WSU) Control Problems with Constraint 1 / 24
1.- Introduction Table of Content
Content
1.- Introduction: Stopping, Impulse, and Signal
2.- Stopping with Constraint
3.- Impulse Control with Constraint - Setting, Assumptions
4.- . . . - Formal Analysis, Dynamic Programming
5.- . . . - Solving HJB Equation
6.- . . . - Extensions
7.- Hybrid Control, Stopping, Impulse, Switching
JLM (WSU) Control Problems with Constraint 2 / 24
1.- Introduction Stopping & Impulse
Optimal Stopping Time & Impulse Control Problem
Control θ = stop evolution of a Markov-Feller xt on a Polish spaceOptimal Problem = maximize/minimize the reward/cost functional
Jx(θ, ψ) = Ex
∫ θ
0e−αt f (xt)dt + e−αθψ(xθ)
, (1)
θ stopping time, f , ψ ≥ 0 (running & terminal), α > 0 (discount).Optimal cost:
v(x) = infJx(θ, ψ) : θ stopping time
.
Cost functional (with intervention)
Jx(ν) = Ex
∫ ∞0
e−αt f (xνt )dt +∞∑i=0
e−αϑi c(xνϑi , ξi ), (2)
for any ν impulse/switching control, and the optimal cost function
v(x) = infνJx(ν).
JLM (WSU) Control Problems with Constraint 3 / 24
1.- Introduction Stopping & Impulse - Constraint
VI & QVI - Stopping & Impulse - Constraint
The dynamic programming gives a HJB equation (VI or QVI) of the form
minAv − αv + f , ψ − v
= 0 or min
Av − αv + f ,Mv − v
= 0,
with A = infinitesimal generator of Markov-Feller process xt , t ∈ [0,∞[(with states in E , Polish space), and Mu(x) = infξc(x , ξ) + u(ξ).This QVI can be solved by using the sequence of variational inequalities
minAvn − αvn + f ,Mvn−1 − vn
= 0,
where vn is the optimal cost for the problem with stopping cost Mvn−1.
Bensoussan and Lions (book 1978), Robin (thesis 1978), Menaldi (thesis 1980), etc, etc,Peskir and Shiryaev (book 2006), etc, etc, etc. (extremely long list, with variations)Constraint: ‘Wait for a signal’. For instance, the system evolves according to a Wienerprocess wt (with drift b and diffusion σ) and the signal is the jumps of a Poisson processNt , independent of the Wiener process. Dupuis and Wang (2002) [also, Lempa (2012),Liang and Wei (2014)] studied this case for a geometric Brownian process.
JLM (WSU) Control Problems with Constraint 4 / 24
1.- Introduction Signal
Constraint: Wait for a Signal
The dynamic programming equation (or HJB equation) takes the form
Au − αu = λ[u − ψ]+, with f = 0,
where the verification theorem is based on an explicit solution of the HJBequation and on the solution of a discrete time problem.
Objective: To generalize the above problem in two directions:
- to replace the 1-d Wiener process with a Markov-Feller process,
- to replace the Poisson process by a more general counting process.
The simplest model τn : n ≥ 1 are the jumps of a Poisson process, i.e., the timebetween two consecutive jumps T1 = τ1, T2 = τ2 − τ1, T3 = τ3 − τ2, . . . isnecessarily an IID sequence, exponentially distributed. However, we focus first onTn : n ≥ 1 being a sequence of IID random variables with common law π0 (notexponential in general, i.e., not memoryless).
JLM (WSU) Control Problems with Constraint 5 / 24
1.- Introduction Time Elapsed Since Last Signal
Signal as a Process
Let us introduce an homogeneous Markov process yt : t ≥ 0,yt ∈ [0,∞[, independent of xt : t ≥ 0 ‘time elapsed since the last signal’.So that almost surely, the cad-lag paths t 7→ yt are piecewise differentiableyt = 1, with jumps only back to zero, and expected infinitesimal generator
A1ϕ(y) = ∂yϕ(t) + λ(y)[ϕ(0)− ϕ(y)], ∀y ≥ 0, (3)
where the intensity λ(y) ≥ 0 is a Borel measurable function.Thus, the signals are defined as functionals on yt : t ≥ 0, namely,
τ0 = 0, τn = inft > τn−1 : yt = 0
, n ≥ 1. (4)
The couple (xt , yt) : t ≥ 0 is an homogeneous Markov process incontinuous time, and a signal arrives at a random instant τ if and only ifyτ = 0 (and τ > 0).
Stopping with Constraint
JLM (WSU) Control Problems with Constraint 6 / 24
2.- Stopping Problem Setting Stopping
Two Optimal Costs - Stopping Problem
• Cb = Cb(E × [0,∞[) continuous & bounded functions on E × [0,∞[,• (xt , yt) : t ≥ 0 Markov-Feller process, i.e., for any ϕ(x , y) in Cb,
(x , y , t) 7→ Ex ,y
ϕ(xt , yt)
is continuous, and (5)
α > 0 and f , ψ ≥ 0,∈ Cb(E × [0,∞[), (6)
• The cost function is
Jx ,y (θ, ψ) = Ex ,y
∫ θ
0e−αt f (xt , yt)dt + e−αθψ(xθ, yθ)
,
which yields an optimal cost
v(x , y) = infJx ,y (θ, ψ) : θ > 0, yθ = 0
, (7)
i.e., θ is any admissible stopping time, and an auxiliary optimal cost isdefined as
v0(x , y) = infJx ,y (θ, ψ) : yθ = 0
, (8)
which is an homogeneous Markov model.JLM (WSU) Control Problems with Constraint 7 / 24
2.- Stopping Problem HJB Stopping
Dynamic Programming - HJB Equations - Optimal Stopping Cost
If τ = inft > 0 : yt = 0
then HJB equations
u0(x , 0) = minψ,Ex ,y
∫ τ
0e−αt f (xt , yt)dt + e−ατu0(xτ , yτ )
, (9)
and
u(x , y) = Ex ,y
∫ τ
0e−αt f (xt , yt)dt + e−ατ minψ, u(xτ , yτ )
, (10)
are the corresponding HJB equations, and both problems are related bythe equation
u(x , y) = Ex ,y
∫ τ
0e−αt f (xt , yt)dt + e−ατu0(xτ , yτ )
. (11)
Note that yτ = 0 and u0(x , y) = u(x , y) for any y > 0.
JLM (WSU) Control Problems with Constraint 8 / 24
2.- Stopping Problem Solving Stopping
Solving Optimal Stopping Problems
Theorem
Assume (5) & (6). Then VI (9) & (10) have each a unique solution inCb(E × [0,∞[), which are the optimal costs (7) & (8). Moreover, the firstexit time from the continuation region [u < ψ] is optimal (discrete s.t.)
θ = inft > 0 : u(xt , yt) ≥ ψ(xt , yt), yt = 0,
θ0 = inft ≥ 0 : u0(xt , yt) = ψ(xt , yt), yt = 0
(12)
are optimal, namely, u(x , y) = Jx ,y (θ, ψ) and u0(x , y) = Jx ,y (θ0, ψ).Furthermore, the relation (11) holds.
Also u0 = minu, ψ, which may 6∈ D(Ax,y ) ⊂ Cb(E × [0,∞[). However, theoptimal cost u given by (7) belongs to D(Ax,y ) and for any (x , y) in E × [0,∞[,
(HJB) − Ax,yu(x , y) + αu(x , y) + λ(x , y)[u(x , 0)− ψ(x , y)]+ = f (x , y). (13)
IC w/Constraint: Setting, Assumptions
JLM (WSU) Control Problems with Constraint 9 / 24
3.- Impulse Control Assumptions
Impulse Control Setting - Evolution
If xt represents the state of the ‘uncontrolled’ system then a sequenceν = (ϑi , ξi ) : i ≥ 1 of stopping times ϑ1 ≤ ϑ2 ≤ · · · −→ ∞ andE -valued random variables ξ1, ξ2, . . . is called an impulse control.The controlled system xνt : t ≥ 0 behaves likes xt for t within [ϑi−1, ϑi [,i ≥ 1, and xνϑi = ξi , in short, instantaneously moves from xν−ϑi (xνϑi−) to ξi .
A cost-per-impulse
c : E × Γ −→ [c0,∞[, c is continuous and c0 > 0, (14)
with a closed subset Γ of E (= set where to jump / admissible jumps).• (xt , yt) : t ≥ 0 is a time-homogeneous Markov-Feller process, withinfinitesimal generator, x ∈ E , y ≥ 0,
Ax ,yϕ(x , y) = Axϕ(x , y) + ∂yϕ(x , y) + λ(x , y)[ϕ(x , 0)− ϕ(x , y)], (15)
i.e., now the intensity λ depends also on x .
JLM (WSU) Control Problems with Constraint 10 / 24
3.- Impulse Control Setting
Cost per Interventions / Impulse costs
• First: ‘admissible’ impulse control ν = (ϑi , ξi ) : i ≥ 1 if yϑi = 0,∀i ≥ 1, and ϑ1 > 0. If ϑ1 = 0 is allowed then ν is ‘zero-admissible’.• 1st decision: choose F-st ϑ1, a Fϑ1-meas. rv ξ1 : Ω→ Γ ⊂ E , cost up toϑ1
Jx ,y (ν|ϑ1) = Eν|ϑ0x ,y
∫ ϑ1
ϑ0
f (xt , yt)e−αtdt + e−αϑ1c(xϑ1 , ξ1)
, ϑ0 = 0,
• 2nd decision: choose a F-st ϑ2, a Fϑ2-meas. rv ξ2 : Ω→ Γ ⊂ E , cost upto ϑ2
Jx ,y (ν|ϑ2) = Jx ,y (ν|ϑ1) + Eν|ϑ1x ,y
∫ ϑ2
ϑ1
f (xt , yt)e−αtdt + e−αϑ2c(xϑ2 , ξ2)
• Iterating, cost upto ϑk+1 (ϑk+1 <∞ means ‘intervention’)
Jx ,y (ν|ϑk+1) = Jx ,y (ν|ϑk) + Eν|ϑkx ,y
∫ ϑk+1
ϑk
f (xt , yt)e−αtdt +
+ e−αϑk+1c(xϑk+1, ξk+1)
. (16)
JLM (WSU) Control Problems with Constraint 11 / 24
3.- Impulse Control Setting (cont. 1)
Optimal Impulse Costs
• Total cost: limit Jx ,y (ν) = limk Jx ,y (ν|ϑk+1). However, it is convenientto construct (in an ∞× copy-space) a probability Pνx ,y such that
Jx ,y (ν) = Eνx ,y∫ ∞
0e−αt f (xνt , y
νt )dt +
∞∑i=1
e−αϑi c(x i−1,νϑi
, ξi ), (17)
• V (or V0) = admissible (or zero-admissible) impulse controls, all relativeto the initial condition (x0, y0) = (x , y). Optimal cost:
v(x , y) = infJx ,y (ν) : ν ∈ V
, ∀(x , y) ∈ E × [0,∞[, (18)
and also, the ‘time-homogeneous’ impulse control, with an optimal cost:
v0(x , y) = infJx ,y (ν) : ν ∈ V0
, ∀(x , y) ∈ E × [0,∞[, (19)
which is actually needed only for y = 0.Formal Analysis, Dynamic Programming
JLM (WSU) Control Problems with Constraint 12 / 24
4.- Formal Analysis Dynamic Programming - IC
Dynamic Programming - State Model - Time Homogeneous
Two models: constraint ‘wait to intervene until a signal arrive’:
(a) v(x , y) translated as ‘intervene only when y = 0 and t > 0’;
(b) (Markov/stationary) v0(x , y) translated as ‘intervene only when y = 0’.
(1) Time homogeneous? v(x , y) v(x , y , s), v(x , y , 0) = v(x , y) and fors > 0,
Jx,y ,s(ν) = Eνx,y∫ ∞
0
e−α(t−s)f (xνs+t)dt +∑i
e−α(ϑi−s)c(x i−1,νs+ϑi
, ξi ),
v(x , y , s) = infJx,y ,s(ν) : ν any admissible impulse control
,
where now ‘admissible’ means ‘intervene only when y = 0 and s 6= 0’.
(2) Multiple impulses? Are excluded from the optimal decision if
c(x , ξ1) + c(ξ1, ξ2) ≥ c(x , ξ2), ∀x ∈ E , ξ1, ξ2 ∈ Γ. (>?) (20)
and v(x , y) ≥ v0(x , y), ∀x ∈ E , y ≥ 0, with = if y > 0,
v0(x , 0) = minv(x , 0), inf
ξ∈Γv(ξ, 0) + c(x , ξ)
, ∀x ∈ E .
(21)
hold true.JLM (WSU) Control Problems with Constraint 13 / 24
4.- Formal Analysis Dynamic Programming - Weak DP Equation
Dynamic Programming - Weak DP/HJB Equation
The relations
u(x , y) = Ex,y
∫ τ1
0
e−αt f (xt)dt + e−ατ1u0(xτ1 , 0). (22)
An equation with u alone
u(x , y) = Ex,y
∫ τ1
0
e−αt f (xt)dt +
+ e−ατ1 minu(xτ1 , 0), inf
ξ1∈Γu(ξ1, 0) + c(xτ1 , ξ1)
,
or equivalently
u(x , y) =(R(minu(·, 0), (Mu(·, 0))
)(x , y), ∀x , y , (23)
Also
u0(x , 0) = minu(x , 0), (Mu(·, 0))(x)
=
= min
(Mu0(·, 0))(x), Ex,0
∫ τ1
0
e−αt f (xt)dt + e−ατ1u0(xτ1 , 0)
.
Solving HJB Equation
JLM (WSU) Control Problems with Constraint 14 / 24
6.- Solving HJB Equation Existence and Uniqueness
HJB Equation - Existence and Uniqueness
Either −Au0 + αu0 = f or
u0(x) = Ex
∫ ∞0
e−αt f (xt)dt, ∀x ∈ E , (24)
v , v0 ∈ C (u0) = ϕ ∈ C (E ) : 0 ≤ ϕ(x) ≤ u0(x), ∀x ∈ E (25)
HJB equation with u0(x) = u0(x , 0), either u0 = minMu0,Ru0 or
u0(x) = min
infξ∈Γu0(ξ) + c(x , ξ),
, Ex,0
∫ τ1
0
e−αt f (xt)dt + e−ατ1u0(xτ1 )
(26)
Consider the scheme un0 = minMun−10 ,Run0, with u0
0 = u0, n ≥ 1. This equationhas a unique solution in C (E ), and un0 = vn
0 (·, 0) with
vn0 (x , y) = inf
θEx,y
∫ θ
0
e−αt f (xt)dt + e−αθMvn−10 (xθ, 0)
, (27)
where the minimization is over all zero-admissible stopping times θ, i.e., θ = τηfor some discrete stopping time η ≥ 0.
JLM (WSU) Control Problems with Constraint 15 / 24
6.- Solving HJB Equation Existence and Uniqueness (Thm 1)
Existence and Uniqueness (Thm 1: u0)
Theorem
Under the previous assumptions, the decreasing sequence un0 : n ≥ 1converges to u0, and there are two constants C > 0 and ρ in ]0, 1[ suchthat
0 ≤ un0 (x)− u0(x) ≤ Cρn, ∀x ∈ E , n ≥ 1. (28)
Moreover, the HJB equation (26) has one and only one solution u0 withinthe interval C (u0), and the representation
u0(x) = infθEx ,0
∫ θ
0e−αt f (xt)dt + e−αθMu0(xθ)
, (29)
holds true, where the minimization is over all zero-admissible stoppingtimes θ, i.e., θ = τη for some discrete stopping time η ≥ 0.
JLM (WSU) Control Problems with Constraint 16 / 24
6.- Solving HJB Equation Existence and Uniqueness (Thm 2)
Existence and Uniqueness (Thm 2: u)
TheoremUnder previous assumptions, un → u, and ∃ C > 0 and ρ in ]0, 1[ such that
0 ≤ un(x)− u(x) ≤ Cρn, ∀x ∈ E , n ≥ 1. (30)
Moreover, the HJB equation (23) has one and only one solution u withinthe interval C (u0), and
u(x) = infθEx ,0
∫ θ
0e−αt f (xt)dt + e−αθMu(xθ)
, (31)
θ admissible, i.e., θ = τη for some discrete stopping time η ≥ 1.Furthermore, un and u belong to D(Ax ,y ) ⊂ Cb(E × [0,∞[) and
− Ax ,yu(x , y) + αu(x , y) + λ(x , y)[u(x , 0)−Mu(·, 0)(x)
]+= f (x) (32)
− Ax ,yun(x , y) + αu(x , y) + λ
[un(x , 0)−Mun−1(·, 0)(x)
]+= f (x) (33)
JLM (WSU) Control Problems with Constraint 17 / 24
6.- Solving HJB Equation Existence and Uniqueness (Thm 3 & 4)
Solution: Optimal Cost and Feedback (Thm 3 & 4)
Theorem
Under the previous assumptions, the unique solution of the HJB equation(23) is the optimal cost (18), i.e.,
u(x , y) = infJx ,y (ν) : ν any admissible impulse control
, (34)
for every (x , y) in E × [0,∞[.
Theorem
Under previous assumptions, the first exit time of the continuation region[u < Mu] provides an optimal admissible impulse control.
Extensions
JLM (WSU) Control Problems with Constraint 18 / 24
6.- Extensions Other Impulse Control Problems
Extensions - Other Impulse Control Problems
• Instead of Γ, may use ‘variable’ Γ(x),
Γ : E 7→ 2E , with Γ(x) closed ∀x ∈ E .
Impulse intervention operator
Mϕ(x) = infϕ(x) + c(x , ξ) : ξ ∈ Γ(x)
, ∀x ∈ E ,
should have the properties(a) M maps continuously Cb(E ) into itself,
(b) there exists a Borel measurable minimizer for M,(35)
where a minimizer means a Borel function ξ : E × Cb(E )→ E satisfying
ξ(x , ϕ) ∈ Γ(x), Mϕ(x) = ϕ(x) + c(x , ξ(x , ϕ)), ∀x , ϕ.
Sometimes, the space Cb(E ) is replaced by another suitable space.
JLM (WSU) Control Problems with Constraint 19 / 24
6.- Extensions Unbounded & Other Signals
Unbounded & Other Signals
• Initially E compact, but E = Rd , or in general, E locally compact• f with polynomial growth, in Cp(Rd), Cp(H), H = Hilbert• Signal with a sequence T1,T2, . . . of independent identicallydistributed (IID conditionally to xt : t ≥ 0), or pure jump Markovprocesses, semi Markov processes, piecewise-deterministic Markovprocesses and diffusion processes with jumps. Essentially, it suffices to adda new variable to the initial process xt : t ≥ 0 to be reduced to thecurrent situation.• Instead of assuming yt in R+, we can consider the case yt in [0, y?[, forsome finite y? > 0, and therefore π([0, y?[) = 1• In the case of IID variables independent of xt : t ≥ 0, we can considera ‘general’ distribution π0, without a density.
Hybrid Control, Stopping, Impulse, Switching
JLM (WSU) Control Problems with Constraint 20 / 24
7. Hybrid Control Abstract Model
Hybrid Abstract Model
• a lot of references, many models, . . .• a book in preparation with Hector (Jasso-Fuentes) & Maurice (Robin)• time t in [0,∞[, state variable (x , n) in S ⊂ Rd × Rm
• trajectories piecewise continuous in x , piecewise constant in n• the continuous-type component x may include some discrete evolution(i.e., taking discrete values) needed to trigger a discrete transition (orevent), produced when x hits the set-interface Dn = x : (x , n) ∈ D• n is sometime called ‘mode/regime’, example of evolution equation:(x(t), n(t)
)=(g(x(t), n(ti ), v(t)), 0
), for t ∈ [ti , ti+1[,
(x(ti ), n(ti )) =(X (x(ti ), n(ti ), ki ),N(x(ti ), n(ti ), ki )
), i = 0, 1, · · ·
ti+1 ≡ T(x(·), n(ti ), ti ,D) := inft ≥ ti : (x(t), n(ti )) ∈ D
, if ti <∞.
with initial condition (x(0), n(0)) = (x0, n0) ∈ S , t0 = 0
JLM (WSU) Control Problems with Constraint 21 / 24
7. Hybrid Control Automaton Case
Automaton Case
Theorem
Under the assumptions
(X ,N) : D × K → S r D, uniformly continuous.g : S × V → Rd , continuous
|g(x , n, v)| ≤ C (1 + |x |), ∀x , n, v|g(x , n, v)− g(x ′, n, v)| ≤ M|x − x ′|, ∀x , x ′, n, v
infk∈K
inf(x ,n),(ξ,η)∈D
|ξ − X (x , n, k)|+ |η − N(x , n, k)| ≥ c > 0
the trajectories are well defined.
• Stochastic Dynamics, SDEs, SPDEs, general Markov-Feller processes,. . .
JLM (WSU) Control Problems with Constraint 22 / 24
7. Hybrid Control General Case
General Case
The data is as follows:
state-space S ⊂ Rd × Rm, open or closed,
minimal set-interface D∧ ⊂ S , closed,
maximal set-interface D∨ ⊂ S , closed, D∧ ⊂ D∨,
control-spaces V × K ⊂ Rp × Rq, compact,
set-constraints RV⊂ S × V and R
K⊂ S × K , closed.
The dynamics follows the rule:(1) a mandatory jump or switching is applied on D∧ ,
(2) a continuous evolution takes place on S r D∨ ,
(3) the controller chooses either (1) or (2) on D∨ r D∧.
JLM (WSU) Control Problems with Constraint 23 / 24
7. Hybrid Control Stopping/Impulse/Switching
Stopping/Impulse/Switching with Constraint
Particular cases: (x , y) ∈ S = E × [0,∞[, D∧ = ∅, D∨ = E × 0, i.e.,the control is allowed only when y = 0 (the ‘signal’ has arrived).
In our Optimal Stopping Time Problem (with constraint) we have twocosts, u and u0 related by
u(x , y) = Ex ,y
e−ατ1u0(xτ1 , 0)
, ∀(x , y) ∈ E × [0,∞[. (36)
The cost u0 corresponds to the hybrid problem ‘control only when y = 0’,but the cost that we are interested in is u.
Current Works (besides with M. Robin): in hybrid control• (with Hector Jasso-Fuentes) Relaxation and linear programs in hybridcontrol problems, . . .• (with Hector Jasso-Fuentes and Tomas Prieto-Rumeau) Discrete-TimeHybrid Control in Borel Spaces, . . .
T H A N K S !JLM (WSU) Control Problems with Constraint 24 / 24