On Stopping Times and Impulse Control with Constraint

Jose Luis MenaldiBased on joint papers with M. Robin (2016, 2017)

Department of MathematicsWayne State UniversityDetroit, Michigan 48202, USA(e-mail: menaldi@wayne.edu)

IMA, Minneapolis, MNMay 7–11, 2018

Wed May 09, 2018, 10:00-10:50

JLM (WSU) Control Problems with Constraint 1 / 24

1.- Introduction Table of Content

Content

1.- Introduction: Stopping, Impulse, and Signal

2.- Stopping with Constraint

3.- Impulse Control with Constraint - Setting, Assumptions

4.- . . . - Formal Analysis, Dynamic Programming

5.- . . . - Solving HJB Equation

6.- . . . - Extensions

7.- Hybrid Control, Stopping, Impulse, Switching

1.- Introduction Stopping & Impulse

Optimal Stopping Time & Impulse Control Problem

Control θ = stop evolution of a Markov-Feller xt on a Polish spaceOptimal Problem = maximize/minimize the reward/cost functional

Jx(θ, ψ) = Ex

∫ θ

0e−αt f (xt)dt + e−αθψ(xθ)

θ stopping time, f , ψ ≥ 0 (running & terminal), α > 0 (discount).Optimal cost:

v(x) = infJx(θ, ψ) : θ stopping time

Cost functional (with intervention)

Jx(ν) = Ex

∫ ∞0

e−αt f (xνt )dt +∞∑i=0

e−αϑi c(xνϑi , ξi ), (2)

for any ν impulse/switching control, and the optimal cost function

v(x) = infνJx(ν).

1.- Introduction Stopping & Impulse - Constraint

VI & QVI - Stopping & Impulse - Constraint

The dynamic programming gives a HJB equation (VI or QVI) of the form

minAv − αv + f , ψ − v

= 0 or min

Av − αv + f ,Mv − v

with A = infinitesimal generator of Markov-Feller process xt , t ∈ [0,∞[(with states in E , Polish space), and Mu(x) = infξc(x , ξ) + u(ξ).This QVI can be solved by using the sequence of variational inequalities

minAvn − αvn + f ,Mvn−1 − vn

where vn is the optimal cost for the problem with stopping cost Mvn−1.

Bensoussan and Lions (book 1978), Robin (thesis 1978), Menaldi (thesis 1980), etc, etc,Peskir and Shiryaev (book 2006), etc, etc, etc. (extremely long list, with variations)Constraint: ‘Wait for a signal’. For instance, the system evolves according to a Wienerprocess wt (with drift b and diffusion σ) and the signal is the jumps of a Poisson processNt , independent of the Wiener process. Dupuis and Wang (2002) [also, Lempa (2012),Liang and Wei (2014)] studied this case for a geometric Brownian process.

1.- Introduction Signal

Constraint: Wait for a Signal

The dynamic programming equation (or HJB equation) takes the form

Au − αu = λ[u − ψ]+, with f = 0,

where the verification theorem is based on an explicit solution of the HJBequation and on the solution of a discrete time problem.

Objective: To generalize the above problem in two directions:

- to replace the 1-d Wiener process with a Markov-Feller process,

- to replace the Poisson process by a more general counting process.

The simplest model τn : n ≥ 1 are the jumps of a Poisson process, i.e., the timebetween two consecutive jumps T1 = τ1, T2 = τ2 − τ1, T3 = τ3 − τ2, . . . isnecessarily an IID sequence, exponentially distributed. However, we focus first onTn : n ≥ 1 being a sequence of IID random variables with common law π0 (notexponential in general, i.e., not memoryless).

1.- Introduction Time Elapsed Since Last Signal

Signal as a Process

Let us introduce an homogeneous Markov process yt : t ≥ 0,yt ∈ [0,∞[, independent of xt : t ≥ 0 ‘time elapsed since the last signal’.So that almost surely, the cad-lag paths t 7→ yt are piecewise differentiableyt = 1, with jumps only back to zero, and expected infinitesimal generator

A1ϕ(y) = ∂yϕ(t) + λ(y)[ϕ(0)− ϕ(y)], ∀y ≥ 0, (3)

where the intensity λ(y) ≥ 0 is a Borel measurable function.Thus, the signals are defined as functionals on yt : t ≥ 0, namely,

τ0 = 0, τn = inft > τn−1 : yt = 0

, n ≥ 1. (4)

The couple (xt , yt) : t ≥ 0 is an homogeneous Markov process incontinuous time, and a signal arrives at a random instant τ if and only ifyτ = 0 (and τ > 0).

Stopping with Constraint

2.- Stopping Problem Setting Stopping

Two Optimal Costs - Stopping Problem

• Cb = Cb(E × [0,∞[) continuous & bounded functions on E × [0,∞[,• (xt , yt) : t ≥ 0 Markov-Feller process, i.e., for any ϕ(x , y) in Cb,

(x , y , t) 7→ Ex ,y

ϕ(xt , yt)

is continuous, and (5)

α > 0 and f , ψ ≥ 0,∈ Cb(E × [0,∞[), (6)

• The cost function is

Jx ,y (θ, ψ) = Ex ,y

∫ θ

0e−αt f (xt , yt)dt + e−αθψ(xθ, yθ)

which yields an optimal cost

v(x , y) = infJx ,y (θ, ψ) : θ > 0, yθ = 0

i.e., θ is any admissible stopping time, and an auxiliary optimal cost isdefined as

v0(x , y) = infJx ,y (θ, ψ) : yθ = 0

which is an homogeneous Markov model.JLM (WSU) Control Problems with Constraint 7 / 24

2.- Stopping Problem HJB Stopping

Dynamic Programming - HJB Equations - Optimal Stopping Cost

If τ = inft > 0 : yt = 0

then HJB equations

u0(x , 0) = minψ,Ex ,y

∫ τ

0e−αt f (xt , yt)dt + e−ατu0(xτ , yτ )

u(x , y) = Ex ,y

∫ τ

0e−αt f (xt , yt)dt + e−ατ minψ, u(xτ , yτ )

, (10)

are the corresponding HJB equations, and both problems are related bythe equation

u(x , y) = Ex ,y

∫ τ

0e−αt f (xt , yt)dt + e−ατu0(xτ , yτ )

. (11)

Note that yτ = 0 and u0(x , y) = u(x , y) for any y > 0.

2.- Stopping Problem Solving Stopping

Solving Optimal Stopping Problems

Theorem

Assume (5) & (6). Then VI (9) & (10) have each a unique solution inCb(E × [0,∞[), which are the optimal costs (7) & (8). Moreover, the firstexit time from the continuation region [u < ψ] is optimal (discrete s.t.)

θ = inft > 0 : u(xt , yt) ≥ ψ(xt , yt), yt = 0,

θ0 = inft ≥ 0 : u0(xt , yt) = ψ(xt , yt), yt = 0

are optimal, namely, u(x , y) = Jx ,y (θ, ψ) and u0(x , y) = Jx ,y (θ0, ψ).Furthermore, the relation (11) holds.

Also u0 = minu, ψ, which may 6∈ D(Ax,y ) ⊂ Cb(E × [0,∞[). However, theoptimal cost u given by (7) belongs to D(Ax,y ) and for any (x , y) in E × [0,∞[,

(HJB) − Ax,yu(x , y) + αu(x , y) + λ(x , y)[u(x , 0)− ψ(x , y)]+ = f (x , y). (13)

IC w/Constraint: Setting, Assumptions

3.- Impulse Control Assumptions

Impulse Control Setting - Evolution

If xt represents the state of the ‘uncontrolled’ system then a sequenceν = (ϑi , ξi ) : i ≥ 1 of stopping times ϑ1 ≤ ϑ2 ≤ · · · −→ ∞ andE -valued random variables ξ1, ξ2, . . . is called an impulse control.The controlled system xνt : t ≥ 0 behaves likes xt for t within [ϑi−1, ϑi [,i ≥ 1, and xνϑi = ξi , in short, instantaneously moves from xν−ϑi (xνϑi−) to ξi .

A cost-per-impulse

c : E × Γ −→ [c0,∞[, c is continuous and c0 > 0, (14)

with a closed subset Γ of E (= set where to jump / admissible jumps).• (xt , yt) : t ≥ 0 is a time-homogeneous Markov-Feller process, withinfinitesimal generator, x ∈ E , y ≥ 0,

Ax ,yϕ(x , y) = Axϕ(x , y) + ∂yϕ(x , y) + λ(x , y)[ϕ(x , 0)− ϕ(x , y)], (15)

i.e., now the intensity λ depends also on x .

3.- Impulse Control Setting

Cost per Interventions / Impulse costs

• First: ‘admissible’ impulse control ν = (ϑi , ξi ) : i ≥ 1 if yϑi = 0,∀i ≥ 1, and ϑ1 > 0. If ϑ1 = 0 is allowed then ν is ‘zero-admissible’.• 1st decision: choose F-st ϑ1, a Fϑ1-meas. rv ξ1 : Ω→ Γ ⊂ E , cost up toϑ1

Jx ,y (ν|ϑ1) = Eν|ϑ0x ,y

∫ ϑ1

f (xt , yt)e−αtdt + e−αϑ1c(xϑ1 , ξ1)

, ϑ0 = 0,

• 2nd decision: choose a F-st ϑ2, a Fϑ2-meas. rv ξ2 : Ω→ Γ ⊂ E , cost upto ϑ2

Jx ,y (ν|ϑ2) = Jx ,y (ν|ϑ1) + Eν|ϑ1x ,y

∫ ϑ2

f (xt , yt)e−αtdt + e−αϑ2c(xϑ2 , ξ2)

• Iterating, cost upto ϑk+1 (ϑk+1 <∞ means ‘intervention’)

Jx ,y (ν|ϑk+1) = Jx ,y (ν|ϑk) + Eν|ϑkx ,y

∫ ϑk+1

f (xt , yt)e−αtdt +

+ e−αϑk+1c(xϑk+1, ξk+1)

. (16)

3.- Impulse Control Setting (cont. 1)

Optimal Impulse Costs

• Total cost: limit Jx ,y (ν) = limk Jx ,y (ν|ϑk+1). However, it is convenientto construct (in an ∞× copy-space) a probability Pνx ,y such that

Jx ,y (ν) = Eνx ,y∫ ∞

0e−αt f (xνt , y

νt )dt +

∞∑i=1

e−αϑi c(x i−1,νϑi

, ξi ), (17)

• V (or V0) = admissible (or zero-admissible) impulse controls, all relativeto the initial condition (x0, y0) = (x , y). Optimal cost:

v(x , y) = infJx ,y (ν) : ν ∈ V

, ∀(x , y) ∈ E × [0,∞[, (18)

and also, the ‘time-homogeneous’ impulse control, with an optimal cost:

v0(x , y) = infJx ,y (ν) : ν ∈ V0

, ∀(x , y) ∈ E × [0,∞[, (19)

which is actually needed only for y = 0.Formal Analysis, Dynamic Programming

4.- Formal Analysis Dynamic Programming - IC

Dynamic Programming - State Model - Time Homogeneous

Two models: constraint ‘wait to intervene until a signal arrive’:

(a) v(x , y) translated as ‘intervene only when y = 0 and t > 0’;

(b) (Markov/stationary) v0(x , y) translated as ‘intervene only when y = 0’.

(1) Time homogeneous? v(x , y) v(x , y , s), v(x , y , 0) = v(x , y) and fors > 0,

Jx,y ,s(ν) = Eνx,y∫ ∞

e−α(t−s)f (xνs+t)dt +∑i

e−α(ϑi−s)c(x i−1,νs+ϑi

, ξi ),

v(x , y , s) = infJx,y ,s(ν) : ν any admissible impulse control

where now ‘admissible’ means ‘intervene only when y = 0 and s 6= 0’.

(2) Multiple impulses? Are excluded from the optimal decision if

c(x , ξ1) + c(ξ1, ξ2) ≥ c(x , ξ2), ∀x ∈ E , ξ1, ξ2 ∈ Γ. (>?) (20)

and v(x , y) ≥ v0(x , y), ∀x ∈ E , y ≥ 0, with = if y > 0,

v0(x , 0) = minv(x , 0), inf

ξ∈Γv(ξ, 0) + c(x , ξ)

, ∀x ∈ E .

hold true.JLM (WSU) Control Problems with Constraint 13 / 24

4.- Formal Analysis Dynamic Programming - Weak DP Equation

Dynamic Programming - Weak DP/HJB Equation

The relations

u(x , y) = Ex,y

∫ τ1

e−αt f (xt)dt + e−ατ1u0(xτ1 , 0). (22)

An equation with u alone

u(x , y) = Ex,y

∫ τ1

e−αt f (xt)dt +

+ e−ατ1 minu(xτ1 , 0), inf

ξ1∈Γu(ξ1, 0) + c(xτ1 , ξ1)

or equivalently

u(x , y) =(R(minu(·, 0), (Mu(·, 0))

)(x , y), ∀x , y , (23)

u0(x , 0) = minu(x , 0), (Mu(·, 0))(x)

(Mu0(·, 0))(x), Ex,0

∫ τ1

e−αt f (xt)dt + e−ατ1u0(xτ1 , 0)

Solving HJB Equation

6.- Solving HJB Equation Existence and Uniqueness

HJB Equation - Existence and Uniqueness

Either −Au0 + αu0 = f or

u0(x) = Ex

∫ ∞0

e−αt f (xt)dt, ∀x ∈ E , (24)

v , v0 ∈ C (u0) = ϕ ∈ C (E ) : 0 ≤ ϕ(x) ≤ u0(x), ∀x ∈ E (25)

HJB equation with u0(x) = u0(x , 0), either u0 = minMu0,Ru0 or

u0(x) = min

infξ∈Γu0(ξ) + c(x , ξ),

, Ex,0

∫ τ1

e−αt f (xt)dt + e−ατ1u0(xτ1 )

Consider the scheme un0 = minMun−10 ,Run0, with u0

0 = u0, n ≥ 1. This equationhas a unique solution in C (E ), and un0 = vn

0 (·, 0) with

vn0 (x , y) = inf

θEx,y

∫ θ

e−αt f (xt)dt + e−αθMvn−10 (xθ, 0)

, (27)

where the minimization is over all zero-admissible stopping times θ, i.e., θ = τηfor some discrete stopping time η ≥ 0.

6.- Solving HJB Equation Existence and Uniqueness (Thm 1)

Existence and Uniqueness (Thm 1: u0)

Theorem

Under the previous assumptions, the decreasing sequence un0 : n ≥ 1converges to u0, and there are two constants C > 0 and ρ in ]0, 1[ suchthat

0 ≤ un0 (x)− u0(x) ≤ Cρn, ∀x ∈ E , n ≥ 1. (28)

Moreover, the HJB equation (26) has one and only one solution u0 withinthe interval C (u0), and the representation

u0(x) = infθEx ,0

∫ θ

0e−αt f (xt)dt + e−αθMu0(xθ)

, (29)

holds true, where the minimization is over all zero-admissible stoppingtimes θ, i.e., θ = τη for some discrete stopping time η ≥ 0.

6.- Solving HJB Equation Existence and Uniqueness (Thm 2)

Existence and Uniqueness (Thm 2: u)

TheoremUnder previous assumptions, un → u, and ∃ C > 0 and ρ in ]0, 1[ such that

0 ≤ un(x)− u(x) ≤ Cρn, ∀x ∈ E , n ≥ 1. (30)

Moreover, the HJB equation (23) has one and only one solution u withinthe interval C (u0), and

u(x) = infθEx ,0

∫ θ

0e−αt f (xt)dt + e−αθMu(xθ)

, (31)

θ admissible, i.e., θ = τη for some discrete stopping time η ≥ 1.Furthermore, un and u belong to D(Ax ,y ) ⊂ Cb(E × [0,∞[) and

− Ax ,yu(x , y) + αu(x , y) + λ(x , y)[u(x , 0)−Mu(·, 0)(x)

]+= f (x) (32)

− Ax ,yun(x , y) + αu(x , y) + λ

[un(x , 0)−Mun−1(·, 0)(x)

]+= f (x) (33)

6.- Solving HJB Equation Existence and Uniqueness (Thm 3 & 4)

Solution: Optimal Cost and Feedback (Thm 3 & 4)

Theorem

Under the previous assumptions, the unique solution of the HJB equation(23) is the optimal cost (18), i.e.,

u(x , y) = infJx ,y (ν) : ν any admissible impulse control

, (34)

for every (x , y) in E × [0,∞[.

Theorem

Under previous assumptions, the first exit time of the continuation region[u < Mu] provides an optimal admissible impulse control.

Extensions

6.- Extensions Other Impulse Control Problems

Extensions - Other Impulse Control Problems

• Instead of Γ, may use ‘variable’ Γ(x),

Γ : E 7→ 2E , with Γ(x) closed ∀x ∈ E .

Impulse intervention operator

Mϕ(x) = infϕ(x) + c(x , ξ) : ξ ∈ Γ(x)

, ∀x ∈ E ,

should have the properties(a) M maps continuously Cb(E ) into itself,

(b) there exists a Borel measurable minimizer for M,(35)

where a minimizer means a Borel function ξ : E × Cb(E )→ E satisfying

ξ(x , ϕ) ∈ Γ(x), Mϕ(x) = ϕ(x) + c(x , ξ(x , ϕ)), ∀x , ϕ.

Sometimes, the space Cb(E ) is replaced by another suitable space.

6.- Extensions Unbounded & Other Signals

Unbounded & Other Signals

• Initially E compact, but E = Rd , or in general, E locally compact• f with polynomial growth, in Cp(Rd), Cp(H), H = Hilbert• Signal with a sequence T1,T2, . . . of independent identicallydistributed (IID conditionally to xt : t ≥ 0), or pure jump Markovprocesses, semi Markov processes, piecewise-deterministic Markovprocesses and diffusion processes with jumps. Essentially, it suffices to adda new variable to the initial process xt : t ≥ 0 to be reduced to thecurrent situation.• Instead of assuming yt in R+, we can consider the case yt in [0, y?[, forsome finite y? > 0, and therefore π([0, y?[) = 1• In the case of IID variables independent of xt : t ≥ 0, we can considera ‘general’ distribution π0, without a density.

Hybrid Control, Stopping, Impulse, Switching

7. Hybrid Control Abstract Model

Hybrid Abstract Model

• a lot of references, many models, . . .• a book in preparation with Hector (Jasso-Fuentes) & Maurice (Robin)• time t in [0,∞[, state variable (x , n) in S ⊂ Rd × Rm

• trajectories piecewise continuous in x , piecewise constant in n• the continuous-type component x may include some discrete evolution(i.e., taking discrete values) needed to trigger a discrete transition (orevent), produced when x hits the set-interface Dn = x : (x , n) ∈ D• n is sometime called ‘mode/regime’, example of evolution equation:(x(t), n(t)

)=(g(x(t), n(ti ), v(t)), 0

), for t ∈ [ti , ti+1[,

(x(ti ), n(ti )) =(X (x(ti ), n(ti ), ki ),N(x(ti ), n(ti ), ki )

), i = 0, 1, · · ·

ti+1 ≡ T(x(·), n(ti ), ti ,D) := inft ≥ ti : (x(t), n(ti )) ∈ D

, if ti <∞.

with initial condition (x(0), n(0)) = (x0, n0) ∈ S , t0 = 0

7. Hybrid Control Automaton Case

Automaton Case

Theorem

Under the assumptions

(X ,N) : D × K → S r D, uniformly continuous.g : S × V → Rd , continuous

|g(x , n, v)| ≤ C (1 + |x |), ∀x , n, v|g(x , n, v)− g(x ′, n, v)| ≤ M|x − x ′|, ∀x , x ′, n, v

infk∈K

inf(x ,n),(ξ,η)∈D

|ξ − X (x , n, k)|+ |η − N(x , n, k)| ≥ c > 0

the trajectories are well defined.

• Stochastic Dynamics, SDEs, SPDEs, general Markov-Feller processes,. . .

7. Hybrid Control General Case

General Case

The data is as follows:

state-space S ⊂ Rd × Rm, open or closed,

minimal set-interface D∧ ⊂ S , closed,

maximal set-interface D∨ ⊂ S , closed, D∧ ⊂ D∨,

control-spaces V × K ⊂ Rp × Rq, compact,

set-constraints RV⊂ S × V and R

K⊂ S × K , closed.

The dynamics follows the rule:(1) a mandatory jump or switching is applied on D∧ ,

(2) a continuous evolution takes place on S r D∨ ,

(3) the controller chooses either (1) or (2) on D∨ r D∧.

7. Hybrid Control Stopping/Impulse/Switching

Stopping/Impulse/Switching with Constraint

Particular cases: (x , y) ∈ S = E × [0,∞[, D∧ = ∅, D∨ = E × 0, i.e.,the control is allowed only when y = 0 (the ‘signal’ has arrived).

In our Optimal Stopping Time Problem (with constraint) we have twocosts, u and u0 related by

u(x , y) = Ex ,y

e−ατ1u0(xτ1 , 0)

, ∀(x , y) ∈ E × [0,∞[. (36)

The cost u0 corresponds to the hybrid problem ‘control only when y = 0’,but the cost that we are interested in is u.

Current Works (besides with M. Robin): in hybrid control• (with Hector Jasso-Fuentes) Relaxation and linear programs in hybridcontrol problems, . . .• (with Hector Jasso-Fuentes and Tomas Prieto-Rumeau) Discrete-TimeHybrid Control in Borel Spaces, . . .

T H A N K S !JLM (WSU) Control Problems with Constraint 24 / 24

On Stopping Times and Impulse Control with Constraint · 2018. 5. 9. · On Stopping Times and...

Documents

Impulse® Vanes are Bohning vanes made from a IMPULSE® …archery-shop.jp/manual/ICE_VANES_DATA.pdf · IMPULSE® VANES Impulse® Impulse® First Flight Impulse® Deux COLORS AND

Shirinkage Stopping

Calvin on Predestination, Providence, and the Church on... · freedom to sin that fallen mankind possesses. He states that men do not sin “from any outward impulse or constraint,

DeviceHubInstallationGuide€¦ · Skip essential environment installing! .21) SAMS stops successfully Stopping nginx! [ Dk ] Stopping mysqld (via systemctl) Stopping redis—server!

IMPULSE | IMPULSE · IMPULSE | IMPULSE 2013 goItasca.com On The Cover: Impulse 26Q Pewter Pearl Deluxe Graphics Impulse Silver 31WP Sandstone Full-Body Paint 26Q Silver Song with

Stopping the Biological Clockkbuckles/natality.pdfface is the biological time constraint on bearing children, or the “biological clock.” Menken, 2. Trussell, and Larsen [1986]

Stopping Time

Constraint Optimisation Problemsprofs.sci.univr.it/~farinelli/courses/ar/slides/constraintOptimisation.pdf · Constraint Optimisation Problems Constraint Optimisation Cost Netwrkso

1 - Momentum and Impulse - Mr. Alderman's Classesalderma4.weebly.com/uploads/1/8/2/4/18241089/1... · • Stopping times and distances depend on the impulse-momentum theorem. •

Stopping the Gremlins From Stopping You

Stopping A Car This section is about stopping a car safely

Martin Vinsome's Presentation from the event "Stopping them from Stopping you"

Stopping Cybercrime: A Student Workbook - Kentuckykfi.ky.gov/public/Teacher Resources/Stopping Cybercrime Student... · Stopping Cybercrime: A Student Workbook ... Place the tiles

Constraint Propagation: The Heart of Constraint Programming

Net Impulse and Net Impulse Characteristics in Vertical

XML DATA CONSTRAINT AND XINCAML - scitepress.org · Integrity constraint (identity constraint): This kind of constraint describes the reference relationship between elements or attributes

Stopping Power

TECHNICAL MANUAL - SMD · Subconn / Impulse Subconn / Impulse Subconn / Impulse Subconn / Impulse Subconn / Impulse Subconn / Impulse Electrical Output 0-10V 4-20mA 0-10V 4-20mA 0-10V

Momentum, Impulse, and Collisions. Momentum and Impulse

Section C Stopping distances · 2018-09-10 · Stopping distance = reaction distance + braking distance 11. In the column ‘Stopping Distance 1’, calculate the stopping distances