Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Stochastic Optimal Control ProblemsPart I: Deterministic Case
Hasnaa Zidani
ENSTA-Paris, University Paris-Sacaly
IMPA, June 20-24, 2016
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 1 / 30
Outline
1 Controlled differential systems
2 A Direct Numerical appraoch
3 Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 2 / 30
Outline
1 Controlled differential systemsIntroduction and ExamplesState equationExistence of optimal solutions
2 A Direct Numerical appraochDiscrete Optimal Control ProblemExampleState of the art
3 Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 3 / 30
y state of the system
u control inputu y
Find a control law and its corresponding trajectory that optimize someperformances of the system while complying with prescribed constraints(physical or economical constraints on the control and/or the state)
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 4 / 30
Consider the problem of minimizing the cost function∫ T
0`(yt , ut)dt + φ(y0, yT ) subject to: yt = f (yt , ut), t ∈ (0,T ),
and the constraints:
Control constraints: c(ut) ≤ 0, t ∈ (0,T ),
State constraints: g(yt) ≤ 0, t ∈ (0,T ),
Mixed state and control constraints: c(ut , yt) ≤ 0, t ∈ (0,T ),
Initial-final equality and inequality constraints:
Φi (y0, yT ) = 0, i = 1, · · · , r1,Ψi (y0, yT ) ≤ 0, i = r1 + 1, · · · , r .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 5 / 30
Function spaces: Control and state spaces
U := L∞(0,T ;Rm); Y := W 1,∞(0,T ;Rd).
Their extension to Hilbert spaces:
U2 := L2(0,T ;Rm); Y2 := H1(0,T ;Rd).
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 6 / 30
The space race: Goddard problem
Example (Goddard)
h(t) = v(t), h(0) = 0,
v(t) =u(t)
m(t)− g , v(0) = 0,
m(t) = −bu(t), m(0) = mo
h(t) : altitudev(t) : velocitym(t) : masse
u(t) : thrust
ä The trust u(t) is subject to: 0 ≤ u(t) ≤ umax .
ä The rocket’s mass satisfies the contraint: m1 ≤ m(t) ≤ m2(t).
The optimal control problem is the following:
Max h(T )u(t) ∈ [0, umax], (h, v ,m) verifie l’EDO,m1 ≤ m(t) ≤ m2(t) t ≥ 0.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 7 / 30
Launcher’s problem: Ariane 5
• Steer the launcher from Kourou to the GEO• State variables (r, v) ∈ R3 × R3:
r = v
v =−→P +
−→FT (r, v, u)−−→FD(r, v, u);
u ∈ R3 the trust force (control input).• State constraints: Heat flux, limited capacityof ergol, target constraint (GEO)
Objective function: maximization of the payload.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 8 / 30
Standing assumptions
Assume the set of admissible control inputs is:
Uad := {u ∈ U ; ut ∈ U on (0,T )}.
(A0) U is a closed set in Rm.
(A1) f : Rd × Rm −→ Rd is loc. Lipschitz continuous.
(A2) For every x ∈ Rd , f (x ,U) is a convex set of Rd .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 9 / 30
Proposition
Assume (A0)-(A1). Let x ∈ Rd .
i) For every u ∈ Uad, there exists yu ∈ H1([0,T ];Rd) solution of theequation: yut = f (y yt , ut), yu0 = x .
ii) Moreover, the application defined by
T (·) : L2(0,T ;Rm) −→ H1(0,T ;Rd)
u 7−→ T (u) := yu
is continuous
S[0,T ](x) :={y | ∃u ∈ Uad, yt = f (yt , ut), y0 = x
}
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 10 / 30
Under (A0)-(A2) and if U is a compact set,
ä S[0,T ](x) is a compact set in W 1,1 endowed with C 0-topology.This result is a consequence of Filippov’s theorem, see the books of Vinter (2010) or Aubin-Cellina
(1984).
ä the set-valued function x S[0,T ](x) is Lipschitz continuous,
∃L > 0,S[0,T ](x) ⊂ S[0,T ](z) + L|x − z |BW 1,1 ∀x , z ∈ Rd .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 11 / 30
Example (1)
Min
∫ 1
0y2(t) dt
y(t) = u(t),
y(0) = 0,
u(t) ∈ {−1, 1}
un(t) =
{1 sur ( 2k
2n ,2k+12n )
−1 sur ( 2k+12n , 2k+2
2n )
yn(t) =
{t − k
n sur ( 2k2n ,
2k+12n )
−t + (k+1)n sur ( 2k+1
2n , 2k+22n )
This simple problem doesn’t admit a solution
yn → 0, y ≡ 0 is not admissible !!‖un‖L∞,L2 = 1 6→ 0
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 12 / 30
Example (1’)
Min
∫ 1
0y2(t) dt
y(t) = u(t),
y(0) = 0,
u(t) ∈ [−1, 1]
un(t) =
{1 sur ( 2k
2n ,2k+12n )
−1 sur ( 2k+12n , 2k+2
2n )
yn(t) =
{t − k
n sur ( 2k2n ,
2k+12n )
−t + k+1n sur ( 2k+1
2n , 2k+22n )
The relaxed control problem admits a solution!
yn → 0, y ≡ 0 is admissible‖un‖L∞,L2 = 1 6→ 0
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 13 / 30
Outline
1 Controlled differential systemsIntroduction and ExamplesState equationExistence of optimal solutions
2 A Direct Numerical appraochDiscrete Optimal Control ProblemExampleState of the art
3 Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 14 / 30
”First discretize and then optimize”
Consider a general control problem
Min φ(yT ) +∫ T0 `(yt , ut)
subject to: yt = f (yt , ut), t ∈ (0,T ), y0 = x
c(ut) ≤ 0, t ∈ (0,T ),
g(yt) ≤ 0, t ∈ (0,T ),
c(ut , yt) ≤ 0, t ∈ (0,T ),
Φi (y0, yT ) = 0, i = 1, · · · , r1,
Ψi (y0, yT ) ≤ 0, i = r1 + 1, · · · , r .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 15 / 30
The Euler discretization
ä N: number of time steps, hk > 0 duration of k-th time step
ä Steps begin at time t0 = 0, and for k = 1 to N, tk =∑k
j=0 hj
ä State equation: yk+1 = yk + hk f (uk , yk), k = 0, · · · ,N − 1.
ä Cost function: φ(yN)+
ä Running constraints:
c(uk) ≤ 0; g(yk) ≤ 0; c(uk , yk) ≤ 0, k = 1, · · · ,N − 1.
ä Final equality and inequality constraints:
Φi (y0, yN) = 0, i = 1, · · · , r1,Ψi (y0, yN) ≤ 0, i = r1 + 1, · · · , r .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 16 / 30
ä Some control problems are ”naturally” desribed by controlled discretedynamics.
ä Indeed, in some cases the control can act on the control variable onlyat very specific dates (daily, monthly, ...)
ä In this case, the time schedule is fixed and the control problem isalready in the form of a complex finite dimensional control problem.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 17 / 30
Example: A production problem
yt : amount of steel produced at time t.
0 ≤ ut ≤ 1 is a fraction of steel produced at time t and allocated toinvestment.
Thepart of yt allocated to investment is used to increase theproduction capacity according to Eq:
dytdt
(t) = kutyt ,
where y0 = A is the initial production and k is the coefficient ofincrease in production.
The optimal control problem consists here at choosing u in an optimalway to maximize the production allocate to the consumption during afixed time horizon T .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 18 / 30
Questions
In case of continuous control problem
How is the discretized version related to the original continuous controlproblem ?
Given a nominal local solution (u, y) of the original problem:
Does the discretized problem have a solution (uh, yh) near (u, y) ??
Can we expect an Error order as ‖uh − u‖+ ‖yh − y‖ = O(h), whereh := maxk hk ?
Is it reasonable to assume that the solution is (piecewise) smooth ?
How do we solve the discretized problem ?
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 19 / 30
Example: double integrator (I)
Consider the very simple example with constraints on the control:ä Dynamics: yt = ut ∈ [−1, 1]ä Optimization problem: reach the zero state in minimal time
-2.0 -1.6 -1.2 -0.8 -0.4 0.0 0.4 0.8 1.2 1.6 2.0
-2.0
-1.6
-1.2
-0.8
-0.4
0.0
0.4
0.8
1.2
1.6
2.0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
........................................
.....
....
....................
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.............................................
..................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....................................................
.......................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.............................................................
.................................
............................
...............................
....................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.........................................
......................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...................................
..............................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
........................
....
......
.......................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 20 / 30
Example: double integrator (I)
ä Solution: Bang-bang optimal control, at most one switching time
ä Discretized solution of same nature (costate affine function of time)
ä Error only due to the switching time step
ä Expected error: at most O(h)
Ref. Alt, Baier, Gerdts, Lempio, Error bounds for Euler approximation of linear-quadratic control
problems with bang-bang solutions. 2012.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 21 / 30
Example: double integrator (II)Fuller’s problem I (work with J. Laurent-Varin)
Same dynamics: xt = ut ∈ [−1, 1]; Integral cost! T
0x2
tdt.
0 1 2 3 4 5 6 7 8−1.0
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
0.8
1.0
Figure 2: Fuller problem: optimal control, logarithmic penalty
8Ref. PhD work of J. Laurent-Varin, 2005.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 22 / 30
PROS
This method can integrate all types of constraints (state constraints,mixed constraints, ... etc)
The discrete problem is a finite dimensional optimisation problem
CONS
local approach
Huge number of variables
Stability and convergence results: in some cases, the discretized controlproblem doesn’t have any feasible solution while the original controlproblem does have a solution!
The discretization of the control problem should take into account thestructure of the optimal trajectory
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 23 / 30
Outline
1 Controlled differential systemsIntroduction and ExamplesState equationExistence of optimal solutions
2 A Direct Numerical appraochDiscrete Optimal Control ProblemExampleState of the art
3 Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 24 / 30
1 Controlled differential systems
2 A Direct Numerical appraoch
3 Optimality conditions: Pontryagin principle
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 25 / 30
With a final state constraint.
Min φ(yT )
subject to: yt = f (yt , ut), t ∈ (0,T ), y0 = y0
Ψ(yT ) = 0
The mapping T : u 7−→ yu is univoque
The OCP (P) can be re-written as:
MinF(u) := J (u, yu)
u ∈ Uad ; Ψ(T (u))(T ) = 0.
Reminder (A known result in Optimization theory)
u ∈ Uad is a minimum of (P) =⇒∃(λo , λ) 6= 0, [λoF ′(u) + [Ψ′(T (u)) · T ′(u)(T )]Tλ] · (u − u) ≥ 0∀u ∈ Uad .
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 26 / 30
Differentiability of F
(A1’) Assume f is of classe C 1.
Theorem
Assume (A0)-(A1) and (A1’), then T is differentiable on L2(0,T ;Rm). Moreover,we have :
T ′(u) · v = zuv ∀u, v ∈ L2(0,T ;Rm);
where zuv is the linearized state, solution of:
{zt = f ′y (yu
t , ut)zt + f ′u(yut , ut)vt on (0,T ),
z0 = 0,(1)
where yu· := T (u) stands for the state associated to u.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 27 / 30
Theorem
We have:
λoF ′(u).v + [T ′(u)(T )]Tλ] · v =
∫ T
0
〈p(t), fu(yut , ut) · vt)〉 dt
where yu = T (u), and p is the adjoint state associated to u, solution of:
−p(t) = [fy (yut , ut ]
tp(t),
p(T ) = λoΦ′(T , yuT ) + λ
Itroduce the hamiltonien H : Rd × Rm × Rd → R, defined by:
H(x , q, v) = q · f (x , v).
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 28 / 30
Theorem (Sous (A1)-(A3) et (A1’))
let u ∈ Uad is a minimum of (P), then the triplet (u, y , p) satifies:
˙y(t) = f (y(t), u(t)), y(0) = xo
− ˙p(t) = [fy (yu(t), u(t))]tp(t),
∂uH(y(t), u(t), p(t)) · (u − u(t)) ≥ 0, ∀u ∈ U.
.
The triplet (u, y , p) is called a Pontryagin extremal.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 29 / 30
Theorem (Sous (A1)-(A3) et (A1’))
let u ∈ Uad is a minimum of (P), then the triplet (u, y , p) satifies:
˙y(t) = f (y(t), u(t)), y(0) = xo
− ˙p(t) = [fy (yu(t), u(t))]tp(t),
H(y(t), u(t), p(t)) = minu∈UH(y(t), u, p(t)).
The triplet (u, y , p) is called a Pontryagin extremal.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 29 / 30
More generally ...
Min φ(yT ) +∫ T0 `(y(t), u(t)) dt
subject to: yt = f (yt , ut), t ∈ (0,T ), y0 = y0
Ψ(yT ) = 0
Theorem (Sous (A1)-(A3) et (A1’))
let u ∈ Uad is a minimum of (P), then there exists (λ0, λ) ∈ {0, 1} × Rd
such that
˙y(t) = ∂pH(y(t), u(t), p(t), λ0), y(0) = xo
− ˙p(t) = ∂yH(yu(t), u(t), p(t), λ0)]tp(t),
∂uH(y(t), u(t), p(t), λ0) · (u − u(t)) ≥ 0, ∀u ∈ U,
where H(x , v , q, µ) := 〈q, f (x , a)〉+ µ`(x , v) for x ∈ Rd , v ∈ U, q ∈ Rd
and µ ∈ {0, 1}.Moreover, λo = 1 if the problem is free of state constraints.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 30 / 30
More generally ...
Min φ(yT ) +∫ T0 `(y(t), u(t)) dt
subject to: yt = f (yt , ut), t ∈ (0,T ), y0 = y0
Ψ(yT ) = 0
Theorem (Sous (A1)-(A3) et (A1’))
let u ∈ Uad is a minimum of (P), then there exists (λ0, λ) ∈ {0, 1} × Rd
such that
˙y(t) = ∂pH(y(t), u(t), p(t), λ0), y(0) = xo
− ˙p(t) = ∂yH(yu(t), u(t), p(t), λ0)]tp(t),
∂uH(y(t), u(t), p(t), λ0) · (u − u(t)) ≥ 0, ∀u ∈ U,
where H(x , v , q, µ) := 〈q, f (x , a)〉+ µ`(x , v) for x ∈ Rd , v ∈ U, q ∈ Rd
and µ ∈ {0, 1}.Moreover, λo = 1 if the problem is free of state constraints.
H. Zidani (ENSTA ParisTech) Stochastic Optimal Control Problems SVAN’2016 30 / 30