View
0
Download
0
Category
Preview:
Citation preview
Funkcialaj Ekvacioj, 11 (1968), 39-50
Stabilizability and Optimal Control
By Dahlard L. LUKES*
(University of wisconsin)
Abstract A stability-theoretic proof of the stabilizability of a completely controllableprocess is given. The stabilizability hypothesis is used to develop the optimal regulator
theory of linear systems and is shown to be equivalent to the solvability of Kalman’s quad-
ratic generalization of Lyapunov’s linear equation.
This paper carefully redevelops the optimal regulator theory of linear syste-
ms as a background for extending the theory to nonlinear systems. Parallelingthe procedure used by Kalman [1] the existence, uniqueness and synthesis ofthe optimal regulator is analyzed as the limiting case of the problem on a finitetime interval. Utilizing the maximum principle, the technique adopted reducesthe problem to an analysis of the Riccati matrix differential equation whicharises upon separation of variables in the Hamilton-Jacobi partial differentialequation. Kalman’ $¥mathrm{s}$ assumption of the complete controllability of the processis relaxed to its stabilizability which is the essential hypothesis needed for exten-
ding the theory to an infinite time interval for both linear and nonlinear sys-
tems. We give a stability-theoretic proof, using LaSalle’s theorem, that everycompletely controllable process is stabilizable (but of course not conversely).
We also prove that the stabilizability of the process is equivalent to the solva-bility of the quadratic Kalman matrix equation whose solution provides the op-
timal closed-loop synthesis. Kalman’s quadratic equation is an extension ofLyapunov’s linear equation whose solvability provides a criterion for only thestability of the uncontrolled process.Notations: We study ordinary differential equations in finite $¥mathrm{k}$-dimensional realnumber spaces $¥mathrm{R}^{¥mathrm{k}}$ , using the inner product and norm notations$¥alpha¥cdot y=x_{1}y_{1}+x_{2}y_{2}+¥cdots+x_{k}y_{k}$ and $|x|=¥sqrt{x¥cdot x}$ for $x=(X_{1}, X_{2}^{ },¥cdots, X_{k})$ and $y=(y_{1},$ $y_{2}$ , $¥cdots$ ,$y_{k})$ in $R^{k}$ . The ordinary transpose of a real matrix $M$ is denoted by $M^{*}$ and
* This work is the introductory part of the author’ $¥mathrm{s}$ Ph. D. thesis written under Law-rence Markus at the University of Minnesota. Formerly with Honeywell Inc., the au-thor is presently a visiting associate professor at the Mathematics Research Center,U. S. Army, University of Wisconsin.
Sponsored by the Mathematics Research Center, United States Army, Madison, Wis-consin, under Contract No. : $¥mathrm{D}¥mathrm{A}_{-}31-124-¥mathrm{A}¥mathrm{R}¥mathrm{O}-¥mathrm{D}-462$ .
40 D. L. LUKES
we use the matrix norm $||M||=¥sup_{|x|=1}|Mx|$ . The notation $M$ $>0$ $(M ¥geqq 0)$ denotes
that $M$ is a symmetric positive definite (semi-definite) real matrix. If $P(t)$ is amatrix valued function on a subinterval of $R^{1}$ we write $P(t)¥uparrow P_{0}$ as $t¥downarrow t_{0}$ to des-cribe the situation in which $||P(t)-P_{0}||$ converges to zero as $t$ monotonically
decreases to $t_{0}$ and $t_{1}¥leqq t_{2}$ implies $P(t_{1})-P(t_{2})¥geqq 0$ .
1. Optimal open-loop control on a finite time interval.Consider a control process in $R^{n}$
(1. 1) $i=A(t)x+B(t)u$
on the finite interval $t_{0}¥leqq t¥leqq t_{1}$ with $x(t_{0})=x_{0}$ where $A(t)$ and $B(t)$ are contin-uous real matrix valued functions of size $n¥times n$ and $n¥times r$ respectively. The spaceof open-loop controls $L_{2}(t_{0}, t_{1})$ denotes the familiar space of equivalence classesof Borel measurable functions on $[t_{0}, t_{1}]$ into $R^{r}$ satisfying
$¥int_{t_{0}}^{t_{1}}|u(t)|^{2}dt<¥infty$.
A cost functional is prescribed on $L_{2}(t_{0}, t_{1})$ by the formula
$C(u)=¥int_{t_{¥alpha}}^{t_{1}}G(t,x, u)dt+x(t_{1})¥cdot ¥mathfrak{G}x(t_{1})$
where the integration is along the trajectory of (1. 1) with $u=u(t)$ in $L_{2}(t_{0}, t_{1})_{¥sim}$
The integrand is a quadratic form$G(t, x, u)=x¥cdot ¥mathfrak{U}(t)x+2x¥cdot ¥mathfrak{C}(t)u+u¥cdot ¥mathfrak{B}(t)u$
defined on [$t_{0}$ , $t_{1}1¥times R^{n}¥times R^{r^{¥prime}}$ with the assumptions on the matrices $¥mathfrak{U}(t)$ , $¥mathfrak{B}(t)$ ,
$¥mathfrak{C}(t)$ , $¥mathfrak{G}$ :
$¥left(¥begin{array}{ll}¥mathfrak{A}(t) & ¥mathfrak{C}(t)¥¥¥mathfrak{C}^{*}(t) & ¥mathfrak{B}(t)¥end{array}¥right)¥geqq 0$, $¥mathfrak{B}(t)>0$
continuous and real on $[t_{0}, t_{1}]$ and $¥mathfrak{G}¥geqq 0$ . We remark that $C(u)$ is well-definedand real valued on $L_{2}(t_{0},t_{1})$ . An open-loop control element u* in $L_{2}(t_{0}, t_{1})$ iscalled optimal if it minimizes $C(u)$ .
We find it useful to study the symmetric nonlinear Kalman-Riccati differen-tial equation
(1. 2) $-¥dot{P}=$ $(¥mathfrak{U}-¥mathfrak{C}¥mathfrak{B}^{-1}¥mathfrak{C}^{*}(+P(A-B¥mathfrak{B}^{-1}¥mathfrak{C}^{*})+(A-B¥mathfrak{B}^{-1}¥mathfrak{C}^{*})^{*}P-P(B¥mathfrak{B}^{-1}B^{*})P$.
in the linear manifold of symmetric $n¥times n$ real matrices, $S_{n}$ .
Lemma 1. 1 Let $P(t)$ be a symmetric solution to (1. 2) on $[a, b]¥subseteq[t_{0}, t_{1}]$
and dPfine $c(t, x)=x¥cdot P(t)x$ . Then the inequality
(1. 3) $c_{t}(t, x)+[A(t)x+B(t)u]¥cdot c_{x}(t, x)+G(t, x, u)¥geqq 0$
holds for $alf(t, x, ¥mathrm{u})$ in $[a, b]¥times R^{n}¥times R^{r}$ and equality holds when and only $u¥prime hen$
$u=-¥mathfrak{B}^{-1}(t)[¥mathfrak{C}^{*}(t)+B^{*}(t)P(t)]x$. Elementary but long calculations show thatfor each fixed $(t, x)$ in $[a, b]¥times R^{n}$ the left-hand side of (1. 3) as a function of$u$ on $R^{r}$ and its gradient are zero when $u=-¥mathfrak{B}^{-1}(t)$ $[¥mathfrak{C}^{*}(t)+¥mathfrak{B}^{*} (t)P(t)]x$ and
Stabilizability and Optimal Control 41
the Hessian is 2 $¥mathfrak{B}>0$ . This proves the lemma.Lemma 1. 2 For every $¥mathfrak{G}¥geqq 0$ there exists a solution $P(t)¥geqq 0$ to the Kalman-
Riccati equation (1. 2) on $[t_{0}, t_{1}]$ satisfying the final condition $P(t_{1})=¥mathfrak{G}$.By the well-known existence theory of differential equations [2] there ex-
ists a solution $P(t)$ of (1. 2) satisfying the boundary condition $P(t_{1})=¥mathfrak{G}$ whichis either defined on $[t_{0}, t_{1}]$ or else on some maximal subinterval $I=$ $(¥delta,$ $t_{1}]$ , $ t_{0}¥leqq$
$¥delta<t_{1}$ . We now assume the latter case and show it leads to a contradiction.
Let $¥hat{t}_{0}$ be arbitrary in I and $u(t)$ be arbitrary in $L_{2}(t_{0}, t_{1})$ . There exists a
corresponding unique absolutely continuous solution to (1. 1) on $[¥hat{t}_{0}, t_{1}]$ with $x(¥hat{t}_{0})$
$=x_{0}$ which according to Lemma 1. 1 satisfies
$¥frac{d}{dt}[x(t)¥cdot P(t)x(t)-]+G(t, x(t), u(t))¥geqq 0$
for $¥mathrm{a}.¥mathrm{e}$ . $t¥in[¥hat{t}_{0}, t_{1}]$ and all $x_{0}¥in R^{n}$ . By integration,
(1. 4) $x_{0}¥cdot P(¥hat{t}_{0})x_{0}¥leqq¥int_{¥hat{t}_{0}}^{t_{1}}G(t, x(t), u(t))dt+x(t_{1})¥cdot ¥mathfrak{G}x(t_{1})$
By the continuity of $P(t)$ there is also a unique absolutely continuous solution$¥hat{x}(t)$ to the system in $R^{n}$
(1. 5) $¥hat{x}.=¥hat{A}(t)¥hat{x}$
on $[¥hat{t}_{0}, t_{1}]$ with $¥hat{x}(¥hat{t}_{0})=x_{0}$ where we define A $(t)=A(t)-B(t)¥mathfrak{B}^{-1}(t)$ $[¥mathfrak{C}^{*}(t)+B^{*}(t)$
$P(t)]$ . From Lemma 1. 1 we see
$¥frac{d}{dt}[¥hat{x}(t)¥cdot P(t)¥hat{x}(t)]+G$ ($t,¥hat{x}(t)$ , u(t))=0
for all $t¥in[¥hat{t}_{0}, t_{1}]$ and all $x_{0}¥subset-R^{n}$ where we define $¥hat{u}(t)¥cdot=-¥mathfrak{B}^{-1}(t)[¥mathfrak{C}^{*}(t)+¥mathrm{B}^{*}(t)P$
$(t)]¥hat{x}(t)$ and by integration have
(1. 6) $x_{0}¥cdot P(¥hat{t}_{0})x_{0}=¥int_{t_{0}}^{t_{1}}¥mathrm{A}G$ ($t,¥hat{x}(t),$ u(t)) $dt+¥hat{x}(t_{1})¥cdot ¥mathfrak{G}¥hat{x}(t_{1})$ .
Since $G¥geqq 0$ and $¥mathfrak{G}¥geqq 0$, (1. 6) implies$0¥leqq x_{0}¥cdot P(¥hat{t}_{0})x_{0}$ .
Write the solution of (1. 1) with $u(t)¥equiv 0$ in terms of the fundamental matrix$X(¥cdot, ¥cdot)$ for which $X(¥hat{t}_{0},¥hat{t}_{0})=I_{n}$ ,
(1. 7) $x=X(¥hat{t}_{0}, t)x_{0}$
and note the continuity of $X$ to be used below. Using this solution to set $u(t)$
$¥equiv 0$ in (1. 4) we obtain the inequality
(1. 8) $x_{0}¥cdot P(¥hat{t}_{0})x_{0}¥leqq x_{0}¥cdot¥hat{P}(¥hat{t}_{0})x_{0}$
where $¥hat{P}$ is symmetric and $||¥hat{P}(t)||$ is continuous and hence bounded on $[t_{0}, t_{1}]$ .
From (1. 7) and (1. 8) it follows that$||P(¥hat{t}_{0})||¥leqq||¥hat{P}(¥hat{t}_{0})||$ and since $¥hat{t}_{0}$ was arbitrary in I we conclude that $||P(t)||$ is boun-
42 D. L. LUKES
$¥mathrm{d}¥mathrm{e}¥mathrm{d}$ on $I$. But this contradicts the known unbounded behavior of solutions onfinite maximal intervals [3].
Hence we have shown that the required solution $P(t)$ of (1. 2) exists on$[t_{0}, t_{1}]$ and we can replace $t_{0}¥mathrm{A}$ by $t_{0}$ in all the above calculations.
Theorem 1. 3 The control process in $R^{n}$
$¥mathfrak{L})$ $¥dot{x}=A(t)x+B(t)u$
operating on a finite interval $[t_{0}, t_{1}]$ with $x(t_{0})=x_{0}$ and cost functional$C(u)=¥int_{t_{0}}^{t_{1}}G(t, x, u)dt+x(t_{1})¥cdot ¥mathfrak{G}x(t_{1})$
has a unique optimal control $u_{*}¥in L_{2}(t_{0}, t_{1})$ of class $C^{0}$ . The optima cost is givenby $C(u_{*})=x_{0}¥cdot P(t_{0})x_{0}$ where $P(t)$ is the solution of (1. 2) for which $P(t_{1})=¥mathfrak{G}$ .Finally, $u_{*}$ can be synthesized by the closed-loop control $D_{*}(t)x$ where $D_{*}(t)=$
$-¥mathfrak{B}^{-1}(t)[¥mathfrak{C}^{*}(t)+B^{*}(t)P(t)]$. That is, $u_{*}(t)=D_{*}(t)x(t)$ where $x(t)$ solves $¥mathfrak{L}$ ) with$u=D_{*}(t)x$ .
The theorem, except for the uniqueness, is a direct consequence of Lemma1. 2 which permits us to replace $t¥wedge 0$ by $t_{0}$ in (1. 4) and (1. 6). To prove unique-ness suppose that for some initial condition $x_{0}¥in R^{n}$ there exists another optimalcontrol $u_{*}$ in $L_{2}(t_{0},t_{1})$ . Therefore the set $[t|u_{*}(t)¥neq¥tilde{u}_{*}(t)]$ has positive measure.Denote the trajectory of $¥mathfrak{L}$) corresponding to $¥tilde{u}_{*}$ by $¥tilde{x}_{*}(t)$ . By Lemma 1. 1,
$0¥leqq¥frac{d}{dt}[¥tilde{x}_{*}(t)¥cdot P(t)¥tilde{x}_{*}(t)]+G(t,¥tilde{x}_{*}(t),¥tilde{u}_{*}(t))$
where strict inequality holds on the set $[t|¥tilde{u}_{*}(t)¥neq D_{*}(¥tilde{t})x_{*}(t)]$. But suppose thisset had zero measure. This would imply $¥tilde{u}_{*}(t)=D_{*}(t)¥tilde{x}_{*}(t)¥mathrm{a}.¥mathrm{e}$ . so $¥overline{x}_{*}(t)$ wouldsatisfy
$¥overline{x}_{*}.=A(t)¥tilde{x}_{*}+B(t)¥tilde{u}_{*}(t)^{¥mathrm{a}.¥mathrm{e}}=$ .$[A(t)+B(t)D_{*}(t)]¥tilde{x}_{*}$
$x_{*}(t_{0})=x_{0}$
on $[t_{0}, t_{1}]$ . But by the uniqueness of the solution to this equation we conclude$¥tilde{x}_{*}(t)=x_{*}(t)$ on $[t_{0}, t_{1}]$ . Hence
$¥tilde{u}_{*}(t)=D_{*}(t)¥tilde{x}_{*}(t)=D_{*}(t)x_{*}(t)=u_{*}(t)$
$¥mathrm{a}.¥mathrm{e}$ . on $[t_{0}, t_{1}]$ which is a contradiction. Therefore$0¥leqq¥frac{d}{dt}[¥tilde{x}_{*}(t)¥cdot P(t)¥tilde{x}_{*}(t)]+G(t,¥tilde{x}_{*}(t),¥tilde{u}_{*}(t))$
on $[t_{0}, t_{1}]$ and strict inequality holds on a subset of positive measure. Integra-tion provides the inequality $0<C$ $(¥tilde{u}_{*})-C(u_{*})$ which contradicts the optimalityof $¥tilde{u}_{*}$ .
2. Stabilizability and solvability of the Kalman matrix equation.The classical stability theory of differential equations calls a real nXn mat-
Stabilizability and Optimal Control 43
$¥mathrm{r}¥mathrm{i}¥mathrm{x}$ a stability matrix if all its eigenvalues have negative real parts. The namecomes from the well-known theorem which states that the origin is an asypto-
tically stable solution to the linear differential equation in $R^{n}$
$¥dot{x}=Ax$
if and only if $A$ is a stability matrix. A theorem of the classical Lyapunov the-ory says $A$ is a stability matrix if and only if for $Q>0$ there is a correspond-ing $P>0$ which satisfies Lyapunov’s linear matrix equation
$Q+A^{*}P+PA=0$ .Control theory introduces a more general stability concept and covers the classi-cal theorem as a special case.
If for the control process(2. 1) $¥dot{x}=Ax+Bu$
in which $A$ and $B$ are constant matrices there exists a constant matrix $D$ forwhich $A+BD$ is a stability matrix, then the process is called stabilizable. $¥mathrm{I}¥mathrm{I}¥mathrm{L}$
other words the linear control function $u=Dx$ stabilizes (2. 1). We now showthis property is characterized by the Kalman matrix equation
(2. 2) $(¥mathfrak{U}-¥mathfrak{C}¥mathfrak{B}^{-1}¥mathfrak{C}^{*})+(A-B¥mathfrak{B}^{-1}¥mathfrak{C}^{*})^{*}P+P(A-B¥mathfrak{B}^{-1}¥mathfrak{C}^{*})-P(B¥mathfrak{B}^{-1}B^{*})P=0$
obtained by equating the right-hand side of the differential equation (1. 2) $¥mathrm{t}¥sigma$
zero. This is clearly a generalization of Lyapunov’s equation. We assume thatall the matrices which appear in the coefficients are constant and satisfy
$¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)¥geqq 0$ and $¥mathfrak{B}>0$ .
Theorem 2. 1 If the linear process in $R^{n}$
$¥mathfrak{L})$ $¥dot{x}=Ax+Bu$
is stabilizable then there exists a matrix solution $P_{¥infty}¥geqq 0$ to the Kafman matrixequation (2. 2). $P(t)¥uparrow P_{¥infty}$ as $ t¥downarrow-¥infty$ where $P(t)$ is the solution of the Kalman-
Riccati differential equation (1. 2) for which $P(0)=0$. If we assume $¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)>0$
then $P_{¥infty}>0$ and $P_{¥infty}$ is the unique positive definite solution. Conversely, if $¥int_{¥mathfrak{C}^{*}}^{¥mathrm{I}}{}^{¥mathrm{t}}¥mathfrak{B}¥mathfrak{C})$
$>0$ and (2. 2) has a solution $P_{¥infty}>0$ then system $¥mathfrak{L}$ ) is stabilizable.Assume $¥mathfrak{L}$ ) is stabilized by the matrix $D_{s}$ and consider the control $u_{s}(t)=$
$D_{s}x_{s}(t)$ where $x_{s}(t)$ is the solution in $R^{n}$ to the system$i_{s}=[A+BD_{s}]x_{s}$
on $(-¥infty, ¥infty)$ with $x_{s}(t_{0})=x_{0}$ . By Theorem 1. 3 in which we set $¥mathfrak{G}=0$ and $t_{1}=(¥}$
we have
$0¥leqq x_{0}¥cdot P(t_{0})x_{0}¥leqq¥int_{t_{0}}^{0}G(x_{s}, u_{s})dt$
$¥leqq¥int_{t_{0}}^{¥infty}G(x_{s}, u_{s})dt=x_{0}¥cdot P_{s}x_{0}$
44 D. L. LUKES
where $P_{s}$ can be computed as the convergent integral
$P_{s}=¥int_{0}^{¥infty}e^{A_{¥epsilon}^{*}t}2¥mathrm{I}_{s}e^{A_{¥epsilon}t}dt$
where $A_{S}=A+BD_{s}$
$¥mathfrak{A}_{s}=¥mathfrak{U}+¥mathfrak{C}D_{s}+D_{s}^{*}¥mathfrak{C}^{*}+D_{s}^{*}¥mathfrak{B}D_{s}$
and we note $P_{s}$ is independent of $t_{0}$ . By the stationarity of $A$, $B$, $¥mathfrak{U}$, $¥mathfrak{B}$ and $¥mathfrak{C}$
we have for every $t_{0}¥leqq 0$ , $¥delta¥geqq 0$ and $x_{0}¥in R^{n}$
$0¥leqq x_{0}¥cdot P(t_{0})x_{0}=¥min_{L_{2}(t_{0}.0)}¥int_{t_{0}}^{0}Gdt$
$=¥min_{L_{2}(t_{0}.¥delta)}¥int_{t_{0}}^{0}Gdt¥leqq¥min_{L_{2}(t_{0},¥delta)}¥int_{t_{0}}^{¥delta}Gdt$
$=x_{0}¥cdot P(t_{0}-¥delta)x_{0}$.Hence we have shown that $0¥leqq P(t)¥leqq P_{s}$ on ( $-¥infty$ , 0] and that $P(t)$ is mono-tone increasing with decreasing $t$ . It follows that $P_{¥infty}=¥lim_{t¥rightarrow-¥infty}P(t)$ exists [33 and
$P_{¥infty}¥geqq 0$. If $¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)$ $>0$ the same argument shows $P_{¥infty}>0$ . Let $P(t, P_{0})$ be the
solution of (1. 2) for which $P(0, P_{0})=P_{0}$ defined in an open neighborhood of (0,$P_{¥infty})$ in $R^{1}¥times S_{n}$ taken small enough so that the solution is continuous there. Bythe continuity and the uniqueness of the solutions
$P(t, P_{¥infty})=¥lim_{¥tau¥rightarrow-¥infty}P(t, P(¥tau))$
$=¥tau¥rightarrow¥tilde{1}-¥infty 1¥mathrm{m}P(.t+¥tau)=P_{¥infty}$
for all $t$ in a neighborhood of zero. This implies that $P_{¥infty}$ solves (1. 2) and$¥backslash (2.2)$ .
Suppose (2. 2) has a solution $P_{¥infty}>0$ . Then (2. 2) may be rewritten as(2. 3) $A_{s}^{*}P_{¥infty}+P_{¥infty}A_{s}=-[(¥mathfrak{U}-¥mathfrak{C}¥mathfrak{B}^{-1}¥mathfrak{C}^{*})+P_{¥infty}(B¥mathfrak{B}^{-1}B^{*})P_{¥infty}]<0$
where $A_{s}=A+B$ $[-¥mathfrak{B}^{-1}(¥mathfrak{C}^{*}+B^{*}P_{¥infty})]$
and the inequality holds if we assum $¥mathrm{e}¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)>0$ . But (2. 3) is Lyapunov’s linear
stability equation which can be shown to have only stability matrix solutions.Hence the matrix $D=-¥mathfrak{B}^{-1}(¥mathfrak{C}^{*}+B^{*}P_{¥infty})$ stabilizes $¥mathfrak{L}$ ). The proof of the unique-ness is postponed to a remark following Theorem 4. 1.
3. Stabilizability and controllability.A linear control process in $R^{n}$ is called completely controllable on $[t_{0}, t_{1}]$ if
for each pair of points $x_{0}$ , $x_{1}$ in $R^{n}$ there exists a control element $u$ in $L_{2}(t_{0}, t_{1})$
steering $x(t_{0})=x_{0}$ to $x(t_{1})=x_{1}$ . Kalman [1] and others have investigated thisconcept which has drawn a considerable interest in the literature. It is wellknown that for a stationary process in $R^{n}$
Stabilizability and Optimal Control 45
$¥mathfrak{L})$ $¥dot{x}=Ax+Bu$
complete controllability on one interval implies complete controllability on every
finite interval and is equivalent to each of the conditions:(1) rank [$B$, AB, $A^{2}B$, $¥cdots$ , $A^{n-1}B$] $=n$
(2) $¥int_{0}^{¥epsilon}e^{-At}BB^{*}e^{-A^{*}t}dt>0$
for some $¥mathrm{e}>0$ .It is also well-known that complete controllability of $¥mathfrak{L}$) implies its stabilizabi-lity but not conversely (take $A=-I$ and $B=0$). We now give a stability-theoretic proof.
Theorem 3. 1 If the linear stationary process in $R^{n}$
$¥mathfrak{L})$ $¥dot{x}=Ax+Bu$
is completely controllable then it is stabilized by each of the following closedloop controls
$u_{e}(x)=-B^{*}(¥int_{0}^{¥mathrm{g}}e{}^{-At}BB^{*}e^{-A^{*}t}dt)^{-1}x$
for each $¥epsilon>0$ and each corresponding stabilized system has the Lyapunov
function$v_{¥epsilon}(x)=x$ . $(¥int_{0}^{¥epsilon}e^{-}{}^{At}BB^{*}e^{-A}{}^{*t}dt)^{-1}x$ .
We assume complete controllability and note that since
$¥int_{0}^{¥epsilon}e{}^{-At}BB^{*}e^{-A^{*}t}dt>0$
for some $¥epsilon>0$ in view of condition (2) the inequality holds for all $¥epsilon>0$. De-fine
$A_{¥mathrm{e}}=A-BB^{*}(¥int_{0}^{¥epsilon}e^{-At}BB^{*}e^{-A^{*}t}dt)^{-1}$
and use Lyapunov’s linear stability equation to note that verifying $v_{¥mathrm{g}}(x)$ is aLyapunov function for $¥mathfrak{L}$) is equivalent to verifying
$v_{¥epsilon}^{*}(x)=x$ . $(¥int_{0}^{¥epsilon}e^{-At}BB^{*}e^{-A^{*}t}dt)x$
is a Lyapunov function for the system
$¥mathfrak{L}^{*})$ $¥dot{x}=A_{¥epsilon}^{*}x$ .
Differentiating along the trajectories of $¥mathfrak{L}^{*}$ ) we can verify
$¥frac{dv_{8}^{*}(x)}{dt}=-[|B^{*}x|^{2}+|B^{*}¥mathrm{e}^{-¥epsilon A^{*}}x|^{2}]$.
Let $¥delta>0$ be fixed and consider the set in $R^{n}$
$E=[x:v_{¥epsilon}^{*}(x)¥leqq¥delta,$ $¥frac{dv_{¥mathrm{e}}^{*}(x)}{dt}=0]$ .
46 D. L. LUKES
Note $E$ is a compact set containing the origin. Let $E_{0}$ denote the maximal posi-tive invariant set in $E$ . Note that $E_{0}$ contains the origin and let $x_{0}$ be in $E_{0}$ .
Thus$e^{tA_{¥epsilon}^{*}}x_{0}¥in^{-}E_{0}¥subseteq E$ for all $t¥geqq 0$ .
Therefore $B^{*}e^{tA_{¥epsilon}^{*}}x_{0}=0$ for $t¥geqq 0$ . Differentiating and setting $t=0$ we get $B^{*}x_{¥theta}$
$=0$ , $B^{*}A_{8}^{*}x_{0}=0$ , $¥cdots$ , $B^{*}A_{8}^{*n-1}x_{0}=0$. That is, $x_{0}$ is orthogonal to the columns
of $[B, A_{¥epsilon}B, ¥cdots, A_{¥epsilon}^{n-1}B]$ . Note that since $x_{0}$ is orthogonal to the columns of $B$ itis orthogonal to the columns of every matrix of the form $BM$ where $M$ is any
matrix. Expanding $A_{¥mathrm{e}}^{k}B$ we have $A_{g}=A-BM_{¥mathit{0}}$ , hence$A_{¥mathit{8}}B=AB-BM_{1}$
$A_{8}^{2}B=(A-BM_{0})(A_{¥mathrm{g}}B)=A^{2}B-(AB)M_{1}-BM_{2}$
$A_{¥mathrm{e}}^{3}B=(A-BM_{0})(A_{¥mathrm{e}}^{2}B)$
$=A^{3}B-(A^{2}B)M_{1}-(AB)M_{2}-BM_{3}$...$A_{¥mathrm{g}}^{n-1}=(A-BM_{0})(A_{¥epsilon}^{n-2}B)$
$=A^{n-1}B-(A^{n-2}B)M_{1}-¥cdots-(AB)M_{n2}¥_-BM_{n1}¥_$
where the form of the matrices $M_{0}$ , $M_{1}$ , $¥cdots,M_{n1}¥_$ is apparent.Since $x_{0}$ is orthogonal to the columns of $A_{¥mathrm{g}}B$ and $BM_{1}$ it is orthogonal to
the columns of AB. By induction we conclude $x_{0}$ is orthogonal to the columnsof [$B$, AB, $¥cdots$ , $A^{n-1}B$] and in view of our complete controllability assumptionand condition (1) we conclude $x_{0}=0$ . But $x_{0}$ was arbitrary in $E_{0}$ , hence $E_{0}=$
$¥{0¥}$ . By LaSalle’s theorem [4] all the trajectories in $E$ converge to $E_{0}$ as $ t¥rightarrow$
$¥infty$ which concludes the proof.
4. Optimal open-loop control on a semi-infinite interval.In order to state the analogue of Theorem 1. 3 on $[0, ¥infty]$ we make the follo-
wing simplifying assumptions and definitions. The control process in $R^{n}$
(4. 1) $¥dot{x}=Ax+Bu$
with $x(0)=x_{0}$ is assumed to be stationary and stabilizable. For the system ope-
rating on $[0, t_{1}]$ we consider the cost functional
$C^{t_{1}}(u)=¥int_{0}^{t_{1}}¥left(¥begin{array}{l}x¥¥u¥end{array}¥right)¥cdot¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)¥left(¥begin{array}{l}x¥¥u¥end{array}¥right)dt$
for $u¥subset-L_{2}(0, t_{1})$ where we allow $ t_{1}¥leqq¥infty$ but requir $¥mathrm{e}¥left(¥begin{array}{ll}¥mathfrak{U} & ¥mathfrak{C}¥¥¥mathfrak{C}^{*} & ¥mathfrak{B}¥end{array}¥right)>0$ and stationary.
For $ t_{1}<¥infty$ we let $u_{*}^{t_{1}}(t)$ and $x_{*}^{t_{1}}(t)$ denote the optimal control and corresponding
trajectory of (4. 1) described by Theorem 1. 3. In terms of the matrix $P_{¥infty}$ ofTheorem 1. 1 we define matrices $D_{*}^{¥infty}=-¥mathfrak{B}^{-1}[¥mathfrak{C}^{*}+B^{*}P_{¥infty}]$ and $A_{*}^{¥infty}=A+BD_{*}^{¥infty}$ .
Stabilizability and Optimal Control 47
Theorem4.1 For a stationary, stabilizable system in $R^{n}$
$¥mathfrak{L})$ $i=Ax+Bu$
on [0, $¥infty$) with $x(0)=x_{0}$ :
(1) There exists a unique optimal open-loop control $u_{*}^{¥infty}(t¥mathrm{j}$ in $L_{2}(0,¥infty)$ ofclass $C^{¥omega}$ given by the formula
$u_{*}^{¥infty}(t)=D_{*}^{¥infty}e^{tA_{*}^{¥infty}}x_{0}$,
(2) $u_{*}^{¥infty}(t)$ can be synthesized by the closed-loop control $D_{*}^{¥infty}x$ -that is, $u_{*}^{¥infty}(t)$
$=D_{*}^{¥infty}x_{*}^{¥infty}(t)$ where $x_{*}^{¥infty}(t)$ is the solution of $¥mathfrak{L}$) with $u=D_{*}^{¥infty}x$,
(3) $A_{*}^{¥infty}$ is a stability matrix and the optimal value of the cost functional is
given by the positive definite quadratic form $C^{¥infty}(u_{*}^{¥infty})=x_{0}¥cdot P_{¥infty}x_{0}$ which pro-vides a corresponding Lyapunov function, and
(4) on every finite intervd $[0, T]$ , $|u_{*}^{¥infty}(t)-¥mathrm{u}_{*}^{t_{1}}(t)|¥rightarrow 0$ and $|x_{*}^{¥infty}(t)-x_{*}^{t_{1}}(t)|¥rightarrow 0^{¥mathrm{J}}$
both uniformly as $ t_{1}¥rightarrow¥infty$ and $C^{t_{1}}(u_{*}^{t_{1}})¥uparrow C^{¥infty}(u_{*}^{¥infty})$ .
We pointed out in the proof of Theorem 2. 1 that $A_{*}^{¥infty}$ is a stability matrix. It
follows that the control function $u_{*}^{¥infty}(t)$ defined in (1) is in $L_{2}(0,¥infty)$ . To prove
$|x_{*}^{¥infty}(t)-x_{*}^{t_{1}}(t)|¥rightarrow 0$ uniformly on a finite interval $[0, T]$ as $ t_{1}¥rightarrow¥infty$ we recall the $¥cdot$
differential equations defining $x_{*}^{¥infty}(t)$ and $x_{*}^{t_{1}}(t)$ ,
$¥dot{x}_{*}^{¥infty}(t)=A_{*}^{¥infty}x_{*}^{¥infty}(t)$
$i_{*}^{t_{1}}(t)=A_{*}^{t_{1}}(t)x_{*}^{t_{1}}(t)$
where $A_{*}^{¥infty}=A-B¥mathfrak{B}^{-1}[¥mathfrak{C}^{*}+B^{*}P_{¥infty}]$
$A_{*}^{t_{1}}(t)=A-B¥mathfrak{B}^{-1}[¥mathfrak{C}^{*}+B^{*}P^{t_{1}}(t)]$
and $P^{t_{1}}(t)$ solves (1. 2) with $P^{t_{1}}(t_{1})=0$. Note that $||A_{*}^{¥infty}-A_{*}^{t_{1}}(t)||¥leqq||B¥mathfrak{B}^{-1}B^{*}||$
$||P_{¥infty}-P^{t_{1}}(t)||$ . From Theorem 2. 1, $P^{t_{1}}(t)¥uparrow P_{¥infty}$ as $ t¥downarrow-¥infty$ and by the autonom-
ous nature of (1. 2) it follows that $P^{t_{1}}(t)¥uparrow P_{¥infty}$ as $ t_{1}¥uparrow¥infty$ and hence $||P^{¥infty}-P^{t_{1}}(t)||$
$¥downarrow 0$ on $[0, T]$ as $ t_{1}¥uparrow¥infty$ . By Dini’s theorem [5] we conclude $||P_{¥infty}-P^{t_{1}}(t)||$ andhence $||A_{*}^{¥infty}-A_{*}^{t_{1}}(t)||$ converge to zero both uniformly on $[0, T]$ as $ t_{1}¥rightarrow¥infty$ . Using
the fundamental inequality [2] to estimate $|x_{*}^{¥infty}(t)-x_{*}^{t_{1}}(t)|$ we can verify that
$|x_{*}^{¥infty}(t)-x_{*}^{t_{1}}(t)|¥rightarrow 0$ uniformy on $[0, T]$ as $ t_{1}¥rightarrow¥infty$ . Then the easily attained es-
timate$|¥mathrm{u}_{*}^{¥infty}(t)-u_{*}^{t_{1}}(t)|¥leqq||D_{*}^{¥infty}|||x_{*}^{¥infty}(t)-x_{*}^{t_{1}}(t)|$
$+||¥mathfrak{B}^{-1}B^{*}||||P_{¥infty}-P^{t_{1}}(t)|||x^{t_{1}}(t)|$
48 D. L. LUKES
shows that $|u_{*}^{¥infty}(t)-u_{*}^{t_{1}}(t)|¥rightarrow 0$ uniformly on $[0, T]$ as $ t_{1}¥rightarrow¥infty$ .
To prove that $C^{t_{1}}(u_{*}^{t_{1}})¥uparrow C^{¥infty}(u_{*}^{¥infty})$ as $ t_{1}¥uparrow¥infty$ and that $u_{*}^{¥infty}(t)$ is optimal, 1$u¥in L_{2}(0, ¥infty)$ be arbitrary except for the restriction that $ C^{¥infty}(u)<¥infty$ . Denote $¥mathrm{t}^{7}$
corresponding solution of $¥mathfrak{L}$) by $x(t)$ .
$¥int_{0}^{t_{1}}|u(¥tau)|^{2}d¥tau<¥infty$ for every $t_{1}>0$ and by Theorem 1. 3
$0¥leqq x_{0}¥cdot P^{t_{1}}(0)x_{0}=¥int_{0}^{t_{1}}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt¥leqq¥int_{0}^{t_{1}}G(x, u)dt$.
Since the limit of each term exists
$0¥leqq x_{0}¥cdot P_{¥infty}x_{0}=¥lim_{t_{1}¥rightarrow¥infty}¥int_{0}^{t_{1}}G(x_{*}^{t_{1}}, u_{*}^{i_{1}})dt$ $¥leqq¥int_{0}^{¥infty}G(x, u)$ $ dt=C^{¥infty}(u)<¥infty$ .
Thus to see $u_{*}^{¥infty}(t)$ is optimal and that $x_{0}¥cdot P_{¥infty}x_{0}=C^{¥infty}(u_{*}^{¥infty})=P_{1}¥varliminf_{¥infty}C^{t_{1}}(u_{*}^{t_{1}})$ all
need do is show that
$¥lim_{t_{1}¥rightarrow¥infty}¥int_{0}^{l_{1}}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt=¥int_{0}^{¥infty}G(x_{*}^{¥infty}, u_{*}^{¥infty})dt$.
By the optimality of $u_{*}^{t_{1}}(t)$ on $[0, t_{1}]$
$¥int_{0}^{t_{1}}G(x_{*}^{t_{1}},u_{*}^{t_{1}})dt¥leqq¥int_{0}^{t_{1}}G(x_{*}^{¥infty}u_{*}^{¥infty})dt$ .Hence
$¥lim_{t_{1}¥rightarrow¥infty}¥int_{0}^{t_{1}}G(x_{*}^{t_{1}},u_{*}^{t_{1}})dt¥leqq¥int_{0}^{¥infty}G(x_{*}^{¥infty}, u_{*}^{¥infty})dt$ .
Te verify the reverse inequality consider $0¥leqq T¥leqq t_{1}$ . Then
$¥int_{0}^{t_{1}}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt$ $¥geqq¥int_{0}^{T}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt$
and since $G(x_{*}^{t_{1}}, u_{*}^{t_{1}})$ is uniformly convergent on $[0, T]$
$¥int_{0}^{T}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt$ $¥rightarrow¥int_{0}^{T}G(x_{*}^{¥infty}u_{*}^{¥infty})dt$
$.¥mathrm{a}¥mathrm{s}t_{1}¥rightarrow¥infty$ with $T$ fixed. Next letting $ T¥rightarrow¥infty$ we have
$¥lim_{t_{1}¥rightarrow¥infty}¥int_{0}^{t_{1}}G(x_{*}^{t_{1}}, u_{*}^{t_{1}})dt$ $¥geqq¥int_{0}^{¥infty}G(x_{*}^{¥infty}, u_{*}^{¥infty})dt$ .
This completes the proof of the optimality of $u_{*}^{¥infty}$ and the equations
$x_{0}¥cdot P_{¥infty}x_{0}=C^{¥infty}(u_{*}^{¥infty})=¥lim_{t_{1}¥rightarrow¥infty}C^{t_{1}}(u_{*}^{t_{1}})$ .
The monotone convergence follows from the formula$C^{t_{1}}(u_{*}^{t_{1}})=x_{0}¥cdot P^{t_{1}}(0)x_{0}$
and the fact $P^{t_{1}}(¥mathrm{O})¥uparrow P_{¥infty}$ as $ t_{1}¥uparrow¥infty$ discussed above. The formula in (1) caneasily verified using the equations of (2).
We now establish the uniqueness. By Lemma 1. 1
Stabilizability and Optirnal Control 49
$0¥leqq[Ax+Bu]¥cdot¥frac{¥partial}{¥theta x}[x¥cdot P_{¥infty}x]+G(x, u)$
for all $(x, u)¥in R^{n}¥times R^{r}$ and strict inequality holds precisely on the subset of $R^{n}$
$¥times R^{r}$ , $[(x, u):u¥neq D_{*}^{¥infty}x]$ . Suppose for some initial $x_{0}$ there exists another opti-
mal control $¥tilde{¥mathrm{u}}_{*}^{¥infty}$. in $L_{2}(0, ¥infty)$ and then the subset in [0, $¥infty$ ), $[t:u_{*}^{¥infty}(t)-¥tilde{u}_{*}^{¥infty}(t)¥neq 0¥mathrm{J}^{¥mathrm{I}}$
has positive measure. We denote the corresponding trajectory by $¥tilde{x}_{*}^{¥infty}(t)$ .In particular
$0¥leqq[A¥tilde{x}_{*}^{¥infty}+B¥tilde{u}_{*}^{¥infty}]¥cdot¥frac{¥partial}{¥partial¥tilde{x}_{*}^{¥infty}}[¥tilde{x}_{*}^{¥infty}¥cdot P_{¥infty}¥tilde{x}_{*}^{¥infty}]+G$ $(¥tilde{x}_{*}^{¥infty},¥tilde{u}_{*}^{¥infty})$
where strict inequality holds precisely on the subset of [0, $¥infty$), $[t:¥tilde{u}_{*}^{¥infty}(t)¥neq D_{*}^{¥infty}$
$¥tilde{x}_{*}^{¥infty}$. $(t)]$ . By the same argument used in the uniqueness proof of Theorem 1. 3this set can be shown to have positive measure. Therefore integration of theinequality shows
$0<¥int_{0}^{¥infty}¥{¥frac{d}{dt}[¥tilde{x}_{*}^{¥infty}¥cdot P_{¥infty}¥tilde{x}_{*}^{¥infty}]+G(¥overline{x}_{*}^{¥infty},¥tilde{u}_{*}^{¥infty})¥}dt$.
But $ 0¥leqq¥int_{0}^{¥infty}G(¥tilde{x}_{*}^{¥infty},¥tilde{u}_{*}^{¥infty})dt<¥infty$ since $¥tilde{u}_{*}^{¥infty}$ is optimal and $ C^{¥infty}(u_{*}^{¥infty})<¥infty$ . Therefore
$¥int_{0}^{¥infty}¥frac{d}{dt}[¥tilde{x}_{*}^{¥infty}¥cdot P_{¥infty}¥tilde{x}_{*}^{¥infty}]dt$
exists and is finite, hence,
$0<¥lim_{t_{1}¥rightarrow¥infty}¥tilde{x}_{*}^{¥infty}(t_{1})¥cdot P_{¥infty}¥tilde{x}_{*}^{¥infty}(t_{1})-x_{0}¥cdot P_{¥infty}x_{0}+¥int_{0}^{¥infty}G$$(¥tilde{x}_{*}^{¥infty},¥tilde{u}_{*}^{¥infty})dt$
which says $C^{¥infty}(u_{*}^{¥infty})<¥lim_{t_{1}¥rightarrow¥infty}¥tilde{x}^{¥infty}(t_{1})¥cdot P_{¥infty}¥tilde{x}^{¥infty}(t_{1})+C^{¥infty}(¥tilde{u}_{*}^{¥infty})$. $¥mathrm{S}¥overline{¥mathrm{l}}¥mathrm{n}¥mathrm{c}¥mathrm{e}¥tilde{u}_{*}^{¥infty}$ and $u_{*}^{¥infty}$ are both
optimal $C^{¥infty}(u_{*}^{¥infty})=C^{¥infty}(¥tilde{u}_{*}^{¥infty})$ and since $P_{¥infty}>0,¥lim_{t_{1}¥rightarrow¥infty}¥tilde{x}_{*}^{¥infty}(t_{1})¥neq 0$ . But this implies
$G$ $(¥tilde{x}_{*}^{¥infty}(t), ¥tilde{u}_{*}^{¥infty}(t))$ does not converge to zero as $ t¥rightarrow¥infty$ but it does converge. We
conclude that $ C^{¥infty}(¥tilde{u}_{*}^{¥infty})=¥infty$ which is a contradiction and the uniqueness is estab-
lished.Remark. We can now prove the uniqueness part of Theorem 2. 1. Suppose
there is another symmetric positive definite solution $¥tilde{P}_{¥infty}$ to the Kalman matrixequation. Define
$¥overline{A}_{*}^{¥infty}=A-B¥mathfrak{B}^{-1}[¥mathfrak{C}^{*}+B^{*}¥tilde{P}_{¥infty}]$
$¥tilde{D}_{*}^{¥infty}=-¥mathfrak{B}^{-1}[¥mathfrak{C}+B^{*}¥tilde{P}_{¥infty}]$ .
For each $x_{0}¥in R^{n}$ let $¥tilde{u}_{*}^{¥infty}(t)=¥tilde{D}_{*}^{¥infty}¥tilde{x}^{¥infty}(t)$ where $¥tilde{x}^{¥infty}(t)$ solves
50 D. L. $¥mathrm{L}¥dot{¥mathrm{U}}¥mathrm{K}¥mathrm{E}¥mathrm{S}$
$¥tilde{x}^{¥infty}.=¥tilde{A}_{*}^{¥infty}¥tilde{x}^{¥infty}=A¥tilde{x}^{¥infty}+B(¥tilde{D}_{*}^{¥infty}¥tilde{x}^{¥infty})$
$¥tilde{x}^{¥infty}(0)=x_{0},0¥leqq t<¥infty$ .Using the same arguments used in proving Theorem 4. 1 we can prove that$¥tilde{u}_{*}^{¥infty}(t)$ is optimal and that $x_{0}¥cdot¥tilde{P}_{¥infty}x_{0}=C^{¥infty}(¥tilde{u}_{*}^{¥infty})$ . Therefore $x_{0}¥cdot¥tilde{P}_{¥infty}x_{0}=x_{0}¥cdot P_{¥infty}x_{0}¥mathrm{f}¥mathrm{o}¥dot{¥mathrm{r}}$ all
$x_{0}¥in R^{n}$ which implies $¥tilde{P}_{¥infty}=P_{¥infty}$.References
[1] Kalman, R. E., “Contributions to the theory of optimal control”, Bol. Soc. Mat.Mexicana, 1960, pp, 102-119.
[2] Coddington, E. A. and Levinson, N., Theory of Ordinary Differential Equations,McGraw-Hill, New York, 1955.
[3] Hartman, P., Ordinary Differential Equations, Wiley, New York, 1964.[4] LaSalle, J. and Lefschetz, S., Stability by Liapunov’s Direct Method, Academic
Press, New York, 1961.[5] Apostol, T. M., Mathematical Analysis, Addison-Wesley, Reading, Massachuse-
tts, 1957.(Ricevita la 15-an de februaro, 1968)
Recommended