New Issues in the Mathematics of Control

To appear in Mathematics Unlimited - 2001 and Beyond-Springer

New Issues in the Mathematics of Control

Roger Brockett

Abstract

This is an essay on mathematical control theory. We begin with a detailed, but fast-paced,review of some past work in control emphasizing an operator point of view. After a discussion ofrecent developments relating to topics ranging from control of chaos to adaptation and learning,we go on to argue that many of the especially interesting and challenging new problems on thehorizon are coming from three areas: the development of intelligent machines, the physics ofmeasurement processes, and the development of system theoretic interpretations of processesfound in the life sciences. While we have tried to keep the discussion both mathematical andforward looking, the reader will not find here a list of crisp, predigested mathematical problems.Instead, we offer suggestions as to where the hunting might be especially rewarding.

1 Introduction

The field of automatic control has emerged over the last century as an indispensable part oftechnology and an important element in the conceptualization of scientific ideas in a variety offields. Because of its wide applicability, many of its mathematical aspects have been exploredbroadly and refined to a high degree. Other avenues are just now beginning to be explored. Aswe enter the 21st century it seems appropriate to assess the progress that has been made anddiscuss what may lie in store. In this paper we review the applicability of existing theory in areassuch as robotics, physics and biology and discuss the kinds of advances that would help make itpossible for the subject to reach its full potential as an aid to science and technology.

For the non-specialists, it may be useful to have in mind a few 20th century problems touse as points of departure. We attach a key to each so as to make it easy to refer back to anitem, as needed. There is not an all encompassing reference for the topics touched here but thegeneral references Astrom and Wittenmark [1], Polderman and Willems [2] and Sontag [3] coverconsiderable ground.

A. Regulator Problems: Consider a variable, or set of variables, associated with a dy-namical system. They are to be maintained at some desired values in the face of changingcircumstances. There exist a second set of parameters that can be adjusted so as to achievethe desired regulation. The effecting variables are usually called inputs and the affected variablescalled outputs. Specific examples include the regulation of the thrust of a jet engine by controllingthe flow of fuel, the regulation of the oxygen content of the blood using the respiratory rate andthe control of a beam of particles in a particle accelerator by modulating magnetic fields.

B. End Point Control Problems: There are inputs, outputs and trajectories, as above.In this case the shape of the trajectory is not of great concern but rather it is the end pointthat is of primary importance. Standard examples include rendezvous problems such as one hasin space exploration, batch processing in chemical engineering in which reactants are introducedand the process controlled in such a way as to make a desired product with a specified purity,control of nuclear spins in nuclear magnetic resonance through the application of magnetic fieldsand radio frequency pulses and the control of an electron beam in such a way as to have it hitthe inside of a television tube at a location specified by the signal received by the antenna.

C. Servomechanism Problems: There are inputs, outputs and trajectories, as above, andan associated dynamical system. In this case, however, it is desired to cause the outputs to followa trajectory specified by the input. Examples include the control of a milling machine so that itwill remove metal along the path specified by the blueprints, the control of an airplane so thatit will travel along the flight path specified by the flight controller and the control of a telescopeso that it will follow the apparent motion of a star as seen from earth.

1

D. Repetitive Mode Problems: Again one has some variable, or set of variables, associatedwith a dynamical system and some inputs which influence its evolution. The task has elementswhich are repetitive and are to be done efficiently. A common engineering example is the controlof the intake-compression-combustion-exhaust cycle of an internal combustion engine. Examplesfrom biology include the control of respiratory processes, control of the pumping action of theheart, control of successive trials in practicing a athletic event.

Of course this taxonomy would need to be extended and refined to bring out a full picture ofthe field but this starting point provides enough detail for our present purposes.

2 Systems as Operators

For us, a system is something that has inputs and outputs, coupled by dynamics. It is a commonengineering idea which facilitates the design of television sets, jet engines, space stations, etc. Thestudy of systems differs from the study of fields such as celestial mechanics in that autonomousbehavior is only part of what is of interest; of equal or more importance is the way the systemmaps exogenous inputs to the effects, or outputs as they are usually called. Mathematicallyspeaking, this lies in a domain where the theory of differential equations of the evolutionary typemeets functional analysis.

Because systems have inputs and outputs there is the possibility of thinking of them asoperators and making use, where possible, of composition and inversion as suggested by workon various operator algebras. Indeed, the composition of systems, inversion of systems, andthe solvability of operator equations of this type immediately arise when one analyzes feedbacksystems (see below). This point of view has played an important role in many aspects of controltheory and is basic to the systems point of view. Control theory is most interesting in those caseswhere the influence of the inputs on the system is too significant to be ignored but somethingless than complete domination!

A very important class of models for systems, ubiquitous in engineering practice, is the finitedimensional stationary linear model. It will be discussed first because such systems frequentlyserve as a baseline, against which other systems are measured. We follow a common notionalconvention and let u denote the vector of inputs, y the vector of outputs and assume that theycan be related through an intermediary state variable x according to

x = Ax+Bu ; y = Cx

We refer to this as the deterministic stationary linear model.The stationary linear system defines a variety of operators. In particular, there is an operator

corresponding to each of the basic problems defined in the introduction.A’ The Steady State Operator: First of all, there is the steady state operator of particular

relevance for the regulator problem. It is

y∞ = −CA−1Bu∞

which describes the map from constant values of u to the equilibrium value of y. It is definedwhenever A is invertible but the steady state value will only be achieved by a real system if, inaddition, the eigenvalues of A have negative real parts. Only when the rank of CA−1B equalsthe dimension of y can we steer y to an arbitrary steady state value and hold it there with aconstant u. (A nonlinear version of this problem plays a central role in Robotics where it is calledthe “inverse kinematics problem”. See, for example, Murray et al. [4].)

B’ The End-Point Adjustment Operator: The second operator, of relevance for the endpoint control problem, is the operator

x(T ) =∫ T

0

eA(T−σ)Bu(σ)dσ

2

If we consider this to define a map from the m-dimensional L2 space Lm2 [0, T ] (where u takes onits values) into Rm then, if it is an onto map, it has a Moore-Penrose (least squares) inverse

u(σ) = BT eAT (T−σ)(W [0, T ])−1

(x(T )− eATx(0)

)with the symmetric positive definite matrix W , the controllability Gramian, being given by

W [0, T ] =∫ T

0

eA(T−σ)BBT eAT (T−σ)dσ

C’ The Servomechanism Operator: The third operator of interest is operator relevantfor the servomechanism problem. Because we have assumed that A, B, and C are constant

y(t) = CeAtx(0) +∫ t

0

CeA(t−τ)Bu(τ)dτ

and of course the Laplace transform can be used to convert convolution to multiplication. Thisbrings out the significance of the Laplace transform pair

CeAtB ⇐⇒ C(Is−A)−1B

as a means of characterizing the input-output map of a linear model with constant coefficients.D’ The Repetitive Mode Operator: This operator is similar to the servomechanism

operator however the constraint that u and x are periodic means that the relevant diagonalizationis provided by Fourier series, rather than the Laplace transform. Thus, in the Fourier domain,we are interested in a set of complex matrices

G(iωi) = C(Iiωi −A)−1B ; ωi = 0, ω0, 2ω0, ...

More general, but still deterministic, models of the input-state-output relation are affordedby the nonlinear affine model

x(t) = f(x(t)) +G(x(t))u(t) ; y(t) = h(x(t))

and the still more general fully nonlinear model

x(t) = f(x(t), u(t)) ; y(t) = h(x(t))

2.1 Feedback Changes the Operator

No idea is more central to automatic control than the idea of feedback. When an input is alteredon the basis of the difference between the actual output of the system and the desired output, thesystem is said to involve feedback. Man made systems are often constructed by starting with abasic element such as a motor, a burner, a grinder, etc. and then adding sensors and the hardwarenecessary to use the measurement generated by the sensors to regulate the performance of thebasic element. This is the essence of feedback control. Feedback is often contrasted with openloop systems in which the inputs to the basic element is determined without reference to anymeasurement of the trajectories. When the word feedback is used to describe naturally occurringsystems, it is usually implicit that the behavior of the system can best be explained by pretendingthat it was designed as one sees man made systems being designed.

Feedback control was used by Airy to control the movement of telescopes in 1840, and wasthe subject of an analysis of a regulator using a fly-ball governor by J. C. Maxwell published in1868. In the context of linear systems, the effect of feedback is easily described. If we start with

x = Ax+Bu ; y = Cx

with u being the controls and y being the measured quantities, then the effect of feedback is toreplace u by u − Ky with K being a matrix of feedback gains. The closed-loop equations arethen

x = (A−BKC)x+Bu ; y = Cx

3

Expressed in terms of the Laplace transform pairs introduced previously, feedback effects thetransformation(

CeAtB ; C(Is−A)−1B)7→(Ce(A−BKC)tB ; C(Is−A+BKC)−1B

)Using such a transformation, it is possible to alter the dynamics of a system in a significant way.The modifications one can effect by feedback include influencing the location of the eigenvaluesand consequently the stability of the system. In fact, if K is m by p and if we wish to select again matrix K so that A−BKC has eigenvalues λ1, λ2, ...λn, it is necessary to insure that

det[C(Iλi −A)−1B −I

I K

]= 0 ; i = 1, 2, ...n

Such families of determinantal equations lie at the heart of the Schubert calculus. In the specialcase in which the dimension of A matches the number of entries in K (n = mp) this problem isexactly the problem of finding an m-plane in (m + n)-space that intersect n p-planes. In Morethan one hundred years ago Schubert reasoned that, if one allows complex solutions, there aregenerically

δ =1!2!...m!(n−m− 1)!

(n−m)!(n−m− 1)!...n!

solutions to this problem. (See [5] for more details.)

2.2 Inversion Solves some Problems

If we wish to force the output of a system to follow a desired path the obvious thing to do isto try to invert the operator and use the inverse to find the input that gives rise to the desiredoutput. This philosophy is especially prevalent in the control of industrial robotics. Althoughpractical difficulties, to be discussed below, often frustrate this direct approach, it is a logicalstarting point.

Linear Systems: In the case of the stationary linear model introduced above, if CB isinvertible then we can use the relationship Cx = CAx + CBu together with y = Cx to writey = CAx+ CBu. This lets us solve for u and recast the system as

x = (A−B(CB)−1CA)x+B(CB)−1y ; u = (CB)−1y − (CB)−1CAx

Here we have a set of equations in which the roles of u and y are reversed. They show how a choiceof y determines x and how x determines u. A satisfactory mathematical description requires thatwe introduce a little more notation. Let W (k)(Rm)[0, T ] denote the set of Rm-valued functionsthat are square integrable along with their first k derivatives. A linear system defines a boundedmap with a bounded inverse only if we are careful about the way in which the input and outputspaces are normed.

Taking a particularly easy case, we may say that if dim u = dim y = m and if CB isnonsingular, then for each x(0), the system defines a map from the space C(0)(Rm)[0, T ] toC(1)(Rm)[0, T ]. The inverse defines a map in the opposite direction. The forward map can beextended in a unique way so as to define a map from the Sobolev space W (0)(Rm)[0, T ] to theSobolev space W (1)(Rm)[0, T ] and the inverse can be extended to a map in the opposite direction.Both the forward map and the inverse map are bounded and onto maps.

Because each of the three problems discussed in the introduction are concerned with forcingy to behave in a certain way, it would seem that this set of steps, telling us how we should chooseu so as to make y follow a certain path, would play a central role in control. However, this isonly partly true. In the first place, CB is typically not invertible, even when it is, y may notbe differentiable, and even if both these conditions are met, u may be constrained in amplitude,rate of change of amplitude, etc. making it impossible to use inverses in a nieve way.

4

2.3 Approximate Inversion Using Pattern Generation

The results of the previous section can be extended to some classes of nonlinear systems withoutmuch qualitative change. However there are classes of nonlinear systems that exhibit very differentbehavior and some of the most interesting and inventive applications of inversion involve situationsthat are of this type. Consider the system with two inputs and three outputs

x1 = u1 ; x2 = u2 ; x3 = x1u2 − x2u1

The dynamics are such that we can express the relationships between the inputs and outputs inintegral form

x1(t)− x1(0) =∫ t

0

u1(σ)dσ ; x2(t)− x1(0) =∫ t

0

u2(σ)dσ ;

x3(t)− x3(0) =∫ t

0

x1(σ)u2(σ)− x2(σ)u1(σ)dσ

Because this system will play a continuing role in the exposition we reinforce these equationswith a block diagram

1/s

1/s

1/s

x

z

y

u

v

+

-

Figure 1. The input-output relationships for the example.

Thought of as a system with two inputs and three outputs, one can ask about which trajec-tories are achievable through choices of u. Of course, x1 and x2 are related to u1 and u2 in acompletely straightforward way; x3 can be best thought of in terms of the geometry of the curvein x1 − x2-space. In view of the standard relationship for integration around closed curves,

2 area =∫γ

ydx− xdy

we may say that the change in x3(t) is the (signed) area in the (x1, x2)-plane defined by thecurve (x1(·), x2(·)), closed by the chord joining (x1(t), x2(t)) to (x1(0), x2(0)). In this sense x3

is sensitive to the area, whereas x1 and x2 are sensitive only to the integrals of the individualinputs. With a view toward constructing a servomechanism type inverse for this system weconsider letting u1 and u2 take the form

u1(t) = a1(t) + f(t) sinωt ; u2(t) = b2(t) + g(t) cosωt

In this case we have

x1(t) = x1(0) + a(t) +∫ t

0

f(τ) sinωτdτ

x2(t) = x2(0) + b(t) +∫ t

0

g(τ) cosωτdτ

x3(t) = x3(0) +∫ t

0

f(τ) sinωτdτ

If we regard f and g as being fixed, the Riemann-Lebesgue lemma implies that, as ω becomeslarge, we have

x1(t) ≈ x1(0) + a(t)

5

x2(t) ≈ x2(0) + b(t)

x3(t) ≈∫ t

0

a(τ)b(τ)− b(τ)a(τ)dτ + f(t)g(t)ω2

Thus, if we let f(t)g(t)ω2 = c(t) − areax1,x2 , we will succeed in forcing x1, x2, x3 to follow thepath defined by the functions a, b, c with an error that can be made vanishingly small by lettingω be large. More precisely, we can approximate any differentiable path in R3 in the C0[0 T ]topology through a choice of inputs of the above form. This system is a prototype for a largeclass of systems with similar properties as shown by Liu and Sussmann [6].

The approximate inverse has a block diagram representation as shown below. The reason forreinterpreting the equations in this schematic form is to make closer contact with the way inwhich pattern generation is thought of in a biological context. (See Brockett [7].)

+

+

+

s

s

sin(ωt+φ)

sin(ωt-φ)

a

b

c

u

v

-

+ω( )

Figure 2. Illustrating the role of sinusoidal patterns in generating an approximate inverse.

The most important aspect of the approximate inverse is that it makes use of a temporalpattern–the sinusoidal variation appearing in the diagram–to accomplish its task. The phase ofthis variation is arbitrary. A selection of a particular phase can be said to break the symmetryof the time axis. Both in naturally occurring systems and in man made systems there are manyexamples of cyclic mechanisms that orchestrate the flow of material, transform energy etc. At thepresent time we have no comprehensive theory that deduces the need for, or relative advantageof, such cyclic processes from first principles. We see pattern generation as a method of reducingthe higher level attention needed to execute tasks. This applies to walking, for example, butmore impressively to the locomotion of snakes as in the paper by Krishnaprasad and Tsakiris [8].

3 Trajectory Optimization

Trajectory optimization problems can be thought of as end point control problems, B-type prob-lems in the taxonomy we have adopted here. These are usually thought of as open-loop problems,feedback plays, at best, a secondary role unless the uncertainties are large and “mid-course cor-rections” are required. Ordinarily one thinks of these tasks as coming to completion after somefinite time. However, infinite time versions are also of practical interest and we will discuss themin subsection 3.3.

3.1 The Spacecraft Paradigm

The search for minimum fuel trajectories for spacecraft received considerable attention in con-nection with putting spacecraft in earth orbit and in the planning stages of the Apollo project.Quite remarkable fuel efficient trajectories which utilize the gravitational fields of several planetsto reach the outer planets are now under development. These are problems in the calculus ofvariations usually reducible to the solution of a system of ordinary differential equations withsplit boundary conditions. When important elements of reality are incorporated, however, theytake on a nonstandard character because of the necessity to impose inequality constraints onone or more of the variables. Trajectory optimization problems also arise in the optimization offlight paths of airplanes and in the mixing and heating of batch process chemical reactions. As

6

control problems these are open loop problems; they are easy to conceptualize but are often quitedifficult to solve. The classical calculus of variations is usually thought of as centering aroundthe Euler-Lagrange equations and the determination of suitable boundary conditions for theseequations. When inequalities are important, the classical calculus of variations does not sufficeand it becomes necessary to invoke some suitable modification. The best known, but not themost general, of these is the maximum principle developed by L. S. Pontryagin and his coworkersin the late 1950’s. The search for an ultimate maximum principle goes on, see the paper bySussmann [9].

3.2 Exploiting Sensitivity, Instability and Chaos

One important way in which the work of E. N. Lorentz on his famous attractor affected thescientific community was to focus attention on systems of differential equations whose trajectoriescould be both bounded and exponentially unstable. This can be thought of as being responsiblefor the outpouring of papers on the “butterfly effect” and its various implications. Control theoryhas a modest contribution to make to the subject if we consider trajectory optimization in thiscontext. If the trajectory under no control is unstable but bounded then the effort requiredto make large changes in the trajectory can be quite small. Indeed, one can think of the newtrajectories for lunar exploration developed by Belbruno [10] as being of this type. It is obviousthat the high sensitivity to initial conditions property of chaotic systems implies a high sensitivityto control as well. The discussion of the details has received some attention [11].

Consider a solution φ(t, x0) of a nonlinear differential equation

x = fd(x) ; x(0) = x0

Let the linearization about the given solution be

δ(t) =∂fd∂x

∣∣∣∣φ(t)

δ(t)def= A(t)δ(t)

If we now suppose that f depends on a control, f = f(x, u) such that f(x, 0) = fd(x). Then thelinearization for small u is

δ(t) = A(t)δ(t) +∂f

∂u

∣∣∣∣φ(t)

u(t) = A(t)δ(t) +B(t)u(t)

Let Φ(t, τ) be the fundamental solution of δ = Aδ. That is,

d

dtΦ(t, τ) = A(t)Φ(t, τ) ; Φ(t, t) = I

Using this notation, the endpoint operator (B’ of the introduction) takes the form

δ(T ) =∫ T

t0

Φ(T, t)B(t)u(t)dt

and its Moore-Penrose inverse is just

u(t) = BT (t)ΦT (T, t)W−1[0, T ](x(T )− Φ(T, t0)x0)

The controllability Gramian in this time varying context is

W [0 T ] =∫ T

τ

Φ(t, τ)B(τ)BT (τ)ΦT (t, τ)dτ

If the system is unstable it will have a norm that goes to infinity with T and so small changes in ucan have very large effects. Our main point here is that a fairly complete understanding of these

7

problems requires little more than a linearization about the unforced trajectory and an analysisof the Moore-Penrose inverse of the end point map. In Figure 3 we illustrate the idea of carryingalong with a trajectory the set of points that are reachable from the starting condition, consistentwith the constraint that the integral of uTu is less than or equal to some small constant.

.

Figure 3. The evolution of the shape of the reachable set for a given limit on the integral of u2.

An aspect of the input-output behavior of chaotic dynamics that is perhaps of more signif-icance relates to ability of systems with unstable but bounded trajectories to greatly amplifysignals without generating outputs that are so large that they can not be processed. One en-gineering application of this idea has already found wide use in the one bit digital to analogconverter [12]. The so-called sigma-delta converters are designed so as to have unstable butbounded trajectories with no tendency to periodicity. More than a decade ago the neurobiologistWalter Freeman made an interesting suggestion about a possible application of systems withhighly sensitive trajectories in biology. The core of the idea is that the amplification of olfactorystimuli might be explained as resulting from the high sensitivity associated with the instabilityof a chaotic system which is periodically reset to a standard initial condition by the breathingprocess.

3.3 Infinite Time Problems

As stated above, trajectory optimization leads to two-point boundary value problems and areusually treated as open-loop problems. There is an important case in which it is possible touse a simple transformation to reduce the two-point boundary value problem to two initial valueproblems. This is possible when the underlying dynamics are linear and the performance measureis quadratic

η =∫ T

0

xT (t)L(t)x(t) + uT (t)u(t)dt

In this case the problem can be solved by solving an initial value problem and, if there exists apurely time-invariant feedback control which drives the system along the optimal trajectory, itsform will fall out of the trajectory optimization problem. The details are well known. If we canfind a solution of the quadratic matrix equation

K = −ATK −KA+ L−KBBTK ; K(T ) = 0

then the feedback control law u = −BTKx results in a trajectory that minimizes η. The optimalvalue function v is

v(x(0)) = ηmin = xT (0)K(0)x(0)

In the limit as T goes to infinity, the optimal control law is time invariant.

4 Systems with Uncertainty

It is usually impossible to isolate a system from unwanted or unpredictable effects. Airplanes inflight are subjected to unexpected wind gusts, chemical plants must accommodate unanticipatedchanges in raw materials, and biological systems must adjust to changes in weather and otherenvironmental conditions. If there is no uncertainty in the system, the control or the environment,feedback control is largely unnecessary. Of course there is always uncertainty and hence modelswith uncertainty play an important role. These take many forms; we consider three commonpossibilities.

8

4.1 Models of Uncertainty

To fix ideas we consider several different, and rather general, classes of models for uncertainty.Models with Random Parameters: Consider an equation of the form

x(t) = f(x(t), u(t), α)

with α drawn at random from a known probability distribution.Models with Random Inputs: Consider a stochastic differential equation of the diffusion

type with control (stochastic differential equations here are to be interpreted in the Ito sense)

dx(t) = f(x(t))dt+ g(x(t))dw + b(x(t))udt

On the other hand, if the model is to capture the behavior of a queuing system or some otherdiscrete phenomenon, the uncertainty may be thought of as originating in a Poisson countingprocess, e.g.

dx(t) = f(x(t), u(t))dt+ g(x(t))dN(t)

In the literature there is often an explicit or implicit distinction made between uncertaintythat operates on a time scale that is changing slowly relative to the possible corrective actionand uncertainty that changes as fast as or faster than the corrective action. In the former casewords like adaptation or adaptive control are often used, whereas in the latter case one usuallyspeaks about stochastic control. Likewise, authors often insist on making a distinction betweencases in which the stochastic terms are purely additive as suggested by

dx(t) = f(x(t))dt+ gdw + b(x(t))u

(g constant) and cases such as

dx(t) = A0x(t) + z(t)A1x(t) +Bu(t)

dz(t) = −Fz(t) +Gdw

where the uncertain term enters the dynamics multiplicatively.In a stochastic setting one has a dynamical equation for the sample paths, but also an evolution

equation for the probability law. In the case of a diffusion process, this equation is a partialdifferential equation of the form

∂ρ(t, x)∂t

= L(x, u)ρ(t, x)

with L being a second order linear partial differential operator. Typically this operator is subelliptic with coefficients that depend on both x and u. In the case of queuing models the equationgenerally takes the form of a differential-difference equation whose solutions depend strongly onboundary conditions.

Models with Random Observations: In addition to the uncertainty associated with thedynamical system and the exogenous inputs, the measurement processes available for measuringproperties of the system is seldom perfect. This leads to problems in estimation theory and bringsin the theory of linear and nonlinear filtering. If we wish to use feedback to control the system

dx = f(x)dt+ g(x)dw + b(x)u ; dy = h(x)dt+ dν

then we must determine exactly what probabilistic information the past values of y provide aboutx(t) if we are to minimize a criterion such as

η = Eφ(x(T ))

where E denotes expectation.

9

Game Theoretic Models: Here one imagines that there are controls and unknown distur-bances. The model is

x(t) = f(x(t), u(t), d(t)) ; (u(t), d(t)) ∈ U × V

It is assumed that u has an objective and nature chooses d so as to make it as difficult as possiblefor u to achieve that objective. For example, a possible formulation might involve evaluating

η = minu

maxd

∫ ∞0

yT y + uTu− dT d dt

This formulation has the effect of limiting the L2 norm of d.One of the major ideas to emerge in the 1980’s was the idea that this kind of model, in which

the system is designed to play against an advesarial world, can form the basis for an effectivedesign tool. In the linear case we can ask for a control law u = −KBTx which provides thefeedback solution to the minmax problem and in this way obtain a degree of assurance that theunknown disturbance can not create too much trouble. The mathematical questions that arisein this area often come down to the question of existence of real solutions to an equation of theform

ATK +KA−KBBTK +KGGTK + CCT = 0

See [13] for more details. The theory of this type of matrix equation has been explored in thecontrol theory literature since the early 1960’s where it first arose in connection with least squaresoptimal control and stability questions. Later it was discovered to have connections with robustcontrol.

4.2 Feedback Can Reduce the Effects of Uncertainty

A reoccurring idea in engineering is that of building reliable machines using unreliable compo-nents. Put slightly differently, it is the problem of building precise machines with reproduciblebehavior using elements that are, for the most part, only imprecisely known and/or of variablecharacteristic. This is of long standing interest in biology and is a key element in the success ofcomputer hardware. Feedback control is often an important part of such systems. In some cases,feedback is used as part of a plan to use relatively few highly accurate parts to precisely regulatethe performance of a large number of less precisely known parts. In other cases, feedback is usedto allow a system to gain precision from a measurement of the difference between the actualperformance and the desired performance. It is a pervasive theme in biology that one observessystems that perform in a precise way even though it appears that the components that make upthe system are largely imprecise. This has given rise to a literature on the problem of buildingprecise machines with imprecise components.

Example 1: Suppose that g represents the voltage gain of an amplifier. The amplifier may becapable of high power levels but the value of g may vary considerably over the life of the system.This was a major problem which had to be solved to make long distance telephony practical.On the other hand, we may have an element k with properties that are complementary. Thevalue of k is known quite accurately but the gain associated with k is very low. In the study offeedback amplifier design, a crucial point is that the derivative of g/(1 + kg) with respect to gis (1 + kg − k)/(1 + kg)2. If g = 115 and k = 11−3, the gain g/(1 + kg) is about 1100 whereas(1 + kg − k)/(1 + kg)2 is about 11−3. The interpretation being that by reducing the forwardgain from g to g/(1 + kg), i.e. by a factor of about 110, one can reduce the sensitivity by afactor of 11000. This was the idea behind the feedback amplifier of Black [14] and set the stagefor the work of Nyquist on feedback stability. This illustrates the use of feedback, together withthe availability of one precisely characterized element, to greatly reduce the effect of a randomvariable which enters multiplicativly.

Example 2: Here is an example illustrating the effectiveness of feedback in reducing theeffect of additive stochastic disturbances. Suppose that x is a scalar that is to be regulated to

10

zero. However, there are disturbances. The evolution equation for x is a stochastic equation ofthe Ito type.

dx = −xdt+ dw + udt

Suppose that u is an expensive resource and we wish to limit its use. We seek to minimize thesteady state value of

η = E(u2(t) + x2(t)

)If we seek a minimizing u in the class of maps from [0,∞)→ R (open loop controls) then the bestvalue of u is zero, and the minimizing value of η is η∗ = 1. If perfect measurement are availableso that we may allow u to depend on x, the optimal control is simply u(t) = (

√2 − 1)x(t) and

the minimizing value of η in steady state is√

2− 1.

4.3 Why Optimal Stochastic Control is Difficult?

There are three distinct levels of difficulty one encounters in stochastic control. The first set ofproblems postulate the ability to make perfect observations. The second postulates the ability tomake noisy observations and the third denies the possibility of making any observations at all.

In the first case we are controlling the Fokker-Planck equation. Starting with a model suchas

dx = f(x, u)dt+G(x, u)dw

and a performance measure of the form

η = Eφ(x(T ))

we can recast the problem as a deterministic problem of controlling the Fokker-Planck equation

∂ρ

∂t= L(x, u)ρ

In this case the optimization is over the choice of functions u depending on x and t. Of course itis necessary to assume some regularity and perhaps some limitations on the range of values u cantake on. What this shows is that even the simplest finite dimensional system leads immediatelyto a problem in the realm of partial differential equations.

If the observations are noisy the difficulty increases tremendously. The situation has anintuitive explanation. If the observations are noisy then it is to be expected that by choice ofcontrol we can either try to improve the quality of the estimate of the state or else we can try todirectly reduce the cost but in general these goals will be in conflict. From a mathematical pointof view, the noisy observation case raises a large number of technical questions. Because one cannot observe x directly one must propagate the conditional density of x. For the model

dx = f(x, u)dt+ g(x)dw ; dy = h(x)dt+ dν

and for a fixed choice of u, the conditional density equation for x will be a partial differentialequation driven by the observation process. Perhaps the nicest form of this is the unnormalizedversion often referred to (at least when the control is absent) as the Duncan-Mortenson-Zakaiequation. (See the edited volume [15] for information and references original papers.) Becausethe observations are noisy it will be an infinite dimensional stochastic equation which dependson u

∂ρ(t, x)∂t

=(L(x, u) +

dy

dth(x)

)ρ(t, x)

It is a theorem, then, that the optimal control for the original system is an instantaneous functionof the conditional density. That is, u = u(ρ(t, x). This puts us in the situation of controllingan infinite dimensional system with perfect observations but now we must control a Fokker-Planck equation for the probability density of an infinite number of variables. There is only onemoderately general class of such problems that has been solved and that is the linear quadraticGaussian class.

The final class of problems, characterized by the need to do control with no observations, willbe considered in a later section.

11

5 Adaptation and Learning

A system interacts with the world. Over some periods of time the relevant parts of the system andthe world do not change and over other periods they do. It is reasonable to say that a controllerfor the system adapts if over a period time in which neither the system nor the world changes,there is some function of the state of the controller and system that is monotone increasing,regardless of what the (fixed) state of the world might be. At the present time we have a numberof mathematically sound theories of adaptation and learning which have been applied to welldefined problems and shown to have many of the properties one normally associates with thesewords.

Shortly after the pioneering work of Robbins and Monro [16] on stochastic approximation,Kiefer and Wolfowitz [17] showed that it is possible to use repeated, noisy measurements of thegradient of a function to descend to a local minimum. A prototype problem is that of minimizing(x − x0)TQ(x − x0) given noisy measurements of the gradient, Q(x − x0). More specifically, ifone is given a sequence of measurements of the form y = Q(x− x0) + n with n being a Gaussianrandom variable, it is possible to design a descent algorithm that will converge to x0. In fact,the solutions of

x(k + 1) = x(k) + a(k) (Q(x(k)− x0))

will converge to x0 if a(k) is a suitably chosen decreasing sequence converging to 1 with a suitablerate. For example, the rate a(k) = 1/k does nicely.

It will be helpful to recall a particular property of linear stochastic differential equations.Consider the equation

dx = a(t)xdt+ b(t)dw

The expected value of x is justEx(t) = e

∫ t0 a(τ)dτEx(0)

The expected value of x2(t), on the other hand, satisfies

d

dtEx2(t) = −2a(t)Ex2(t) + b2(t)

Now, the solution of the time varying differential equation

a(t) = α(t)a(t) + β(t)

goes to zero as t goes to infinity if the integral∫ toα(τ)dτ diverges to −∞ and

γ(t) =∫ t

0

e∫ τ0 α(µ)dµβ(τ)dτ

goes to zero. This is the case, for example, if α(t) = β(t) = 1/1 + t. A slight modification of thisanalysis shows that the solution of

dx = a(t)(x− f)dt+ a(t)fdw

approaches f as t goes to infinity and that E(x(t)− f)2 approaches zero. This has the followinginterpretation. If we want to minimize the function (x − f)2 and only have available a noisyversion of the gradient, 2(x − f) + n then it is possible to find a descent procedure which leadsx to f in a very strong sense. This is the basic idea behind stochastic approximation.

5.1 Descent with Limited Information

The mathematical expression of the idea of always making changes that will improve matters is theidea of a gradient flow. However, one often wants to use this when a full, noiseless gradient is notavailable. It is often possible to obtain some gradient-like information through experimentationeven when a complete determination of the gradient is impossible. Examples of special interestinclude:

12

1. Descent based on limited knowledge of the gradient:

x = −B(x, t)∇φ(x)

with B ≥ 0 but B singular. (A central problem in adaptive control corresponds to the caserank B =1. See, for example, Sastry and Bodson [18].)

2. Descent based on a noisy description of the gradient and scheduled strength of update

x = −a(t) (∇φ+ n(t))

(The Keifer-Wolfowitz algorithm.)

3. Descent with scheduled randomization:

x = −∇φ+ a(t)n(t)

(Simulated annealing. See, for example, Geeman and Hwang [19].)

What distinguishes these problems from the usual continuous descent problems of the form

x = −∇φ(x)

in that in these cases the gradient is either only partially known or else it is corrupted by noise.For example, in the adaptive control literature one finds equations of the form

x = −γ(t)γT (t)∇φ(x)

with γ(t) being a vector. In this case only the projection of ∇φ onto a one-dimensional subspaceis known. Even so, it can be shown that if γ(t) varies in a suitably generic way, x will approacha local minimum of φ. In learning control, one encounters noisy descent equations of the form

x = −α(t) (∇φ(x) + n(t))

with n a random process. This formulation is immediately suggestive of the Robbins-Monrostochastic approximation setup. Finally, in the context of machine learning there is usuallyabsolutely no reason for thinking that the optimization problem is convex. In that case somekind of simulated annealing algorithm is often combined with a dynamic programming step tosolve the problem.

5.2 Function Identification

The various limited information descent algorithms also work in infinite dimensional settings. Inparticular, we will see below that there are important applications in which one needs to findthe equilibrium solution of a Hamilton-Jacobi equation. A prototype of this is the problem ofidentifying a smooth function on the basis of sampled values. Suppose that there is a smoothfunction φ : Rn → R which is to be identified. Assume that it is possible to sample the functionone point at a time so as to make available φ(xi). Is it possible to determine φ? This problem issolved in a quite elegant way in the recent thesis of Baker [20]. The algorithm can be explained asfollows. Construct a series of approximations to the unknown function using an iterative scheme

φi+1(x) = φ(x) + αiwi(x− y)(φ(yi) + ni)

To fix ideas, suppose that X is all of Rn and that the weighting functions wi are heat kernelscentered at yi and of the form

wi(x− y) =1√

(2π)ndetΣie−

12 (x−y)TΣ−1

i (x−y)

In this case the value of φi(yi) affects the value of φi+1 quite strongly in the vicinity of yi andless so as the distance from yi increases. If the size of the symmetric positive definite matrix Σdecreases with increasing i then the influence of the measured value yi falls off more rapidly asthe distance from yi grows. It is intuitive reasonable, and can be shown rigorously, that if Σidecreases at a suitable rate as i goes to infinity and if αi also decreases at a suitable rate as igoes to infinity then φi → φ in several senses.

13

5.3 Joint Identification and Optimization

Recently there have appeared proofs of theorems that show that for a reasonably wide class ofproblems it is possible to use a suitable form of randomization together with a type of dynamicprogramming to find optimal controls and even optimal strategies for two person games. Tesauro[21] provides a convincing example of how these proceedures work. As in introduction to thistype of problem consider the following optimization problem. Suppose that there is an underlyingrelationship between x(k + 1) and x(k) which takes the form

x(k + 1) = f(x(k), u(k))

but that the f is unknown. Suppose further, that it is desired to minimize an exponentiallydiscounted future cost

η =∞∑j=i

γir(x(j), u(j))

Basic to the dynamic programming point of view is the value function V (x) which is the minimumachievable value of the cost, given that one is starting in state x. According to the principle ofoptimality, V satisfies the equation

V (x) = minu

(r(x, u) + γV (f(x, u)))

and the optimal policy can be determined from the fact that

u∗(x) = arg minu

(r(x, u) + γV (f(x, u)))

It is natural to try to solve this equation by iteration in function space; e.g., by a scheme such as

V k+1(x) = V k(x) + αk

(minu

(r(x, u) + γV k(f(x, u)

)− V k(x)

)This algorithm is well defined if r and f are known.

This idea has been modified in two important ways. First of all, if the process that generatesx(k + 1) from the pair x(k), u(k) is stochastic rather than deterministic then the most obviousmodification of the problem formulation is to seek to minimize the expectation

η = E∞∑j=i

γir(x(j), u(j))

and to work with a value function V (x) whose value at x is simply η expressed as a function ofthe starting state,

V (x) = minuE∞∑j=i

γir(x(j), u(j))

In this context dynamic programming asserts that the optimal V satisfies

V (x) = E minu

(r(x, u) + γV (y))

where y is the successor of x when control u is applied. Of course this somewhat complicates thesolution but in principle this equation too can be solved by the iterative scheme

V k+1(x) = V k(x) + αk

(minuE(r(x, u) + γV k(y)− V k(x))

)The second departure is more radical. It considers solving the same problem but doing so withoutgiven knowledge of the probability law that transforms (x, u) into the next sate y. In other words,the problem is to learn the optimal control law without being provided with any informationexcept for a knowledge of the sequence of states the system visits and the step by step value of

14

the cost incurred. The iterative procedure is defined in terms of a so called “Q-function”. If weknew the optimal value function we could make a definition

Q∗(x, u) = E(r(x, u) + min

uγV (y)

)We solve this equation iteratively using

Qk+1(x, u) = (1− αk)Qk(x, u) + αk

(r(x, u) + γmin

vQK(y, u)

)The iterative scheme can also be written as

Qk+1(x, u) = (Qk(x, u)− αk(Qk(x, u)− r(x, u) + γmin

vQK(y, u)

)A systematic account of this class of problems can be found in [22].

5.4 A Time Constant Hierarchy

In thinking about how adaptation and learning fits in with deterministic and stochastic control,it may be helpful to think in terms of time scales used by these different methods. A hierarchyis suggested by Figure 4 in which T represents an approximate overall time constant for thedynamics of the system being controlled.

System+

T30 T

1000 TLearningAdaptionFilter/ObserverDirect Feedback 0=

Figure 4. The feedback memory hierarchyThe figure illustrates the action of four different classes of control, classified according to

typical values for the length of the memory required to implement the control strategy.

1. Instantaneous feedback. Any deterministic optimal control can be implemented this way.

2. Feedback through a filter whose time constants are comparable with those of the origi-nal system. The separation theorem of linear-quadratic-Gaussian theory asserts that thisstructure is optimal in that context.

3. Feedback through a filter whose time constants are on the scale of 10 to 30 times the timeconstant of the system being controlled. This structure captures most adaptive controlschemes.

4. Feedback through a structure whose time constant extends beyond 30 times the time con-stant of the system. Most learning system fall into this category.

6 Quality of Performance vs. Cost of Implementation

When engineers set out to design a control system it is often assumed that it is possible toseparate the question of what control strategy to use from the question of how to realize thestrategy with sensors, actuators, communication links, computers, etc. In some cases it is decidedin advance that the control laws will be linear and consist of proportional plus integral plusderivative control, that the sensors will be of a certain type and that the communication will beaccomplished using certain hardware. This is in contrast to the way in which evolution shapes

15

the control systems found in biology; there the algorithmic and realization issues are completelycommingled. Through evolution living systems are constantly optimizing the joint problem ofimproving performance while providing an efficient realization of the control strategy. This hasthe effect of confounding the issues of optimizing the implementation and optimizing the shortterm performance.

Over the years there have been developed a number of successful theories of implementation.Outstanding in this regard are the various methodologies for electrical circuit design in boththe analog and digital domains. However, because the technologies available for implementationchange rapidly, few have stood the test of time. For example, the development of VLSI and MEMStechnologies have radically altered the economics of implementing control systems. On the otherhand, in biology the “technologies” available for implementation change slowly if at all. Instead,the implementation issues are thoroughly mixed with performance issues. The result is that thedetermination of the relationship between structure, as revealed by a study of neuroanatomy andfunction is difficult to resolve. One can argue that the reason control theories have not beenmore successful in biology is that if a biologists wishes to give a control theoretic explanation ofsome neural circuits it is necessary to interpret which parts of the observable structure are theresult of a quest for performance optimization and which are the result of an attempt to find aneconomical implementation. This makes it difficult to isolate the strategy that is intended. Onthe other hand, it is often possible to trace the evolutionary path associated with a system andin this way deduce something about the way evolution has attempted to make improvements.

6.1 Attention as an Indicator of Implementation Cost

In an engineering setting, the implementation of a control strategy involves the selection of theaccuracy and effectiveness of various subsystems. Generally speaking, less refined systems willcost less and may even be more reliable. Here is a partial list.

1 Sensors (range, accuracy, and sampling rate)

2 Communication channels (capacity, and latency)

3 Computational resources (speed and accuracy)

4 Actuators (strength, speed, and controllability)

If there is to be adaptation or learning then it is necessary to add to this list

5 Computational support for statistical analysis

6 Memory and database search engine

From the list given above it might seem that any attempt to limit the cost of an implemen-tation will necessarily be highly dependent on technological details. We argue that this is notnecessarily the case. Observe that most of the items listed above relate directly to the rate ofchange of the control action. This will lead us to suggest a relatively simple mathematicallydefined indicator of implementation complexity. The ideas follow mostly from [22].

A variety of problems, ranging from the control of a remote vehicle by means of a radiolink, to the control of a stock portfolio by means of transaction mechanisms for which a feeis charged, demonstrate the need for an approach to control theory that includes as a designgoal the development of controls that require as little updating as possible. A first principlesapproach to the question of implementation suggests that the easiest control law is to implementis one that requires no sensing no computation and no communication. This would be a controlsystem which produces a constant control. Anything else requires change. The amount of changerequired can be measured by

η =∫ T

0

∫X

L

(∂u

∂t,∂u

∂x

)dx dt

16

If the system to be controlled is described by x = f(x, u) ; y = h(x) and if the initial conditionis drawn from a density ρ(x) then a complete formulation of an optimization problem includinga cost term from the attention functional would be

∂ρ(t, x)∂t

=∑ ∂ρ(t, x)

∂xifi(x, u) ; ρ(0, x) = ρ0(x)

η =∫ T

0

∫X

L(x, u)ρ(t, x) + α

(∂u

∂t

)2

+ β

(∂u

∂x

)2

dx dt

This provides a method for trading off the cost of implementation against the quality of theperformance. Of course it glosses over a number of points but it is does introduce additionalrealism into the process.

Observe that terms in the attention functional involving ∂u∂t can be thought of as measuring

the amount of open loop control that is being used whereas terms of the form ∂u∂x measure the

amount of closed loop control. Thus, the significance of the coefficients α and β in the expressionfor η can be related to the extent to which the control should be open loop vs. the extent towhich it should use feedback. A frequently made observation about human performance is thatthe relative amount of open loop vs. feedback control depends on how much the task has beenpracticed, the act of practicing often converts what was done with feedback into an open loopform. In terms of the attention functional, we might say that increasing β steers the optimumaway from feedback and toward open loop implementations.

It seems that there are no known cases for which the minimum attention control problemcan be solved in closed form. On the other hand, by restricting the underlying dynamics to acompact manifold and assuming a strong form of controllability it is possible to show that thefunctional can be made finite. The standard device used to establish existence of harmonic maps,i.e., setting up a gradient flow for u, is less obvious here because of the side constraint arising forthe auxiliary equation for ρ.

6.2 Possible Applications

Portfolio Management: Suppose that we wish to create a system for managing an investmentportfolio. The state variables are the various values of the investments held. The differentialequations model the daily movement of prices and the return on investments. The control actionsare the possibilities of selling what is now held and buying something else. There is a costassociated with each such transaction. Moreover, there is a cost associated with the monitoringprocesses, both the monitoring of present holdings and the cost associated with researchinganything that might be purchased. The control steps in managing an investment portfolio havea cost associated with them. In addition to determining the optimal decisions rules, we wantto include the costs of all transactions. That is to say, we want to minimize the attention costsnecessary to implement the control law.

Understanding Plasticity: It has been known for decades that there is a topographicrepresentation of the various body surfaces in an area of the brain called the sensory motor strip.It was once thought that there must be a “wiring diagram” describing the way the system ishooked up and through the study of this diagram one might hope to gain a better understandingof motor control processes. However, a little more than a decade ago, experiments done onprimates showed that instead of being ruled by a wiring diagram, there is considerable plasticity.The area in the motor sensory strip that is devoted to a particular portion of the body dependson how much use that portion of the body has had and that it grows and shrinks over a timescale of days to weeks. Some details and references are to be found in the well known textbookby Kendal, Schwartz and Jessell [23]. Modeling this plasticity mathematically would seem torequire some measure of attention in the sense we have used the term.

Centralized Control vs. Distributed Control: Consider the problem of controlling alarge system such as the inventory of a large chain of stores or the routing rules of a communica-tions network. If too much communication is required the expense will erode profits. If there is

17

too little communication then excess supply in one location can not be used to fill excess demandin another location. The use of a mathematical definition of attention could clarify the discussionof these issues.

7 Challenges from Engineering

The prospects for intelligent machines continue to grow, keeping pace with advances in computingpower and the development of better sensors, actuators and software. In particular, the field ofrobotics, which began to attract great attention in the 1970’s and 1980’s has continued to maturein terms of applications and theory. Today there is considerable emphasis on more generalquestions in motion control involving sensory rich environments. Interesting examples includemobile robots and the associated questions about kinematics, dynamics, nonholonomic pathplanning, sensor fusion, etc.

What has control theory contributed to the fields of robotics and vision guided control? Oneof the spectacular successes has come out of the work by Deikmans and his group [24]. Theywere able to design a vision system working inside a feedback control loop having the speedand accuracy necessary to guide a passenger car on an autobahn at 130 kkm/hr without humanintervention. In general, the robotics groups that have had the most impact have been thosethat have made the hardware work with the software. New problems have emerged, such as mapmaking and compliance control. New algorithms such as those needed for path planning, anddata fusion in cases where there are a multitude of sensors, have been developed.

7.1 Hybrid Systems: Making Use of Higher Levels of Abstraction

Greater functionality usually means greater complexity and the standard way to deal with com-plexity is to introduce levels of abstraction. An important device used by humans to deal with thisis the introduction of tokens which encapuslate a number of more detailed descriptions. Whenwe substitute the phrase, “Get the milk from the refrigerator.” for a detailed description of themotion of the limbs required to get the milk from the refrigerator we achieve an enormous savingsin complexity. If machines are to make use of higher levels of abstraction we must design andanalyze machines that can process symbolic strings as well as the more familiar analog signals.Figure 5 below suggests the type of input-output modeling that is needed. In this figure thereare both analog and digital inputs on the left and both analog and digital responses on the right.Such systems sit at the interface between a communications network and physical processes.

Hybrid System

u

v

y

w

Figure 5. A conceptualization of a Language driven system.

Developments in electronics have made it inexpensive to add a wide variety of sensing andcomputing capabilities to systems which were previously designed as simple feedback loops. Avery simple example is the timer added to the thermostat control of a home heating systemthat resets the desired temperature according to the day of the week, time of the day, etc. Thisadditional capability often takes the system from a domain in which everything can be modeledusing differential equations to a domain in which the model includes both differential equations

18

and finite automata. By a hybrid system we understand a model of the form

x(t) = a(u(t), x(t), zbpc) ; y(t) = ce(x(t), z(bpc))p(t) = r(u(t), y(t), z(bpc))z(dpe) = f(z(bpc), y(tp), v(bpc)) ; w(k) = h(y(tp), z9bpc))

with r being nonnegative. The meaning of the variables are with minor differences, as above.The first equation, specifying the evolution of x, describes those aspects of the system for whichdifferential equations are the appropriate basis for modeling. The variable p is to be thought of asmodeling the pace of interaction between the real-time dynamics represented by x and the flow ofinformation represented by changes in z. See [25] for more details. The last equation, specifyingthe way in which z changes, describes the part of the system whose evolution is triggered byevents, i.e. p advancing through integer values, and represents the symbolic processing done bythe system.

An Autonomous Example: Let K be a simplicial complex in Rn. Let X be a connecteddifferentiable manifold of dimension n and let f : K → X be a triangularization of X. Let g be afunction which is constant on the interior of the cells of the triangularization that contain opensets of X. Let Z be a finite set, {z1, z2, ...zk}. Assume that for each value of z in this set Fzi isa vector field on X. Suppose that in some system of local coordinates for X the vector field Fzicorresponds to the differential equation

x = f(x, z)

in the sense that Fzi corresponds to f(·, zi). Now add to the equation for x an equation for ztaking the form

dz = a0(x, z)dg + a1(x, z)|dg|by which we understand an evolution law for a left-continuous function according to which z isconstant when g is constant and changes when g changes according to the rule

z(ti) 7→ z(ti) + a0(x(ti), z(ti))∆g + a0(x(ti), z(ti))|∆g|

This is unambiguous when x crosses transversely from one n-cell of the triangularization toanother; we leave for another place the rule for trajectories that fail to satisfy this simplifyingassumption. Finally we restrict the choice of the a’s in such a way that the jumps in z are suchthat z evolves in the finite set Z specified above. Of course this strongly limits the way in whichthe a’s can depend on x but does not require that the a’s be independent of the x. This thengives us a model analogous to an ordinary differential equation but with a discrete part actinglike a finite automaton. It may be remarked that the a1|dg| term in the z equation is of criticalimportance in that its presence gives rise to a much richer structure for the mapping of x-pathsinto z than one gets without it.

The more difficult tasks of planning and reasoning about motions seem to require a level ofabstract reasoning that is best thought of in terms of tokens. Models capable of describing thetime evolution of these tokens will be required to capture their behavior. For example, if wewant models that will allow one to estimate the length of time required to collect and process thestring data because this is an important design parameter for determining the channel capacityand buffer sizes for the symbolic input, then this additional complexity is necessary. The moreelaborate models also allow one to predict the transient response of a feedback system with anautomaton in the feedback loop. We may refer to such systems as hybrid systems and although asyet they lack the beautiful self-contained structure of a system of differential equations, the factis that they are now the most common form of implementation for everything from automobileengine control to voice recognition systems.

7.2 Learning Systems in Engineering

It seems that the control of systems that are “combinatorally complex”, that is, systems whosecomplexity is such that human conceptualization of it is likely to be only possible at a level that

19

is well above the level that would be useful for control, represent targets of opportunity for theimplementation of learning control. There are already examples of such systems:

Adaptive Arrays and Principal Components: A widely used idea which can be thoughtof as a learning system involves the identification and use of principal components. Consider anm-vector of signals, u(·), which we suppose to be zero mean stochastic processes. If we computea sample covariance matrix using an accumulator equation

W (t) = u(t)uT (t)

then we can consider a related flow on the space of orthogonal matrices defined by asking that Θ(t)be orthogonal and that it should make ΘTW (t)Θ(t) diagonal and make the diagonal elementsappear in descending order from top left to bottom right. It has been shown that for a suitablechoice of diagonal matrix N the flow defined by

Θ(t) = [ΘT (t)W (t)Θ(t), N ]Θ(t)

generates an arbitrarily good approximation to such a matrix. In this case the columns of Θgenerate the principal components of the process u and the various adaptive subspaces filters canbe implemented in this way. In fact, we can say that this pair of equations learns subspaces and isa linear form of a neural network. Situations in which the signal is projected on to one or anothersubspace which is, itself, determined by the statistical properties of the signals recently received.This idea goes back at least to the early work of Widrow and is discussed in his textbook withStearns [27]. Continuous evolutions build on recent results on various steepest descent equationsmake closer contact with neural networks. Bloch’s paper [28] and the book by Helmke and Moore[29] are possible starting points.

Hidden Markov Models: (See Rabinier and Juang [30].) Let x be a stochastic processtaking on values in a finite set X. Let pi(t) be the probability that x(t) = xi. Suppose that x is acontinuous time jump process and that p = Ap. If A is known and if we make noisy observationson x by observing a related random variable y, than we can propagate the conditional probabilityassociated with x given y. A well known example of this occurs when y = x+n with n Gaussian.However, if A is not known then the matter becomes more interesting. In this case one wantsto estimate A at the same time one is estimating x. One important example of this is thesituation found in many speech recognition algorithms. In those applications the states are takento be certain combinations of phonemes (typically the number of states is well over 110) andthe transition probabilities aij are initially rather poorly known. The Baum-Welch algorithmproceeds by updating the A matrix by counting the number of transitions from state i to statej and updating aij in accordance with an empirical view of probability. This cannot be done, ofcourse, because the state is only known probabilistically. To reduce this effect one uses smootheddata and relies on long runs.

The stochastic realization problem most frequently discussed in the system theory literaturebegins with a autocorrelation function and looks for a dynamical system whose output will havethe given autocorrelation function when the input is white noise. The spectacular success thathidden Markov models have had, not only in speech recognition but elsewhere as well, suggeststhat one should give more emphasis to the Baum-Welch definition of the problem. A nice testwould be to see if the hidden Markov models could be used to optimize the computer codeassociated with some motion control problem.

8 Challenges from Physics

There are a number of original experiments in physics demonstrating new ways to manipulatemolecules, atoms and elementary particles which paved the way to new measurement techniquesof great value to medicine and other sciences. In many cases these experiments do not concern themanipulation of individual entities but rather they demonstrate the control of average propertiesof ensembles of identical (or very nearly identical) entities consisting of anywhere from 106 to 1012

or more individuals. For example, the 1985 Nobel prize in physics, given for work done at CERN

20

involving experiments done with a particle accelerator. It was shared by the particle physicistCarlo Rubbia and Simon van der Meer who received training at Delft in control engineering. TheNobel committee cited both the experimental result and the control techniques for manipulatingbunches of particles, on the order of 106 in number, whose position can be measured and controlledin an accelerator by modulating magnetic fields in response to measurements [31]. We willconsider other examples, many of commercial value, in which it is an ensemble which must becontrolled.

8.1 Ensemble control

A typical problem in this area involves not a single system but rather an ensemble of systems

xi(t) = f(xi(t), u(t), t) ; i = 1, 2, . . . k

consisting of k identical systems subjected to the same control. In important cases, it is possibleto measure an output which is a symmetric function of the x’s, say

y(t) =k∑i=1

φ(xi)

Such models describe problems in thermodynamics, and quantum statistical physics, as well as theparticle accelerator situation just mentioned. Only recently have such problems been formalizedas control problems.

Consider the problem of manipulating a quantity of gas in such a way as to extract as muchmechanical work as possible from it, given that it can take only a certain amount of heat froma heat source of temperature Th and given that it has access to a heat sink of temperatureTl < Th. This problem formulation goes back to Carnot and over the last 150 years has beenreexamined and resolved in many ways. Here we consider a collection of gas particles moving inthree dimensional space. Their velocities are considered to satisfy Ito equations and, consistentwith standard statistical mechanics, the temperature is interpreted as the being proportionalto the average kinetic energy of the various modes. The controls enter to model the effect ofchanging the volume occupied by the gas and to model the temperature/thermal conductivityassociated with the heat source. The particle model is such that the motion in each of thecoordinate direction is independent and is Brownian motion in the velocities. After a suitablenormalization,

dvi =√

2Tdwi

with T being the temperature of the heat source the particles are in contact with.Passing now to average quantities, introduce (x1, x2, x3) with x1 being the temperature of the

gas, x2 being the volume occupied by the gas and x3 being the work done by the gas. We nowformulate a variational problem that will lead to the Carnot cycle. Consider

d

dt

x1

x2

x3

=

2x1

12x1

u1 +

−2x1

00

u2 +

100

u3

Here u1 is the time rate of change of the volume of the gas, u2 serves to select the type of contactthe gas has with the temperature sources, while u3 is the actual temperature of source. Thephysical situation imposes certain inequalities such as T1 ≤ x1 ≤ T2, u2 ≥ 0, etc. (See [32] forfurther details.)

8.2 Controlling Spin Systems

Although the idea that protons and neutrons can carry quantized spin was advanced by Heisen-berg very shortly after the discovery of the neutron in 1932, the first clear experiments werereported in 1946 coming out of groups lead by Felix Bloch and Edward Purcell, respectively.

21

Far from being just a curiosity, the control and subsequent exploitation of nuclear spin is nowa widely used measurement technique whose importance has been recognized by a number ofNobel prizes. From the point of view of control, the relevant features are as follows. One has, tobegin with, a large number of nearly identical, very weakly interacting objects which carry weakcirculating currents. These objects are in thermal equilibrium, subject to random motions whichmake the magnetic effects of the currents cancel out. By putting the system in a strong magneticfield, it is possible to slightly bias the distribution. In this biased state, an externally appliedelectromagnetic pulse can alter the direction of the bias, resulting in a coherent precession of partof the population. The precessing spins radiate, generating a electromagnetic disturbance whosefrequency (frequencies) is characteristic of the composition of the sample. Because the signalsgenerated in this way are weak, and because of the ever present noise, there is a continuinginterest in finding more effective ways to excite the biased sample.

The mathematical model can, in this case, be written as

ρ = −i[H, ρ]

with ρ, the so called density matrix, being a Hermitian operator, and H being the Hamiltonianof the system. In fact, H is of the form H = H0 +

∑Hiui where ui represents the effect of the

externally applied fields. These matters are discussed from a control theory point of view in [33].

9 Challenges from Biology

Neurobiology [24] and cell biology [34] are two of the most active fields in science. Althoughdifferent in many ways, they share important common features. In particular, they are:

1. experiencing a rapid transformation from a qualitative science to a science based on physicsand chemistry;

2. plagued by overwhelming complexity because of the large numbers of highly interactingparts;

3. fields in which flow and transformation of information appears to be of greater importancethan the flow and transformation of matter.

The first of these has made it possible to reduce a number of complex observations to sequencesof steps, each explainable in terms of known scientific principles. The second and third, howeversuggest that the kind of reductionism that helped to bring science to its present state will notnecessarily be enough to enable human understanding of these fields. In the context of cellbiology Hartwell et al.[35] have argued that progress will be more rapid and understanding moreuseful if researchers think in terms of functional units with inputs and outputs, each very muchless complex than an entire cell but each more encompassing than a few related biochemicalprocesses. This point of view involving “modules” or, as one would say in the present context“subsystems”, is at the heart of systems engineering. In neurobiology the subsystems defined byanatomy are well recognized but it seems that most of theses are far too complex to play therole of subsystems in the sense that engineers would use the terms. The prospects for the use ofcontrol theoretic ideas here is exciting.

9.1 Regulation Involving Gene Expression

Because of variations in the environment in which living things grow, resources must be allocatedto various regulatory mechanisms capable of preserving suitable conditions for metabolism andgrowth. As researchers discover more details about the various mechanisms for regulation newideas about control emerge. The more novel of these are mechanisms involving discrete as wellas continuous processes. Over the first half of this century there gradually emerged an awarenessthat the regulator point of view, already having gained acceptance in engineering, could be usedto explain various aspects of physiology. It proved to be relatively easy to identify structuresinvolving sensors and actuators which act to regulate body temperature, blood pressure, respi-ratory rate, etc. Ross Ashby in his pioneering book Cybernetics [36], presents a tightly reasoned

22

and broad ranging account of this line of work. One measure of its success is that it now appearsin some form in many high school biology texts.

At the level of the individual cell, things are rather different. When a eukaroytic cell synthe-sizes one of the proteins needed for its existence or replication, one essential step is transcription,a process by which the information coded in the DNA is read and converted into a “workingplan” to be used in assembling the protein. This transcription is a discrete process, in some wayscomparable to a computer program reading a subroutine from a read-only disk. The decisionto initiate the process, i.e., the decision to ‘express the gene, is thought to be dependent on thechemical environment as characterized by the concentrations of various proteins and nutrients.Thus the evolution over time of cellular processes ordinarily involves both analog and digitalelements, in something of the same way that hybrid systems of the type we discussed above do.A very detailed model of one such process has be published by McAdams and Shapiro [37].

One specific problem involving elements of discrete and continuous aspects of regulation thathas been discussed and experimented with is the problem of building a biological clock that issimple enough to exist within a single cell. Of course such a clock is simply an aid to the mainpurpose of the cell and hence must not use extensive resources. In [38] it is hypothesized that arobust time keeping mechanism involving transcription as part of the cycle can be built using arelatively small number of molecules. Reference [39] contains a detailed study of the molecularbasis for the circadian clock in mice. Feedback plays a critical role in the analysis. From anengineering point of view this work is tantalizing because circuit designers generally need toincorporate off chip crystal oscillators to get accurate clocks. This raises the possibility thatsome as yet undiscovered hybrid structures may achieve a degree of accuracy and robustness notachievable with analog or digital components alone.

9.2 Motion Control

Questions centering around how the mind controls the body are central to the study of psychology.Specific systems, such as the vestibulo-ocular system for controlling the eye movement, havebeen the subject of years of study while still retaining many secrets. See, e.g. the work ofIto [40]. Central to the difficulty is the fact that interesting systems often display an importantdependence on the higher level tasks that are being performed. Psychologists often try to describethis dependence using ideas relating to attention. In section 7 we introduced a quantitativeversion of attention as it might apply to control. However, there is a second aspect of attention,perceptual attention which has to do with the allocation of the resources which can be used tointerpret the sensory apparatus and the signals produced by it. There has already been workwhich formulates precise questions along these lines having to do with the processing of data.For example, models involving recursive estimation theory already appear in the literature [41].However, in an environment that is sufficiently rich with sensory data, it will always be necessaryto make choices. It may be best to think of perceptual attention as a vector characterizing thedirection in which lies the most relevant data. Adopting this point of view certain problems inwhat biologists call motor control would split into a choice of perceptual attention vector and achoice of attention functional which would together determine the quality of the control.

Indeed, the latter seems to involve a variation of the servomechanism problem in which thedesired output is described in an abstract way using tokens, i.e., by using elements of a formallanguage in the sense that term is used in computer science.

Acknowledgements This work was funded by the Army grant DAAG 55 97 1 0114 (Centerfor Dynamics and Control of Spart Structures), Brown Univ. Army DAAH 04-96-1-0445 (Metricsfor ATR), Yale University NSF CCR9980058 and MIT Army DAAL03-92-G-0115, (Center forIntelligent Control).

10 References

1. K. J. Astrom and B. Wittenmark, Adaptive Control, Addison-Wesley, Reading MA, 1989.

23

2. J. W. Polderman and J. C. Willems , Introduction to Mathematical Systems Theory, Springer,New York, 1998.

3. E. Sontag, Mathematical Control Theory, Springer, New York, 1998.

4. R. M. Murray, X. Li and S. Sastry, Robotic Manipulation, CRC Press, Boco Raton, Fl,1994.

5. R. W. Brockett and C. I. Byrnes, “Multivariable Nyquist Criteria, Root Locus, and PolePlacement by Output Feedback,” IEEE Trans. on Automatic Control, Vol. AC-26 (1981)pp. 271-284.

6. W. Liu and H. J. Sussmann, “Limits of highly oscillatory controls and approximation ofgeneral paths by admissible trajectories,” Proceedings of the 30th IEEE CDC pp 437-442.(1991).

7. R. W. Brockett, “Characteristic Phenomena and Model Problems in Nonlinear Control,”Proceedings of the 1996 IFAC Congress Vol. G, pp. 135-140.(1996).

8. P. S. Krishnaprasad and D. Tsakiris, “G-Snakes: Nonholonomic Kinematic Chains on LieGroups,” Proc. 33rd IEEE Conf. on Decision and Control, pp. 2955-2960, 1994.

9. H. J. Sussmann, “Geometry and Optimal Control,” in Mathematical Control Theory, (J.Baillieul and J. C. Willems, Eds.) Springer Verlag, New York, 1998.

10. E. Belbruno, “Lunar Capture Orbits, a Method of Constructing Earth-Moon Trajectoriesand the Lunar GAS Mission,” in Proc. AIAA/DGLR/JSASS International PropulsionConf, (!987).

11. T. Shinbrot, C. Grebogi, E. Ott, J.A. Yorke, “Using Small Perturbations to control chaos,”Nature 363: 411-417 (1993)

12. S. Hein and A. Zakhor, Sigma Delta Modulators: Nonlinear Decoding Algorithms and Sta-bility Analysis, Kluwer Academic Publishers, 1993.

13. T. Basar and P. Bernhard, H∞-Optimal Control and Related Minimax Design Problems,Birkhauser, Boston, 1995.

14. H. S. Black, Bell Systems Technical Journal, 1934.

15. M. Hazewinkel and J. C. Willems, Stochastic Systems: The Mathematics of Filtering andIdentification and Applications, Reidel, Dordrecht, The Netherlands, 1980

16. H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Stat. ,vot22, pp. 400-407, (1951).

17. J. Kiefer and J. Wolfowitz, “Stochastic Estimation of the Maximum of a Regression Func-tion,” Annals of Math. Stat., vol 23, pp. 462-466 (1952).

18. S. Sastry and M. Bodson, Adaptive Systems,; Stability, Convergence and Robustness, Pren-tice Hall,

19. S. Geeman and C. R. Hwang, “Diffusions for Global Optimizations,” SIAM J. on Controland Optimization, vol 24, pp1131-1143, 1986.

20. W. Baker, Learning via Stochastic Approximation in Function Space, PhD Thesis, HarvardUniversity, 1997.

21. G. Tesauro, “TD-Gammon, a self-teaching backgammon program achieves master levelplay,” Neural Computation, vol. 6., pp. 5215-219, (1994).

24

22. D. P. Bertsekas and J. N. Tsitsiklis, ”Neuro-Dynamic Programming”, Athena Scientific,Belmont, MA, 1996.

23. R. W. Brockett, “Minimum Attention Control,” Proceedings of the 1997 CDC, 1997, pp.2628-2632.

24. E.R. Kendal, J.H. Schwartz and T. M. Jessell, Essentials of Neural Science and Behavior,Appelton and Lange, Stamford, Connectiut, 1995.

25. E.D. Dickmanns. An Integrated Spatio-Temporal Appraoch to Automatic Visual Guidanceof Autonomous Vehicles. IEEE Transactions on Systems, Vol. 20 No. 6, 1990.

26. R. W. Brockett, “Hybrid Models for Motion Control Systems,” in Perspectives in Control,(H. Trentelman and J.C. Willems, eds.), Boston: Birkhauser, 1993, pp. 29-54.

27. B. Widrow and S. D. Stearns, Adaptive Signal Processing, Prentice-Hall, Englewood Cliffs,NJ, 1985.

28. A. M. Bloch, “Stepest Descent, linear Programming and Hamiltonian Flows,” in Contemp.Math., AMS vol 114, pp. 77-88, 1990.

29. U. Helmke and J. Moore, Optimization and Dynamical Systems, Springer, London, 1994.

30. L. Rabiner and B-H. Juang, Fundamentals of Speech Recognition. Prentice-Hall, EnglewoodCliffs, NJ, 1993.

31. S. Van Der Meer, “Stochastic Cooling and the Accumulation of Antiprotons,” in NobelLectures in Physics, (Tore Frangsmyr, Ed.) World Scientific, Singapore, 1993.

32. R. W. Brockett, “Control of Stochastic Ensembles,”, in The Astrom Symposium on Control,,(B. Wittenmark and A. Rantzer, Eds.) Studentlitteratur, Lund, Sweeden, 47111999.

33. R. W. Brockett and Navin Khaneja, “On the control of Quantum Ensembles,” System The-ory: Modeling, Analysis and Control, (Theodore Djaferis and Irvin Schick, EDs.), KluwerAcademic Publishers, 1999.

34. G. M. Cooper, The Cell, ASM Press, Washington, D.C. 1997.

35. L. Hartwell, J. Hopfield, S. Leibler, and A. Murphy, “From Molecular to Modular cellBiology,” Nature, vol 402, (Supplement) Dec. 1999.

36. Ross Ashby, Cybernetics, Academic Press, 1959.

37. H. H. McAdams and L. Shipiro, “Circuit Simulation of Genetic Networks,” Science vol.269, pp. 650-656, (1995).

38. M. Elowitz and S. Leibler, “A synthetic oscillatory network of transcriptional regulators,”Nature, vol 403, pp. 335-338. (2000).

39. L. P. Shearman et al. “Interacting Molecular Loops in the Mammalian Circadian Clock,”Science, vol. 288, pp. 1013-1019, (2000).

40. M. Ito, The Cerebellum and Neural Control, Raven Press, New York, 1984.

41. Michael Athans, “On the determination of optimal cost measurement strategies for Linearstochastic systems, Abstracts of IFAC fifth World Congress, p 127, 1972.

25

Documents

New Issues in the Mathematics of Control