12
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009 1739 Model Predictive Control for Stochastic Resource Allocation David A. Castañón, Senior Member, IEEE, and Jerry M. Wohletz, Member, IEEE Abstract—In this paper, we consider a class of stochastic re- source allocation problems where resources assigned to a task may fail probabilistically to complete assigned tasks. Failures to complete a task are observed before new resource allocations are selected. The resulting temporal resource allocation problem is a stochastic control problem, with a discrete state space and control space that grow in cardinality exponentially with the number of tasks. We modify this optimal control problem by expanding the admissible control space, and show that the resulting control problem can be solved exactly by efficient algorithms in time that grows nearly linear with the number of tasks. The approximate control problem also provides a bound on the achievable perfor- mance for the original control problem. The approximation is used as part of a model predictive control (MPC) algorithm to generate resource allocations over time in response to information on task completion status. We show in computational experiments that, for single resource class problems, the resulting MPC algorithm achieves nearly the same performance as the optimal dynamic programming algorithm while reducing computation time by over four orders of magnitude. In multiple resource class experiments involving 1000 tasks, the model predictive control performance is within 4% of the performance bound obtained by the solution of the expanded control space problem. Index Terms—Optimization, resource allocation, stochastic control. I. INTRODUCTION R ESOURCE allocation problems such as multiprocessor scheduling or job shop scheduling assume that, once a resource works on a task, the task will be completed success- fully. However, there are many resource allocation problems where resources can fail to complete the task, and subsequent resource assignments are required for that task. We refer to this class of problems as unreliable resource allocation problems. Examples of these problems include many gambling paradigms, assignment of search activity (e.g., sonobuoys) to sectors [23], assignment of ground-air missiles to aircraft, or more general assignment of weapons to targets [7], [8], [13] in diverse mili- tary applications. From an academic perspective, the process of submission of proposals to funding agencies can also be viewed as unreliable resource allocation. In this paper, we are interested in the problem of dynamic re- source allocation where resources are non-renewable and unre- liable, and where the success of past resource allocations can be Manuscript received February 09, 2007; revised January 01, 2008. Current version published August 05, 2009. This work was supported by AFOSR under Grants FA9550-04-1-0133, FA9550-07-1-0361, and MURI Grant FA9550-07-1-0528. Recommended by Associate Editor S. Dey. D. A. Castañón is with the Department of Electrical and Computer En- gieering, Boston University, Boston, MA 02215 USA (e-mail: [email protected]). J. M. Wohletz is with the BAE Systems, Burlington, MA 01803 USA Digital Object Identifier 10.1109/TAC.2009.2024562 observed before new allocation decisions are made. These ob- servations provide the opportunity for feedback in allocation de- cisions. These problems require selecting which tasks to process first, and what resources to hold in reserve for allocation after observing the success of early allocations. The optimal solu- tion of these problems will result in closed-loop strategies which adapt to observed failures in task completion. There is an extensive literature on weapon target assignment problems that are formulated as unreliable resource allocation problems. Most of these variations consist of static problems [7], [8], [12], [13], [19] where no information on allocation outcomes is observed. Dynamic variations where allocation outcomes are observed were studied in [11] and [2]. Recently, Murphey [15], [16] has addressed stochastic dynamic weapon assignment problems where new tasks arrive over time, and weapon assignments are unreliable. Murphey allows for ob- servation of the new task arrivals, but no observation of the past allocation outcomes. The resulting problem formulation is a stochastic program [17], which requires feedback strategies that hedge against future task arrivals. In a recent paper, Glaze- brook and Washburn [10] study several variations of dynamic weapon assignment problems that have simple index policies, including problems with utilization costs per weapons, partially observed outcomes and multiarmed bandit versions of weapon assignment problems [20]–[22]. The search theory literature includes several results on dy- namic search problems; a good overview of these results is avail- able in [23]–[25]. However, most the available results on dy- namic search focus on sequential search for a single object, where search resources are allocated to a single site at a time, and the search stops when the object is found (so the task is com- pleted). An exception to this is the recent work on sequential hypothesis testing [3], where the use of imperfect search sen- sors with false alarms and missed detections requires continued search. In this paper, we focus on the allocation of unreliable re- sources to multiple tasks over two stages, where the outcomes of all the resources assigned in the first stage are observed before the second stage allocations are selected. We cast the problem as a stochastic control problem, where the dynamic state consists of the subset of tasks that remain incomplete, and the transition probability of the state is governed by the first stage allocations. Although this problem can be solved using stochastic dynamic programming (SDP) [1], the number of states grows exponen- tially with the number of tasks, and the cardinality of the control space also grows exponentially with the number of resources. We propose an alternative algorithm, based on solution of an approximate SDP formulation. We replace the original Markov Decision Problem formulation by expanding the class of admis- 0018-9286/$26.00 © 2009 IEEE

Model Predictive Control for Stochastic Resource Allocation

  • Upload
    jm

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Model Predictive Control for Stochastic Resource Allocation

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009 1739

Model Predictive Control for StochasticResource Allocation

David A. Castañón, Senior Member, IEEE, and Jerry M. Wohletz, Member, IEEE

Abstract—In this paper, we consider a class of stochastic re-source allocation problems where resources assigned to a taskmay fail probabilistically to complete assigned tasks. Failures tocomplete a task are observed before new resource allocations areselected. The resulting temporal resource allocation problem is astochastic control problem, with a discrete state space and controlspace that grow in cardinality exponentially with the numberof tasks. We modify this optimal control problem by expandingthe admissible control space, and show that the resulting controlproblem can be solved exactly by efficient algorithms in time thatgrows nearly linear with the number of tasks. The approximatecontrol problem also provides a bound on the achievable perfor-mance for the original control problem. The approximation is usedas part of a model predictive control (MPC) algorithm to generateresource allocations over time in response to information on taskcompletion status. We show in computational experiments that,for single resource class problems, the resulting MPC algorithmachieves nearly the same performance as the optimal dynamicprogramming algorithm while reducing computation time by overfour orders of magnitude. In multiple resource class experimentsinvolving 1000 tasks, the model predictive control performance iswithin 4% of the performance bound obtained by the solution ofthe expanded control space problem.

Index Terms—Optimization, resource allocation, stochasticcontrol.

I. INTRODUCTION

R ESOURCE allocation problems such as multiprocessorscheduling or job shop scheduling assume that, once a

resource works on a task, the task will be completed success-fully. However, there are many resource allocation problemswhere resources can fail to complete the task, and subsequentresource assignments are required for that task. We refer to thisclass of problems as unreliable resource allocation problems.Examples of these problems include many gambling paradigms,assignment of search activity (e.g., sonobuoys) to sectors [23],assignment of ground-air missiles to aircraft, or more generalassignment of weapons to targets [7], [8], [13] in diverse mili-tary applications. From an academic perspective, the process ofsubmission of proposals to funding agencies can also be viewedas unreliable resource allocation.

In this paper, we are interested in the problem of dynamic re-source allocation where resources are non-renewable and unre-liable, and where the success of past resource allocations can be

Manuscript received February 09, 2007; revised January 01, 2008. Currentversion published August 05, 2009. This work was supported by AFOSRunder Grants FA9550-04-1-0133, FA9550-07-1-0361, and MURI GrantFA9550-07-1-0528. Recommended by Associate Editor S. Dey.

D. A. Castañón is with the Department of Electrical and Computer En-gieering, Boston University, Boston, MA 02215 USA (e-mail: [email protected]).

J. M. Wohletz is with the BAE Systems, Burlington, MA 01803 USADigital Object Identifier 10.1109/TAC.2009.2024562

observed before new allocation decisions are made. These ob-servations provide the opportunity for feedback in allocation de-cisions. These problems require selecting which tasks to processfirst, and what resources to hold in reserve for allocation afterobserving the success of early allocations. The optimal solu-tion of these problems will result in closed-loop strategies whichadapt to observed failures in task completion.

There is an extensive literature on weapon target assignmentproblems that are formulated as unreliable resource allocationproblems. Most of these variations consist of static problems[7], [8], [12], [13], [19] where no information on allocationoutcomes is observed. Dynamic variations where allocationoutcomes are observed were studied in [11] and [2]. Recently,Murphey [15], [16] has addressed stochastic dynamic weaponassignment problems where new tasks arrive over time, andweapon assignments are unreliable. Murphey allows for ob-servation of the new task arrivals, but no observation of thepast allocation outcomes. The resulting problem formulation isa stochastic program [17], which requires feedback strategiesthat hedge against future task arrivals. In a recent paper, Glaze-brook and Washburn [10] study several variations of dynamicweapon assignment problems that have simple index policies,including problems with utilization costs per weapons, partiallyobserved outcomes and multiarmed bandit versions of weaponassignment problems [20]–[22].

The search theory literature includes several results on dy-namic search problems; a good overview of these results is avail-able in [23]–[25]. However, most the available results on dy-namic search focus on sequential search for a single object,where search resources are allocated to a single site at a time,and the search stops when the object is found (so the task is com-pleted). An exception to this is the recent work on sequentialhypothesis testing [3], where the use of imperfect search sen-sors with false alarms and missed detections requires continuedsearch.

In this paper, we focus on the allocation of unreliable re-sources to multiple tasks over two stages, where the outcomes ofall the resources assigned in the first stage are observed beforethe second stage allocations are selected. We cast the problem asa stochastic control problem, where the dynamic state consistsof the subset of tasks that remain incomplete, and the transitionprobability of the state is governed by the first stage allocations.Although this problem can be solved using stochastic dynamicprogramming (SDP) [1], the number of states grows exponen-tially with the number of tasks, and the cardinality of the controlspace also grows exponentially with the number of resources.

We propose an alternative algorithm, based on solution of anapproximate SDP formulation. We replace the original MarkovDecision Problem formulation by expanding the class of admis-

0018-9286/$26.00 © 2009 IEEE

Page 2: Model Predictive Control for Stochastic Resource Allocation

1740 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

sible strategies through relaxation of constraints on admissibledecisions, in a manner similar to our earlier work and [2], [4]and the work of Yost and Washburn [26], [27]. The solutionof this relaxed problem formulation provides a bound to theachievable performance in the original problem. The relaxedformulation is a class of stochastic control problems with ex-pected value constraints, similar to those studied in [6]. Usinga primal dual stochastic control approach, we develop a newclass of algorithms for solving this problem with computationalcomplexity that grows nearly linear with the number of tasksand resources for the case of a single resource class. Further-more, we use the solution to this relaxed formulation to de-velop a model-predictive controller [5], [14] which generatesthe first stage allocations using the relaxed formulation, then re-solves the problem based on the observed outcome informationto guarantee feasibility of the resulting decisions. We furtherextend the relaxed formulation and model predictive control ap-proach to problems with non-identical resources. We comparethe performance of the model-predictive controller with that ofthe optimal SDP algorithm, a faster suboptimal SDP algorithmand an intelligent heuristic using randomly generated resourceallocation problems. Our results show that, for problems withidentical resources, the model predictive algorithm achieves onaverage performance that is within 2% of an average task valuewhen compared with the performance of the optimal SDP al-gorithm, while reducing computation time by several orders ofmagnitude. The average performance of the intelligent heuristicis nearly an order of magnitude worse. For non-identical re-sources, the performance of the model predictive algorithm isover 98% of the performance of an upper bound to the optimalperformance of the SDP algorithm with similar reductions incomputation time.

The rest of this paper is organized as follows: Section II de-scribes the mathematical formulation of the two stage unreliableresource allocation problem with a single class of resources.This formulation illustrates the underlying stochastic controlproblem using simple notation. Section III develops a solution tothis problem using Stochastic Dynamic Programming, and de-scribes an approximate greedy algorithm for the problem. Thesealgorithms are computationally intensive, requiring evaluationof large numbers of outcomes. Section IV describes the approx-imate SDP formulation and its solution, and the resulting algo-rithm used in the model predictive control approach. Section Vcontains the extensions to multiple resource types. Section VIdiscusses the numerical experiments. Section VII presents ourconclusions and suggestions for further research.

II. PROBLEM FORMULATION

Consider a two stage unreliable dynamic resource allocationproblem with a single resource class. Assume that there aretasks, indexed by , and that there are a total ofnon-renewable homogeneous resources which can be assignedto each task over two possible stages. Each task has a numer-ical value which is obtained by completing the task. Each re-source has a cost of assigning the resource. When a resourceis assigned to task in stage , the event that the resource com-pletes the task successfully has probability , and this event

is independent of any other events generated by other resourceassignments.

Let denote the number of resources assigned to taskat stage 1. The probability that task is not completed by theseresource assignments is obtained from the above independenceassumptions as:

(1)

At the end of stage 1, the set of remaining tasks will be observed.This set will denote the task state vector, and is a Boolean vector

; denotes that task was completed instage 1, and denotes the complementary event for task .Given a vector of stage 1 resource allocations , (1) inducesa probability distribution on the possible outcomes.The stage 2 allocations are strategies which depend on the spe-cific outcome which is observed, as . We refer to thesestrategies as recourse strategies.

The evolution of the task state across stages can be de-scribed by a Markov process with states in , as follows.Initially, is a vector of all ones, indicating that no tasksare complete. Given resource allocations and the inde-pendence assumptions, the state transitions to , where theprobability that is given by . Throughout therest of this paper, we will use without a stage index to referto , the state resulting from the actions at stage 1. Givenrecourse strategies , the state transitions to . Theprobability that task is not completed either in stage 1 or instage 2, so , is given by

(2)

where is the indicator function, and

(3)

The optimal control problem is to select resource allocationsand recourse strategies to minimize the expected

incomplete task value plus the expected cost of using resources:

(4)

subject to the constraints

(5)

(6)

for all .The above problem is a two-stage stochastic control problem

with a discrete state space that grows exponentially in thenumber of tasks , and a discrete decision space which also

Page 3: Model Predictive Control for Stochastic Resource Allocation

CASTAÑÓN AND WOHLETZ: MODEL PREDICTIVE CONTROL 1741

grows exponentially in the number of tasks. In the next sec-tion, we discuss the solution of this problem using stochasticdynamic programming.

III. STOCHASTIC DYNAMIC PROGRAMMING SOLUTION

Consider first the problem at the second stage, after the statehas been observed. Without loss of generality, assume that thereare incomplete tasks in , renumbered from ,and that there are resources remaining. The second stageproblem can be expressed as follows:

(7)

subject to

(8)

Define real-valued functions over nonnegative integer alloca-tion variables as

(9)

and define as the linear interpolation ofthe function . Note that is convex in .Relaxing the second stage optimization problem to allowreal-valued nonnegative allocations results in amonotropic optimization problem [18] of the form

(10)

subject to

(11)

The separable convex nature of the objective function in (10)and the single additive constraint leads to simple computationsof subgradients, and guarantees the existence of a scalar La-grange multiplier which satisfies the Karush-Kuhn-Tucker con-ditions [18]. Furthermore, the facts that are piecewiselinear and continuous, is an integer and the nondifferentiablepoints of occur only at integer values of guarantee thatan optimal solution to (10), (11) can be found that has withinteger values [18]. The result is a fast algorithm for determiningthe optimal recourse allocations and the correspondingoptimal value , which is equivalent to an incrementalline search for the optimal Lagrange multiplier value. The keystructural result is stated below:

Lemma 1: Consider the second stage allocation problem de-fined by (10)–(11). There exists a nonnegative value suchthat the optimal resource allocation is given by

(12)

Furthermore, can be chosen to be equal to the negative of oneof the slopes of the piecewise linear functions .

Proof: The existence of is guaranteed by the convexityof the objective, the single linear constraint and strong duality.Note that the solution to (12) will only change at discrete valuesof , corresponding to the negative values of the slopes of

, so the values of can be restricted tothis discrete set.

Note that, since the breakpoints of occur at integervalues and the constraint in (11) has an integer right hand side,we can always find an optimal solution to (12) with integervalues. Furthermore, the optimal solution has the followingincremental optimality property: The optimal allocation vector

is monotone nondecreasing in each coor-dinate as a function of . This leads to efficient algorithmsfor solving the second stage allocation problem, of complexity

[8]. These efficient algorithms are usedfor each possible first stage outcome and remaining resourcelevel to compute the optimal cost-to-go . Theoptimal cost-to-go has the following properties:

Lemma 2: The optimal cost-to-go function hasthe following properties:

1) is a convex, piecewise linear, nonincreasingfunction of with breakpoints only at integer values of

.2) Consider two distinct outcomes . If

for , then .Proof: The nonincreasing property follows from ex-

panding the admissible solutions in the minimization asincreases. To show the convexity property, note that the al-gorithm of Lemma 1 indicates that, as a function of ,

is a piecewise linear function with breakpoints atevery integer, and with non-decreasing slopes, which estab-lishes convexity.

To establish the second property, note that, for a given valueof , every set of feasible decisions for is also a set offeasible decisions for , and has at least one additionaltask. This immediately establishes the desired inequality.

Consider now the first stage problem. The solution of (4)–(6)satisfies the stochastic dynamic programming equation

(13)

subject to the constraint

(14)

where (3) defines.

Unfortunately, the above optimization problem is a non-sep-arable integer programming problem, and the objectivefunction has terms in the summation. Exact solution of

Page 4: Model Predictive Control for Stochastic Resource Allocation

1742 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

this problem is a difficult combinatorial problem. However,the presence of the single constraint (14) suggests the use ofan approximate incremental optimization approach (a greedyalgorithm) similar to that used for the second stage problem,as follows: Define the notation to denote the vector

. Let bedefined as

(15)

The greedy algorithm can be described as:1) Initialize .2) For each , compute .3) Select for which for all .4) If and , set ;

otherwise, stop.5) Repeat steps 2–4 until algorithm stops.Note that the solution to (13), (14) is not guaranteed to have

the incremental optimality property. Thus, the above algorithmis only an approximate algorithm, although our experimental re-sults later indicate its performance is very similar to that of anenumerative search. Note also that computation of stillrequires summation over terms, an exponential complexityin the number of tasks. In the next section, we describe an alter-native suboptimal approach, based on using Model PredictiveControl (MPC) [14] with an approximate optimization model,which can generate solutions in complexity .

IV. MODEL PREDICTIVE CONTROL

To avoid the exponential growth in complexity as the numberof tasks and resources grow, we propose an alternative con-trol approach based on MPC principles. The idea is to solvean approximate model of the optimization problem in (4)–(6),which expands the admissible strategy space. The structure ofthe approximate problem can be exploited to yield fast algo-rithms that scale nearly linear with the number of tasks and re-sources. However, the resulting strategies may not satisfy theconstraints in (5). Using MPC principles, we propose a controlapproach which implements the first period strategies obtainedfrom solving the approximate problem, and then resolves theproblem after the first period once the new state information isavailable, in order to determine optimal second period alloca-tions that are feasible given the observed outcomes.

A. Approximate Problem

Consider the optimization problem represented by (4)–(6).Note that (5) represents constraints, one per sample out-come. The idea behind the approximate problem is to replacethe the constraints in (5) by one average resource utilizationconstraint, similar to the approach used in [2], [4], [26], [27] forother dynamic resource allocation problems.

The new constraint is:

(16)

The approximate problem is to minimize (4) subject to con-straints in (6), (16). This is a stochastic control problem sub-ject to an expected value constraint, of the type studied in [6].The results of [6] provide a dynamic programming algorithmfor this problem that recurs both on the constraint level andthe state , resulting in a complex recursion that is not separableacross objects. In this section, we treat the constraint level asknown, and use dual optimization techniques to develop a fastalgorithm for computing the optimal solution.

In the proposed MPC approach, miminization of (4) subjectto constraints in (6), (16) determines , the first stage re-source allocations, as well as strategies for future allocations.Once the first stage allocations are implemented, the outcomeis observed. Based on this outcome, the second stage allocationsare determined by a subsequent optimization problem using theoptimal algorithm described in Lemma 1. The key to the MPCapproach is the efficient solution, over closed-loop strategies, ofthe problem in (4), (6), (16), which is described next.

Let denote the optimal solution of the original sto-chastic control problem in (4)–(6), and let denote theoptimal solution of the relaxed stochastic control problem in (4),(6), (16). Since every strategy that satisfies the constraints in (5)also satisfies the constraint (16), we have the result:

Lemma 3: for all ,

This implies that the solution of the relaxed problem overes-timates the expected performance that can be achieved with theavailable resources in the second stage; thus, this solution willhave a bias to commit more resources in the first stage than theoriginal problem.

The important question is whether one can solve efficientlythe approximate problem (4), (6), (16) because the optimizationis over integer allocations. Note that the set of outcomes is fi-nite but grows exponentially with the number of tasks ;thus, the set of possible feedbacks strategies satisfying (6) isvery large .

To address this issue, we expand the space of strategiesto allow mixed strategies. Let index the set of strategiessatisfying (6). Define a discrete random variable , independentof other random variables, with probability mass function

. A mixed strategy is a mixture probability over theset of feedback strategies satisfying (6), such that strategy

is used with probability . Thus, mixed strategies aredenoted as such that, for each ,

is a feedback strategy satisfying (6).The performance of a mixed strategy can be character-

ized in terms of the expected performance of each of thepure strategies satisfying (6). For a given pure strategy

, define the expected resource useand expected performance as

(17)

Page 5: Model Predictive Control for Stochastic Resource Allocation

CASTAÑÓN AND WOHLETZ: MODEL PREDICTIVE CONTROL 1743

(18)

(19)

By using mixed strategies with mixture probability , theexpected resource use and performance of the mixed strategyare given by

Thus, one can achieve any performance and resource utilizationin the convex hull of .

Denote by is the optimal solution of (4), (6), (16)when mixed strategies are allowed. Since the space of mixedstrategies includes the set of all pure strategies, we have:

Lemma 4: for all,

B. Solution of Approximate Problem

We want to compute the solution as the optimalsolution of (4), (6), (16) over mixed strategies. To do this, weexploit a special characterization of optimal strategies for (4),(6), (16), as follows. Define a restricted class of feedback strate-gies, denoted as local strategies, as

Definition 1: A local strategy consists of a pure strategywith the property that .

Local strategies generate recourse allocations for individualtasks based on the observed state of that task only. In contrast,the recourse allocations in general feedback strategies dependon the combined states of all tasks . Let denote the set of allpure local strategies. The characterization of optimal strategiesis provided in the following theorem.

Theorem 1: Consider the optimization problem in (4), (6),(16). Given any pure strategy , there is a mixed strategy

, based only on local strategies in , that achieves the sameexpected performance and the same expected resource use asthe pure strategy.

This result is a special case of the more general result in The-orem 2 that is proved in the appendix for the case with mul-tiple resource types. Basically, the proof exploits the propertythat the objectives and the averaged resource use can be de-composed additively over tasks in generating an explicit con-struction of the mixed local strategies which have equivalent ex-pected performance and expected resource use as a given purestrategy. Hence, Theorem 1 implies that one can restrict the opti-mization problem for computing to mixtures of localstrategies.

Let denote an index over all local strategies and letdenote the expected performance and resource utilization of

strategy . Note that the set of all local strategies is finite dueto the finite number of outcomes and possible allocations peroutcome. The optimal mixed local strategy is the solution of thelinear program

(20)

subject to

(21)This linear program optimizes over mixtures of all local

strategies, which is a large number. The next results provide abetter characterization of the optimal strategy.

Lemma 5: There is an optimal mixed strategy which is a mix-ture of at most two pure local strategies.

Proof: The linear program has two constraints, and hencehas a basic feasible solution with at most two nonzero elements

.The solution of the above linear program and its dual yields

an optimal dual variable for the average resource uti-lization constraint (16). Let denote the set of mixed localstrategies. The Lagrangian of (4), (6), (16) when optimizing overmixed local strategies is defined as:

(22)

subject to the integer constraints in (6). The convexity of the op-timization problem (20), (21) implies that strong duality holds,so that

(23)

Note that in (22) is separable across tasks be-cause the optimization is over mixtures of local feedback strate-gies. That is,

(24)

A Lagrangian optimization approach can thus be used to com-pute optimal mixtures of local strategies. For each value of ,we solve the individual minimization problems in (24). Sincethese minimization problems are unconstrained, the minimumis obtained by a pure strategy. Thus, this minimum is obtainedfor each task as

(25)

The solution of these problems can be used to update the valueof using a subgradient approach for (24).

Page 6: Model Predictive Control for Stochastic Resource Allocation

1744 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

C. Fast Algorithms for Approximate Problem

The structure of the optimization problem in (24) can be ex-ploited to search for an optimal value of in a finite numberof steps, as described in this subsection. Without loss of gener-ality, we restrict the admissible local strategies to those where

, because allocating resources at stage 2 toany task that was successfully completed in stage 1 is demon-strably suboptimal by a simple interchange argument. In thiscase, each local strategy can be characterized by a set of al-location integer vectors where

and . The local performance for taskgiven an allocation resource vector , is readily computedusing the independence assumptions as

(26)

and the local expected resource use for task as

(27)

Using this notation, (25) becomes a deterministic integer op-timization

(28)

The function is a concave, piecewise linear functionof , with nondifferentiable corners. Let denote theminimizing arguments in (28). The corners of corre-spond to values of for which two different local strategiesachieve the same minimum. The following lemma provides anadditional characterization of the optimal local strategies:

Lemma 6: The optimizing solutions of (28) aremonotone nonincreasing in .

The proof is given in the appendix. There is another repre-sentation of the optimization problem (4), (6), (16) that can berelated to the single task problems in (25). Consider the singletask optimization problem

(29)

subject to

(30)

This problem represents the best performance for the single tasksubproblem, subject to using expected resources less than or

equal to on task .With the above notation, problem (4), (6), (16) becomes the

following hierarchical optimization problem

(31)

subject to

(32)

where are given by the minimization in (29), (30). Thesefunctions have the following important properties:

Lemma 7: For each , is a piecewise linear, convex,non-increasing function of .

Proof: The function is the minimum over a convex poly-tope defined by the pure local strategies , thereforeit is convex. The non-increasing nature of the function fol-lows because of the increasing domain of optimization asincreases.

Lemma 7 implies that the optimization problem (31), (32)is a monotropic programming problem, of the type discussedearlier in the second stage of dynamic programming. The onlydifference is that the corner points of do not occur atinteger values of . The optimal solution can be obtained bya similar algorithm: the negative of slopes of segmentsare possible values of the Lagrange multiplier associated withtransitions in resource allocations.

We now describe an algorithm for determining the pure strate-gies which constitute the corners of the function , basedon Lemmas 6 and 7. Start with a high value of , whichyields the optimal solution in (28). The algorithm re-duces the value of by discrete amounts until the maximal valueof either or increases. In particular, Lemma 6 requires thatthe values of for which the optimal solution of (28) changesmust increase either or by one. Recall that the ex-pected resource use and performance associated with task aregiven in (26), (27). Define the notation

(33)

(34)

The corners of the function are associated with a se-quence of resource allocations which have the propertythat .The algorithm proceeds as follows:

1) Initialize .2) To compute the -th corner point, define the following two

slopes:

ifotherwiseifotherwise

3) If , stop. There are no further cor-ners in . If the maximum is positive, and

, then set ; otherwise,set .

4) Repeat the above two steps until no further corners arefound or .

The above algorithm steps over the possible values of whichlead to changes in the optimal solution of (28) in decreasingorder. The ratios computed in step 2 are possible values ofwhere two resource allocations that differ by one have the same

Page 7: Model Predictive Control for Stochastic Resource Allocation

CASTAÑÓN AND WOHLETZ: MODEL PREDICTIVE CONTROL 1745

value. Choosing the largest value of defines which of the twovariables is incremented first along the convex function .The resulting algorithm has complexity linear in .

The maximum number of possible transition values of isper task, for a total less than or equal to . Performing

a line search over this value results in a polynomial time algo-rithm for exact solution. Furthermore, the solution will have theproperty that only one task (corresponding to a negative slopeequal to the final value of will use a local mixed strategy (cf.Lemma 5). A faster algorithm that computes the slopes incre-mentally for each , and keeps track only of the next slope foreach task, can be shown to solve the problem in complexity

.

D. Model Predictive Control Algorithm

The solution of the approximate problem (4), (6), (16) is notguaranteed to satisfy the constraint of (5). Furthermore, the so-lution is a mixture of two pure strategies. However, the first stageallocations of each task are feasible for the contraint in (5). In theMPC approach, the approximate problem solution determinesthe first stage allocations for each task as follows. As discussedin the previous subsection, the optimal mixed strategy is a mix-ture of two strategies for one task, and a pure local strategy forall other tasks. For the task with a local mixed strategy, we allo-cate the smaller of the two first stage allocations to that task.

Once the outcomes of the first stage allocations are observed,the remaining resources can be assigned using a second opti-mization problem, based on our earlier discussion of the secondstage solution (10)–(11). This solution is guaranteed to be fea-sible with respect to (5) for the specific realization that wasobserved.

V. EXTENSION TO MULTIPLE RESOURCE TYPES

The above results can be extended to problems with mul-tiple resource types. Assume that there are resources of differenttypes , each of which has probability of suc-cessfully completing task in stage , and this event is indepen-dent of any other events generated by all other resource assign-ments. We extend the previous notation to let denote thenumber of resources of type assigned to task at stage 1. Theprobability that task is not completed by this resource assign-ment is obtained from the above independence assumptions as:

(35)

At the completion of stage 1, the set of completed tasks willbe observed, and coded in the state , as before. The recourseallocations are now composed of componentswhich include multiple resource types. Let denote thevector of allocations . Theprobability that task is not completed either in stage 1 or instage 2 is given by

(36)

where

(37)

We assume that there are resource use costs and resourceavailability for each resource type . The optimal controlproblem is minimize the objective:

(38)

subject to the constraints

(39)

(40)

for all .An important difference in this problem is that the single stage

problem is NP-hard, as shown by Lloyd and Witsehausen in[12]. Hence, the complexity of the initial step in a stochasticdynamic programming algorithm grows exponentially with thenumber of tasks.

As in Section IV, we expand the admissible set to allow forrandomized strategies, and we replace the sample path con-straints in (39) by expected resource use constraints of the form

(41)

Thus, the new optimization problem is to minimize (38)subject to constraints in (40), (41). As in the Section IV, thenew admissible set of strategies includes all feasible strategiesfor the original stochastic control problem, so the solutionof the relaxed problem is a lower bound to the performanceof the original problem. As before, we expand the set ofadmissible strategies to include mixed strategies of the form

, in order to permit full utilization of theavailable resources in (41). Let denote theexpected performance and multidimensional resource utiliza-tion of pure strategy ; mixed strategies allow us to achieve anyperformance and resource utilization in theconvex hull of these expected performance-resource vectors.We extend the definition of local strategies to multiple resourcesas follows:

Definition 2: A local strategy consists of a pure strategywith the property that .

The key result for the solution of the approximate stochasticcontrol problem with multiple resource types is given below.

Theorem 2: Consider the optimization problem in (38), (40),(41). Given any pure strategy, there is a mixed strategy usingonly local strategies that achieves the same expected perfor-mance and the same expected resource use.

Page 8: Model Predictive Control for Stochastic Resource Allocation

1746 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

The proof of this result is included in the appendix. The impli-cations are that one can restrict the choice of controls to mixturesof local strategies. Let denote an index over all local strategies,and let denote the expected performance andresource utilization vector and of local strategy . The optimalmixed strategy is the solution of the linear program

(42)

subject to the constraints

(43)

This linear program is over mixtures of all local strategies,which is a large number. The linear programming approach wasused earlier in [26]–[28], and provides the basis for the resultspresented below.

The next result provides a better characterization of the op-timal strategy.

Lemma 8: There is an optimal mixed strategy which is a mix-ture of at most local strategies.

Proof: The linear program has inequality constraintsplus the constraint that the sum of the mixture probabilities addsto one. Hence, it has a basic feasible solution with at mostnonzero elements .

The main result of [26]–[28] is an efficient constraint genera-tion [9] algorithm which solves the linear program in (42)–(43)while considering only mixtures of a small number of localstrategies. We summarize their algorithm below.

The algorithm starts with an initial set of pure local strategiesindexed by . The first step in the algorithm is tosolve the linear program in (42), (43) restricted to mixtures ofthe initial strategies. Since the admissible strate-gies are restricted, the solution provides an upper boundto the optimal cost. Denote by the optimal dualprices of the resource constraints in this solution. The constraintgeneration algorithm uses these optimal dual prices to define arelaxed primal problem which is decoupled for each task, statedas follows:

(44)

subject to the constraints

(45)

where we have assumed without loss of generality that, so that no resources are assigned to tasks

that are already completed. The above problems are decoupledminimization problems over tasks with a finite number ofalternatives, and can be solved directly by enumeration. Denote

by the optimal value of the solution for task . By combiningthe results over , one obtains a pure local strategyassociated with the dual prices which provides alower bound to the optimal cost:

The key result in the constraint generation algorithm is stated asfollows:

Lemma 9: Consider the pure local strategy generated by thesolution of (44)–(45). If , the optimal solution overall mixtures of local strategies is a mixture of the local strategiesindexed by . Otherwise, the pure local strategy canbe used as part of a mixed strategy which provides a cost lowerthan .

The proof of this result is given by Gilmore and Gomory [9].It is based on the fact that solving the relaxed problem is equiv-alent to finding the strategy which has the greatest impact inreducing the cost of the current best solution. This leads to adynamic column generation algorithm, as follows: if

, increase the number of local strategies by adding the newpure local strategy, and resolve the primal problem in (42)–(43)to get dual variables and then relaxed problem in(44)–(45). At each iteration, we improve the achievable perfor-mance, until the lower bound and upper bound estimates areclose enough. By the lemma above, the optimal solution will beobtained without enumerating all of the pure local strategies.

Unlike the results for the single resource case, there are noefficient algorithms for exact solution of the relaxed problemsfor each task, or for determining the optimal values of the dualvariables. When there are few resource classes, the number offeasible pure strategies can be enumerated for each task in orderto obtain the solution, and the complexity of a single iteration isproportional to the number of tasks . While there is no simplebound on the number of iterations required, each iteration cor-responds to a pivot operation in a simplex algorithm for linearprogramming. Empirically, we have found that the number ofiterations is for our sample problems.

To implement a model predictive control algorithm, we obtainthe solution of the approximate problem to determine a mixtureof local strategies at the first period. We either sample thesestrategies according to the mixture probabilities, or instead se-lecting the allocation corresponding to the largest mixture prob-ability .

VI. EXPERIMENTAL RESULTS

In order to evaluate the effectiveness of the proposed MPCapproach, we conducted several experiments with identical re-sources comparing the following algorithms:

1) The exact SDP solution, obtained by enumerating the pos-sible first stage allocations and finding a global minimum.

2) The incremental DP (IA) algorithm discussed at the end ofSection III.

3) The MPC algorithm described in Section IV.4) A heuristic algorithm (HA) that assigns in the first

stage optimally, then assigns the remaining resources opti-mally to the incomplete tasks in stage 2.

Page 9: Model Predictive Control for Stochastic Resource Allocation

CASTAÑÓN AND WOHLETZ: MODEL PREDICTIVE CONTROL 1747

TABLE IPERFORMANCE OF MPC, IA AND HA ALGORITHMS AS DECREASE IN VALUE

COMPLETED BY SDP AS PERCENTAGE OF AVERAGE TASK VALUE

The first set of experiments consisted of random problemswith 7 to 11 tasks, with task values selected randomly in therange of 1–10, and task success probabilities selected randomlyin the interval . The number of resources for eachnumber of task varied from 7 to 11 resources. For each numberof tasks, we generated 100 random problems, and obtained thesolutions (in terms of expected value of tasks left unfinished)given by the SDP, IA, MPC and SA algorithms. The metricreported is the decrease in value of the completed task valuewhen compared with the optimal SDP algorithm, expessed as apercent of the average task value, averaged over 100 problems.This allows for comparison of the different experiments usinga common metric. We also computed the worst case percentagedifference in performance between the MPC and HA algorithmsand the SDP algorithm. The results are summarized in Table I.

The results in Table I indicate that the performance of theIA algorithm was optimal for all random problems generated.This leads to an unproven conjecture that the algorithm is op-timal in general. The results also establish that the MPC algo-rithm yields near-optimal performance: The worst case perfor-mance across 900 problems tested was within 1/6 of an averagetask value when compared to the optimal SDP performance, andthe average performance was within 1.25% of a task value. Incontrast, the worst case performance of the HA algorithm wasnearly 80% of an average task value higher than the optimal SDPperformance.

The second set of experiments used problems with 16 and20 tasks, and with a varying number of resources from 12 to20. For these problems, computing the exact dynamic program-ming solution using enumerative techniques was prohibitivelylong. As a reference point, it required 3 days on a LINUX Pen-tium 1.7 GHz workstation to solve 100 instances of the 11 taskproblem. Given the optimal performance achieved by the IA al-gorithm on previous problems, we compare results only for IA,HA and MPC algorithms. The statistics reported are the increasein surviving task value over the value achieved by the IA algo-rithm, expressed as a percentage of the average task value in theproblem instance. As before, we report both the average per-centage across 100 problems and the worst-case percentage overthe 100 problems for each experiment. The results are summa-rized in Table II.

The results in Table II confirm the near optimal behavior ofthe Model Predictive Control algorithm. The average perfor-mance difference between the computationally intensive IA al-gorithm and the MPC algorithm is approximately 2% of thevalue of an average task. In contrast, the average performance of

TABLE IIPERFORMANCE OF MPC AND HA ALGORITHMS AS DECREASE IN VALUE

COMPLETED BY IA AS PERCENTAGE OF AVERAGE TASK VALUE

TABLE IIIPERFORMANCE OF MPC ALGORITHM AS PERCENTAGE OF TASK

VALUE COMPLETED BY MIXED STRATEGY UPPER

BOUND FOR TWO RESOURCE TYPES

the HA algorithm ranges between 5% and 20% of the value ofan average task. Similar differences in performance are evidentin the worst case performance. The experiments confirm thatthe MPC algorithm’s bias to commit more resources in the firststage has a nearly negligible impact in overall task performance.

The IA algorithms required over 13 minutes to solve asingle instance of a 20 task, 20 resource problem on a Pentium1.4 GHz workstation running Linux. In contrast, the MPCalgorithm solved 100 instances of 1000 task, 1000 resourceproblems in a total of 3.5 seconds. This suggests that the MPCalgorithm is well suited to applications where information aboutavailable tasks and values becomes available in real time, andmust be converted into resource allocation decisions quickly.

As a final set of experiments, we studied problems with twodifferent resource types. As before, we generated random taskvalues in the range of 1 to 10, and random values for in therange of 0.7 to 0.9. We assumed that , makingthe probabilities of task completion equal across stages for thesame resource assignment. We implemented the Model Predic-tive Control algorithm by computing the average of the mixtureassignments for each task, and rounding the first stage assign-ments to the closest integer. We simulated the effects of thesefirst stage assignments to generate sample sets of incompletetasks for the second stage. In the second stage, we used a subop-timal greedy heuristic to assign the remaining resources [2], [8]to the incomplete tasks. We compared the resulting performancein terms of expected completed task value after averaging over10,000 Monte Carlo trials to the optimistic bound obtained bythe solution of the approximate model used by the MPC con-troller, for 100 random instances each of 10 task, 50 task and100 task problems. The number of available resources in eachproblem was set so that the total number of resources equaledthe number of tasks, and there were equal numbers of two typesof resources. The statistic computed was the percentage of theoptimistic performance achieved by the MPC controller. As be-fore, we report both the average and the worst-case performanceover 100 sample problems.

The results of this experiment are summarized in Table III.The results show that, even when the MPC controller uses a

Page 10: Model Predictive Control for Stochastic Resource Allocation

1748 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

rounding approximation to determine the first stage allocationsand uses an approximate greedy algorithm to select the secondstage allocations, the expected performance is over 95% of anupper bound to the optimal performance in every problem in-stance tested. On average, the expected performance is 98.5%or higher for the different problem sizes. Note that the averageperformance improves with increasing problem size, as statis-tical mixing across tasks increases the accuracy of the approx-imate optimization problem used by the MPC algorithm. Theresults show that the MPC algorithm achieves near-optimal per-formance in terms of expected task value completed.

VII. CONCLUSION

The problem of allocation of unreliable resources to tasksover multiple stages arises in many important applications. Inthis paper, we have developed a stochastic dynamic program-ming formulation for this problem, which captures the oppor-tunity for observing task completion events and using recoursestrategies. However, exact solution of this problem using Sto-chastic Dynamic Programming is computationally intensive be-cause both the state space and the admissible action space growexponentially with the number of tasks. As an efficient alterna-tive, we developed a Model Predictive Control algorithm thatis based on solving a relaxed stochastic dynamic programmingproblem. We established that the relaxed problem can be solvedvery fast, in time nearly linear with the number of tasks. Fur-thermore, the resulting algorithm exhibits near-optimal perfor-mance across a range of random test problems.

There are several important directions for extension of thiswork. The first of these is extension of the results to more thantwo stages. The main theorem in this paper, the representationof the optimal relaxed strategies in terms of local strategies, ex-tends in a straightforward manner to multiple stages. Using thisextension, it is also straightforward to extend the MPC algo-rithms in the case of identical resources to solve the problem in

time for a fixed number of stages.Another interesting extension is to consider tasks that require

multiple assignment of simultaneous resources to complete. Ex-tensions of our techniques to this problem, and problems wheretasks have precedence constraints, are currently under investi-gation.

APPENDIX

Proof of Lemma 6:Proof: Rewrite the minimization problem of (28) as

The inner minimization establishes that is monotone non-increasing in , because is integer convex anddecreasing in . Define

Then, is continuous and nondecreasing as increases,since, if , then . Hence,the optimal is feasible for the optimization with andyields a lower cost than the optimal cost .

Consider the outer optimization, rewritten as

Assume that increases for some decreasing . That is, thereis a value so that for any in an interval

. Since is optimal, then

which implies

Similarly, optimal implies

Thus, for small , this implies

Since and is continuous in, the above equality cannot be satisfied for arbitrarily small ,

contradicting the assumption that increases as decreasesfor some value of .

Proof of Theorem 2:Proof: Given any pure strategy , its perfor-

mance is given by

(46)

Define the notation , where denotesthe completion state of all tasks other than task . Let

denote the vector of resourceallocations to task at stage 1, and let . Thecontribution of task in (46) can be expressed as

where the equality follows from the independence of the taskcompletion events. Define now the probability distribution onthe recourse actions for task as

Page 11: Model Predictive Control for Stochastic Resource Allocation

CASTAÑÓN AND WOHLETZ: MODEL PREDICTIVE CONTROL 1749

Then, we can rewrite as

with corresponding expected resource use

Consider the following mixture of local strategies for task i:Assign resources, then assign to task theamounts determined by a local strategy using recourse strategies

with probability

where we use the variable to distinguish the local recoursestrategies from the original recourse strategy .

By using an independent mixture for each task , we constructa product mixture of local strategies, each of which is selectedwith probability

To simplify the resulting equations, define the notation, and and the complementary proba-

bilities . The expected performance of thismixed strategy on task is given by

where the equality follows from the definition of the mixturedensity and the fact that each term depends only on either

or and the functions are probability distributionswhich sum to 1. Using the definition of , we obtain

which establishes that the performance achieved by the mixedlocal strategy on task is the same as the performance on taskof the original pure strategy . A similar argumentestablishes that the expected resource use by the mixed localstrategies in task is also identical to that of the pure strategy,which completes the proof.

REFERENCES

[1] D. P. Bertsekas, Dynamic Programming and Optimal Control. Bel-mont, MA: Athena Scientific, 2000.

[2] D. A. Castañón, Advanced Weapon Target Assignment AlgorithmsBurlington, MA, ALPHATECH report TR 428, 1989.

[3] D. A. Castañón, “Optimal search strategies in dynamic hypoth-esis testing,” IEEE Trans. Syst., Man Cybern., vol. 25, no. 7, pp.1130–1138, Jul. 1995.

[4] D. A. Castañón, “Approximate dynamic programming for sensor man-agement,” in Proc. 36th IEEE Conf. Decision Control, San Diego, CA,Dec. 1997, pp. 1202–1207.

[5] D. A. Castañón and J. Wohletz, “Moldel predictive control for dynamicunreliable resource allocation,” in Proc. 41st IEEE Conf. Decision Con-trol, Las Vegas, NV, Dec. 2002, pp. 3754–3759.

[6] R. Chen and G. L. Blankenship, “Dynamic programming equations fordiscounted constrained stochastic control,” IEEE Trans. Automat. Con-trol, vol. 49, no. 5, pp. 699–709, May 2004.

[7] G. G. denBroder, R. E. Ellison, and L. Emerling, “On optimum targetassignment,” Oper. Res., vol. 7, pp. 322–326, 1959.

[8] A. R. Eckler and S. A. Burr, Mathematical Models of Target Cov-erage and Missile Allocation. Alexandria, VA: Military OperationsResearch Society, 1972.

[9] P. C. Gilmore and R. E. Gomory, “A linear programming approach tothe cutting stock problem,” Oper. Res., vol. 9, pp. 849–859, 1961.

[10] K. Glazebrook and A. Washburn, “Shoot-look-shoot: A review and ex-tension,” Oper. Res., vol. 52, no. 3, pp. 454–463, May-Jun. 2004.

[11] P. Hosein, “A Class of Nonlinear Resource Allocation Problems,”Ph.D. dissertation, Electrical and Computer Engineering, MIT, Cam-bridge, MA, 1989.

[12] S. P. Lloyd and H. S. Witsenhausen, “Weapons allocation is NP-com-plete,” in Proc. Summer Conf. Simul., Reno, NV, 1986, pp. 1054–1058.

[13] S. Matling, “A review of the literature on the missile allocationproblem,” Oper. Res., vol. 18, no. 2, pp. 334–373, Mar./Apr. 1970.

[14] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Skocaert,“Constrained model predictive control: Stability and optimality,”Automatica, vol. 36, pp. 789–814, 2000.

Page 12: Model Predictive Control for Stochastic Resource Allocation

1750 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 54, NO. 8, AUGUST 2009

[15] R. A. Murphey, “An approximate algorithm for a weapon target assign-ment program,” in Approximation and Complexity in Numerical Opti-mization: Continuous and Discrete Problems, P. Pardalos, Ed. Dor-drecht, The Netherlands: Kluwer Academic, 1999.

[16] R. A. Murphey, “Target-based weapon target assignment problems,”in Nonlinear Assignment Problems: Algorithms and Applications, P.M. Pardalos and L. S. Pitsoulis, Eds. Dordrecht, The Netherlands:Kluwer Academic, 2000.

[17] R. T. Rockafellar and R. J.-B. Wets, “Stochastic convex programming:Kuhn-Tucker conditions,” J. Math. Econom., vol. 2, pp. 349–370, 1975.

[18] R. T. Rockafellar, Network Flows and Monotropic Programming.New York: Wiley, 1984.

[19] S. C. Chang, R. M. James, and J. J. Shaw, “Assignment algorithms forkinetic energy weapons in boost phase defense,” in Proc. 26th IEEECDC, Los Angeles, CA, 1987, pp. 1678–1683.

[20] F. A. Miercourt and R. M. Soland, “Optimal allocation of missilesagainst area and point defenses,” Oper. Res., vol. 19, pp. 605–617,1971.

[21] R. M. Soland, “Optimal terminal defense tactics when several sequen-tial engagements are possible,” Oper. Res., pp. 537–542, 1988.

[22] N. T. O’Meara and R. M. Soland, “Optimal strategies for problems ofsimultaneous attack against an area defense with impact point predic-tion,” Naval Res. Logistics, vol. 39, pp. 1–28, 1992.

[23] L. D. Stone, Theory of Optimal Search. New York: Academic Press,1975.

[24] L. Stone and A. Washburn, Eds., “Special issue on search theory,”Naval Research Logistics, vol. 38, 1991.

[25] A. Washburn and L. Thomas, “Dynamic search games,” Oper. Res.,vol. 39, pp. 422–425, 1991.

[26] K. A. Yost, “Solution of Large-Scale Allocation Problems with Par-tially Observed Outcomes,” Ph.D. dissertation, Naval PostgraduateSchool, Monterey, CA, 1998.

[27] K. A. Yost and A. R. Washburn, “The LP/POMDP marriage: Optimiza-tion with imperfect information,” Naval Research Logistics, vol. 47, no.8, pp. 607–619, 2000.

[28] K. A. Yost and A. R. Washburn, “Optimizing assignments of air-to-Ground assets and BDA sensors,” Military Oper. Res., vol. 5, no. 2,pp. 77–91, 2000.

David A. Castañón (S’68–M’79–SM’98) receivedthe Ph.D. degree in applied mathematics fromthe Massachusetts Institute of Technology (MIT),Cambridge, in 1976.

From 1976 to 1981, he was a Research Scientistwith the Laboratory for Information and DecisionSystems, MIT. From 1982 to 1990, he was SeniorScientist and Chief Scientist at Alphatech, Inc.,Burlington, MA. He joined Boston University,Boston, MA, in 1990, where he is currently Pro-fessor pf Electrical and Computer Engineering. He is

co-Director of Boston University’s Center for Information and Systems Engi-neering, and Associate Director the National Science Foundation EngineeringResearch Center on Subsurface Censing and Imaging Systems. His researchinterests include stochastic control, estimation, optimization, game theory,image formation and understanding.

Dr. Castañón received the Control Systems Society Distinguished MemberAward. He currently serves on the U. S. Air Force Scientific Advisory Board.He has served as Associate Editor of the IEEE TRANSACTIOSN ON AUTOMATIC

CONTROL, and has held numerous positions in the IEEE Control Systems So-ciety, including general chair of the 2007 IEEE Conference on Decision andControl and President of the Society.

Jerry M. Wohletz (S’93–M’00) received the B.S.degree in aerospace engineering (with highest dis-tinctions) from the University of Kansas, Lawrence,in 1994, and the M.Eng. and Ph.D. degrees in aero-nautics and astronauts from the Massachusetts Insti-tute of Technology, Cambridge, in 1997 and 2000,respectively.

He has been with BAE Systems, Burlington,MA, since 2000, and is currently the Director ofStrategy for Electronic Combat Solutions AdvancedPrograms. His research interests include adaptive

control systems for large-scale uncertain systems, optimization and schedulingwith applications to control of teams of unmanned air vehicles.