Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

1/24

Dyn Games Appl (2011) 1:534557

DOI 10.1007/s13235-011-0024-5

Numerical Solution of Orbital Combat Games Involving

Missiles and Spacecraft

Mauro Pontani

Published online: 14 July 2011

Springer Science+Business Media, LLC 2011

Abstract This research addresses the problem of the optimal interception of an optimally

evasive orbital target by a pursuing spacecraft or missile. The time for interception is to be

minimized by the pursuing space vehicle and maximized by the evading target. This problem

is modeled as a two-sided optimization problem, i.e. as a two-player zero-sum differential

game. The work incorporates a recently developed method, termed semi-direct collocation

with nonlinear programming, for the numerical solution of dynamic games. The method is

based on the formal conversion of the two-sided optimization problem into a single-objective

one, by employing the analytical necessary conditions for optimality related to one of the

two players. An approximate, first attempt solution for the method is provided through the

use of a genetic algorithm in a preprocessing phase. Three qualitatively different cases are

considered. In the first example the pursuer and the evader are represented by two space-

craft orbiting the Earth in two distinct orbits. The second and the third case involve two

missiles, and a missile that pursues an orbiting spacecraft, respectively. The numerical re-

sults achieved in this work testify to the robustness and effectiveness of the method also in

solving large, complex, three-dimensional problems.

Keywords Orbital dynamic games Pursuit-evasion games Two-sided optimization

1 Introduction

The problem of the three-dimensional optimal interception of an optimally evasive orbital

target by a pursuing spacecraft (or missile) involves two competing players with contrasting

objectives. The pursuing space vehicle tries to reach the evading target as quickly as possi-

ble, whereas this latter tries to delay capture indefinitely. Since the time for interception is

to be minimized by the pursuing spacecraft and maximized by the evading spacecraft, this

problem is best modeled as a two-sided optimization problem, i.e. it becomes a two-playerzero-sum differential game.

M. Pontani ()

Scuola di Ingegneria Aerospaziale, University of Rome La Sapienza, 00138 Rome, Italy

e-mail: [email protected]
mailto:[email protected]:[email protected]


2/24

Dyn Games Appl (2011) 1:534557 535

Zero-sum games were first introduced by Isaacs [1] and are also referred to as pursuit-

evasion games. In the context of zero-sum games the optimal trajectories of the two

spacecraft correspond to a so-called saddle-point equilibrium solution of the game. The

necessary conditions for an open-loop saddle-point equilibrium solution are relatively

straightforward to derive and can be viewed as an extension of the necessary conditions foroptimality that hold in optimal control theory [2, 3]. Their meaningfulness in relation with

closed-loop saddle-point equilibrium solutions is closely related to the intrinsic character-

istics of the game of interest. Only a few problems with simplified dynamics are amenable

to an analytical solution [4, 5]. For problems with realistic dynamics the only choice is nu-

merical solution. Hillberg and Jrmark [6] solved an air combat maneuvering problem in

the horizontal plane with steady turn and realistic drag and thrust data. Jrmark, Merz, and

Breakwell [7] solved a qualitatively similar air combat problem employing differential dy-

namic programming, and considered only coplanar situations. A pursuit-evasion problem

between missile and aircraft has been solved using an indirect, multiple shooting method by

Breitner, Grimm and Pesch [8, 9]. Raivio and Ehtamo [10] solved a pursuit-evasion problemfor a visual identification of the target by iterating a direct method. With regard to orbital

pursuit-evasion games, past studies are often based on simplified dynamical models. Ander-

son and Grazier [11] described the construction of a closed-form solution for the barrier in a

planar pursuit-evasion game between two spacecraft, by linearizing the problem about a ref-

erence circular orbit. Kelley et al. [12] derived the impulsive maneuvers for two spacecraft

involved in an orbital combat. They argued that optimal evasion only consists of in-plane

maneuvers.

The work that follows presents a recently developed method [1316], termed semi-

direct collocation with nonlinear programming (semi-DCNLP), devoted to the numerical

solution of zero-sum dynamic games with separable dynamics of the two players. Thismethod is based on the formal conversion of the two-sided optimization problem into a

single-objective one, by employing the analytical necessary conditions for optimality re-

lated to one of the two players. This fact implies that the adjoint variables of one of the two

spacecraft are directly involved in the optimization process, which needs a reasonable guess

to yield an accurate saddle-point equilibrium solution. The trial-and-error selection of first

attempt values for the (non-intuitive) adjoint variables is very challenging for the problem

at hand. In this work an approximate, first attempt solution is provided through the use of

a genetic algorithm in a preprocessing phase. Three qualitatively different cases are con-

sidered. In the first example the pursuer and the evader are represented by two spacecraft

orbiting Earth in two distinct orbits. The second and the third case involve two missiles, and

a missile that pursues an orbiting spacecraft, respectively.

The objective of this work is to: (i) formulate the three-dimensional orbital combat as a

dynamic game, (ii) describe and derive the analytical necessary conditions that must be sat-

isfied by an open-loop equilibrium solution (while discussing their validity in relation with

closed-loop equilibrium solutions), and (iii) obtain the saddle-point equilibrium trajectories

through the joint use of a genetic algorithm and of the semi-DCNLP.

2 Problem Definition

The problem of the optimal interception of an optimally evasive orbital target consists in

the determination of the saddle-point equilibrium trajectories of the two space vehicles in-

volved in the combat scenario. Termination of the game occurs when the pursuing vehicle

(henceforth denoted with P) reaches the instantaneous position of the evading target (de-

noted with E henceforward). A plausible sufficient condition ensuring that capture ends
mailto:[email protected]:[email protected]


3/24

536 Dyn Games Appl (2011) 1:534557

Fig. 1 Local horizontal plane and related angles (a); instantaneous plane of motion and related thrust an-gles (b)

the game is presented in the next subsection. The objective function, to be minimized by P

and maximized by E, is represented by the time for interception. In this work each player

is assumed to possess complete and instantaneous information on the state of the opponent

player.

2.1 Spacecraft Dynamics

This study employs a point-mass model to describe the three-dimensional motion of the two

space vehicles involved in the orbital game. The problem is investigated under the following

assumptions:

(a) aerodynamic forces are neglected, due to the altitudes involved in the cases that are

being considered;

(b) both spacecraft employ their maximum thrust for the entire time of flight;

(c) the two space vehicles are given modest propulsive capabilities;

(d) at the initial time t0, which is set to 0, the dynamical state of the two spacecraft is

specified.

Hypotheses (b) and (c) allow assuming constant thrust-to-mass ratios for both spacecraft,

denoted with (TP/mP) and (TE /mE ) for P and E, respectively. This circumstance implies

also that the control is performed through the thrust direction only.

Six scalar variables describe the dynamical state of each spacecraft in an inertial Earth-

centered reference frame: radius ri (i = P or E), absolute longitude i (measured from the

vernal axis, the axis joining the equinoxes in the ecliptic plane), latitude i , flight path

angle i , velocity vi , heading (or coazimuth) angle i (defined in Fig. 1(a)). The control is

performed with the thrust direction, identified through the two angles i and i illustrated in

Fig. 1(b); by definition, /2 i /2. IfE denotes the Earth gravitational parameter,

the equations of motion are

ri = vi sin i , (1)

i =vi cos i cos i

ri cos i, (2)


4/24


5/24

538 Dyn Games Appl (2011) 1:534557

A condition that ensures (at least for the cases considered in this paper) that interception

concludes the game is

TP

mP>

TE

mE, (17)

i.e., P has superior propulsive capabilities with respect to E. As unbounded controls are

assumed for both spacecraft, it is conjectured that in general this condition implies that

interception can occur in a finite time, regardless of the initial conditions of the two players.

This circumstance implies also that no barrier that emanates from the target set can exist for

the game at hand.

In general, for zero-sum games two feedback (or, equivalently, closed-loop) strategies,

P and E , can be introduced for the two players. If a closed-loop saddle-point equilibrium

exists, the strategies P and E are in saddle-point equilibrium when

JP,E JP,E JP,E, P P, E E , (18)where P and E are the sets of the admissible strategies (in the neighborhoods of

P and

E ). At a given time t, the value V of the game is defined as the outcome of the objective

function when both players employ their optimal strategies along the optimal path in the

time interval [t, tf]:

V = minP

maxE

J = maxE

minP

J (19)

provided that the operators max and min commute.

A common assumption [1, 3] is that the state space can be divided into a number of mu-

tually disjoint regions, separated by singular surfaces. These surfaces, according to the def-inition given by Basar [3], are the loci where (i) the equilibrium strategies are not uniquely

determined by the necessary conditions, or (ii) the value function is not continuously dif-

ferentiable, or (iii) the value function is discontinuous. In the scientific literature, some spe-

cial, structural characteristics of zero-sum games are responsible of a number of singular

surfaces. For instance, state constraints can yield afferent and universal surfaces [8]. Non-

smooth data (e.g., a discontinuous thrust) can be responsible of discontinuities in the right

hand side of the state equations and transition surfaces can arise [8]. Furthermore, control

variables that appear linearly in the dynamics equations usually yield singular surfaces of

several kinds [1]. Other more complex analytical conditions [17, 18] can generate singu-

lar surfaces. For the problem at hand none of the previously mentioned circumstances is

encountered, and the non-existence of singular surfaces is conjectured. Thus, the value V

is plausibly assumed to be continuously differentiable over the entire state space (i.e. V is

assumed to be of class C1 over the entire state space). With this assumption (which still

eludes any rigorous mathematical proof), the optimal open-loop representations (uP,uE ) of

the closed-loop strategies are introduced as

uP(t ) = P (xP,xE , t) and u

E (t ) =

E (xP,xE , t) . (20)

For each player an open-loop representation of an optimal feedback strategy is the strategy

along the saddle-point equilibrium trajectory as a function of t and of the initial state only,

under the assumption that V is of class C1 in the region of the state space under consid-

eration. In other words, if the state is contained in a region where the value function is of

class C1, then the open-loop strategies become open-loop representations of feedback strate-

gies. These representations are relevant because two properties relate them to the feedback

strategies:


6/24

Dyn Games Appl (2011) 1:534557 539

(a) if one of the two players deviates from his optimal open-loop strategy, his outcome

worsens,

(b) if both players employ their own optimal open-loop strategies, then the time histories of

the optimal open-loop and of the optimal feedback strategies are identical.

It is worth remarking that the determination of open-loop representations is relevant forzero-sum games, because it represents an essential premise for the successive synthesis of

feedback strategies. Pesch et al. [19], Breitner and Pesch [20], and Lachner et al. [21] com-

puted a relevant number of open-loop solutions and employed them for the synthesis of

feedback strategies by means of special techniques (e.g., with the use of neural networks).

In the regions where the value function exists and is continuously differentiable in t and

x, V satisfies the following partial differential equation, referred to as Isaacs equation:

V

t+ max

EminP

V

xPfP +

V

xEfE

= 0. (21)

Isaacs equation is written with reference to the special (separable) form (11) of the state

equations.

With regard to the dynamic game at hand, in this context the variables (P, P) and

(E , E ) represent feedback strategies for P and E, respectively (P = [P P]T and E =

[E E ]T). Isaacs equation becomes

V

t+ max

EminP

V1vPsP + V2

vPcPcP

rPcP+ V3

vPcPsP

rP

+ V4vPcPrP + TPmP s PcPvP E cPr 2PvP + V5TP

mP c PcP

E sP

r 2P

+ V6

TP

mP

sP

vPcP

vPcPsPcP

rPcP

+ max

EminP

V7vE sE + V8

vE cE cE

rE cE

+ V9vE cE sE

rE+ V10

vE cE

rE+

TE

mE

s E cE

vE

E cE

r2E vE

+ V11

TE

mEc E cE

E sE

r 2E

+ V12

TE

mE

sE

vE cE

vE cE sE cE

rE cE

= 0 (22)

and holds (x, t), since V is assumed to be of class C1 over the entire state space. The

symbol Vj denotes the derivative of the value function V with respect to the state component

xj (j = 1, . . . , 12), whereas s[] = sin[] and c[] = cos[]. It is worth noticing that for the

problem of interest the two operators max and min are interchangeable due to separability

of the dynamical system. For the same reason (22) reduces to:

V

t+

TP

mPminP

V4

sin P cos P

vP+ V5 cos P cos P + V6

sin P

vP cos P

+TE

mEmaxE V10 sin E cos EvE + V11 cos E cos E + V12

sin E

vE cos E + r.t. = 0, (23)where r.t. represents the remaining terms, all of which are independent of the (feedback)

control variables P and E . Introducing the unit vector P by

P =

V25 +

V4

vP

2+

V6

vP cos P

2 12

V5V4

vP

V6

vP cos P

T(24)


7/24

540 Dyn Games Appl (2011) 1:534557

it is then relatively straightforward to find the control P that minimizes the sec-

ond term of (23). In fact, if the thrust direction of P is denoted with TP, then TP =

[cos P cos P sin P cos P sin P]T and the second term in (23) can be rewritten as

TP

mP

V4vP

2+ V25 +

V6vP cos P

2minuP

TPTP

. (25)

The dot product TPTP is minimized to 1 if

cos P cos P = V5

V4

vP

2+ V25 +

V6

vP cos P

2 12

, (26)

sin P cos P = V4

vPV4

vP2

+ V25 + V6vP cos P2

1

2

, (27)

sin P = V6

vP cos P

V4

vP

2+ V25 +

V6

vP cos P

2 12

. (28)

The three relations (26)(28) lead to deriving P (which is constrained to [/2, /2]) and

P as functions of the state variable xP and {Vj }j =4,5,6. The same steps can be repeated for

E (taking into account that max replaces min in (23)) and lead to the following relation-

ships:

cos E cos E = V11V10

vE

2+ V211 +

V12vE cos E

2 12, (29)

sin E cos E =V10

vE

V10

vE

2+ V211 +

V12

vE cos E

2 12

, (30)

sin E =V12

vE cos E

V10

vE

2+ V211 +

V12

vE cos E

2 12

(31)

which allow obtaining E (constrained to [/2, /2]) and E as functions of the state

variable xE and {Vj }j =10,11,12. Due to (26)(31), Isaacs equation becomes

V

t+ V1vPsP + V2

vPcPcP

rPcP+ V3

vPcPsP

rP+ V4

vPcP

rP

E cP

r 2PvP

V5

E sP

r 2P

V6vPcPsPcP

rPcP

TP

mP

V4

vP

2+ V25 +

V6

vPcP

2+ V7vE sE + V8

vE cE cE

rE cE

+ V9vE cE sE

rE+ V10

vE cE

rE

E cE

r 2E

vE V11

E sE

r2E

V12vE cE sE cE

rE cE

+TE

mE

V10

vE

2+ V211 +

V12

vE cE

2= 0. (32)

As ri , vi , cos i , and cos i (i = P or E) never vanish and due to continuity of the partial

derivatives {Vj }j =1,...,12, for the game at hand Isaacs equation holds in the entire state space.


8/24

Dyn Games Appl (2011) 1:534557 541

The partial differential equation (32) cannot be directly solved in closed form and this

circumstance prevents directly deriving the feedback control laws in the form (20). In dif-

ferential game contexts, it is a common practice [3] to employ the necessary conditions for

open-loop saddle-point strategies. Then, if the value function is of class C1 over the region of

the state space under consideration, then the open-loop strategies become open-loop repre-sentations of feedback strategies. This research is aimed at determining open-loop strategies,

which are conjecturally considered open-loop representations of feedback strategies, under

the reasonable assumption that for the game of interest the value function has class C1 over

the entire state space.

The necessary conditions for open-loop saddle-point solutions involve ordinary differ-

ential equations, and can be regarded as extensions of the necessary conditions for a local

minimum that hold in optimal control theory. First, a Hamiltonian H and a function of ter-

minal conditions are introduced as

H = TPfP + TEfE , = tf + T, (33)

where P,E , and are the adjoint variables conjugate to the state equations (11), and

to the boundary conditions (15), respectively. For the Lagrange multipliers P and E the

following adjoint equations hold [3]:

P =

H

xP

T=

fP

xP

TP, (34)

E = HxE

T

= fExE T

E (35)

with the respective boundary conditions:

P(tf) =

xPf

T, (36)

E (tf) =

xEf

T. (37)

Open-loop control variables can be determined through the following pair of relations,

uP = argminuP

H, (38)

uE = arg maxuE

H (39)

that can be regarded as the extension of the Pontryagin minimum principle to dynamic

games. As the terminal time tf is unspecified, the following transversality condition holds:

H (tf) +

tf= 0. (40)

Equations (11), (15), and (34)(40) define the two-point boundary value problem associated

with the zero-sum dynamic game. The unknowns are the state vectors xP(t ) and xE (t ),

the control vectors uP(t) and uE (t ), the Lagrange multipliers P(t),E (t ), and , and the

terminal time tf.


9/24

542 Dyn Games Appl (2011) 1:534557

With regard to the orbital game at hand, (34)(35) yield 12 scalar adjoint equations,

which are not reported for the sake of brevity (cf. [16]). If the subscript f denotes the

value of the corresponding variable at tf, and using the terminal constraints (12)(14), the

boundary conditions for the adjoint variables P(t ) and E (t ) are

1f = 1

2f = 2

3f = 3

4f = 5f = 6f = 0

7f = 1

8f = 2

9f = 3

10f = 11f = 12f = 0

(41)

or equivalently

4f = 5f = 6f = 0, (42)

10f = 11f = 12f = 0, (43)1f + 7f = 0, (44)

2f + 8f = 0, (45)

3f + 9f = 0. (46)

Then, for the control variables the necessary conditions (38)(39) yield

P

P

T= argmin

uP 4sin P cos P

vP+ 5 cos P cos P + 6

sin P

vP cos P, (47)E

E

T= argmax

uE

10

sin E cos E

vE+ 11 cos E cos E + 12

sin E

vE cos E

. (48)

These relations are formally identical to those used to determine the feedback strategies as

functions of the partial derivatives of V, with the only difference that the adjoint variables

{j }j =4,5,6,10,11,12 replace {Vj }j =4,5,6,10,11,12. Therefore, the optimal open-loop control laws

are given by

P = arcsin 6vP cos P 4vP

2

+ 25 + 6vP cos P

2 12, (49)sin P =

4

vP cos P

4

vP

2+ 25 +

6

vP cos P

2 12

, (50)

cos P = 5

cos P

4

vP

2+ 25 +

6

vP cos P

2 12

, (51)

E = arcsin12

vE cos E 10

vE 2

+ 211 +12

vE cos E 2

1

2

, (52)sin E =

10

vE cos E

10

vE

2+ 211 +

12

vE cos E

2 12

, (53)

cos E =11

cos E

10

vE

2+ 211 +

12

vE cos E

2 12

. (54)


10/24

Dyn Games Appl (2011) 1:534557 543

Lastly, the transversality condition (not reported for the sake of brevity) holds, because the

terminal time tf is unspecified.

3 Method of Solution

The semi-direct collocation with nonlinear programming (semi-DCNLP) algorithm converts

the two-sided optimization problem into a single-objective one, by employing the analytical

necessary conditions for optimality related to one of the two players [13]. Then the semi-

DCNLP algorithm transforms the continuous optimization problem into a discrete problem,

in which the system governing equations are translated into nonlinear algebraic (constraint)

equations involving the discrete parameters. The problem thus becomes a nonlinear pro-

gramming (NLP) problem. The numerical NLP solver must be initialized with a guess or

approximate solution (of reasonably good quality) if it is to converge to an accurate open-

loop saddle-point equilibrium solution. The guess solution affects the semi-DCNLP conver-gence. As the costate variables usually have a non-intuitive meaning, the selection of first

attempt values for them is very challenging, especially for large problems. In this research,

as well as in other papers published in the literature (cf. [1316]), a genetic (or evolutionary)

algorithm is employed as a preprocessing technique to overcome this difficulty. The use of

a genetic algorithm (GA) is intended to provide a first attempt approximate solution to

the problem. Then this guess is employed by the semi-DCNLP algorithm to generate an

actual, accurate (open-loop) saddle-point equilibrium solution. This section describes both

the evolutionary preprocessing and the semi-DCNLP algorithm.

3.1 Genetic Algorithm Preprocessing

Genetic algorithms represent a systematic approach to providing a starting guess for the

semi-DCNLP algorithm because they do not require any a priori information about the

solution. The unknown parameters involved in the problem form an individual. A popu-

lation is composed of a large number of individuals. Each individual corresponds to a set

of values of the unknown parameters and is evaluated with respect to a given objective

(or fitness) function. The starting population is randomly generated and suitable reproduc-

tion mechanismssuch as crossover, elitism, and mutation (cf. [22, 23])are employed to

improve the population generation after generation. After a specified (large) number of gen-erations, the GA is expected to produce the best individual, which contains the parameters

associated with the optimal approximate solution to the problem. Genetic algorithms are

characterized by a poor numerical accuracy, due to the representation of parameters through

a finite number of digits. This can be ameliorated in part by using real genetic algorithms.

Yet, this property is not a limitation when they are employed as preprocessing techniques,

i.e. just to provide a reasonable guess for the subsequent use of the semi-DCNLP.

In this study, the GA preprocessing considers all the equations that form the TPBVP

associated with the zero-sum game. In particular:

each individual is composed of all the unknown values of the costate variables at theinitial time t0 (= 0), and includes also the (unknown) time of flight:

i (0)

i=1,...,12; tf (55)

the control variables are expressed as functions of the state and costate variables through

(49)(54);


11/24

544 Dyn Games Appl (2011) 1:534557

the state equations (1)(6) and the adjoint equations for P(t ) and E (t ) are integrated

numerically for each individual;

the 13 boundary conditions (12)(14), (40), and (42)(46) are assimilated to scalar con-

straints of the form cl (xPf,xPf,Pf,Pf, tf) = 0 (l = 1, . . . , 13);

the following functional, related to constraint violation, represents the objective functionJ for the GA, and is evaluated for each individual:

J =

9l=1

kl c2l (kl > 0). (56)

It is worth remarking that the number of unknown parameters is exactly equal to the number

of constraints (i.e. 13). In this research the C package NSGA-II, developed by Deb [23], has

been employed, with the following settings: a population composed of 500 individuals, and

100 generations to select the best individual.

3.2 Semi-DCNLP Algorithm

The semi-direct collocation with nonlinear programming (semi-DCNLP) algorithm converts

the dual-sided optimization problem, formulated as zero-sum game, into a single-objective

optimization problem. It is based on the following points:

the control of the evader is found from the necessary conditions (52)(54), and can be

expressed as uE = uE (xE ,E );

the control of the pursuer is found numerically;

an extended state x (n-dimensional vector) is defined with the inclusion of the adjoint

variables of the evader

x =xTP x

TE

TE

T; (57)

a new control variable, including uP only, is introduced: u = uP (m-dimensional vector).

Hence, the extended state equations for x can be formally written by taking into account the

state equations (11) and the adjoint equations (35) for E (t ):

x = fTP f

TE

TE

fE

xE T

= f, (58)

where fTE = fTE (xE ,uE (xE ,E , t ), t ) = f

TE (xE ,E , t ).

The extended boundary conditions include the original boundary conditions of the prob-

lem (15) and the boundary conditions related to the adjoint variables of the evader, collected

in EXT:

=T TEXT

T= 0. (59)

The additional term EXT consists of the boundary conditions related to E only, after elim-

inating the components of from (36)(37). For the problem of interest EXT includes the

left hand side of (43) and (40) (after introducing (42) and (44)(46)), i.e. EXT has 4 com-ponents. As an immediate consequence, the q-dimensional vector has seven components

(q = 7).

With these steps the zero-sum game has been converted into the following optimal control

problem:

minu

J subject to (58) and (59). (60)


12/24

Dyn Games Appl (2011) 1:534557 545

The corresponding extended Hamiltonian is

H = Tf= TP(e)fP +

TE(e)fE

T(e)

fE

xE

TE , (61)

where = [TP(e) TE(e)

T(e)]

T. The extended terminal function now includes also EXT:

= tf + T = tf +

T + TEXTEXT, (62)

where = [T TEXT]T. The solution of the problem (60) also satisfies the necessary condi-

tions for an open-loop representation of a saddle-point equilibrium solution if the following

condition holds (cf. Appendix A):

EXT = 0. (63)

The continuous problem (60) is then discretized in time through collocation [24, 25] andsolved numerically. More specifically:

the time interval [t0, tf] is partitioned into N subintervals (N = 10 in this study);

in each subinterval, the state and the control variables are discretized in time (i.e. only

their values at discrete times are employed by the algorithm);

equations (58) are translated into nonlinear algebraic equations by means of high-order

quadrature rules (in this research the highly accurate GaussLobatto fifth-order quadra-

ture rules, cf. [25]).

The resulting nonlinear programming problem is solved by a numerical solver (in this work

the Fortran package NPSOL [26]).

With the fifth-order GaussLobatto quadrature rules, each state component is represented

by the values at the initial, at the central, and at the terminal point of each subinterval. There-

fore, the extended state x is represented by (2nN + n) parameters (378 in this study). Each

control component is represented by the respective values at the initial, at the central, and

at terminal point of each subinterval, and also by two additional values corresponding to

two collocation points (cf. [25] for further details). Therefore, the control vector u is rep-

resented through (4mN + m) parameters (82 in this work). The fifth-order GaussLobatto

rules allow translating the continuous problem into 2nN nonlinear constraints (360 in this

research). The NLP solver is expected to yield the optimal values of the parameters, and thenthe state components are interpolated through fifth-degree polynomials, which represent the

continuous accurate approximations of their optimal time histories.

4 Numerical Results

In this study canonical units have been employed; the Earth radius RE is the distance

unit (DU), whereas the time unit is such that the Earth gravitational parameter E equals

1 DU3/TU2. Hence, 1 DU = 6378.165 km and 1 TU = 806.8 sec. In canonical units,

1 DU/TU2 1 g = 9.798 103 km/sec2.Three problems have been considered and solved through the method described in

Sect. 3:

(a) the optimal interception of an optimally evading spacecraft by a pursuing spacecraft;

(b) the optimal interception of an optimally evading missile by a pursuing missile;

(c) the optimal interception of an optimally evading spacecraft by a pursuing missile.


13/24


14/24


15/24

548 Dyn Games Appl (2011) 1:534557

Fig. 4 Osculating orbital elements of the pursuing spacecraft

Fig. 5 Osculating orbital elements of the evading spacecraft

P(t0) = 60 deg; P(t0) = 127.5 deg;

MP(t0) = 116.9 deg; (TP/mP) = 0.1 g;

aE (t0) = 6678.165 km; eE (t0) = 0; iE (t0) = 56.5 deg;

E (t0) = 0 deg; E (t0) = E (t0) + ME (t0) = 39.8 deg;

(TE /mE ) = 0.05 g.


16/24

Dyn Games Appl (2011) 1:534557 549

Fig. 6 Preprocessed (from the GA optimizer) and optimal (from the sDCNLP optimizer) control laws of the

pursuer (a) and of the evader (b)

Fig. 7 Missile vs. missile:

saddle-point trajectories leading

to interception

Figure 10 illustrates the preprocessed and the optimal control laws of the two players. Fig-

ure 11 portrays the corresponding saddle-point trajectories, whereas Figs. 12 and 13 show

the time histories of the osculating orbital elements. The initial altitude (at t0) of the pursuing

missile is 100 km. Interception occurs in 12.7 minutes at an altitude of 317.2 km.

5 Concluding Remarks

Combat scenarios involving two competing space vehicles are best modeled as zero-sum

dynamic games. Algorithms devoted to the numerical solution of optimal control problems

cannot be employed to solve directly zero-sum games. This research describes an effective

numerical method tailored to solving zero-sum games with separable dynamics: the semi-

direct collocation with nonlinear programming algorithm (semi-DCNLP). More specifically,

under the assumption that they exist, this work addresses the determination of open-loop

representations of feedback saddle-point equilibrium solutions. These representations are


17/24

550 Dyn Games Appl (2011) 1:534557

Fig. 8 Osculating orbital elements of the pursuing missile

Fig. 9 Osculating orbital elements of the evading missile

sought by employing the necessary conditions that hold for open-loop strategies, under the

plausible conjecture that for the game at hand the value function is continuously differ-

entiable over the entire state space. The semi-DCNLP converts the zero-sum game into an

optimal control problem, and then solves this converted problem employing collocation. The

method under consideration has already been successfully applied to a variety of aerospace

problems [1316] and here is used to solve the problem of optimal interception of an opti-

mally evasive target by a pursuing spacecraft or missile. The two space vehicles are assumed

to start maneuvering simultaneously and each of them is supposed to possess complete and


18/24

Dyn Games Appl (2011) 1:534557 551

Fig. 10 Preprocessed (from the GA optimizer) and optimal (from the sDCNLP optimizer) control laws of

the pursuer (a) and of the evader (b)

Fig. 11 Missile vs. spacecraft:

saddle-point trajectories leading

to interception

instantaneous information on the state of the opponent player. In real life the evader is un-

likely to possess this information, which it needs to execute the optimal evasion. However,

the solution from game theory for the optimal strategy of the evading target can provide

from a practical point of viewthe worst-case-scenario faced by the pursuer, which is

very useful to know.

The semi-DCNLP requires a reasonable guess for the non-intuitive adjoint variables of

one of the two players. This guess is provided through a genetic algorithm and in the three

examples that have been solved it is occasionally found to have only a poor correspondence

to the final, converged solution. This circumstance testifies to the effectiveness and robust-

ness of the semi-DCNLP, which apparently needs only a feasible (approximate) solution as

a guess, i.e. a solution that fulfills the conditions for termination with a fair accuracy.

Only a small number of cases of optimal interception are solved here. Of course the

number of possible initial conditions and thrust capabilities for the vehicles is infinite, so

that even a large number of solved cases would not be much more useful. The solved cases


19/24

552 Dyn Games Appl (2011) 1:534557

Fig. 12 Osculating orbital elements of the pursuing missile

Fig. 13 Osculating orbital elements of the evading spacecraft

represent three possible combat scenarios involving missiles and spacecraft and prove the

validity and usefulness of the analysis, as well as the effectiveness of the method of solution.


20/24

Dyn Games Appl (2011) 1:534557 553

Appendix A: Formal Conversion of a Zero-Sum Game into an Optimal Control

Problem

This section has the purpose of proving that the solution of the optimal control problem ( 60)

yields an open-loop representation of a saddle-point equilibrium solution of the originalzero-sum game if the condition (63) holds.

The following necessary conditions for optimality are associated with the optimal control

problem under consideration:

P(e) =

H

xP

T=

fP

xP

TP(e), (64)

E(e) =

H

xE

T=

fE

xE

TE(e) +

xE

fE

xE

TE

T(e), (65)

(e) =

H

E

T=

fE

xE

T(e) (66)

with boundary conditions given by (59) and including also

(e)f =

E(e)f

T=

EXT

E(e)f

TEXT. (67)

IfEXT = 0, then, due to homogeneity of (66), (e) = 0 t, and the Hamiltonian H reduces

to

H = Tf= TP(e)fP +

TE(e)fE (68)

whereas the function of terminal conditions simplifies to

= tf + T . (69)

These two expressions are formally identical to the corresponding expressions that hold in

the definition of the necessary conditions for an open-loop saddle-point equilibrium solu-

tion (33). As the same differential equations and boundary conditions hold for P and P(e)

and for E and E(e), these pairs of variables are identical, i.e. P P (e) and E E(e).This circumstance implies that solving the optimal control problem (60) is equivalent to

identifying an open-loop representation of a saddle-point equilibrium solution for the origi-

nal zero-sum game, provided that the condition (63) holds.

It is worth remarking that the same analytical developments can be derived if P is con-

sidered for inclusion in x. This means that the roles of P and E are interchangeable in this

context.

Appendix B: Relations Between Orbital Elements and State Components

State components (ri , i , i , i , vi , i ) and orbital elements (ai , ei , ii , i , i , Mi ) represent

a set of six variables, which describe the dynamic state of each spacecraft. Once the orbital

elements are known, the state components are unequivocally determined and vice versa.

This appendix deals with the formal derivation of all the relationships needed to calculate

the state components from the orbital elements and vice versa.


21/24

554 Dyn Games Appl (2011) 1:534557

Fig. 14 Set of rotation angles

associated with RA

First of all, it is worth remarking that the ranges where angular variables are defined are

the following:

i < ,

2 i

2, i < ,

2 i

2, (70)

i < , 0 ii , i < , i < . (71)

With reference to Fig. 14, the Earth-centered inertial frame is identified by (c1, c2, c3): c1

is the vernal axis and (c1, c2) belong to the Earth equatorial plane. This frame and the or-bital frame (ri , i , hi ) (where hi denotes the unit vector aligned with the specific angular

momentum hi= r i vi , cf. Fig. 14) are related through the rotation matrix RA, defined by

ri

i

hi

=

ci ci ci ci si

ci si si si ci ci ci si si si si ci

si si ci si c si ci ci si si ci ci

RA

c1

c2

c3

(72)

where s[] = sin[] and c[] = cos[]. The rotation RA is written in terms of the angles i , i ,

and i , and results as the composition of three elementary rotations: the first (counterclock-

wise by the angle i ) about axis 3, the second (clockwise by the angle i ) about axis 2, the

third (counterclockwise by the angle i ) about axis 1.

Similarly, with reference to Fig. 15, the orbital frame (ri , i , hi ) can be obtained from the

inertial frame (c1, c2, c3) through an alternative rotation RB , written in terms of the angles

i , ii , and i , where i (= i + fi ) is the argument of latitude (fi denotes the true anomaly):

ri

i

hi

=ci ci si cii si ci si + si cii ci si sii

si ci ci cii si ci cii ci si si ci sii

sii si sii ci cii

RB

c1

c2

c3

. (73)

This rotation results as the composition of three elementary rotations: the first (counter-

clockwise by the angle i ) about axis 3, the second (counterclockwise by the angle ii )


22/24

Dyn Games Appl (2011) 1:534557 555

Fig. 15 Set of rotation angles

associated with RB

about axis 1, the third (counterclockwise by the angle i ) about axis 3. The two matricesRA and RB must coincide and this fact implies that the corresponding elements must be

identical. As a result, one obtains

cos i cos i = cos i cos i sin i cos ii sin i , (74)

cos i sin i = cos i sin i + sin i cos ii cos i , (75)

sin i = sin i sin ii , (76)

sin i cos i = cos i sin ii , (77)

cos i cos i = cos ii , (78)

cos i sin i cos i + sin i sin i = sin ii sin i , (79)

cos i sin i sin i sin i cos i = sin ii cos i . (80)

B.1

If the orbital elements (ai , ei , ii , i , i , Mi ) are specified, the state components (ri , vi , i ,

i , i , i ) can be deduced in the following fashion. First of all, the numerical solution of

Keplers equation produces the eccentric anomaly Ei :

Mi = Ei ei sin Ei Ei (81)

which is directly related to the true anomaly fi through the well-known formulas:

sin fi =sin Ei

1 e2i

1 ei cos Eiand cos fi =

cos Ei ei

1 ei cos Ei. (82)

The polar equation of elliptic orbits yields the radius ri :

ri = ai (1 e

2

i )1 + ei cos fi

. (83)

Then, from the vis viva equation [27] one obtains the velocity

vi =

2E

ri

E

ai. (84)


23/24

556 Dyn Games Appl (2011) 1:534557

The flight path angle i can be deduced from the radial component of velocity, vri :

vri =

E

ai (1 e2i )

ei sin fi = vi sin i i . (85)

Finally, the three angles (i , i , i ) can be calculated from (i , ii , i ) through the relation-

ships (74)(78).

B.2

If the state components (ri , vi , i , i , i , i ) are specified, the instantaneous (osculating) or-

bital elements (ai , ei , ii , i , i , Mi ) can be deduced in the following fashion. First of all,

from the vis viva equation one obtains the SMA ai :

ai =

E ri

2E ri v2i . (86)

In terms of the state components, the specific angular momentum magnitude hi is written as

hi = ri vi cos i . However this quantity is also: hi =

E ai (1 e

2i ). Hence, the eccentricity

is given by:

ei =

1

(ri vi cos i )2

E ai. (87)

The true anomaly fi can be found by considering the polar equation of the ellipse:

ri =ai (1 e

2i )

1 + ei cos fi cos fi =

ai (1 e2i ) ri

ri ei(88)

in conjunction with the radial component of velocity:

vri =

E

ai (1 e2i )

ei sin fi = vi sin i sin fi =vi sin i

ei

ai (1 e

2i )

E. (89)

The counterparts of the relationships (82) yield the eccentric anomaly Ei :

sin Ei =sin fi

1 e2i

1 + ei cos fiand cos Ei =

cos fi + ei

1 + ei cos fi. (90)

Then, one can obtain the mean anomaly Mi = Ei ei sin Ei . The three angles (i , ii , i )

can be calculated from (i , i , i ) through the relationships (76)(80). Finally, once i and

fi are known, the argument of perigee i is simply i = i fi .

References

1. Isaacs R (1965) Differential games. Wiley, New York

2. Bryson AE, Ho YC (1975) Applied optimal control. Hemisphere, New York

3. Basar T, Olsder GJ (1999) Dynamic noncooperative game theory. SIAM, Philadelphia

4. Breakwell JV, Merz AW (1977) Minimum required capture radius in a coplanar model of the aerial

combat problem. AIAA J 15(8):10891094


24/24

Dyn Games Appl (2011) 1:534557 557

5. Guelman M, Shinar J, Green A (1990) Qualitative study of a planar pursuit evasion game in the atmo-

sphere. J Guid Control Dyn 13(6):11361142

6. Hillberg C, Jrmark B (1983) Pursuit-evasion between two realistic aircraft. AIAA Atmospheric Flight

Mechanics Conference, Gatlinburg, Paper AIAA-83-2119

7. Jrmark B, Merz AW, Breakwell JV (1981) The variable speed tail-chase aerial combat problem. J Guid

Control Dyn 4(3):3233288. Breitner MH, Pesch HJ, Grimm W (1993) Complex differential games of pursuit-evasion type with state

constraints, Part 1: Necessary conditions for open-loop strategies. J Optim Theory Appl 78(3):419441

9. Breitner MH, Pesch HJ, Grimm W (1993) Complex differential games of pursuit-evasion type with state

constraints, Part 2: Numerical computation of open-loop strategies. J Optim Theory Appl 78(3):443463

10. Raivio T, Ehtamo H (2000) Visual aircraft identification as a pursuit-evasion game. J Guid Control Dyn

23(4):701708

11. Anderson GM, Grazier VW (1975) A closed-form solution for the barrier in pursuit-evasion problems

between two low thrust orbital spacecraft and its application. In: Aerospace sciences meeting, Pasadena,

CA, January 1975

12. Kelley HJ, Cliff EM, Lutze FH (1981) Pursuit-evasion in orbit. J Astronaut Sci 29:277288

13. Horie K, Conway BA (2006) Optimal fighter pursuit-evasion maneuvers found via two-sided optimiza-

tion. J Guid Control Dyn 29(1):10511214. Horie K, Conway BA (2004) Genetic algorithm pre-processing for numerical solution of differential

games problems. J Guid Control Dyn 27(6):10751078

15. Pontani M, Conway BA (2008) Optimal interception of evasive missile warheads: numerical solution of

the differential game. J Guid Control Dyn 31(4):11111122

16. Pontani M, Conway BA (2009) Numerical solution of the three-dimensional orbital pursuit-evasion

game. J Guid Control Dyn 32(2):474487

17. Cardialaguet P, Quincampoix M, Saint-Pierre P (1995) Numerical methods for optimal control and dif-

ferential games. Ceremade CNRS URQA 749, University of ParisDauphine, Paris, France

18. Cardialaguet P, Quincampoix M, Saint-Pierre P (1999) Set-valued numerical analysis for optimal control

and differential games. In: Bardi M, Raghavan TES, Parthasarathy T (eds) Stochastic and differential

games: theory and numerical methods. Annals of the international society of dynamic games. Birkhuser,

Boston, pp 17724719. Pesch HJ, Gabler I, Miesbach S (1995) Synthesis of optimal strategies for differential games by neural

networks. In: Olsder GJ (ed) New trends in dynamic games and applications. Annals of the international

society of dynamic games. Birkhuser, Boston, pp 111141

20. Breitner MH, Pesch HJ (1994) Reentry trajectory optimization under atmospheric uncertainty as a dif-

ferential game. In: Basar T, Haurie A (eds) Advances in dynamic games and applications. Annals of the

international society of dynamic games. Birkhuser, Boston, pp 7086

21. Lachner R, Breitner MH, Pesch HJ (2000) Differential game, numerical solution, and synthesis of strate-

gies. In: Filar JR, Gaitsgory V, Mizukami K (eds) Advances in dynamic games and applications. Annals

of the international society of dynamic games. Birkhuser, Boston, pp 115135

22. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley,

Boston

23. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Chichester24. Hargraves CR, Paris SW (1987) Direct trajectory optimization using nonlinear programming and collo-

cation. J Guid Control Dyn 10(4):338342

25. Herman AL, Conway BA (1996) Direct optimization using collocation based on high-order Gauss

Lobatto quadrature rules. J Guid Control Dyn 19(3):592599

26. Gill PE, Murray W, Saunders MA, Wright MH (1986) Users guide for NPSOL (Version 4.0): A Fortran

package for nonlinear programming, SOL 86-2, Stanford University

27. Prussing JE, Conway BA (1993) Orbital mechanics. Oxford University Press, New York

Documents

Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft