Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

Embed Size (px)

Citation preview

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    1/24

    Dyn Games Appl (2011) 1:534557

    DOI 10.1007/s13235-011-0024-5

    Numerical Solution of Orbital Combat Games Involving

    Missiles and Spacecraft

    Mauro Pontani

    Published online: 14 July 2011

    Springer Science+Business Media, LLC 2011

    Abstract This research addresses the problem of the optimal interception of an optimally

    evasive orbital target by a pursuing spacecraft or missile. The time for interception is to be

    minimized by the pursuing space vehicle and maximized by the evading target. This problem

    is modeled as a two-sided optimization problem, i.e. as a two-player zero-sum differential

    game. The work incorporates a recently developed method, termed semi-direct collocation

    with nonlinear programming, for the numerical solution of dynamic games. The method is

    based on the formal conversion of the two-sided optimization problem into a single-objective

    one, by employing the analytical necessary conditions for optimality related to one of the

    two players. An approximate, first attempt solution for the method is provided through the

    use of a genetic algorithm in a preprocessing phase. Three qualitatively different cases are

    considered. In the first example the pursuer and the evader are represented by two space-

    craft orbiting the Earth in two distinct orbits. The second and the third case involve two

    missiles, and a missile that pursues an orbiting spacecraft, respectively. The numerical re-

    sults achieved in this work testify to the robustness and effectiveness of the method also in

    solving large, complex, three-dimensional problems.

    Keywords Orbital dynamic games Pursuit-evasion games Two-sided optimization

    1 Introduction

    The problem of the three-dimensional optimal interception of an optimally evasive orbital

    target by a pursuing spacecraft (or missile) involves two competing players with contrasting

    objectives. The pursuing space vehicle tries to reach the evading target as quickly as possi-

    ble, whereas this latter tries to delay capture indefinitely. Since the time for interception is

    to be minimized by the pursuing spacecraft and maximized by the evading spacecraft, this

    problem is best modeled as a two-sided optimization problem, i.e. it becomes a two-playerzero-sum differential game.

    M. Pontani ()

    Scuola di Ingegneria Aerospaziale, University of Rome La Sapienza, 00138 Rome, Italy

    e-mail: [email protected]

    mailto:[email protected]:[email protected]
  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    2/24

    Dyn Games Appl (2011) 1:534557 535

    Zero-sum games were first introduced by Isaacs [1] and are also referred to as pursuit-

    evasion games. In the context of zero-sum games the optimal trajectories of the two

    spacecraft correspond to a so-called saddle-point equilibrium solution of the game. The

    necessary conditions for an open-loop saddle-point equilibrium solution are relatively

    straightforward to derive and can be viewed as an extension of the necessary conditions foroptimality that hold in optimal control theory [2, 3]. Their meaningfulness in relation with

    closed-loop saddle-point equilibrium solutions is closely related to the intrinsic character-

    istics of the game of interest. Only a few problems with simplified dynamics are amenable

    to an analytical solution [4, 5]. For problems with realistic dynamics the only choice is nu-

    merical solution. Hillberg and Jrmark [6] solved an air combat maneuvering problem in

    the horizontal plane with steady turn and realistic drag and thrust data. Jrmark, Merz, and

    Breakwell [7] solved a qualitatively similar air combat problem employing differential dy-

    namic programming, and considered only coplanar situations. A pursuit-evasion problem

    between missile and aircraft has been solved using an indirect, multiple shooting method by

    Breitner, Grimm and Pesch [8, 9]. Raivio and Ehtamo [10] solved a pursuit-evasion problemfor a visual identification of the target by iterating a direct method. With regard to orbital

    pursuit-evasion games, past studies are often based on simplified dynamical models. Ander-

    son and Grazier [11] described the construction of a closed-form solution for the barrier in a

    planar pursuit-evasion game between two spacecraft, by linearizing the problem about a ref-

    erence circular orbit. Kelley et al. [12] derived the impulsive maneuvers for two spacecraft

    involved in an orbital combat. They argued that optimal evasion only consists of in-plane

    maneuvers.

    The work that follows presents a recently developed method [1316], termed semi-

    direct collocation with nonlinear programming (semi-DCNLP), devoted to the numerical

    solution of zero-sum dynamic games with separable dynamics of the two players. Thismethod is based on the formal conversion of the two-sided optimization problem into a

    single-objective one, by employing the analytical necessary conditions for optimality re-

    lated to one of the two players. This fact implies that the adjoint variables of one of the two

    spacecraft are directly involved in the optimization process, which needs a reasonable guess

    to yield an accurate saddle-point equilibrium solution. The trial-and-error selection of first

    attempt values for the (non-intuitive) adjoint variables is very challenging for the problem

    at hand. In this work an approximate, first attempt solution is provided through the use of

    a genetic algorithm in a preprocessing phase. Three qualitatively different cases are con-

    sidered. In the first example the pursuer and the evader are represented by two spacecraft

    orbiting Earth in two distinct orbits. The second and the third case involve two missiles, and

    a missile that pursues an orbiting spacecraft, respectively.

    The objective of this work is to: (i) formulate the three-dimensional orbital combat as a

    dynamic game, (ii) describe and derive the analytical necessary conditions that must be sat-

    isfied by an open-loop equilibrium solution (while discussing their validity in relation with

    closed-loop equilibrium solutions), and (iii) obtain the saddle-point equilibrium trajectories

    through the joint use of a genetic algorithm and of the semi-DCNLP.

    2 Problem Definition

    The problem of the optimal interception of an optimally evasive orbital target consists in

    the determination of the saddle-point equilibrium trajectories of the two space vehicles in-

    volved in the combat scenario. Termination of the game occurs when the pursuing vehicle

    (henceforth denoted with P) reaches the instantaneous position of the evading target (de-

    noted with E henceforward). A plausible sufficient condition ensuring that capture ends

    mailto:[email protected]:[email protected]
  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    3/24

    536 Dyn Games Appl (2011) 1:534557

    Fig. 1 Local horizontal plane and related angles (a); instantaneous plane of motion and related thrust an-gles (b)

    the game is presented in the next subsection. The objective function, to be minimized by P

    and maximized by E, is represented by the time for interception. In this work each player

    is assumed to possess complete and instantaneous information on the state of the opponent

    player.

    2.1 Spacecraft Dynamics

    This study employs a point-mass model to describe the three-dimensional motion of the two

    space vehicles involved in the orbital game. The problem is investigated under the following

    assumptions:

    (a) aerodynamic forces are neglected, due to the altitudes involved in the cases that are

    being considered;

    (b) both spacecraft employ their maximum thrust for the entire time of flight;

    (c) the two space vehicles are given modest propulsive capabilities;

    (d) at the initial time t0, which is set to 0, the dynamical state of the two spacecraft is

    specified.

    Hypotheses (b) and (c) allow assuming constant thrust-to-mass ratios for both spacecraft,

    denoted with (TP/mP) and (TE /mE ) for P and E, respectively. This circumstance implies

    also that the control is performed through the thrust direction only.

    Six scalar variables describe the dynamical state of each spacecraft in an inertial Earth-

    centered reference frame: radius ri (i = P or E), absolute longitude i (measured from the

    vernal axis, the axis joining the equinoxes in the ecliptic plane), latitude i , flight path

    angle i , velocity vi , heading (or coazimuth) angle i (defined in Fig. 1(a)). The control is

    performed with the thrust direction, identified through the two angles i and i illustrated in

    Fig. 1(b); by definition, /2 i /2. IfE denotes the Earth gravitational parameter,

    the equations of motion are

    ri = vi sin i , (1)

    i =vi cos i cos i

    ri cos i, (2)

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    4/24

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    5/24

    538 Dyn Games Appl (2011) 1:534557

    A condition that ensures (at least for the cases considered in this paper) that interception

    concludes the game is

    TP

    mP>

    TE

    mE, (17)

    i.e., P has superior propulsive capabilities with respect to E. As unbounded controls are

    assumed for both spacecraft, it is conjectured that in general this condition implies that

    interception can occur in a finite time, regardless of the initial conditions of the two players.

    This circumstance implies also that no barrier that emanates from the target set can exist for

    the game at hand.

    In general, for zero-sum games two feedback (or, equivalently, closed-loop) strategies,

    P and E , can be introduced for the two players. If a closed-loop saddle-point equilibrium

    exists, the strategies P and E are in saddle-point equilibrium when

    JP,E JP,E JP,E, P P, E E , (18)where P and E are the sets of the admissible strategies (in the neighborhoods of

    P and

    E ). At a given time t, the value V of the game is defined as the outcome of the objective

    function when both players employ their optimal strategies along the optimal path in the

    time interval [t, tf]:

    V = minP

    maxE

    J = maxE

    minP

    J (19)

    provided that the operators max and min commute.

    A common assumption [1, 3] is that the state space can be divided into a number of mu-

    tually disjoint regions, separated by singular surfaces. These surfaces, according to the def-inition given by Basar [3], are the loci where (i) the equilibrium strategies are not uniquely

    determined by the necessary conditions, or (ii) the value function is not continuously dif-

    ferentiable, or (iii) the value function is discontinuous. In the scientific literature, some spe-

    cial, structural characteristics of zero-sum games are responsible of a number of singular

    surfaces. For instance, state constraints can yield afferent and universal surfaces [8]. Non-

    smooth data (e.g., a discontinuous thrust) can be responsible of discontinuities in the right

    hand side of the state equations and transition surfaces can arise [8]. Furthermore, control

    variables that appear linearly in the dynamics equations usually yield singular surfaces of

    several kinds [1]. Other more complex analytical conditions [17, 18] can generate singu-

    lar surfaces. For the problem at hand none of the previously mentioned circumstances is

    encountered, and the non-existence of singular surfaces is conjectured. Thus, the value V

    is plausibly assumed to be continuously differentiable over the entire state space (i.e. V is

    assumed to be of class C1 over the entire state space). With this assumption (which still

    eludes any rigorous mathematical proof), the optimal open-loop representations (uP,uE ) of

    the closed-loop strategies are introduced as

    uP(t ) = P (xP,xE , t) and u

    E (t ) =

    E (xP,xE , t) . (20)

    For each player an open-loop representation of an optimal feedback strategy is the strategy

    along the saddle-point equilibrium trajectory as a function of t and of the initial state only,

    under the assumption that V is of class C1 in the region of the state space under consid-

    eration. In other words, if the state is contained in a region where the value function is of

    class C1, then the open-loop strategies become open-loop representations of feedback strate-

    gies. These representations are relevant because two properties relate them to the feedback

    strategies:

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    6/24

    Dyn Games Appl (2011) 1:534557 539

    (a) if one of the two players deviates from his optimal open-loop strategy, his outcome

    worsens,

    (b) if both players employ their own optimal open-loop strategies, then the time histories of

    the optimal open-loop and of the optimal feedback strategies are identical.

    It is worth remarking that the determination of open-loop representations is relevant forzero-sum games, because it represents an essential premise for the successive synthesis of

    feedback strategies. Pesch et al. [19], Breitner and Pesch [20], and Lachner et al. [21] com-

    puted a relevant number of open-loop solutions and employed them for the synthesis of

    feedback strategies by means of special techniques (e.g., with the use of neural networks).

    In the regions where the value function exists and is continuously differentiable in t and

    x, V satisfies the following partial differential equation, referred to as Isaacs equation:

    V

    t+ max

    EminP

    V

    xPfP +

    V

    xEfE

    = 0. (21)

    Isaacs equation is written with reference to the special (separable) form (11) of the state

    equations.

    With regard to the dynamic game at hand, in this context the variables (P, P) and

    (E , E ) represent feedback strategies for P and E, respectively (P = [P P]T and E =

    [E E ]T). Isaacs equation becomes

    V

    t+ max

    EminP

    V1vPsP + V2

    vPcPcP

    rPcP+ V3

    vPcPsP

    rP

    + V4vPcPrP + TPmP s PcPvP E cPr 2PvP + V5TP

    mP c PcP

    E sP

    r 2P

    + V6

    TP

    mP

    sP

    vPcP

    vPcPsPcP

    rPcP

    + max

    EminP

    V7vE sE + V8

    vE cE cE

    rE cE

    + V9vE cE sE

    rE+ V10

    vE cE

    rE+

    TE

    mE

    s E cE

    vE

    E cE

    r2E vE

    + V11

    TE

    mEc E cE

    E sE

    r 2E

    + V12

    TE

    mE

    sE

    vE cE

    vE cE sE cE

    rE cE

    = 0 (22)

    and holds (x, t), since V is assumed to be of class C1 over the entire state space. The

    symbol Vj denotes the derivative of the value function V with respect to the state component

    xj (j = 1, . . . , 12), whereas s[] = sin[] and c[] = cos[]. It is worth noticing that for the

    problem of interest the two operators max and min are interchangeable due to separability

    of the dynamical system. For the same reason (22) reduces to:

    V

    t+

    TP

    mPminP

    V4

    sin P cos P

    vP+ V5 cos P cos P + V6

    sin P

    vP cos P

    +TE

    mEmaxE V10 sin E cos EvE + V11 cos E cos E + V12

    sin E

    vE cos E + r.t. = 0, (23)where r.t. represents the remaining terms, all of which are independent of the (feedback)

    control variables P and E . Introducing the unit vector P by

    P =

    V25 +

    V4

    vP

    2+

    V6

    vP cos P

    2 12

    V5V4

    vP

    V6

    vP cos P

    T(24)

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    7/24

    540 Dyn Games Appl (2011) 1:534557

    it is then relatively straightforward to find the control P that minimizes the sec-

    ond term of (23). In fact, if the thrust direction of P is denoted with TP, then TP =

    [cos P cos P sin P cos P sin P]T and the second term in (23) can be rewritten as

    TP

    mP

    V4vP

    2+ V25 +

    V6vP cos P

    2minuP

    TPTP

    . (25)

    The dot product TPTP is minimized to 1 if

    cos P cos P = V5

    V4

    vP

    2+ V25 +

    V6

    vP cos P

    2 12

    , (26)

    sin P cos P = V4

    vPV4

    vP2

    + V25 + V6vP cos P2

    1

    2

    , (27)

    sin P = V6

    vP cos P

    V4

    vP

    2+ V25 +

    V6

    vP cos P

    2 12

    . (28)

    The three relations (26)(28) lead to deriving P (which is constrained to [/2, /2]) and

    P as functions of the state variable xP and {Vj }j =4,5,6. The same steps can be repeated for

    E (taking into account that max replaces min in (23)) and lead to the following relation-

    ships:

    cos E cos E = V11V10

    vE

    2+ V211 +

    V12vE cos E

    2 12, (29)

    sin E cos E =V10

    vE

    V10

    vE

    2+ V211 +

    V12

    vE cos E

    2 12

    , (30)

    sin E =V12

    vE cos E

    V10

    vE

    2+ V211 +

    V12

    vE cos E

    2 12

    (31)

    which allow obtaining E (constrained to [/2, /2]) and E as functions of the state

    variable xE and {Vj }j =10,11,12. Due to (26)(31), Isaacs equation becomes

    V

    t+ V1vPsP + V2

    vPcPcP

    rPcP+ V3

    vPcPsP

    rP+ V4

    vPcP

    rP

    E cP

    r 2PvP

    V5

    E sP

    r 2P

    V6vPcPsPcP

    rPcP

    TP

    mP

    V4

    vP

    2+ V25 +

    V6

    vPcP

    2+ V7vE sE + V8

    vE cE cE

    rE cE

    + V9vE cE sE

    rE+ V10

    vE cE

    rE

    E cE

    r 2E

    vE V11

    E sE

    r2E

    V12vE cE sE cE

    rE cE

    +TE

    mE

    V10

    vE

    2+ V211 +

    V12

    vE cE

    2= 0. (32)

    As ri , vi , cos i , and cos i (i = P or E) never vanish and due to continuity of the partial

    derivatives {Vj }j =1,...,12, for the game at hand Isaacs equation holds in the entire state space.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    8/24

    Dyn Games Appl (2011) 1:534557 541

    The partial differential equation (32) cannot be directly solved in closed form and this

    circumstance prevents directly deriving the feedback control laws in the form (20). In dif-

    ferential game contexts, it is a common practice [3] to employ the necessary conditions for

    open-loop saddle-point strategies. Then, if the value function is of class C1 over the region of

    the state space under consideration, then the open-loop strategies become open-loop repre-sentations of feedback strategies. This research is aimed at determining open-loop strategies,

    which are conjecturally considered open-loop representations of feedback strategies, under

    the reasonable assumption that for the game of interest the value function has class C1 over

    the entire state space.

    The necessary conditions for open-loop saddle-point solutions involve ordinary differ-

    ential equations, and can be regarded as extensions of the necessary conditions for a local

    minimum that hold in optimal control theory. First, a Hamiltonian H and a function of ter-

    minal conditions are introduced as

    H = TPfP + TEfE , = tf + T, (33)

    where P,E , and are the adjoint variables conjugate to the state equations (11), and

    to the boundary conditions (15), respectively. For the Lagrange multipliers P and E the

    following adjoint equations hold [3]:

    P =

    H

    xP

    T=

    fP

    xP

    TP, (34)

    E = HxE

    T

    = fExE T

    E (35)

    with the respective boundary conditions:

    P(tf) =

    xPf

    T, (36)

    E (tf) =

    xEf

    T. (37)

    Open-loop control variables can be determined through the following pair of relations,

    uP = argminuP

    H, (38)

    uE = arg maxuE

    H (39)

    that can be regarded as the extension of the Pontryagin minimum principle to dynamic

    games. As the terminal time tf is unspecified, the following transversality condition holds:

    H (tf) +

    tf= 0. (40)

    Equations (11), (15), and (34)(40) define the two-point boundary value problem associated

    with the zero-sum dynamic game. The unknowns are the state vectors xP(t ) and xE (t ),

    the control vectors uP(t) and uE (t ), the Lagrange multipliers P(t),E (t ), and , and the

    terminal time tf.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    9/24

    542 Dyn Games Appl (2011) 1:534557

    With regard to the orbital game at hand, (34)(35) yield 12 scalar adjoint equations,

    which are not reported for the sake of brevity (cf. [16]). If the subscript f denotes the

    value of the corresponding variable at tf, and using the terminal constraints (12)(14), the

    boundary conditions for the adjoint variables P(t ) and E (t ) are

    1f = 1

    2f = 2

    3f = 3

    4f = 5f = 6f = 0

    7f = 1

    8f = 2

    9f = 3

    10f = 11f = 12f = 0

    (41)

    or equivalently

    4f = 5f = 6f = 0, (42)

    10f = 11f = 12f = 0, (43)1f + 7f = 0, (44)

    2f + 8f = 0, (45)

    3f + 9f = 0. (46)

    Then, for the control variables the necessary conditions (38)(39) yield

    P

    P

    T= argmin

    uP 4sin P cos P

    vP+ 5 cos P cos P + 6

    sin P

    vP cos P, (47)E

    E

    T= argmax

    uE

    10

    sin E cos E

    vE+ 11 cos E cos E + 12

    sin E

    vE cos E

    . (48)

    These relations are formally identical to those used to determine the feedback strategies as

    functions of the partial derivatives of V, with the only difference that the adjoint variables

    {j }j =4,5,6,10,11,12 replace {Vj }j =4,5,6,10,11,12. Therefore, the optimal open-loop control laws

    are given by

    P = arcsin 6vP cos P 4vP

    2

    + 25 + 6vP cos P

    2 12, (49)sin P =

    4

    vP cos P

    4

    vP

    2+ 25 +

    6

    vP cos P

    2 12

    , (50)

    cos P = 5

    cos P

    4

    vP

    2+ 25 +

    6

    vP cos P

    2 12

    , (51)

    E = arcsin12

    vE cos E 10

    vE 2

    + 211 +12

    vE cos E 2

    1

    2

    , (52)sin E =

    10

    vE cos E

    10

    vE

    2+ 211 +

    12

    vE cos E

    2 12

    , (53)

    cos E =11

    cos E

    10

    vE

    2+ 211 +

    12

    vE cos E

    2 12

    . (54)

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    10/24

    Dyn Games Appl (2011) 1:534557 543

    Lastly, the transversality condition (not reported for the sake of brevity) holds, because the

    terminal time tf is unspecified.

    3 Method of Solution

    The semi-direct collocation with nonlinear programming (semi-DCNLP) algorithm converts

    the two-sided optimization problem into a single-objective one, by employing the analytical

    necessary conditions for optimality related to one of the two players [13]. Then the semi-

    DCNLP algorithm transforms the continuous optimization problem into a discrete problem,

    in which the system governing equations are translated into nonlinear algebraic (constraint)

    equations involving the discrete parameters. The problem thus becomes a nonlinear pro-

    gramming (NLP) problem. The numerical NLP solver must be initialized with a guess or

    approximate solution (of reasonably good quality) if it is to converge to an accurate open-

    loop saddle-point equilibrium solution. The guess solution affects the semi-DCNLP conver-gence. As the costate variables usually have a non-intuitive meaning, the selection of first

    attempt values for them is very challenging, especially for large problems. In this research,

    as well as in other papers published in the literature (cf. [1316]), a genetic (or evolutionary)

    algorithm is employed as a preprocessing technique to overcome this difficulty. The use of

    a genetic algorithm (GA) is intended to provide a first attempt approximate solution to

    the problem. Then this guess is employed by the semi-DCNLP algorithm to generate an

    actual, accurate (open-loop) saddle-point equilibrium solution. This section describes both

    the evolutionary preprocessing and the semi-DCNLP algorithm.

    3.1 Genetic Algorithm Preprocessing

    Genetic algorithms represent a systematic approach to providing a starting guess for the

    semi-DCNLP algorithm because they do not require any a priori information about the

    solution. The unknown parameters involved in the problem form an individual. A popu-

    lation is composed of a large number of individuals. Each individual corresponds to a set

    of values of the unknown parameters and is evaluated with respect to a given objective

    (or fitness) function. The starting population is randomly generated and suitable reproduc-

    tion mechanismssuch as crossover, elitism, and mutation (cf. [22, 23])are employed to

    improve the population generation after generation. After a specified (large) number of gen-erations, the GA is expected to produce the best individual, which contains the parameters

    associated with the optimal approximate solution to the problem. Genetic algorithms are

    characterized by a poor numerical accuracy, due to the representation of parameters through

    a finite number of digits. This can be ameliorated in part by using real genetic algorithms.

    Yet, this property is not a limitation when they are employed as preprocessing techniques,

    i.e. just to provide a reasonable guess for the subsequent use of the semi-DCNLP.

    In this study, the GA preprocessing considers all the equations that form the TPBVP

    associated with the zero-sum game. In particular:

    each individual is composed of all the unknown values of the costate variables at theinitial time t0 (= 0), and includes also the (unknown) time of flight:

    i (0)

    i=1,...,12; tf (55)

    the control variables are expressed as functions of the state and costate variables through

    (49)(54);

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    11/24

    544 Dyn Games Appl (2011) 1:534557

    the state equations (1)(6) and the adjoint equations for P(t ) and E (t ) are integrated

    numerically for each individual;

    the 13 boundary conditions (12)(14), (40), and (42)(46) are assimilated to scalar con-

    straints of the form cl (xPf,xPf,Pf,Pf, tf) = 0 (l = 1, . . . , 13);

    the following functional, related to constraint violation, represents the objective functionJ for the GA, and is evaluated for each individual:

    J =

    9l=1

    kl c2l (kl > 0). (56)

    It is worth remarking that the number of unknown parameters is exactly equal to the number

    of constraints (i.e. 13). In this research the C package NSGA-II, developed by Deb [23], has

    been employed, with the following settings: a population composed of 500 individuals, and

    100 generations to select the best individual.

    3.2 Semi-DCNLP Algorithm

    The semi-direct collocation with nonlinear programming (semi-DCNLP) algorithm converts

    the dual-sided optimization problem, formulated as zero-sum game, into a single-objective

    optimization problem. It is based on the following points:

    the control of the evader is found from the necessary conditions (52)(54), and can be

    expressed as uE = uE (xE ,E );

    the control of the pursuer is found numerically;

    an extended state x (n-dimensional vector) is defined with the inclusion of the adjoint

    variables of the evader

    x =xTP x

    TE

    TE

    T; (57)

    a new control variable, including uP only, is introduced: u = uP (m-dimensional vector).

    Hence, the extended state equations for x can be formally written by taking into account the

    state equations (11) and the adjoint equations (35) for E (t ):

    x = fTP f

    TE

    TE

    fE

    xE T

    = f, (58)

    where fTE = fTE (xE ,uE (xE ,E , t ), t ) = f

    TE (xE ,E , t ).

    The extended boundary conditions include the original boundary conditions of the prob-

    lem (15) and the boundary conditions related to the adjoint variables of the evader, collected

    in EXT:

    =T TEXT

    T= 0. (59)

    The additional term EXT consists of the boundary conditions related to E only, after elim-

    inating the components of from (36)(37). For the problem of interest EXT includes the

    left hand side of (43) and (40) (after introducing (42) and (44)(46)), i.e. EXT has 4 com-ponents. As an immediate consequence, the q-dimensional vector has seven components

    (q = 7).

    With these steps the zero-sum game has been converted into the following optimal control

    problem:

    minu

    J subject to (58) and (59). (60)

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    12/24

    Dyn Games Appl (2011) 1:534557 545

    The corresponding extended Hamiltonian is

    H = Tf= TP(e)fP +

    TE(e)fE

    T(e)

    fE

    xE

    TE , (61)

    where = [TP(e) TE(e)

    T(e)]

    T. The extended terminal function now includes also EXT:

    = tf + T = tf +

    T + TEXTEXT, (62)

    where = [T TEXT]T. The solution of the problem (60) also satisfies the necessary condi-

    tions for an open-loop representation of a saddle-point equilibrium solution if the following

    condition holds (cf. Appendix A):

    EXT = 0. (63)

    The continuous problem (60) is then discretized in time through collocation [24, 25] andsolved numerically. More specifically:

    the time interval [t0, tf] is partitioned into N subintervals (N = 10 in this study);

    in each subinterval, the state and the control variables are discretized in time (i.e. only

    their values at discrete times are employed by the algorithm);

    equations (58) are translated into nonlinear algebraic equations by means of high-order

    quadrature rules (in this research the highly accurate GaussLobatto fifth-order quadra-

    ture rules, cf. [25]).

    The resulting nonlinear programming problem is solved by a numerical solver (in this work

    the Fortran package NPSOL [26]).

    With the fifth-order GaussLobatto quadrature rules, each state component is represented

    by the values at the initial, at the central, and at the terminal point of each subinterval. There-

    fore, the extended state x is represented by (2nN + n) parameters (378 in this study). Each

    control component is represented by the respective values at the initial, at the central, and

    at terminal point of each subinterval, and also by two additional values corresponding to

    two collocation points (cf. [25] for further details). Therefore, the control vector u is rep-

    resented through (4mN + m) parameters (82 in this work). The fifth-order GaussLobatto

    rules allow translating the continuous problem into 2nN nonlinear constraints (360 in this

    research). The NLP solver is expected to yield the optimal values of the parameters, and thenthe state components are interpolated through fifth-degree polynomials, which represent the

    continuous accurate approximations of their optimal time histories.

    4 Numerical Results

    In this study canonical units have been employed; the Earth radius RE is the distance

    unit (DU), whereas the time unit is such that the Earth gravitational parameter E equals

    1 DU3/TU2. Hence, 1 DU = 6378.165 km and 1 TU = 806.8 sec. In canonical units,

    1 DU/TU2 1 g = 9.798 103 km/sec2.Three problems have been considered and solved through the method described in

    Sect. 3:

    (a) the optimal interception of an optimally evading spacecraft by a pursuing spacecraft;

    (b) the optimal interception of an optimally evading missile by a pursuing missile;

    (c) the optimal interception of an optimally evading spacecraft by a pursuing missile.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    13/24

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    14/24

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    15/24

    548 Dyn Games Appl (2011) 1:534557

    Fig. 4 Osculating orbital elements of the pursuing spacecraft

    Fig. 5 Osculating orbital elements of the evading spacecraft

    P(t0) = 60 deg; P(t0) = 127.5 deg;

    MP(t0) = 116.9 deg; (TP/mP) = 0.1 g;

    aE (t0) = 6678.165 km; eE (t0) = 0; iE (t0) = 56.5 deg;

    E (t0) = 0 deg; E (t0) = E (t0) + ME (t0) = 39.8 deg;

    (TE /mE ) = 0.05 g.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    16/24

    Dyn Games Appl (2011) 1:534557 549

    Fig. 6 Preprocessed (from the GA optimizer) and optimal (from the sDCNLP optimizer) control laws of the

    pursuer (a) and of the evader (b)

    Fig. 7 Missile vs. missile:

    saddle-point trajectories leading

    to interception

    Figure 10 illustrates the preprocessed and the optimal control laws of the two players. Fig-

    ure 11 portrays the corresponding saddle-point trajectories, whereas Figs. 12 and 13 show

    the time histories of the osculating orbital elements. The initial altitude (at t0) of the pursuing

    missile is 100 km. Interception occurs in 12.7 minutes at an altitude of 317.2 km.

    5 Concluding Remarks

    Combat scenarios involving two competing space vehicles are best modeled as zero-sum

    dynamic games. Algorithms devoted to the numerical solution of optimal control problems

    cannot be employed to solve directly zero-sum games. This research describes an effective

    numerical method tailored to solving zero-sum games with separable dynamics: the semi-

    direct collocation with nonlinear programming algorithm (semi-DCNLP). More specifically,

    under the assumption that they exist, this work addresses the determination of open-loop

    representations of feedback saddle-point equilibrium solutions. These representations are

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    17/24

    550 Dyn Games Appl (2011) 1:534557

    Fig. 8 Osculating orbital elements of the pursuing missile

    Fig. 9 Osculating orbital elements of the evading missile

    sought by employing the necessary conditions that hold for open-loop strategies, under the

    plausible conjecture that for the game at hand the value function is continuously differ-

    entiable over the entire state space. The semi-DCNLP converts the zero-sum game into an

    optimal control problem, and then solves this converted problem employing collocation. The

    method under consideration has already been successfully applied to a variety of aerospace

    problems [1316] and here is used to solve the problem of optimal interception of an opti-

    mally evasive target by a pursuing spacecraft or missile. The two space vehicles are assumed

    to start maneuvering simultaneously and each of them is supposed to possess complete and

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    18/24

    Dyn Games Appl (2011) 1:534557 551

    Fig. 10 Preprocessed (from the GA optimizer) and optimal (from the sDCNLP optimizer) control laws of

    the pursuer (a) and of the evader (b)

    Fig. 11 Missile vs. spacecraft:

    saddle-point trajectories leading

    to interception

    instantaneous information on the state of the opponent player. In real life the evader is un-

    likely to possess this information, which it needs to execute the optimal evasion. However,

    the solution from game theory for the optimal strategy of the evading target can provide

    from a practical point of viewthe worst-case-scenario faced by the pursuer, which is

    very useful to know.

    The semi-DCNLP requires a reasonable guess for the non-intuitive adjoint variables of

    one of the two players. This guess is provided through a genetic algorithm and in the three

    examples that have been solved it is occasionally found to have only a poor correspondence

    to the final, converged solution. This circumstance testifies to the effectiveness and robust-

    ness of the semi-DCNLP, which apparently needs only a feasible (approximate) solution as

    a guess, i.e. a solution that fulfills the conditions for termination with a fair accuracy.

    Only a small number of cases of optimal interception are solved here. Of course the

    number of possible initial conditions and thrust capabilities for the vehicles is infinite, so

    that even a large number of solved cases would not be much more useful. The solved cases

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    19/24

    552 Dyn Games Appl (2011) 1:534557

    Fig. 12 Osculating orbital elements of the pursuing missile

    Fig. 13 Osculating orbital elements of the evading spacecraft

    represent three possible combat scenarios involving missiles and spacecraft and prove the

    validity and usefulness of the analysis, as well as the effectiveness of the method of solution.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    20/24

    Dyn Games Appl (2011) 1:534557 553

    Appendix A: Formal Conversion of a Zero-Sum Game into an Optimal Control

    Problem

    This section has the purpose of proving that the solution of the optimal control problem ( 60)

    yields an open-loop representation of a saddle-point equilibrium solution of the originalzero-sum game if the condition (63) holds.

    The following necessary conditions for optimality are associated with the optimal control

    problem under consideration:

    P(e) =

    H

    xP

    T=

    fP

    xP

    TP(e), (64)

    E(e) =

    H

    xE

    T=

    fE

    xE

    TE(e) +

    xE

    fE

    xE

    TE

    T(e), (65)

    (e) =

    H

    E

    T=

    fE

    xE

    T(e) (66)

    with boundary conditions given by (59) and including also

    (e)f =

    E(e)f

    T=

    EXT

    E(e)f

    TEXT. (67)

    IfEXT = 0, then, due to homogeneity of (66), (e) = 0 t, and the Hamiltonian H reduces

    to

    H = Tf= TP(e)fP +

    TE(e)fE (68)

    whereas the function of terminal conditions simplifies to

    = tf + T . (69)

    These two expressions are formally identical to the corresponding expressions that hold in

    the definition of the necessary conditions for an open-loop saddle-point equilibrium solu-

    tion (33). As the same differential equations and boundary conditions hold for P and P(e)

    and for E and E(e), these pairs of variables are identical, i.e. P P (e) and E E(e).This circumstance implies that solving the optimal control problem (60) is equivalent to

    identifying an open-loop representation of a saddle-point equilibrium solution for the origi-

    nal zero-sum game, provided that the condition (63) holds.

    It is worth remarking that the same analytical developments can be derived if P is con-

    sidered for inclusion in x. This means that the roles of P and E are interchangeable in this

    context.

    Appendix B: Relations Between Orbital Elements and State Components

    State components (ri , i , i , i , vi , i ) and orbital elements (ai , ei , ii , i , i , Mi ) represent

    a set of six variables, which describe the dynamic state of each spacecraft. Once the orbital

    elements are known, the state components are unequivocally determined and vice versa.

    This appendix deals with the formal derivation of all the relationships needed to calculate

    the state components from the orbital elements and vice versa.

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    21/24

    554 Dyn Games Appl (2011) 1:534557

    Fig. 14 Set of rotation angles

    associated with RA

    First of all, it is worth remarking that the ranges where angular variables are defined are

    the following:

    i < ,

    2 i

    2, i < ,

    2 i

    2, (70)

    i < , 0 ii , i < , i < . (71)

    With reference to Fig. 14, the Earth-centered inertial frame is identified by (c1, c2, c3): c1

    is the vernal axis and (c1, c2) belong to the Earth equatorial plane. This frame and the or-bital frame (ri , i , hi ) (where hi denotes the unit vector aligned with the specific angular

    momentum hi= r i vi , cf. Fig. 14) are related through the rotation matrix RA, defined by

    ri

    i

    hi

    =

    ci ci ci ci si

    ci si si si ci ci ci si si si si ci

    si si ci si c si ci ci si si ci ci

    RA

    c1

    c2

    c3

    (72)

    where s[] = sin[] and c[] = cos[]. The rotation RA is written in terms of the angles i , i ,

    and i , and results as the composition of three elementary rotations: the first (counterclock-

    wise by the angle i ) about axis 3, the second (clockwise by the angle i ) about axis 2, the

    third (counterclockwise by the angle i ) about axis 1.

    Similarly, with reference to Fig. 15, the orbital frame (ri , i , hi ) can be obtained from the

    inertial frame (c1, c2, c3) through an alternative rotation RB , written in terms of the angles

    i , ii , and i , where i (= i + fi ) is the argument of latitude (fi denotes the true anomaly):

    ri

    i

    hi

    =ci ci si cii si ci si + si cii ci si sii

    si ci ci cii si ci cii ci si si ci sii

    sii si sii ci cii

    RB

    c1

    c2

    c3

    . (73)

    This rotation results as the composition of three elementary rotations: the first (counter-

    clockwise by the angle i ) about axis 3, the second (counterclockwise by the angle ii )

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    22/24

    Dyn Games Appl (2011) 1:534557 555

    Fig. 15 Set of rotation angles

    associated with RB

    about axis 1, the third (counterclockwise by the angle i ) about axis 3. The two matricesRA and RB must coincide and this fact implies that the corresponding elements must be

    identical. As a result, one obtains

    cos i cos i = cos i cos i sin i cos ii sin i , (74)

    cos i sin i = cos i sin i + sin i cos ii cos i , (75)

    sin i = sin i sin ii , (76)

    sin i cos i = cos i sin ii , (77)

    cos i cos i = cos ii , (78)

    cos i sin i cos i + sin i sin i = sin ii sin i , (79)

    cos i sin i sin i sin i cos i = sin ii cos i . (80)

    B.1

    If the orbital elements (ai , ei , ii , i , i , Mi ) are specified, the state components (ri , vi , i ,

    i , i , i ) can be deduced in the following fashion. First of all, the numerical solution of

    Keplers equation produces the eccentric anomaly Ei :

    Mi = Ei ei sin Ei Ei (81)

    which is directly related to the true anomaly fi through the well-known formulas:

    sin fi =sin Ei

    1 e2i

    1 ei cos Eiand cos fi =

    cos Ei ei

    1 ei cos Ei. (82)

    The polar equation of elliptic orbits yields the radius ri :

    ri = ai (1 e

    2

    i )1 + ei cos fi

    . (83)

    Then, from the vis viva equation [27] one obtains the velocity

    vi =

    2E

    ri

    E

    ai. (84)

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    23/24

    556 Dyn Games Appl (2011) 1:534557

    The flight path angle i can be deduced from the radial component of velocity, vri :

    vri =

    E

    ai (1 e2i )

    ei sin fi = vi sin i i . (85)

    Finally, the three angles (i , i , i ) can be calculated from (i , ii , i ) through the relation-

    ships (74)(78).

    B.2

    If the state components (ri , vi , i , i , i , i ) are specified, the instantaneous (osculating) or-

    bital elements (ai , ei , ii , i , i , Mi ) can be deduced in the following fashion. First of all,

    from the vis viva equation one obtains the SMA ai :

    ai =

    E ri

    2E ri v2i . (86)

    In terms of the state components, the specific angular momentum magnitude hi is written as

    hi = ri vi cos i . However this quantity is also: hi =

    E ai (1 e

    2i ). Hence, the eccentricity

    is given by:

    ei =

    1

    (ri vi cos i )2

    E ai. (87)

    The true anomaly fi can be found by considering the polar equation of the ellipse:

    ri =ai (1 e

    2i )

    1 + ei cos fi cos fi =

    ai (1 e2i ) ri

    ri ei(88)

    in conjunction with the radial component of velocity:

    vri =

    E

    ai (1 e2i )

    ei sin fi = vi sin i sin fi =vi sin i

    ei

    ai (1 e

    2i )

    E. (89)

    The counterparts of the relationships (82) yield the eccentric anomaly Ei :

    sin Ei =sin fi

    1 e2i

    1 + ei cos fiand cos Ei =

    cos fi + ei

    1 + ei cos fi. (90)

    Then, one can obtain the mean anomaly Mi = Ei ei sin Ei . The three angles (i , ii , i )

    can be calculated from (i , i , i ) through the relationships (76)(80). Finally, once i and

    fi are known, the argument of perigee i is simply i = i fi .

    References

    1. Isaacs R (1965) Differential games. Wiley, New York

    2. Bryson AE, Ho YC (1975) Applied optimal control. Hemisphere, New York

    3. Basar T, Olsder GJ (1999) Dynamic noncooperative game theory. SIAM, Philadelphia

    4. Breakwell JV, Merz AW (1977) Minimum required capture radius in a coplanar model of the aerial

    combat problem. AIAA J 15(8):10891094

  • 8/2/2019 Numerical Solution of Orbital Combat Games Involving Missiles and Spacecraft

    24/24

    Dyn Games Appl (2011) 1:534557 557

    5. Guelman M, Shinar J, Green A (1990) Qualitative study of a planar pursuit evasion game in the atmo-

    sphere. J Guid Control Dyn 13(6):11361142

    6. Hillberg C, Jrmark B (1983) Pursuit-evasion between two realistic aircraft. AIAA Atmospheric Flight

    Mechanics Conference, Gatlinburg, Paper AIAA-83-2119

    7. Jrmark B, Merz AW, Breakwell JV (1981) The variable speed tail-chase aerial combat problem. J Guid

    Control Dyn 4(3):3233288. Breitner MH, Pesch HJ, Grimm W (1993) Complex differential games of pursuit-evasion type with state

    constraints, Part 1: Necessary conditions for open-loop strategies. J Optim Theory Appl 78(3):419441

    9. Breitner MH, Pesch HJ, Grimm W (1993) Complex differential games of pursuit-evasion type with state

    constraints, Part 2: Numerical computation of open-loop strategies. J Optim Theory Appl 78(3):443463

    10. Raivio T, Ehtamo H (2000) Visual aircraft identification as a pursuit-evasion game. J Guid Control Dyn

    23(4):701708

    11. Anderson GM, Grazier VW (1975) A closed-form solution for the barrier in pursuit-evasion problems

    between two low thrust orbital spacecraft and its application. In: Aerospace sciences meeting, Pasadena,

    CA, January 1975

    12. Kelley HJ, Cliff EM, Lutze FH (1981) Pursuit-evasion in orbit. J Astronaut Sci 29:277288

    13. Horie K, Conway BA (2006) Optimal fighter pursuit-evasion maneuvers found via two-sided optimiza-

    tion. J Guid Control Dyn 29(1):10511214. Horie K, Conway BA (2004) Genetic algorithm pre-processing for numerical solution of differential

    games problems. J Guid Control Dyn 27(6):10751078

    15. Pontani M, Conway BA (2008) Optimal interception of evasive missile warheads: numerical solution of

    the differential game. J Guid Control Dyn 31(4):11111122

    16. Pontani M, Conway BA (2009) Numerical solution of the three-dimensional orbital pursuit-evasion

    game. J Guid Control Dyn 32(2):474487

    17. Cardialaguet P, Quincampoix M, Saint-Pierre P (1995) Numerical methods for optimal control and dif-

    ferential games. Ceremade CNRS URQA 749, University of ParisDauphine, Paris, France

    18. Cardialaguet P, Quincampoix M, Saint-Pierre P (1999) Set-valued numerical analysis for optimal control

    and differential games. In: Bardi M, Raghavan TES, Parthasarathy T (eds) Stochastic and differential

    games: theory and numerical methods. Annals of the international society of dynamic games. Birkhuser,

    Boston, pp 17724719. Pesch HJ, Gabler I, Miesbach S (1995) Synthesis of optimal strategies for differential games by neural

    networks. In: Olsder GJ (ed) New trends in dynamic games and applications. Annals of the international

    society of dynamic games. Birkhuser, Boston, pp 111141

    20. Breitner MH, Pesch HJ (1994) Reentry trajectory optimization under atmospheric uncertainty as a dif-

    ferential game. In: Basar T, Haurie A (eds) Advances in dynamic games and applications. Annals of the

    international society of dynamic games. Birkhuser, Boston, pp 7086

    21. Lachner R, Breitner MH, Pesch HJ (2000) Differential game, numerical solution, and synthesis of strate-

    gies. In: Filar JR, Gaitsgory V, Mizukami K (eds) Advances in dynamic games and applications. Annals

    of the international society of dynamic games. Birkhuser, Boston, pp 115135

    22. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley,

    Boston

    23. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Chichester24. Hargraves CR, Paris SW (1987) Direct trajectory optimization using nonlinear programming and collo-

    cation. J Guid Control Dyn 10(4):338342

    25. Herman AL, Conway BA (1996) Direct optimization using collocation based on high-order Gauss

    Lobatto quadrature rules. J Guid Control Dyn 19(3):592599

    26. Gill PE, Murray W, Saunders MA, Wright MH (1986) Users guide for NPSOL (Version 4.0): A Fortran

    package for nonlinear programming, SOL 86-2, Stanford University

    27. Prussing JE, Conway BA (1993) Orbital mechanics. Oxford University Press, New York