6
MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING SEMI-COOPERATIVE STACKELBERG EQUILIBRIUM Istv´ an Harmati Department of Control Engineering and Information Technology Budapest University of Technology and Economics, Magyar Tud´ osok krt. 2/B422, Budapest, Hungary H-2117 Abstract: This paper concerns multiple robot coordination problem for target tracking where a collision free coordination is expected to keep team members together due to the formation shape. To improve its own position, any individual decision of a robot on the moving direction may bring other robots in a worse position, a typical feature in the noncooperative game theory that makes the global coordination nontrivial. The contribution of the paper is a game theoretic approach that improves the convergence of target tracking. The method uses an appropriately fitted Stackelberg equilibrium instead of Nash equilibrium and a new formation component in the individual cost functions. The results are simulated in a target tracking example where the target is followed by a team of three simple mobile robots. Keywords: Target tracking, Robot coordination, Game theory 1. INTRODUCTION Multiple robot coordination plays an important role if a team has to accomplish a global task in their common workspace. A usual task to be performed consists of target tracking, where the team members follow the target and, at the same time, they endeavor to reach some favorable con- ditions due to the relative positions between the robots, target and obstacles. Robots in the team have the right to make an own decision, though any decision of a robot may destroy the condition of other team members. It can be assumed that a successful coordina- tion is reached if all the team-mates strive to yield individually the best position taking into ac- count the set of possible interactive actions of the team-mates. This scenario can be effectively de- scribed in the frame of noncooperative game the- ory (Owen, 1995), (Petrosjan, 1993). The solution depends on the hierarchical structure of the game. Different types of equilibria may exist for the coordination task. Another important aspect that influences the solution is the information available for the players (team-mates). In (Basar and Ols- der, 1999) the reader finds the fundamentals of the game theory. If a strong cooperation can be assumed between robots the coordination problem is described as a cooperative game (Bilbao, 2000). This paper starts from the target tracking con- cept of (Skrzypczyk, 2004) but its results are improved at several point. One of the improve- ments forces team-mates into a formation that makes coordination more organized and manage- able. The literature of formation control is an intensive research area nowadays, some method is investigated for example in (Yamaguchi, 2002),

MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

MULTIPLE ROBOT COORDINATION FOR

TARGET TRACKING USING

SEMI-COOPERATIVE STACKELBERG

EQUILIBRIUM

Istvan Harmati ∗

∗ Department of Control Engineering andInformation Technology

Budapest University of Technology and Economics,Magyar Tudosok krt. 2/B422, Budapest, Hungary H-2117

Abstract: This paper concerns multiple robot coordination problem for targettracking where a collision free coordination is expected to keep team memberstogether due to the formation shape. To improve its own position, any individualdecision of a robot on the moving direction may bring other robots in a worseposition, a typical feature in the noncooperative game theory that makes theglobal coordination nontrivial. The contribution of the paper is a game theoreticapproach that improves the convergence of target tracking. The method uses anappropriately fitted Stackelberg equilibrium instead of Nash equilibrium and a newformation component in the individual cost functions. The results are simulatedin a target tracking example where the target is followed by a team of three simplemobile robots.

Keywords: Target tracking, Robot coordination, Game theory

1. INTRODUCTION

Multiple robot coordination plays an importantrole if a team has to accomplish a global taskin their common workspace. A usual task to beperformed consists of target tracking, where theteam members follow the target and, at the sametime, they endeavor to reach some favorable con-ditions due to the relative positions between therobots, target and obstacles. Robots in the teamhave the right to make an own decision, thoughany decision of a robot may destroy the conditionof other team members.It can be assumed that a successful coordina-tion is reached if all the team-mates strive toyield individually the best position taking into ac-count the set of possible interactive actions of theteam-mates. This scenario can be effectively de-scribed in the frame of noncooperative game the-

ory (Owen, 1995), (Petrosjan, 1993). The solutiondepends on the hierarchical structure of the game.Different types of equilibria may exist for thecoordination task. Another important aspect thatinfluences the solution is the information availablefor the players (team-mates). In (Basar and Ols-der, 1999) the reader finds the fundamentals ofthe game theory. If a strong cooperation can beassumed between robots the coordination problemis described as a cooperative game (Bilbao, 2000).This paper starts from the target tracking con-cept of (Skrzypczyk, 2004) but its results areimproved at several point. One of the improve-ments forces team-mates into a formation thatmakes coordination more organized and manage-able. The literature of formation control is anintensive research area nowadays, some methodis investigated for example in (Yamaguchi, 2002),

Page 2: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

(Yamaguchi, 1999), (Desai et al., 1998), (Tanneret al., 2004). As another contribution, an appro-priately fitted Stackelberg equlibrium is proposedwhich increases the robustness of target tracking.The paper is organized as follows. Section 2defines the target tracking problem. Section 3presents the new method using formation cost andsemi-cooperative Stackelberg equilibrium. Thesimulation results are illustrated on a target track-ing example in Section 4.

2. THE GAME THEORETIC FRAMEWORK

The problem of team coordination for targettracking reveals a conflict situation between N

team-mates as individual decision makers in theirmovement. To solve the problem within the frameof game theory, the concept of (Skrzypczyk, 2004)is adopted with some modifications. Based onsensory information, the algorithm obtains theposition of team-mates, obstacles and target. Theposition of objects allows computing cost functioncomponents and building up a cost function foreach team-mate. The cost functions define a gamemodel and a conflict situation to be solved by themeans of game theory. One may apply differentapproach to solve the problem such as min-maxstrategy, Nash equilibrium and Stackelberg equi-librium. It occurs easily in some cases that morethan one equilibrium point exists. The algorithmdistributed on each team-mate assigns an equilib-rium point and forces the robots to choose theiraction (linear and angular velocity) according tothis selected point. In our framework, the team-mates choose their action from the set

Ui ={

(ωi, vi) : ωi ∈[

−π

6, 0,

π

6

]

, vi

}

(1)

implying a new world scenario which is measuredby sensors and the loop begins again. The targettracking problem consists of finding the appropri-ate controls

ui =(

ωdi(tn)i , v

di(tn)i

)

i = 1, . . . , N (2)

from (1) by decision di for each robot in the team.Each element di ∈ Ui realize a distinct decision fora team-mate and the target tracking algorithm in-tends to select the optimal one. Optimal decisiondepends on the hierarchical structure of the teamand the cost function. The computation of linearvelocity vi is adopted from (Skrzypczyk, 2004)and it facilitates to approach the target. Hence,game theoretic model focuses on ωdi

i .The cost function of the ith team-mate is de-scribed in the form

Ii(d1, . . . , dN ) = K1i I1

i + K2i I2

i + K3i I3

i + K4i I4

i

where K1i , K2

i , K3i and K4

i are the weightingfactors of the cost components. As defined in(Skrzypczyk, 2004), the cost function componentI1i has the responsibility for the collision avoidance

and penalizes the closeness to the target, obsta-cles and other team-mates. The component I2

i

punishes decision (d1, . . . , dN ) with the distancebetween the ith robot and the center of mass ofthe team. The third component in cost functionimposes a penalty on the team-mate with thedistance between the center of mass of the teamand the target.In our method, and extra component I4

i is intro-duced that penalizes the deviation from a pre-scribed formation. It is assumed that the for-mation is given by a formation graph and theactual position of the target and the team matesare g(tn) and p1(tn), . . . pN (tn), respectively. Letthe desired formation consist of a straight line ofthree team-mates called formation line which isperpendicular to the line passes through the targetposition and the middle robot in the formationline. The latter line is called center line. In thedesired formation graph, the center line directedby the estimated velocity vg,est of the target. Thedistances between neighboring team-mates in theformation line are even and the distance betweenthe two terminal robots in the formation is equalto the distance of the target and the middle roboton the center line. The determination of I4

i followsthree steps and it is illustrated on figure 1.Algorithm 1. (Computation of I4

i )[1.] Select the middle robot from the team. Themiddle robot is required to be an interior point ofthe cone with vertex g(tn) and with the other twoteammates as extreme points. In figure 1, p2(tn)denotes the position of middle robot.[2.] If the direction vg,est coincided with the centerline defined by the position of target and middlerobot then the formation graph and the positionof target and middle robot determines the desiredposition of other team-mates. They are p′1(tn)and p′3(tn) in figure 1. Note, that p′2(tn) ≡ p2(tn).[3.] Since vg,est does not coincide with the centerline in general, the formation line p′1(tn), . . . , p′N (tn)is rotated by δ which is the angle between the vec-tor from target to middle robot and −vg,est. Therotation defines the points p′′1(tn), . . . , p′′N (tn)and I4

i has to drive team-mates to these points.Hence, the formation component to ith team-mate is defined by I4

i (d1, . . . , dN ) = dpi,p′′

i(the

distance between pi and p′′i).

3. SEMI-COOPERATIVE STACKELBERGEQUILIBRIUM

This section improves the results of (Skrzypczyk,2004) and proposes a more robust solution. Al-though Stackelberg equilibrium point plays the

Page 3: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

estgv ,

)( ntg

)(1 ntp

)(3 ntp

d

)(1 ntp¢

)()( 22 nn tptp ¢=

)(3 ntp¢

)(1 ntp ¢¢

)(2 ntp ¢¢

)(3 ntp ¢¢

Fig. 1. The difference from the desired formation

central role, we recall first the definition of gener-ally used noncooperative Nash equilibrium.

Definition 1. A decision(

dk10

1 , dk20

2 , . . . dkN0

N

)

re-

alizes a Nash equilibrium point if it satisfies theinequailities

I1

(

dk10

1 , dk20

2 , . . . dkN0

N

)

≤ I1

(

dk1

1 , dk20

2 , . . . dkN0

N

)

...

IN

(

dk10

1 , dk20

2 , . . . dkN0

N

)

≤ IN

(

dk10

1 , dk20

2 , . . . dkN

N

)

In the sequel, let INN1 := I1

(

dk10

1 , dk20

2 , . . . dkN0

N

)

,

· · ·, INNN := IN

(

dk10

1 , dk20

2 , . . . dkN0

N

)

. 3

The idea behind the application of Stackelbergequilibrium is that it prefers the decision of ateam-mate (usually called leader) to the decisionof other team-mates (usually identified as follow-ers). Preference can be often useful if the coordi-nation of the team by a team-mate is highly desir-able. In general, it is possible to define a hierarchybetween team-mates. A team-mate on higher hier-archy level announces his decision earlier than theteam-mate on lower hierarchy level. Team-mateson the same level do not know anything about thedecision of any team-mate on the same level. Inour new approach, the conflict situation is solvedby two distinct games. The first game is playedby the followers and we refer to this as followers’game, the second game is played by the leaderand a fictitious player called Union and we referto this as coordination game. The set of decisionsof Union consists of the solutions of followers’game. The followers’ game can be solved by usingNash equilibrium, Stackelberg equilibrium, min-max strategy depending on the further hierarchyof the followers. In the sequel, it is assumed thatthere is one leader in the team, and the other

team-mates are followers with equal rank. With-out loss of generality, it is also assumed in therest of paper that the leader is the R1 robot i.e.the first player in the team. For the discussion,one needs the definition of equilibrium strategyfor the leader and followers’ group.

Definition 2. Decision dk1

1,NS is a noncooperativeStackelberg equilibrium strategy for the leader if

maxRF

(

dk1

1,NS

)

I1

(

dk1

1,NS , RF

(

dk1

1,NS

))

= (3)

mind

k1

1∈U1

maxRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

= : INS1

where RF

(

dk1

1,NS

)

is the optimal response set of

followers’ group (i.e. player Union) and is definedfor each dk1

1 ∈ U1 by

RF

(

dk1

1

)

={

Ξ := (ξ2, . . . , ξN ) ∈ U2,×, UN :

I2

(

dk1

1 ,Ξ)

≤ I2

(

dk1

1 , dk2

2 , ξ3, . . . , ξN

)

,

I3

(

dk1

1 ,Ξ)

≤ I3

(

dk1

1 , ξk2

2 , dk3

3 , ξk4

4 , . . . , ξN

)

,

... (4)

IN

(

dk1

1 ,Ξ)

≤ I3

(

dk1

1 , ξk2

2 , ξ3, . . . , ξN−1dkN

N

)

,

∀(

dk2

2 , . . . , dkN

N

) }

Any(

dk2

2,NS , . . . , dkN

N,NS

)

∈ RF

(

dk1

1,NS

)

is an op-

timal decision for the followers’ group (playerUnion). For the brief notation, we introduce

INSi := Ii

(

dk1

1 ,Ξ)

, i = 2, . . . , N . 3

Definition 2 says that followers choose their op-timal decision to the leader’s announced decisiondk1

1 . Based on this, leader can select his optimaldecision dk10

1 that results the minimum cost forthe leader even if the optimal followers’ decisionselected is the worst possible for the leader. If fol-lowers’ game (as a subgame) has only one (Nash)equilibrium point i.e. RF (dk1

1 ) contains only oneelement then Proposition 3.16 in (Basar and Ols-der, 1999) gives an important result in the two-person coordination game.

Theorem 3. For a given two-person finite game,let INS

1 denote the noncooperative Stackelbergcost of the leader and let INN

1 denote any Nashequilibrium cost of the leader. If the set RF (dk1

1 )contains only one element for each dk1

1 ∈ U1 thenINS1 ≤ INN

1 .

The results of Theorem 3 is beneficial if a team-mate in target tracking has a bad position incomparison with the other. Assigning this team-mate as leader, his relative position in the team-mate may improve if the condition of Theorem 3holds. However, followers have often more than

Page 4: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

one (Nash) optimal response. Leader assumesthen the worst case in which followers choose theequilibrium strategy with the highest possible costto the leader. To improve further efficiency inthe coordination of target tracking, we introducethe notion of semi-cooperation which plays crucialrole in the following.

Definition 4. Decision dk1

1,SS is a semi-cooperativeStackelberg equilibrium strategy for the leader if

minRF

(

dk1

1,SS

)

I1

(

dk1

1,SS , RF

(

dk1

1,SS

))

= (5)

mind

k1

1∈U1

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

= : ISS1

where RF

(

dk1

1,SS

)

is the optimal response set of

followers’ group (i.e. player Union) and is definedfor each dk1

1 ∈ U1 by

RF

(

dk1

1

)

={

Ξ := (ξ2, . . . , ξN ) ∈ U2,×, UN :

I2

(

dk1

1 ,Ξ)

≤ I2

(

dk1

1 , dk2

2 , ξ3, . . . , ξN

)

,

I3

(

dk1

1 ,Ξ)

≤ I3

(

dk1

1 , ξk2

2 , dk3

3 , ξk4

4 , . . . , ξN

)

,

... (6)

IN

(

dk1

1 ,Ξ)

≤ I3

(

dk1

1 , ξk2

2 , ξ3, . . . , ξN−1dkN

N

)

∀(

dk2

2 , . . . , dkN

N

) }

Any(

dk2

2 , . . . , dkN

N

)

∈ RF

(

dk1

1,SS

)

is a (Nash)

optimal decision in the followers’ game, howeveronly the decision

(

dk2

2,SS , . . . , dkN

N,SS

)

= (7)

= arg minRF

(

dk1

1,SS

)

I1

(

dk1

1,SS , RF

(

dk1

1,SS

))

is optimal in the coordination game and followers’game, respectively. If more than one decision satis-fies (7), team-mates select one decision with socialagreement (e.g. lexicographical order). Decision(

dk2

2,SS , . . . , dkN

N,SS

)

is called the semi-cooperative

optimal response of followers’ group (i.e. playerUnion) and

ISS1 = I1

(

dk1

1,SS , dk2

2,SS , . . . , dkN

N,SS

)

ISS2 = I2

(

d1,SSk1, dk2

2,SS , . . . , dkN

N,SS

)

... (8)

ISSN = IN

(

dk1

1,SS , dk2

N,SS , . . . , dkN

N,SS

)

is the cost function of the team-mates. 3

Definition 4 says that followers choose their op-timal decision to the leader’s announced decision

dk1

1 . Based on this, leader can select his optimaldecision dk1

1,SS , a decision for that the followers op-timize their decision for their own and the leader,as well. The semi-cooperative optimal responseof followers’ group is important because leadercan assumes a cooperation from the followers toa certain extent if they have no unique optimalresponse. Another interpretation of the decisionprocess is that leader has the right to resolve thetrade-off between the (Nash) equilibrium pointsof the followers game to the leader’s own account.Indeed, a team operation like target tracking canreally expect this kind of minimal cooperationsince followers could not assign effectively a puredecision from RF (dk1

1 ) in general. In strict sensethey should employ mixed strategy which hasno much sense in varying environment of tar-get tracking problem. We show now that semi-cooperative Stackelberg strategy pays off betterto the leader than noncooperative Stackelberg andNash strategy in the target tracking problem.

Proposition 5. For a given N person game ofteam-mates in target tracking problem, let ISS

1

denote the semi-cooperative Stackelberg cost ofhe leader and let denote INS

1 the noncooperativeStackelberg cost of the leader. Then, ISS

1 ≤ INS1 .

Proof: By Definition 4 and Definition 2, both thenoncooperative and semi-cooperative Stackelberggame of team-mates incorporates two games, agame between the followers (i.e. followers’ game)and a game between the leader and the playerUnion (i.e. coordination game).If dk1

1,NS = dk1

1,SS , the set of optimal responsesin the two games are the same and the optimaldecision (3) of leader in noncooperative and theoptimal decision (5) of leader in semi-cooperativegame can be compared directly:

ISS1 = min

dk2

2,...,d

kNN

∈RF

(

dk1

1,SS

)

I1

(

dk1

1,SS , dk2

2 , . . . , dkN

N

)

= mind

k2

2,...,d

kNN

∈RF

(

dk1

1,NS

)

I1

(

dk1

1,NS , dk2

2 , . . . , dkN

N

)

≤ maxd

k2

2,...,d

kNN

∈RF

(

dk1

1,NS

)

I1

(

dk1

1,NS , dk2

2 , . . . , dkN

N

)

= INS1 (9)

i.e. ISS1 ≤ INS

1 holds.If dk1

1,NS 6= dk1

1,SS , the set of optimal responses in

the two games are RF (dk1

1,NS) and RF (dk1

1,SS) and

they may be different. Since dk1

1,SS is the semi-cooperative hierarchical strategy of the leader, itholds that

ISS1 = min

RF

(

dk1

1,SS

)

I1

(

dk1

1,SS , RF

(

dk1

1,SS

))

= mind

k1

1∈U1

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

Page 5: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

≤ minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

(10)

for any dk1

1 6= dk1

1,SS . Let dk1

1 = dk1

1,NS is thenoncooperative hierarchical strategy of the leader,then

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

= (11)

minRF

(

dk1

1,NS

)

I1

(

dk1

1,NS , RF

(

dk1

1,NS

))

maxRF

(

dk1

1,NS

)

I1

(

dk1

1,NS , RF

(

dk1

1,NS

))

= INS1

The RHS of inequality (10) and the LHS ofinequality (11) are equal, hence

ISS1 ≤ min

dk2

2,...,d

kNN

∈RF

(

dk1

1,NS

)

I1

(

dk1

1,NS , dk2

2 , . . . , dkN

N

)

≤ INS1 (12)

which completes the proof. 2

Proposition 6. For a given N person game ofteam-mates in target tracking problem, let ISS

1

denote the semi-cooperative Stackelberg cost ofthe leader and let INN

1 denote any Nash equilib-rium cost of the leader. Then, ISS

1 ≤ INN1 .

Proof: Let the semi-cooperative Stackelberg strat-egy be denoted by (dk1

1,SS , dk2

2,SS , . . . , dkN

N,SS) andlet a strategy of Nash equilibrium point be de-noted by (dk1

1,NN , dk2

2,NN , . . . , dkN

N,NN ). By the defi-nition of semi-cooperative Stackelberg strategy,

ISS1 = min

dk1

1∈U1

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

≤ minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

(13)

holds for every (dk1

1 , dk2

2 , . . . , dkN

N ). Let dk1

1 =

dk1

1,NN . Then, (13) can be rewritten as

ISS1 = min

dk1

1∈U1

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

≤ minRF

(

dk1

1,NN

)

I1

(

dk1

1,NN , RF

(

dk1

1,NN

))

(14)

Now, let (dk2

2 , . . . , dkN

N ) = (dk2

2,NN , . . . , dkN

N,NN ),Then (14) proceeds as

ISS1 = min

dk1

1∈U1

minRF

(

dk1

1

)

I1

(

dk1

1 , RF

(

dk1

1

))

≤ minRF

(

dk1

1,NN

)

I1

(

dk1

1,NN , RF

(

dk1

1,NN

))

≤ I1

(

dk1

1,NN , dk2

2,NN , . . . , dkN

N,NN

)

= INN1

and the proof is completed. 2

−160 −140 −120 −100 −80 −60 −40 −20 0 20 40−140

−120

−100

−80

−60

−40

−20

0

20

40

Fig. 2. Target tracking if the team uses Nash equi-librium point. (Target starts from [19, 16]

T.)

Cost function weights: K1i = 1, K2

i = 25,K3

i = 15, K4i = 2, i = 1, 2, 3.

Remark 7. Proposition 5 and Proposition 6 saythat semi-cooperative Stackelberg equilibrium pointis more favorable to the leader than noncoopera-tive Stackelberg or Nash equilibrium points. Thepropositions do not state, however, that follow-ers are better off playing semi-cooperative Stack-elberg strategy than any other noncooperativeStackelberg or Nash strategies.

If the team-mate with the highest cost functionis chosen as a leader at the actual decision time,then leader position in semi-cooperative strategyallows the team-mate to reduce its own handicap.This concept keeps the team-mates together ina group with roughly even cost function andassists avoiding a team-mate to suffer handicappermanently relative to the others. Permanenthandicap hinders the team-mate to occupy a goodlocation. It is not desirable because it may forcethe robot to digress from the group.

4. SIMULATION RESULTS

This section demonstrates the new algorithm ona simple target tracking problem. We show thata suitable leader selection strategy is able to ex-ploit the results of Proposition 5 and Proposi-tion 6. Figure 2 illustrates the target trackingif the team-mates choose their decision due toNash equilibrium point. One of the team-mateshas a permanent handicap after some time and itbegins to turn aside from the group to improveits position. Depending on the value of K2

i theother two team-mates may remain close to thehandicapped team-mate while their unfortunaterelative position to handicapped team-mate forcesthe handicapped team-mate to diverge from thetarget. As a result, the team peel off from thetarget, the components I4

i are increasing and thetarget tracking comes to fail as shown in figure2. Semi-cooperative Stackelberg equilibrium of-

Page 6: MULTIPLE ROBOT COORDINATION FOR Istv´an Harmatiukacc.group.shef.ac.uk/proceedings/control2006/papers/f... · 2007-03-22 · MULTIPLE ROBOT COORDINATION FOR TARGET TRACKING USING

−30 −20 −10 0 10 20 30 40−120

−100

−80

−60

−40

−20

0

20

40

Fig. 3. Target tracking if the team uses semi-cooperative Stackelberg equilibrium point.(Target starts from [19, 16]

T.) Cost function

weights: K1i = 1, K2

i = 25, K3i = 15, K4

i = 2,i = 1, 2, 3.)

−30 −20 −10 0 10 20 30 40 50−120

−100

−80

−60

−40

−20

0

20

40

Fig. 4. Target tracking methods are similar at costfunction weights: K1

i = 1, K2i = 7, K3

i = 20,K4

i = 15, i = 1, 2, 3.

fers a more robust tracking due to the leaderselection strategy that promotes the handicappedteam-mate to the leader. Figure 3 shows thattarget tracking becomes successful and formationcost components I4

i remain also balanced betweenteam-mates. The difference between Stackelbergand Nash strategies is not always so significant(see figure 4). In fact, decisions derived fromnoncooperative Stackelberg and Nash equilibriumpoint are the same at the 90% of the simula-tion and the noncooperative and semi-cooperativeStackelberg equilibrium points coincide at morethan 95% of discrete time moments. The perfor-mance of target tracking depends highly on theweight of cost function components.

5. CONCLUSIONS

This paper proposed an algorithm for trajec-tory tracking problem. We introduced the no-tion of semi-cooperative Stackelberg equilibriumpoint and proved that team strategy based on this

equilibrium point provides equal or better perfor-mance for the leader than conventional noncoop-erative Nash or Stackelberg strategy. Although,it does not guarantee that all the team-matesgain a lower cost, we demonstrated that an ap-propriate leader selection increases the robustnessof the overall target tracking. Target tracking ingame theoretical framework is very sensitive tothe weight of cost function components. In orderto increase robustness, one may apply soft com-puting methods for weight tuning.

ACKNOWLEDGEMENTS

The research was supported by the Hungarian Sci-ence Research Fund under grant OTKA T 042634,the project of Advanced Vehicles and VechicleControl Knowledge Center of the National Officefor Research and Technology and the Janos BolyaiResearch Scholarship of the Hungarian Academyof Sciences. The author thanks also Erno Horvathfor his help in providing a part of simulation data.

REFERENCES

Basar, T. and G. J. Olsder (1999). Dynamicnoncooperative game theory. 2nd ed.. SIAM.

Bilbao, J. M. (2000). Cooperative games oncombinatorial structures. Vol. 26 of Theoryand Decision Library, Series C: Game the-ory,Mathematical programming and Opera-tions research. Kluwer academic publishers.

Desai, J., J. P. Ostrowski and V. Kumar (1998).Controlling formations of multiple mobilerobots. In: Proceedings of the 1998 IEEE In-ternational Conference on Robotics and Au-tomation. Leuven, Belgium. pp. 2864–2869.

Owen, G. (1995). Game theory. 3rd ed.. AcademicPress.

Petrosjan, L. A. (1993). Differential game of pur-suit. Vol. 2 of Series on optimization. WorldScientific.

Skrzypczyk, K. (2004). Game theory based targetfollowing by a team of robots. In: Proceedingsof Forth International Workshop on RobotMotion and Control. pp. 91–96.

Tanner, H. G., G. J. Pappas and V. Kumar(2004). Leader-to-formation stability. IEEETransactions on Robotics and Automation20(3), 443–455.

Yamaguchi, H. (1999). A cooperative huntingbahavior by mobile robot troops. The In-ternational Journal of Robotics Research18(8), 931–940.

Yamaguchi, H. (2002). A distributed motion coor-dination strategy for multiple nonholonomicmobile robots in cooperative hunting opera-tions. In: Proceeding of the 41st IEEE Con-ference on Decision and Control. Las Vegas,Nevada, USA. pp. 2984–2991.