7
Theoretical Analysis on an Inversion Phenomenon of Convergence Velocity in a Real-Coded GA Hiroshi Someya, Member, IEEE Abstract— The aims of this paper are to analyze an inversion phenomenon theoretically and discussion on appropriateness of combination of a crossover operator and a selection model. In the previous study, the author designed a crossover operator that worked well on various kinds of objective functions. One of the features of the objective functions is “the optimum exists near a boundary much more than the other.” On such objective functions, with recommended selection model, the proposed crossover operator set with an appropriate parameter has shown the fastest convergence speed. However, with another selection model, its convergence speed has been the slowest. In order to understand this inversion phenomenon, a theoretical analysis quantified the selection pressures of the selection models and estimated the expected positions of the center of gravity of the population. The theoretical results corresponded to empirical verifications and successfully explained. Finally, a guideline for designing RCGAs was obtained. I. I NTRODUCTION Recent years, several researchers have studied Real-Coded GAs (RCGAs) for objective functions where the optimum is located near a boundary [1], [2], [3], [4]. They have reported that most RCGAs prefer to search the center of the search space much more than the other. An example of such bias is shown in Fig.1. The left figure was obtained by an actual computer simulation. In this simulation, the indi- viduals of the initial population were distributed randomly in the search space [5, 5]. In each generation, the parents randomly picked produced their children using Unimodal Normal Distribution Crossover (UNDX) [5]. The survival selection was performed as the random picking manner. Although all the selection procedures were completely at random, the searched points were clearly biased toward the center of the search space. This bias works advantageously only in the case that the optimum happens to be located in the center of the search space. In the other cases, it works vice versa. This paper claims that the robustness should be given priority over the performance in a specific case because no one knows where the optimum exists before a trial in real- world applications. The right figure in Fig.1 illustrates the theoretically obtained sampling bias of UNDX. Because the biased-curve is proper to the crossover operator, invention of an improved crossover operator would be believed to be the best promising approach for this problem. In the previous study, the author designed a new crossover operator, Asymmetrical Normal Distribution Crossover (ANDX) [6]. One of the advantageous features of this Hiroshi Someya is with The Institute of Statistical Mathematics, Research Organization of Information and Systems, 4-6-7 Minami-Azabu Minato-ku Tokyo 106-8569, Japan (email: [email protected]). crossover operator is flexibility. The shape of the probability density function (p.d.f.) of this crossover operator can be controlled in a continuous fashion. In a certain parameter setting, the theoretical biased-curve of ANDX is relatively flat. In several experiments where the performance measure adopted was the number of finding the optimum, ANDX has worked well on all of a variety of test functions, such as F step ,F rastrigin ,F ikeda and F rosenbrock . Their fitness landscapes are characterized by several different structures: discreteness, big valley, punch bowl, ridge structure, epistasis structure, UV structure, and structures where the optimum exists near a boundary rather than in the center of the search space. Therefore, the author concluded that ANDX would be suitable for searching near a boundary of the search space. Some parts of the experimental results are shown in TableI. Each test function except F rosenbrock has the optimum at the origin. The experiments on them were performed with the initial population distributed in [-0.62, 9.62]. F rosenbrock has the optimum at (1, . . . , 1). The initial domain was set to be [-2.048, 2.048]. The ζ o is a parameter of ANDX. The MGG and X-MGG are the selection models adopted. Then, the author examined the convergence curves of these experiments. The curves on F rosenbrock drew two curious graphs shown in Fig.2. ANDX ζo=0.4 with X-MGG rapidly heads toward the optimum. The convergence speed ranking of ANDXs with X-MGG is ANDX ζo=0.4 , ANDX ζo=0.3 , ANDX ζo=0.2 . However, in the case of with MGG, the rank- ing is the opposite. The convergence velocity of ANDX ζo=0.4 with MGG is the slowest. Why? The aims of this paper are to analyze this inversion phenomenon theoretically, and discussion on appropriateness of combination of a crossover operator and a selection model. -5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5 (a) Experimental 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Probability Density Domain of Definition (b) Theoretical Fig. 1. The sampling bias of UNDX [4]. 4531 1-4244-1340-0/07$25.00 c 2007 IEEE

Theoretical Analysis on an Inversion Phenomenon of Convergence

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Theoretical Analysis on an Inversion Phenomenon ofConvergence Velocity in a Real-Coded GA

Hiroshi Someya, Member, IEEE

Abstract— The aims of this paper are to analyze an inversionphenomenon theoretically and discussion on appropriateness ofcombination of a crossover operator and a selection model. Inthe previous study, the author designed a crossover operatorthat worked well on various kinds of objective functions. Oneof the features of the objective functions is “the optimum existsnear a boundary much more than the other.” On such objectivefunctions, with recommended selection model, the proposedcrossover operator set with an appropriate parameter hasshown the fastest convergence speed. However, with anotherselection model, its convergence speed has been the slowest. Inorder to understand this inversion phenomenon, a theoreticalanalysis quantified the selection pressures of the selectionmodels and estimated the expected positions of the center ofgravity of the population. The theoretical results correspondedto empirical verifications and successfully explained. Finally, aguideline for designing RCGAs was obtained.

I. INTRODUCTION

Recent years, several researchers have studied Real-CodedGAs (RCGAs) for objective functions where the optimumis located near a boundary [1], [2], [3], [4]. They havereported that most RCGAs prefer to search the center ofthe search space much more than the other. An example ofsuch bias is shown in Fig.1. The left figure was obtained byan actual computer simulation. In this simulation, the indi-viduals of the initial population were distributed randomlyin the search space [−5, 5]. In each generation, the parentsrandomly picked produced their children using UnimodalNormal Distribution Crossover (UNDX) [5]. The survivalselection was performed as the random picking manner.Although all the selection procedures were completely atrandom, the searched points were clearly biased toward thecenter of the search space. This bias works advantageouslyonly in the case that the optimum happens to be located in thecenter of the search space. In the other cases, it works viceversa. This paper claims that the robustness should be givenpriority over the performance in a specific case because noone knows where the optimum exists before a trial in real-world applications. The right figure in Fig.1 illustrates thetheoretically obtained sampling bias of UNDX. Because thebiased-curve is proper to the crossover operator, invention ofan improved crossover operator would be believed to be thebest promising approach for this problem.

In the previous study, the author designed a new crossoveroperator, Asymmetrical Normal Distribution Crossover(ANDX) [6]. One of the advantageous features of this

Hiroshi Someya is with The Institute of Statistical Mathematics, ResearchOrganization of Information and Systems, 4-6-7 Minami-Azabu Minato-kuTokyo 106-8569, Japan (email: [email protected]).

crossover operator is flexibility. The shape of the probabilitydensity function (p.d.f.) of this crossover operator can becontrolled in a continuous fashion. In a certain parametersetting, the theoretical biased-curve of ANDX is relativelyflat.

In several experiments where the performance measureadopted was the number of finding the optimum, ANDXhas worked well on all of a variety of test functions, suchas Fstep, Frastrigin, Fikeda and Frosenbrock. Their fitnesslandscapes are characterized by several different structures:discreteness, big valley, punch bowl, ridge structure, epistasisstructure, UV structure, and structures where the optimumexists near a boundary rather than in the center of the searchspace. Therefore, the author concluded that ANDX would besuitable for searching near a boundary of the search space.Some parts of the experimental results are shown in Table I.Each test function except Frosenbrock has the optimum atthe origin. The experiments on them were performed withthe initial population distributed in [-0.62, 9.62]. Frosenbrock

has the optimum at (1, . . . , 1). The initial domain was setto be [-2.048, 2.048]. The ζo is a parameter of ANDX.The MGG and X-MGG are the selection models adopted.Then, the author examined the convergence curves of theseexperiments. The curves on Frosenbrock drew two curiousgraphs shown in Fig.2. ANDXζo=0.4 with X-MGG rapidlyheads toward the optimum. The convergence speed rankingof ANDXs with X-MGG is ANDXζo=0.4, ANDXζo=0.3,ANDXζo=0.2. However, in the case of with MGG, the rank-ing is the opposite. The convergence velocity of ANDXζo=0.4

with MGG is the slowest. Why?The aims of this paper are to analyze this inversion

phenomenon theoretically, and discussion on appropriatenessof combination of a crossover operator and a selection model.

-5

-4

-3

-2

-1

0

1

2

3

4

5

-5 -4 -3 -2 -1 0 1 2 3 4 5

(a) Experimental

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Pro

babi

lity

Den

sity

Domain of Definition

(b) Theoretical

Fig. 1. The sampling bias of UNDX [4].

4531

1-4244-1340-0/07$25.00 c©2007 IEEE

TABLE I

THE PERCENTAGES OF RUNS IN WHICH EACH METHOD SUCCEEDED IN

FINDING THE GLOBAL OPTIMUM IN 300 RUNS (ROUNDED OFF TO THE

FIRST DECIMAL PLACE).

Fst Fra Fik Fro

MGG UNDX 34 0 36 62ANDXζo=0.2 47 1 47 63ANDXζo=0.3 91 3 46 62ANDXζo=0.4 100 6 58 65

X-MGG ANDXζo=0.2 61 2 44 72ANDXζo=0.3 99 9 49 74ANDXζo=0.4 100 97 67 96

Fst = Fstep, Fra = Frastrigin,Fik = Fikeda, Fro = Frosenbrock.

The theoretical analysis quantifies the selection pressures ofthe selection models and estimates the expected positions ofthe center of gravity of the population.

II. ANALYSIS OBJECTS

A. UNDX with MGG

This method is widely referred and adopted in manyliteratures. It works well even though the landscapes ofobjective functions are characterized by multi-modal, discreteand high-dimensional.

UNDX produces two children in a single crossover op-eration around the middle point of the parents, Parent1 andParent2. The children vectors, c(1) and c(2), are determinedas follows:

c(1) = g + z1e1 +

n∑k=2

zkek ,

c(2) = g − z1e1 −n∑

k=2

zkek .

The parent vectors p(1) and p(2) introduce their middle pointvector g = (p(1) + p(2))/2 and the orthogonal unit vectorse1 = (p(2) − p(1))/|p(2) − p(1)|, ek(k = 2, . . . , n), wheren is the dimension of objective function. z1 ∼ N(0, σ2

1) andzk ∼ N(0, σ2

2) are normally distributed random numbers,where σ1 = αd1 and σ2 = βd2/

√n. α and β are constants.

d1 is the distance between Parent1 and Parent2. d2 is thedistance of Parent3, the third parent, from the line connectingParent1 and Parent2.

Minimal Generation Gap (MGG) model [7] is describedin the followings:

step 1 Generate an initial population randomly.step 2 Choose a pair of individuals as parents from the

population randomly.step 3 Make a certain number of children by a crossover.step 4 Select the best individual out of the family, the parents

and their children.step 5 Choose an individual except the best out of the family

randomly according to fitness-based or ranking-basedwheel selection.

step 6 replace the two selected individuals to the parents.

MGG

1e-08

1e-06

1e-04

1e-02

1e+00

1e+02

1e+04

0 5e+06 1e+07 1.5e+07 2e+07

Number of evaluation

Fitness Value

ANDXζo=0.4�

�� ANDXζo=0.3

����

ANDXζo=0.2 ���UNDX �

X-MGG

1e-08

1e-06

1e-04

1e-02

1e+00

1e+02

1e+04

0 5e+06 1e+07 1.5e+07 2e+07

Number of evaluation

Fitness Value

ANDXζo=0.4

ANDXζo=0.3���

ANDXζo=0.2

���

Fig. 2. The transition curves of the best fitness value on Frosenbrock

(the average of 10 successful runs). The above graph was obtained by theexperiments with MGG, and the below one was with X-MGG.

step 7 Iterate step2∼step6 until a certain condition is satis-fied.

In this model, evolutionary stagnation is avoided by selectingthe best individual at step4. Early convergence is preventedby the roulette-wheel selection at step5.

B. ANDX with X-MGG

ANDX has been designed for searching the followingpromising search regions: (i) near a promising solutioncandidate (a parent), (ii) the inner region of a parent distribu-tion, (iii) independent regions of the coordinate systems, and(iv) a region satisfying “preservation of statistics”. Examplesin Fig.3 explain that ANDX searches the promising searchregions.

ANDX produces children near each parent. Let g be thecenter of mass of m parents, p(1), . . . ,p(m), and let d(j) =p(j) − g. A child c(s) is produced near p(s), chosen fromthe parents randomly, in the following equation:

c(s) = p(s) −m∑

j �=s

d(j)ω(j) + q . (1)

The symbol∑m

j �=s expresses the notation for summationof index j = 1, . . . , m except for j = s. Each ω is anasymmetrical normal random variable defined as:

ω ∼{

2ζoN(0, σ2o) (x ≥ 0)

2ζiN(0, σ2i ) (x < 0)

(2)

where ζo + ζi = 1, σo = ζoσ, and σi = ζiσ. The ζo and σare the parameters of ANDX. The former is set to satisfy

4532 2007 IEEE Congress on Evolutionary Computation (CEC 2007)

c(1)

c(2)

c(3) merged

Fig. 3. Examples of probability density distribution of ANDX on a 2-dimensional search space. The parameter values are ζo = 0.3 and σ ; 1.0. Theparent vectors are p

(1) = (4, 1), p(2) = (−1, 7) and p

(3) = (−3,−8) [6],

ζo < ζi in order to prefer the promising search region 2.When the equation,

σ =2√

2π(2ζo − 1)

2(m − 2)(2ζo − 1)2 − π(m − 1)(3ζ2o − 3ζo + 1)

(3)

is satisfied, the children are placed in the promising searchregion 4. Examples of p.d.f.s introduced by such parametersare demonstrated in Fig.4. Let W be the subspace spanned byd(1), . . . ,d(m), and let W⊥ be the orthogonal complement.The q in W⊥ would be defined, for example, as the samemanner of UNDX.

The eXclusive-MGG (X-MGG) [6] model has been intro-duced in order to equip MGG with exclusive replacementstrategy. To avoid premature convergence of the populationin the early stages, the population should contain a variety ofcharacteristics of genes. For this, only one individual shouldsurvive the competition in a family, a group that consistsof parents and their children, because each individual thatbelongs to the family has similar characteristics of genes. Inthis selection model only one individual is allowed to survivefrom a subfamily derived from dividing the family. Eachsubfamily consists of a parent and especially similar childrento the parent. Procedure of a generation of this model usingANDX is described as follows:

step 1 Choose m + 1 parents, P = {p(1), . . . ,p(m)} andp(m+1), randomly from the population.

step 2 Produce children, C = {C(1), . . . , C(m)} where C(j)

is a group including c(j) produced near p(j), usingANDX. Let S = {S(1), . . . , S(m)} where S(j) is asubfamily that consists of p(j) and C(j).

step 3 Select the best individual Ibest in S. Let S(best) bethe S(j) that includes Ibest, and let p(best) be theparent that belongs to S(best).

step 4 Select one individual Iroulette by fitness-based orranking-based roulette-wheel selection from the groupS(rest) derived from S by removing S(best). Letp(roulette) be the parent that belongs to the S(j)

including Iroulette.step 5 replace p(best) with Ibest.step 6 replace p(roulette) with Iroulette.

III. THEORETICAL ANALYSIS

Firstly, this section discusses selection pressures of MGGmodel and X-MGG model. Secondly, expected values ofchildren vectors selected are introduced. Finally, enlargementand acceleration of a population are quantified. This paperadopts ranking-based selection as the roulette-wheel selec-tions of MGG and X-MGG.

A. Terms

The best evaluated member in a family is called the bestmember and a selected member by a roulette-wheel selectionis called roulette selection member.

B. The expected ranking in the ranking-based roulette-wheelselection

Let consider a roulette-wheel selection for choosing out ofa population consists of k individuals. The actual ranking tobe selected is unknown because it would be determined byusing a random number generator. Thus, we should considerthe expected ranking, R(k).

The survival odds of the first, the second, the third, . . . , andthe worst member should be assigned as: ck

1 , ck

2 , ck

3 , . . . , ck

k,

where ck is a constant that satisfies∑k

r=1ck

r= 1. Hence,

ck ·(

1

1+

1

2+ · · · + 1

k − 1+

1

k

)=1

and

ck−1 ·(

1

1+

1

2+ · · · + 1

k − 1

)=1

introduce

ck =k · ck−1

k + ck−1. (4)

Arbitrary ck can be derived from this recurrence formula andc1 = 1. Hence, the expected ranking is estimated as,

R(k) =k∑

r=1

r · ck

r= ck · k . (5)

2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4533

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

prob

abili

ty d

ensi

ty

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

prob

abili

ty d

ensi

ty

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

prob

abili

ty d

ensi

ty

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

prob

abili

ty d

ensi

ty

(a) ANDXζo=0.2 (b) ANDXζo=0.3 (c) ANDXζo=0.4 (d) UNDX

Fig. 4. Examples of p.d.f.s on a 1-dimensional search space. Each parent is located at p(1) = −1 and p

(2) = 1, respectively. The parameters are set tosatisfy (3). The dark filled curves indicate p.d.f.s of c

(2) only.

C. Normalization

The ranking 10 is the worst in a 10 member group, butit is not bad in a 20 member group. This example explainsthat ranking value is often hard to be handled. This paperadopts a normalized ranking value to measure the quality ofan individual. The value Q(r, k) is defined as:

Q(r, k) =r − 0.5

k, (6)

where k is the number of members within the focused group,and r is the raw ranking value. For example, each memberwithin a group that consists of four members is given the nor-malized ranking: 0.125, 0.375, 0.625 or 0.875 respectively.The values correspond to the positions uniformly distributed.

D. Selection Pressures

1) MGG model: The normalized ranking of the bestmember must be

Qbest = Q(1, t) =0.5

t, (7)

where t is the family size. Because the roulette-wheel selec-tion is performed among the rest members, the raw rankingvalue of the roulette selection member must be 1+R(t−1).Thus, corresponding normalized ranking is derived from (6)as

QMGGroulette =

R(t − 1) + 0.5

t. (8)

2) X-MGG model: The Qbest of X-MGG is the sameas that of MGG. The roulette-wheel selection is performedamong the rest subfamilies, S(rest). The number of therightful members is expressed as

t(rest) =t(m − 1)

m. (9)

Thus, the normalized ranking in S(rest) is derived as:

QX−MGGroulette =

R(t(rest)) − 0.5

t(rest). (10)

E. Expected Position

For simple discussion, the population size is set to betwo. Each of their initial positions is -1 or 1. The objectivefunction is defined by F (x) = x. This optimization is treatedas a one-dimensional maximization problem. Consequently,the optimum x = ∞ exists far from the initial population. LetG(x) be the integral of the normal distribution

∫ ∞

xN(0, 1).

The symbol PDF (crossover name) expresses the p.d.f. of acrossover operator. xQ is the position of the x that satisfies∫ ∞

xPDF (crossover name) = Q.

1) UNDX with MGG: The PDF (UNDX) is just theN(0, 1). Therefore, xQbest

and xQMGGroulette

are directly obtain-able by the table of normal distribution.

2) ANDX with MGG: Let Hall(x) be the integral∫ ∞

xPDF (ANDX). This is divided into the following four

parts:

the outside of the right part of the PDF (ANDX)

Gro(x) =

{ζo

∫ ∞

xN(1, σ2

o) = ζoG(zro), · · · (1 ≤ x)ζo

2 · · · (x < 1) ,

(11)

the inside of the right part of the PDF (ANDX)

Gri(x) =

⎧⎪⎨⎪⎩

0 · · · (1 ≤ x)

ζi

∫ 1

xN(1, σ2

i ) = ζi

∫ 0

zriN(0, 1)

= ζi(G(zri) − 0.5), · · · (x < 1) ,

(12)

the inside of the left part of the PDF (ANDX)

Gli(x) =

{ζi

∫ ∞

xN(−1, σ2

i ) = ζiG(zli) · · · (−1 < x)ζi

2 · · · (x ≤ −1) ,

(13)

the outside of the left part of the PDF (ANDX)

Glo(x) =

⎧⎪⎨⎪⎩

0 · · · (−1 < x)

ζo

∫ −1

xN(−1, σ2

o) = ζo

∫ 0

zloN(0, 1)

= ζo(G(zlo) − 0.5), · · · (x ≤ −1) ,

(14)

4534 2007 IEEE Congress on Evolutionary Computation (CEC 2007)

where,

zro =x − 1

σo

, zri =x − 1

σi

,

zli =x + 1

σi

, zlo =x + 1

σo

.

Thus, arbitrary xQ, including xQbestor x

QX−MGGroulette

, thatsatisfies

Hall(x) = Gro(x) + Gri(x) + Gli(x) + Glo(x)

= Q (15)

is obtainable by the table of normal distribution.

3) ANDX with X-MGG: The xQbestis obtained by the

same manner of the ANDX with MGG. In the case thatthe best is derived from the right asymmetrical normaldistribution, the roulette-wheel selection is performed basedon the left one. In this case, x

QX−MGGroulette

must satisfy

Hleft(x) = 2 · (Gli(x) + Glo(x)) . (16)

In the other case, it must satisfy

Hright(x) = 2 · (Gro(x) + Gri(x)) . (17)

Consequently, the expected value of xQ

X−MGGroulette

is obtainedby the weighted average, where the weight is defined by theratio of these cases, λ =

Hright(xQbest)

2·Hall(xQbest) .

F. Convergence Speed

Let gk be the center of gravity of parents after the k-thcrossover operation. In cases of k ≥ 2, the enlargement ratioof the distance between gk and gk−1 per a single crossoveroperation, u1, must be

u1 =gk − gk−1

gk−1 − gk−2=

gk−1 − gk−2

gk−2 − gk−3= · · · . (18)

This equation and g0 = 0 give

gk =g1

1 − u1(1 − u1

k) . (19)

When k → ∞, u1k → 0 if u1 < 1. Hence, this result

indicates that the population may converge into a certainpoint,

g∞ =g1

1 − u1, (20)

before finding the optimum far from the initialized region.On the other hand, in the case of 1 < u1 and 0 < g1, thepopulation would reach the optimum.

G. Numerical Example

An example case of t = 52 (adopted in [6]) is shown here.The selection pressures of MGG and X-MGG are

Qbest =0.5

52� 0.0096,

QMGGroulette �

11.29 + 0.5

52� 0.23, (21)

TABLE II

A NUMERICAL EXAMPLE CASE OF THEORETICAL ANALYSIS OF THE

SEARCH PROCESS. THE FAMILY SIZE IS SET TO BE 52.

xQbexQro

u1 g1 g∞

MGG UNDX 2.34 0.75 0.79 1.54 7.53ANDXζo=0.2 2.33 0.82 0.75 1.57 6.40ANDXζo=0.3 2.07 0.86 0.61 1.46 3.72ANDXζo=0.4 1.90 0.93 0.49 1.41 2.76

X-MGG ANDXζo=0.2 2.33 0.91 0.71 1.62 5.57ANDXζo=0.3 2.07 0.54 0.77 1.30 5.58ANDXζo=0.4 1.90 -0.42 1.16 0.74 ∞

xQbe= xQbest

, xQro= xQroulette

.

and

t(rest) =52

2= 26,

QX−MGGroulette =

6.75 − 0.5

26� 0.24 . (22)

They give us Table II. Note, the values include a certaindegree of computational errors, such as roundoff errors andinterpolation errors [8]. All computation was performed byJava language version ”1.5.0 10”. The real-numbers werestored in Java’s double type. The integral values were ob-tained by the linear interpolation.

IV. EMPIRICAL VERIFICATION

This section compares the pure mathematical search pro-cess with the actual optimization search process.

A. Conditions

Section III assumes several impractical conditions. Inexperiments here, somewhat actual conditions are adopted.The objective function used is the Sphere function:

Fsphere =

n∑i=1

xi2 . (23)

The optimum of this function is just the origin. The individ-uals of the initial population were uniformly distributed in[9999, 10000]. Therefore, the population must go on a longlong distance trip. The dimension n was set to be 100. In eachgeneration, a set of two parents and one additional parent pro-duce 50 children. The performance measures adopted wereboth the number of finding optimum and the convergencespeed. They were evaluated on the average of 30 trials.

B. Experimental Results and Discussions

The first experiments were performed under the similarcondition to the theoretical condition; the population size isonly three. This size is the smallest for UNDX and ANDX.In this case, only ANDXζo=0.4 with X-MGG reached theoptimum and the others converged in a certain point. In thenext case that the population size is four, all methods exceptANDXζo=0.4 with MGG reached the optimum. The transitioncurves in the case that the population size is five are shownin Fig. 5 (pop = 5). These graphs are similar to those inFig. 2. This inversion phenomenon is successfully explainedby the theoretical analysis. The larger g∞ is better for

2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4535

MGG MGG MGGFitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4�

���ANDXζo=0.3

�ANDXζo=0.2 ���

UNDX �

Fitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4������ ANDXζo=0.3

ANDXζo=0.2

UNDX

Fitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4������ ANDXζo=0.3

ANDXζo=0.2

UNDX

X-MGG X-MGG X-MGGFitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4���������

ANDXζo=0.3

ANDXζo=0.2

Fitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4������ ANDXζo=0.3

ANDXζo=0.2

Fitness Value

Number of Evaluation

0e+00 2e+00 4e+00 6e+00 8e+00

1e+08

1e+04

1e+00

1e-04

1e-08

ANDXζo=0.4���

ANDXζo=0.3

ANDXζo=0.2 ����

pop = 5 pop = 15 pop = 60

Fig. 5. The transition curves of the best fitness value on Fsphere (the average of 30 runs).

finding the outer optimum. In spite of the smaller number ofgenerating children (not a little sampling error must happen),this convergence speed order approximately corresponds wellto the ranking order based on the theoretical values.

The primary difference between the cases of Fig. 2 andFig. 5 (pop = 5) is the population size. The size 30used on Frosenbrock is far larger. The author continuedthe experiments on Fsphere with larger population sizes:6, 9, 12, . . . , 30. With some of these sizes, overlapped tran-sition curves were observed. An example of such case isshown in Fig. 5 (pop = 15). Why did the overlappedcurves appear only on Fsphere? On Frosenbrock, Sakumaand Kobayashi have demonstrated the typical search processof the population in detail [3]. In the early search stage,the population extremely prefers to gather around the ori-gin. Then the population marches along the curved sheercliff toward the optimum. According to this demonstration,the main factor of that difference should be the extentof the population distribution. With larger population size,the combinatorial variations of parent pairs for a crossoveroperation are exponentially larger. Therefore, even if thedistance between the survived children is short, the distancebetween the parents for the next crossover operation mayhave enough range. This means that larger population sizeweakens the influence of the distance between the survivedchildren. On Fsphere, the population distribution can easilyenlarge if the crossover operator and the selection modelallow. On the other hand, on Frosenbrock, the curved sheercliff always prevents the population from rapid marching.

When the population size is 60, the transition curve ofANDXζo=0.4 with X-MGG is the slowest, as shown in Fig. 5(pop = 60). In such case, the value of g1 would be moreimportant to enlarge the population. This experiment suggeststhe important measure is not only one. The measures: u1, g1

and g∞, would have respective roles.Finally, this paper qualitatively discusses the appropriate-

ness of combination of a crossover operator and a selectionmodel. We have learned an important guideline for designinga RCGA: “consider the total balance of the components.”through the above theoretical and empirical studies. Theauthor had tried to discover only the relation between ANDXand MGG, or ANDX and X-MGG. However, we understandnow that the appropriateness of the combination would becontrolled by the population size parameter. Furthermore,the other parameter, the number of produced children pera generation, also has an influence on the selection pres-sure, as shown in (6)∼(10). A GA that consists of severalhighly independent elements should be a better designed GA.Otherwise, designing a GA would continue being a kind ofcombinatorial optimization problems (hard work!).

V. CONCLUSIONS AND FUTURE WORKS

This paper theoretically analyzed an inversion phe-nomenon of the convergence velocities of ANDX withMGG and ANDX with X-MGG. The theoretical resultscorresponded to the empirical verification. The author hasconcluded that the main factor of the inversion phenomenonis the diversity change of the population. The diversity

4536 2007 IEEE Congress on Evolutionary Computation (CEC 2007)

change has been controlled by not only crossover operatorsand selection models but the population size parameter. Theequations derived in the theoretical analysis informed thatthe number of produced children per a generation would beanother factor to control the selection pressure. Designingimproved genetic operators that have highly independence isan important future work. Comparison of ANDX with X-MGG and other RCGAs, such as introduced in [9] and [10],is also a future work.

ACKNOWLEDGMENT

This work was supported in part by the “Function andInduction research project” and by the Grant-in-Aid forScience Research (A), No.17200020.

REFERENCES

[1] S. Tsutsui and D. E. Goldberg, “Search space boundary extensionmethod in real-coded genetic algorithms,” Information Sciences, vol.133, no. 3-4, pp. 229–247, 2001.

[2] I. Ono, H. Kita, and S. Kobayashi, “A robust real-coded geneticalgorithm using unimodal normal distribution crossover augmented byuniform crossover : Effects of self-adaptation of crossover probabil-ities,” in Proceedings of the Genetic and Evolutionary ComputationConference 1999 (GECCO-99), July 1999, pp. 496–503.

[3] J. Sakuma and S. Kobayashi, “Extrapolation-directed crossover con-sidering sampling bias in real-coded genetic algorithm (in japanese),”Journal of Japanese Society for Artificial Intelligence, vol. 17, no. 6,pp. 699–707, 2002.

[4] H. Someya and M. Yamamura, “A robust real-coded evolutionaryalgorithm with toroidal search space conversion,” Soft Computing,vol. 9, no. 4, pp. 254–269, 2005.

[5] I. Ono and S. Kobayashi, “A real-coded genetic algorithm for functionoptimization using unimodal normal distribution crossover,” in Pro-ceedings of the 7th International Conference on Genetic Algorithms(ICGA97), 1997, pp. 246–253.

[6] H. Someya, “Promising search regions of crossover operators forfunction optimization,” in Lecture Notes in Artificial Intelligence 4570(Proceedings of The 20th International Conference on Industrial,Engineering & Other Applications of Applied Intelligent Systems)(IEA/AIE 2007), June 2007, pp. 434–443.

[7] H. Satoh, M. Yamamura, and S. Kobayashi, “Minimal generationgap model for GAs considering both exploration and exploitation,”in Proceedings of IIZUKA’96, 1996, pp. 494–497.

[8] R. Mak, The Java Programmer’s Gide to Numerical Computing.Prentice Hall PTR, 2003.

[9] F. Herrera, M. Lozano, and J. L. Verdegay, “Tackling real-codedgenetic algorithms: Operators and tools for behavioural analysis,”Artificial Intelligence Review, vol. 12, no. 4, pp. 265–319, 1998.

[10] F. Herrera, M. Lozano, and A. M. Sanchez, “A taxonomy for thecrossover operator for real-coded genetic algorithms: An experimentalstudy,” International Journal of Intelligent Systems, vol. 18, no. 3, pp.309–338, 2003.

2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4537