[IEEE 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence - Anchorage, AK, USA (4-9 May 1998)] 1998 IEEE International

Building Block Filtering and Mixing Cees H.M. van Kemenade, CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands

Abstract- Problems involving high-order building blocks with unknown linkage are difflcult to solve. Neither n-point crossover nor uniform crossover can mix high-order building blocks efficiently. A further complication arises when the optimal solution consists of many building blocks. In that case it is difflcult to strike balance between exploration and exploitation. We have developed a hybrid GA, the BBF-GA, to handle building blocks effectively and efRciently. This GA uses a three-stage approach. During the Arst stage a large number of rapidly converging GA’s is used to explore the search-space. In the second stage the best individual of each GA is Altered to locate the (potential) building blocks present in this individual. The third stage consists of a GA that exploits these masked individuals by mixing them to obtain the global optimal solution. The BBF-GA performs well on a set of test-problems and is able to locate and mix more building blocks than the competitor GA’s.

I. INTRODUCTION

A possible explanation for the working of the GA is given by the building block hypothesis as introduced by Gold- berg [Go189]:

“ ... so does a genetic algorithm seek near optimal performance through the juxtaposition of short, low-order, high-performance schemata, or building blocks.”

In this paper a set of optimization problems that adheres to this hypothesis is used as test-problems. For these problems the global optimal solution can be partitioned in a set of building blocks. These building blocks can be discovered independently and mixed afterwards.

Before applying a GA we have to decide on a representation for the individuals where each individual describes a single solutions to the problem we want to handle. An important issue is the linkage. Linkage is said to be tight if bits belonging to the same building block are next to each other on the chromosome, while loose linkage corresponds to a situation where bits belonging to the same building block are scattered over the chromosome.

Tightly linked building blocks can often be processed efficiently by a GA using a one-point or a two-point crossover operator. Problems having loose linkage are much more difficult to process using these crossover operators as the probability that a loosely linked building block survives the crossover operation is relatively small. In some cases slightly better results can be obtained by using n-point crossover or uniform crossover, but neither of these two is very efficient in processing high-order building blocks. When a crossover operator is more efficient on tightly linked building blocks we say that this operator has a positional bias. A positional bias implies that the probability of two bits being taken from the same parent de- pends upon the (relative) position within the chromosome of these bits [ECS89]. Problems due to linkage have al- ready been studied by Holland [Ho175], and the inversion

operator is proposed as a remedy. It has been shown that the inversion operator acts too slow to be useful. Another approach to avoid linkage problems is taken in the (fast) Messy GA’s [GDKH93], [GKD89], [Kar95], where a different representation and corresponding operators without positional bias are introduced.

The GA is often assumed to be an efficient method as it is able to find building blocks independent of one another and mix these building blocks afterwards. The efficiency of this mixing process is dependent upon the linkage, the efficiency of the crossover operator, and the life-time of individuals that contain these building blocks. Mixing of building blocks can be difficult for a GA [GDT93], [TG93]. One possible reason for these difficulties is that the efficiency of a GA decreases as the size of the building blocks in- creases. One approach to increase the capabilities of GA’s to process such high-order building blocks, is to measure correlations between actual bit-values and the fitness of the solution. The first GA that uses this approach is the GEMGA [Kar96]. Simultaneously, but independently, the building block filtering [vK96J was developed. Both methods estimate fitness contributions of separate bits. The GEMGA basically uses this information to guide the evo- lution process. The building block filtering method uses the information to make an actual decomposition of a bit- string in order to extract the loci that belong to the same building block and their optimal values. In this paper we describe a hybrid GA that incorporates the building block filtering method.

In section I1 the building block filtering method will be described and extended. Section I11 describes the GA’s used and section IV introduces the masked crossover. In section V the different components are combined in a hybrid GA, the BBF-GA. The test-problems are given in section VI followed by the e:xperimental results in section VI1 and the conclusions in section VIII.

11. FILTERING OF BUILDING BLOCKS

If we know which loci of an individual belong to a certain building block then we can process/mix this building block easily. If we do not have information we would like to learn it. To do so we devieloped the building block filtering method. Given a specific individual this filtering method locates a (small) set of bits that have a relatively large influence on the fitness of the individual. We assume that this set of bits correspond to an optimal building block, or at least that a schema corresponding to these bits is a schema that mainly ccivers a single building block. We mark these bits and process these as a single unit when applying crossover. This way we can reduce the bias introduced by the choice of representation. The defining length of the schema that represents the optimal building block is

0-7803-4869-9/98 $10.0001998 JEEE 505

measure

df itness -+

sort on --+

df itness

select

significant bits

--+ construct

building block

3

Fig. 1. Example of one filtering step w 3 - 1 competitionb - offspring

pt ps

Fig. 2. Schematic representation of Elitist recombination

random shuffle .

pt ps

order triple

Pt+l

p *

Et-1

H Fig. 3. Schematic representation of triple-competition selection

not that important anymore and the schema is processed in one piece. More details about such a crossover operator will be given in section IV

break schema 0#1##1. As a result the positive fitness contribution +7 will get lost. In our example the df itness- values of bit 1, 3, and 6 are respectively -8, -6, and -7.

An example of the building block filtering method is shown in Figure 1. The method uses four steps. Dur- ing the first step the fitness contribution, dfitness, of each of the bits is measured, resulting in a set of tuples. Next these tuples are ordered on df itness. In the third step a truncation rules is used to select a subset of tuples, and in the final step a masked individual is created. The bit-string of this individual are exactly the same as the bit-string of the original individual, but a mask is added that indicates the most significant bits.

In the first step we measure the change of fitness, dfitness, when each of the separate bits in an individual is changed to the complementary value. So a 1-bit is changed to a 0-bit and a 0-bit to a 1-bit. We assume that the dfitness is an appropriate measure for the relative influence of the corresponding bit on the fitness of the complete individual. In the example in Figure 1 we see a bit-string of length 6 . Let us assume that the main fitness contribution within this string is coming from a building block containing bits 1, 3 and 6, resulting in a fitness increase of +7 when the schema 0#1##1 is present. The dfitness of each bit is measured by flipping this bit and observing the change in fitness, and a set of tuples of type (position, dfitness) is created. Flipping bit 1, 3 or 6 will

During the third step we have to truncate the ordered sequence of tuples in order to select a set of most influen- tial bits. The simplest approach is to mask a fixed number of tuples. A more complex approach would be to try to estimate the optimal number of bits to select. Given a filtered individual and a random test-individual, we transfer the bits from the filtered individual to the test-individual one by one in the order, as determined by the the second filtering step. So in case of the example shown in Fig- ure 1, we transfer in sequence bit-values of loci 1, 6, 3, 2, 5, and 4. We observe how the fitness of the test-individual changes during this process. The bit whose transfer in- creases the fitness of the test-individual most and also results in a new fitness that is larger than the initial fitness of the test-individual is used to estimate the truncation position. Assuming that the bit at ranked position r results in this largest increase and that the filtered individual and the test-individual have exactly the same bit-values from ranked position T + 1 to T + s then the truncation pojnt is chosen in the middle of this range at position r -t- $. We apply an upper bound on r as we would like to find building blocks of limited size only. Masking half of a bit-string would not make much sense. Application of this procedure with a single test-individual will only give a rough estimate

506

of the optimal truncation point. Therefore, we repeat this procedure a number of times using different random test- individuals and we take the median of all obtained truncation points and use this to select the bits that are masked.

111. ELITIST RECOMBINATION

As the basis of the hybrid GA described in the next section we use two other GA’s.

Elitist recombination [TG94] selects parents by doing a random pairing of individuals. As all parents have exactly the same probability of being selected this corresponds to a uniform selection. Figure 2 shows how the next population Pt+l is produced from the current population Pt. On the left we see the current population Pt, where each box represents a single individual. The values in the boxes denote the fitness of the corresponding individuals. An intermedi- ate population Ps is generated by doing a random shuffle on Pt. Population P, is partitioned in a set of adjacent pairs and for each pair the recombination operator is applied in order to obtain two offspring. Next, a competition is held amongst the two offspring and their two parents, and the two winners are transferred to the next population &+I. In the presented example one parent and one offspring are transferred to Pt+l. Elitism is used as parents can survive their own offspring. Elitist recombination has been shown to be more efficient then other GA’s on a problem involving a high-order building block [vK97a].

We also use the triple-competition selection scheme [vK97b] (where it is was called rat-race selection). A schematic representation is given in Figure 3. When using this scheme the population is partitioned in a set of triples. Within each set of three individuals the two best performing individuals are allowed to create one offspring. This offspring replaces the third individual. This modified triple is then propagated to the next generation. So two- third of the individuals will be propagated unmodified to the next generation. The best individual will never get lost when using this selection scheme. It has been observed that triple-competition selection tends to result in relatively fast convergence of the population to an sub-optimal solution.

IV. MIXING WITH MASKED UNIFORM CROSSOVER

X

1

Fig. 4. An example of the masked crossover

The filtering method described in section I1 produces a masked individual where the masked bits are considered to be a potential building block. Now we have to combine

different building blocks in order to find a globally optimal solution to the problem. If we know that the masks exactly locate the building blocks then the search for the optimal solution is as simple as selecting an optimal set of non-overlapping building blocks. Possibly the masks wiIl not be this accurate. A rnask is likely to include bits that do not belong to the building block or it might miss some of the bits that do belong to the building block. There- fore, we are going to use a genetic algorithm with a special crossover operator to perform this mixing task for us. First we are going to describe how to use the masks when performing crossover and next we are going to discus how the information present in the mask will be transferred to the offspring.

In case the parent individuals have disjunct masks the crossover is relatively simple. First we transfer the masked bits of the first parent to the offspring, next we transfer the masked bits of the second parent and then we are going to apply uniform crossover for the remaining bits. If there is an overlap between thle masks of the two parents at a certain locus then we can discriminate between two cases. If both parents have the same bit-value at this locus then we can still use the same procedure. If both parents differ at this location then we are going to choose the parent that provides this bit at random.

The next step involves the creation of a mask for the offspring based on the two masks of the parents. A bit in the offspring is masked if it was masked by one of the parents, or if it was mask.ed by both of the parents and the bit-values of the both parents at this location was equal. If both parents masked a bit but had a different value at the given location then the bit will not be masked in the offspring. As a result we get a kind of pruning of the masks.

Figure 4 shows an example of the application of the masked crossover. The first locus and the last locus are masked by exactly one parent, so the offspring will inherit the bit-value from the corresponding parent and the mask- bit of the offspring will be set for these loci. The third locus is masked by both parents, but the parents have non- matching bit-values, so am arbitrary parent is selected and the mask-bit is not propagated. The fourth locus is also masked by both parents, but this time the bit-values match, so it does not matter from which parent we take the value of this locus. In this case the corresponding mask-bit in the offspring is set.

V. THE HYRID BBF-GA When developing GA’s it is difficult to strike a balance

between exploration and exploitation. A GA with emphasis on exploration (searching the complete search-space) is likely to be slow, a G:A with emphasis on exploitation (preservation and duplication of the currently best individuals) is likely to converge rapidly to sub-optimal solutions. The hybrid BBF-GA separates the exploration and exploitation task and handles building blocks effectively and efficiently.

The BBF-GA uses a three-stage approach. During the first stage a large number of rapidly converging GA’s is

507

used to explore the search-space. In the second stage the best individual of each GA is filtered in order to mask the building blocks present in this individual. The third stage consists of a GA that that exploits these masked individuals by mixing them to obtain the global optimal solution.

During the first stage we explore the complete search space by running a large number of GA’s with a small population and triple-competition selection. Such GA’s usually converge fast to a local optimum, containing only few high-order building blocks. GA’s with a small population are not reliable optimizers and different GA’s are likely to converge to many different local optima. This is exactly the behavior we like as the first stage should explore the complete search space. The best individual of each GA is passed to the next stage.

During the second stage we try to learn something about the linkage of the bits within a specific individual. The filtering method is applied to each of the individuals and a population of masked individuals is created.

During the third stage we try to exploit all information we gathered in the previous two stages. A single GA with the masked uniform crossover operator is run. The initial population consists of all the masked individuals produced during stage two. For the GA, we use elitist recombination. This GA is not very fast, but it is effective at mixing of building blocks.

VI. TEST-PROBLEMS To study the effectiveness of the filtering and the mixing

we introduce two scalable test-problems. The number of building blocks m can be adjusted and the size of the optimal building blocks d, is adjustable. A partition can be described by means of a set of non-overlapping schemata ***f **f f f *f *, where * is the wild-card symbol and f represents a fixed allele (either 0 or 1). For both test-problems we are considering the global optimum can be partitioned in a set of m order d building blocks.

The first test-problem is based on the deceptive trap- functions [Go189], [DG94]. We use the following formula to compute the fitness contribution of the single part.

ad if u(b) = d = { (d - u ( b ) ) / d otherwise

Here a > 1. The global optimum of the deceptive part contains only 1-bits, which results in a fitness contribution of a.

Based on this building block we can build a scalable test-problem by concatenating m of these order d building blocks whose fitness contribution is determined by fo ( b ) . The fitness of an complete individual is computed as the sum of the contributions of all its parts divided by ma, so the fitness ranges from 0 to 1. A solution to this problem can be coded in a straight-forward manner in a bit-string of length E = m x d. The actual linkage of the bits belonging to the same part is assumed to be unknown, so we have a problem with loose linkage.

Typical properties of this first test-problem are that the first-order statistics for each bit-position are deceptive and

that the different parts can be optimized completely independent of one another. Therefore a second test-problem, the bb-NK problem, is introduced that does not have either of these properties, but does have building blocks. As a basis for the problem we use NK-landscapes [Kau93]. These NK-landscapes have a single parameter k that takes a value in the range 0 to 1 - 1. The fitness contribution of each bit is dependent upon the values of all bits in a neighborhood of size k. The fitness of the complete string is computed as the average fitness-contribution of all bits. Two types of NK-landscape exist based on the way the neighbors are selected. The first type uses nearest-neighbor interaction so each bit is dependent on its k nearest other bits. Hence, the expected correlation between bits decreases with dis- tance. The second type of NK-landscape uses a randomly selected neighborhood. In the case the expected correlation between bits is independent of their relative position on the bit-string and bits that are far apart will usually be much more correlated than in case of the nearest neighbor interaction.

In order to build a test-problem with m building blocks of size d we take an NK-landscape length 1 = m x d with a random neighborhood, and we select a random partition of the our bit-strings in m parts. The fitness contribution of a bit is 1.0 if the part that bit belongs to is completely filled with 1-bits, otherwise the contribution of the bit is determined by the underlying NK-landscape. The fitness of a complete bit-string is computed as the average fitness-contribution of all the bits. The optimal solution is again a bit-string consisting of only 1-bits having fitness 1.0. The bb-NK problem does have a set of independent building blocks of order d. The optimal building blocks are completely independent of one another, but if an optimal building block is not present within a certain part then we have non-linear interactions that cross the boundaries of the partitions, so this problem is not separable.

VII. EXPERIMENTS For the BBF-GA we use a population size of 24 during

the first stage. The triple-competition selection scheme is run for 12 generations on this population and the best individual is passed to the second stage, the building block filtering. Two variants of building block filtering have been investigated. The first variant uses a fixed bound for the number of bits that are going to be selected in the mask. The number of selected bits is set to 8. The second variant uses a flexible number of bits that is determined by the procedure given at the end of section 11. During the third stage elitist recombination with masked crossover is applied for 100 generations on a population of 100 masked individuals.

Elitist recombination using uniform crossover, is also applied directly to the problem with a population of 300 individuals that is allowed to converge for 100 generations. Furthermore a generational GA with tournament selection is applied with tournament-size 2, population size 300, 100 generations, and crossover is always applied.

For the deceptive problem we use a = 1.5 and for the bb-

1

0.8

0.6

0.4

0.2

0

deceptive problem (experiment I)

.... *...... + . ' 0

Q... .......

\

1

0.8

0.6

0.4

0.2

0 4 6 8 10 12 14 16

Order of building blocks

bb-NK problem (experiment I)

4 6 8 10 12 14 16 Order of building blocks

The average number of optimal building blocks in the best solution using different optimiza,tion methods for the deceptive problem Fig. 5. (left) and the bb-NK problem (right).

bb-NK problem (experiment 1)

ElRw --e..- GGA -*---

0.85 -

0.8

0.75

-

-

0.7 -

0.65 4 6 8 IO 12 14 16

Order of building blocks

bb-NK problem (experiment 11)

*-- .--__ ___.

0.8 .......

0.7 9

0.6

0.5

::: 0.2

....... ......

....... -. ........ --e-. .._. i ..__

Q.... D

I 3 4 5 6 7 8 9 1 0

Totdl number of building blocks

Fig. 6. The average fitness of the best solution (left) and the average fraction of optimal building blocks as a function of the total number of building blocks (right) for the bb-NK problem.

NK problem we use a random neighborhood of size k = 10. All results shown are averaged over 30 independent runs. The BBF-GA with fixed bound for the number of bits to mask, uses 28,000 fitness evaluations per experiment, while the BBF-GA with flexible bounds uses approximately 10,000 additional evaluation to determine the bound during the filtering stage. The generational GA and elitist recombination use 30,000 fitness evaluations per experiment.

.A set of 44 different experiments (experiment I) was performed for varying orders of the building blocks d, and varying numbers of building blocks m. Each experiment was repeated 30 times. Figures 5 and 6 (left graph) show the average properties of the overall best individual. In case of the BBF-GA and elitist recombination this individual is present in the last population, but in case of the generational GA this individual might be lost and we are looking at the off-line performance. For m we have taken the smallest value such that I = m x d 2 60, so for d = 9 we have m = 7 such that the size of an individual becomes 63 bits. Figure 5 shows the average fraction of building blocks

present in the best solution for the deceptive problem (left) and the bb-NK problem (right). On both problems the BBF-GA using a flexible bound performs best followed by the BBF-GA with fixed bound, the elitist recombination, and the generational GA in sequence. The flexible bound selection for the BBF-GIA performs well on the deceptive problem while on the bb-NK problem the results are only slightly better than for the BBF-GA with a fixed bound. The left graph of Figure 6 shows the average fitness of the best solution as a function of the order of the optimal building block for the b'b-NK problem. It is interesting to see that the generational GA outperforms elitist recombination for d > 12 even though it does not find any of the optimal building-blocks as can be seen in the right graph of Figure 5. Recall that, these graphs show the properties of the overall best individual, so it might be the case that the individual is not present in the final population of the generational GA and that the generational GA mainly does a random search.

A second set of 32 experiments (experiment 11) was per-

509

formed where the size of the building blocks was fixed at d = 10 while the number of building blocks was varied. Again each experiment was repeated 30 times. The right graph of Figure 6 shows the results for the bb-NK problem. Again we see that the BBF-GA with flexible bound performs best while the generational GA performs worst. The performance of the BFF-GA’s degrades less when increasing the number of building blocks than the elitist recombination. The fraction of building blocks in the best solution is low and nearly constant for the generational GA.

VIII. CONCLUSIONS The building block filtering method is able to discover

parts of high-order building blocks even in quasi-random landscapes. Using this filtering method we create masked individuals which are processed with a special crossover operator. This crossover operator uses the mask to process a set of bits, assumed to be relatively important, in one piece while the non-masked bits are processed by a uniform crossover. The masked crossover also computes a new mask for the offspring based on the masks and the bit-values of both of the parents.

The BBF-GA is a hybrid three-stage GA. During the first stage an exploration of the search space is performed. The second stage locates a set of potential building blocks, and the third stage tries to exploit the the building blocks by mixing them in order to generate the optimal solution. The BBF-GA outperformed its competitors and scales better when increasing either the size, or the number of building blocks in the problem.

We also developed filtering methods based on computing first-order statistics for each locus separately and on computing linkage disequilibria between pairs of loci. Both methods performed very well on the deceptive problem but performed much worse on the more irregular bb-NK problem.

[DG94]

[ECS89]

[GDKH93]

[GDT93]

[GKD89]

[Go1891

[Ho175]

REFERENCES K. Deb and D.E. Goldberg. Sufficient conditions for deceptive and easy binary functions. Annals of Mathemat- ics and Artificial Intelligence, 10:385-408, 1994. L.J. Eshelman, R.A. Caruana, and J.D. Schaffer. Biases in the crossover landscape. In Proceedings of the Third International Conference on Genetic Algorithms, pages 10-19, 1989. D.E. Goldberg, K. Deb, H. Kargupta, and G. Harik. Rapid, accurate optimization of difficult problems using the fast messy genetic algorithms. In Proceedings of the FiBh International Conference on Genetic Algorithms, pages 56-64, 1993. D.E. Goldberg, K. Deb, and D. Thierens. Towards a better understanding of mixing in genetic algorithms. Jour- nal of the Society for Instrumentation and Control En- gineers, SICE, 32(1):10-16, 1993. D.E. Goldberg, B. Korb, and K. Deb. Messy genetic algorithms: Motivation, analysis, and first results. Complex Systems, 3(5):493-530, 1989. D.E. Goldberg. Genetic Algorithms in Search, Optimiza- tion, and Machine Learning. Addison-Wesley, 1989. J.H. Holland. Adaptation in natural and artificial systems: an introductory analysis with applications to biol- ogy, control, and artificial intelligence. The university of Michigan Press/Longman Canada, 1975.

[KarSS]

[Kar96]

[Kau93] [TG93]

[TG94]

[vK96]

[vK97a]

[vK97b]

H. Kargupta. SEARCH, polynomial complexity, and the fast messy genetic algorithm. Technical Report IlliGAL- 95008, University of Illinois, October 1995. H. Kargupta. Gene expression messy genetic algorithm. In Proceedings of the Third IEEE Conference on Evolu- tionary Computation, pages 814-819, 1996. S.A. K a u h a n . The origins of order. Oxford press, 1993. D. Thierens and D.E. Goldberg. Mixing in genetic algorithms. In S. Forrest, editor, Proceedings of the Fifth Intentational Conference on Genetic Algorithms, pages 38-45. Morgan Kaufmann, 1993. D. Thierens and D.E. Goldberg. Elitist recombination: an integrated selection recombination GA. In Proceedings of the First IEEE Conference on Evolutionary Compu- tation, pages 508-512. IEEE Press, 1994. C.H.M. van Kemenade. Explicit filtering of building blocks for genetic algorithms. In Proceedings of Parallel Problem Solving from Nature IV, pages 494-503, 1996. C.H.M. van Kemenade. Cross-competition between building blocks, propagating information to subsequent generations. In Proceedings of the Seventh International Conference on Genetic Algorithms, pages 1-8, 1997. C.H.M. van Kemenade. Modeling elitist genetic algorithms with a finite population. In Proceedings of the Third Nordic Workshop on Genetic Algorithms, pages 1-10, 1997.

510

Documents

[IEEE 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence - Anchorage, AK, USA (4-9 May 1998)] 1998 IEEE International