Let's get ready to rumble redux: Crossover versus mutation head to head on exponentially scaled...

Preview:

DESCRIPTION

This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems with substructures of non-uniform salience. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to determine the scalability of each algorithm. The analysis shows that for deterministic exponentially-scaled additively separable, problems, the BB-wise mutation is more efficient than crossover yielding a speedup of Θ(l logl), where l is the problem size. For the noisy exponentially-scaled problems, the outcome depends on whether scaling on noise is dominant. When scaling dominates, mutation is more efficient than crossover yielding a speedup of Θ(l logl). On the other hand, when noise dominates, crossover is more efficient than mutation yielding a speedup of Θ(l).

Citation preview

Let’s Get Ready to Rumble Redux: Crossover vs. Mutation Head to Head

on Exponentially-Scaled Problems

Kumara Sastry1,2 and David E. Goldberg1

1Illinois Genetic Algorithms Laboratory2Materials Computation Center

University of Illinois at Urbana-Champaign, Urbana IL 61801http://www.illigal.uiuc.edu

ksastry@uiuc.edu, deg@uiuc.edu

Supported by AFOSR FA9550-06-1-0096 and NSF DMR 03-25939.

MotivationGreat debate between crossover and mutation

When mutation works, it’s lightning quick

When crossover works, it tackles more complex problems

Compare crossover and mutation where both operators have access to same neighborhood information

Local search literatureEmphasis on good neighborhood operators [Barnes et al, 2003; Watson, 2003; Hansen et al, 2001]

Need for automatic induction of neighborhoods

Leads to adaptive time continuation operator [Lima et al 2005, 2006, 2007]

Outline

Related work

Assumption of known or discovered linkage

Objective

Algorithm Description

Scalability analysis: Crossover vs. MutationKnown or discovered linkageExponentially scaled additively-separable problem with and without Gaussian noise

Summary and Conclusions

Background

Emprical studies comparing crossover and mutation

Scalability of GAs and mutation-based hillclimber[Mühlenbein, 1991 & 1992; Mitchell, Holland, and Forrest, 1994; Baum, Boneh, and Garett, 2001; Dorste, 2002; Garnier, 1999; Jansen and Wegener, 2002, 2005]

Single GA run with large population vs. multiple GA runs with small population at fixed computational cost [Goldberg, 1999; Srivastava & Goldberg, 2001; Srivastava, 2002; Cantú-Paz & Goldberg, 2003; Luke, 2001; Fuchs, 1999]

Used fixed operators that don’t adapt linkage

Did not consider problems of bounded difficultyLinkage and neighborhood information is critical

Known or Discovered Linkage

Assumption of known or induced linkageCan use linkage-learning techniques

Linkage information is critical for selectorecombinative GA success

Provide the same information for mutationMutation searches in the building-block subspace

Pelikan, Ph.D. Thesis, 2002

Exponential Polynomial Scalability

Algorithm Description

Selectorecombinative genetic algorithmPopulation of size nBinary tournament selectionUniform building-block-wise crossover

Exchange BBs with probability 0.5

Selectomutative genetic algorithmStart with a random individualEnumerative BB-wise mutation

Consider BB partitions– Arbitrary left-to-right order

Choose the best schemata– Among the 2k possible ones

BBs #1 and #3 exchanged

Crossover Versus Mutation: Uniform Scaling

Deterministic fitness: Mutation is more efficient

Noisy fitness: Recombination is more efficient

[Sastry & Goldberg, 2004]

Objective

Crossover and mutation both have access to same neighborhood information

Known or discovered linkageRecombination exchanges building blocksMutation searches for the best BB in each partition

Compare scalability of crossover and mutationAdditively separable problems with exponentially-scaled BBs

With and without additive Gaussian noise

Where do they excel?

Derive, verify, and use facetwise modelsConvergence time and population sizing

Scaling and Noise Cover Most Problems

Adversarial problem design [Goldberg, 2002]

Noisy BinInt

P

Fluctuating

Deception NoiseScaling R

Convergence Time for Crossover: Deterministic Fitness Functions

Selection-Intensity based model [Rudnick, 1992; Thierens et al, 1998]

Derived for the BinInt problemApplicable to additively-separable problems

Selection Intensity

Problem size (m·k )

Population Sizing for Crossover:Deterministic Fitness Functions

Domino convergence [Rudnick, 1992]

BB convergence in order of salienceDrift bound dictates population sizing

Drift time [Goldberg and Segrest, 1987]

Size the population such that:

Population size:

...

time

Pro

porti

on Mostsalient

Leastsalient

Scalability Analysis of Crossover & Mutation: Deterministic Fitness Functions

Selectorecombinative GAPopulation size:

Convergence time:

Number of function evaluations:

Selectomutative GAInitial solution is evaluated once2k –1 evaluations in each of m partitions

Crossover vs. Mutation: Deterministic Fitness Functions

Speed-Up: Scalability ratio of mutation to that of crossover

Convergence Time for Crossover: Noisy Fitness Functions

Additive Gaussian noise with variance σ2N

Set proportional to maximum fitness variance

Scaling dominated:

Noise dominated:

Population Sizing for Crossover:Noisy Fitness Functions

Scaling dominated:

Noise dominated:

Scalability Analysis of Mutation: Noisy Fitness Functions

Fitness should be sampled to average out noiseWhat should the sample size, ns, be?BB-wise decision making [Goldberg, Deb, & Clark, 1992]

Square of the ordinate of a one-sided Gaussian deviate with specified error probability, α

Scalability Analysis of Crossover & Mutation: Noisy Fitness Functions

Selectorecombinative GA

Selectomutative GAFitness of each individual is sampled ns times2k –1 evaluations in each of m partitions

Crossover vs. Mutation: Noisy BinInt

Speed-Up: Scalability ratio of crossover to that of mutation

Summary

Deterministic fitness: Mutation is more efficient

Noisy fitness: Recombination is more efficient in noise dominated regime

Conclusions

Good neighborhood information is essentialQuadratic scalability of crossover and mutationExponential scalability of simple crossover [Thierens & Goldberg, 1994]

ekmk scalability of simple mutation [Mühlenbein, 1991]

Leads to a theory of time continuationKey facet of efficiency enhancement

Leads to principled design and development of adaptive time continuation operators

Promise of yielding supermultiplicative speedups

Recommended