View
61
Download
0
Category
Preview:
DESCRIPTION
BCTCS 2005. Geometric Interpretation of Crossover. Alberto Moraglio amoragn@essex.ac.uk. Contents. I – Quick Preliminaries II – Geometric Interpretation of Crossover Extremely quick overview of its implications: III – Unification of Major Representations IV – Crossover Principled Design - PowerPoint PPT Presentation
Citation preview
Geometric Interpretation of
Crossover
Alberto Moraglioamoragn@essex.ac.uk
BCTCS 2005
Contents
I – Quick Preliminaries
II – Geometric Interpretation of Crossover
Extremely quick overview of its implications:
III – Unification of Major Representations
IV – Crossover Principled Design
V – Is Biological Recombination Geometric?
VI – Unity of Evolutionary Search
I. Quick Preliminaries
Evolutionary Algorithms…
• Are function optimizers
• Mimic biological evolution
• Are robust, hence preferred for real world problems
• Have little theory to explain how and why they work
• There are various flavours
Evolutionary Algorithm Template
Problem & representation independent
Standard representations & EAs flavours/dialects
• Binary strings (genetic algorithms, the classic)• Real code vectors (evolution strategies, continuous
optimization)• Permutations (order-based GAs, combinatorial
optimization)• Parse trees (genetic programming, evolution of computer
programs)Algorithmically irrelevant differences:
name/authorship/solution interpretation/domain of application
Algorithmically relevant differences:solution representation/genetic operators
What is crossover?
CrossoverIs there any
commonaspect ?
Is it possible to give arepresentation-
independent definitionof crossover and
mutation?
100000011101000
100111100011100
100110011101000
100001100011100
Mutation & Crossover for binary strings
• Mutation = bit flip at random position101001 101101
• Crossover = selection crossover point at random swap tails
1010|01 1010001110|00 111001
1*10|0* 1*100*• All offspring match the parent schema
II. Geometric Interpretation of
Crossover
Genetic operators & Neighbourhood structure
• Forget the representation and consider the neighbourhood structure (= search space structure)
• Mutation: offspring are “close to” their parent in the direct neighbourhood
Direct Neighbour Mutation
000
001
010
011
100
101
111
110
Representation: Binary String
Move: Bit Flip
Neighbourhood: Hamming
Representation + Move = Neighbourhood
?
Mutation: Offspring in the direct neighbourhoodWhat is crossover?
Neighbourhood and Crossover
Crossover idea: combining parents genotypes to get children genotypes “somewhere in between” them
Topologically speaking, “somewhere in between” = somewhere on a shortest path
Why on a shortest path?
Shortest Path Crossover011001
010001 011101 011011
010101 011111
010011
010111
D0 : P1
D2 : P2
D1
Parent1: 011101
Parent2: 010111
Children: 01*1*1
Children are on shortest paths
More than one shortest path in general
Interpretation & Generalization
• Traditional mutation & crossover have a natural interpretation in the neighbourhood structure in terms of closeness and betweenness
• Given any representation plus a notion of neighbourhood (move), mutation & crossover operators are well-defined
From graphs to geometry
• Forget the neighbourhood structure and consider the metric space (= space with a notion of distance)
• The distance in the neighbourhood is the length of the shortest path connecting two solutions
• Mutation Direct neighbourhood Ball• Crossover All shortest paths Line
Segment
Balls & SegmentsIn a metric space (S, d) the closed ball is the set of the form
where x belongs to S and r is a positive real number called the radius of the ball.
In a metric space (S, d) the line segment or closed interval is the set of the form
where x and y belong to S and are called extremes of the segment and identify the segment.
}),(|{);( ryxdSyrxB
)},(),(),(|{];[ yxdyzdzxdSzyx
Squared balls & Chunky segments
33
000 001
010 011
100 101
111
110
B(000; 1)Hamming space
3
B((3, 3); 1)Euclidean space
3
B((3, 3); 1)Manhattan space
Balls
1
2
1
2
000 001
010 011
100 101
111
110
[000; 011] = [001; 010]2 geodesics
Hamming space
1 3
[(1, 1); (3, 2)]1 geodesic
Euclidean space
1 3
[(1, 1); (3, 2)] = [(1, 2); (3, 1)]infinitely many geodesics
Manhattan space
Line segments
Uniform Mutation & Uniform Crossover
Uniform topological crossover:
Uniform topological ε-mutation:
|],[|
]),[(}2,1|Pr{),|(
yx
yxzyPxPzUXyxzfUX
],[}0),|(|{)],(Im[ yxyxzfSzyxUX UX
|),(|
)),((}|Pr{)|(
xB
xBzxPzUMxzfUM
),(}0)|(|{)](Im[ xBxzfSzxUM M
Genetic operators have a geometric nature
Representation independentand rigorous definition of
crossover and mutation in the neighbourhood seen as a
geometric space…
This is cheating! I have generalized from a single example
of solution representation!
III. Unification of Major Representations &
Operators
Minkowski spaces – real vectors
22
2
B((2, 2); 1)Euclidean space
2
B((2, 2); 1)Manhattan space
Balls
2
2
B((2, 2); 1)Chessboard space
1
2
1
2
1 3
[(1, 1); (3, 2)]1 geodesic
Euclidean space
1 3
[(1, 1); (3, 2)] = [(1, 2); (3, 1)]infinitely many geodesics
Manhattan space
Line segments
1
2
1 3
[(1, 1); (3, 2)]infinitely many geodesics
Chessboard space
Representation: real vectors
Neighbourhoods: continuous (3 types)
Distances: Minkowski distances
Implementation: algebraic manipulation of real vector (equation of line passing through two points)
Pre-existing recombination operators:- both blend crossovers and discrete crossovers fit geometric definition- extended blend crossovers do not fit
Hamming spaces – binary strings
00 01 02
10 11 12
20 21 22
00 01 02
10 11 12
20 21 22
B(00;1)Hamming space H(2,3)
[00;11]=[01;10]2 geodesics
Hamming space H(2,3)
000 001
010 011
100 101
111110
B(000; 1)Hamming space H(3,2)
000 001
010 011
100 101
111
110
[000; 011] = [001; 010]2 geodesics
Hamming space H(3,2)
Representation: binary/multary strings
Neighbourhoods: bit-flip/site substitution
Distances: Hamming distances
Implementation: symbolic manipulation of multary strings (mask-based crossovers)
Pre-existing recombination operators:- all binary crossovers fit the geometric definition
Cayley spaces - permutationsRepresentation: permutations
Neighbourhoods: adj. swap, swap, reversal, insertion
Distances: corresponding distances
Implementation: “minimal permutation sorting by X move” algorithms:- adj. swap = bubble sort- swap = selection sort - insertion = insertion sort - reversal = approximated MPS by reversals (NP-Hard))
Pre-existing recombination operators:various pre-existing crossover operators are sorting algorithm in disguise (because sorting permutations is easier than sorting vectors of other items)
abc
bac acb
bca cab
cba
B(abc; 1)Adjacent swap space
abc
bac acb
bca cab
cba
[abc; bca]1 geodesic
Adjacent swap space
B(abc; 1)Swap space & Reversal
space
abc
bac acb
bca cab
cba
abc
bac acb
bca cab
cba
[abc; bca]3 geodesics
Swap space & Reversal space
B(abc; 1)Insertion space
[abc; bca]1 geodesic
Insertion space
abc
bac acb
bca cab
cba
abc
bac acb
bca cab
cba
Syntactic tree spaces
Representation: syntactic tree (lisp expression)
Neighbourhood: weighted sub-tree neighbourhood
Distance: structural distance
Implementation: - sub-tree swap crossover - common region mask based crossover
Pre-existing recombination operators:- traditional crossover (non-geometric)- homologous crossover - the geometric framework can help to clarify what is the landscape and distance related to homologous crossover and a distance connected with a geometric crossover which traditional crossover is an approximation
+
sin +
x x x
*
* *
y x*
yy
Parent 1 Parent 2
y
+
sin
x
*
*
yy
x
AlignmentCrossover Point
Swap
*
*
yy
+
x x
Offspring 1Offspring 2
Significance of Unification
• Most of the pre-existing crossover operators for major representations fit geometric definition
• Established pre-existing operators have emerged from experimental work done by generations of practitioners over decades
• Geometric crossover compresses in a simple formula an empirical phenomenon
IV. Crossover Principled Design
Crossover Principled Design
• Domain specific solution representation is effective
• Problem: for non-standard representations it is not clear how crossover should look like
• But: given a combinatorial problem you may know already a good neighbourhood structure
• Geometric Interpretation of Crossover Give me your neighbourhood definition and I give you a crossover definition
+ = ?
Crossover Design Example
Non-labelled graph neighbourhood
MOVE: Insert/remove an edge
Fixed number of nodes
0
1
2
1
2
3
+
Offspring
V. Is Biological Recombination
Geometric?
Levenshtein spaces – sequences
Representation: multary sequences (DNA/amino acids)
Neighbourhood: insertion + deletion + substitution (compound edit move)
Distance: Levenshtein distance
Implementation: inexact sequence alignment (dynamic programming) and sites exchange (crossover mask)
Pre-existing recombination operators:- none- it could be a good crossover for linear GP- it could be a better model of biological crossover to study molecular evolution because it keeps into account the inexact alignment due to molecular annealing of DNA strands that producesevolution of size variation
Parent1=AGCACACAParent2=ACACACTA
best inexact alignment (with gaps):
AGCA|CAC-A Child1=AGCACACTAA-CA|CACTA Child2=ACACACA
A simple model of (homologous) biological recombination fits the
geometric definition under a DNA distance used in bioinformatics
VI. Unity of Evolutionary Search
Example of evolutionary search
Abstract convex evolutionary search
Main result: an evolutionary algorithm using geometric crossover with any probability distribution, any kind of representation, any problem, any selection and replacement mechanism, does the same search: convex search
Proof based on abstract convexity (axiomatic geodesic convexity) and axiomatization of search process (abstract search process)
…Nearly Over!
Future work
THEORY: Generalizing and accommodating pre-existent theories into geometric framework (schema theorem, fitness landscapes, representation theories…)
PRACTICE: Testing crossover principled design on important problems with non-standard representation (problem domain representation)
Questions?
Recommended