View
213
Download
0
Embed Size (px)
Citation preview
13
Basics of Genetic Algorithmsand some possibilities
Peter SpijkerTechnische Universiteit Eindhoven
Department of Biomedical EngineeringDivision of Biomedical Imaging and Modeling
California Institute of TechnologyMaterials Process and Simulation Center
Biochemistry & Molecular Biophysics
November 25, 2003
12
13Presentation Overview
• Purpose of presentation
• General introduction to Genetic Algorithms (GA’s)
• Biological background• Origin of species
• Natural selection
• Genetic Algorithm• Search space
• Basic algorithm
• Coding
• Methods
• Examples
• Possibilities
13Purpose of presentation
• Optimising parameters of force fields is a difficult and time consuming task
• Use of optimising methods might be of use
• Methods:- steepest descent
- simulated annealing (Monte Carlo)
- genetic algorithms
• Brief introduction to genetic algorithms in lecture style
13General Introduction to GA’s
• Genetic algorithms (GA’s) are a technique to solve problems which need optimization
• GA’s are a subclass of Evolutionary Computing
• GA’s are based on Darwin’s theory of evolution
• History of GA’s
• Evolutionary computing evolved in the 1960’s.
• GA’s were created by John Holland in the mid-70’s.
13Biological Background (1) – The cell
• Every animal cell is a complex of many small “factories” working together
• The center of this all is the cell nucleus
• The nucleus contains the genetic information
13Biological Background (2) – Chromosomes
• Genetic information is stored in the chromosomes
• Each chromosome is build of DNA
• Chromosomes in humans form pairs
• There are 23 pairs
• The chromosome is divided in parts: genes
• Genes code for properties
• The posibilities of the genesforone property is called: allele
• Every gene has an unique positionon the chromosome: locus
13Biological Background (3) – Genetics
• The entire combination of genes is called genotype
• A genotype develops to a phenotype
• Alleles can be either dominant or recessive
• Dominant alleles will always express from the genotype to the fenotype
• Recessive alleles can survive in the population for many generations, without being expressed.
13Biological Background (4) – Reproduction
• Reproduction of genetical information• Mitosis
• Meiosis
• Mitosis is copying the same genetic information to new
offspring: there is no exchange of
information
• Mitosis is the normal way ofgrowing of multicell structures,
like organs.
13Biological Background (5) – Reproduction
• Meiosis is the basis of sexual reproduction
• After meiotic division 2 gametesappear in the process
• In reproduction two gametesconjugate to a zygote wich
will become the new individual
• Hence genetic information is sharedbetween the parents in order to
create new offspring
13Biological Background (6) – Reproduction
• During reproduction “errors” occur
• Due to these “errors” genetic variation exists
• Most important “errors” are:
• Recombination (cross-over)
• Mutation
13Biological Background (7) – Natural selection
• The origin of species: “Preservation of favourablevariations and rejection of unfavourable
variations.”
• There are more individuals born than can survive, so there is a continuous struggle for life.
• Individuals with an advantage have a greater chance for survive: survival of the fittest.
13Biological Background (8) – Natural selection
• Important aspects in natural selection are:
• adaptation to the environment
• isolation of populations in different groups which cannot mutually mate
• If small changes in the genotypes of individuals are expressed easily, especially in small populations, we speak of genetic drift
• Mathematical expresses as fitness: success in life
13Presentation Overview
• Purpose of presentation
• General introduction to Genetic Algorithms (GA’s)
• Biological background• Origin of species
• Natural selection
• Genetic Algorithm• Search space
• Basic algorithm
• Coding
• Methods
• Examples
• Possibilities
13Genetic Algorithm (1) – Search space
• Most often one is looking for the best solutionin a specific subset of solutions
• This subset is called the search space (or state space)
• Every point in the search space is a possible solution
• Therefore every point has a fitness value, depending on the problem definition
• GA’s are used to search thesearch space for the best
solution, e.g. a minimum
• Difficulties are the local minima and the starting
point of the search
0 100 200 300 400 500 600 700 800 900 10000
0.5
1
1.5
2
2.5
13Genetic Algorithm (2) – Basic algorithm
• Starting with a subset of n randomly chosen solutions from the search space (i.e. chromosomes). This is the population
• This population is used to produce a next generation of individuals by reproduction
• Individuals with a higher fitness have more chance to reproduce (i.e. natural selection)
13Genetic Algorithm (3) – Basic algorithm
• Outline of the basic algorithm
0 START : Create random population of n chromosomes
1 FITNESS : Evaluate fitness f(x) of each chromosome in the population
2 NEW POPULATION
0 SELECTION : Based on f(x)
1 RECOMBINATION : Cross-over chromosomes
2 MUTATION : Mutate chromosomes
3 ACCEPTATION : Reject or accept new one
3 REPLACE : Replace old with new population: the new
generation
4 TEST : Test problem criterium
5 LOOP : Continue step 1 – 4 until criterium is satisfied
13Genetic Algorithm (4) – Coding
• Normal cells are diploid (containing 2 complete sets of chromosomes)
• On the contrary gametes are haploid
• Formalizing diploid reproduction is much more difficult than haploid
• Diploid populations have an extra dimension compared to haploid populations
• For simplicity therefore only haploid genetic algorithms
13Genetic Algorithm (5) – Coding
• Chromosomes are encoded by bitstrings
• Every bitstring therefore is a solution but not necisseraly the best solution
• The way bitstrings can code differs from problem to problem
Either: sequence of on/off or the number 91
0
0
1
13Genetic Algorithm (6) – Coding
• Recombination (cross-over) can when using bitstrings schematically be represented:
• Using a specific cross-over point
1
0
0
1
1
0
1
0
1
0
1
1
1
0
X
1
0
0
1
1
1
0
0
1
0
1
1
0
1
13Genetic Algorithm (7) – Coding
• Mutation prevents the algorithm to be trapped in a local minimum
• In the bitstring approach mutation is simpy the flipping of one of the bits
1
0
0
1
1
0
1
1
1
0
1
1
0
1
13Genetic Algorithm (8) – Coding
• Both recombination and mutation depend a loton the exact definition of the
problem and the choice of representing the chromosomes (e.g. no bitstrings)
• Different encodings can be used:• Binary encoding
• Permutation encoding
• Value encoding
• Tree encoding
• Focus in this presentation stays with binary encoding
13Example Minimum of Function (1)
• First example shows how to find the minimum of a function
0 100 200 300 400 500 600 700 800 900 10000
0.5
1
1.5
2
2.5
Minimum f(x) at x = 809
1100101001
13Example Minimum of Function (2)
0 200 400 600 800 1000 12000.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
Generation 1
0 10 20 30 40 50 60 70 800.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
Generations
Fitn
ess
Best FitnessMean Fitness
Individual Best individual
Mean fitness
Best fitness
Generations
13Example Minimum of Function (3)
• Interactive show of this algorithm with Matlab
• Using the function: genalg2()
• Variables:• Population size
• Bitstringlength
• Mutation chance
• Recombination chance
• Starting population adaption
13Genetic Algorithm (9) – Remarks
• It is clear from the example that the convergencespeed of the algorithm depends on
many factors:• Population size
• Mutation probability
• Recombination probability
• Elitism
• Selection methods• Random selection of parents
• Roulette wheel selection of parents
• Strong point GA’s: mutation prevents from falling in a local minimum, recombination initiates a fast first convergence
13Example Checkboard (1)
• We are given an n by n checkboard in which every field can have a different colour
from a set of four colours.
• Goal is to achieve a checkboard in a way that there are no neighbours with the same colour (not diagonal)
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
13Example Checkboard (2)
• Chromosomes represent the way the checkboardis coloured.
• Chromosomes are not represented by bitstrings but by bitmatrices
• The bits in the bitmatrix can have one of the four values 0, 1, 2 or 3, depending on the colour
• Crossing-over involves matrix manipulation instead of point wise operating. Crossing-over can be combining the parential matrices in a horizontal, vertical, triangular or square way
• Mutation remains bitwise changing bits in either one of the other numbers
13Example Checkboard (3)
• Fitnesscurve for the checkboard example
• This problem can be seen as a graph with n nodes and (n-1) edges, so the fitness f(x) is easily
defined as: f(x) = 2 · (n-1) ·n
0 100 200 300 400 500 600130
135
140
145
150
155
160
165
170
175
180
Generations
Fitn
ess
Best FitnessMean Fitness
13Example Checkboard (4)
• Fitnesscurves for different cross-over rules
0 100 200 300 400 500130
140
150
160
170
180
Fit
ne
ss
Lower-Triangular Crossing Over
0 200 400 600 800130
140
150
160
170
180Square Crossing Over
0 200 400 600 800130
140
150
160
170
180
Generations
Fit
ne
ss
Horizontal Cutting Crossing Over
0 500 1000 1500130
140
150
160
170
180
Generations
Verical Cutting Crossing Over
13Example Checkboard (5)
• Interactive show of this algorithm with Matlab
• Using the functions: • main()
• checkers()
• bestindividual()
• mutate()
• recombine()
• select()
• showbestindividual()
13Possibilities
• Using the genetic algorithm to optimise parameters for a force field
• Parameters are real numbers, so adaptations of these algorithms is required
• Value incoding vs. bitstring encoding
• Difficulties:• Definition fitness function
• Integration algorithm with software