32
13 Basics of Genetic Algorithms and some possibilities Peter Spijker Technische Universiteit Eindhoven Department of Biomedical Engineering Division of Biomedical Imaging and Modeling California Institute of Technology Materials Process and Simulation Center Biochemistry & Molecular Biophysics November 25, 2003 12

13 Basics of Genetic Algorithms and some possibilities Peter Spijker Technische Universiteit Eindhoven Department of Biomedical Engineering Division of

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

13

Basics of Genetic Algorithmsand some possibilities

Peter SpijkerTechnische Universiteit Eindhoven

Department of Biomedical EngineeringDivision of Biomedical Imaging and Modeling

California Institute of TechnologyMaterials Process and Simulation Center

Biochemistry & Molecular Biophysics

November 25, 2003

12

13Presentation Overview

• Purpose of presentation

• General introduction to Genetic Algorithms (GA’s)

• Biological background• Origin of species

• Natural selection

• Genetic Algorithm• Search space

• Basic algorithm

• Coding

• Methods

• Examples

• Possibilities

13Purpose of presentation

• Optimising parameters of force fields is a difficult and time consuming task

• Use of optimising methods might be of use

• Methods:- steepest descent

- simulated annealing (Monte Carlo)

- genetic algorithms

• Brief introduction to genetic algorithms in lecture style

13General Introduction to GA’s

• Genetic algorithms (GA’s) are a technique to solve problems which need optimization

• GA’s are a subclass of Evolutionary Computing

• GA’s are based on Darwin’s theory of evolution

• History of GA’s

• Evolutionary computing evolved in the 1960’s.

• GA’s were created by John Holland in the mid-70’s.

13Biological Background (1) – The cell

• Every animal cell is a complex of many small “factories” working together

• The center of this all is the cell nucleus

• The nucleus contains the genetic information

13Biological Background (2) – Chromosomes

• Genetic information is stored in the chromosomes

• Each chromosome is build of DNA

• Chromosomes in humans form pairs

• There are 23 pairs

• The chromosome is divided in parts: genes

• Genes code for properties

• The posibilities of the genesforone property is called: allele

• Every gene has an unique positionon the chromosome: locus

13Biological Background (3) – Genetics

• The entire combination of genes is called genotype

• A genotype develops to a phenotype

• Alleles can be either dominant or recessive

• Dominant alleles will always express from the genotype to the fenotype

• Recessive alleles can survive in the population for many generations, without being expressed.

13Biological Background (4) – Reproduction

• Reproduction of genetical information• Mitosis

• Meiosis

• Mitosis is copying the same genetic information to new

offspring: there is no exchange of

information

• Mitosis is the normal way ofgrowing of multicell structures,

like organs.

13Biological Background (5) – Reproduction

• Meiosis is the basis of sexual reproduction

• After meiotic division 2 gametesappear in the process

• In reproduction two gametesconjugate to a zygote wich

will become the new individual

• Hence genetic information is sharedbetween the parents in order to

create new offspring

13Biological Background (6) – Reproduction

• During reproduction “errors” occur

• Due to these “errors” genetic variation exists

• Most important “errors” are:

• Recombination (cross-over)

• Mutation

13Biological Background (7) – Natural selection

• The origin of species: “Preservation of favourablevariations and rejection of unfavourable

variations.”

• There are more individuals born than can survive, so there is a continuous struggle for life.

• Individuals with an advantage have a greater chance for survive: survival of the fittest.

13Biological Background (8) – Natural selection

• Important aspects in natural selection are:

• adaptation to the environment

• isolation of populations in different groups which cannot mutually mate

• If small changes in the genotypes of individuals are expressed easily, especially in small populations, we speak of genetic drift

• Mathematical expresses as fitness: success in life

13Presentation Overview

• Purpose of presentation

• General introduction to Genetic Algorithms (GA’s)

• Biological background• Origin of species

• Natural selection

• Genetic Algorithm• Search space

• Basic algorithm

• Coding

• Methods

• Examples

• Possibilities

13Genetic Algorithm (1) – Search space

• Most often one is looking for the best solutionin a specific subset of solutions

• This subset is called the search space (or state space)

• Every point in the search space is a possible solution

• Therefore every point has a fitness value, depending on the problem definition

• GA’s are used to search thesearch space for the best

solution, e.g. a minimum

• Difficulties are the local minima and the starting

point of the search

0 100 200 300 400 500 600 700 800 900 10000

0.5

1

1.5

2

2.5

13Genetic Algorithm (2) – Basic algorithm

• Starting with a subset of n randomly chosen solutions from the search space (i.e. chromosomes). This is the population

• This population is used to produce a next generation of individuals by reproduction

• Individuals with a higher fitness have more chance to reproduce (i.e. natural selection)

13Genetic Algorithm (3) – Basic algorithm

• Outline of the basic algorithm

0 START : Create random population of n chromosomes

1 FITNESS : Evaluate fitness f(x) of each chromosome in the population

2 NEW POPULATION

0 SELECTION : Based on f(x)

1 RECOMBINATION : Cross-over chromosomes

2 MUTATION : Mutate chromosomes

3 ACCEPTATION : Reject or accept new one

3 REPLACE : Replace old with new population: the new

generation

4 TEST : Test problem criterium

5 LOOP : Continue step 1 – 4 until criterium is satisfied

13Genetic Algorithm (4) – Coding

• Normal cells are diploid (containing 2 complete sets of chromosomes)

• On the contrary gametes are haploid

• Formalizing diploid reproduction is much more difficult than haploid

• Diploid populations have an extra dimension compared to haploid populations

• For simplicity therefore only haploid genetic algorithms

13Genetic Algorithm (5) – Coding

• Chromosomes are encoded by bitstrings

• Every bitstring therefore is a solution but not necisseraly the best solution

• The way bitstrings can code differs from problem to problem

Either: sequence of on/off or the number 91

0

0

1

13Genetic Algorithm (6) – Coding

• Recombination (cross-over) can when using bitstrings schematically be represented:

• Using a specific cross-over point

1

0

0

1

1

0

1

0

1

0

1

1

1

0

X

1

0

0

1

1

1

0

0

1

0

1

1

0

1

13Genetic Algorithm (7) – Coding

• Mutation prevents the algorithm to be trapped in a local minimum

• In the bitstring approach mutation is simpy the flipping of one of the bits

1

0

0

1

1

0

1

1

1

0

1

1

0

1

13Genetic Algorithm (8) – Coding

• Both recombination and mutation depend a loton the exact definition of the

problem and the choice of representing the chromosomes (e.g. no bitstrings)

• Different encodings can be used:• Binary encoding

• Permutation encoding

• Value encoding

• Tree encoding

• Focus in this presentation stays with binary encoding

13Example Minimum of Function (1)

• First example shows how to find the minimum of a function

0 100 200 300 400 500 600 700 800 900 10000

0.5

1

1.5

2

2.5

Minimum f(x) at x = 809

1100101001

13Example Minimum of Function (2)

0 200 400 600 800 1000 12000.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Generation 1

0 10 20 30 40 50 60 70 800.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

Generations

Fitn

ess

Best FitnessMean Fitness

Individual Best individual

Mean fitness

Best fitness

Generations

13Example Minimum of Function (3)

• Interactive show of this algorithm with Matlab

• Using the function: genalg2()

• Variables:• Population size

• Bitstringlength

• Mutation chance

• Recombination chance

• Starting population adaption

13Genetic Algorithm (9) – Remarks

• It is clear from the example that the convergencespeed of the algorithm depends on

many factors:• Population size

• Mutation probability

• Recombination probability

• Elitism

• Selection methods• Random selection of parents

• Roulette wheel selection of parents

• Strong point GA’s: mutation prevents from falling in a local minimum, recombination initiates a fast first convergence

13Example Checkboard (1)

• We are given an n by n checkboard in which every field can have a different colour

from a set of four colours.

• Goal is to achieve a checkboard in a way that there are no neighbours with the same colour (not diagonal)

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10

1

2

3

4

5

6

7

8

9

10

13Example Checkboard (2)

• Chromosomes represent the way the checkboardis coloured.

• Chromosomes are not represented by bitstrings but by bitmatrices

• The bits in the bitmatrix can have one of the four values 0, 1, 2 or 3, depending on the colour

• Crossing-over involves matrix manipulation instead of point wise operating. Crossing-over can be combining the parential matrices in a horizontal, vertical, triangular or square way

• Mutation remains bitwise changing bits in either one of the other numbers

13Example Checkboard (3)

• Fitnesscurve for the checkboard example

• This problem can be seen as a graph with n nodes and (n-1) edges, so the fitness f(x) is easily

defined as: f(x) = 2 · (n-1) ·n

0 100 200 300 400 500 600130

135

140

145

150

155

160

165

170

175

180

Generations

Fitn

ess

Best FitnessMean Fitness

13Example Checkboard (4)

• Fitnesscurves for different cross-over rules

0 100 200 300 400 500130

140

150

160

170

180

Fit

ne

ss

Lower-Triangular Crossing Over

0 200 400 600 800130

140

150

160

170

180Square Crossing Over

0 200 400 600 800130

140

150

160

170

180

Generations

Fit

ne

ss

Horizontal Cutting Crossing Over

0 500 1000 1500130

140

150

160

170

180

Generations

Verical Cutting Crossing Over

13Example Checkboard (5)

• Interactive show of this algorithm with Matlab

• Using the functions: • main()

• checkers()

• bestindividual()

• mutate()

• recombine()

• select()

• showbestindividual()

13Possibilities

• Using the genetic algorithm to optimise parameters for a force field

• Parameters are real numbers, so adaptations of these algorithms is required

• Value incoding vs. bitstring encoding

• Difficulties:• Definition fitness function

• Integration algorithm with software

13Further Questions

?