26
2101INT – Principles of Intelligent Systems Lecture 11

2101INT – Principles of Intelligent Systems Lecture 11

Embed Size (px)

Citation preview

Page 1: 2101INT – Principles of Intelligent Systems Lecture 11

2101INT – Principles of Intelligent Systems

Lecture 11

Page 2: 2101INT – Principles of Intelligent Systems Lecture 11

Biological Overview of Genetic Algorithms

Organisms produce a number of offspring similar, but not entirely so, to themselves

– Variations are caused by mutations – random changes in the genome, which are often environmental

– Variations are caused by sexual recombination – genes inherited from parents giving giving characteristics of each

On an evolutionary scale, the better adapted offspring are more likely to survive and produce their own offspring

Over time this reinforces their particular genetic characteristics

Genetic Algorithms use this process to evolve better solutions to problems

Page 3: 2101INT – Principles of Intelligent Systems Lecture 11

Classes of search techniques

Search Techniques

Calculus Base Techniqes

Local search techniques

Enumerative Techniqes

BFSDFS Dynamic Programming

Tabu Search Hill Climbing

Simulated Anealing

Evolutionary Algorithms

Genetic Programming

Genetic Algorithms

Fibonacci Sort

Page 4: 2101INT – Principles of Intelligent Systems Lecture 11

What is a GA?

An allele is the simplest piece of genetic information, a base-pair in humans, a bit in computers

A gene is string of alleles responsible for the production of a protein (humans) describing one particular feature of a solution

A chromosome is a sequence of genes. Humans have many, generally GAs are considered to just have one, that being the complete set of genes describing all features of the problem

The genotype refers to the genes of an organism, the phenotype refers to the observable characteristics

Page 5: 2101INT – Principles of Intelligent Systems Lecture 11

What is a GA? - Example

Consider the problem of matching observed data to a polynomial curve – ax4 + bx3 + cx2 + dx + e

Our environment variables are a, b, c, etc. and it is these that will form our genes. Let us assume we have an 8-bit signed integer for each variable. One random chromosome is then:

{01010101 11110011 11001010 00001000 11010010}

Each 8-bit word represents a gene. Each single bit is an allele. In total we have a 40-bit chromosome.

Page 6: 2101INT – Principles of Intelligent Systems Lecture 11

Genotypes and phenotypes

Continuing the curve fitting example, we can discuss the differences between genotypes and phenotypes

Genotype-space is the space of possible chromosomes, i.e. the space of possible 40-bit strings so has 240

elements. Phenotype-space is the space of possible 4th order

polynomial curves (with 8-bit signed coefficients) which are the characteristics given by the genes. There may be fewer elements in the phenotype space, as multiple genotypes can map to a single phenotype.

Page 7: 2101INT – Principles of Intelligent Systems Lecture 11

Metaphor

Nature Genetic Algorithm

Environment Optimisation problem

Individuals – humans Feasible solutions

Degree of adaptation Solution quality/fitness

A population A set of feasible solutions

Selection, recombination & mutation

Genetic operators – analogues of the biological operators

Life Iteratively applying operators to population

Page 8: 2101INT – Principles of Intelligent Systems Lecture 11

Flowchart of a Genetic Algorithm

Outputsolution

Initialize Population

Terminate?Yes

No

Evaluate Fitness

Perform selection, crossover and mutation

Evaluate Fitness

Page 9: 2101INT – Principles of Intelligent Systems Lecture 11

Problem Encoding

Each problem variable becomes a gene with sufficient bits to represent the domain of possible values

– Genes can represent integers, real numbers, permutations of a list, lists of rules, sequences of instructions

Must be careful (particularly with last three) that solutions remain feasible

After instantiating the genes, need some measure of fitness which can judge which genes give the best performance in the phenotype space

Page 10: 2101INT – Principles of Intelligent Systems Lecture 11

Fitness

Generally consider that higher fitness equates to a better solution. When it doesn’t, it is necessary to standardise the fitness function so that it does

Continuing the curve fitting example, we could find the sum of square error for the true curve (given by data points) and our evolved curve (given by its coefficients)

Of course, this function ideally should be zero, if the evolved curve is a perfect match

spoxip edxcxbxaxTf

int

2234 )(

Page 11: 2101INT – Principles of Intelligent Systems Lecture 11

Standardising Fitness

Easiest way to standardise fitness is to subtract it from the largest fitness in current population

p refers to a single population element, P to the current set of all population elements

pPMAXp fff '

Page 12: 2101INT – Principles of Intelligent Systems Lecture 11

Selection

Many different types of selection– Fitness proportionate: randomly select elements based on

their fitness proportionate to total population fitness– Fitness guaranteed: select each element a guaranteed

minimum number of times based on its fitness– Rank selection: select based on fitness rank, rather than on

true fitness. Useful when the fitness function increases exponentially and some elements could dominate.

– Tournament selection: extract k elements from the population and make them compete against each other for selection. Useful when evolving strategies.

All selection methods give a pool of individuals that may participate in reproduction

Page 13: 2101INT – Principles of Intelligent Systems Lecture 11

Genetic Operators

Primary operators are crossover, mutation and to some extent cloning.

Many other specialised operators have been formulated for particular problems

Genetic operators can be considered as performing a local search of the phenotype space. The local area of a genotype is the set of all other genotypes that can be reached by applying the genetic operators.

As such, the local area depends on the larger population, rather than a single genotype

Page 14: 2101INT – Principles of Intelligent Systems Lecture 11

Crossover/Recombination

Analogue of sexual reproduction – combines the genetic material of two parents to form two new offspring

This is the main genetic operator. Different types: gene based, crossover, random

Page 15: 2101INT – Principles of Intelligent Systems Lecture 11

Gene Preserving recombination

Gene based. Don’t break genes.

01101001 01001110 10101101 10110101 11010100 01011010 10110100 10100101 11011001 01011010 10101101 10100101

Page 16: 2101INT – Principles of Intelligent Systems Lecture 11

Crossover recombination

Pick a point and take all genes to the left from one parent, and all from the right of another. Can preserve gene integrity:

01101001 01001110|10101101 10110101 11010100 01011010|10110100 10100101 01101001 01001110|10110100 10100101

But more often is just a random point:

01101001 0|1001110 10101101 10110101 11010100 0|1011010 10110100 10100101 01101001 0|1001110 10110100 10100101

Page 17: 2101INT – Principles of Intelligent Systems Lecture 11

Multi-point crossover

Using just a single point crossover is less disruptive to a genotype

Two-point crossover treats the genotype as a ring, where the start and end allele are considered joined

Some evidence to suggest that multi-point crossover can be useful in smaller population sizes

Page 18: 2101INT – Principles of Intelligent Systems Lecture 11

Mutation

Asexual reproduction, using just a single parent. Flip the bits at a number of random positions in a

chromosome. Used to re/introduce diversity into the population.

01101001 01001110 10101101 1011010101111001 01001110 10001101 10010101

Page 19: 2101INT – Principles of Intelligent Systems Lecture 11

Cloning

Cloning exactly duplicates/copies an existing genotype into the successive generation

Often used to copy the best element(s) of one population to the next, in which case the GA is described as elitist, since the elite elements remain

Page 20: 2101INT – Principles of Intelligent Systems Lecture 11

Genetic Programming

GP was developed by Koza around 1990 Extends the GA to a non-linear, tree-based structure. Instead of single bits, nodes in the tree represent

functions and terminals (constants). Internal nodes are functions Leaf nodes are terminals or 0-arity functions - such as

rand()

Page 21: 2101INT – Principles of Intelligent Systems Lecture 11

Unrestricted Size and Bloat

Beyond practical limitations, the chromosomes of GP are not restricted to a particular size of shape

The trees will continue to grow while ever there is no appreciable reduction in fitness

So as an example, consider curve fitting. If you didn’t know it was a 4th order polynomial, you could use GP instead, which could learn 4th, 5th etc order polynomials

Introduces the problem of bloat – trees will continue to increase in size even if there is no increase in fitness, as long as it remains constant

Page 22: 2101INT – Principles of Intelligent Systems Lecture 11

Multi-criteria Optimisation

Bloat is often difficult to control because it introduces a multi-criteria problem – that is, give me the equation of the best fitting curve that is also smallest

How do you trade-off an improved curve with a smaller description?

No easy solution – introduces the concept of Pareto optimisation, and Pareto fronts.

One answer is said to dominate another if it is as good according to one measure and better according to another.

The current set of non-dominated solutions is termed the Pareto-optimal front

Page 23: 2101INT – Principles of Intelligent Systems Lecture 11

Use of GP

GP can be used to evolve programs That can be computer programs, functions, strategies What you can evolve is limited only by the choices of

functions and terminals GP is usually strongly typed to ensure that solutions

remain feasible. For example, adding an Int and a String has no meaning, and would not be allowed to occur – interchanged subtrees are always of the same type

Page 24: 2101INT – Principles of Intelligent Systems Lecture 11

GP Operators

Extends crossover to interchange subtrees of two parents

Mutation generates an entirely new subtree rooted at a random point

Cloning is identical

Page 25: 2101INT – Principles of Intelligent Systems Lecture 11

Comparing GAs and GP

There are extensions to GAs that allow those linear data structures to represent non-linear tree like data. After all, computer memory is linear and it still manages to store the trees somehow.

These GAs are generally termed messy GAs, and do not have a predetermined or fixed length

But generally, if you know the form of the solution you are searching for you would use a GA, if not, use GP

Page 26: 2101INT – Principles of Intelligent Systems Lecture 11

Bibliography

I direct you towards the following books– Richard Dawkins “The Selfish Gene”– John Holland “Adaptation in Natural and Artificial Systems”– John Koza “Genetic Programming”

The first one is popular science The second two are the original GA and GP text books

respectively