22
Building Blocks CS 5764 Evolutionary Computation Hod Lipson

Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Building Blocks

CS 5764

Evolutionary Computation

Hod Lipson

Page 2: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Unifying ideas

• Knowledge represented as a population of

solutions containing building blocks

• Progress is driven by two key processes:

– Incremental progress: e.g. mutation

(traditional optimization): Refinement

– Recombination of solutions (e.g. crossover):

Discovering new areas (possibly initially

inferior)

Page 3: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

“Chromosome” “Gene”

Terminology

01010100111001010101010010110

Allele one of two or more forms of a gene or a genetic locus

Page 4: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

A GA Schema

• A “template”

– a string of symbols taken from the alphabet

{0,1,*}

– 010*1, *110*, *****, 10101

• The character “*” means “don’t care”

– *10*1 represents 01001, 01011, 11001,

and 11011

Page 5: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Geometric Interpretation

A Schema is a hyperplane in the larger search space manifold

Page 6: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Order of a schema

• Number of specified alleles in a gene

?

?

?

?

000 001 010 011

010 011 110 111

010 110

101

Page 7: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Order of a schema

• How many different strings of length N

does a schema of order “O” represent?

– A schema of order O represents 2N-O

different strings of length N

Schema Order Represented Strings

*** 0 000 001 010 011 100 101 110 111

*1* 1 010 011 110 111

*10 2 010 110

101 3 101

Page 8: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Destructive Dynamics

• Probability of surviving mutation

Sm(H)=

Page 9: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Defining Length

• “D” = The distance between the furthest

two non-* symbols

Schemata D

**** *1** 0

*10* 10** 1

1*1* 2

1*11 0**1 1001 3

Why is the length important?

Page 10: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Destructive Dynamics

• Probability of surviving single point crossover

Page 11: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Strings containing schemata

• A bit string represented by a schema is

said to “contain” the schema

Bit String Contained Schemata

1 1 *

00 00 0* *0 **

110 110 11* 1*0 1** *10 *1* **0 ***

1011 1011 101* 10*1 10** 1*11 1*1* 1**1 1***

*011 *01* *0*1 *0** **11 **1* ***1 ****

How many schemata does a string of length N include?

Page 12: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

How many schemata in a

population?

• There are 3N different schemata

(potential genes) of length N

• A population of P bit-strings each of

length N contains between 2N and

min(P2N, 3N) schemata

N P Number of Schemata

3 100 ? - ?

All possible

schemata

Page 13: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

How many schemata in a

population?

• There are 3N different schemata of

length N

• A population of P bit-strings each of

length N contains between 2N and

min(P2N, 3N) schemata

N P Number of Schemata

6 20 64 - 729

20 50 1048576 - 52428800

40 100 -

100 300 -

All possible

schemata

N=3

Page 14: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Estimating Fitness

associated with a gene

Population f

101 5

100 1

010 2

110 3

Schemata f

***

**0

**1

*0*

*00

*01

*1*

(5+1+2+3) / 4 = 2.75

(1+2+3) / 3 = 2

5 / 1 = 5

(5+1) / 2 = 3

1 / 1 = 1

5 / 1 = 5

(2+3)/2 = 2.5

Estimation uncertainty: Standard error

Page 15: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Observations

• If only fitness-proportionate selection is applied (no crossover or

mutation), schemata with above (below) average fitness are

sampled, generation after generation, by an increasing

(decreasing) number of chromosomes.

• Schemata with a long defining length have a higher probability

to be disrupted by crossover

• Schemata with high order have a higher probability of being

disrupted by mutation

• Schemata with a low order and a short defining length are called

building blocks

• Building blocks are processed with minimum disruption by GAs,

therefore GAs use building blocks of relatively high fitness to

build entire solutions

Page 16: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

• GAs will be successful insofar as the

problem has been encoded in a way

that can be solved with compact

building blocks (low order, low defining

lengths)

• What is the easiest problem you can

think of?

Page 17: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Dynamics

• H is a schema present in the population at time t

• m(H,t) is the number of instances of H at time t

• u(H,t) is the observed average fitness of H

• expected number of offspring of x is f(x)/favg(t)

• If x is an instance of H, then

Page 18: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Destructive Dynamics

• Probability of surviving single point crossover

• Probability of surviving mutation

Sm(H)=

Page 19: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Combining Effects

Page 20: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

0 500 1000 1500 2000 2500 3000

4

5

6

7

8

9

10

11

12

13

Random Search

GA (Roulette, Tight Linkage)Parallel HillclimberParallel Simulated Annealing

GA (Diversity, Poor Linkage)

GA (Diversity, Tight Linkage)

Be

st

Fitn

ess

GenerationEvaluations (x100)

Large defining length and small order = poor linkage

Page 21: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

The Building Block Hypothesis

• GAs performs adaptation by identifying

and recombining "building blocks", i.e.

low order, low defining-length schemata

with above average fitness.

• GAs perform adaptation by implicitly

and efficiently implementing this

heuristic.

Page 22: Building Blocks - Cornell University · The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length

Caveats

• Model assumes particular form of

representation:

– Bit strings, single point crossover, mutation

• Assumes fitness-proportionate selection

• Assume fixed fitness criterion

• Assumes fixed population size

Many variations have been published