Upload
zeus-albert
View
60
Download
9
Embed Size (px)
DESCRIPTION
Digression: Symbolic Regression. Suppose you are a criminologist, and you have some data about recidivism. Injects Heroin in Eyeballs. Recidivist. Years in Prison. Holds Ph.D. IQ. 10 0 87 1 1 - PowerPoint PPT Presentation
Citation preview
Digression: Symbolic RegressionDigression: Symbolic Regression
Suppose you are a criminologist, and you Suppose you are a criminologist, and you have some data about recidivism.have some data about recidivism.
Years inPrison
HoldsPh.D
IQ Injects Heroinin Eyeballs
Recidivist
10 0 87 1 1 4 1 86 0 022 1 186 1 1 6 0 108 0 1 8 0 143 0 0 : : : : :
Criminology 101Criminology 101
You want a formula that predicts if someone will You want a formula that predicts if someone will go back to jail after being released.go back to jail after being released.
The formula will be based on the data collected, so The formula will be based on the data collected, so the “independent variables” arethe “independent variables” are– xx11 = number of years in jail = number of years in jail– xx22 = holds Ph.D. = holds Ph.D.– xx33 = IQ = IQ– etc.etc.
This is usually done with “regression”. Here is a This is usually done with “regression”. Here is a simpler example, with one independent variable.simpler example, with one independent variable.
Symbolic RegressionSymbolic Regression
A simple data set with one independent A simple data set with one independent variable, called x. What’s the relationship variable, called x. What’s the relationship between x and y?between x and y?
x
y
x y
12457:
2.13.33.11.83.2 :
Symbolic RegressionSymbolic Regression
You might try “linear regression:”You might try “linear regression:”
x
yy = mx + b
Symbolic RegressionSymbolic Regression
You might try “quadratic regression:”You might try “quadratic regression:”
x
y y = ax2 + bx + c
Symbolic RegressionSymbolic Regression
You might try “exponential regression:”You might try “exponential regression:”
x
y y = axb + c
Symbolic RegressionSymbolic Regression
How would you choose?How would you choose? Maybe there is some underlying Maybe there is some underlying
“mechanism” that produced the data.“mechanism” that produced the data. But you may not know…But you may not know… ““Symbolic regression” finds the Symbolic regression” finds the formform of the of the
equation, and the coefficients, equation, and the coefficients, simultaneously.simultaneously.
How To Do Symbolic Regression?How To Do Symbolic Regression?
One way: One way: genetic programminggenetic programming.. ““The evolution of computer programs The evolution of computer programs
through natural selection.”through natural selection.” The brainchild of John Koza, extending The brainchild of John Koza, extending
work by John Holland.work by John Holland. A very bizarre idea that actually works!A very bizarre idea that actually works! We will do this.We will do this.
Regression via Regression via Genetic ProgrammingGenetic Programming
We know how to produce “algebraic We know how to produce “algebraic expression trees.”expression trees.”
We can even form them randomly.We can even form them randomly. Koza says “Make a generation of random Koza says “Make a generation of random
trees, evaluate their fitnesses, then let the trees, evaluate their fitnesses, then let the more fit have sex to produce children.”more fit have sex to produce children.”
Maybe the children will be more fit?Maybe the children will be more fit?
Expression Trees AgainExpression Trees Again
A one-variable tree A one-variable tree isis a regression equation: a regression equation:
+
*
x2
-
x+
.5x y = (((x + 0.5) - x) + (2 * x))
Evaluating Expression TreesEvaluating Expression Trees
yp = (((x + 0.5) - x) + (2 * x))
x yo yp |yo - yp|2
12457
2.1 2.5 0.163.3 4.5 1.443.1 8.5 29.161.8 10.5 75.693.2 14.5 127.69
234.14 = “fitness”
Superscripts:“o” for “observed”“p” for “predicted”
A Generation of Random TreesA Generation of Random Trees
…
Tree 1 Tree 2 Tree 3 Tree 4
Tree Fitness
1 3352 15303 9504 1462: :
(most of these are really rotten!)
Choosing ParentsChoosing Parents
…
Tree 1 Tree 2 Tree 3 Tree 4
Tree Fitness
1 3352 15303 9504 1462: :
Choose these two,randomly, “proportionalto their fitness"
Generation 1
““Sexual Reproduction”Sexual Reproduction”
Choose “crossoverpoints”, at random
Then, swap the subtreesto make two new childtrees:
Generation 1
Generation 2
The StepsThe Steps
1.1. Create Generation 1 by randomly generating 500 Create Generation 1 by randomly generating 500 trees.trees.
2.2. Find the fitness of each tree.Find the fitness of each tree.3.3. Choose pairs of parent trees, proportional to their Choose pairs of parent trees, proportional to their
fitness.fitness.4.4. Crossover to make two child trees, adding them to Crossover to make two child trees, adding them to
Generation 2.Generation 2.5.5. Continue until there are 500 child trees in Continue until there are 500 child trees in
Generation 2.Generation 2.6.6. Repeat for 50 generations, keeping the best (most Repeat for 50 generations, keeping the best (most
fit) tree over all generations.fit) tree over all generations.
How Could This Possibly Work?How Could This Possibly Work?
No one seems to be able to say…No one seems to be able to say… John Holland proved something called the John Holland proved something called the
“schema theorem,” but it really doesn’t “schema theorem,” but it really doesn’t explain much.explain much.
It’s a highly “parallel” process that It’s a highly “parallel” process that recombines “good” building blocks.recombines “good” building blocks.
It really does work very well for a huge It really does work very well for a huge variety of hard problems!variety of hard problems!
Why This, in a Java Course?Why This, in a Java Course?
Because we’re going to implement it!Because we’re going to implement it! Because writing code to implement this Because writing code to implement this
isn’t too hard.isn’t too hard. Because it illustrates a large number of O-O Because it illustrates a large number of O-O
and Java ideas.and Java ideas. Because it’s fun!Because it’s fun! Here is what my implementation looks like:Here is what my implementation looks like: