Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Kaizen Programming
Vinícius Veloso de [email protected]
Institute of Science and Technology (ICT)Federal University of São Paulo (UNIFESP)
Newfoundland & Labrador, Canada
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 22
SummarySummary
● Context
● Kaizen Programming
● Experiments
● Summary and Conclusions
● Future works
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 33
The quality of the individual (fitness) is how good it solves the problem
ContextContext
Initial
population
Calculate Calculate the fitnessthe fitness
SelectionSelection
Generate Generate offspringoffspring
Check stopping Check stopping criteria / insert criteria / insert into current into current populationpopulation
SolutionEvolutionaryEvolutionarycyclecycle
Each individual represents a complete solution
Evolution is driven by natural selection (improvement of the fittest)
Not so good individuals can pass to the next generation via tournament selection in order to maintain diversity in the population
Usually random modifications
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 66
ContextContext
● Suppose the following symbolic regression problem:
– Global optimum:● f(x) = sin(x)
– Current best:● f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt(abs((5.2134³)*x))
– GP (or similar) inserts expressions trying to reduce the error caused by the garbage expressions
● Bloat
– Is it easy for GP (or similar) to get to the global optimum from this current best?
● What if we could detect which parts of the expression are good and which are bad?
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 88
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 99
The Kaizen methodologyThe Kaizen methodology
● The Japanese word Kaizen means “Good Change,” and is adopted as a philosophy of work which means continuous improvement
● Kaizen Event is the term given to an event consisting of a team (of workers and managers) working together for a brief period of time to find effective solutions to identified business problems
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1010
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1111
The Kaizen methodologyThe Kaizen methodologyPlan-Do-Check-Act (PDCA)Plan-Do-Check-Act (PDCA)
Source: http://www.binaryspectrum.com/itservices/quality_assurance.html
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1313
Kaizen ProgrammingKaizen Programming
● Kaizen Programming (KP) is a novel evolutionary tool based on the concepts of the Kaizen methodology
● KP is a computational implementation of a Kaizen event with PDCA
● KP individuals are not complete solutions, but part of it (divide and conquer strategy)
– Evolution becomes a collaborative approach instead of an egocentric one
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1515
Kaizen ProgrammingKaizen ProgrammingPDCA cyclePDCA cycle
A team of experts is formed to propose ideas to solve a problem, that are put together tobecome a complete solution
The quality of the solution is how good it solves the problem
The quality of an idea is a measurement of its contribution to the solution
Now one can determine, exactly, which parts of the solution should be removed or improvedSuch property results in a reduction in bloat, smaller population sizes, and lower number offunction evaluations
Construct a solution (build a model)using only the new ideas or new andold ideas at the same time
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1616
Kaizen ProgrammingKaizen ProgrammingApplication in Symbolic RegressionApplication in Symbolic Regression
● The creation/modification of the ideas is performed by GP (crossover and mutation)
– Using the set of terminals and non-terminals, the new ideas (Ki) are non-linear relationships among the variables, i.e.:
● K1=x2 ; K2=sin(x); K3=-x+3/x
● The evaluation is performed by Ordinary Least Squares (multiple linear regression model)
– ŷ = β1K1+β2K2+β3K3
– βi are used to scale the ideas and are discovered by OLS
● All models generated by KP are linear in the parameters
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1717
Kaizen ProgrammingKaizen ProgrammingApplication in Symbolic RegressionApplication in Symbolic Regression
● The quality of the model (containing all partial solutions) is a measure of the goodness-of-fit
– Adjusted R2 : proportion of variance explained
● The quality of each solution is its importance to the model, not how good it fits !
– P-value: hypothesis test as a significance level α
– Ideas with non-significant values (or very small β) are not useful to the model
● The analysis of the model is used to guide the search instead of using natural selection
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1818
Remember me?Remember me?
f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x)
f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x)
f(x) = -13.502*sin(x) 0.0740631*(-13.502)*sin(x) 1.0*sin(x)
ONE SINGLE ITERATION !ONE SINGLE ITERATION !
P-value > α P-value > αP-value < α
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 1919
Kaizen ProgrammingKaizen ProgrammingApplication in Symbolic RegressionApplication in Symbolic Regression
Check this constant!
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2020
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functionsMain results: Nguyen functions
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2121
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functionsMain results: Nguyen functions
● Artificial Bee Colony Programming (ABCP)
● Genetic Programming
– Standard Crossover (SC)
– No Same Mate (NSM) selection
– Semantics Aware Crossover (SAC)
– Context Aware Crossover (CAC)
– Soft Brood Selection (SBS)
– Semantic Similarity-based Crossover (SSC)
● Results taken from D. Karaboga, C. Ozturk, N. Karaboga, and B. Gorkemli. Artificial bee colony programming for symbolic regression. Information Sciences, 209(0):1 –15, 2012.
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2222
Why so large for only 2 terminals?
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functions Main results: Nguyen functions (Karaboga et al., 2012)(Karaboga et al., 2012)
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2323
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functions Main results: Nguyen functions (Karaboga et al., 2012)(Karaboga et al., 2012)
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2424
● Kaizen Programming
– Number of experts: 8
– Maximum number of node evaluations: 1 x 105
– Idea improver: 90% GP Uniform Mutation / 10% GP ERC Mutation
– Idea sharing: one-point crossover
– 100 independent runs
● This configuration will certainly give terrible results!
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functions using KPMain results: Nguyen functions using KP
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2525
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functions using KPMain results: Nguyen functions using KP
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2626
Experiments: Symbolic regressionExperiments: Symbolic regressionMain results: Nguyen functionsMain results: Nguyen functions
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2727
Summary and ConclusionsSummary and Conclusions
● Kaizen Programming (KP) uses a collaborative problem solving approach in which partial solutions are put together to result in a complete solution
● The final solution is a multiple linear regression model
– Easier to understand (when compared to a single huge bloated solution generated by GP) and to interpret if necessary
● for instance: in the final model the best curve is an exponential component, or a sine component, etc
– The resulting ideas can be seen as features extracted from the dataset. The features have different distinct accuracies to complement each other
● PCA? ICA? FFT?
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 2828
Summary and ConclusionsSummary and Conclusions
● KP’s methodology helps the search because it acts as a better guide than regular natural selection in which only the best are useful
● One can know, exactly, which ideas are useful for the next improvement cycle
– the guessing, and consequently the bloat, are decreased when compared to Genetic Programming (GP) and similar approaches
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 3131
Future WorksFuture Works
● For regression:
– Change the techniques in the modules● GP, OLS, Adjusted R2, p-value
– Test in other symbolic regression problems
– Test in real-world problems and datasets
– Perform sensitivity analysis of the parameters
● Test in other kinds of problems
– New results to be submitted soon
● Suggestions/collaborations?
07/14/1407/14/14 GECCO 2014, VANCOUVER, BC, CANADAGECCO 2014, VANCOUVER, BC, CANADA 3232
Thank [email protected]
This work was supported by CNPq (Universal) grant 486950/2013-1
Newfoundland & Labrador, Canada