Upload
vidor
View
34
Download
1
Embed Size (px)
DESCRIPTION
No Free Lunch (NFL) Theorem. Presentation by Kristian Nolde. Many slides are based on a presentation of Y.C. Ho. General notes. Goal: Give an intuitive feeling for the NFL Present some mathemtical background To keep in mind NFL is an impossibility theorem, such as - PowerPoint PPT Presentation
Citation preview
No Free Lunch (NFL)Theorem
Many slides are based on a presentation of Y.C. Ho
Presentation by Kristian Nolde
25. August 2004– 2/29
General notes
Goal:• Give an intuitive feeling for the NFL• Present some mathemtical background
To keep in mind• NFL is an impossibility theorem, such
as– Gödel‘s proof in mathematics (roughly:
some facts cannot be proved or disaproved in any mathematical system)
– Arrow‘s theorem in economics (in principle, perfect democracy is not realizable)
• Thus, practicle use is limited ?!?
25. August 2004– 3/29
The No Free Lunch Theorem
• Without specific structural assumptions, no optimization scheme can perform better than blind search on the average
• But blind search is very inefficient! • Prob (at least one out of N samples is in
the top-n for search space of size ||) ~ nN/|| ex. Prob=0.0001 for ||=109, n=1000, N=1000
25. August 2004– 4/29
Assume a finite World
Finite # of input symbols (x’s) and
finite # of output symbols (y’s) =>
finite # of possible mappings from input to output (f’s)
25. August 2004– 5/29
The Fundamental Matrix F
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0
FACT: equal number of 0’s and 1’s in each row!
1
1
0 1
0
1 1
1
1
1
In each row, each value of Y appear |Y| |X|-1 times!
Averaged over all f, the value is independent of x!
25. August 2004– 6/29
Compare Algorithms
• Think of two algorithms: a1 and a2
e.g. a1 always selects from x1 to x.5|X|
a2 always selects from x.5|X| to x|X|
• For specific f: a1 or a2 may be bettter. However, if f is not known average performance of both is equal:
where d is a sample and dy is the cot value associated with d.
f
y
f
y afdPafdP ),(),( 21
25. August 2004– 7/29
Comparing Algorithms Continued
• Case 1: Algorithms can be more specific, e.g. assume a certain realization fk, a1
• Case 2: Or, they can be more general, assume more uniform distribution of possible f, a2.
• Then performance of a1 will be excellent for fk
but catastrophic for all other cases (great performance, no robustness)
• Contrary, a2 performs mediocre for all cases, but doesn‘t fail (poor performance, high robustness)
Common Sense says:Robustness * Efficiency = Constant
or Generality * Depth = Constant
25. August 2004– 8/29
Implication 1
• Let x be the optimization variable, f the performance function, and y the performance, i.e., y=f(x)
• then averaged over all possible optimization problems, the result is choice independent
• if you don’t know the structure of f (which column you are dealing with), blind choice is as good as any!
25. August 2004– 10/29
Implications 2
• Let X be the space of all possible representation (as in genetic algorithms), or space of all possible algorithms to apply to a class of problems
• Without understanding of the problem, blind choice is as good as any.
• “understanding” means you know which column of the F matrix you are dealing with
25. August 2004– 11/29
Implications 3
• Even if you know which columns or group of columns you are dealing with => you can specialize the choice of rows
• You must accept that you will suffer LOSSES should other choices of column occur due to uncertainties or disturbances
25. August 2004– 12/29
The Fundamental Matrix F
x1
x2
x|X|
f1 f2 f|F|
0
0
1
0
0
0 1
1
0 1
0
1 1
1
1
1
Assume a distribution of the columns, then pick a row that results in minimal expected losses or maximal performance. This is
stochastic optimization
25. August 2004– 13/29
Implications 5
• Worse, if you should estimate the probabilities incorrectly, then your stochastically optimized solution may suffer catastrophic bad outcomes more frequent then you like.
• Reason: you have already used up more of the good outcomes in your “optimal” choice. What are left are bad ones that are not suppose to occur! (HOT Design & power law -Doyle)
25. August 2004– 14/29
Implications 6
• Generality for generality sake is not very fruitful
• Working on a specific problem can be rewarding
• Because: – the insight can be generalized– the problem is practically important– the 80-20 effect