View
3
Download
0
Category
Preview:
Citation preview
MULTI-OBJECTIVE EVOLUTIONARY
OPTIMIZATION: FEW APPLICATIONS
Hugo Jair Escalante
Contents
• Single and multi objective optimization
• Multi-objective evolutionary algorithms (NSGA-II)
• Maximizing diversification of search results• Maximizing diversification of search results
• Prototype generation for classification
• Discussion
SINGLE/MULTI OBJECTIVE
OPTIMIZATION
Multi-objective Evolutionary Algorithms: two applications
http://en.wikipedia.org/wiki/Mathematical_optimization
• A single-objective optimization problem can
be defined as:
min f(x)
Single-objective optimization
min f(x)
s.t. gi(x) ≤ 0 for i = {1,…,I}
hj(x) = 0 for j = {1,…,J}
xlk ≤ xk ≤ xu
k for k = {1,…,n}
Single-objective optimization
Brian Birge’s PSO demo for matlab
Función: Rosenbrock
Single-objective optimization
• In this type of problems we want to find a
solution x* associated to an extreme value of
f. There are different types of methods for
approaching this problems (e.g., gradient-approaching this problems (e.g., gradient-
based, simplex, heuristic, etc. )
• A multi-objective optimization problem can be
defined as:
min f(x) = ‹f (x), …, f (x)›
Multi-objective optimization
min f(x) = ‹f1(x), …, fN(x)›
s.t. gi(x) ≤ 0 for i = {1,…,I}
hj(x) = 0 for j = {1,…,J}
xlk ≤ xk ≤ xu
k for k = {1,…,n}
Multi-objective optimization
Decision space Objectives space
f2(x)
f1(x)
Multi-objective optimization
• In MOO we deal with problems involving more
than one objective. Hence a good candidate
solution to solve the problem must return
acceptable values for all of the consideredacceptable values for all of the considered
objectives
• Optimum in MOO: The solution that
represents the best tradeoff among the
considered objectives
Multi-objective optimization
• Pareto optimality: one of
the most accepted
notions of optimum
• (Some) MOO methods• (Some) MOO methods
are based in the concept
of dominance to
determine if a solution is
better than other
Pareto dominance: Solution x1 dominates x2 iff x1 is better than x2
in at least in one objective and it is not worse in the rest.
Multi-objective optimization
• A solution x* is a Pareto
optimum iff does not
exist another solution x´
such that x´dominate x*
• Problem: The output of a
MOO method is not a
single solution but an
approximation to the
Pareto optimal set
No solution is better than another in the Pareto optimal set.
Selecting a single solution is the job of the decision maker.
MULTI-OBJECTIVE EVOLUTIONARY
ALGORITHMS (NSGA-II)
Multi-objective Evolutionary Algorithms: two applications
Evolutionary Computing
• EC Is the collective name for a range of problem-
solving techniques based on principles of
biological evolution, such as natural selection
and genetic inheritance.and genetic inheritance.
• These techniques are being increasingly widely
applied to a variety of problems, ranging from
practical applications in industry and commerce
to leading-edge scientific research.
Evolutionary Computing
• Trial and error problem solving approach:
– While not_satisfied_with_solution
1. Generate candidate solution(s) for the problem at
handhand
2. Evaluate the quality of the candidate solution (s)
– Return best_solution_found
EC techniques generate new
solutions according to (rough)
analogies with biological
evolution principles
NSGA-II : (perhaps) the most used
MOEA
Non-dominated
sorting
NSGA-II : (perhaps) the most used
MOEA
Crowding distance
NSGA-II : (perhaps) the most used
MOEA
NSGA-II’s output
Hugo Jair Escalante, Alicia Morales. TIA-INAOE's approach for the 2013
MAXIMIXING VISUAL DIVERSITY OF
IMAGE RETRIEVAL RESULTS
Hugo Jair Escalante, Alicia Morales. TIA-INAOE's approach for the 2013
Retrieving Diverse Social Images task. MediaEval 2013 Workshop,
October 18-19, 2013, Barcelona, Spain, CEUR Workshop Proceedings, Vol.
1043, 2013
Diversification of retrieval results in
content-based image retrieval
• Given a list of images (relevant to a query), to
re-rank the list such that the visual diversity of
top-ranked images is maximized
Machu-Picchu in
the background
Retrieval model
Image
collection
Query
Diversification of retrieval results in
content-based image retrieval
• Given a list of images (relevant to a query), to
re-rank the list such that the visual diversity of
top-ranked images is maximized
…
? ? ? ? ?
? ? ? ? ?
• The 2013 Retrieving Diverse Social Images Task: Result diversification in social photo retrieval. Organizers: retrieval. Organizers:
– Provide data
• Ranked lists of documents
• Textual features, visual features, tags, comments, etc.
• Evaluation
– Evaluate participants
http://www.multimediaeval.org/mediaeval2013/diverseimages2013/
• Considered scenario:– A user searches for images of a specific location in social
media (e.g., Flickr)
– Text is used for searching
– The user wants that images in the first positions of the list– The user wants that images in the first positions of the listare visually diverse to each other
– Additionally, all of the images must be relevant:• About the searched location (GPS coordinates)
• No person in the image
• …
Casas Grandes
Chihuahua Mexico
Multi-objective optimization
for result diversification
• Idea: to re-rank the list of images such that a
tradeoff between relevance and diversity is
maximized
Multi-objective optimization
for result diversification
• NSGA-II is used to approach the problem as
follows:
Maximize < ρ(S0, S) , β(S) >
• Where:
Diversity termRelevance term
Maximize < ρ(S , S) , β(S) >
MORD: Representation
Each solution is thevector of scores togenerate the rankedlist
A solution to our
problem is a
ranked list of
images
MORD: Representation
1
2
Rank List
1
0.5
S0
3
4
5
6
… … …
0.3
0.25
0.2
0.16
… …
Initial population
Multi-objective optimization
for result diversification
• NSGA-II is used to approach the problem as
follows:
Maximize < ρ(S0, S) , β(S) >
• Where:
Diversity termRelevance term
Maximize < ρ(S , S) , β(S) >
Multi-objective optimization
for result diversification
• Diversity criterion:
…
? ? ? ? ?
? ? ? ? ?
Multi-objective optimization
for result diversification
• Diversity criterion:
…
? ? ? ? ?
? ? ? ? ?
Multi-objective optimization
for result diversification
• Diversity criterion:
…
? ? ? ? ?
? ? ? ? ?
MORD: Evolutionary stuff
• Initialization: Solutionsare generated by addingrandom numbers to theoriginal scores-vector
• Evolutionary operators:Standard cross-over andmutation operatorswere used
MORD: Selection of a single-solution
• We take the solution offering the best tradeoff between
both objectives
Experiments & results
• Three runs were submitted:
1. Visual
2. Textual
3. Visual+Textual
−17.2
−17
−16.8
−16.6
−0.98 −0.96 −0.94 −0.92 −0.9 −0.88 −0.86 −0.84 −0.82 −0.8−18.6
−18.4
−18.2
−18
−17.8
−17.6
−17.4
Relevance
Vis
ual d
iver
sity
Experiments & results
Initial list (7 topics in top-12 images)
Experiments & results
Re-ranked (8 topics in top-12 images)
Experiments & results
• Comparison with other participants: 6th out 11
0.65
0.7
0.75VISUAL
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
SO
TO
N−
UK
LIP
6−F
R
LAP
I−R
O
RG
U
BM
E−
HU
GE
NT
−B
E
BIL
KE
NT
−T
R
UE
C−
JP
CE
ALI
ST
−F
R
UM
RC
NR
S−
FR
TIA
INA
OE
−M
XTeam
Per
form
ance
P@10C@10F
1@1
Experiments & results
• Comparison with other participants: 5th out 11
0.75
0.8
0.85TEXTUAL
P@10C@10F
1@1
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Team
Per
form
ance
SO
TO
N−
UK
LIP
6−F
R
LAP
I−R
O
RG
U
BM
E−
HU
GE
NT
−B
E
BIL
KE
NT
−T
R
UE
C−
JP
CE
ALI
ST
−F
R
TIA
INA
OE
−M
X
Experiments & results
• Comparison with other participants: 6th out of 11
0.9
1MULTIMEDIA
P@10C@10F
1@1
0.4
0.5
0.6
0.7
0.8
Team
Per
form
ance
SO
TO
N−
UK
LIP
6−F
R
LAP
I−R
O
RG
U
BM
E−
HU
GE
NT
−B
E
BIL
KE
NT
−T
R
UE
C−
JP
CE
ALI
ST
−F
R
TIA
INA
OE
−M
X
Conclusions
• The multi-objective formulation for RD ispromising, but not as effective as we expected
• The initial ranked list was not too reliable?
• No feature selection / special processing offeatures
• No feature selection / special processing offeatures
• Did not take advantage of meta-data (tags/comments/ etc.)
• Too many parameters/decisions to fix/take
Future work
• Alternative objective functions for both relevanceand diversity.
• Evaluation of the gains over single-objectivecombinatoric approachescombinatoric approaches
• Efficient implementation in GPUs
• Incorporating feature selection into theoptimization process
Hugo Jair Escalante, Maribel Marin-Castro, Mario Graff, Alicia Morales-
Reyes, Manuel Montes, Alejandro Rosales, Jesús A. González, Carlos A.
MOPG: MULTI-OBJECTIVE PROTOTYPE
GENERATION FOR CLASSIFICATION
Reyes, Manuel Montes, Alejandro Rosales, Jesús A. González, Carlos A.
Reyes. MOPG: Multi-objective prototype generation for classification.
Submitted to Pattern Recognition, October 12, 2013
KNN – classifier
• One of the most popular non-parametric
classifiers
• Easy to implement and very effective
• Main issues with KNN:
– The curse of dimensionality
– Efficiency
– Sensibility to noisy data
Positive examples
Negative examples
Prototype-based classification
• KNN classifiers using a subset ofthe original data
• The goal is to reduce thecomputational cost of standardcomputational cost of standardKNN, by filtering outnoisy/redundant instances andkeeping the most informativeones
• Key issue: how to select/obtainthe set of prototypes for aclassification problem?
Prototype generation
• Problem: To select a (small)
subset of instances such
that the classification
Positive examples
Negative examples
that the classification
performance of a particular
classifier (KNN) is not
degraded significantly
Accuracy vs reduction dilemma
• The two key aspects for the evaluation of PGmethods are: reduction and accuracy onunseen data
0.7
0.8
0.9
1PNNBTS3
MCA
GMCA
ICPL
MixtGaussSGPLVQ3 MSEDSM LVQTCVQAVQ
LVQPRUChen
RSP3ENPC
PSOAMPSOPSCSA
GPPC
• Maximizing reduction may cause accuracy todecrease and viceversa
0.64 0.66 0.68 0.7 0.72 0.74 0.760
0.1
0.2
0.3
0.4
0.5
0.6
GENN
Depur
HYB
POC
1NN
Accuracy
Red
uctio
n
MOPG: Multi-Objective
Prototype Generation
• Idea: approaching the PG problem as one of
multi-objective optimization, where the
objectives are: reduction and accuracy
• Goal: to obtain solutions that offer a good
tradeoff between both objectives, and then
select one for classification
MOPG: Multi-Objective
Prototype Generation
• NSGA-II is used to approach the following
problem:
• Where: f1(P) = δ(P, D); f2(P) = γ(P, D)
Hold-out classification
performance
Training set
reduction
MOPG: Representation
Each solution is codified
as matrix of size P x d
Instances
Features
A solution to our
problem is a set
of instances (the
prototypes)
MOPG: Initialization
• Training data is divided into development andvalidation partitions
• Development: Instances from which prototypes canbe generated
• Validation: Hold-out data set to evaluate solutions• Validation: Hold-out data set to evaluate solutions
• The partition is updated every iteration
• Initialization: For each class we randomly select a setof training instances (class distribution is mantained)
MOPG: Evolutionary operators
• Crossover: with uniform probability either
• Interchange (same-class) prototypesbetween solutions
• Replace a prototype of class k in onesolution with the average of allprototypes from class k in the otherprototypeprototypes from class k in the otherprototype
• Mutation: with uniform probability either
• Add a vector of random numbers to aprototype
• Replace a prototype with anotherinstance frmo the development set
MOPG: Selection of a single-solution
0.92
0.94ring
0.9
0.92
0.94banana
• We evaluate the performance of each solution in the
Pareto front and chose the one with highest accuracy
0.975 0.98 0.985 0.99 0.995 10.82
0.84
0.86
0.88
0.9
Acc
urac
y −
f 2(P)
Reduction − f1(P)
0.975 0.98 0.985 0.99 0.995 10.76
0.78
0.8
0.82
0.84
0.86
0.88
0.9
Acc
urac
y −
f 2(P)
Reduction − f1(P)
Pareto front for two sample data sets
Experiments and results
• We performed experiments over 59
classification problems of diverse
characteristics
• Compared the performance of our proposal to
that of 25 alternative prototype generation
techniques
I. Triguero, J. Derrac, S. García and F.Herrera, A Taxonomy and Experimental Study on Prototype Generation for Nearest
Neighbor Classification . IEEE Trans. on Systems, Man, and Cybernetics--Part C, 42 (1) (2012) 86-100, 2012
Experiments and results
I. Triguero, J. Derrac, S. García and F.Herrera, A Taxonomy and Experimental Study on Prototype Generation for Nearest
Neighbor Classification . IEEE Trans. on Systems, Man, and Cybernetics--Part C, 42 (1) (2012) 86-100, 2012
Experiments & results
• Evaluation of the selection strategy:
0.975 0.98 0.985 0.99 0.995 10.82
0.84
0.86
0.88
0.9
0.92
0.94
Acc
urac
y −
f 2(P)
Reduction − f1(P)
ring
0.975 0.98 0.985 0.99 0.995 10.76
0.78
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
Acc
urac
y −
f 2(P)
Reduction − f1(P)
banana
Experiments & results
• Parameter settings
Experiments & results
• Parameter settings
72
73
Acc
urac
y (%
)
98.66
98.68
Red
uctio
n (%
)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.969
70
71
Crossover / mutation rate
Acc
urac
y (%
)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
98.6
98.62
98.64
Red
uctio
n (%
)
Crossover / accuracyMutation / accuracyCrossover / reductionMutation / reduction
Experiments & results
• Comparison with related work
Experiments & results
• Comparison with related work
0.72
0.74
0.76 GENN
Depur
MCA
GMCA
ICPL
MSE
HYB
RSP3
ENPC
PSO
1NN
GPPC
MOPG
Acc
urac
ySmall data sets
0 0.2 0.4 0.6 0.8 10.64
0.66
0.68
0.7
PNN
BTS3
MixtGaussSGP
LVQ3DSM
LVQTC
VQ
AVQ
LVQPRU
Chen
POC
AMPSO
PSCSA
Reduction
Acc
urac
y
• Comparison with related work
0.7
0.72
0.74
0.76 GENN
Depur
MCA
GMCA
ICPL
MixtGaussSGP
MSE
LVQTC
HYB
LVQPRU
RSP3
ENPC
PSO
AMPSO
1NN
GPPC
MOPGA
ccur
acy
Small data sets
0.75
0.8
0.85
GENN
Depur
BTS3
MSEHYB Chen
RSP3
ENPC PSO
AMPSO
1−NNGPPCMOPG
Acc
urac
yLarge data sets
Experiments & results
0 0.2 0.4 0.6 0.8 10.64
0.66
0.68
0.7
PNN
BTS3
MixtGauss
LVQ3DSM
VQ
AVQ
Chen
POC
AMPSO
PSCSA
Reduction
Acc
urac
y
0 0.2 0.4 0.6 0.8 1
0.65
0.7
0.75 BTS3MixtGauss
SGP
LVQ3DSM
LVQTC
VQ
AVQ
LVQPRUAMPSO
PSCSA
Reduction
Acc
urac
y
0.5
0.6
0.7
0.8
Red
uctio
n
Small data sets
Experiments & results
0
0.1
0.2
0.3
0.4
GE
NN
Dep
ur
PN
N
BT
S3
MC
A
GM
CA
ICP
L
Mix
tGau
ss
SG
P
LVQ
3
MS
E
DS
M
LVQ
TC
VQ
AV
Q
HY
B
LVQ
PR
U
Che
n
RS
P3
PO
C
EN
PC
PS
O
AM
PS
O
PS
CS
A
1NN
GP
PC
MO
PG
PG method
Acc
urac
y ×
Red
uctio
n
Reduction – Accuracy tradeoff (reduction * accuracy)
0.4
0.5
0.6
0.7
0.8
Acc
urac
y ×
Red
uctio
n
Small data sets
0.5
0.6
0.7
0.8
0.9
Red
uctio
nLarge data sets
Experiments & results
0
0.1
0.2
0.3
GE
NN
Dep
ur
PN
N
BT
S3
MC
A
GM
CA
ICP
L
Mix
tGau
ss
SG
P
LVQ
3
MS
E
DS
M
LVQ
TC
VQ
AV
Q
HY
B
LVQ
PR
U
Che
n
RS
P3
PO
C
EN
PC
PS
O
AM
PS
O
PS
CS
A
1NN
GP
PC
MO
PG
PG method
Acc
urac
y
0
0.1
0.2
0.3
0.4
0.5
GE
NN
Dep
ur
BT
S3
Mix
tGau
ss
SG
P
LVQ
3
MS
E
DS
M
LVQ
TC
VQ
AV
Q
HY
B
LVQ
PR
U
Che
n
RS
P3
EN
PC
PS
O
AM
PS
O
PS
CS
A
1−N
N
GP
PC
MO
PG
PG method
Acc
urac
y ×
Red
uctio
n
Reduction – Accuracy tradeoff (reduction * accuracy)
Experiments & results
• Comparison with the best* methods (so far)
for PG
Conclusions
• The multi-objective formulation for PG is a promisingalternative to mono-objective approaches
• We hope our work can foster the development ofother multi-objective optimization methods for PG.
• We showed evidence supporting the hypothesis thatour proposal, MOPG, is very competitive in terms ofboth objectives reduction and accuracy
• MOPG outperforms most PG methods proposed sofar
Future work
• Devising better ways to select the best
solution from the Pareto front
• Efficient implementation of MOPG to deal• Efficient implementation of MOPG to deal
with big-data problems (GPUs)
• Adapt MOPG for the generation of visual
vocabularies
M. García-Limón, H. J. Escalante, E. Morales, A. Morales. Simultaneous Generation of Prototypes and Featuresthrough Genetic Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionary
SIMULTANEOUS GENERATION OF
PROTOTYPES AND FEATURES
through Genetic Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionarycomputation, pp. 517-524, (Full paper, Oral presentation), Vancouver, Canada, July, 12-17, 2014.
M. Alfonso García, H. J. Escalante, E. Morales. Towards Simultaneous Prototype and Feature Generation. Proc.of the XVI IEEE Autumn Meeting of Power, Electronics and Computer Science ROPEC 2014 INTERNACIONAL, pp.393—398, 2014.
Best MS Thesis on Artificial Intelligence 2015, (SMIA)
Simultaneous generation of features
and prototypes
• Is it possible to apply the same approach to
generate features?
• Is it possible to perform both feature and• Is it possible to perform both feature and
prototype generation simultaneously?
• A multi-objective formulation would further
help?
Simultaneous generation of features
and prototypes
• We aim to find a set of prototypes and featuressuch that:
– Accuracy is maximized
– Number of instances reduced
– Number of features is kept low– Number of features is kept low
• Proposed solution: Multi-objective GP
– Same idea: combine instances/features to generateprototypes/features.
– Multiobjective implementation (NSGA-II)
M. García-Limón, H. J. Escalante, E. Morales, A. Morales. Simultaneous Generation of Prototypes and Features through Genetic
Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionary computation, pp. 517-524, (Full
paper, Oral presentation), Vancouver, Canada, July, 12-17, 2014.
Simultaneous generation of features
and prototypes
Simultaneous generation of features
and prototypes
• A different feature space for each class
NSGA-II : (perhaps) the most used
MOEA
Non-dominated
sorting
NSGA-II : (perhaps) the most used
MOEA
Crowding distance
NSGA-II : (perhaps) the most used
MOEA
NSGA-II’s output
Simultaneous generation of features
and prototypes
• We select a solution by looking at accuracy
only
Simultaneous generation of features
and prototypes
• Example:
– Original data set (initial instances and input space)
Simultaneous generation of features
and prototypes
• Example:
– Prototypes and input space for class 1
Simultaneous generation of features
and prototypes
• Example:
– Prototypes and input space for class 2
Simultaneous generation of features
and prototypes
• Some results:
Simultaneous generation of features
and prototypes
• Some results:
Small data sets
Large data sets
Simultaneous generation of features
and prototypes
Simultaneous generation of features
and prototypes
Simultaneous generation of features
and prototypes
• Competitive performance on generation of bothprototypes and features
• Class-specific input spaces• Class-specific input spaces
• Other uses: oversampling, data embedding, visualization,
• Issue: not scalable to large data sets
Questions?
Recommended