View
221
Download
2
Tags:
Embed Size (px)
Citation preview
Results
Classification - Comparison with existing approaches on benchmark datasets
Survival Analysis – International multi-centre double-blind study
Survival Analysis – Finding subgroups with significantly different survival curves
Conclusions And Future Work
• MOEAs are versatile algorithms for successful model extraction
• MOEAs can produce models that perform well on real-world and benchmark data
• The approach will be tested on artificial benchmark data to determine under which conditions the approach performs well or not.
Acknowledgments
This research has been supported by the Computer Science Department of the University of Liverpool.
Model Extraction Using Multi-Objective Evolutionary Algorithms C Setzkorn*, AFG Taktak*, BE Damato#
Dept. of Clinical Engineering*, Ocular Oncology Centre#, RLBUHT ([email protected])
Summary
• Multi-objective evolutionary algorithms (MOEAs) are powerful optimisation algorithms which simulate the Darwinian principle of the survival of the fittest within a computer.
• MOEAs can be used to extract models from data as they can optimise several objectives at the same time (e.g. the parameters and structure of the model) and are less susceptible to interaction effects and noise within the data.
• The present work uses a MOEA to extract models for classification and survival analysis. In addition, the MOEA is used to find subgroups of patients with significantly different survival curves.
Introduction And Methods
Structure Of A Multi-Objective Evolutionary Algorithm
Genetic Operators And Selection
Why Multi-Objective Evolutionary Algorithms?
Advantages:
• Generate a set of trade-off solutions in a single run.
• Applicable to large and complex search spaces.
• Deals with incommensurable objectives.
• Unsusceptible to the shape of the trade-off surface
• Alleviates over-fitting and reduces model complexity
Disadvantages:
• Computational expensive.
• No guaranteed convergence.
Implementation of international multi-centre double-blind study
Table 1: Comparison of the MOEA with three existing approaches on several benchmark datasets.
ClinicalEngineering
Royal Liverpool University Hospital
Figure 6: Decision surface for the spiral dataset. The spiral dataset is a complicated artificial problem
Dataset
Result Rules Results Rules Results Rules
Bcw 97.6 +/- 1.3 306.5 93.1 +/- 5.5 97.6 +/- 3.0 8.2 99.4 +/- 0.6 13.4Car 94.3 +/- 1.4 107.6 92.3 +/- 2.2 98.3 +/- 0.7 78.6 95.4 +/- 0.4 40.6Cmc 63.9 +/- 4.0 196.6 64.1 +/- 5.8 66.5 +/- 4.9 39.1 71.1 +/- 1.4 27.3Crx 90.2 +/- 4.2 758.5 87.6 +/- 4.3 90.1 +/- 3.4 12.9 94.0 +/- 3.4 10.1German 71.9 +/- 4.9 807.5 77.1 +/- 4.5 67.9 +/- 9.6 23.4 79.1 +/- 2.6 20.8Glass 74.4 +/- 10.0 183.7 74.0 +/- 8.7 75.7 +/- 5.9 12.2 94.4 +/- 1.6 48Image 93.3 +/- 1.4 811.4 95.6 +/- 0.9 99.0 +/- 0.5 28.6 97.8 +/- 1.6 25.1Kr-vs-kp 92.6 +/- 1.5 2328.3 95.1 +/- 0.8 99.7 +/- 0.2 26.3 90.4 +/- 4.4 5.9Mushroom 100.0 +/- 0.0 2362.2 99.8 +/- 0.1 100.0 +/- 0.0 11.5 99.8 +/- 0.2 7.5Nursery 97.1 +/- 0.2 606.6 98.0 +/- 0.2 99.8 +/- 0.1 336.8 97.0 +/- 0.3 34.8Promoters 83.5 +/- 16.2 7432.2 97.7 +/- 4.0 88.4 +/- 12.8 8 93.2 +/- 6.7 15.9Sonar 65.8 +/- 12.8 10075.7 76.1 +/- 13.0 77.8 +/- 13.7 9.1 83.1 +/- 6.7 31.4Splice 87.3 +/- 1.6 8406.8 99.2 +/- 0.6 97.2 +/- 0.7 76.2 91.5 +/- 1.5 20.3
Naive Bayes MERBIS RL C4.5rules
Crossover
Mutation
Selection
Figure 1: Structure of a multi-objective evolutionary algorithm.
Figure 2: Genetic operators: crossover and mutation. They can change the structure and the parameters of models.
Figure 3: Selection simulates the Darwinian principle of the survival of the fittest.
Decrease complexity
Incr
ease
fit
to
th
e d
ata
Figure 4: Set of trade-off solutions. Red points are so-called non-dominated solutions (models).
ParticipantsHane/Paulo (UK), Patrizia (Italy), Ian (UK), Christian (UK), George (Greece)
Elia (Italy), Azzam (UK)
General Ocular Oncology Database (Geoconda)
Assessors
Referee
Figure 5: Model extraction performed by five centres from three countries. The results are evaluated in a double-blind manner.
Figure 8: Kaplan-Meier curves for the Leukemia data (solid line - Treatment, dashed line - Placebo) together with the estimates of the model (asterisks - Treatment, cross - Placebo).
Figure 7: Results of an international multi-centre study. Performance measured by the CIndex. RBF is the MOEA approach.
Figure 9: The produced models can also be used via the Internet.
Figure 10: Kaplan-Meier curves of four generated subgroups with significantly different survival times (Log-Rank 320.0965, p < 0.0001).
Figure 11: The MOEA was used to describe the four subgroups within the feature space.
InitialisePopulationRandomly
ApplyGenetic
Operators
FitnessEvaluation
Do It Again ?OutputArchive
Archive
Selection