Results Classification - Comparison with existing approaches on benchmark datasets Survival Analysis – International multi-centre double-blind study Survival

Results

Classification - Comparison with existing approaches on benchmark datasets

Survival Analysis – International multi-centre double-blind study

Survival Analysis – Finding subgroups with significantly different survival curves

Conclusions And Future Work

• MOEAs are versatile algorithms for successful model extraction

• MOEAs can produce models that perform well on real-world and benchmark data

• The approach will be tested on artificial benchmark data to determine under which conditions the approach performs well or not.

Acknowledgments

This research has been supported by the Computer Science Department of the University of Liverpool.

Model Extraction Using Multi-Objective Evolutionary Algorithms C Setzkorn*, AFG Taktak*, BE Damato#

Dept. of Clinical Engineering*, Ocular Oncology Centre#, RLBUHT ([email protected])

Summary

• Multi-objective evolutionary algorithms (MOEAs) are powerful optimisation algorithms which simulate the Darwinian principle of the survival of the fittest within a computer.

• MOEAs can be used to extract models from data as they can optimise several objectives at the same time (e.g. the parameters and structure of the model) and are less susceptible to interaction effects and noise within the data.

• The present work uses a MOEA to extract models for classification and survival analysis. In addition, the MOEA is used to find subgroups of patients with significantly different survival curves.

Introduction And Methods

Structure Of A Multi-Objective Evolutionary Algorithm

Genetic Operators And Selection

Why Multi-Objective Evolutionary Algorithms?

Advantages:

• Generate a set of trade-off solutions in a single run.

• Applicable to large and complex search spaces.

• Deals with incommensurable objectives.

• Unsusceptible to the shape of the trade-off surface

• Alleviates over-fitting and reduces model complexity

Disadvantages:

• Computational expensive.

• No guaranteed convergence.

Implementation of international multi-centre double-blind study

Table 1: Comparison of the MOEA with three existing approaches on several benchmark datasets.

ClinicalEngineering

Royal Liverpool University Hospital

Figure 6: Decision surface for the spiral dataset. The spiral dataset is a complicated artificial problem

Dataset

Result Rules Results Rules Results Rules

Bcw 97.6 +/- 1.3 306.5 93.1 +/- 5.5 97.6 +/- 3.0 8.2 99.4 +/- 0.6 13.4Car 94.3 +/- 1.4 107.6 92.3 +/- 2.2 98.3 +/- 0.7 78.6 95.4 +/- 0.4 40.6Cmc 63.9 +/- 4.0 196.6 64.1 +/- 5.8 66.5 +/- 4.9 39.1 71.1 +/- 1.4 27.3Crx 90.2 +/- 4.2 758.5 87.6 +/- 4.3 90.1 +/- 3.4 12.9 94.0 +/- 3.4 10.1German 71.9 +/- 4.9 807.5 77.1 +/- 4.5 67.9 +/- 9.6 23.4 79.1 +/- 2.6 20.8Glass 74.4 +/- 10.0 183.7 74.0 +/- 8.7 75.7 +/- 5.9 12.2 94.4 +/- 1.6 48Image 93.3 +/- 1.4 811.4 95.6 +/- 0.9 99.0 +/- 0.5 28.6 97.8 +/- 1.6 25.1Kr-vs-kp 92.6 +/- 1.5 2328.3 95.1 +/- 0.8 99.7 +/- 0.2 26.3 90.4 +/- 4.4 5.9Mushroom 100.0 +/- 0.0 2362.2 99.8 +/- 0.1 100.0 +/- 0.0 11.5 99.8 +/- 0.2 7.5Nursery 97.1 +/- 0.2 606.6 98.0 +/- 0.2 99.8 +/- 0.1 336.8 97.0 +/- 0.3 34.8Promoters 83.5 +/- 16.2 7432.2 97.7 +/- 4.0 88.4 +/- 12.8 8 93.2 +/- 6.7 15.9Sonar 65.8 +/- 12.8 10075.7 76.1 +/- 13.0 77.8 +/- 13.7 9.1 83.1 +/- 6.7 31.4Splice 87.3 +/- 1.6 8406.8 99.2 +/- 0.6 97.2 +/- 0.7 76.2 91.5 +/- 1.5 20.3

Naive Bayes MERBIS RL C4.5rules

Crossover

Mutation

Selection

Figure 1: Structure of a multi-objective evolutionary algorithm.

Figure 2: Genetic operators: crossover and mutation. They can change the structure and the parameters of models.

Figure 3: Selection simulates the Darwinian principle of the survival of the fittest.

Decrease complexity

Incr

ease

fit

to

th

e d

ata

Figure 4: Set of trade-off solutions. Red points are so-called non-dominated solutions (models).

ParticipantsHane/Paulo (UK), Patrizia (Italy), Ian (UK), Christian (UK), George (Greece)

Elia (Italy), Azzam (UK)

General Ocular Oncology Database (Geoconda)

Assessors

Referee

Figure 5: Model extraction performed by five centres from three countries. The results are evaluated in a double-blind manner.

Figure 8: Kaplan-Meier curves for the Leukemia data (solid line - Treatment, dashed line - Placebo) together with the estimates of the model (asterisks - Treatment, cross - Placebo).

Figure 7: Results of an international multi-centre study. Performance measured by the CIndex. RBF is the MOEA approach.

Figure 9: The produced models can also be used via the Internet.

Figure 10: Kaplan-Meier curves of four generated subgroups with significantly different survival times (Log-Rank 320.0965, p < 0.0001).

Figure 11: The MOEA was used to describe the four subgroups within the feature space.

InitialisePopulationRandomly

ApplyGenetic

Operators

FitnessEvaluation

Do It Again ?OutputArchive

Archive

Selection

Documents

Results Classification - Comparison with existing approaches on benchmark datasets Survival Analysis – International multi-centre double-blind study Survival