Julian - diagnosing heart disease using convolutional neural networks

Preview:

Citation preview

Diagnosing heart

diseases with deep neural networks

Introduction

• Julian de Wit

• Freelancer software / machine learning

• MSc. Software engineering

• Love biologically inspired computing

• Last few years neural net “revolution”

• Turn academic ideas into practical apps

• Documents, plant/fruit grading, Medical, radar

Agenda1. Diagnose heart disease challenge

2. Deep learning

3. Solution discussion

4. Results

5. Some extra slides

6. Feel free to ask questions during talk !

Challenge

• Second national data science bowl

• Kaggle.com / Booz Allen Hamilton

• Automate manual 30min clinical procedure

• Ca. 500.000 cases/year in USA

• Estimate heart volume based on MRI’s

• Ratio systole/diastole is ‘health’ predictor

• 750 teams

• $200.000 prize money

Challenge

• Kaggle.com

• Competition platform for ‘data scientists’

• Challenges hosted for companies

• Prize money and exposure

• 400.000+ registered users

• Learn: Always someone smarter than you !

• Today’s state of the art is tomorrow’s baseline!

Challenge • Given: MRI’s, metadata, train-volumes

• Train 700, Test: 1000 patients, 300.000+ imgs

• Estimate volume of left ventricle

Deep learning

• Image data → Deep Learning (CNN)

• Neural networks 2.0

• Don’t believe ALL the hype

• Structured data → feature engineering + Tree/Lin

• Great when “perception” data is involved

• Spectacular results with image analysis

• My take: “Super human” with a twist

Solution • Step 1: Preprocessing

• Use DICOM info to make images uniform

• Crop around heart 180x180 (less distractions)

• For my solution less class imbalance

• Local contrast enhancement (CLAHE)

Solution

123ml

• Step 2: Train deep neural net

• Standard option: Regression with ‘Vanilla’ architecture.

• Approach used by most teams (ie. #2 Ghent university)

• Input slices, regress on provided volumes

Solution • Less publicized approach (mine): Segment images.

• Integrate estimated areas into volume using metadata.

• Problem: ‘No annotations provided.’ Sunnybrook/hand

Solution • Segmentation : Traditional architecture bad fit

• Every layer is higher level features less spatial info (BOW)

• Per pixel classification possible coarse due to spatial loss

• Cumbersome! H x W x 300.000 classifications.

Solution • Segmentation : Fully convolutional architecture + upscale

• Efficient. Classify all pixels at once

• Still problem spatial bottleneck at bottom : coarse

Solution • Segmentation : U-net architecture

• Skip connection give more detail in segmentation output

• Author works at Deepmind health now

• Resnet-like ?!?

Solution • Segmentation results impressive.

• Machine did exactly what it was told.

• Confused with uncommon examples < 1%.

• Remedy : Active learning

• Nice property : brightness == (un)certainty

Solution • Last step: Integrate to volume.. should be simple

• Devil was in the details

PER PIXEL SEGMENTATION LEFT VENTRICLE

Y/N

SUM ALL PIXELS AND USE

DICOM INFO TO GET TO ML

100ML

...

...

...

...

n slices n overlays

Solution • Devil in details: MUCH data cleaning

• Slice order

• Missing slices

• Out of bound slices

• Wrong orientation

• Missing frames

• BAD ground truth volumes

• Gradient boosting “calibration” procedure

• Not relevant in real setting. Just rescan MRI.

Results • Result:

• 3rd place

• Only 1 model. No ensemble.

• Sub 10ml MAE → clinically significant

• Many improvements possible :

• More, cleaner train data

• Expert annotations

• Active learning

Appendix 1. • Other approaches

• #1 Similar + 9 extra modelsSegmentation, age, 4-chamber, regression on images etc.

• #2 Traditional, 250!! ModelsDynamic ensemble per patient

“Cool” end-to-end model

Appendix 2. • U-nets and state of the art

• Potential successor dilated convolutions.

• No more bottleneck.

• Somewhat easier to use.

• Small improvements for personal project.

• Jury is still out.

• Kaggle: Ultrasound nerve segmentation

• U-nets was baseline and best solution.

• FCN also worked.

• No significant “discoveries”

• Dilated convolutions did not seem to work,

Appendix 3. • Medical images challenges

• Deep learning => success

• Example: Kaggle retinopathy challenge

• As good as doctor (better in combination)

• Google deepmind (Jeffry De Fauw=Kaggler)

• Many other companies “copied” the solution

Summary• Deep learning for medical imaging

EINDE....

Diagnosing heart diseases with deep neural networks

Competition

• Kaggle.com

• Competition platform for ‘data scientists’

• Challenges hosted for companies

• Prize money and exposure

• 400.000+ registered competitors

• Learn. Always someone smarter than you !

• Today’s state of the art is tomorrow’s baseline!

My background

• Julian de Wit

• Freelancer software / machine learning

• Technical University Delft : SE

• Biologically inspired computing / AI

• Since 2006 heavily re-interested in neural nets

• Looking for opportunities to test and bring in

practice

Approach

n slices n overlays

PER PIXEL SEGMENTATI

ON LEFT VENTRICLE

Y/N

CLEAN DATA& SUM

...

...

...

...

PROVIDED VOLUMES

CALIBRATE 110ML

Calibration • Use provided volumes to calibrate

• Remove systematic errors

• Use Gradient Booster on residuals

• Top 5 -> top 3

• Beware of overfitting

Approach• Every pixel: Left ventricle Yes/No

• Use convolutional neural network

• Sunnybrook too simplistic

• Train with hand-labeled segmentations

• Reverse engineer how to label

• Fix systematic errors with calibration against

provided volumes.

Competition

Deep learning

Labeling • Hand labeling with own tool

• Big performance limiting factor

• Could not find how to do it exactly

Cat!

Cat !

Grass

Submission • CRPS

• Uncertainty based on stdev in error as a

function of size.

• Model provided uncertainty.

• However does not account for uncertainty in

labels

• Example: patient 429. Error of 89ml !!!

• Provided label was wrong…