Upload
abhishek-thakur
View
172
Download
0
Embed Size (px)
Citation preview
Diagnosing heart
diseases with deep neural networks
Introduction
• Julian de Wit
• Freelancer software / machine learning
• MSc. Software engineering
• Love biologically inspired computing
• Last few years neural net “revolution”
• Turn academic ideas into practical apps
• Documents, plant/fruit grading, Medical, radar
Agenda1. Diagnose heart disease challenge
2. Deep learning
3. Solution discussion
4. Results
5. Some extra slides
6. Feel free to ask questions during talk !
Challenge
• Second national data science bowl
• Kaggle.com / Booz Allen Hamilton
• Automate manual 30min clinical procedure
• Ca. 500.000 cases/year in USA
• Estimate heart volume based on MRI’s
• Ratio systole/diastole is ‘health’ predictor
• 750 teams
• $200.000 prize money
Challenge
• Kaggle.com
• Competition platform for ‘data scientists’
• Challenges hosted for companies
• Prize money and exposure
• 400.000+ registered users
• Learn: Always someone smarter than you !
• Today’s state of the art is tomorrow’s baseline!
Challenge • Given: MRI’s, metadata, train-volumes
• Train 700, Test: 1000 patients, 300.000+ imgs
• Estimate volume of left ventricle
Deep learning
• Image data → Deep Learning (CNN)
• Neural networks 2.0
• Don’t believe ALL the hype
• Structured data → feature engineering + Tree/Lin
• Great when “perception” data is involved
• Spectacular results with image analysis
• My take: “Super human” with a twist
Solution • Step 1: Preprocessing
• Use DICOM info to make images uniform
• Crop around heart 180x180 (less distractions)
• For my solution less class imbalance
• Local contrast enhancement (CLAHE)
Solution
123ml
• Step 2: Train deep neural net
• Standard option: Regression with ‘Vanilla’ architecture.
• Approach used by most teams (ie. #2 Ghent university)
• Input slices, regress on provided volumes
Solution • Less publicized approach (mine): Segment images.
• Integrate estimated areas into volume using metadata.
• Problem: ‘No annotations provided.’ Sunnybrook/hand
Solution • Segmentation : Traditional architecture bad fit
• Every layer is higher level features less spatial info (BOW)
• Per pixel classification possible coarse due to spatial loss
• Cumbersome! H x W x 300.000 classifications.
Solution • Segmentation : Fully convolutional architecture + upscale
• Efficient. Classify all pixels at once
• Still problem spatial bottleneck at bottom : coarse
Solution • Segmentation : U-net architecture
• Skip connection give more detail in segmentation output
• Author works at Deepmind health now
• Resnet-like ?!?
Solution • Segmentation results impressive.
• Machine did exactly what it was told.
• Confused with uncommon examples < 1%.
• Remedy : Active learning
• Nice property : brightness == (un)certainty
Solution • Last step: Integrate to volume.. should be simple
• Devil was in the details
PER PIXEL SEGMENTATION LEFT VENTRICLE
Y/N
SUM ALL PIXELS AND USE
DICOM INFO TO GET TO ML
100ML
...
...
...
...
n slices n overlays
Solution • Devil in details: MUCH data cleaning
• Slice order
• Missing slices
• Out of bound slices
• Wrong orientation
• Missing frames
• BAD ground truth volumes
• Gradient boosting “calibration” procedure
• Not relevant in real setting. Just rescan MRI.
Results • Result:
• 3rd place
• Only 1 model. No ensemble.
• Sub 10ml MAE → clinically significant
• Many improvements possible :
• More, cleaner train data
• Expert annotations
• Active learning
Appendix 1. • Other approaches
• #1 Similar + 9 extra modelsSegmentation, age, 4-chamber, regression on images etc.
• #2 Traditional, 250!! ModelsDynamic ensemble per patient
“Cool” end-to-end model
Appendix 2. • U-nets and state of the art
• Potential successor dilated convolutions.
• No more bottleneck.
• Somewhat easier to use.
• Small improvements for personal project.
• Jury is still out.
• Kaggle: Ultrasound nerve segmentation
• U-nets was baseline and best solution.
• FCN also worked.
• No significant “discoveries”
• Dilated convolutions did not seem to work,
Appendix 3. • Medical images challenges
• Deep learning => success
• Example: Kaggle retinopathy challenge
• As good as doctor (better in combination)
• Google deepmind (Jeffry De Fauw=Kaggler)
• Many other companies “copied” the solution
Summary• Deep learning for medical imaging
EINDE....
Diagnosing heart diseases with deep neural networks
Competition
• Kaggle.com
• Competition platform for ‘data scientists’
• Challenges hosted for companies
• Prize money and exposure
• 400.000+ registered competitors
• Learn. Always someone smarter than you !
• Today’s state of the art is tomorrow’s baseline!
My background
• Julian de Wit
• Freelancer software / machine learning
• Technical University Delft : SE
• Biologically inspired computing / AI
• Since 2006 heavily re-interested in neural nets
• Looking for opportunities to test and bring in
practice
Approach
n slices n overlays
PER PIXEL SEGMENTATI
ON LEFT VENTRICLE
Y/N
CLEAN DATA& SUM
...
...
...
...
PROVIDED VOLUMES
CALIBRATE 110ML
Calibration • Use provided volumes to calibrate
• Remove systematic errors
• Use Gradient Booster on residuals
• Top 5 -> top 3
• Beware of overfitting
Approach• Every pixel: Left ventricle Yes/No
• Use convolutional neural network
• Sunnybrook too simplistic
• Train with hand-labeled segmentations
• Reverse engineer how to label
• Fix systematic errors with calibration against
provided volumes.
Competition
Deep learning
Labeling • Hand labeling with own tool
• Big performance limiting factor
• Could not find how to do it exactly
Cat!
Cat !
Grass
Submission • CRPS
• Uncertainty based on stdev in error as a
function of size.
• Model provided uncertainty.
• However does not account for uncertainty in
labels
• Example: patient 429. Error of 89ml !!!
• Provided label was wrong…