GENIE: Automated Feature Extraction
for Pathology Applications
Neal R. Harvey
Kim Edlund
Los Alamos National Laboratory
harve/[email protected]
Acknowledgements:We should like to thank the following for providing their medical expertise, data and some results shown during this presentation:
• Dr. Richard Levenson, CRI inc.
• Dr. David Rimm, Yale University
• Dr. Carola Zalles, Yale University
• Dr. Cesar Angeletti (formerly of Yale University)
So much Data, So Little Information
Satellite-based and other instrumentation today produces unprecedented quantities of raw image and signal data.
Hidden in this data is information of interest to analysts and scientists.
How can this information be extracted:• Easily• Rapidly• Reliably
So much Data, So Little Information
Microscope cameras, slide scanners and other instrumentation today produces unprecedented quantities of raw image data.
Hidden in this data is information of interest to pathologists, other medics and scientists.
How can this information be extracted:• Easily• Rapidly• Reliably
Traditional Approach
Physical ModelingPhysical Modeling
GENIE: Machine Learning
Easier to Easier to showshow a a machine machine
what to find…what to find…
...than to ...than to telltell a machine a machine
how to find ithow to find it
GENIEGENIE automatically automatically generates an algorithm generates an algorithm for future usefor future use
TraiTrainn
ExploitExploit
Evolving Solutions
• GENIE is an Adaptive System:
– It derives a general purpose image classifier from a limited set of user-supplied examples.
– It uses a hybrid genetic algorithm, combining evolutionary exploration with statistical machine learning.
Issues in Pixel Classification
• Spectral information often inadequate.• Need to make use of textural and spatial
context cues.• Many, many ways of describing/encoding
such spatial context information.• Best techniques are task-specific.• How do we do learn to map pixels to
categories in general?
The GENIE Approach
• Give GENIE a large and flexible “toolbox” of image processing algorithms.
• Use an evolutionary algorithm to explore which tools are most appropriate for the current task.
• Use statistical machine learning to learn how to combine those tools together to give an accurate classification.
GENIE Development
1999: Initial funding from two NRO DII’s
Continued research funding from LANL, DOE and others
2002: R&D 100 Award
2003: Transition to NGA funding for operational version: Genie ProGenie Pro.
2004: Genie Pro wins NGA Feature Extraction Evaluation (“bake-off”)
GENIE and Pathology?
• Initial experiments in applying GENIE to bio-medical data– Apply GENIE “as is” on multi-spectral
pathology data– i.e. make no modifications to/customization of
GENIE for the pathology field
GENIE and Colon Cancer Detection
H & E Stained
Colon Tissue
(Cancer & Normal)
GENIE
Classification
(cancer vs normal)
GENIE and colon cancer detection (Training)
True color image
Colon: containing cancer and normal tissue
Training data
Green: cancerous nuclei
Red: everything else (i.e. not cancerous nuclei)
True color image
Colon: containing only normal tissue
Training data
Green: none because no cancerous nuclei
Red: everything else (i.e. not cancerous nuclei)
GENIE and colon cancer detection (Exploitation)
GENIE Result: Cancer
(Training Data)
GENIE Result: Normal
(Training Data)
GENIE Result: Cancer
(Testing Data)
GENIE Result: Normal
(Testing Data)
GENIE: Breast Cancer Detection (cancerous nuclei) - Training Data
Training Data: Cancer
Training Data: Normal
GENIE: Breast Cancer Detection (Cancerous Nuclei) – Results for training Data
Classification Results: Cancer
Classification Results: Normal
GENIE: Breast Cancer Detection (Cancerous Nuclei) – Results for testing data (cancer)
GENIE: Breast Cancer Detection (Cancerous Nuclei) – Results for testing data (normal)
GENIE and endometrial gland detection (training data)
True color image
Training data
Green: gland boundary
Red: everything else
True color image
Training data
Green:gland boundary
Red: everything else
GENIE endometrium gland detection: exploitation over training data
GENIE endometrium gland detection: exploitation over testing data
GENIE and kidney inflammation detection (training)
True color image
Training data
Green: inflammation
Red: everything else
Training result
Green:inflammation
Red: everything else
GENIE and kidney inflammation detection (testing)
True color image
Testing result
Green:inflammation
Red: everything else
GENIE and Other Bio-Medical Applications
• Vibrational Hyperspectral Imaging– Fluorescence imaging– FTIR (Fourier Transform Infra Red) imaging– Raman spectroscopy– CARS (Coherent Anti-Stokes Raman
Scattering)
• Can exploit specific molecular signatures in vibrational spectrum
GENIE application to VHI
Hyperspectral fluorescence image of bacteria (E. Coli) bio-engineered to express GFP (green fluorescent protein), added to sample of macrophages stained to reveal ROS (reactive oxygen species). Task set GENIE – find E. Coli that had been taken up (engulfed) by the macrophages.
Training data provided to GENIE
GENIE classification result
GENIE: Urine Cytology Classification
GENIE Results: Cover of Laboratory Investigation
“When tested on urothelial cytology specimens collected at two separate institutions over a span of 4 years, GENIE showed a combined sensitivity and specificity of 85 and 95%, respectively. Of particular note is that when ‘training’ was performed on cases initially diagnosed as ‘equivocal’ on cytology but with follow-up biopsy, surgical specimen or cytology, which was unequivocally benign or malignant, GENIE was superior to the cytopathologist interpreting the initial ‘equivocal’ cytology specimen.”
Genie Pro Commercialization
Genie Pro has been exclusively licensed to
Aperio
For all digital pathology applications