Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
High Content 2016September 12th-14th3rd Annual Conference
Joseph B. Martin Conference Center at Harvard Medical School, Boston, MA
High Content 2017September 13th-15th4th Annual Conference
San Diego Conference Center, San Diego, CA
Educational Program:
Intro to HCS/HCA
Image and Data Analysis
Mark-Anthony Bray, Ph.DNovartis Institutes of BioMedical ResearchCambridge, Massachusetts, [email protected]
High Content 2018September 18th-20th
5th Annual Conference
Joseph B. Martin Conference Center, Boston, MA
The Basic Skill Sets for an HCS Laboratory
|○○○○ | DDMMYY1 MEAN.VarInten.CMFDAMEAN.Dif f IntenDensity .CMFDAMEAN.Av gInten.CMFDAMEAN.FiberLength.CMFDAMEAN.NeighborMinDist.CMFDAMEAN.IntenCoocContrast.ActinMEAN.SpotFiberAv gArea.ActinMEAN.SpotFiberTotalArea.ActinMEAN.TotalInten.CMFDAMEAN.VarInten.TubulinMEAN.Dif f IntenDensity .TubulinMEAN.Av gInten.TubulinMEAN.TotalInten.TubulinMEAN.FiberAlign1.TubulinMEAN.NeighborAv gDist.Actin.TubulinMEAN.NeighborVarDist.Actin.TubulinMEAN.Entropy Inten.TubulinMEAN.IntenCoocEntropy .ActinMEAN.Entropy Inten.ActinMEAN.IntenCoocEntropy .TubulinMEAN.MemberObjectAreaDif f .DAPIMEAN.VarRadialInten.ActinMEAN.MemberAv gTotalInten.DAPIMEAN.TotalInten.DAPIMEAN.MemberAv gAv gInten.DAPIMEAN.NeighborVarDist.TubulinMEAN.Av gInten.DAPIMEAN.MemberCount.DAPIMEAN.MemberAv gConv exHullPerimRatio.DAPIMEAN.Av gRadialInten.TubulinMEAN.EqSphereArea.CMFDAMEAN.Area.CMFDAMEAN.EqEllipseProlateVol.CMFDAMEAN.EqSphereVol.CMFDAMEAN.NeighborMinDist.ActinMEAN.EqCircDiam.CMFDAMEAN.Length.CMFDAMEAN.Width.CMFDAMEAN.EqEllipseOblateVol.CMFDAMEAN.SpotFiberCount.TubulinMEAN.NeighborMinDist.TubulinMEAN.Entropy Inten.CMFDAMEAN.Perim.CMFDAMEAN.NeighborMinDist.Actin.TubulinMEAN.EqEllipseLWR.CMFDAMEAN.SpotFiberTotalArea.TubulinMEAN.ShapeLWR.CMFDAMEAN.NeighborVarDist.CMFDAMEAN.SkewInten.CMFDAMEAN.SkewRadialInten.TubulinMEAN.FiberWidth.CMFDAMEAN.SpotFiberAv gArea.TubulinMEAN.KurtRadialInten.TubulinMEAN.KurtInten.CMFDAMEAN.NeighborAv gDist.TubulinMEAN.ShapeP2A.CMFDAMEAN.Conv exHullAreaRatio.CMFDAMEAN.Av gRadialInten.ActinMEAN.SpotFiberCount.ActinMEAN.FiberAlign1.ActinMEAN.TotalInten.ActinMEAN.VarInten.ActinMEAN.IntenCoocContrast.TubulinMEAN.Angle.CMFDAMEAN.Av gInten.ActinMEAN.Dif f IntenDensity .ActinMEAN.MemberAv gConv exHullAreaRatio.DAPIMEAN.MemberAv gArea.DAPIMEAN.MemberAv gCircleDiam.DAPIMEAN.KurtRadialInten.ActinMEAN.NeighborAv gDist.ActinMEAN.FiberAlign2.TubulinMEAN.Conv exHullPerimRatio.CMFDAMEAN.MemberAv gShapeBFR.DAPIMEAN.ShapeBFR.CMFDAMEAN.NeighborAv gDist.CMFDAMEAN.NeighborVarDist.ActinMEAN.MemberAv gShapeP2A.DAPIMEAN.MemberAv gShapeLWR.DAPIMEAN.MemberAv gEllipseLWR.DAPIMEAN.VarRadialInten.TubulinMEAN.FiberAlign2.ActinMEAN.MemberObjectAreaRatio.DAPIMEAN.KurtInten.TubulinMEAN.SkewInten.TubulinMEAN.IntenCoocASM.TubulinMEAN.IntenCoocMax.TubulinMEAN.SkewInten.ActinMEAN.KurtInten.ActinValidObjectCountMEAN.IntenCoocMax.ActinMEAN.IntenCoocASM.Actin
* An Introduction To High Content Screening: Imaging Technology, Assay
Development and Data Analysis in Biology and Drug Discovery (2015), Haney,
S.A, Bowman, D. Chakravarty, A. Davies, A. and Shamu, C.E. John Wiley
Press, NY, NY (in production)
The HCS Laboratory
2
Plate Handler Robot HCS Imager
Plate Visualization / Image Analysis
Workstations
Image Analysis
Computer Cluster
Data Management
System
Network File Server
Network
Instrument Control
Workstation
The Wet Lab
Reagents, protocols,
assay optimization
Hardware and
Image Acquisition
Assay Types and Assay Development
Image and Data Analysis
* An Introduction To High Content Screening And Analysis Techniques:
Practical Advice and Examples, Haney, S.A, Bowman, D. Chakravarty, A.
Davies, A. and Shamu, C.E. John Wiley Press, NY, NY (in production)
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Images Contain A Wealth Of Information
http://www.microscopyu.com Image: Javier Irazoqui
Fundamental Steps
Making measurements,
feature extractionLENGTH, WIDTH,
CURVATURE, TEXTURE…
Result
Object detection, segmentation
(including 3D and tracking over time)
Preprocessing
Image acquisition
Object
classification,
interpretation,
recognition
Image Analysis Software Solutions
• Application modules
– Good for someone new to HCS, or just needs turn-key
solution
– Polished user interfaces, fast
– Often integrated with microscope hardware
– Validated, standard assays
– Canned approach: No detailed knowledge re: image
analysis needed
• Development environment
– Good for new assay development, more flexible approach
– Customizable assay design instead of pre-built solution
– Typically, combine modules into a workflow
– Higher “cost-of-entry”: Time involved to understand image
analysis details, language, scripting…
Image Analysis Software Solutions
• Commercial– PerkinElmer Acapella
– Definiens Tissue Studio
– Molecular Devices Metamorph
– GE InCell Analyzer
– Media Cybernetics ImagePro+
– Mathworks MATLAB
– Adobe Photoshop
– Etc
• Open-source– ImageJ/FIJI
– CellProfiler
– BioImageXD
– Icy
– Vaa3D
– ITK/VTK
– KNIME
– Etc
Not comprehensive!
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Object Identification
• Also known as segmentation: Partitioning an image into
regions of interest
• Step 1: Distinguish the foreground from the background by
picking a good threshold
• Foreground: Regions where I(x,y) > threshold T
Illumination Correction
• Nonuniformities introduced in the optical path of the
sample, microscope, and/or camera
Example: Uneven illumination from left to right
– Can lead to inaccurate segmentation and measurements
– Cell at (a) is brighter than (b) even if cells have same
amount of fluorescent material
(a) (b)
Carpenter et al, Genome Biology 2006, 7:R100
Illumination Correction
• Recommendations
– Create new illumination correction if switching microscopes
– Perform per-plate correction
– Perform per-channel correction, as absolute illumination intensities may differ between channels
Images from Carolina Wahlby
Input image Output image
=÷
Approximation of
backgroundAverage many images
Fit continuous function to result
or smooth heavily
Background Subtraction
• Top-hat (“rolling ball”) filtering
Image Thresholding
What is the best threshold value for dividing the intensity histogram into foreground and background pixels?
Here?
Or here?
Pixel values
Fre
qu
en
cy
Raw input
image
Thresholded
binary image
0: Background
1: Objects
Labeled objects
Colored ROI:
Connected
pixels
Pixel-Based Image Classification
• For images where a threshold cannot be found…
• Machine-learning tools can be helpful, e.g., ilastik
– User manually labels regions of image
– Suite of features are used to distinguish regions and create a classifier
Sommer and Gerlich, JCS 2013, 126:1
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Separating Touching Objects
• Step 2: Distinguish multiple objects contained in
the same foreground blob
• Once the foreground blobs have been identified,
what next?
– Thresholding is not sufficient to separate clustered or
touching objects
Watershed Segmentation
• Consider the image
as a surface with
basins….
••
•
••
•
•
•
http://www.svi.nl/watershed
Images from Carolina Wahlby
Separating Touching Objects
– Intensity-based:Works best if objects are brighter at center, dimmer at edges
– Shape-based: Works best if objects have indentations where objects touch (esp. if objects are round)
Peaks
2
1 2
Indentations
Identifying objects: Some options
1
1
••
•
••
•
•
•
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Identifying Cell Objects
• Nuclei more easily separated than cells– DNA markers are specific
– Yield good foreground/background contrast
– Uniform shape
• Identifying cells is more difficult– Available markers often lower
contrast
– Unclear boundaries between cells, depending on the cell type and culture conditions
Secondary Object Identification
• “Growing” the primary objects to identify cell boundaries
• Use segmented nuclei as “seeds” by using a cell stain channel
• Some assays do not require precise cell ID• E.g, is a protein located in
nucleus or cytoplasm?
• Produce proxy cells by growing nuclei by N pixels if no cell stain available
Identifying Subcellular Structures
• With appropriate markers, other
subcellular compartments can
be labeled
• These can be identified using
the same methods already
mentioned
• Consider using enclosing object
as mask for better pre-
processing, thresholding
• Make sure to assign subfeatures
to enclosing objects
|○○○○ | DDMMYY23
Pre-processing
Sub-object ID
Sub-object relation
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Measuring Object Counts
• Most common readout
– # of cells per image/well
– # of organelles per image/well
– # of organelles per cell
• Number of objects per image/well is often a useful readout for QC
purposes
Measuring Object Morphology
• Reduce an aspect of object shape to a single value
• Example features
– Area: Pixel coverage of object
– Perimeter: Length of object boundary
– Eccentricity: Object “oblongness”
– Major, minor axis length: Object elongation
– Form factor: Measure of compactness
– Zernike features
• Objects touching the image border should be excluded if shape is important
http://www.perkinelmer.co.uk/
Measuring Object Intensity
• Example features– Integrated (total) intensity: Sum of the
object pixel ∝ amount of substance labeled
– Mean, median, standard deviation intensities
– Lower/upper intensity quartiles
– Correlation coefficients between channels: Colocalization
• Make sure to illumination correct beforehand
• Related to the amount of marker at a pixel location
Images courtesy of Ilya Ravkin
Measuring Object Texture
• Determine whether the staining pattern is smooth or coarse at a particular scale
• Selecting the appropriate texture scale
– Higher scale: Larger patterns of texture
– Smaller scale: More localized (finer) patterns of texture
Virus Texture Dataset, http://www.cb.uu.se/~gustaf/virustexture/
Moffat et al., Cell, 2006, 124:1283
Measuring Location
• Cell or organelle location within image may be meaningful
• Example features– Distance from organelle to
nucleus, cell membrane
– Change in position often important in time-lapse imaging
Miller et al., PNAS 2003
Battich et al., Nat Meth 2013
Time-Lapse Analysis
• Very sensitive to problems in object
identification
• GIGO: Assay development, image
acquisition must be optimized for tracking
success
• Take note of mis-segmentations
especially for cell cycle, lineage studies
• Software
– Bitplane Imaris, Perkin-Elmer Volocity,
Molecular Devices Metamorph
– CellProfiler, FIJI, etc
|○○○○ | DDMMYY30
Schmitz et al. Nat Cell Biol 2010, 12:886
Measuring Clustering
• Characterization of spatial
relationships between
objects
• Example features
– Number of neighboring
objects
– Percent of the perimeter
touching neighbor objects
– Distance to the nearest
neighborhttp://www.perkinelmer.co.uk/
Combinations of Measurements
• Phenotype identification may be difficult if hand-
selecting from a limited measurement set
• Machine learning (ML) approaches can identify
phenotypes from a combination of measurements
Sommer and Gerlich, JCS 2013, 126:1
• Some measurements (e.g., texture) are hard to interpret as readouts but
are excellent fodder for ML approaches to downstream analysis
– See ML advanced elective for more
Outline
• The image as quantitative data
• Identifying the image foreground
• Splitting object clusters
• Identifying cellular compartments
• Measurement extraction
• Statistical analysis
Quality Control
– Focus imperfections,
incorrect exposures,
background problems,
artifacts
– Identify, eliminate
systematic aberrationsFocal blur Saturation artifact
• Ideally, QC should be performed at beginning of workflow
• Use automated measures, with option of manual vetting
– Machine learning approaches can be useful here
Sommer and Gerlich, JCS 2013, 126:1
Data Analysis
• What does this data set look like?
• Cytological profile, or Cytoprofile
• Shows all the measurements acquired– For each individual cell
– In every image
– In the entire experiment.
+1
0
-1
Cell #6111617
-.2 .7 -.1 0 .2 -.9
Data Normalization
• Used to remove systematic errors from the data
• Allows comparison of screening runs from different plates, acquisition times, etc.
• Ideally, results in:
– Similar measurement ranges observed across different wells with the same treatment
– Similar measurement distributions of the controls (positive or negative)
– Keep in mind the recommendations from Assay Development section!
• Common approaches
– % of control: Divide by mean of corresponding measurement from control
– % of samples: Divide by mean of corresponding measurement from all samples
– Z-score, robust Z-score: Transform to zero mean/median, unit variance/MAD
• Alternative approach: Normalized value = percentile within rank-ordered data
|○○○○ | DDMMYY36
Statistical Analysis Software
• Spreadsheets (e.g., Microsoft
Excel)
– Widely used because of familiarity,
– Unable to handle large screening
datasets
– Lack sophisticated analysis
methods
• HCS/HTS microscope vendors
often bundle data-analysis
functionality with hardware, image-
analysis software
http://www.essenbioscience.com
Statistical Analysis Software
• Specialized commercial tools
– Wide variety of products
– Often bundled with hardware
– Talk to vendors for more details
• Open-source tools
– KNIME
– CellProfiler Analyst
– Weka
– Bioconductor
• Custom scripts
– MATLAB
– R
– Python
Not comprehensive!
Summary: Fundamental Steps
Knowledge about
the application!
Making measurements,
feature extractionLENGTH, WIDTH,
CURVATURE, TEXTURE…
Result
Object detection, segmentation
(including 3D and tracking over time)
Preprocessing
Image acquisition
Object
classification,
interpretation,
recognition
Additional Resources
• Introduction to the Quantitative Analysis of Two-Dimensional Fluorescence Microscopy Images for Cell-Based Screening
– Ljosa and Carpenter, PLoS Computational Biology, 5(12), 2009
– DOI: 10.1371/journal.pcbi.1000603
• Biological imaging software tools
– Eliceiri et al, Nat Meth, 9(7), 2012
– DOI: 10.1038/nmeth.2084
• Assay Guidance Manual
– Introduction: http://www.ncbi.nlm.nih.gov/books/NBK100913
– Advanced methods: http://www.ncbi.nlm.nih.gov/books/NBK126174
Summary: The HCS Laboratory
41
Plate Handler Robot HCS Imager
Plate Visualization / Image Analysis
Workstations
Image Analysis
Computer Cluster
Data Management
System
Network File Server
Network
Instrument Control
Workstation
The Wet Lab
Reagents, protocols,
assay optimization
Hardware and
Image Acquisition
Assay Types and Assay Development
Image and Data Analysis
* An Introduction To High Content Screening And Analysis Techniques:
Practical Advice and Examples, Haney, S.A, Bowman, D. Chakravarty, A.
Davies, A. and Shamu, C.E. John Wiley Press, NY, NY (in production)