24
Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages 761- 769

Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

  • View
    222

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization

imagesBIOINFORMATICS

Gene expression

Vol. 26, no. 6, 2010, pages 761-769

Page 2: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Outline

• In situ hybridization• Introduction• Berkeley Drosophila in situ database• Approach

– Image registration– Representation of expression patterns– Correspondence of expression similarities to expert

annotation– Expression similarity of known co-regulated genes

Page 3: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

In situ hybridization (ISH)

• a type of hybridization that uses a labeled complementary DNA or RNA strand (i.e. probe) to localize a specific DNA or RNA sequence in a portion or section of tissue, if the tissue is small enough (e.g. plant seeds, Drosophila embryos)

Page 4: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

In situ hybridization

• In situ hybridization of wild type Drosophila embryos at different developmental stages for the RNA from a gene called hunchback

Page 5: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Introduction

• Recent advancements in high-throughput imaging.

• Methods for capturing these spatial and/or temporal expression patterns– In situ hybridization– Fluorescent reporter constructs or tags

• Are still frequently assessed by subjective qualitative comparisons

Page 6: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Introduction

• A particular interesting and fundamental problem - to compare two samples on the level of their expression profiles– For instance, the same gene under different

conditions or across different species– different genes with the goal to cluster them

akin to approaches developed for microarray data.

Page 7: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Introduction

• Several problems arise when comparing image expression images:– To process the raw input images

• to eliminate noise under a typical large range of imaging condition (e.g. different viewpoints, different locations, multiple specimens per image)

• to perform normalizations

– Represent the expression patterns and to specify appropriate similarity metrics capable of assessing similarity

– Assess the significance of observed similarities

Page 8: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Introduction

• Use a dataset of Drosophila embryonic expression patterns

• There main contribution– Robust and fully automated image analysis

techniques– Comprehensively compare different similarity

metrics– A new significance testing framework for

spatial similarity scores through constrained realization Monte Carlo simulations

Page 9: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Berkeley Drosophila in situ database

• 78621 images of 3724 genes expressed across six time windows (covering the developmental stages 1-3, 4-6, 7-8, 9-10, 11-12, 13-15).

• Focus on the subset of 27157 images covering 3127 genes acquired during the 4-6 stages– Crucial information is not yet provided for later stages– Are not subject to the additional complexity that

expression in later stages

Page 10: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

• Prior to any quantitative analysis, necessary to normalize and register the images to a common frame

Page 11: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

• statistical shape models– Signed distance maps to describe object cont

ours• Negative distance - inside the object• Positive distances – outside • Magnitude - actual distance

Page 12: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

– Drosophila shape model was automatically created from a manually curated set of 120 embryo images

• The contours of the embryo were manually segmented and transformed into signed distance map

• Automatically normalized in size– Minimizing the distance of each individual signed distanc

e map to the mean signed distance map

• The resulting normalized maps were analyzed using hierarchical principal component analysis (PCA) decomposition

Page 13: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

• Model the filtered intensity values around the contour the embryos– Provide characteristic priors– Bin the intensities observed in distances from

-25 to 25, while remaining bins are not included.

Page 14: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

(A) Example image showing the creation of the training set

(B) A subset of 120 images used for the training of the shape model

(C) The training set is normalized in size

(D) Four of the principal shapes of the embryo, These images depict 2 standard deviations of the principal component from the mean of the signed distance map.

Page 15: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Image registration

• The task of image registration – to find the optimal set of parameters – Two categories:

• Rigid transformation parameters of the image r

• The principle shape components of our shape prior s

– These tow sets of parameters are simultaneously optimized using an in-house implementation of a particle swarm optimizer.

• This approach estimates both the average shape of an embryo, as well as the main components of variation of embryo shape

Page 16: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Representation of expression patterns

• We are interested in comparing the global 2D expression pattern.

• After registration, the transformed image T and shape mode S are used to calculate column and row vectors of expression data

Page 17: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Correspondence of expression similarities to expert annotations

• To determine how the similarity values computed by each metric corresponded to manually annotated expression terms

• metrics– Spatial metrics

• Haar wavelets (HWs)• Spatial mutual information (SMI)

– Non-spatial metrics• Mean squared error (MSE)• Mutua information (MI)

Page 18: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Correspondence of expression similarities to expert annotations

• For each scoring metric, we calculated an enrichment significance for each annotation term.

• Describe how often genes annotated with a particular ontology term show the strongest similarity to genes annotated with the same term.

Page 19: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Correspondence of expression similarities to expert annotations

P-value cutoff of 0.05, SMI performed the best, with 22 of the 29 annotation terms being significantly enriched.

Page 20: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Expression similarity of known co-regulated genes

• We use SMI and significance tests to validate known biological interactions, suggesting their usefulness for inference on biological data.

• Gene regulation and spatial patterning are a tightly coupled process– Transcription factor acting as activators for a

gene are often co-expressed in similar spatial regions

Page 21: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Expression similarity of known co-regulated genes

A, C => Nubbin and dichaete

bottom of (A,C) => The extracted expression vector

B,D => a set of random realizations with constraints on the correlation between spatially adjacent expression values

E => distribution of similarity values

Page 22: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Expression similarity of known co-regulated genes

Significance of the pairwise score :

Blue : >0.1

Green : (0.1, 0.05]

Yellow : (0.05, 0.01]

Red :<0.01

Page 23: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Expression similarity of known co-regulated genes

• Example– pdm2 and nubbin (also known as pdm1) share functio

n roles– Ocelliless (also known as orthodenticle) is positively r

egulated by bicoid– Ubx indirectly regulated dichaete through the interme

diate activation of dpp– Hunchback represses the expression of nubbin, pdm2

and Ubx– Giant and Krueppel mutually repress each other

Page 24: Extraction and comparison of gene expression patterns from 2D RNA in situ hybridization images BIOINFORMATICS Gene expression Vol. 26, no. 6, 2010, pages

Expression similarity of known co-regulated genes

• Not all known interactions are detected as significant– A spatial expression pattern of a gene is the

result of complex interaction of many genes across several time stages