Martin Ringwald, Mouse Gene Expression DB, fged_seattle_2013

Preview:

DESCRIPTION

The Mouse Gene Expression Database (GXD)

Citation preview

The Mouse Gene Expression Database (GXD)

Martin Ringwald

The Jackson Laboratory

Mouse developmental gene expression data provide insights into

• organismal function of genes

• molecular mechanism of differentiation

• molecular basis of disease

Genotype Phenotype Expression

Mouse Strains and Mutants

Of mice and men …..

• integrates different types of expression data

RNA in situ hybridization Northern blot

Immunohistochemistry Western blot

Knock-in reporter studies RT-PCR

• focus on endogenous gene expression

during mouse development

• all developmental stages

• expression data from wild-type and mutant mice

The Gene Expression Database (GXD)

Gene

RNA

Protein

1…n

1…p

Time Space

Genotype

Standardized description of expression patterns

Hierarchical structure: • Extensibility • Hierarchical searches • Integrated description of expression patterns from assays with differing spatial resolution

Anatomical Ontology for Mouse Development: developed by Edinburgh Mouse Atlas Project maintained and expanded by EMAP and GXD

Anatomical Ontology for the Adult Mouse: developed and maintained by GXD

Integrated access to complex and heterogeneous data to facilitate the use of the mouse as an experimental model to study human development and disease.

Integration with all the other data in MGI

Genotype Phenotype Expression

Function!

PubMed

OMIM

GenBank/EMBL/DDBJ

Entrez Gene

UniProt

InterPro

EMAGE

GenePaint

GEO

Array Express

IMSR

Other species DB

Many links to other resources:

MGI Home Page: www.informatics.jax.org

GXD Home Page

• Data Acquisition and Current Data Content

• New Search and Display Features

Recent Progress

• curation of expression data from literature

• electronic submission from laboratories – small and large scale data

• collaboration with projects that generate data at a large scale

Data Acquisition for GXD

First step of literature curation: Each article is indexed with regard to -  Genes -  Assay types -  Embryonic ages -  Bibliographic information

as of 6/15/13: 149,941 entries 20,996 references 15,033 genes up-to-date complete from 1993 (1990) to the present

Superior to PubMed: • Manual annotation of whole manuscript • Use of standard gene nomenclature • Indexing of assay types and embryonic ages

Primary Image Data

Example: RT-PCR

Primary Image Data

Example: Immunohistochemistry

Sections

Antibody detail

Gene

Specimens

Mutant "alleles

Results

Link to"images

• Standard nomenclature • Extensive use of controlled vocabularies • Manual and computational consistency checks • Editorial Interface and QC reports • Detailed and regularly updated editorial guidelines

Data Quality Control

Data Quality Control

• Text-based annotations complemented by primary image data • Annotations are NOT based on our own interpretation of the images. They strictly rely on the statements of the authors. • Resolution of annotations is determined by details provided in the text of the manuscript. • We notify authors once data for their publications have been entered. Authors can provide comments and additional information.

Gene Expression Data – Result Annotations

Large-scale Gene Expression Data Sets

Incorporation of large-scale data sets

• Develop parsers to extract and evaluate data • Manual and computational quality controls - verify gene identity: probe to gene mapping - verify probe identity: probe already in database? - map results to anatomical ontology and other controlled vocabularies - resolve ambiguities - complete annotations • Bring data in standardized format for data loads • Bulk-load curated data in GXD

GXD adds value to large-scale data sets

from other databases

• data are integrated with all the other data in GXD and MGI • data are accessible via many new search parameters • data and data connections are maintained and kept up-to-date

GXD: Current Data Content

249,010 Expression Images 1,394,685 Annotated Expression Results 63,374 Expression Assays 13,751 Genes 1,820 Mouse Mutants with Expression Data

• Gene Expression Data Query Forms

• Expression Data Summaries

• Expression Assay Details

• Images

Improved Search and Display Capabilities

MGI

Gene Detail Page

Function (GO) Phenotype Disease

Anatomy Dev. Stage Age

Wild-type / mutant

Assay type

New Query Form - Standard Search

New Query Form - Differential Expression Search

Function (GO) Phenotype Disease

Anatomy Dev. Stage Age

Wild-type / mutant

Assay type

New Query Form - Standard Search

1824 genes annotated to DNA binding Expression data are available for this gene set (otherwise ‘DNA binding’ would be greyed out).

Auto-fill function

DNA binding genes

detected in

diencephalon at TS 17-20

by Immunohistochemistry

New Summary – Assay Results

• 4 sortable data summaries: genes, assays, assay results, images • links to detailed annotations and images • summary data can be downloaded and exported to other applications

Sort

New Summary – Assays

New Summary – Genes

New Summary – Assay Results

• 4 sortable data summaries: genes, assays, assay results, images • links to detailed annotations and images • summary data can be downloaded and exported to other applications

Sort

45

Previous Assay Details

reference to 1H, 1J; link to Figure 1

all specimen information displayed upfront

reference to 1E, 1F; link to Figure 1

Links to 3-D mapped images in EMAGE 46

45

New Assay Details

focus on most important specimen information

images displayed together with result annotations

New Summary – Images

Search directly for images using many different query criteria

New Summary – Images

45

New Assay Details

• Gene Expression Data Query Forms

- improved layout

- new query capabilities

• Strongly enhanced query performance

• Expression Data Summaries

- more flexible and interactive

- option to download and export data

- image summaries

• Expression Assay Details

- integration of images and annotations

- improved layout - focus on essential data

Improved Search and Display Capabilities

• MGI Batch Query

• GXD BioMart

New ways to access GXD Data

• Enter list of gene symbols or IDs and look up associated expression data

• Download data and export data to other applications

GXD BioMart

Find expression data • for a gene • for a list of genes • for an anatomical structure • for a mutant • for a reference Integrated searches across different BioMarts

GXD BioMart: Query Results (default view)

Export Data

Link to Images Link to Assay Details

Constance Smith Jacqueline Finger Terry Hayamizu Ingeborg McCright Jingxia Xu David Shaw Joanne Berghout MGI Software Group Jim Kadin Joel Richardson Janan Eppig

Acknowledgements

GXD is supported by NICHD

Recommended