Annotating Microarray Data with the MGED Ontology

Preview:

DESCRIPTION

Annotating Microarray Data with the MGED Ontology. NCI Center for Bioinformatics April 15, 2004 P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H. He, G. Grant, M. Mailman, C. Stoeckert Center for Bioinformatics University of Pennsylvania. Science 298:601-604, 2002. - PowerPoint PPT Presentation

Citation preview

Annotating Annotating Microarray Data Microarray Data with the MGED with the MGED

OntologyOntologyNCI Center for BioinformaticsNCI Center for Bioinformatics

April 15, 2004April 15, 2004P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H. P. L. Whetzel, A. Pizarro, E. Manduchi, J. Liu, H.

He, G. Grant, M. Mailman, C. StoeckertHe, G. Grant, M. Mailman, C. Stoeckert

Center for BioinformaticsCenter for Bioinformatics

University of PennsylvaniaUniversity of Pennsylvania

Science 298:601-604, 2002

Science 298:597-600, 2002

To compare experiments, you need some To compare experiments, you need some minimum information about the microarray minimum information about the microarray

experiments.experiments.Ivanova et al. Science 2003

Microarray Information to be Microarray Information to be SharedShared

Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

The Computational View of Microarray The Computational View of Microarray Information Information

MGED SocietyMGED Society

International organizationInternational organization Comprised of biologists Comprised of biologists

computer scientists, and computer scientists, and data analystsdata analysts

Aims to facilitate the sharing Aims to facilitate the sharing and evaluation of microarray and evaluation of microarray data data

Establish standards for Establish standards for microarray data annotationmicroarray data annotation

Create microarray Create microarray databasesdatabases

Promote sharing of high Promote sharing of high quality, well-annotated dataquality, well-annotated data

Generalize to data Generalize to data generated by functional generated by functional genomics and proteomics genomics and proteomics experimentsexperiments

www.mged.org

MGED Standardization MGED Standardization EffortsEfforts

MIAMEMIAME The formulation of the minimum information about a microarray The formulation of the minimum information about a microarray

experiment required to interpret and verify the results. (Brazma experiment required to interpret and verify the results. (Brazma et al. Nature Genetics 2001)et al. Nature Genetics 2001)

MAGE-OMMAGE-OM The establishment of a data exchange format and object model for The establishment of a data exchange format and object model for

microarray experiments. (Spellman et al. Genome Biol. 2002)microarray experiments. (Spellman et al. Genome Biol. 2002)

MGED OntologyMGED Ontology The development of an ontology for microarray experiment The development of an ontology for microarray experiment

description and biological material (biomaterial) annotation in description and biological material (biomaterial) annotation in particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003)particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003)

TransformationsTransformations The development of recommendations regarding microarray data The development of recommendations regarding microarray data

transformations and normalization methods.transformations and normalization methods.

MGED Ontology (MO)MGED Ontology (MO)

PurposePurpose Provide standard terms for the annotation of Provide standard terms for the annotation of

microarray experiments microarray experiments Not to model biology but to provide descriptors for Not to model biology but to provide descriptors for

experiment components experiment components BenefitsBenefits

Unambiguous description of how the experiment was Unambiguous description of how the experiment was performedperformed

Structured queries can be generatedStructured queries can be generated

Ontology concepts derived from the MIAME Ontology concepts derived from the MIAME guidelines/MAGE-OMguidelines/MAGE-OM

MGED Ontology MGED Ontology developmentdevelopment

http://mged.sourceforge.net/ontologies/MGEDonhttp://mged.sourceforge.net/ontologies/MGEDontology.php tology.php

OILedOILed File formatsFile formats

DAML fileDAML file HTML fileHTML file NCI DTS BrowserNCI DTS Browser

ChangesChanges NotesNotes Term TrackerTerm Tracker

Relationship of Relationship of MO to MAGE-OMMO to MAGE-OM

MO class hierarchy follows that of MO class hierarchy follows that of MAGE-OMMAGE-OM Association to OntologyEntryAssociation to OntologyEntry

MO provides terms for these MO provides terms for these associations by: associations by: Instances internal to MOInstances internal to MO Instances from external ontologiesInstances from external ontologies

Take advantage of existing ontologiesTake advantage of existing ontologies

MGED Ontology MGED Ontology Class HierarchyClass Hierarchy

MGED CoreOntologyMGED CoreOntology Coordinated development Coordinated development

with MAGE-OMwith MAGE-OM Ease of locating Ease of locating

appropriate class to appropriate class to select terms fromselect terms from

MGED MGED ExtendedOntologyExtendedOntology Classes for additional Classes for additional

terms as the usage of terms as the usage of genomics technologies genomics technologies expandexpand

MAGE and MOMAGE and MO

MAGE and MOMAGE and MO

Main focus of MGED Main focus of MGED OntologyOntology

Structured and Structured and rich description rich description of BioMaterialsof BioMaterials

BioMaterial

OntologyEntry

+characteristics

+associations

MO and References to MO and References to External OntologiesExternal Ontologies

MO and references to MO and references to External OntologiesExternal Ontologies

Use MGED Ontology for Use MGED Ontology for Structured Descriptions Structured Descriptions

(MAGE-ML)(MAGE-ML)

http://www.sofg.org

Desirable Microarray Desirable Microarray Queries Queries

Return all experiments with species X Return all experiments with species X examined at developmental stage Yexamined at developmental stage Y Sort by platform typeSort by platform type Which are untreated? Treated?Which are untreated? Treated?

Treated with what compound?Treated with what compound? How comparable are these?How comparable are these?

What can these experiments tell me?What can these experiments tell me?

MO and Structured MO and Structured QueriesQueries

RAD: RNA Abundance Database http://www.cbil.upenn.edu/RAD

RAD is part of GUS (Genomics Unified Schema)The GUS platform maximizes the utility of stored data by

warehousing them in a schema that integrates the genome, transcriptome, gene regulation and networks, ontologies and controlled vocabularies, gene expression

Relational schema (implemented in Oracle)Stores data from gene expression arrays and SAGEComes with a suite of web-annotation forms (Study-

Annotator)MAGE-RAD Translator (MR_T) generates MAGE-ML files

for exportsManduchi et al. 2004 Bioinformatics 20:452-459.

GUS (Genomics Unified Schema)http://www.gusdb.org

OntologiesShared

ResourcesSRes

MIAME/MAGE-OMGene ExpressionRAD

GrammarsGene regulationTESS

DocumentationData ProvenanceCore

Central dogmaSequence and

annotationDoTS

FeaturesDomainNamespace

About 65 tables and 30 viewsAssay to Quantification tablesStudy Design tablesBioMaterials tablesPlatform tablesQuantification Result tablesProcessing tablesAnalysis Result tablesMisc tables: Protocol, Contact*, Ontologies*Meta tables*: data privacy and for history trackingIntegrity Checks tables

* These are used by RAD, but belong to common GUS components

RAD Schema

Tables populated bythe Study-Annotator

RAD Study-Annotator

Covers all relevant parts of the MIAME checklistExploits the MGED OntologyAllows entering of very specific details of an

experimentWeb-based forms:

Modular structureWritten in PHPFront-end data integrity checks using JavaScript

Manages Data Privacy based on Project/Group selections present in GUS schema

Available at http://www.cbil.upenn.edu/RAD/RAD-installation.htm

RAD Study-AnnotatorLogical Flow

Study

From Assay to Quantification

Study Design

Login

BioMaterials(samples, treatments)

New User Registration

Module I Module II Module III

Data Preferences(Project, Group)

Misc

Experiment Annotation:Experiment Annotation:Study DesignStudy Design

BioMaterial Annotation: BioMaterial Annotation: Conceptual View Conceptual View

RAD Study Annotator: RAD Study Annotator: BioMaterial ModuleBioMaterial Module

RAD Study Annotator: RAD Study Annotator: BioSource FormBioSource Form

RAD Study Annotator: RAD Study Annotator: Treatment FormTreatment Form

Using the Ontologies

OntologyEntry

ExternalDatabases

RAD Study-Annotator

MGED OntologyAnatomy

DevelopmentalStageDiseaseLineage

PATOAttributePhenotype

Taxon

SRES

RAD

MGED Ontology

Ontology instances propagated to annotation web forms

new terms can be proposed

Sources of New Terms in Sources of New Terms in OntologyEntryOntologyEntry

MGED OntologyMGED Ontology Continued development of new Continued development of new

classes and termsclasses and terms Shared Resources (SRes)Shared Resources (SRes)

Contains controlled vocabularies and Contains controlled vocabularies and ontologiesontologies

External Database SourcesExternal Database Sources Annotated term provided by user Annotated term provided by user

Adding New TermsAdding New Terms

1 Add term from SRes

2 Add term from External Database

Future IssuesFuture Issues

Burning IssuesBurning Issues Developing MO in synch with related Developing MO in synch with related

efforts (MAGE-OM v.2.0)efforts (MAGE-OM v.2.0) Use/presentation in annotation formsUse/presentation in annotation forms Coverage of other technologies and Coverage of other technologies and

biological domainsbiological domains Flame retardant structureFlame retardant structure

ExtendedOntologyExtendedOntology Space to add new classes, terms and their Space to add new classes, terms and their

relationship to one anotherrelationship to one another

A Functional Genomics A Functional Genomics ViewView

A. Jones et al. submitted

A Functional Genomics A Functional Genomics Object Model (FGE-OM)Object Model (FGE-OM)

Separate out common Separate out common components from components from technology-specific technology-specific onesones

Allow new domains to Allow new domains to be added as new be added as new modules to the modelmodules to the model

Incorporate ideas Incorporate ideas from SysBio-OM from SysBio-OM (Xirasgur et al. (Xirasgur et al. Bioinformatics in Bioinformatics in press)press) Jones et al. Bioinformatics in press

ProteomicsStandards

FunctionalGenomicsStandards

MicroarrayStandards

MIAME MAGE-OM MGED Ontology

MIAPEPedro

PedroMIAPE-OMFGE-OM

MIAMEMIAME-Tox

MIAPE

FGE-OM MGED Ontology

Informal specification Formal specification

Immutable type systemStrong type system

Use Cases

Proposed Development of FGE-OMProposed Development of FGE-OM

AcknowledgementsAcknowledgements

MGED Ontology Working GroupMGED Ontology Working Group Chris Stoeckert, Trish Whetzel Chris Stoeckert, Trish Whetzel

(Penn)(Penn) Helen Parkinson (EBI)Helen Parkinson (EBI) Joe White (TIGR)Joe White (TIGR) Gilberto Fragoso, Liju Fan, Mervi Gilberto Fragoso, Liju Fan, Mervi

Heiskanen (NCI)Heiskanen (NCI) Many others!Many others!

Recommended