19
THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

Embed Size (px)

Citation preview

Page 1: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MGED Ontology Workshop

MGED7

September 8-10, 2004

Toronto, Canada

Page 2: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MAGE Workshop

May 24th- May 28th 2004

The Institute for Genomic Research

and

University of Maryland, Shady Grove

Rockville, MD

Page 3: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MAGE Workshop Goals• MAGE -> MGED Ontology API

– Main goal for this meeting

– Build a mechanism for people to use the MO as part of MAGE

• Ontology Tools– In order to use MO we need tools for manipulating and

managing the ontology

• MAGE v2 Model– Continue model discussions toward MAGE v2

– Document Model and changes as the model is developed

– Begin work on Mapping MAGE v1 to MAGE v2

Page 4: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MAGE Workshop Goals

• Standalone ADF Converter– Continue work on simplified Array Design

Format and a reader and writer for it– Integrate this into MAGE v2

• Documentation– MO policies and usage– MOE class or methods– MAGE v2

• Code, Code, Code …

Page 5: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Ontology Tools• daml file parsing scripts for extracting the

classes, instances and properties from MO

• ANSI SQL scripts for creating MO in a relational database like MySQL or Sybase

• Script based methods for updating a datrabase implementation of the MO

• Perl and Java methods for searching the MO for classes, instances, and properties

• Others ?

Page 6: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

OWL Reader:

  Adam Witney developed an reader based on the Redland RDF reader that will parse the MGED Ontology from .OWL as well as .DAML files. This reader became useful to the Ontology helper API. It is located in cvs: lib/Perl/script/MGEDOntology_parser.pl

Page 7: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Ontology Helper API:

• Eric Deutsch and Kjell Petersen developed ‘MGEDOntologyHelper.pm’ which will create OntologyEntry objects based on ‘leaf node’ data. Both applications follow MGEDOntology policies.

• MO Traversal application written by Kjell Petersen and Stathis Sideris.was extended to use ‘leaf node’ data to instantiate an OE MAGE-ML object.

• Eric Deutsch ported Stathis’ code to a perl modules, which returns nested OE objects. The Perl helper is nearly complete. It works well in simple cases, but in nested cases, the final ‘value’ doesn’t get inserted.

• These ‘prototype’ modules and applications are available in cvs:

• Perl: MAGE-Perl/MAG/Tools. Java: MAGE-Java/MGEDOntologyEntry

• The code also is a working prototype.

• Example scripts are located in ‘lib/Perl/MGEDOntology

• The Jave Helper has been completed and is fully functional.

Page 8: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Perl modules

MGEDOntologyClassEntry.pm

MGEDOntologyEntry.pm

MGEDOntologyHelper.pm

MGEDOntologyPropertyEntry.pm

Java classes

MGEDOntologyEntry.java

MGEDOntologyClassEntry.java

MGEDOntologyPropertyEntry.java

OntologyHelper.java

StringManipHelpers.java

Page 9: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Example Perl for MGEDOntologyHelper

my $ontologyEntry1 = Bio::MAGE::Tools::MGEDOntologyClassEntry->new(

parentObject => $qt, ## ref to a QT object

className => 'QuantitationType',

association => 'Scale',

values => {

Scale => 'linear_scale',

},

ontology => $ontology, ## ref to MO Helper obj

);

Page 10: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Resulting OE

<OntologyEntry value="linear_scale"

category="Scale">

<OntologyReference_assn>

<DatabaseEntry URI="http://mged.sourceforge.net/ontologies/MGEDOntology.php#linear_scale"

accession="linear_scale">

<Database_assnref>

<Database_ref identifier="MO"/>

</Database_assnref>

</DatabaseEntry>

</OntologyReference_assn>

</OntologyEntry>

Page 11: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Another example

my $ontologyEntry1 = Bio::MAGE::Tools::MGEDOntologyClassEntry->new(

parentObject => $BioSource, ## ref to parent

className => 'BioMaterial',

association => 'Characteristics',

values => {

OrganismPart => 'lung',

},

ontology => $ontology, ## ref to MO obj.

);

Page 12: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Missing value in resulting OE

<OntologyEntry value="BioMaterialCharacteristics“ category="BioMaterialCharacteristics">

<OntologyReference_assn>

<DatabaseEntry URI=http://mged.sourceforge.net/ontologies/MGEDOntology.php#BioMaterialCharacteristics accession="BioMaterialCharacteristics">

<Database_assnref>

<Database_ref identifier="MO"/>

</Database_assnref>

</DatabaseEntry>

</OntologyReference_assn>

<Associations_assnlist>

<OntologyEntry value="OrganismPart“ category="OrganismPart">

<OntologyReference_assn>

<DatabaseEntry URI="http://mged.sourceforge.net/ontologies/MGEDOntology.php#OrganismPart"

accession="OrganismPart">

<Database_assnref>

<Database_ref identifier="MO"/>

</Database_assnref>

</DatabaseEntry>

</OntologyReference_assn>

</OntologyEntry>

</Associations_assnlist>

</OntologyEntry>

Page 13: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Autogenerated MO classes:

Scott Gustofsen has devised a method of generating Java classes from the MO. He calls this Java Ontology Bindings. The code isn’t yet implemented, but will be in the near term. The classes would have to be regenerated with each release of MO. He has agreed to push it into cvs.

Page 14: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

BioMoby Web service:

Tina Boussard and Derek Fowler began work to establish a BioMoby service to do the following: 1) search MO for terms and definitions, 2) return instantiated classes [objects], 3) provide command-line client register service [for batch processes].

The service would likely be hosted at CBIL (Upenn). A namespace has been defined; datatype and output/input formats for searches still need to be defined. Currently, the service uses GO as the test case database. Derek Fowler has set up a BioMoby test service: test_GetMoTerm.

• The Moby namespace was problematic because another group at Cornell has registered their namespace incorrectly. They've been asked to fix it.

Page 15: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MO Term Tracker/Validator:

Trish Whetzel implemented a term tracker for the RAD database.

The use cases for the Tracker are:

1) return new terms proposed by date and/or submitter,

2) return all terms for any MO class,

3) consistency check the MO.

The Tracker would be hosted at CBIL. Helen Parkinson and Trish discussed methods and use cases for managing the MO in RAD.

Page 16: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

ADF Converter:

Philippe Rocca-Serra continued work on the ADF format with Michael Miller and Pierre Bushel. In order to incorporate chip/cgh data he had to remove Reporter and CompositeSequence identifiers. The format now is in an Excel workbook with 3 worksheets: headers, Reporters and CompositeSequences. The code will be placed in cvs when ready.

Page 17: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MAGEv2 Progress:

Ugis Sarkans, Paul Spellman, Michael Miller and Angel Pizarro continued re-model-ing discussions. The following changes are being considered:

1. Use Channel to link FactorValues to particular BioMaterials

2. BioAssays are now generalized to (hopefully) include all types of experimental protocols. As such the BioMaterial Treatment object is now a type of BioAssay.

3. ArrayDesign and DesignElement will mostly be left as-is in the model and the reference implementation of the model will have native support for a simpler format, probably ADF. It is not yet clear whether the default serialization (XML schema) will have both formats.

4. Protocols have been changed to be multistep protocols, i.e. a set of ordered steps, and not just simple protocol descriptions. This has allowed the changes in BioAssay to take place.

5. A new abstract class, ‘Referenceable’ was devised to separate the concept of internal MAGE references (Identifiable) to objects that exist in other resources (e.g. they have associations to DatabaseEntry and BibliographicReference)

6. OntologyEntry will be redefined to allow for representation of frame-based and descriptive logic ontologies (MO), as well as simpler node-based ontologies (GO)

7. HigherLevelAnalysis will be extended to represent other types of analytical results, The current cluster represnetation will remain the same, modulo some bug fixes.

8. The MAGE submitters notes were searched for best-practices issues that could be solved by the model. The conclusion was that most of the best-practice recommendations stem from semantic checks, not syntactic, so the model changes would not suffice.

Page 18: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

MO Problems:

Kjell and Eric found a couple of inconsistencies in the Ontology which are being investigated - these have been posted into the MGED Ontology tracker.

There was an issue about how we handle deprecated instances and classes; this will also be investigated at the next VOW.

Page 19: THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada

THE INSTITUTE FOR GENOMIC RESEARCH

TIGRTIGR

Ontology Helper API:

• Eric Deutsch and Kjell Petersen developed an API which will create OntologyEntry objects based on ‘leaf node’ data, i.e. the instance values for the OE. Both applications follow MGEDOntology policies.

• Kjell Petersen extended a Java application written together with Stathis Sideris after the EBI jamboree last December. The application traverses the MO to generate uninitalised data structures according to the policies, with the possible choices from MO available at each node. The extended code will now take a minimum set of 'leaf node' data as input, validate them against MO and instantiate the full data structureof MAGE OntologyEntry. The code is a working prototype. The classes are in cvs: MAGE-Java/MGEDOntologyEntry. Kjell completed the Java Helper such that it produces complete and filled OE objects.

• Eric Deutsch ported Stathis’ code to a set of perl modules that are used by a further module, ‘MGEDOntologyHelper.pm’, which returns nested OE objects. The helper is nearly complete. It works well in simple cases, but in nested cases, the final ‘value’ doesn’t get inserted. These modules are available in cvs:

MAGE-Perl/MAGE/Tools. The code also is a working prototype. Example scripts that use it are located in ‘lib/Perl/MGEDOntology’ as test1.pl , test2.pl, and test3.pl.).