14
BioPatML BioPatML Pattern sharing for the Genomic Pattern sharing for the Genomic Sciences Sciences 2008 Microsoft eScience Workshop 2008 Microsoft eScience Workshop 7-9 December 7-9 December Indianapolis Indianapolis Stefan Maetschke, Michael Towsey and James M. Hogan MQUTeR Microsoft QUT eResearch Centre Queensland University of Technology, Australia

BioPatML Pattern sharing for the Genomic Sciences Stefan Maetschke, Michael Towsey and James M. Hogan MQUTeR Microsoft QUT eResearch Centre Queensland

Embed Size (px)

Citation preview

BioPatMLBioPatMLPattern sharing for the Genomic SciencesPattern sharing for the Genomic Sciences

2008 Microsoft eScience Workshop2008 Microsoft eScience Workshop7-9 December7-9 December

IndianapolisIndianapolis

Stefan Maetschke, Michael Towsey and James M. Hogan

MQUTeRMicrosoft QUT eResearch Centre

Queensland University of Technology,

Australia

AA comprehensive pattern description language comprehensive pattern description language

WWeb serviceseb services for pattern storage and searching for pattern storage and searching

Integration withIntegration with the semantic web the semantic web

The BioPatML project includes:The BioPatML project includes:

Unifying the Description of Patterns Unifying the Description of Patterns in Biological Sequencesin Biological Sequences

BioPatML supports:BioPatML supports:1.1. DNA, RNA, AA sequencesDNA, RNA, AA sequences

2.2. Principled aggregation of different Principled aggregation of different pattern pattern typestypes e.g. e.g. motifs, motifs, gaps, loopsgaps, loops

3.3. Hierarchical patternsHierarchical patterns

4.4. PPattern librariesattern libraries

5.5. IntegratedIntegrated scoring of pattern matchesscoring of pattern matches

6.6. Some eSome existing pattern databases e.xisting pattern databases e.g.g. Prosite Prosite

BioPatML BioPatML exploits texploits the advantages of XMLhe advantages of XML and RDF. and RDF.

<Motif alphabet=“DNA” motif=“TA[AT]AAW” />

T A A T T C C A G A T AT AG AC A

<Motif alphabet=“DNA” motif=“TA[AT]AAW” name=“Pribnow-box” threshold=“0.5” />

Simple PatternsSimple Patterns

<Series ... > <Motif ... /> <Gap .../> <Motif .../></Series>

Series

MotifGapMotif

-10 element-35 element gap

TTGACA

bacterial promoter

Series PatternsSeries Patterns

TATAAT

Libraries of PatternsLibraries of Patterns

(BioPatML resource: uri=biopatml/promoter.bpl)

<Definition name=“sigma70” > <Definitions> < Definition name=“-35element” /> <Motif motif=“TTGACA” alphabet=“DNA” /> </Definition> < Definition name=“-10element” /> <Motif motif=“TATAAT” alphabet=“DNA” /> </Definition> </Definitions> <Void /></Definition>

(BioPatML resource: uri=biopatml/promoter.bpl)

<Definition name=“sigma70” > <Definitions> < Definition name=“-35element” /> <Motif motif=“TTGACA” alphabet=“DNA” /> </Definition> < Definition name=“-10element” /> <Motif motif=“TATAAT” alphabet=“DNA” /> </Definition> </Definitions> <Void /></Definition>

<Definition name=“Promoter” > <Definitions> <Import uri=“biopatml/promoter.bpl” </Definitions> <Series ... > <Use definition=“sigma70.-35element” /> <Gap min=“13” max=’21” /> <Use definition=“sigma70.-10element” /> </Series></Definition>

<Definition name=“Promoter” > <Definitions> <Import uri=“biopatml/promoter.bpl” </Definitions> <Series ... > <Use definition=“sigma70.-35element” /> <Gap min=“13” max=’21” /> <Use definition=“sigma70.-10element” /> </Series></Definition>

Pattern creationPattern creation

AnnotationAnnotation

XMLXML

Semantic Semantic taggingtagging

BioPatML Web servicesBioPatML Web serviceshttp://bio.mquter.qut.edu.au/biopatmlhttp://bio.mquter.qut.edu.au/biopatml

SilverGene: Genome browserSilverGene: Genome browser

Gene CT323

Pattern matches

BioPatML inBioPatML in the Semantic Web the Semantic Web

BioPatML is part of the Bio2RDF projectBioPatML is part of the Bio2RDF project

Bio2RDF is an initiative of Quebec Genomics Centre and Bio2RDF is an initiative of Quebec Genomics Centre and Université LavalUniversité Laval

Described as "Described as "a new integrated way to surf genomic a new integrated way to surf genomic knowledgeknowledge""

The world according to Bio2RDFThe world according to Bio2RDF

BioPatML inBioPatML in the Semantic Web the Semantic Web

BioPatML in Bio2RDFBioPatML in Bio2RDF– created a name space and termscreated a name space and terms– http://bio2rdf.org/ns/biopatmlhttp://bio2rdf.org/ns/biopatml

Created an RDF database of BioPatML patternsCreated an RDF database of BioPatML patterns– encapsulencapsulateate BioPatML patterns as RDF literals BioPatML patterns as RDF literals– RDF tagging and searchRDF tagging and search

BioPatML: Semantic TaggingBioPatML: Semantic Tagging

BioPatML ResourcesBioPatML Resources

http://bio.mquter.qut.edu.au/biopatml (web demo)

http://www.mquter.qut.edu.au/bio (BioPatML manual)

http://bio2rdf.org/ns/biopatml (namespace & terms)

http://bio2rdf.org (Bio2RDF home page)

Bioinformatics team at MQUTERBioinformatics team at MQUTER

Scott Mann

Lawrence Buckingham

Jim Hogan

ChrisBowles

Xin-Yi Chua

Michael Towsey

Peter AnsellJiro

Sumitomo