Upload
balhoff
View
456
Download
3
Embed Size (px)
DESCRIPTION
Data integration challenge entry presentation for iEvoBio 2011.
Citation preview
Phenoscape Knowledgebase
Jim Balhoff, Wasila Dahdul, Hilmar Lapp, Paula Mabee, Peter Midford, Todd Vision, Monte Westerfield
Phenoscape
• Collaboration between P. Mabee (U. South Dakota), M. Westerfield (ZFIN, U. Oregon), and T. Vision (NESCent, UNC)
• Aim: foster semantic integration of phenotype data by
• Prototyping a database of curated, machine-interpretable evolutionary phenotypes.
• Integrating these with mutant phenotypes from model organisms.
• Providing reasoner-enabled semantic tools which facilitate data-mining of phenotypic diversity and discovery of candidate genes for evolutionary phenotype transitions.
Phenoscape Knowledgebase• ~4 million asserted semantic links (+ 7 million inferred)
• 52 fish systematics publications
• 4700 morphological characters
• 11,000 states + ontological descriptions (EQ)
• 2500 referenced taxa
• ZFIN
• 3900 genes, 6800 genotypes
• 30000 experimental phenotype annotations
• 13 ontologies
• ~80,000 terms
Workflow for phenotype annotation
2. Students: Manual entry of free text character descriptions,
matrix, taxon list, specimens and museum numbers using Phenex
3. Character annotation by experts: Entry of phenotypes using
Phenex
1. Students: gather publications (scan
hard copies, produce OCR PDFs)
4. Phenoscape Knowledgebase: OBD, data services, web
application
Dahdul et al., 2010 PLoS ONE Text
501,862 phenotypes for
taxa
Knowledgebase architecture
Phenotypic Quality Ontology
(PATO)
Teleost Taxonomy Ontology (TTO)
Knowledgebase (OBD)(PostgreSQL)
Genes & genotypes
OBD Programming API(Java)
Knowledgebase Data Services API (REST)
Knowledgebase User IntefaceWeb Application for Exploration & Mining
(Ruby on Rails, JavaScript)
OBD Reasoner
Phenex(Evolutionary EQ
annotation)
Homology assertions
Mutant EQ phenotypes from Zebrafish Model Organism Database
External web sites and client
applications
Skeletal Character Data(from phylogenetic
treatments in literature)
OBO LibraryNeXMLEvolutionary EQ Phenotypes
(through annotation)
Anatomy Ontologies(ZFA, TAO)
OBD: Ontology-Based Database
• Stores data and ontologies in combined semantic model - triple based
• Reasoner executes inference rules as SQL queries - results iteratively added to database
• Supports class expressions based on property restrictions, intersections, unions; transitive properties, property chains, and subsumption
• Provenance-tracking via reification
Brachyplatystoma capapretum
round that
inheres_in some
ethmoid cartilage
exhibits some
tfap2ats213/ts213split that
inheres_in some
ethmoid cartilage
influences some
Reasoning across logical relationships
Brachyplatystoma capapretum
round that
inheres_in some
ethmoid cartilage
exhibits some
tfap2ats213/ts213split that
inheres_in some
ethmoid cartilage
influences some
round
split
ethmoid cartilageBrachyplatystoma
tfap2a
is_avariant_of
inheres_in
inheres_in
is_a
is_a
Reasoning across logical relationships
Brachyplatystoma capapretum
round that
inheres_in some
ethmoid cartilage
exhibits some
tfap2ats213/ts213split that
inheres_in some
ethmoid cartilage
influences some
round
split
ethmoid cartilageBrachyplatystoma
tfap2a
is_avariant_of
inheres_in
inheres_in
is_a
is_a
shape
chondrocranium cartilage
olfactory region
Pimelodidae
sequence-specific DNA binding transcription
factor activity
is_a
has_function is_a part_ofis_a
is_a
Reasoning across logical relationships
Demo
Phenotype variation in taxa (left) vs. zebrafish mutants (right)
cardiovascular
sensory
nervous
digestive
endocrine
hematopoietic
immune
liver and biliary
musculaturerenalreproductive
respiratory
anatomical system
is_ais_a
is_a
is_a
is_a
is_a
is_a
is_a
is_a
is_a is_ais_a
is_a
>85% 20-30% 15-19% 10-14% 5-9% 1-4% <1%
skeletal
Distributed across anatomical systems
!" #!" $!" %!" &!" '!" (!" )!"
*+,-.+/0.123"
456.+7+/0.123"
489.9:+/0.123"
;5170</0.123"
;070.57:8+/0.123"
4,-62+/0.123"
=">03?:.97+9,"9@+9,"3A2,2?07"
=">9+.2B"C73"
=""4.97+-1"
Global view of skeletal data
Image from Sabaj-Perez
Skeletal variation across taxa and regions
viewed, summarized, synthesized, at a scale not possible otherwise.
Similarity (IC)
Taxon Taxon phenotype
Candidategene
Gene phenotype
Subsumingphenotype
13.43 Eels gill rakers, absent eda gill rakers, absent gill rakers count
12.85 Minytrema lateral line, absent pcsk5a lateral line, reduced
lateral line, size
12.44 Siluriformes basihyal, absent brpf1 basihyal, absent basihyal, count
11.32 Gyrinocheilus ceratobranchial 5 teeth, absent
eda ceratobranchial teeth, absent
tooth, count
11.11 Siluriformes scales, absent eda scales, absent scales, count
9.91 Siluriformes opercle, shape edn1 maxilla, shape dermatocranium, shape
9.19 Mola caudal fin, absent yap1 median fin, absent median fin, count
9.19 Gonorynchiformes dorsal fin, absent tf1p2a median fin, absent median fin, count
5.17 Siluriformes eye, reduced pbx2 eye, small eye, size
Quantify phenotypic similarity
Summary
• Powerful queries not previously possible for evolutionary phenotype data
• Meaningful integration with model organism phenotypic and genetic data
Semantic framework and reasoning tools provide:
Acknowledgments
Contributors to Teleost Ontologies
! Gloria Arratia! Stan Blum! Miles Coburn! Kevin Conway! Wasila Dahdul! Mário de Pinna! Jeff Engemen! Bill Eschmeyer! Terry Grande! Melissa Haendel! Brian Hall! Eric Hilton! John Lundberg! Richard Mayden! Mark Sabaj Pérez! Brian Sidlauskas! Richard Vari! Jacqueline Webb! Edward Wiley
National Science Foundation (BDI-0641025)National Evolutionary Synthesis Center
Phenoscape Workshop Participants & Contributors! Arhat Abzhanov! Michael Ashburner! Judith Blake! Stan Blum! Quentin Cronk! Mário de Pinna! Andy Deans! George Gkoutos! Melissa Haendel! Hopi Hoekstra! Hans Hofmann! Elizabeth Jockusch! Elizabeth Kellogg! Chuck Kimmel! Suzanna Lewis! Anne Maglia! Austin Mast! Chris Mungall! Martin Ramirez! Sue Rhee! Martin Ringwald! Nelson Rios! Mark Sabaj Pérez! Eric Segerdell! Brian Sidlauskas! Barry Smith! David Stern! Peter Vize! Gunter Wagner! Nicole Washington! Edward Wiley
Curators: Miles Coburn! Jeff Engemen!Terry Grande!Eric Hilton! John Lundberg!Paula Mabee!Richard Mayden!Mark SabajSandrine Tercerie