15
Phenoscape Knowledgebase Jim Balhoff, Wasila Dahdul, Hilmar Lapp, Paula Mabee, Peter Midford, Todd Vision, Monte Westerfield

The Phenoscape Knowledgebase

  • Upload
    balhoff

  • View
    456

  • Download
    3

Embed Size (px)

DESCRIPTION

Data integration challenge entry presentation for iEvoBio 2011.

Citation preview

Page 1: The Phenoscape Knowledgebase

Phenoscape Knowledgebase

Jim Balhoff, Wasila Dahdul, Hilmar Lapp, Paula Mabee, Peter Midford, Todd Vision, Monte Westerfield

Page 2: The Phenoscape Knowledgebase

Phenoscape

• Collaboration between P. Mabee (U. South Dakota), M. Westerfield (ZFIN, U. Oregon), and T. Vision (NESCent, UNC)

• Aim: foster semantic integration of phenotype data by

• Prototyping a database of curated, machine-interpretable evolutionary phenotypes.

• Integrating these with mutant phenotypes from model organisms.

• Providing reasoner-enabled semantic tools which facilitate data-mining of phenotypic diversity and discovery of candidate genes for evolutionary phenotype transitions.

Page 3: The Phenoscape Knowledgebase

Phenoscape Knowledgebase• ~4 million asserted semantic links (+ 7 million inferred)

• 52 fish systematics publications

• 4700 morphological characters

• 11,000 states + ontological descriptions (EQ)

• 2500 referenced taxa

• ZFIN

• 3900 genes, 6800 genotypes

• 30000 experimental phenotype annotations

• 13 ontologies

• ~80,000 terms

Page 4: The Phenoscape Knowledgebase

Workflow for phenotype annotation

2. Students: Manual entry of free text character descriptions,

matrix, taxon list, specimens and museum numbers using Phenex

3. Character annotation by experts: Entry of phenotypes using

Phenex

1. Students: gather publications (scan

hard copies, produce OCR PDFs)

4. Phenoscape Knowledgebase: OBD, data services, web

application

Dahdul et al., 2010 PLoS ONE Text

501,862 phenotypes for

taxa

Page 5: The Phenoscape Knowledgebase

Knowledgebase architecture

Phenotypic Quality Ontology

(PATO)

Teleost Taxonomy Ontology (TTO)

Knowledgebase (OBD)(PostgreSQL)

Genes & genotypes

OBD Programming API(Java)

Knowledgebase Data Services API (REST)

Knowledgebase User IntefaceWeb Application for Exploration & Mining

(Ruby on Rails, JavaScript)

OBD Reasoner

Phenex(Evolutionary EQ

annotation)

Homology assertions

Mutant EQ phenotypes from Zebrafish Model Organism Database

External web sites and client

applications

Skeletal Character Data(from phylogenetic

treatments in literature)

OBO LibraryNeXMLEvolutionary EQ Phenotypes

(through annotation)

Anatomy Ontologies(ZFA, TAO)

Page 6: The Phenoscape Knowledgebase

OBD: Ontology-Based Database

• Stores data and ontologies in combined semantic model - triple based

• Reasoner executes inference rules as SQL queries - results iteratively added to database

• Supports class expressions based on property restrictions, intersections, unions; transitive properties, property chains, and subsumption

• Provenance-tracking via reification

Page 7: The Phenoscape Knowledgebase

Brachyplatystoma capapretum

round that

inheres_in some

ethmoid cartilage

exhibits some

tfap2ats213/ts213split that

inheres_in some

ethmoid cartilage

influences some

Reasoning across logical relationships

Page 8: The Phenoscape Knowledgebase

Brachyplatystoma capapretum

round that

inheres_in some

ethmoid cartilage

exhibits some

tfap2ats213/ts213split that

inheres_in some

ethmoid cartilage

influences some

round

split

ethmoid cartilageBrachyplatystoma

tfap2a

is_avariant_of

inheres_in

inheres_in

is_a

is_a

Reasoning across logical relationships

Page 9: The Phenoscape Knowledgebase

Brachyplatystoma capapretum

round that

inheres_in some

ethmoid cartilage

exhibits some

tfap2ats213/ts213split that

inheres_in some

ethmoid cartilage

influences some

round

split

ethmoid cartilageBrachyplatystoma

tfap2a

is_avariant_of

inheres_in

inheres_in

is_a

is_a

shape

chondrocranium cartilage

olfactory region

Pimelodidae

sequence-specific DNA binding transcription

factor activity

is_a

has_function is_a part_ofis_a

is_a

Reasoning across logical relationships

Page 10: The Phenoscape Knowledgebase

Demo

Page 11: The Phenoscape Knowledgebase

Phenotype variation in taxa (left) vs. zebrafish mutants (right)

cardiovascular

sensory

nervous

digestive

endocrine

hematopoietic

immune

liver and biliary

musculaturerenalreproductive

respiratory

anatomical system

is_ais_a

is_a

is_a

is_a

is_a

is_a

is_a

is_a

is_a is_ais_a

is_a

>85% 20-30% 15-19% 10-14% 5-9% 1-4% <1%

skeletal

Distributed across anatomical systems

Page 12: The Phenoscape Knowledgebase

!" #!" $!" %!" &!" '!" (!" )!"

*+,-.+/0.123"

456.+7+/0.123"

489.9:+/0.123"

;5170</0.123"

;070.57:8+/0.123"

4,-62+/0.123"

=">03?:.97+9,"9@+9,"3A2,2?07"

=">9+.2B"C73"

=""4.97+-1"

Global view of skeletal data

Image from Sabaj-Perez

Skeletal variation across taxa and regions

viewed, summarized, synthesized, at a scale not possible otherwise.

Page 13: The Phenoscape Knowledgebase

Similarity (IC)

Taxon Taxon phenotype

Candidategene

Gene phenotype

Subsumingphenotype

13.43 Eels gill rakers, absent eda gill rakers, absent gill rakers count

12.85 Minytrema lateral line, absent pcsk5a lateral line, reduced

lateral line, size

12.44 Siluriformes basihyal, absent brpf1 basihyal, absent basihyal, count

11.32 Gyrinocheilus ceratobranchial 5 teeth, absent

eda ceratobranchial teeth, absent

tooth, count

11.11 Siluriformes scales, absent eda scales, absent scales, count

9.91 Siluriformes opercle, shape edn1 maxilla, shape dermatocranium, shape

9.19 Mola caudal fin, absent yap1 median fin, absent median fin, count

9.19 Gonorynchiformes dorsal fin, absent tf1p2a median fin, absent median fin, count

5.17 Siluriformes eye, reduced pbx2 eye, small eye, size

Quantify phenotypic similarity

Page 14: The Phenoscape Knowledgebase

Summary

• Powerful queries not previously possible for evolutionary phenotype data

• Meaningful integration with model organism phenotypic and genetic data

Semantic framework and reasoning tools provide:

Page 15: The Phenoscape Knowledgebase

Acknowledgments

Contributors to Teleost Ontologies

! Gloria Arratia! Stan Blum! Miles Coburn! Kevin Conway! Wasila Dahdul! Mário de Pinna! Jeff Engemen! Bill Eschmeyer! Terry Grande! Melissa Haendel! Brian Hall! Eric Hilton! John Lundberg! Richard Mayden! Mark Sabaj Pérez! Brian Sidlauskas! Richard Vari! Jacqueline Webb! Edward Wiley

National Science Foundation (BDI-0641025)National Evolutionary Synthesis Center

Phenoscape Workshop Participants & Contributors! Arhat Abzhanov! Michael Ashburner! Judith Blake! Stan Blum! Quentin Cronk! Mário de Pinna! Andy Deans! George Gkoutos! Melissa Haendel! Hopi Hoekstra! Hans Hofmann! Elizabeth Jockusch! Elizabeth Kellogg! Chuck Kimmel! Suzanna Lewis! Anne Maglia! Austin Mast! Chris Mungall! Martin Ramirez! Sue Rhee! Martin Ringwald! Nelson Rios! Mark Sabaj Pérez! Eric Segerdell! Brian Sidlauskas! Barry Smith! David Stern! Peter Vize! Gunter Wagner! Nicole Washington! Edward Wiley

Curators: Miles Coburn! Jeff Engemen!Terry Grande!Eric Hilton! John Lundberg!Paula Mabee!Richard Mayden!Mark SabajSandrine Tercerie