59
PATO & Phenotypes: From model organisms to clinical medicine Suzanna Lewis September 4th, 2008 Signs, Symptoms and Findings Workshop First Steps Toward an Ontology of Clinical Phenotypes

PATO & Phenotypes: From model organisms to clinical medicine

  • Upload
    tokala

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

PATO & Phenotypes: From model organisms to clinical medicine. Suzanna Lewis September 4th, 2008 Signs, Symptoms and Findings Workshop First Steps Toward an Ontology of Clinical Phenotypes. - PowerPoint PPT Presentation

Citation preview

Page 1: PATO & Phenotypes: From model organisms to clinical medicine

PATO & Phenotypes:From model organisms to

clinical medicine

Suzanna LewisSeptember 4th, 2008

Signs, Symptoms and Findings WorkshopFirst Steps Toward an Ontology of Clinical

Phenotypes

Page 2: PATO & Phenotypes: From model organisms to clinical medicine

Describing phenotype using ontologies will aid in the identification of models of disease & candidate causative genes

GWAS: Genome Wide Association Studies Any study of genetic variation across the entire human genome that is designed to identify genetic associations with observable traits (such as blood pressure or weight), or the presence or absence of a disease or condition.

Given an identified gene, then what?

Page 3: PATO & Phenotypes: From model organisms to clinical medicine

Animal disease models

Animal models

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease model)

Page 4: PATO & Phenotypes: From model organisms to clinical medicine

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease)

Humans Animal models

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease model)

Animal disease models

Page 5: PATO & Phenotypes: From model organisms to clinical medicine

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease model)

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease)

Animal disease models

Humans Animal models

Page 6: PATO & Phenotypes: From model organisms to clinical medicine

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease)

Humans Animal models

Mutant Gene

Mutant or missing Protein

Mutant Phenotype (disease model)

Animal disease models

Page 7: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype data mining = text searching?

Text-based phenotype resources: OMIM (NCBI) DECIPHER (Sanger) HGMD (Cardiff) Disease-specific databases MODs PubMed

Page 8: PATO & Phenotypes: From model organisms to clinical medicine

Query # of records

“large bone” 713

"enlarged bone" 136

"big bones" 16

"huge bones" 4

"massive bones" 28

"hyperplastic bones"

8

"hyperplastic bone" 34

"bone hyperplasia" 122

"increased bone growth"

543Thanks to:M Ashburner

Information retrieval from text-based resources (OMIM) is not

straightforward:

Page 9: PATO & Phenotypes: From model organisms to clinical medicine

Even if we can find what we are looking for in one

organism, how can we associate that with phenotypes observed

in different organisms?

Methods to link phenotypic descriptions of human

diseases to animal models currently don’t exist.

Page 10: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes

Page 11: PATO & Phenotypes: From model organisms to clinical medicine

SHH-/+ SHH-/-

shh-/+ shh-/-

Page 12: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

Page 13: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

P1 = eye + hypoteloric

Page 14: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

P1 = eye + hypoteloric

P2 = midface + hypoplastic

Page 15: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

P1 = eye + hypoteloric

P2 = midface + hypoplastic

P3 = kidney + hypertrophied

Page 16: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

P1 = eye + hypoteloricP2 = midface + hypoplastic P3 = kidney + hypertrophied

PATO: hypoteloric

hypoplastic

hypertrophied

ZFIN: eye

midface

kidney

+

Page 17: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

Anatomical ontology

Cell & tissue ontology

Developmental ontology

Gene ontology

biological process

molecular function

cellular component

+ PATO(phenotype and trait ontology)

Page 18: PATO & Phenotypes: From model organisms to clinical medicine

Phenotype (clinical sign) = entity + attribute

P1 = eye + hypoteloricP2 = midface + hypoplastic P3 = kidney + hypertrophied

Syndrome = P1 + P2 + P3 (disease) (package) = holoprosencephaly

Page 19: PATO & Phenotypes: From model organisms to clinical medicine

Entity Quality

Evidence Qualifier

relationship

Units

EnvironmentGenetic Phenotype annotation model

Source

Attribution

Who makes the assertion

Properties

When, what organization

Assertion

Page 20: PATO & Phenotypes: From model organisms to clinical medicine

OBD and annotations

Shh

Absenceof aorta

publish/create

Experiment/investigation

query/meta-analysis

Directannotation

annotation

Shh- AbsenceOf aorta

X

observation

Computational representation

Agent(human/computer)Community/expert

Information entity

investigator

read

bio-entity

bio-entity

Shh+ Heartdevelopment

Dev Biol 2005 Jul 15;283(2):357-72

“Sonic hedgehog is required for cardiac outflow tract and neural crest cell development”

communicate

local dblocal dblocal db

Multiple schemas

influences

Participatesin

represents

subj objrelation

annotation

submit/consume

Page 21: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes

Develop and extend requisite ontologies For the entities being described: anatomies, processes, …

Page 22: PATO & Phenotypes: From model organisms to clinical medicine

It is critical that ontologies are developed cooperatively so thattheir classification strategies

augment one another.

Building a suite of orthogonal interoperable reference (evidence

based) ontologies in the biomedical domain.

Truth springs from arguments amongst friends. (David Hume)

Page 23: PATO & Phenotypes: From model organisms to clinical medicine

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy?)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process(GO)

CELL AND CELLULAR COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function(GO)

MOLECULE Molecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

Page 24: PATO & Phenotypes: From model organisms to clinical medicine

Requisite ontologies

An ontology of qualities (PATO) Organism specific anatomies A controlled vocabulary of homologous and analogous anatomical structures (Uberon)

Gene Ontology Cell Types

Page 25: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes

Develop and extend requisite ontologies For the entities being described: anatomies, processes, …

Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”)

Page 26: PATO & Phenotypes: From model organisms to clinical medicine

Phenote: Simple software for annotating using ontologies

Provide tool for ontology-based annotation Standardized model to record annotations for increased compatibility of data between disparate communities.

Simple & intuitive user interface (especially for users that don’t know/care about what an ontology is)

Easy-to-configure for different user-communities

Pluggable architecture for external applications to interface/embed in application

Provide interfaces with external SOAP and REST services for streamlined workflow (OBD, NCBI, EBI, etc).

www.phenote.org

Page 27: PATO & Phenotypes: From model organisms to clinical medicine

CVS

BioPortal

External siteLocal file

Ontologies can be utilized from various resources in OWL

and OBO format

Page 28: PATO & Phenotypes: From model organisms to clinical medicine

Phenote tour

Page 29: PATO & Phenotypes: From model organisms to clinical medicine

Editor

Page 30: PATO & Phenotypes: From model organisms to clinical medicine

Refining terms on-the-spot

Post-composition: Join together 2 (or more) terms for specificity: Apoptosis of neuron in skin (GO,CL,FMA) S-phase of colon cancer cell (GO,CL) Aster of human spermatocyte (GO,FMA)

Combine terms from different ontologies

Increase “information content” of an annotation

Pre-composed: Have decomposed definitions of ~2/3rds of MP terms available to incorporate mouse data

Page 31: PATO & Phenotypes: From model organisms to clinical medicine

Term Info Browser

Page 32: PATO & Phenotypes: From model organisms to clinical medicine

Annotation Table

Page 33: PATO & Phenotypes: From model organisms to clinical medicine

Retrieve data from NCBI: OMIM, PUBMED, …

(SOAP plug-in)

Page 34: PATO & Phenotypes: From model organisms to clinical medicine

Graphical Viewer

Page 35: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes

Develop and extend requisite ontologies For the entities being described: anatomies, processes, …

Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”)

Develop a set of guidelines for biocurators

Annotate mutant phenotypes (OMIM and models)

Page 36: PATO & Phenotypes: From model organisms to clinical medicine

General Annotation Standards

Remarkable normality Absence Relative qualities (what does “small” mean?)

Rates/frequencies does it inhere in the heart or a process?

Homeotic transformation Phenotypes specific to a stage or temporal duration

Page 37: PATO & Phenotypes: From model organisms to clinical medicine

Testing the methodology

Annotated 11 gene-linked human diseases described in OMIM, and their homologs in zebrafish and fruitfly. ATP2A1, BRODY MYOPATHY EPB41, ELLIPTOCYTOSIS EXT2, MULTIPLE EXOSTOSES EYA1, EYES ABSENT FECH, PROTOPORPHYRIA PAX2, RENAL-COLOBOMA SYNDROME SHH, HOLOPROSENCEPHALY SOX9, CAMPOMELIC DYSPLASIA SOX10, PERIPHERAL DEMYELINATING NEUROPATHY TNNT2, FAMILIAL HYPERTROPHIC CARDIOMYOPATHY TTN, MUSCULAR DYSTROPHY

Page 38: PATO & Phenotypes: From model organisms to clinical medicine

An OMIM Record

Page 39: PATO & Phenotypes: From model organisms to clinical medicine

An OMIM Record

Page 40: PATO & Phenotypes: From model organisms to clinical medicine

An OMIM Record

Page 41: PATO & Phenotypes: From model organisms to clinical medicine

An OMIM Record

Page 42: PATO & Phenotypes: From model organisms to clinical medicine

An OMIM Record

Page 43: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes

Develop and extend requisite ontologies For the entities being described: anatomies, processes, …

Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”)

Develop a set of guidelines for biocurators

Annotate mutant phenotypes (OMIM and models)

Collect & store annotations in a common resource (OBD) and make these broadly available

Page 44: PATO & Phenotypes: From model organisms to clinical medicine

4355 genes and genotypes in OBD17782 entity-quality annotations in OBD

Page 45: PATO & Phenotypes: From model organisms to clinical medicine

OBD model: Requirements

Generic We can’t define a rigid schema for all of biomedicine

Let the domain ontologies do the modeling of the domain

Expressive Use cases vary from simple ‘tagging’ to complex descriptions of biological phenomena

Formal semantics Amenable to logical reasoning First Order Logic and/or OWL1.1

Standards-compatible Integratable with semantic web

Page 46: PATO & Phenotypes: From model organisms to clinical medicine

OBD Model: overview

Graph-based: nodes and links Nodes: Classes, instances, relations Links: Relation instances

Connect subject and object via relation plus additional properties

Annotations: Posited links with attribution / evidence

Equivalent expressivity as RDF and OWL Links aka axioms and facts in OWL Attributed links:

Named graphs Reification N-ary relation pattern

Supports construction of complex descriptions through graph model

Page 47: PATO & Phenotypes: From model organisms to clinical medicine

OBD Dataflow

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 48: PATO & Phenotypes: From model organisms to clinical medicine

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

key

Post-composition of phenotype classes

(PATO EQ formalism)

Post-composition of complex anatomical entity

descriptions

Example ofAnnotation in OBD

Page 49: PATO & Phenotypes: From model organisms to clinical medicine

OBD Architecture

Two stacks1. Semantic web stack

Built using Sesame triplestore + OWLIM

Future iterations: Science-commons Virtuoso

2. OBD-SQL stack Current focus Traditional enterprise architecture Plugs into Semantic Web stack via

D2RQ

Page 50: PATO & Phenotypes: From model organisms to clinical medicine

OBD-SQL Stack

Alpha version of API implemented Test clients access

via SOAP Phenote current

accesses via org.obo model & JDBC

Wraps org.obo model and OBD schema Share relational

abstraction layer Org.obo wraps OWLAPI

Phenote currently connects via JDBC connectivity in org.obo

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 51: PATO & Phenotypes: From model organisms to clinical medicine

Goal: Turn text-based phenotypes into ontology-based

computable annotations Define a model for representing phenotypes Develop and extend requisite ontologies

For the entities being described: anatomies, processes, …

Develop an intuitive annotation environment for rigorously capturing phenotypes (“semantic authoring”)

Develop a set of guidelines for biocurators Annotate mutant phenotypes (OMIM and models) Collect & store annotations in a common resource (OBD) and make these broadly available

Develop tools & resources for mining data for novel discovery Developed a similarity search algorithm to identify genotypes with similar phenotype.

Page 52: PATO & Phenotypes: From model organisms to clinical medicine

sox9 mutations curated in PATO syntax

Human, SOX9(Campomelic dysplasia)

Zebrafish, sox9a (jellyfish)

Male sex determination: disrupted

Scapula: hypoplastic Scapulocorocoid: aplasticLower jaw: decreased size

Cranial cartilage: hypoplastic

Heart: malformed or edematous Heart: edematous

Phalanges: decreased length Pectoral fin: decreased lengthLong bones: bowed Cartilage development: disrupted

Page 53: PATO & Phenotypes: From model organisms to clinical medicine

0

100

200

300

400

500

600

1 2 3 4

EYA1 SOX10 SOX9 PAX2

0.78 0.71 0.61 0.72

# A

nn

ota

tio

ns

congruence

total annotations

similar annotations

Average annotation consistency

Page 54: PATO & Phenotypes: From model organisms to clinical medicine

Reasoning over phenotype descriptions recorded with

ontologies provides linkages in annotations.

Page 55: PATO & Phenotypes: From model organisms to clinical medicine

Ontologies and reasoning can reveal

similarities in phenotype

annotations.

Page 56: PATO & Phenotypes: From model organisms to clinical medicine

Gene Similarity

Citation Role in hedgehog pathway

smo 0.445 Ochi, et al. 2006

Membrane protein binds shh receptor ptc1

disp1 0.444 Nakano, et al. 2004

Regulates secretion of lipid modified shh from midline

prdm1a

0.43 Roy, et al., 2001

Zinc-finger domain transcription factor, downstream target of shh signaling

hdac1 0.427 Cunliffe and Casaccia-Bonnefil, 2006

Transcriptional regulator required for shh mediated expression of olig2 in ventral hindbrain

scube2

0.398 Hollway et al., 2006

May act during shh signal transduction at the plasma membrane

wnt11 0.380 Mullor et al., 2001

Extracellular cysteine rich glycoprotein required for gli2/3 induced mesoderm development

gli2a 0.348 Kalstrom, et al., 1999

Zinc finger transcription factor target of shh signaling

bmp2b 0.303 Ke et al., 2008

Downstream target of gli2 gene repression

gli1 0.303 Karlstrom, et al., 2003

Zinc finger transcription factor target of shh signaling

ndr2 0.289 Muller, et al., 2000

TGFbeta family member upstream of hedgehog signaling in the ventral neural tube

hhip 0.265 Ochi et al., 2006

Binds shh in membrane and modulates interaction with smo

A zebrafish shh similar-phenotype query returns known hedgehog pathway

members

Page 57: PATO & Phenotypes: From model organisms to clinical medicine

OBD similarity query

A computational search that enables comparison of phenotypes within and across species. Given a set of phenotype annotations recorded for a mutant allele we can identify other alleles in the same gene.

We can identify other known pathway members in the same species and known gene orthologs in other species simply by comparing phenotypes alone.

This annotation and search method provides a novel means for laboratory researchers to identify potential gene candidates participating in regulatory and/or disease pathways.

Page 58: PATO & Phenotypes: From model organisms to clinical medicine

Summary of (some of) the challenges

Curating the information Efficiency (pre-composed vs. post-composed) Consistency between curators Missing contextual information (genetic background and environment)

Observation vs. the inference made from this

Representing homology (bones named by relative position)

Those attempting this anyway: zebrafish, Drosophila, C. elegans, Cyprinoid fish (evolution), Dictyostelium, mouse (many), Xenopus, paramecium…

Page 59: PATO & Phenotypes: From model organisms to clinical medicine

Credit to

Berkeley Christopher Mungall Mark Gibson Nicole Washington Rob Bruggner

U of Oregon Monte Westerfield Melissa Haendel

National Institutes of Health

U of Cambridge Michael Ashburner George Gkoutos (PATO)

David Osumi-Sutherland

OBO Foundry Michael Ashburner Christopher Mungall Alan Ruttenberg Richard Scheuermann Barry Smith