83
1 Ontology for Community- and Population-Based Medicine Barry Smith University at Buffalo http:// ontology.buffalo.edu/smith

1 Ontology for Community- and Population-Based Medicine Barry Smith University at Buffalo

Embed Size (px)

Citation preview

1

Ontology for Community- and Population-Based Medicine

Barry SmithUniversity at Buffalohttp://ontology.buffalo.edu/smith

Outline1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

1.1. Who am I?Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

Who am I?NCBO: National Center for Biomedical Ontology (NIH Roadmap Center)

4

− Stanford Medical Informatics− University of San Francisco Medical Center− Berkeley Drosophila Genome Project− Cambridge University Department of

Genetics− The Mayo Clinic− University at Buffalo (PI of Dissemination

and Ontology Best Practices)

Who am I?

Duke/Dallas CTSA Ontology Consortium

Cleveland Clinic Semantic Database in Cardiothoracic Surgery

Gene Ontology Scientific Advisory Board

Biomedical Informatics Research Network (BIRN) Ontology Task Force

Advancing Clinico-Genomic Trials on Cancer (ACGT)

5

1. Who am I?

2.2. How to find your dataHow to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

Multiple kinds of data in multiple kinds of silos

Lab / pathology data

Electronic Health Record data

Clinical trial data

Patient histories

Medical imaging

Microarray data

Protein chip data

Flow cytometry

Mass spec

Genotype / SNP data

7

How to find your data?

How to find and integrate other people’s data?

How to reason with data when you find it?

How to understand the significance of the data you collected 3 years earlier?

Part of the solution must involve consensus-based, standardized terminologies and coding schemes

8

Making data (re-)usable through standards

• Standards provide– common structure and terminology– single data source for review (less redundant

data)

• Standards allow– use of common tools and techniques– common training– single validation of data

9

10

Problems with standards

• Standards involve considerable costs of re-tooling, maintenance, training, ...

• Not all standards are of equal quality

• Bad standards create lasting problems

11

NIH Mandates for Sharing of Research Data

Investigators submitting an NIH application seeking $500,000 or more in any single year are expected to include a plan for data sharing

(http://grants.nih.gov/grants/policy/data_sharing)

12

Program Announcement Number: PAR-07-425

Title:  Data Ontologies for Biomedical Research (R01)NIH Blueprint for Neuroscience Research, (http://neuroscienceblueprint.nih.gov/)National Cancer Institute (NCI), (http://www.cancer.gov)National Center for Research Resources (NCRR), (http://www.ncrr.nih.gov/)National Eye Institute (NEI), (http://www.nei.nih.gov/)National Heart Lung and Blood Institute (NHLBI), (http://http.nhlbi.nih.gov )National Human Genome Research Institute (NHGRI), (http://www.genome.gov)National Institute on Alcohol Abuse and Alcoholism (NIAAA), (http://www.niaaa.nih.gov/)National Institute of Biomedical Imaging and Bioengineering (NIBIB), (http://www.nibib.nih.gov/)National Institute of Child Health and Human Development (NICHD), (http://www.nichd.nih.gov/)National Institute on Drug Abuse (NIDA), (http://www.nida.nih.gov/)National Institute of Environmental Health Sciences (NIEHS), (http://www.niehs.nih.gov/)National Institute of General Medical Sciences (NIGMS), (http://www.nigms.nih.gov/)National Institute of Mental Health (NIMH), (http://www.nimh.nih.gov/)National Institute of Neurological Disorders and Stroke (NINDS), (http://www.ninds.nih.gov/)National Institute of Nursing Research (NINR), (http://www.ninr.nih.gov)

13

Purpose. Optimal use of informatics tools and data resources depends upon explicit understandings of concepts related to the data upon which they compute. This is typically accomplished by a tool or resource adopting a formal controlled vocabulary and ontology. 

14

Currently, there is no convenient way to map the knowledge that is contained in one data set to that in another data set, primarily because of differences in language and structure

... in some areas there are emerging standards.

Examples include: •the Unified Medical Language System (UMLS), •the Gene Ontology, •the caBIG project,•Open Biomedical Ontologies (OBO)

15

NIH anticipates that, once important data sets in a topical area have been unified, others in that area will adopt the emerging standard. 

The nucleation points should be able to interact with each other, e.g. through the use of the tools made freely available by the National Center for Biomedical Ontology (NCBO) (http://bioontology.org/) or by caBIG.

16

Another determinate of ontology acceptance is the degree to which the ontology conforms to best practices governing ontology design and construction. 

... the applicant should specify the criteria with which the ontology will conform

Criteria have been developed by the Vocabulary and Common Data Element Work Group of caBIG and by the OBO Foundry (http://obofoundry.org/). 

1. Who am I?

2. How to find your data

3.3. How to do biology across the genomeHow to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV

How to do biology across the genome?

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE

19

20

what cellular component?

what molecular function?

what biological process?

through annotation of data

21

what cellular component?

what molecular function?

what biological process?

and through curation of literature

22

what cellular component?

what molecular function?

what biological process?

three types of data

Clark et al., 2005

part_of

is_a

23

24

The Gene Ontology

25

WormBase

Gramene

FlyBase

Rat Genome Database

DictyBase

Mouse Genome Database

The Arabidopsis Information Resource

The Zebrafish Information Network

Berkeley Drosophila Genome Project

Saccharomyces Genome Database

...

26

Gene Ontology Consortium

Benefits of GO

1. rooted in basic experimental biology

2. links people to data and to literature

3. links data to data

• across species (human, mouse, yeast, fly ...)

• across granularities (molecule, cell, organ, organism, population)

4. links medicine to biological science

5. cumulation of scientific knowledge in algorithmically tractable form

27

A strategy for translational medicine

Sjöblöm T, et al. analyzed 13,023 genes in 11 breast and 11 colorectal cancers

using functional information captured by GO identified 189 genes as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention.

Science. 2006 Oct 13;314(5797):268-74.

28

29

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4.4. How to extend the GO methodology to clinical and How to extend the GO methodology to clinical and translational medicine: Open Biomedical Ontologiestranslational medicine: Open Biomedical Ontologies

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

31

Ontology Scope URL Custodians

Cell Ontology (CL)

cell types from prokaryotes to mammals

obo.sourceforge.net/cgi-

bin/detail.cgi?cell

Jonathan Bard, Michael Ashburner, Oliver Hofman

Chemical Entities of Bio-

logical Interest (ChEBI)

molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara

Common Anatomy Refer-

ence Ontology (CARO)

anatomical structures in human and model

organisms(under development)

Melissa Haendel, Terry Hayamizu, Cornelius

Rosse, David Sutherland,

Foundational Model of Anatomy (FMA)

structure of the human body

fma.biostr.washington.

edu

JLV Mejino Jr.,Cornelius Rosse

Functional Genomics Investigation

Ontology (FuGO)

design, protocol, data instrumentation, and

analysisfugo.sf.net FuGO Working Group

Gene Ontology (GO)

cellular components, molecular functions, biological processes

www.geneontology.org

Gene Ontology Consortium

Phenotypic Quality Ontology

(PaTO)

qualities of anatomical structures

obo.sourceforge.net/cgi

-bin/ detail.cgi?attribute_and_value

Michael Ashburner, Suzanna

Lewis, Georgios Gkoutos

Protein Ontology (PrO)

protein types and modifications

(under development)Protein Ontology

Consortium

Relation Ontology (RO)

relationsobo.sf.net/

relationshipBarry Smith, Chris

Mungall

RNA Ontology(RnaO)

three-dimensional RNA structures

(under development) RNA Ontology Consortium

Sequence Ontology(SO)

properties and features of nucleic sequences

song.sf.net Karen Eilbeck

32

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

OBO Foundry Coordinators

Suzanna LewisBerkeleyMichael Ashburner

Cambridge, UKBarry Smith

BuffaloChris Mungall

Berkeley

Goal of the OBO Foundry

all biomedical research data should cumulate to form a single, algorithmically processable, whole

Smith, et al. Nature Biotechnology, Nov 2007

34

35

CRITERIA

The ontology is open and available to be used by all.

The ontology is instantiated in, a common formal language and shares a common formal architecture

The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

OBO FOUNDRY CRITERIA

36

CRITERIA The developers of each ontology commit to its

maintenance in light of scientific advance, and to soliciting community feedback for its improvement.

They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.

37

Mature OBO Foundry ontologies

Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)

38

Foundry ontologies being built ab initio

Common Anatomy Reference Ontology (CARO)Ontology for Biomedical Investigations (OBI)Protein Ontology (PRO)RNA Ontology (RnaO)Subcellular Anatomy Ontology (SAO)

39

Ontologies in planning phase

Environment Ontology (EnvO) Infectious Disease Ontology (IDO)Biobank/Biorepository OntologyFood OntologyAllergy OntologyVaccine Ontology

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5.5. An OBO Foundry success storyAn OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

Anatomy Ontologies

Fish Multi-Species Anatomy Ontology (NSF funding received)Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy Ontology (SPD)Xenopus Anatomy Ontology (XAO)

undergoing reform: Drosophila and Zebrafish Anatomy Ontologies

41

Ontologies facilitate grouping of annotations

brain 20 hindbrain 15 rhombomere 10

Query brain without ontology 20Query brain with ontology 45

42

Multiple axes of classification

Functional: cardiovascular system, nervous systemSpatial: head, trunk, limbDevelopmental: endoderm, germ ring, lens placodeStructural: tissue, organ, cell Stage: developmental staging series

43

CARO – Common Anatomy Reference Ontology

for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations

Haendel et al., “CARO: The Common Anatomy Reference Ontology”, in: Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press.

44

45

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6.6. IDO: The Infectious Disease OntologyIDO: The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

We have data

TBDB: Tuberculosis Database, including Microarray data

VFDB: Virulence Factor DB

TropNetEurop Dengue Case Data

ISD: Influenza Sequence Database at LANL

PathPort: Pathogen Portal Project

...

47

We need to annotate this data

to allow retrieval and integration of– sequence and protein data for pathogens– case report data for patients– clinical trial data for drugs, vaccines– epidemiological data for surveillance, prevention– ...

Goal: to make data deriving from different sources comparable and computable

48

IDO needs to work with

Disease Ontology (DO) + SNOMED CT

Gene Ontology Immunology Branch

Phenotypic Quality Ontology (PATO)

Protein Ontology (PRO)

Sequence Ontology (SO)

...

49

We need common controlled vocabularies to describe these data in ways that will assure

comparability and cumulation

What content is needed to adequately cover the infectious domain?– Host-related terms (e.g. carrier, susceptibility)– Pathogen-related terms (e.g. virulence)– Vector-related terms (e.g. reservoir, – Terms for the biology of disease pathogenesis (e.g.

evasion of host defense)– Population-level terms (e.g. epidemic, endemic,

pandemic, )

50

IDO Processes

51

IDOQualities

52

IDO Roles

53

IDO provides a common template

IDO works like CARO.

It contains terms (like ‘pathogen’, ‘vector’, ‘host’) which apply to organisms of all species involved in infectious disease and its transmission

Disease- and organism-specific ontologies built as refinements of the IDO core

54

Disease-specific IDO test projects

MITRE, Mount Sinai, UTSouthwestern – Influenza– Stuart Sealfon, Joanne Luciano,

IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus)– Kristos Louis

Colorado State University – Dengue Fever– Saul Lozano-Fuentes

Duke – Tuberculosis– Carol Dukes-Hamilton

Cleveland Clinic – Infective Endocarditis– Sivaram Arabandi

University of Michigan – Brucilosis– Yongqun He

56

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7.7. Towards a controlled vocabulary for community-based Towards a controlled vocabulary for community-based medicinemedicine

8. The Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

58

All OBO Foundry ontologies work in the same way

– we have data (biosample, haplotype, clinical data, survey data, ...)

– we need to make this data available for semantic search and algorithmic processing

– we create a consensus-based ontology for annotating the data

59

60

61

62

to enhance alignment of data about instances (communities, places, ...)

63

to enhance alignment of data about relevant types of entities

(origin, community, cell type, race, family ...)

64

65

to enhance coordination of research

66

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8.8. The Community Ontology and its branchesThe Community Ontology and its branches

9. The Environment Ontology: A new type of patient data

Community / Population Ontology

68

− family, clan− ethnicity− religion − diet− social networking− education (literacy ...)− healthcare (economics ...)− household forms− demography− public health−...

69

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

70

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

71

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO)

Population Phenotype

PopulationProcess

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Componen

t(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

http://obofoundry.org

1. Who am I?

2. How to find your data

3. How to do biology across the genome

4. How to extend the GO methodology to clinical and translational medicine

5. Anatomy Ontologies: An OBO Foundry success story

6. The Infectious Disease Ontology

7. Towards a controlled vocabulary for community-based medicine

8. The Community Ontology and its branches

9.9. The Environment Ontology: A new type of patient dataThe Environment Ontology: A new type of patient data

73

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population OrganFunction

(FMP, CPRO)

Population

Phenotype

Population Process

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)http://obofoundry.org

E N

V I R

O N

M E

N T

74

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community,

Deme, Population

Environment of population

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environment

http://obofoundry.org

E N

V I R

O N

M E

N T

75

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OF ORGANISMS

Family, Community, Deme, Population

Environment of population

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

(FMA, CARO)

Environment of single organism*

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cell Com-

ponent(FMA, GO)

Environment of cell

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular environment

E N

V I R

O N

M E

N T

* The sum total of the conditions and elements that make up the surroundings and influence the development and actions of an individual.

76

RELATION TO TIME

GRANULARITY

CONTINUANT

INDEPENDENT

COMPLEX OFORGANISMS

biome / biotope, territory, habitat, neighborhood, ...

work environment, home environment;host/symbiont environment; ...

ORGAN ANDORGANISM

CELL AND CELLULAR

COMPONENT

extracellular matrix; chemokine gradient; ...

MOLECULE

hydrophobic surface; virus localized to cellular substructure; active site on

protein; pharmacophore ...

http://obofoundry.orgE

N V

I R O

N M

E N

T

The Environment Ontology

78

OBO FoundryGenomic Standards ConsortiumNational Environment Research Council (UK)USDA, Gramene, J. Craig Venter Institute ...

79

Applications of EnvO in biology

How EnvO currently works for information retrieval

Retrieve all experiments on organisms obtained from:– deep-sea thermal vents– arctic ice cores– rainforest canopy– alpine melt zone

Retrieve all data on organisms sampled from:– hot and dry environments– cold and wet environments– a height above 5,000 meters

Retrieve all the omic data from soil organisms subject to:– moderate heavy metal contamination

80

extending EnvO to clinical and translational research

• we have public heath, community and population data

• we need to make this data available for search and algorithmic processing

• we create a consensus-based ontology which can interoperate with ontologies for neighboring domains of medicine and basic biology

81

Environment = totality of circumstances external to a living organism or group of organisms

– pH– evapotranspiration– turbidity– available light– predominant vegetation– predatory pressure– nutrient limitation …

82

extend EnvO to the clinical domain– dietary patterns (Food Ontology: FAO, USDA) ...

allergies– neighborhood patterns

• built environment, living conditions• climate• social networking• crime, transport• education, religion, work• health, hygiene

– disease patterns• bio-environment (bacteriological, ...)• patterns of disease transmission (links to IDO)

83

a new type of patient data

a patient’s environmental history

use EnvO and the community ontology to mine relations between disease phenotypes and environmental patterns and patterns of community behavior

84

with thanks toCARO: Fabian Neuhaus (NIST), Melissa Haendel

(ZFin), David Sutherland (Flybase)

EnvO: Dawn Field, Norman Morrison, (NERC)

IDO: Lindsay Cowell (Duke)

OBO Foundry: Michael Ashburner, Suzanna Lewis, Chris Mungall (Flybase, GO), Alan Ruttenberg (MIT, Neurocommons)

NCBO: NIH RFA-RM-04-022

PRO: NIH R01 GM080646-01

ACGT: European Commission IST-2005-026996

85