21
An Ontological Approach for Describing Phospho- proteins in Rhodococcus Dept. of Computer Science, University of British Columbia. Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang CPSC 445. April 5 th . 2007

An Ontological Approach for Describing Phospho-proteins in Rhodococcus

  • Upload
    zuri

  • View
    29

  • Download
    1

Embed Size (px)

DESCRIPTION

An Ontological Approach for Describing Phospho-proteins in Rhodococcus. Dept. of Computer Science, University of British Columbia. Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang CPSC 445. April 5 th . 2007. What is an ontology?. Purpose: knowledge representation & reasoning - PowerPoint PPT Presentation

Citation preview

Page 1: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

An Ontological Approach for Describing Phospho-proteins

in Rhodococcus

Dept. of Computer Science,University of British Columbia.

Dennis Wang, Gavin Ha, Jennifer Chen, Nancy Wang

CPSC 445. April 5th. 2007

Page 2: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Purpose: knowledge representation & reasoning Facilitates knowledge sharing and reuse

Definition: a data model that represents a set of concepts within a

domain and the relationships between those concepts. It is used to reason about the objects within that

domain. Describe individuals (instances), classes (concepts),

attributes, relations and axioms

Uses: AI, information architecture, semantic web, software

engineer

Page 3: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Biology = knowledge based use prior knowledge to infer new knowledge data rich

Biologist needs extensive prior knowledge to analyze data obtained Pace of data production beyond one’s ability

to acquire knowledge

Need an automated system to apply domain experts’ knowledge to biological data

Page 4: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Joint effort of biologist and computer scientist

Build ontologies using domain knowledge Rapid classification of large datasets Allows query to find instances of a class Create controlled vocabularies for shared use

across different biological and medical domains. In bioinformatics, ontology can make

knowledge available to community and its applications.

Page 5: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

“provides structured, controlled vocabularies and classifications that cover several domains of molecular biology”

Uses: annotation of large data sets the ability to group gene products to some high level

term Computational (putative) assignments of molecular

function based on sequence similarity to annotated genes or sequences.

Unknown gene product

Sequence in SWISS-PROT

Seq similarity

? Inferred gene function from electronic annotation

Known function

Infer function

Page 6: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

There is no standardized methodology

But, efforts to make more comprehensive guidelines

In general: Informal Stage

natural language Formal Stage

formal knowledge representation language

Page 7: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Inspired by software engineering.

User Model (Biologist):

#1) Identification of the purpose and scope of the ontology#2) Acquisition of domain knowledge

Identify purpose and scope

Knowledge Acquisition

Page 8: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Conceptualization Model (Bioinformatician/Biologist):

#3) Identifying key concepts in the domain.#4) Integration by using and incorporating other existing ontologies

Building

Identify purpose and scope

Knowledge Acquisition

Conceptualization

Integrating existing

ontologies

Page 9: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Implementation Model (Bioinformatician):

#5) Representing concepts with a formal language

#6) Documenting informal and formal definitions#7) Evaluation of the appropriateness of the ontology for its intended application

Building

Identify purpose and scope

Knowledge Acquisition

Conceptualization

Integrating existing

ontologies

Encoding

Evaluation

Language & Representation

Available Development

Tools

Page 10: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

ProvidesProvides

Results

Build using

OWL-DL

Made up of

Pellet

Reasoner

Uses

BiologistsSignal Protein Experts

Phosphatase & Kinase background knowledge

Proteomic experimental data

Data (Instances/Individuals)

Ontology(Classes)

Bioinformatician

Can we use the phosphabase ontology to describe phospho-proteins discovered by the Rhodococcus Genome Project?

Page 11: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

subClassOf

XML syntax OWL-DL (Description Logic) : Certain restrictions to

guarantee decidability based on description logic OWL uses Resource Description Framework (RDF)

Subject Predicate Object Basic components in OWL:

classes Individuals properties Class

ProfessorSuperclass

FacultyMember

InstanceOf

IndividualAnne Condon

IndividualJennifer Chen

teaches

Page 12: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Biological Motivation Driven by protein domain architecture to describe

signalling protein families Background knowledge required for construction:

Signal protein domains Presence of protein domains within signal proteins

OWL Ontology Ontology uses OWL-DL

Description-logic can be applied to classify proteins using reasoners

Many different ways to represent this knowledge in OWL

Wolstencroft et al, 2006

Page 13: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Domain_Entity

Macromolecule

Protein_Phosphatase

Protein_Kinase

Page 14: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Input Ontology – OWL-DL format

axioms about classes into TBox type and property assertions

(individuals) into ABox Query - RDQL (SPARQL) format

Instance data (individuals)

Tableau Reasoner Checks satisfiability of an ABox

with respect to a TBox Test for knowledge base

consistency

[Parsia and Sirin, ISWC 2004]

Page 15: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Locus ID: RHA1_ro01186 Acknowledgements for this annotation

Strain:Rhodococcus sp. RHA1NBCI Taxonomy Database

Replicon: ChromosomeRefseq: NC_008268

Start: 1260414 Stop: 1260866

Gene Name:    Alternate gene name(s):  

Protein / Product Name:

protein-tyrosine-phosphatase

Alternate product name(s):

 

Refseq GI Number: 111018199 

Category:Protein 

Localization:Cytoplasmic (Class 3) 

Transposon Mutant Available?:

No transposon mutant available yet

COG predictions:Wzb, Protein-tyrosine-phosphatase [Signal transduction mechanisms]. 

PseudoCAPEC Number:

3.1.3.48  

COG0394 

Comments:

PFAM predictions: PF01451: LMWPc, Low molecular weight phosphotyrosine protein phosphatase.. 

go_function: protein tyrosine phosphatase activity [goid 0004725]  

Page 16: An Ontological Approach for Describing Phospho-proteins in Rhodococcus
Page 17: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Locus ID: RHA1_ro05453 Acknowledgements for this annotation

Strain:Rhodococcus sp. RHA1NBCI Taxonomy Database

Replicon: ChromosomeRefseq: NC_008268

Start: 5845588 Stop: 5847288

Gene Name:    Alternate gene name(s):  

Protein / Product Name:

probable protein-tyrosine kinase Alternate product name(s):  

Refseq GI Number:

111022419 

Category:Protein 

Localization:Cytoplasmic Membrane (Class 3) 

Transposon Mutant Available?:

No transposon mutant available yet

COG predictions:

Mrp, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]. 

PseudoCAPEC Number:

2.7.10.1  

COG0489 

TIGRFAM predictions:

TIGRFAM Accession: TIGR01007TIGRFAM name and function: eps_fam - capsular exopolysaccharide family (6.7e-46)TIGRFAM EC Number: Role: Transport and binding proteins  Sub Role: Carbohydrates, organic alcohols, and acidsTIGRFAM to Gene Ontology Mappings:

Comments:

PFAM predictions:

PF02706: Wzz, Chain length determinant protein. This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases.. 

go_component: signal recognition particle (sensu Eukaryota) [goid 0005786]  

Page 18: An Ontological Approach for Describing Phospho-proteins in Rhodococcus
Page 19: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Locus ID: RHA1_ro05554 Acknowledgements for this annotation

Strain:Rhodococcus sp. RHA1NBCI Taxonomy Database

Replicon: ChromosomeRefseq: NC_008268

Start: 5971327 Stop: 5972865

Gene Name:    Alternate gene name(s):  

Protein / Product Name:

probable alkaline phosphatase Alternate product name(s):  

Refseq GI Number:

111022520 

Category:Protein 

Localization:Unknown (This protein may have multiple localization sites) (Class 3) 

Transposon Mutant Available?:

No transposon mutant available yet

COG predictions: PhoD, Phosphodiesterase/alkaline phosphatase D [Inorganic ion transport and metabolism].

TIGRFAM predictions:

TIGRFAM to Gene Ontology Mappings: COG3540 

Comments:

PFAM predictions:

PF00245: Alk_phosphatase, Alkaline phosphatase. 

go_component: organelle inner membrane [goid 0019866]  

Page 20: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Ontologies can be used as a standard model for the exchange of biological information

Building ontologies can get very complicated Biologists with little description logic training Computer scientist with little knowledge of biology Need more bioinformaticians

Ontologies can facilitate automated annotation of genes / gene products

Difficult to Read and Infer from Ontologies Ontologies can get very big (Phosphabase only

small example) Reasoners are sometimes slow and inaccurate

www.quicklybored.c

om

Page 21: An Ontological Approach for Describing Phospho-proteins in Rhodococcus

Rhodococcus sp. RHA1 data Eltis Lab: Dr. Lindsay Eltis, Dept. Microbiology

& Biochemistry Phosphabase Ontologoy

Wolstencroft Lab, University of Manchester, UK

Bioinformatics paper: Wolstencroft et al, 2006

Phosphabase Ontology processing Benjamin Good, iCAPTURE Centre, Vancouver