35
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

Embed Size (px)

Citation preview

Page 1: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

1

Gene Ontology and Functional Annotation

Donghui Li

ASPB Plant Biology, June 29, 2008, Merida

Page 2: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

2

TAIR literature statistics

May 2007 May 2008

Reference 31,058 34,179

Research articles 22,640 25,001

Full-text papers 15,572 16,638

Average new papers/month

204 216

Loci with valid references

9,289 10,847

Page 3: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

3

Functional annotation

Controlled vocabularies: GO and PO

Functional annotation at TAIR

Community annotation

Outline

Page 4: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

4

is defined as the process of collecting information about a gene’s biological identity:

• molecular function (protein kinase)• biological roles (protein phosphorylation)• subcellular localization (cytoplasm)

• aliases• mutant phenotype• expression domain

Functional annotation

Page 5: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

5

An annotation is a statement that a gene product …

…has a particular molecular function

…is involved in a particular biological process

…is located within a certain cellular component

…as determined by a particular method

…as described in a particular reference

What is an annotation?

Adapted from Harold J Drabkin, The Jackson Laboratory

Page 6: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

6Adapted from Harold J Drabkin, The Jackson Laboratory

Smith et al. (2006) determined by an enzyme assay that Abc2 has protein kinase activity, is involved in the process of protein phosphorylation, and is located in the cytoplasm.

Smith et al. (2006) determined by an enzyme assay that Abc2 has protein kinase activity, is involved in the process of protein phosphorylation, and is located in the cytoplasm.

ReferenceReference

Evidence code

Evidence code

Controlled vocabulariesControlled

vocabularies

Gene productGene

product

Page 7: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

7

Non-controlled vocabulary• same name, different concept• different name, same concept

Controlled vocabulary (CV)

Controlled vocabulary• A standardized restricted set of defined terms

designed to reduce ambiguity in describing a concept

Page 8: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

8

Same name, different concept

Cell

Page 9: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

9

Same name, different concept

germination

seed germinationpollen germinationspore germination

Page 10: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

10

glucose biosynthesisglucose synthesisglucose formationglucose anabolismgluconeogenesis

Different name, same concept

noncarbohydrate precursors(pyruvate, amino acids and glycerol)

glucose

(3Z)-phytochromobilin + oxidized ferredoxin = biliverdin IXa + reduced ferredoxin. (EC:1.3.7.4)phytochromobilin synthase activity =phytochromobilin:ferredoxin oxidoreductase activity

protein formationtranslation = protein biosynthesis

Page 11: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

11

Cross-species cross-database comparison is problematic without CV

• translation• protein biosynthesis

• phytochromobilin synthase activity• phytochromobilin:ferredoxin oxidoreductase activity

Page 12: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

12

Cross-species cross-database comparison is problematic without CV

pollen spore

germination

seed germinationpollen germinationspore germination

Page 13: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

13

GO: The Gene Ontology, Gene Ontology Consortium

PO: The Plant Ontology, Plant Ontology Consortium

Controlled vocabularies used by TAIR

Page 14: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

14

molecular function: catalytic / binding activitieskinase activity, DNA binding activitytranscriptional factor

biological process: biological goal or objectivesignal transductionmitosis, purine metabolism

cellular component: location or complexnucleus ribosome, proteasome

Gene Ontology

Page 15: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

15

Term

Page 16: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

16

Ontology structure: directed acyclic graph (DAG)

DAG: each child may have one or more parents

parent 1

child

parent 2

Page 17: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

17

protein complex

organelle

mitochondrion

fatty acid beta-oxidation multienzyme complex

Ontology structure: directed acyclic graph (DAG)

Page 18: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

18

is-a

protein complex

organelle

mitochondrion

fatty acid beta-oxidation multienzyme complex

part-of

is-a

Ontology structure: term-term relationships

Page 19: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

19

Gene ontology browser: AmiGO

http://www.geneontology.org

http://amigo.geneontology.org

Page 20: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

20

Plant structure

morphological and anatomical structures

stamen, petal, guard cell

Growth and developmental stages

whole plant growth stages and plant structure developmental stages

seedling growth, rosette growth, leaf development stages, embryo development stages

Plant Ontology

Page 21: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

21

term

evidence

association

gene

How are annotations made?

The Plant Journal (2006) 47:701

AT5G27620

GO:0004672 protein kinase activity

kinase assay

Page 22: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

22

Experimental evidence codesExperimental evidence codes

EXPEXP - Inferred from Experiment- Inferred from Experiment

IMPIMP -- IInferred from MMutant PPhenotype

IDAIDA -- IInferred from DDirect AAssay

IGIIGI - I- Inferred from GGenetic IInteraction

IPIIPI -- IInferred from PPhysical IInteraction

IEPIEP -- IInferred from EExpression PPattern

Computational analysis evidence codesComputational analysis evidence codes

ISSISS -- IInferred from SSequence or structural SSimilarity

Evidence codes

Page 23: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

23

Page 24: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

24

May 2008

KnownKnown, EXP Unannotated

Unknown

Functional annotation of Arabidopsis genome using GO

Page 25: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

25

Search GO Annotations

Page 26: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

26

Page 27: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

27

Page 28: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

28

Page 29: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

29

Page 30: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

30

Page 31: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

31

Total With gene-related data

Indexed Curated

Papers in priority 1 journals

222 166 100% 144 (86%)

Papers in priority 2 journals

546 385 100% 207 (54%)

Papers in priority 3 journals

517 314 100% 31 (10%)

Papers in priority 4 journals

1291 461 100% 11 (2%)

Total 2576 1326 1326 393 (30%)

Papers entered into TAIR (May 07 to May 08)

Page 32: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

32

TAIR - Plant Physiology collaboration

• Author submits annotation after the paper is accepted

• Web-based interface

• AGI locus identifier (At1g01040)

• Gene function annotation linked to loci with method

• Will expand to include other journals (Plant Cell ...)

Page 33: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

33

Page 34: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

34

Functional annotation submission form

[email protected]

Page 35: 1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida

35

Add your comment on TAIR