25
Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Embed Size (px)

Citation preview

Page 1: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Global Annotation of the Protein Kinase Family

Michael Gribskov

University of California, San Diego

Page 2: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

SignalingCascades

Page 3: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego
Page 4: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Statistics

• Arabidopsis• 1028 putative kinase• 58 Potentially alternatively spliced• 82 % confirmed by full length cDNA• Less than 100 experimentally investigated

• Rice• 1565 putative kinases

• What are the functions of each protein kinase?• Functional groupings• Substrate prediction• Pathway analysis and modeling

Page 5: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Targets

• Protein kinase• Protein phosphatase• Membrane transporters• Proteasome complex

Page 6: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Some Receptor Kinases

Class I(EGF receptor)

Class II(Insulin receptor)

Class III(FGF receptor)

Page 7: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Requirements for Functional Clustering

• Must handle very large number of objects (over 1200 for plants, over 9000 for all species)

• Must deal sensibly with paralogs from functional point of view

• Must be based on entire sequence, not just kinase catalytic domain

• Must be tolerant to sequence errors and omissions

Page 8: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Orthology vs Paralogy

• Relationships between genes in multigene families are complex

• Multiple genes may exist before speciation• Genes may be lost and replaced along lineages• “Function space” must be filled

Species A Species B

Page 9: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Maximal Linkage Clustering

Clustering

Page 10: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

a b

c

d

e f g

h

A

a b c e f g h d

B

Average linkage

a b c d e f g h

C

Maximum linkage

Clustering

Page 11: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Clustering/Classification

Maximum linkage

Page 12: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Clustering/Classification

• Pairwise distances• All-against-all BLAST

Uses entire sequence

Alignments not required

Longer matches, i.e. more domains, give better score

0

5000

10000

15000

20000

25000

0 10 20 30 40 50 60 70 80 90 100

110

120

130

140

150

160

170

180

-log( E-value )

Nu

mb

er

Nu

mb

er

Page 13: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Basic Approach

• Maximum linkage clustering up to “natural” limit• Recalculate average distances between groups• Repeat until tree is complete

Page 14: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Complete Kinase Clustering

Page 15: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Statistics

• Class 1: RLKs (transmembrane) and RLCKs • Class 2: “Raf-like” • Class 3: Casein Kinase and CLK • Class 4: Non-TM, Non-Receptor

Page 16: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

BLASTDistance

Entire Sequence

Page 17: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

BLASTDistance

Non-KinaseDomain

Page 18: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Yeast Signaling (MAPK)

Page 19: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Validating Transgenomic Predictions

Page 20: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

SnRK

• At AKIN10 and AKin11• Rescue yeast SNF1 deletion• Functional homolog

GIN4/ERC47/CLA6/D9719.13/YDR507C KCC4/YCL024W HSL1/(SEL2)/NIK1/YKL453/YKL101W SNF1/CAT1/GLC2/CCR1/PAS14/HAF3/D8035.20/YDR477W

At5g39440 At3g29160/AKIN11 At3g01090/AKIN10 At5g58380 At5g07070 At5g01810 At5g45820 At4g30960 At5g25110 At5g10930 At2g25090 At2g30360 At5g01820/AtSR1 At2g38490 At3g23000/AtSR2 At4g14580 At1g01140 At1g30270 At2g26980 At4g24400 At5g35410/SOS2 At1g48260 At3g17510 At5g57630 At1g60940 At1g10940 At5g08590 At5g63650 At2g23030 At1g78290 At3g50500 At5g66880 At4g33950 At4g40010 At1g29230 At2g34180 At4g18700 At5g45810

KIN1/YD9727.17/YDR122W KIN2/L8004.3/L2546/YLR096W KIN4/KIN31/(KIN3)/O5220/YOR233W YPL141C/LPI5 YPL150W/P2597

50

E=10-80

See

Fig

. 2

 

Page 21: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

MAPK Erk1/Human/CMGC MAPK ERK Erk2/Human/CMGC MAPK ERK rl/Fruit Fly/CMGC MAPK ERK mpk 1/Nematode worm/CMGC MAPK ERK Erk5/Human/CMGC MAPK ERK FUS3/YBL016W/Bakers Yeast/CMGC MAPK ERK KSS1/YGR040W/Bakers Yeast/CMGC MAPK ERK HOG1/YLR113W/Bakers Yeast/CMGC MAPK p38 At1g10210.1 4 5 1 AtMPK1 MAP kinase 1 At1g59580.1 4 5 1 AtMPK2 MAP kinase 2 At1g59580.2 AtMPK2 MAP kinase 2 9630.m00469/protein MAP kinase MAPK2 9634.m04729/protein MAP kinase 2 At2g18170.1 4 5 1 AtMPK7 MAP kinase 7 At4g36450.1 4 5 1 MPK14 MAP kinase 14 At2g43790.1 4 5 1 AtMPK6 MAP kinase 6 9634.m00532/protein Protein kinase domain putative At3g45640.1 4 5 1 AtMPK3 MAP kinase 3 9631.m01739/protein Protein kinase domain putative At3g59790.1 4 5 1 AtMPK10 MAP kinase 10 At2g46070.1 4 5 1 AtMPK12 MAP kinase 12 At4g01370.1 4 5 1 AtMPK4 MAP kinase 4 9636.m00537/protein mitogen activated protein kinase MMK2 <EC 9638.m03508/protein putative serine/threonine protein kinase

SLT2/YHR030C/Bakers Yeast/CMGC MAPK ERK YKL161C/Bakers Yeast/CMGC MAPK ERK At4g11330.1 4 5 1 AtMPK5 MAP kinase 5 NLK/Human/CMGC MAPK nmo nmo/Fruit Fly/CMGC MAPK nmo lit 1/Nematode worm/CMGC MAPK nmo At1g01560.1 4 5 1 MPK11 MAP kinase 11 At1g07880.1 4 5 1 AtMPK13 MAP kinase 13 At1g18150.1 4 5 1 AtMPK8 MAP kinase 8 At3g18040.2 AtMPK9 MAP kinase 9 At1g18150.2 AtMPK8 MAP kinase 8 At1g53510.1 4 5 1 AtMPK18 MAP kinase 18 At2g42880.1 4 5 1 AtMPK20 MAP kinase 20 9630.m00329/protein MAP kinase homolog 9633.m04760/protein MAP kinase putative 9633.m00448/protein expressed protein 9633.m04713/protein ATMPK9 9634.m04815/protein MAP kinase homolog At3g14720.1 4 5 1 AtMPK19 MAP kinase 19 At2g01450.1 4 5 1 AtMPK17 MAP kinase 17 9629.m04359/protein blast and wounding induced At3g18040.1 4 5 1 AtMPK9 MAP kinase 9 At5g19010.1 4 5 1 AtMPK16 MAP kinase 16 9629.m04231/protein Protein kinase domain putative 9634.m02609/protein mitogen activated protein kinase homologue 9629.m04560/protein Protein kinase domain putative At1g73670.1 4 5 1 AtMPK15 MAP kinase 15 9633.m04602/protein Protein kinase domain putative Erk7/Human/CMGC MAPK Erk7 CG2309/Fruit Fly/CMGC MAPK Erk7 C05D10.2/Nematode worm/CMGC MAPK Erk7 SMK1/YPR054W/Bakers Yeast/CMGC MAPK ERK JNK1/Human/CMGC MAPK JNK JNK3/Human/CMGC MAPK JNK JNK2/Human/CMGC MAPK JNK bsk/Fruit Fly/CMGC MAPK JNK jnk 1/Nematode worm/CMGC MAPK JNK T07A9.3/Nematode worm/CMGC MAPK JNK ZC416.4/Nematode worm/CMGC MAPK JNK p38a/Human/CMGC MAPK p38 p38b/Human/CMGC MAPK p38 Mpk2/Fruit Fly/CMGC MAPK p38 p38b/Fruit Fly/CMGC MAPK p38 p38d/Human/CMGC MAPK p38 p38g/Human/CMGC MAPK p38 pmk 1/Nematode worm/CMGC MAPK p38 pmk 2/Nematode worm/CMGC MAPK p38 P38c/Fruit Fly/CMGC MAPK p38 pmk 3/Nematode worm/CMGC MAPK p38 C04G6.1/Nematode worm/CMGC MAPK F09C12.2/Nematode worm/CMGC MAPK W06B3.2/Nematode worm/CMGC MAPK

50

Page 22: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

MEME PSSM

Page 23: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

PPC4.2.6 MEME Motifs

Page 24: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Summary

• Functional groups by clustering• Functional assignment by transgenomic comparison• Directed search for functional motifs by motif

comparison• Construction of public data resources

Page 25: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Bioinformatics Group

• Michael Gribskov• Fariba Fana• Degeng Wang• Sheila Podell• Tobey Tam *• Jason Tchieu *• Hannes Niedner

• Douglas Smith• Guangfa Zhang *

• Jeff Harper

• Major Contributors• Catherine Chan• Alice Harmon• Estelle Hrabak• David Kerk• Shinhan Shiu