Homology Profile-HMMs Domains Protein-family Databases How to build a new (Pfam) protein family EMBO...
If you can't read please download the document
Homology Profile-HMMs Domains Protein-family Databases How to build a new (Pfam) protein family EMBO Workshop, Cape Town, 2014 Function annotation transfer
Homology Profile-HMMs Domains Protein-family Databases How to
build a new (Pfam) protein family EMBO Workshop, Cape Town, 2014
Function annotation transfer Outline Pfam database
Slide 2
Homology EMBO Workshop, Cape Town, 2014
Slide 3
Definition: Two proteins are homologous if they share a common
ancestor, i.e. they are evolutionary related EMBO Workshop, Cape
Town, 2014
Slide 4
Symmetric A A B B homologous Transitive B B A A homologous A A
B B AND B B C C homologous A A C C
Slide 5
Detecting homology EMBO Workshop, Cape Town, 2014
Slide 6
Human: 1
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60
MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1
MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60
Human: 61
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120
DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse:
61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120
Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM
KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG
154 By excess similarity (see Pearson Curr Protoc Bioinformatics
2013 ) Statistical significance (e.g. E-values) Sequence similarity
EMBO Workshop, Cape Town, 2014
Slide 7
2G2X: 1
MAYWLMKSEPDELSIEALARLGEARWDGVRNYQARNFLRAMSVGDEFFFYH-----SSCP 55
MAYWL D W Y N VGD Y 2P5D: 4
MAYWLCITNEDNWKVIKEKKI----WGVAERY--KNTINKVKVGDKLIIYEIQRSGKDYK 57
2G2X: 56 QPGIAGIARITRAAYPD------PTALDPESHY 82 P I G Y D PT P 2P5D:
58 PPYIRGVYEVVSEVYKDSSKIFKPTPRNPNEKF 90 Excess sequence similarity?
Structural similarity EMBO Workshop, Cape Town, 2014
http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context See e.g. Jun et al. BMC Genomics 2009
Slide 12
http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context Homology See e.g. Jun et al. BMC Genomics 2009
Slide 13
http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context See e.g. Jun et al. BMC Genomics 2009
Homology?
Slide 14
http://www.microbesonline.org EMBO Workshop, Cape Town, 2014
Genomic context Mostly used for distinguishing orthology from
paralogy
Slide 15
Origins of homology in proteins EMBO Workshop, Cape Town,
2014
Slide 16
Origin of homology in proteins Speciation (orthology) Gene
duplication (paralogy) Horizontal gene transfer (xenology) Whole
genome duplication (ohnology) Gametology EMBO Workshop, Cape Town,
2014
Slide 17
Myoglobin: Serves as a reserve supply of oxygen and facilitates
the movement of oxygen within muscles. Orthology EMBO Workshop,
Cape Town, 2014
Slide 18
Speciation (orthology) Gene duplication (paralogy) Horizontal
gene transfer (xenology) Whole genome duplication (ohnology)
Gametology Origin of protein homology EMBO Workshop, Cape Town,
2014
Slide 19
Myoglobin: Serves as a reserve supply of oxygen and facilitates
the movement of oxygen within muscles. Hemoglobin: Oxygen-transport
protein in red-blood cells of vertebrates Paralogy
Slide 20
EMBO Workshop, Cape Town, 2014
Slide 21
Ancestral Globin B C Myo A Hemo EMBO Workshop, Cape Town,
2014
Slide 22
Ancestral Globin B C Myo A Hemo EMBO Workshop, Cape Town,
2014
Slide 23
Ancestral Globin B C Myo A Hemo Myo Hemo Myo Hemo EMBO
Workshop, Cape Town, 2014
Slide 24
Origin of protein homology EMBO Workshop, Cape Town, 2014
Speciation (orthology) Gene duplication (paralogy) Horizontal gene
transfer (xenology) Whole genome duplication (ohnology) Gametology,
Synology
Slide 25
Mindell and Meyer Trends in Ecology and Evolution 2001
Slide 26
EMBO Workshop, Cape Town, 2014 Homology: why bother? Slide
courtesy of Alex Mitchell (EMBL-EBI)
Schubert et al. Nat. Struct. Biol. 5 (1998) Protein function(s)
EMBO Workshop, Cape Town, 2014
Slide 29
A way to capture biological knowledge in a written and
computable form A set of concepts and their relationships to each
other www.ebi.ac.uk/QuickGO EMBO Workshop, Cape Town, 2014 Slide
courtesy of Alex Mitchell (EMBL-EBI) The Gene Ontology (GO)
Slide 30
1. Molecular Function 2. Biological Process 3. Cellular
Component An elemental activity or task or job protein kinase
activity insulin receptor activity A commonly recognised series of
events cell division Where a gene product is located mitochondrion
mitochondrial matrix mitochondrial inner membrane EMBO Workshop,
Cape Town, 2014 Slide courtesy of Alex Mitchell (EMBL-EBI) GO: 3
ontologies in 1
Slide 31
Slide 32
Slide 33
Slide 34
Protein Families EMBO Workshop, Cape Town, 2014
Slide 35
http://www.studyblue.com/notes/note/n/exam-3/deck/8955883
Globins in Human
Slide 36
Definition: We call family a group of evolutionary related
proteins or protein regions EMBO Workshop, Cape Town, 2014
Slide 37
P P A A Why protein families?
Slide 38
Human: 1
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60
MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1
MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60
Human: 61
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120
DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse:
61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120
Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM
KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG
154 Why protein families? EMBO Workshop, Cape Town, 2014
Slide 39
Human: 1
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60
MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1
MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60
Human: 61
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120
DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse:
61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120
Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM
KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG
154 Why protein families? EMBO Workshop, Cape Town, 2014
Slide 40
P P A A B B H H G G E E C C D D F F
Slide 41
We can detect functionally important residues EMBO Workshop,
Cape Town, 2014
Slide 42
We can detect functionally important residues EMBO Workshop,
Cape Town, 2014
Slide 43
We have a window open on evolutionary diversity Human: 1
MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60
MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1
MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60
Human: 61
DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120
DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse:
61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120
Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM
KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG
154 EMBO Workshop, Cape Town, 2014
Slide 44
We have a window open on evolutionary diversity
Slide 45
Example (using homology for protein annotation) EMBO Workshop,
Cape Town, 2014
Slide 46
H. influenzae protein (3M71) 1.20 Chen et al. Nature 467 (2010)
TUM, January 2013 EMBO Workshop, Cape Town, 2014 New York
Consortium on Membrane Protein Structure (NYCOMPS)