Homology Profile-HMMs Domains Protein-family Databases How to build a new (Pfam) protein family EMBO Workshop, Cape Town, 2014 Function annotation transfer

Embed Size (px)

Citation preview

  • Slide 1
  • Homology Profile-HMMs Domains Protein-family Databases How to build a new (Pfam) protein family EMBO Workshop, Cape Town, 2014 Function annotation transfer Outline Pfam database
  • Slide 2
  • Homology EMBO Workshop, Cape Town, 2014
  • Slide 3
  • Definition: Two proteins are homologous if they share a common ancestor, i.e. they are evolutionary related EMBO Workshop, Cape Town, 2014
  • Slide 4
  • Symmetric A A B B homologous Transitive B B A A homologous A A B B AND B B C C homologous A A C C
  • Slide 5
  • Detecting homology EMBO Workshop, Cape Town, 2014
  • Slide 6
  • Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 By excess similarity (see Pearson Curr Protoc Bioinformatics 2013 ) Statistical significance (e.g. E-values) Sequence similarity EMBO Workshop, Cape Town, 2014
  • Slide 7
  • 2G2X: 1 MAYWLMKSEPDELSIEALARLGEARWDGVRNYQARNFLRAMSVGDEFFFYH-----SSCP 55 MAYWL D W Y N VGD Y 2P5D: 4 MAYWLCITNEDNWKVIKEKKI----WGVAERY--KNTINKVKVGDKLIIYEIQRSGKDYK 57 2G2X: 56 QPGIAGIARITRAAYPD------PTALDPESHY 82 P I G Y D PT P 2P5D: 58 PPYIRGVYEVVSEVYKDSSKIFKPTPRNPNEKF 90 Excess sequence similarity? Structural similarity EMBO Workshop, Cape Town, 2014
  • Slide 8
  • 2G2X 2P5D Structural similarity EMBO Workshop, Cape Town, 2014
  • Slide 9
  • Structural similarity 2G2X 2P5D
  • Slide 10
  • Structural similarity 2G2X 2P5D Z-score = 12.2 RMSD = 2.9 Lali = 122 %id =20 DALI: http://ekhidna.biocenter.helsinki.fi/dali_lite/start
  • Slide 11
  • http://www.microbesonline.org EMBO Workshop, Cape Town, 2014 Genomic context See e.g. Jun et al. BMC Genomics 2009
  • Slide 12
  • http://www.microbesonline.org EMBO Workshop, Cape Town, 2014 Genomic context Homology See e.g. Jun et al. BMC Genomics 2009
  • Slide 13
  • http://www.microbesonline.org EMBO Workshop, Cape Town, 2014 Genomic context See e.g. Jun et al. BMC Genomics 2009 Homology?
  • Slide 14
  • http://www.microbesonline.org EMBO Workshop, Cape Town, 2014 Genomic context Mostly used for distinguishing orthology from paralogy
  • Slide 15
  • Origins of homology in proteins EMBO Workshop, Cape Town, 2014
  • Slide 16
  • Origin of homology in proteins Speciation (orthology) Gene duplication (paralogy) Horizontal gene transfer (xenology) Whole genome duplication (ohnology) Gametology EMBO Workshop, Cape Town, 2014
  • Slide 17
  • Myoglobin: Serves as a reserve supply of oxygen and facilitates the movement of oxygen within muscles. Orthology EMBO Workshop, Cape Town, 2014
  • Slide 18
  • Speciation (orthology) Gene duplication (paralogy) Horizontal gene transfer (xenology) Whole genome duplication (ohnology) Gametology Origin of protein homology EMBO Workshop, Cape Town, 2014
  • Slide 19
  • Myoglobin: Serves as a reserve supply of oxygen and facilitates the movement of oxygen within muscles. Hemoglobin: Oxygen-transport protein in red-blood cells of vertebrates Paralogy
  • Slide 20
  • EMBO Workshop, Cape Town, 2014
  • Slide 21
  • Ancestral Globin B C Myo A Hemo EMBO Workshop, Cape Town, 2014
  • Slide 22
  • Ancestral Globin B C Myo A Hemo EMBO Workshop, Cape Town, 2014
  • Slide 23
  • Ancestral Globin B C Myo A Hemo Myo Hemo Myo Hemo EMBO Workshop, Cape Town, 2014
  • Slide 24
  • Origin of protein homology EMBO Workshop, Cape Town, 2014 Speciation (orthology) Gene duplication (paralogy) Horizontal gene transfer (xenology) Whole genome duplication (ohnology) Gametology, Synology
  • Slide 25
  • Mindell and Meyer Trends in Ecology and Evolution 2001
  • Slide 26
  • EMBO Workshop, Cape Town, 2014 Homology: why bother? Slide courtesy of Alex Mitchell (EMBL-EBI)
  • Slide 27
  • Homology Function? Structure (homology modeling) EMBO Workshop, Cape Town, 2014 Homology: why bother?
  • Slide 28
  • Schubert et al. Nat. Struct. Biol. 5 (1998) Protein function(s) EMBO Workshop, Cape Town, 2014
  • Slide 29
  • A way to capture biological knowledge in a written and computable form A set of concepts and their relationships to each other www.ebi.ac.uk/QuickGO EMBO Workshop, Cape Town, 2014 Slide courtesy of Alex Mitchell (EMBL-EBI) The Gene Ontology (GO)
  • Slide 30
  • 1. Molecular Function 2. Biological Process 3. Cellular Component An elemental activity or task or job protein kinase activity insulin receptor activity A commonly recognised series of events cell division Where a gene product is located mitochondrion mitochondrial matrix mitochondrial inner membrane EMBO Workshop, Cape Town, 2014 Slide courtesy of Alex Mitchell (EMBL-EBI) GO: 3 ontologies in 1
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Protein Families EMBO Workshop, Cape Town, 2014
  • Slide 35
  • http://www.studyblue.com/notes/note/n/exam-3/deck/8955883 Globins in Human
  • Slide 36
  • Definition: We call family a group of evolutionary related proteins or protein regions EMBO Workshop, Cape Town, 2014
  • Slide 37
  • P P A A Why protein families?
  • Slide 38
  • Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 Why protein families? EMBO Workshop, Cape Town, 2014
  • Slide 39
  • Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 Why protein families? EMBO Workshop, Cape Town, 2014
  • Slide 40
  • P P A A B B H H G G E E C C D D F F
  • Slide 41
  • We can detect functionally important residues EMBO Workshop, Cape Town, 2014
  • Slide 42
  • We can detect functionally important residues EMBO Workshop, Cape Town, 2014
  • Slide 43
  • We have a window open on evolutionary diversity Human: 1 MGLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE 60 MGLSDGEWQLVLNVWGKVEAD GHGQEVLI LFK HPETL KFDKFK LKSE MK SE Mouse: 1 MGLSDGEWQLVLNVWGKVEADLAGHGQEVLIGLFKTHPETLDKFDKFKNLKSEEDMKGSE 60 Human: 61 DLKKHGATVLTALGGILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISECIIQVLQSKH 120 DLKKHG TVLTALG ILKKKG H AEI PLAQSHATKHKIPVKYLEFISE II VL H Mouse: 61 DLKKHGCTVLTALGTILKKKGQHAAEIQPLAQSHATKHKIPVKYLEFISEIIIEVLKKRH 120 Human: 121 PGDFGADAQGAMNKALELFRKDMASNYKELGFQG 154 GDFGADAQGAM KALELFR D A YKELGFQG Mouse: 121 SGDFGADAQGAMSKALELFRNDIAAKYKELGFQG 154 EMBO Workshop, Cape Town, 2014
  • Slide 44
  • We have a window open on evolutionary diversity
  • Slide 45
  • Example (using homology for protein annotation) EMBO Workshop, Cape Town, 2014
  • Slide 46
  • H. influenzae protein (3M71) 1.20 Chen et al. Nature 467 (2010) TUM, January 2013 EMBO Workshop, Cape Town, 2014 New York Consortium on Membrane Protein Structure (NYCOMPS)
  • Slide 47
  • TUM, January 2013
  • Slide 48
  • Thomine and Barbier-Brygoo Nature 467:1058-59 (2010) EMBO Workshop, Cape Town, 2014
  • Slide 49
  • Thomine and Barbier-Brygoo Nature 467:1058-59 (2010) EMBO Workshop, Cape Town, 2014
  • Slide 50
  • Chen et al. Nature 467 (2010)
  • Slide 51
  • EMBO Workshop, Cape Town, 2014 Chen et al. Nature 467 (2010)
  • Slide 52
  • EMBO Workshop, Cape Town, 2014 Chen et al. Nature 467 (2010)
  • Slide 53
  • TUM, January 2013
  • Slide 54
  • EMBO Workshop, Cape Town, 2014 OPEN Jalview 1. 2. File -> Input Alignment -> From File PF03595_seed.txt
  • Slide 55
  • EMBO Workshop, Cape Town, 2014 Colour -> BLOSUM62 1.
  • Slide 56
  • EMBO Workshop, Cape Town, 2014 OPEN Chimera 1. File -> Open 3M71.pdb 2.
  • Slide 57
  • EMBO Workshop, Cape Town, 2014
  • Slide 58
  • out
  • Slide 59
  • EMBO Workshop, Cape Town, 2014 Actions -> Atoms/Bonds -> wire 1. Actions -> Atoms/Bonds -> show 2.
  • Slide 60
  • out EMBO Workshop, Cape Town, 2014 Actions -> Atoms/Bonds -> wire 1. Actions -> Atoms/Bonds -> show 2.