12
Server daTAA: http://toolkit.tuebingen.mpg.de/dataa Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS,

daTAA server

Embed Size (px)

DESCRIPTION

My talk about domain annotation in trimeric autotransporter adhesins.

Citation preview

Page 1: daTAA server

Server daTAA: http://toolkit.tuebingen.mpg.de/dataa

Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS, Warsaw, Poland

Page 2: daTAA server

Internal complexity of TAAs

MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKLSLETVTVMFLLGASPVLA/SNLALTGAKNLSQNSPGVNYSKGSHGSIVLSGDDDFCGADYVLGRGGNSTVRNGIPISVEEEYERFVKQKLMNNATSPYSQSSEQQVWTGDGLTSKGSGYMGGKSTDGDKNILPEAYGIY-------------------------SFATGCGSSAQGNY-------------------------SVAFGANATALTGG-------------------------SQAFGVAALASGRV-------------------------SVAIGVGSEATGEA-------------------------GVSLGGLSKAAGAR-------------------------SVAIGTRANAYGEE-------------------------SIAIGGGLKQGSDNKIGSAVAQGLK-------------------------AISIGSDSVGFQHY-------------------------AVAIGAKSRALLLK-------------------------SVALGSYSVADVDAGVRGYDPVEDEPSKNVSFVWKSSVGAVSVGNRKEGLTRQIIGVAAG---TEDTDAVNVAQLKALR:GMISEK|GGWNLTVNNDNNTVVSSGGALDLSSGSKNLKIAKDGKKNNVTFDVARDLTLKSIKLDGVTLNETGLFIANGPQITASGINAGSQKITGVAEG---TDANDAVNFGQL-----------------------------------------------------------------------------------KKI|ETEVKE-----QVAASGFVKQDSDTK:YLTIGKDTDGDTINIANNKSDKRTLMGIKEGDISKDSSEAITGSQLFTTNQNVKTVSDNLQTAATNIAKTFGGDAKYE-DGEWTAPTFKVKTVTGEGKE-EEKTYQNVADALAGVGSSITNVQ-------NKVTEQVNNAIT--KVEGDALLWSDEANAFVARHEKSKLEKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-NGEWTAPTFKVKTVKEDGKE-EEQTYHNVAAAFEGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKDDK-NGSINYASVTLGKGKDSAAVTLHNVAAGNIAKDSHDAINGSQIYSLNE--------------QLATYFGGGAGYNKEGKWTAPTFTVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--NVTGDSLVKKDLDTNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------------------------DKGLKHLSDSLQSEDSAVVHYDKKTDETGGINYTSVTLG-GKDKTPVALHNVADGSISKDSHDAINGGQIHTIGE--------------DVAKFLGGAASFN-NGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLGKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-DGEWTAPTFKVKTVKEDGKE-EEKTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGGINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGSQIYSLNE--------------QLATYFGGGAKYE-NGQWTAPIFKVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--SVTGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGASFMNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DADISKVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGDAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQINNEIS--NVKGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGDGKE-EEQTYQNVAEALTGVGTSFTNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DANISKVNNNVTNKFNELTQNITNVTQ--QVQGDTLLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQIDNEII--NVKGDSLVKRDLATNLITIGKEIEGSAINIANKSGEARTISGVKEA---VNNNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPSFKVKTVKEDGKE-EEQTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGTINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGGQIHTIGE--------------DVAKFLGGDAAFK-DGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQSITNVTQ--QVKGDSLLWSDEANAFVARHEKSKLEKGASKAIQENSKITYLLDGNVSKGSTDAVTGGQLYSMSN--------------MLATYLGGNAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGTSFTNIK-------SEIAKQINHL----QSDDSAVIHYDKNKDETGTINYASVTLGKGEDSAAVALHNVAAGNIAKDSRDAINGSQLYSLNE--------------QLLTYFGGNAGYK-DGQWIAPKFQVSQFKSDGSSGEKESYDNVAAAFEGVNKSLAGM--------NERINNVVTAGQ--NVSSNSLNWNETEGGYDARHNGVDSKLTHVENGDVSEKSKEAVNGSQLWNTNEKVEAVEKDVKNIEKKVQDIATVADSAVKYEKDSTGKKTNVIKLVGGSESDPVLIDNVADGDIKEGSKQAVNGGQLRDYTEKQMKIVLEDAKKYTDERFNDVVNNGVNEAKAYTDMKFEALSYAVEDVRKEARQAQLLVWRYLTYVTMIYRDL AAIGLAVSNLRYYDIPGSLSLSFGTGIWRSQSAFAVGAGYTSEDGNIRSNLSITNAGGHWGVGAGITLRLK

Page 3: daTAA server

Automated vs manual annotation

Domain type PFAM manually

Present in PFAM 28% 35%

Not present in PFAM - 18%

Coiled coils - 3%

Total 28% 56%

Present in PFAM 26% 31%

Not present in PFAM - 36%

Coiled coils - 25%

Total 26% 92%

Coverage of annotation

Page 4: daTAA server

Automated vs manual annotation

Domain type PFAM daTAA manually

Present in PFAM 28% 32% 35%

Not present in PFAM - 13% 18%

Coiled coils - 5% 3%

Total 28% 50% 56%

Present in PFAM 26% 28% 31%

Not present in PFAM - 27% 36%

Coiled coils - 11% 25%

Total 26% 66% 92%

Coverage of annotation

Page 5: daTAA server

Prediction of individual repeats in YadA

|----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|-----ASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALG ----------| |---------Hep_Hag-------Ylhead--|---Ylhead---|---Ylhead---|----Ylhead--DSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIG ---| |----------Hep_Hag---------|-|----Ylhead-----|----Ylhead---|HSSHVAANHGYSIAIGDRSKTDRENSVSIGHESL

Page 6: daTAA server
Page 7: daTAA server
Page 8: daTAA server
Page 9: daTAA server
Page 10: daTAA server
Page 11: daTAA server

Key points

Approach of human annotator implemented in a computer system

Improvement in coverage and accuracy over general annotation servers

Unique workflow with knowledge-based rules

Visual helpers for interpretation of the results

Page 12: daTAA server

Acknowledgements

MPI for Developmental Biology

Institute of Biochemistry and Biophysics PAS

Andrei Lupas Dirk Linke Toolkit development

team

Piotr Zielenkiewicz Marcin Grynberg