daTAA server

Preview:

DESCRIPTION

My talk about domain annotation in trimeric autotransporter adhesins.

Citation preview

Server daTAA: http://toolkit.tuebingen.mpg.de/dataa

Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS, Warsaw, Poland

Internal complexity of TAAs

MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKLSLETVTVMFLLGASPVLA/SNLALTGAKNLSQNSPGVNYSKGSHGSIVLSGDDDFCGADYVLGRGGNSTVRNGIPISVEEEYERFVKQKLMNNATSPYSQSSEQQVWTGDGLTSKGSGYMGGKSTDGDKNILPEAYGIY-------------------------SFATGCGSSAQGNY-------------------------SVAFGANATALTGG-------------------------SQAFGVAALASGRV-------------------------SVAIGVGSEATGEA-------------------------GVSLGGLSKAAGAR-------------------------SVAIGTRANAYGEE-------------------------SIAIGGGLKQGSDNKIGSAVAQGLK-------------------------AISIGSDSVGFQHY-------------------------AVAIGAKSRALLLK-------------------------SVALGSYSVADVDAGVRGYDPVEDEPSKNVSFVWKSSVGAVSVGNRKEGLTRQIIGVAAG---TEDTDAVNVAQLKALR:GMISEK|GGWNLTVNNDNNTVVSSGGALDLSSGSKNLKIAKDGKKNNVTFDVARDLTLKSIKLDGVTLNETGLFIANGPQITASGINAGSQKITGVAEG---TDANDAVNFGQL-----------------------------------------------------------------------------------KKI|ETEVKE-----QVAASGFVKQDSDTK:YLTIGKDTDGDTINIANNKSDKRTLMGIKEGDISKDSSEAITGSQLFTTNQNVKTVSDNLQTAATNIAKTFGGDAKYE-DGEWTAPTFKVKTVTGEGKE-EEKTYQNVADALAGVGSSITNVQ-------NKVTEQVNNAIT--KVEGDALLWSDEANAFVARHEKSKLEKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-NGEWTAPTFKVKTVKEDGKE-EEQTYHNVAAAFEGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKDDK-NGSINYASVTLGKGKDSAAVTLHNVAAGNIAKDSHDAINGSQIYSLNE--------------QLATYFGGGAGYNKEGKWTAPTFTVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--NVTGDSLVKKDLDTNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------------------------DKGLKHLSDSLQSEDSAVVHYDKKTDETGGINYTSVTLG-GKDKTPVALHNVADGSISKDSHDAINGGQIHTIGE--------------DVAKFLGGAASFN-NGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLGKGASKATQENSKITYLLDGDVSKDSTDAITGKQLYSLGD--------------KIASYLGGNAKYE-DGEWTAPTFKVKTVKEDGKE-EEKTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGGINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGSQIYSLNE--------------QLATYFGGGAKYE-NGQWTAPIFKVKTVKEDGEE-EEKTYQNVAEALTGVGTSFTNIK-------SEITKQIANEIS--SVTGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGASFMNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DADISKVNNNVTNKFNELTQNITNVTQ--QVKGDALLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGDAKYE-NGQWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQINNEIS--NVKGDSLVKKDLATNLITIGKEVAGTEINIASVSKADRTLSGVKEA---VKDNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPTFKVKTVNGDGKE-EEQTYQNVAEALTGVGTSFTNVQNKIT---NEITNQVNNAIT--KVEGDSLVKQDNLG-IITLGKERGGLKVDFANRDGLDRTLSGVKEA---VNDNEAVNKGQL---------------------------------------------------------------------DANISKVNNNVTNKFNELTQNITNVTQ--QVQGDTLLWSDEANAFVARHEKSKLEKGVSKATQENSKITYLLDGDISKGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYHNVAAAFEGVGTSFTNIK-------SEITKQIDNEII--NVKGDSLVKRDLATNLITIGKEIEGSAINIANKSGEARTISGVKEA---VNNNEAVNKGQL---------------------------------------------------------------------DTNIKKVE-------DKLTEAVGKVTQ--QVKGDALLWSNEDNAFVADHGKDSAKTKSKITHLLDGNIASGSTDAVTGGQLYSLNE--------------QLATYFGGGAKYE-NGQWTAPSFKVKTVKEDGKE-EEQTYQNVAEALTGVGTSFTNVK-------NEITKQINHL----QSDDSAVVHYDKNKDETGTINYASVTLGKGKDSAAVTLHNVADGSISKDSRDAINGGQIHTIGE--------------DVAKFLGGDAAFK-DGAFTGPTYKLSNIDAKGDV-QQSEFKDIGSAFAGLDTNIKNVNNNVTNKFNELTQSITNVTQ--QVKGDSLLWSDEANAFVARHEKSKLEKGASKAIQENSKITYLLDGNVSKGSTDAVTGGQLYSMSN--------------MLATYLGGNAKYE-NGEWTAPTFKVKTVNGEGKE-EEQTYQNVAEALTGVGTSFTNIK-------SEIAKQINHL----QSDDSAVIHYDKNKDETGTINYASVTLGKGEDSAAVALHNVAAGNIAKDSRDAINGSQLYSLNE--------------QLLTYFGGNAGYK-DGQWIAPKFQVSQFKSDGSSGEKESYDNVAAAFEGVNKSLAGM--------NERINNVVTAGQ--NVSSNSLNWNETEGGYDARHNGVDSKLTHVENGDVSEKSKEAVNGSQLWNTNEKVEAVEKDVKNIEKKVQDIATVADSAVKYEKDSTGKKTNVIKLVGGSESDPVLIDNVADGDIKEGSKQAVNGGQLRDYTEKQMKIVLEDAKKYTDERFNDVVNNGVNEAKAYTDMKFEALSYAVEDVRKEARQAQLLVWRYLTYVTMIYRDL AAIGLAVSNLRYYDIPGSLSLSFGTGIWRSQSAFAVGAGYTSEDGNIRSNLSITNAGGHWGVGAGITLRLK

Automated vs manual annotation

Domain type PFAM manually

Present in PFAM 28% 35%

Not present in PFAM - 18%

Coiled coils - 3%

Total 28% 56%

Present in PFAM 26% 31%

Not present in PFAM - 36%

Coiled coils - 25%

Total 26% 92%

Coverage of annotation

Automated vs manual annotation

Domain type PFAM daTAA manually

Present in PFAM 28% 32% 35%

Not present in PFAM - 13% 18%

Coiled coils - 5% 3%

Total 28% 50% 56%

Present in PFAM 26% 28% 31%

Not present in PFAM - 27% 36%

Coiled coils - 11% 25%

Total 26% 66% 92%

Coverage of annotation

Prediction of individual repeats in YadA

|----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|-----ASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALG ----------| |---------Hep_Hag-------Ylhead--|---Ylhead---|---Ylhead---|----Ylhead--DSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIG ---| |----------Hep_Hag---------|-|----Ylhead-----|----Ylhead---|HSSHVAANHGYSIAIGDRSKTDRENSVSIGHESL

Key points

Approach of human annotator implemented in a computer system

Improvement in coverage and accuracy over general annotation servers

Unique workflow with knowledge-based rules

Visual helpers for interpretation of the results

Acknowledgements

MPI for Developmental Biology

Institute of Biochemistry and Biophysics PAS

Andrei Lupas Dirk Linke Toolkit development

team

Piotr Zielenkiewicz Marcin Grynberg

Recommended