Upload
lorena-capell
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
EVOLUTIONARY HISTORY OF PHAGESWITH dsDNA GENOMES
Arcady MushegianStowers Institute, Kansas City, USA Image: A. Merkov © 2007, http://wsbs-msu.ru/foto
Galina GlazkoUniversity of Rochester, USA
Vladimir MakarenkovUniversité du Québec
à Montreal, Canada
Jing LiuU of Kansas -Utah Law
Siphoviridae Myoviridae Podoviridae Tectiviridae Lipothrixviridae
Fuselloviridae
ICTV CLASSIFICATION: VIRION SHAPE PLUS A FEW MOLECULAR CHARACTERS
RudiviridaeCorticoviridae Plasmaviridae
lambda T4 P60 SIFV
SIRV2 SSV1
Bam35c & PRD1
L2PM2
EVOLUTIONARY HISTORY OF PHAGES : A GENOME-BASED APPROACH ?
WHY ICTV APPROACH IS NOT ENOUGH :
- LOW RESOLUTION OF STRUCTURAL TRAITS
- STRUCTURE CONVERGENCE ?
- NO WAY TO ACCOUNT FOR HGT
CAN WE DO BETTER WITH GENOMES ?
- ARE THERE ENOUGH MOLECULAR TRAITS ?
- WILL HGT SWAMP EVERYTHING ?
RECOMBINATION DOES NOT HAVE TO BE HOMOLOGOUS
Modified from J.G.Lawrence et al., J.Bacteriol. 2002
PHYLOGENY BASED ON GENE CONTENT
THE CLOSER TWO GENOMES ARE, THE
MORE GENES THEY HAVE IN COMMON
- ALL CLASSES OF METHODS CAN BE APPLIED
- NO NEED FOR OMNIPRESENT GENES
- HOW TO MEASURE AND NORMALIZE ?
USED WITH BACTERIA (Koonin, Bork )
AND PHAGES (Rohwer and Edwards, J. Bacteriol. 2002)
- HGT WAS STILL UNACCOUNTED FOR
Bacteriophages
dsDNA ssDNA dsRNA dsRNAdomains
tailed filamentous icosahedral … …divisions
modus I modus II modus III modus IV modi … …
“RETICULATE EVOLUTION” PROPOSAL FOR PHAGE TAXONOMY (Lawrence et al., 2002)
kingdom
“ Phage SfV might belong to the domain of dsDNA viruses, the division of tailed bacteriophages, but to at least 3 modi : (i) Phages with HK97-like head proteins and maturation processes, (ii) Phages with Mu-like contractile tails, and (iii) Integrase-mediated temperate phages. ”
BUT ….
… HOW WE DERIVE THE MODI ?
… WHAT IS THE EVOLUTIONARY SCENARIO ?
( esp. given the claim of “a new view of viral evolution, classification, and taxonomy” ?? )
ON THE BRIGHT SIDE :
THE IMPORTANCE OF HGT
A POST-MODERN DISTRACTION: A TREE ( OF LIFE ) OR NOT A TREE?
THE GOAL OF EVOLUTIONARY RECONSTRUCTION
IS NOT “ TO BUILD A TREE ”, BUT TO LEARN WHAT HAPPENED
OR ?
W.F.Doolittle, Science 1999 © AAAS
PHAGE ORTHOLOGOUS GROUPS ( POGs ) - FOLLOWING THE NCBI COG FRAMEWORK
Tatusov et al., Science 1997; NCBI 1997-2007
VECTORS OF GENE CONTENT IN PHAGES
THE MAIN OBSERVATIONS
- FAT-TAILED DISTRIBUTION, MANY ZEROS
- HIGH POG CONTENT IN GENOMES
- av. 52 % (mostly 30 - 70 %)
- av. 42 % even for the unclassified phages
- “PHAGENESS QUOTIENT”
where ,
i.e., “is this POG more likely to be drawn
from a phage or from a cellular organism?”
Mp
Vp
i
i
f
fPQ log
M
nf
V
nf
MpM
p
VpV
pi
i
i
i ,
WHY HIGH PQ IS IMPORTANT
PQmax = ∞ (phage-specific POGs)
82% OF POGs HAVE PQ = ∞
+ 8% - HIGH PQ, FORM A CLADE
- MOST OF THE POGS TAKE NO PART IN HOST-VIRUS GENE TRANSFER
- WE CAN RESTATE THE TASK AS DETECTION OF PHAGE-PHAGE HGT
Mp
Vp
i
i
f
fPQ log
TREES FROM GENE CONTENT, NO HGT YET
- PRETEND THAT DESPITE HGT, EVOLUTION OF PHAGES IS TREE-LIKE
- BUILT CONVENTIONAL TREE, VERIFY THAT THE SIGNAL IS TRUE, THEN INFER HGT
- DISTANCE METHODS REQUIRE PROPER DISTANCE MEASURE ( SEPARATE STORY …)
NJ MrBayes
SiphoviridaePodoviridaeMyoviridaeFuselloviridaeTectiviridaeUnclassified
TREES FROM GENE CONTENT
SiphoviridaePodoviridaeMyoviridaeFuselloviridaeTectiviridaeUnclassified
TREES FROM GENE CONTENT - CONCLUSIONS
- VERTICAL SIGNAL IS CONSIDERABLE- RESAMPLING, SIMULATIONS, COMPATIBILITY
- 18 WELL-SUPPORTED GROUPS
- 71 % OF ALL PHAGES
- SOME GROUPS INCLUDE PHAGES WITH DIFFERENT MORPHOLOGY
- THE LARGEST 3 ICTV MORPHOTYPES DO NOT RESOLVE AS MONOPHYLETIC
GROUPS WITH SIMILAR MORPHOLOGY
group 2staphylococci phages
group 3Sfi21-like siphoviruses
group 7fuselloviruses
group 11-like siphoviruses
group 14T4-like myoviruses group 18
P2-like myoviruses
and groups 4, 5, 6, 8, 10Siphoviridae Myoviridae Fuselloviridae Unclassified
group 9PZA-like podoviruses group 13
mycobacteriophages
group 15T7-like podoviruses
group 16
and groups 1, 12, 17Siphoviridae Podoviridae Myoviridae Tectiviridae Unclassified
GROUPS WITH DISSIMILAR MORPHOLOGY
HOW TO INFER RETICULATIONS ?
IN PRACTICE : T-REX (MAKARENKOV, 1999-2007) - SPR OF TGC , MINIMIZE DISTANCE TO TSF ,
WHILE MAINTAINING SUB-TREE TOPOLOGIES
HGT EVENTS IN NUMBERS
• 294 HGT EVENTS 90 % within groups
• 114 (of 158) PHAGES ARE INVOLVED
• GENES FROM 229 POGs HAVE BEEN TRANSFERRED
• REMOVE THESE POGs :GROUPS STAY TOGETHER , SOME LOSE 1-2 MEMBERS
FREQUENCY DISTRIBUTIONS OF ALL HGT EVENTS ARE FAT-TAILED
MOST PHAGES RARELY EXCHANGE GENES MOST POGs ARE NEVER TRANSFERRED
THE HGT DEBATE ( NOT ONLY IN PHAGES ) IS ABOUT THE OPPOSITE TAILS OF THE SAME DISTRIBUTION
WHAT NEXT
- MORE PHAGES ( ~500 )- MORE TRAITS, EVEN FOR CURRENT COLLECTION- HIGHER DENSITY OF TAXON SPACE
- TRAITS OTHER THAN ORFs- COS, PAC SITES?
- BETTER METHODS
- ARE ICTV FAMILIES MONOPHYLETIC ?- RELATIONSHIP WITH HERPESVIRUSES ?- ON TO BACTERIAL EVOLUTION ?
http://www.stowers-institute.org/ScientistsSought/TrainingPrograms.asp
52% of the analyzed phage proteins are clustered into 981 Phage Orthologous Groups (POGs)
ICTV family Number of Number of Number of proteins POG genomes proteins in POGs coverage
Myoviridae 28 3538 1526 43%
Siphoviridae 80 5815 3431 59%
Podoviridae 31 1520 879 58%Tectiviridae 2 55 10 18%
Cortiviridae 1 22 3 14%
Plasmaviridae 1 14 2 14%
Fuselloviridae 4 134 79 59%
Lipothrixviridae 1 72 5 7%
Unclassified 16 1052 443 42%
Total number 164 12222 6378 52%
POGs shared by phage genomes are suitable characters in evolutionary reconstruction
461 POGs belong to 14 functional categories
L, replication, recombination, and repair
K, transcription
F, nucleotide transport and metabolism
X, virion assembly
S, unknown function
X,Y,Z,W,V,U,A – phage specific categories
Phages vary significantly in genome size and content
381
8
How to normalize shared gene count I.Correlation between gene-content trees and 16S rRNA-based tree
JC: Jaccard coefficient distance MB: Maryland bridge distance
kNN
kGGd JC
21
21 1),(
WA: Weighted average distance CORR: Standard correlation distance
YX
iiCORR
YYXXGGd
)()(
1),( 21
21
2121 2
)(1),(
NN
NNkGGdMB
21
22
21
212
1),(NN
NNkGGdWA
00.20.40.60.81
1.21.41.6
500 600 700 800 900 1000Genome size, N2
Dis
tan
ce v
alu
es
JC MB WA CORR
00.20.40.60.81
1.21.41.61.82
100 200 300 400 500 600 700 800 900 1000Genome size, N2
Dis
tan
ce v
alu
es
How to normalize shared gene count II. The effect of differences in genome sizes
Number of shared genes (k) = 100 Number of shared genes (k) = 500
Size of Genome 1: N1 = 1000, size of Genome 2: N2
JC: Jaccard coefficient distance MB: Maryland bridge distance
WA: Weighted average distance CORR: Standard correlation distance
Possible evolutionary links between viruses from different host domains
Hendrix 1999 Curr. Biol.
Shared colors indicate proposed evolutionary connections between relevant viruses
Herpesvirus proteases (family S21)
1 1 2 3 4 2 3 4 5 6 7 5 HSV-2 (19)PIYVAGFLALY(6) ELAL-DPDTVRAAL(5) LPINVDHR(3) EVGRVL A-VVND(2)GPFFVGLI(3)QLERVL(20)RLLYLITNYLPSVSL-ST(15)----------FA--HVALCAI(13)LDAAIAP(73) VZV (1)ALYVAGYLALY(5) ELNI-TPEIVRSAL(5) IPINIDHR(3) VVGEVI A-IIED(2)GPFFLGIV(3)QLHAVL(20)RALYLVTNYLPSVSL-SS(5) ----------FT--HVALCVV(13)PESSIEP(66) HCMC (12)PVYVGGFLARY(7) ELLL-PRDVVEHWL(13)LPLNINHD(3) VVGHVA A-MQSV(2)GLFCLGCV(3)RFLEIV(21)KVVEFLSGSYAGLSL-SS(27)----------FK--HVALCSV(13)PEWVTQR(73) EBV (5)SVYVCGFVERP(7) CLHL-DPLTVKSQL(5) LPLTVEHL(3) PVGSVF G-LYQS(2)GLFSAASI(3)DFLSLL(20)PKVEALHAWLPSLSL-AS(19)----------FD--HVSICAL(13)LAWVLKH(70) KSHV (3)GLYVGGFVDVV(7) ELYL-DPDQVTDYL(5) LPITIEHL(3) EVGWTL G-LFQV(2)GIFCTGAI(3)AFLELA(20)PLLEILHTWLPGLSL-SS(16)----------FQ--HVSLCAL(13)AEWVVSR(70)
Bacteriophage prohead proteases (family U35)
1) Prophages
CP-933C (20)SNTLTGYVVRW(12)EKFQ--RGAFTEWL(5) VRGLYEHD(3) LLGRTR(3)LKLEED(2)GLRFELTP(1)DTSTGR(1) VIELVKRGDISGMSF-GF(16)TVLVA-E---LY—-EITVTSV(3) PDSGVEL(28) LambdaBa04 (25)NRTLIGYAVKW(14)EQFK--NGAFTETL(4) QRFLWSHD(3) VLGRTK(3)LRLNED(2)GLRFELDL(1)DTTLGN(1) TYKSIKRGDVDGVSF-GF(17)TVTKA-K---LL--EVSAVAF(3) PDSEVSA(27) Lp3 (35)GKTISGYAIVW(11)EVVT--PKALDGVD(3) VLMLNNHD(3) VLASVK(3)LTLETD(2)GLHFTAQL(1)NTSFAN(1) VYEEVQSGNVDSCSF-GF(19)TINQVKS---LF--DVSVVAV(3) DDTNVQV(322)
2) Siphoviridae
HK97 (22)QGIFEGYASVF(7) DIIL--PGAFKNAL(6) VAMFFNHK(4) PVGKWD S-LAED(2)GLYVRGQL(2)GHSGAA(1) LKAAMQHGTVEGMSV-GF(14)IFKNIQA---LR--EISVCTF(3) EQAGIAA(68) P27 (30)SGEFEGYGSVF(7) DVVV--PGAFTTTL(9) PALLWQHR(3) PIGVY- TEMKED(2)GLYVRGRL(3)DDPLAK(1) AHAHMKAGSLTGLSI-GY(14)LLKEI-D---LW--EVSLVTF(3) DEARISD(61) phi-C31 (23)RISMRGYAYRF(11)ERIV--PGAGAPSL(4) VYATFNHD(3) LLGRTS(3)LRVGED(2)GGWYEIDL(1)DTTVGR(1) VAKLLKRGDLQGSSF-TF(22)EITAM-D---VV--ELGPVVN(3) PTTQASL(44) psiM100 (128)KRLVTGPVLVP(12)EQVE--EVAYKFME(2) QNIDIMHR(3) VARPVE(3)LRADEE(2)GVHLPRGT(3)TARIYD(2) IWEGVKTGKYTGFSI-TA(13)TLRELGW---PW--EVVTISI(3) PKAKYLS(135) psiM2 (82)QRIITGPVLVP(12)EQVE--RVAYKFME(2) QNVDILHR(3) VAKPVE(3)LREDTM(2)GVDLPEGT(3)SAKVYD(2) TWRGILEGKYQGFSI-TA(8) TLADIGW---PF--DVVTVSI(3) PKARYLS(131) Omega (35)TLVLEGYASTF(16)EQLD--RRAFEKTL(5) LHLLVNHA(2) PLARTK(3)LDLSVD(2)GLKVVARL(2)RDPDVQ(1) LAVKMERGDMDEMSF-AF(21)TITEV-S---LHKGDVSVVNF(3) PTTSVGL(235) phi3626 (26)TKTITGYASKY(16)EVVA--EGAFDNSL(4) IKALYNHN(3) VLGSTK(3)LRLESD(2)GLRFEIDL(1)NTTVAN(1) LYESVKRGDVDGTSF-GF(20)TLLEI-D---LY--EISPTPF(3) EDTEVDC(26) phiPV83 (19)EMVIEGYALKF(11)ETIS--RRALENTS(3) VRCLVDHI(3) IIGRTK(3)LELETD(2)GLKYRCKL(1)NTTFAR(1) LYENMRVGNINQCSF-GF(20)TLTAIRE---LT--DVSVVTY(3) KDTDVKP(31) A2 (21)PAVIEGYALKF(15)EHID--PHALDNAD(3) VVALFNHD(3) VLGRTG(2)LELTVD(2)GLKYTLTP(1)DTQLGR(1) LLENVRRGIISQSSF-AF(23)TINNIDH---LF--DVSPVTT(3) PDTEVKV(38) bIL285 (24)EKIISGYFIVF(11)EEIS--PESFDNVD(3) VRALIDHE(3) VLGRTK(3)LTLSVD(2)GVYGEIKV(2)NDTEAM(1) LYSRVQRGDVDQCSF-GF(17)TIKAI-E---LF--EVSVVTF(3) ADTAVEA(33) bIL309 (24)IGQIAGYAIKF(11)EYIA--PIALDNVD(3) VLALYNHD(3) VLGRVD(3)LKLSID(2)GLHFVLDM(1)DTTVGH(1) VYNNIKAGNLKGMSF-GF(18)IINQLQT---LS--EISVVSR(3) DDTSVQV(29)
3) Podoviridae
ST64B (30)SGEFEGYGSVF(7) DVVM--SGAFAASL(8) PALLWQHR(3) PIGVY- TEMKED(2)GLYVKGRL(3)DDPLAK(1) AHAHMKAGSLTGLSI-GY(14)LLKEI----DLW--EVSLVTF(3) DEARISD(61) phage V (21)PAHIIGYGSVF(11)EIIR--PGAFDDVL(3) VRALFNHD(3) ILGRSA(3)LNLSVD(2)GLRYDIQA(1)ETQTIR(2) VLAPMQRGDINQSSF-AF(17)VIREITRFSRLL--DVSPVTY(3) QEADSAV(34)
4) Myoviridae
P2 (6) KFFRIGVEGDT(1) DGRVISAQDIQEMA(9) CRINLEHL(10)RYGDV- AELKAE(8)KGKWALFA KITPTD(1) LIAMNKAAQKVYTSMEIQ(5) TGKCYLVGLAVT—-DDPASLG(22)PENLISV(121) HP1 (8) DFICIATSGYT(1) DGRQITAQELHEMA(9) ANLWPEHR(3) NMGQV- IELKAE(3)KGETQLFA IIAPNK(1) LIEYNRAGQYLFTSIEIT(5) SGKAYLSGLGVT—-DSPASVG(21)VDFSAKE(145) K139 (5) DWVIVATAGTT(2) DGRVISESWINDMA(9) ALIWPEHY(10)NWGEV- EELKAG(2)KDKLRLFA KLTPNH(1) LLEANKDGQKLFSSIEPE(5) EGRCYLLGLAVT—-DSPASSG(15)LECSALE(148) Mu (19)GWCQLLPAGHF(13)QGWFIDGEIAGRLV(9) VLIDYEHN(14)AAGWFN(1)DEMQWR -EGEGLFI HPRWTA(1) AQQRIDDGEFGYLSAVFP(4) TGAVLQIRLAALT-NDPGATG(16)QENKPMN(181)
5) Unclassified dsDNA phages
BFK20 (19)NGTFTAYASVF(7) DVVK--SGAFADTL(9) LPVLYGHD(3) PFSNIG(2)VEAEED(2)GLKITGKL(2)DNPKAA(1) VYKLLKEKRLSQMSF-AF(17)SIDKV-K---LY--EVSVVPI(3) QETEILA(43) phBC6A52 (17)QVILDGYVNVV(15)ERIV--PKTFEKAL(5) VDLLFNHD(3) NLGSIE(3)LELYED(2)GLRAIA-- -TVTDE(1) VIKKARNKELRGWSF-GF(17)SIEEL-E---LL--EVSILDM(5) VATSIET(42) phi13 (19)EMVIEGYALKF(11)ETIS--RRALENTD(3) VRCLVDHI(3) IIGRTK(3)LELETD(2)GLKYRCKL(1)NTTFAR(1) LYENMRVGNINQCSF-GF(20)TLTAIRE---LT--DVSVVTY(3) KDTDVKP(31)
Bacteriophage prohead proteases (family U9)
Aeh1 (31)KLYIEGIFMQS(12)KVL---QEAVTKYI(8) ALGELNHP(3) NVDPLH(1)AIIIEK(4)GNDVWGRA(3)EGDYAE(3) TAALIRAGWIPGVSSRGL(11)EVQEGFKLTVGV--DVVWGPS(3) PNAYVKP(32) T4 (36)GLYIEGIFMQA(12)RIL---EKAVKDYI(8) ALGELNHP(3) NVDPMQ(1)AIIIED(4)GNDVYGRA(3)EGDHGP(3) LAANIRAGWIPGVSSRGL(11)IVNEGFKLTVGV--DAVWGPS(1) PDAWVTP(30) S-PM2 (22)HLYIEGVFLQS(12)SVL---EKEVSRYN(8) ALGELGHP(3) TVNLDR(1)SHRITS(4)GSNFIGKA(3)ATPMGN(1) AKSLLDEGVRLGVSSRGM(11)VMDDFMHAATAA--DIDADPS(3) PDAFVNG(49) RM378 (4)DKTYTALIMEA(12)EAV---KKAVERMK(7) MYGELDHP(8) FVSLER(1)AVQWVD(4)GNKVYGKF(3)PTPYGN(1) VKSLLENGINFGFSLRGS(14)IVDDFFIT--AI--DVVAVPS(3) QSARVLQ(24)
Many dsDNA phage prohead proteases are herpesvirus assemblin-like serine proteases
Liu & Mushegian, 2004 J. Bacteriol.