Click here to load reader
Upload
vonhan
View
220
Download
0
Embed Size (px)
Citation preview
MI64CH18-Hatfull ARI 17 August 2010 15:32
Mycobacteriophages:Genes and GenomesGraham F. HatfullPittsburgh Bacteriophage Institute, Department of Biological Sciences,University of Pittsburgh, Pittsburgh, Pennsylvania 15260; email: [email protected]
Annu. Rev. Microbiol. 2010. 64:33156
First published online as a Review in Advance onJune 7, 2010
The Annual Review of Microbiology is online atmicro.annualreviews.org
This articles doi:10.1146/annurev.micro.112408.134233
Copyright c 2010 by Annual Reviews.All rights reserved
0066-4227/10/1013-0331$20.00
Key Wordsbacteriophage, genome evolution, genomics, tuberculosis,mycobacteria
AbstractViruses are powerful tools for investigating and manipulating theirhosts, but the enormous size and amazing genetic diversity of the bacte-riophage population have emerged as something of a surprise. In lightof the evident importance of mycobacteria to human healthespeciallyMycobacterium tuberculosis, which causes tuberculosisand the difficul-ties that have plagued their genetic manipulation, mycobacteriophagesare especially appealing subjects for discovery, genomic characteriza-tion, and manipulation. With more than 70 complete genome sequencesavailable, the mycobacteriophages have provided a wealth of informa-tion on the diversity of phages that infect a common bacterial host,revealed the pervasively mosaic nature of phage genome architectures,and identified a huge number of genes of unknown function. My-cobacteriophages have provided key tools for tuberculosis genetics, andnew methods for simple construction of mycobacteriophage recombi-nants will facilitate postgenomic explorations into mycobacteriophagebiology.
331
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Mycobacteriophage:a bacteriophage thatinfects mycobacterialhosts
ContentsINTRODUCTION . . . . . . . . . . . . . . . . . . 332GENERAL PROPERTIES OF
MYCOBACTERIOPHAGES . . . . . . 333Mycobacteriophage Virion
Morphologies. . . . . . . . . . . . . . . . . . . 333Host Range and Host Range
Determinants . . . . . . . . . . . . . . . . . . . 333Life Cycles . . . . . . . . . . . . . . . . . . . . . . . . 337
MYCOBACTERIOPHAGEGENOMICS . . . . . . . . . . . . . . . . . . . . . . 337Sequenced Mycobacteriophage
Genomes . . . . . . . . . . . . . . . . . . . . . . . 337Overview of Genomic Diversity . . . . 338Genome Organizations. . . . . . . . . . . . . 339Genome Mosaicism . . . . . . . . . . . . . . . . 342Mechanisms for Generating
Mosaic Genomes . . . . . . . . . . . . . . . 344Transposons and Other Mobile
Elements . . . . . . . . . . . . . . . . . . . . . . . 346MYCOBACTERIAL GENE
FUNCTION ANDEXPRESSION . . . . . . . . . . . . . . . . . . . . 347Lysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347Integration and Prophage
Maintenance. . . . . . . . . . . . . . . . . . . . 347Gene Expression and Its
Regulation. . . . . . . . . . . . . . . . . . . . . . 349Other Mycobacteriophage
Gene Functions . . . . . . . . . . . . . . . . . 349MYCOBACTERIOPHAGE
GENETIC MANIPULATION . . . . 349SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . 350
INTRODUCTIONMycobacteriophages are viruses that infect my-cobacterial hosts. Interest in mycobacterio-phages began in the late 1940s with the isola-tion of phages that infect Mycobacterium smeg-matis (31, 121), followed shortly by phages thatinfect Mycobacterium tuberculosis (27). A pri-mary motivation of these early studies was totype mycobacterial clinical isolates, which wasfurther advanced by collecting sizable num-bers of mycobacteriophages from a variety
of environmental and clinical sources (37, 57,105). The use of mycobacteriophages for typingpurposes dominated the literature over the next35 years, although important advances weremade in understanding mycobacteriophage bi-ology including the use of phage I3 as a gen-eralized transducing phage for M. smegmatis(91), lysogeny in environmental and clinicalstrains (55, 72, 77), visualization by electron mi-croscopy (100), and transfection of mycobacte-riophage DNA (59, 114).
Mycobacteriophages emerged in the late1980s as key players in the establishment ofa facile genetic system for the mycobacteria(50). A breakthrough was established in 1987 byJacobs et al., who used phage TM4 to constructnovel shuttle phasmids that replicate as largecosmids in Escherichia coli and as phages in my-cobacteria (53). These shuttle phasmids can bemanipulated in E. coli using standard geneticengineering approaches and used to efficientlyintroduce foreign genes into mycobacteria. Inthe absence of other methods for direct manip-ulation of mycobacteriophage genomes, shuttlephasmids have proven invaluable for specializedtransduction (1), transposon delivery (2, 98),and diagnostic introduction of reporter genes(51, 88). They also facilitated the use of an-tibiotic selectable markers through temperatephage L1 shuttle phasmids (103) and character-ization of high-efficiency transformation mu-tants of M. smegmatis (104).
A notable feature of shuttle phasmidconstruction is that it does not require phagegenomic information (52). However, realiza-tion of the full potential of mycobacteriophagesfor contributing to an understanding of theirhosts clearly requires genomic characterization,and the first sequenced genome was that of my-cobacteriophage L5 in 1993 (46). As the tech-nologies for DNA sequencing advanced and be-came both quicker and cheaper, a large collec-tion of complete mycobacteriophage genomesequences has emerged, revealing a delight-fully complex, diverse, and interesting set ofgenomes. Seventy genome sequences are avail-able in GenBank (Table 1) and a comparativeanalysis of 60 of these has been described (44).
332 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
dsDNA: double-stranded DNA
Mycobacteriophages hold considerablepromise for elucidating phage diversity andevolution, gaining novel insights into thephysiology and perhaps virulence of their my-cobacterial hosts, and aiding the developmentof tools for mycobacterial genetics. In thisreview I focus primarily on the first of these,although the last two aspects have been greatlyexplored, providing insights into biofilmformation (80), cell wall composition (82, 87),tools for transposon delivery (2), reportergene delivery (51), gene replacement (1, 118),point mutagenesis (119), single copy vectors(65), and non-antibiotic selectable markers(23), among others. Several additional reviewsprovide the reader with further informationabout mycobacteriophage genomics and appli-cations (3943, 75, 76). As our understandingof mycobacteriophage genomics expands, itwill undoubtedly invigorate further utilitiesand insights.
GENERAL PROPERTIES OFMYCOBACTERIOPHAGES
Mycobacteriophage VirionMorphologies
All the characterized mycobacteriophages aredouble-stranded DNA (dsDNA) tailed phagesbelonging to the order Caudovirales. Most (61of 70) are of the family Siphoviridae, character-ized by relatively long flexible noncontractiletails, whereas nine are of the family Myoviri-dae, containing contractile tails (44). There isa notable absence of phages from the familyPodoviridae (containing short stubby tails), al-though it is unclear whether their absence isdue to evolutionary constraints or to physicalproblems in traversing the complex and rela-tively thick mycobacterial cell envelope.
Although the nine myoviral mycobacterio-phages (Table 1) are morphologically indistin-guishable, the siphoviruses show considerablevariation. For example, the tail lengths vary byalmost a factor of three (105 to 300 nm) andthe structures at the tail tips are discernibly dif-ferent in many of these phages (44). For the
most part, the heads are isometric, althoughthreeCorndog, Che9c, and Brujitacontainprolate heads, with the most extreme beingCorndog, whose length-to-width ratio is almostfour; the previously described but unsequencedphage R1 (106) has a prolate head similar to thatof Che9c and Brujita (44). Those with isometricheads span a range of sizes, with the smallest be-ing BPs and Halo (48 nm in diameter) and thelargest being Bxz1 and its relatives (85 nm indiameter). In general, the capsid size correlateswith genome size, suggesting there is a rela-tively constant DNA packaging density (44).
Host Range andHost Range DeterminantsThe early phage-typing studies showed thatmycobacteriophages can have an almost end-less variety of preferences for different bacterialhosts. Some phages (e.g., D29) have broadhost ranges and infect many species of bothfast-growing and slowly growing mycobacteria,including M. smegmatis and M. tuberculosis (94),whereas others (e.g., Barnyard) have verynarrow preferences and infect only a singleknown host (94). At least one phage (DS6A) hasbeen reported whose host range is restricted tostrains composing the M. tuberculosis complex(10, 56), although only a partial genomesequence of this potentially extremely usefuland interesting phage is available. Severalphages discriminate between strains or isolatesof a particular species, and we note that phage33D differentiates between BCG strains andMycobacterium bovis, and several phages havepreferences for specific strains of M. smegmatis(C. Bowman, G. Broussard, D. Jacobs-Sera &G.F. Hatfull, unpublished observations).
For the most part, the molecular and ge-netic barriers to mycobacteriophage host rangepreferences are not known. Presumably, dif-ferentiation occurs at the cell surface due tothe presence or absence of specific receptors,from the need for particular metabolic re-quirements after DNA has been injected intothe cell, or from specific phage protectionmechanisms such as immunity and restriction.
www.annualreviews.org Mycobacteriophages 333
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Tab
le1
Gen
omet
rics
of70
sequ
ence
dm
ycob
acte
riop
hage
geno
mes
a
Pha
ge
Size
(bp)
G
C%
N
o. o
f O
RF
s tR
NA
#
tmR
NA
#
End
s A
cces
sion
no.
C
lust
er
Ori
gins
R
efer
ence
Bet
hleh
em
52,2
50
63.3
87
0
0 10
-bas
e 3
A
Y50
0153
A
1 B
ethl
ehem
, PA
45
Bxb
1 50
,550
63
.7
86
0 0
9-ba
se 3
A
F271
693
A1
Bro
nx, N
Y
76a
DD
5 51
,621
63
.4
87
0 0
10-b
ase
3
EU
7442
52
A1
Upp
. St.
Cla
ir, P
A
44
Jasp
er
50,9
68
63.7
94
0
0 10
-bas
e 3
E
U74
4251
A
1 L
exin
gton
, MA
44
KB
G
53,5
72
63.6
89
0
0 10
-bas
e 3
E
U74
4248
A
1 K
entu
cky
44
Loc
kley
51
,478
63
.4
90
0 0
10-b
ase
3
EU
7442
49
A1
Pitts
burg
h, P
A
44
Solo
n 49
,487
63
.8
86
0 0
10-b
ase
3
EU
8264
70
A1
Solo
n, IA
44
U2
51,2
77
63.7
81
0
0 10
-bas
e 3
A
Y50
0152
A
1 B
ethl
ehem
, PA
45
Che
12
52,0
47
62.9
98
3
0 10
-bas
e 3
D
Q39
8043
A
2 C
henn
ai, I
ndia
45
D29
49
,136
63
.5
77
5 0
9-ba
se 3
A
F022
214
A2
Cal
ifor
nia
24
L5
52,2
97
62.3
85
3
0 9-
base
3
Z18
946
A2
Japa
n 46
Puko
vnik
52
,892
63
.3
88
1 0
10-b
ase
3
EU
7442
50
A2
Ft. B
ragg
, NC
44
Peac
hes
51,3
76
63.9
86
0
0 10
-bas
e 3
G
Q30
3263
.1
A2
Mon
roe,
LA
U
npub
lishe
d da
ta
Bxz
2 50
,913
64
.2
86
3 0
10-b
ase
3
AY
1293
32
A2
Bro
nx, N
Y
83
Cha
h 68
,450
66
.5
104
0 0
Cir
c Pe
rm
FJ17
4694
B
1 R
uffs
dale
, PA
44
Col
bert
67
,774
66
.5
100
0 0
Cir
c Pe
rm
GQ
3032
59.1
B
1 C
orva
llis,
OR
U
npub
lishe
d da
ta
Ori
on
68,4
27
66.5
10
0 0
0 C
irc
Perm
D
Q39
8046
B
1 Pi
ttsbu
rgh,
PA
45
PG1
68,9
99
66.5
10
0 0
0 C
irc
Perm
A
F547
430
B1
Pitts
burg
h, P
A
45
Puhl
toni
o 68
,323
66
.4
97
0 0
Cir
c Pe
rm
GQ
3032
64.1
B
1 B
altim
ore,
MD
U
npub
lishe
d da
ta
Unc
leH
owie
68
,016
66
.5
98
0 0
Cir
c Pe
rm
GQ
3032
66.1
B
1 St
. Lou
is, M
O
Unp
ublis
hed
data
Qyr
zula
67
,188
69
.0
81
0 0
Cir
c Pe
rm
DQ
3980
48
B2
Pitts
burg
h, P
A
45
Ros
ebus
h 67
,480
69
.0
90
0 0
Cir
c Pe
rm
AY
1293
34
B2
Lat
robe
, PA
83
Phae
drus
68
,090
67
.6
98
0 0
Cir
c Pe
rm
EU
8165
89
B3
Pitts
burg
h, P
A
44
Phly
er
69,3
78
67.5
10
3 0
0 C
irc
Perm
FJ
6411
82.1
B
3 Pi
ttsbu
rgh,
PA
U
npub
lishe
d da
ta
Pipe
fish
69
,059
67
.3
102
0 0
Cir
c Pe
rm
DQ
3980
49
B3
Pitts
burg
h, P
A
45
Coo
per
70,6
54
69.1
99
0
0 C
irc
Perm
D
Q39
8044
B
4 Pi
ttsbu
rgh,
PA
45
Nig
el
69,9
04
68.3
94
1
0 C
irc
Perm
E
U77
0221
B
4 Pi
ttsbu
rgh,
PA
44
Bxz
1 15
6,10
2 64
.8
225
35
1 C
irc
Perm
A
Y12
9337
C
1 B
ronx
, NY
83
Cal
i 15
5,37
2 64
.7
222
35
1 C
irc
Perm
E
U82
6471
C
1 Sa
nta
Cla
ra, C
A
44
334 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Cat
era
153,
766
64.7
21
8 35
1
Cir
c Pe
rm
DQ
3980
53
C1
Pitts
burg
h, P
A
45
ET
08
155,
445
64.6
22
1 30
1
Cir
c Pe
rm
GQ
3032
60.1
C
1 Sa
n D
iego
, CA
U
npub
lishe
d da
ta
LR
RH
ood
154,
349
64.7
22
7 30
1
Cir
c Pe
rm
GQ
3032
62.1
C
1 Sa
nta
Cru
z, C
A
Unp
ublis
hed
data
Riz
al
153,
894
64.7
22
0 35
1
Cir
c Pe
rm
EU
8264
67
C1
Pitts
burg
h, P
A
44
Scot
t McG
15
4,01
7 64
.8
221
35
1 C
irc
Perm
E
U82
6469
C
1 Pi
ttsbu
rgh,
PA
44
Spud
15
4,90
6 64
.8
222
35
1 C
irc
Perm
E
U82
6468
C
1 Pi
ttsbu
rgh,
PA
44
Myr
na
164,
602
65.4
22
9 41
0
Cir
c Pe
rm
EU
8264
66
C2
Upp
. St.
Cla
ir, P
A
44
Adj
utor
64
,511
59
.7
86
0 0
Cir
c Pe
rm
EU
6760
00
D
Pitts
burg
h, P
A
44
But
ters
cotc
h 64
,562
59
.7
86
0 0
Cir
c Pe
rm
FJ16
8660
D
Pi
ttsbu
rgh,
PA
44
Gum
ball
64,8
07
59.6
88
0
0 C
irc
Perm
FJ
1686
61
D
Pitts
burg
h, P
A
44
P-lo
t 64
,787
59
.7
89
0 0
Cir
c Pe
rm
DQ
3980
51
D
Pitts
burg
h, P
A
45
PBI1
64
,494
59
.7
81
0 0
Cir
c Pe
rm
DQ
3980
47
D
Pitts
burg
h, P
A
45
Tro
ll4
64,6
18
59.6
88
0
0 C
irc
Perm
FJ
1686
62
D
Silv
er S
prin
gs, M
D
44
244
74,4
83
62.9
14
2 2
0 9-
base
3
DQ
3980
41
E
Pitts
burg
h, P
A
45
Cjw
1 75
,931
63
.1
141
2 0
9-ba
se 3
A
Y12
9331
E
Pi
ttsbu
rgh,
PA
83
Kos
tya
75,8
11
62.9
14
3 2
0 9-
base
3
EU
8165
91
E
Was
hing
ton,
DC
44
Pork
y 76
,312
62
.8
147
2 0
9-ba
se 3
E
U81
6588
E
C
onco
rd, M
A
44
Pum
pkin
74
,491
63
.0
143
2 0
9-ba
se 3
G
Q30
3265
.1
E
Hol
land
, MI
Unp
ublis
hed
data
Boo
mer
58
,037
61
.1
105
0 0
10-b
ase
3
EU
8165
90
F1
Pitts
burg
h, P
A
44
Che
8 59
,471
61
.3
112
0 0
10-b
ase
3
AY
1293
30
F1
Che
nnai
, Ind
ia
83
Frui
tloop
58
,471
61
.8
102
0 0
10-b
ase
3
FJ17
4690
F1
L
atro
be, P
A
44
Llij
56
,852
61
.5
100
0 0
10-b
ase
3
DQ
3980
45
F1
Pitts
burg
h, P
A
45
Pacc
40
58,5
54
61.3
10
1 0
0 10
-bas
e 3
FJ
1746
92
F1
Pitts
burg
h, P
A
44
PMC
56
,692
61
.4
104
0 0
10-b
ase
3
DQ
3980
50
F1
Pitts
burg
h, P
A
45
Ram
sey
58,5
78
61.2
10
8 0
0 10
-bas
e 3
FJ
1746
93
F1
Whi
te B
ear,
MN
44
Tw
eety
58
,692
61
.7
109
0 0
10-b
ase
3
EF5
3606
9 F1
Pi
ttsbu
rgh,
PA
86
Che
9d
56,2
76
60.9
11
1 0
0 10
-bas
e 3
A
Y12
9336
F2
C
henn
ai, I
ndia
83
Ang
el
41,4
41
66.7
61
0
0 11
-bas
e 3
E
U56
8876
.1
G
OH
ara
Tw
p, P
A
96
BPs
41
,901
66
.6
63
0 0
11-b
ase
3
EU
5688
76
G
Pitts
burg
h, P
A
96
Hal
o 42
,289
66
.7
64
0 0
11-b
ase
3
DQ
3980
42
G
Pitts
burg
h, P
A
45
Hop
e 41
,901
66
.6
63
0 0
11-b
ase
3
GQ
3032
61.1
G
A
tlant
a, G
A
Unp
ublis
hed
data
Kon
stan
tine
68,9
52
57.3
95
0
0 C
irc
Perm
FJ
1746
91
H1
Pitts
burg
h, P
A
44
Pred
ator
70
,110
56
.3
92
0 0
Cir
c Pe
rm
(Continued
)
EU
7702
22
H1
Don
egal
, PA
44
www.annualreviews.org Mycobacteriophages 335
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Tab
le1
(Con
tinue
d)
Bar
nyar
d 70
,797
57
.3
109
0 0
Cir
c Pe
rm
AY
1293
39
H2
Lat
robe
, PA
83
Bru
jita
47,0
57
66.8
74
0
0 11
-bas
e 3
FJ
1686
59
I V
irgi
nia
44
Che
9c
57,0
50
65.4
84
0
0 10
-bas
e 3
A
Y12
9333
I
Che
nnai
, Ind
ia
83
Cor
ndog
69
,777
65
.4
99
0 0
4-ba
se 3
A
Y12
9335
N
on
Pitts
burg
h, P
A
83
Gile
s 53
,746
67
.5
78
0 0
14-b
ase
3
EU
2035
71
Non
Pi
ttsbu
rgh,
PA
78
Om
ega
110,
865
61.4
23
7 2
0 4-
base
3
AY
1293
38
Non
U
pp. S
t. C
lair
, PA
83
TM
4 52
,797
68
.1
89
0 0
10-b
ase
3
AF0
6884
5 N
on
Col
orad
o 25
Wild
cat
78,2
96
56.9
14
8 24
1
11-b
ase
3
DQ
3980
52
Non
L
atro
be, P
A
45
TO
TA
L
5,07
8,09
0
7930
36
3
AV
ER
AG
E
73,5
95.5
63
.7
114.
9 5.
26
a Col
ored
sha
ding
cor
resp
onds
to g
enom
e gr
oupi
ngs
acco
rdin
g to
clu
ster
rela
tions
hips
.
For many mycobacteriophages the barriers ap-pear to be absolute, and no plaques are observedon a nonpermissive host even after plating largenumbers of phage particles. For other phagesplaques are observed at modest plating efficien-cies (104 to 106) on a nonpermissive host, andphages BPs and Halowhich were isolated onM. smegmatisform plaques on M. tuberculo-sis at a frequency of 105 (96). Further char-acterization shows that the plaques recoveredon M. tuberculosis are expanded host range mu-tants that infect both strains with equal platingefficiency (96).
Although mycobacteriophage host prefer-ences are expected to be strongly dominated bythe availability of specific cellular receptors, fewhave been identified or studied. Lipid extractsof M. smegmatis have been shown to inhibitinfection by phages D29 and the uncharacter-ized D4 (113), and a specific peptidoglycolipid,mycoside C(sm), has been purified and pro-posed to play a role in phage D4 binding (29).Glycolipids may act as receptors for adsorptionof mycobacteriophage Phlei (8), and a subset oflyxose-containing molecules has been furtherchemically characterized (60). More recently,a single methylated rhamnose residue on theM. smegmatis cell wallassociated glycopep-tidolipid has been shown to be involved inadsorption of phage I3 (16).
Isolation of spontaneous M. smegmatis mu-tants resistant to D29 infection is simplified bythe high efficiency with which this phage kills itshost. However, characterization of the mutantsis complicated by their poor growth and ge-netic instability. Surprisingly, robust resistanceto D29 can arise through simple overexpressionof the wild-type M. smegmatis mpr gene whenpresent on an extrachromosomal plasmid (4),from an integrated mpr gene expressed from astrong promoter (4), or by transposon activa-tion (93). It is thus plausible that spontaneousD29 resistance occurs through localized geneamplification at the mpr locus, leading to en-hanced expression of the Mpr protein, and thatthe locus reduces back to a single copy when se-lective pressure is removed. It is not clear what
336 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
CRISPR: clusteredregularly interspacedshort palindromicrepeat
the normal cellular function of mpr is, or whympr overexpression gives D29 resistance.
In many bacterial hosts, clustered regu-larly interspaced short palindromic repeats(CRISPRs) play roles in phage resistance (3,116). Most sequenced mycobacterial genomesdo not appear to have CRISPRs, with the excep-tions being M. tuberculosis H37Rv (and relatedstrains) and M. avium strain 104. The CRISPRsare composed of short direct repeats (2147 bp) separated by short (3050 bp) uniquespacer sequences, and in the well-characterizedCRISPRs the spacers have near sequence iden-tity with phage genomic sequences, an im-portant component for phage resistance (117).The mycobacterial CRISPR spacer sequencesdo not have compellingly similar counterpartsin any of the sequenced mycobacteriophagegenomes, consistent with the idea that manyphages of these hosts remain unidentified.
Life CyclesdsDNA tailed phages canonically are eithertemperate, forming stable lysogens at moderatefrequencies (e.g., lambda), or lytic, such that allinfections lead to phage growth and cell death(e.g., T4 and T7). Classification of mycobac-teriophages into two such groups is, however,complex. A good example of a temperate phageis L5, which forms obviously turbid plaquesfrom which stable lysogens immune to superin-fection can be readily isolated (23); in contrast,D29 forms completely clear plaques in whichvirtually all host cells are killed. Genomic anal-ysis, however, shows that D29 is a clear-plaquederivative of an L5-like temperate parent, notof a T4-like or T7-like phage (24). Of the ge-nomically characterized phages, 12 others (theCluster A phages; Table 1) behave similarlyto L5. Most other mycobacteriophages formlightly turbid plaques, rather than clear or obvi-ously turbid ones, and for Tweety, Giles, BPs,and Halo this reflects the ability to form lyso-gens at relatively low frequencies (35%) (78,86, 96). Approximately one-half of the char-acterized mycobacteriophages (36 of 70) havean integration cassette and are candidates for
forming lysogens, albeit at relatively low fre-quency. Phage such as Bxz1 and its relativesalso form hazy plaques, although it is unclearwhether the cellular survivors are uninfectedcells, resistant mutants, or lysogens.
MYCOBACTERIOPHAGEGENOMICS
Sequenced MycobacteriophageGenomes
The first completely sequenced mycobacterio-phage genome was that of phage L5 (46), atemperate phage isolated in Japan (22); it is aclose relative of phage L1, which shares a simi-lar restriction pattern but does not grow at 42C(65). Both L5 and L1 infect fast-growing andslowly growing mycobacterial strains, althoughefficient infection of slow-growers by L5 re-quires the presence of high calcium concentra-tions (28). Although the sequence of L1 has notbeen determined, derivatives that grow at both42 and 30C have been identified, followed byisolation and characterization of temperature-sensitive mutants (13, 15). The next completegenome reported was that of D29 (24), whichwas isolated in California from a soil sample byenrichment and infects both fast-growing andslowly growing strains, and is clearly lytic (27).D29 has considerable nucleotide sequence sim-ilarity to L5, especially in the left-most partsof the genomes that encode the virion struc-tural genes (24). Whereas D29 forms distinctlyclear plaquesperhaps more so than any othermycobacteriophagethe sequenced version islikely a recent derivative of a temperate par-ent, and Bowman (9) noted a mixture of plaquemorphologies in his starting D29 stock; ge-nomic comparison with L5 is consistent withthis.
The third sequenced mycobacteriophage,TM4, was isolated by induction of a strain ofM. avium (112). It is unclear whether the orig-inal strain was lysogenic or pseudolysogenic,since TM4 is capable of lysing it as well as M.smegmatis and M. tuberculosis (112); it does notappear to form stable lysogens in either of these
www.annualreviews.org Mycobacteriophages 337
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
strains. Genomic analysis shows that it is dis-tinct from L5 and D29 at the nucleotide se-quence level (25), and it does not encode anyknown integration system or any readily iden-tifiable phage repressor.
All the other sequenced mycobacteriophagegenomes were from phages isolated over thepast 20 years, and all were isolated from envi-ronmental samples using M. smegmatis mc2155as a host. At the time of writing, the total num-ber of mycobacteriophage genomes depositedin GenBank is 70 (Table 1) and a detailed com-parative genomic analysis of 60 has been de-scribed (44). These phages have come from avariety of geographic locations, although abouthalf of them were isolated from the westernPennsylvania region. The isolation of new my-cobacteriophages has been greatly spurred bythe development of phage discovery and ge-nomics as an educational platform (38, 45). Itwould be of considerable interest to take ad-vantage of the faster and cheaper technologiesto sequence the numerous mycobacteriophagesisolated in the earlier period (19501980)for which detailed host range data have beenreportedif these can still be recovered. Wealso note that the use of other mycobacterialstrains for phage isolation will likely give dis-tinct landscapes of genetic diversity to that de-scribed below for the current collection.
Overview of Genomic DiversityThe 70 sequenced mycobacteriophagegenomes encompass substantial genetic di-versity, and the genomic architectures aredominated by mosaic relationships. Althoughthe overall diversity is high, it is not uniform,and any two particular phages may share eitherextensive nucleotide sequence similarity overthe entire genome lengths with only a fewbase differences (e.g., phages Adjutor andPBI1), or as few as three genes whose productsshare greater than 25% amino acid identity(e.g., phages Barnyard and Giles) (Figure 1).Because of the mosaic nature of these genomes,many of the relationships lie between theseextremes, with substantial numbers of genes
shared among genomes that are not otherwiseclosely related.
To recognize the heterogeneous nature ofgenome diversity, the 70 genomes can begrouped into clusters according to their rela-tionships to each other (Figure 1) (44). Severaldifferent methods can be used for determiningthe cluster assignments, including nucleotidesequence similarities and gene content analy-ses. For many genomes the placement into aparticular cluster is simple because of extensiveand clear nucleotide sequence similarity, but forother genomes it is more complex either be-cause there is extensive but weaker similarity orbecause there is high nucleotide sequence sim-ilarity that extends over only a small genomesegment. An arbitrary cutoff measure has beenproposed that any two genomes with evidentnucleotide sequence similarity spanning morethan 50% of the genome lengths should be in-cluded within the same cluster (44). Using thesecriteria, an analysis of 60 sequenced genomesplaced 55 into nine major clusters (AI), and theremaining 5 were singleton genomes with noclose relatives (44); the additional 10 genomesavailable in GenBank all fit within the ninemajor clusters (Figure 1) (Table 1). Five ofthese clusters can be further subdivided intosubclusters, and it is anticipated that as addi-tional genomes are sequenced new clusters willbe formed (because of expected discovery of rel-atives of genomes that are currently singletons),and that current clusters will undergo furthersubdivision. The global population of mycobac-teriophages would seem more likely to form acontinuum of relationships, and the observedclusters may emerge from biases imposed bythe isolation procedures. It is also likely thatadditional genomes unrelated to any of thosesequenced to date remain to be discovered.Note that this clustering primarily provides aconvenient framework for further analysis anddoes not provide an accurate portrayal of wholegenome phylogenies, which involve reticu-late relationships due to genomic mosaicism(64, 69, 70).
An indication that the current collection ofmycobacteriophages underrepresents their full
338 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
A B C D E F G HB2 B4 F2 H1 H2A1 A2 B1 B3 C1 C2 F1
I SinA
BC
DE
FG
HB2
B4
F2
H1
H2A1
A2
B1
B3
C1
C2
F1
I
Sin
Figure 1Dotplot comparison of 70 sequenced mycobacteriophage genomes. Each of the 70 sequencedmycobacteriophages was concatenated into a single 5-Mbp sequence and compared with itself usingGepard (62). The genome order is the same as in Table 1 and the Cluster and Subcluster designations areshown above.
diversity is provided by several prophages res-ident in mycobacterial genomes. Full-lengthprophages can be identified in the genomesof M. avium strain 104, M. abscessus (92), andM. marinum (108), and there are smallerprophage-like elements in M. tuberculosis (18,49) and Mycobacterium ulcerans (107, 109). How-ever, none of these is closely related to any ofthe sequenced mycobacteriophages and shouldbe generally classified as singletons in the clus-tering scheme described above. The roles of any
of the prophages or prophage-like sequences invirulence of their hosts are not clear, but theyare of interest because many of the sequencedmycobacteriophage genomes do encode genescapable of influencing host physiology (83).
Genome OrganizationsMycobacteriophage genome lengths varygreatly, from 41.4 (Angel) to 164.6 kbp(Myrna), with an average length of 73.6 kbp
www.annualreviews.org Mycobacteriophages 339
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
31 32 33 34 35 36 37 39 41
16 27 28 19 20 21 22 23 24 25 26 27 29 30
Virion structure and assembly
LysisIntegration/
immunity
Terminase PortalSca!old
Capsid MTSTape measure protein
1314291329151714332
1328
1358
456 1
393
1371
13731
372
1374
1375 9
34 122
503
142950912105381406 86432 16151413119753
140618
9020
140622
73 144728 706 1377306624 261406
171333
191334
21 23 25 27 29
2292
1391
1410
13921
379
1380
1378
1396
1381
1396
1390 4041388 1389138713841382323
1410
1386
1383 4
60
58 6056545250 4151304424038 138561595551
484644494743413937 45 53 57
31
32107
137633
34 3635
1
Lysin A Lysin BMinor tail proteins
IntegraseRepressor
RDF
HNHEndo
RuvCRecTRecE
Recombination
Halo Hope BPs
MPME2MPME1
MPME1
Figure 2Organization of the mycobacteriophage Angel genome. The linear genome is represented by a horizontal bar with markers in kilobasepairs. Predicted genes are shown as colored boxes with the gene name shown inside the box; genes shown above the genome bar aretranscribed rightward, and those below it are transcribed leftward. Each of the genes has been grouped into a phamily of relatedmycobacteriophage genes (44), with the Pham number designation shown above the gene. Putative gene functions are noted whereknown. Angel is a member of Cluster G (see Table 1), in which there are three other members, Halo, Hope, and BPs. These fourgenomes are similar at the nucleotide level, and differ in structure primarily by insertions of a putative mycobacteriophage mobileelement (MPME). Angel contains no insertions, both Hope and BPs contain insertions of MPME1, and Halo contains an insertion ofMPME2 as shown.
(Table 1). An example of genome organizationis shown in Figure 2, in which the virionstructure and assembly genes are arranged asan array in the left part of the genome, followedby the lysis cassette, an integration cassette, anda set of genes in the right part, some of whichencode DNA replication or recombinationfunctions, but most are of unknown function.However, there is considerable variation ingenome organization and several themesemerge for different phage clusters (Figure 3).The most obvious is that all the mycobac-teriophages with a siphoviral morphotype(all but Cluster C) share a syntenic group ofgenes encoding virion structure and assemblyproteinsas seen in all siphoviruses regardlessof their bacterial host and regardless of the lack
of sequence similarity. For representationalpurposes these are shown in the left parts ofthe genomes (Figure 3).
Clusters F, G, and I all contain genomeswith defined ends with short single-stranded DNA extensions (Table 1), andthe leftmost of the structure and assem-bly genes (terminase) is located close tothe genome end (Figure 3). In contrast,Clusters A and E, together with singletonsCorndog, Giles, Omega, TM4, and Wildcat,have defined genome ends but additional genesare present between the terminase and the end(Figure 3), most of which likely do not encodevirion structure and assembly functions. Thenumber of genes varies from 4 (Cjw1; ClusterE) to 31 in the singleton Corndog, and in
340 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Giles
A
B
C
D
E
F
G
I
H
( )
( )( )
( )( )
( )
( )
( )
Lysis Integration Immunity StructureReplication/
recombination Other
Wildcat
TM4
Omega
Corndog
Figure 3Schematic representations of mycobacteriophage genomes architectures. The genomes of phages in the ninemain clusters (AI) and the five singleton genomes are represented by black bars with genes regions shown ascolored boxes. Genes transcribed rightward are shown above the bar, and those transcribed leftward areshown below it. Putative functions of the gene blocks are represented by different colors, with the key shownat the bottom of the figure. In some clusters there is organizational variation within the cluster, andvariations are given in parentheses. The genome organizations are schematic and are not drawn to scale.
Cluster A this is where the lysis genes arepositioned.
Clusters B, D, and H have circularly per-muted genomes, and for purposes of gene num-bering and representing the genomes as linearmaps, an arbitrary position close to the termi-nase gene is chosen as nucleotide position #1.In some genomes (e.g., Subcluster B1) this cor-responds to the first base of the putative smallterminase subunit gene, whereas in others it iswithin an upstream noncoding interval. Thereis a close relationship between terminase phy-logeny and the nature of phage genome ends(12), and this is also observed in mycobacterio-phages (44).
In many of the genomes (i.e., Clusters A,D, F, G, and I) the virion structure and assem-bly genes are in the canonical and largely unin-terrupted order: Terminase, Portal, Protease,Scaffold, Capsid, presumed head-tail joininggenes, major tail subunit, G/T tail chaperones,tape measure protein, minor tail proteins (44).Many genomes encode both small and largeterminase subunits (e.g., Clusters B1, B4, E,F, G, I, Corndog, TM4), whereas in othersa small terminase subunit gene has not beenidentified (e.g., Clusters A, B2, B3, D, H).Not all genomes encode a scaffold protein andthese functions may be incorporated into thecapsid subunit as they are in coliphage HK97
www.annualreviews.org Mycobacteriophages 341
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
(19, 89). The tape measure protein gene is typ-ically the largest in the genomes of the my-cobacterial siphoviruses, reflecting their ratherlong tails (from 107 nm in L5 to nearly300 nm in Predator). There are, however,numerous genomes that contain additionalgenes in the structure and assembly gene array(Figure 3). These insertions occur at multi-ple locations, such as between the small andlarge terminase subunits (Cluster E), immedi-ately following the major capsid subunit gene(Subcluster B1), and between the portal andprotease genes (in Cluster H), and there are rel-atively large insertions in the singletons Corn-dog and Omega (Figure 3). The insertions inClusters B, E, and H correspond to Hollidayjunction resolving enzymes (RuvC-like in Clus-ter B and Endo VII-like in E and H), consistentwith a role for these genes in DNA packaging(36).
As noted above, in the Cluster A genomesthe lysis genes are located between the termi-nase gene and the left end. However, this is un-usual, and it is more typically located immedi-ately downstream of the tail genes (Figures 2and 3) and transcribed in the same direction.This is a notable departure from the lambdaprototype, where the lysis functions are locatedclose to the right end of the genome (97). Clus-ters A, E, F, G, and singletons Giles and Omegaencode integration cassettes that are near thecenters of their genomes regardless of substan-tial differences in genome lengths (40). In Gilesthe integration cassette is in an atypical locationto the left of the lysis genes (78). Although genesinvolved in DNA replication (including DNAPol I, Pol III, and Holliday junction resolvingenzymes) and DNA metabolism (such as ThyXand ribonucleotide reductase) can be identified(Figure 3), most other genes in the siphoviralgenomes are of unknown function (44).
All Cluster C mycobacteriophages havemyoviral morphologies and relatively largegenomes, and the virion structure and assemblygenes do not appear to be organized into a well-defined array as they are in the siphoviruses.However, relatively few of the structureand assembly genes have been identified
and the virion proteins are not well charac-terized. A striking feature of these genomes isthat they encode a large number of tRNA genes(Table 1) organized into at least two large ar-rays. Myrna (Subcluster C2) is predicted to ex-press 41 tRNAs, only modestly fewer than itsM. smegmatis host (47 predicted tRNAs). TheSubcluster C1 phages, as well as the singletonWildcat, also encode a tmRNA gene (Table 1).
Genome MosaicismA notable feature of all bacteriophage genomesis their mosaic architectures, where eachgenome can be thought of as a specific assem-blage of individual modules (81, 83, 101). Eachmodule may correspond to a single gene ora group of genes, and its modular nature isreflected by its location in genomes that areotherwise not closely related. The exchange ofmodules may have occurred relatively recentlyin evolutionary time, in which case the mod-ules may retain substantial similarity at the nu-cleotide sequence level, or it may have occurredat more distant times, with the only remain-ing evidence of common descent being weakbut statistically significant amino acid sequencesimilarity. Examples of both extremes can befound among the mycobacteriophage genomes.
An excellent example of a relatively recentexchange is seen in the Cluster B genomes(Figure 4). Cluster B genomes can be readilysubdivided into four subclusters (B1B4) suchthat genomes within each subcluster have highlevels of nucleotide sequence similarity overtheir entire genome lengths, but nucleotidesequence similarity is poor between genomesof different subclusters. However, there is a1.9-kbp DNA sequence segment that departsfrom this pattern and is shared at a level of 94%nucleotide sequence similarity between phagesRosebush (a Cluster B2 member) and all six ofthe Cluster B1 genomes; the only other mem-ber of the B2 subcluster (Qyrzula) has a quitedistinct sequence in its place (Figure 4a). Be-cause sufficient evolutionary time has passedto allow for the accumulation of about 100nucleotide differences between the sequences,
342 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
366 (18)37
1406 (454)32
GTCGTCTGGCACGTCGTCGTGGACGAGTAGGGAGGCCGCCAATGGCCGTTATGATCGTCTGGCACATCG-CG-G-ACGAGTGATGTCGACACCGCGC
CAGCTGGACCGTGGTCGAGTAGGGAGGCCACCAATGGCCGTTATG
)
a
b
Orion
PG1
Qyrzula
Qyrzula
1406 (454)31
1406 (454)32
1406 (454)31
1406 (454)32
1406 (454)33
364 (9) 366 (18) 367 (9)34 365 (9) 366 (18) 360 (18)
3536
3738
39
)1406 (454)33
1406 (454)33
32363 (9)31
363 (9)29
364 (9) 366 (18) 367 (9)34 365 (9) 366 (18) 360 (18)
35
364 (9)34 365 (9)
35
3637
366 (18)36
38
367 (9)38
366 (18)35
1406 (454)31
1406 (454)30
364 (9)32 365 (9)
33
366 (18)34
367 (9)36
39
Rosebush
PG1Rosebush
33 34
35
313029 32 33 34 35 36
313029 32 33 34 35 36
313029 32 33 34 35 36 37
313029 32 33 34 35 36 37
Sequence similarity:
Increasing similarity
Figure 4Recombination between Cluster B mycobacteriophages. (a) Phages Orion and PG1 are members of Subcluster B1 and are closelyrelated at the nucleotide level (Table 1). Phages Rosebush and Qyrzula are members of Subcluster B2 and are closely related at thenucleotide level across most of their genome spans. A short portion (7 kbp) of the genomes is shown and aligned, with sequencesimilarity represented as colored shading between the pairwise genomes. The strengths of the relationships are shown according to thecolor spectrum, with violet representing the closest similarity. Note the segment of Rosebush that is closely related to the Subcluster B1genomes, but not to its B2 relative Qyrzula. Genes are shown as gray boxes, with the gene name within the box, the phamily assignmentabove the box, and the number of phamily members in parentheses. Figure was generated using the program Phamerator (S. Cresawn,R. Hendrix & G.F. Hatfull, unpublished data). (b) Alignment of PG1, Rosebush, and Qyrzula sequences at the rightmost recombinantjunction. The arrow above the sequences shows the position of the 3 ends of genes 35 of PG1 and Rosebush; the arrows below showthe 3 and 5 ends of Qyrzula genes 33 and 34, respectively. The box shows a region of interrupted similarity between PG1 and Qyrzulawithin which recombination could have given rise to the Rosebush recombinant structure.
www.annualreviews.org Mycobacteriophages 343
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Phamily (Pham):a group ofmycobacteriophagegenes related to eachother as defined byBlastP and ClustalW
examination of the recombinant junctionsshould be interpreted cautiously. Nonetheless,at the right junction, which corresponds closelywith the 3 ends of gene 35, there is a shortsegment of interrupted sequence similarity be-tween PG1 (and all of its five relatives) andQyrzula that could have served as a site for re-combination to give rise to the Rosebush struc-ture (Figure 4b). The common sequence atthe junction is not completely conserved andit is impossible to tell whether the differenceshave occurred subsequent to recombination, orwhether they might have been present in theparent genomes (which were not necessarilyQyrzula or other known Cluster B1 phages). Ithas been proposed that homeologous recombi-nation events (involving sequences that are sim-ilar but divergent) mediated by phage-encodedrecombinases (such as lambda Red or theRecET systems) acting at partially conservedsequences could give rise to junctions such asthese (74).
The mycobacteriophages appear to have nu-merous examples in which individual modulescorrespond to single genes, with the relation-ships made evident by amino acid sequence sim-ilarity (83). When the phylogenies of individ-ual genes are determined, they are often dif-ferent, revealing distinct evolutionary paths toresidence in any particular phage genome. Tosimplify the representation of this, we have uti-lized phamily circles, which have the advan-tage of displaying all genome members used inthe analysis, including those that do not con-tain a particular gene member of the pham-ily being analyzed (83). Examples are shown inFigure 5 in which both Pham 233 and Pham471 have a member in phage Omega but havePham members in a variety of other genomes.
In the Omega genome, genes 126 and 127represent these two phamilies, respectively,and their distinct phylogenetic relationshipsstrongly suggest they have evolved separately,and have been juxtaposed by a recombina-tion event between them. This is further il-lustrated by examining the locations of the re-lated pham members in other genomes. Forexample, Pham 233 has a member in phageCjw1 (gene 73) that is flanked on both sidesby genes unrelated to those flanking Omegagene 126. Likewise, Pham 471 has a memberin phage KBG (gene 84) flanked by genes un-related to those in Omega. This single-genemosaicism, especially among the nonstructuralgenes, is a prominent feature of these genomesand underscores the dominant role of hor-izontal exchange processes in bacteriophageevolution.
Mechanisms for GeneratingMosaic GenomesThere has been considerable speculation re-garding the specific molecular mechanisms thatgive rise to mosaic phage genomes (47, 48).An early model suggested that short, conservedboundary sequences located at gene boundariesmay serve as targets for genetic exchange (110),and such boundary sequences have been de-scribed in coliphage HK620 (17). Boundary se-quences are not, however, prevalent among my-cobacteriophages (or other groups of phages)and thus seem unlikely to solely account forthe pervasive mosaicism. A second view is thatmosaicism results from events that are primar-ily illegitimate or nonsequence determined. Al-though most of these events will be destructive,
Figure 5Examples of mycobacteriophage mosaicism. (a) A segment of the Omega genome is shown that encodes for genes 125128. Gene 125and 128 are orphams and have no known mycobacteriophage homologs, and genes 126 and 127 are members of Pham 233 and Pham471, respectively, which have five and eight members, respectively. Members of Pham 233 and Pham 471 are found in phages Cjw1 andKBG, and in each case they are in distinct genomic contexts. Presumably, recombination events between these genes occurred indistant evolutionary time to generate these mosaic structures. (b) Phamily circle representations for Pham 233 and Pham 471. Each ofthe sequenced mycobacteriophage genomes is represented around the circumference of each circle, grouped according to cluster. Mapsand circles were generated using the program Phamerator (S. Cresawn, R. Hendrix & G.F. Hatfull, unpublished data).
344 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
Bxz1
Coop
er
)
Beth
lehe
mBe
thle
hem
Bxb1
Bxb1
DD
5D
D5
Jasp
erJa
sper
KBG
KBG
(gp8
4)Lo
ckle
yLo
ckle
ySo
lon U2 B
xz2 C
he12 D29
Solo
nU
2Bx
z2 Che
12 D29
L5
Puko
vnik
Chah
Orio
n
PG1
L5 Puk
ovni
k
Chah
Orio
n
PG1
Qyr
zula
Rose
bush
Phae
drus
Pipe
!sh
Qyr
zula
Rose
bush
Phae
drus
Pipe
!sh
Nig
el
Cali
Cate
raM
yrna
Riza
lSc
ottM
cG
Bxz1
Coop
er
Nig
el
Cali
Cate
raM
yrna
Riza
lSc
ottM
cGSp
udAd
juto
rSp
udAd
juto
rBu
tter
scot
chBu
tter
scot
chG
umba
llPB
I1PL
ot
Gum
ball
PBI1
PLot
Trol
l4Tr
oll4
244
(gp7
3)24
4
Cjw
1 (g
p73)
Cjw
1
Kost
ya (g
p72)
Pork
y (g
p71)
Boom
er
Kost
yaPo
rky
Boom
er
Che8
Frui
t loo
p
Llij
PMC
Pacc
40
Che8
Frui
t loo
p
Llij
PMC
Pacc
40
Ram
sey
Twee
ty
Ram
sey
(gp6
9)
Twee
ty (g
p69)
Twee
ty (g
p72)
Che9
dCh
e9d
(gp8
0)
BPs
Hal
o
Kons
tant
ine
Pred
ator
Barn
yard
Bruj
itaChe9
c
BPs
Hal
o
Kons
tant
ine
Pred
ator
Barn
yard
Bruj
itaChe9
c
Corn
dog
Corn
dog
(gp6
)Co
rndo
g (g
p7)
Gile
sG
iles
Om
ega
(gp1
26)
Om
ega
(gp1
27)
TM4
TM4
Wild
cat
Wild
cat
Pham
233
Pham
471
b
a73
5 (1
)
125
126
128
127
736
(1)
68
5051
69O
meg
a
Cjw
1
KBG
233
(5)
471
(8)
30%
27.2
%
233
(5)
234
(5)
231
(4)
232
(4)
71
7274
7375
228 (
14)
8284
972
(3)
65 (4
)
471
(8)
85
1890
(1)
8348
Rela
tions
hip
iden
ti!ed
by
Blas
tPRe
latio
nshi
p id
enti!
ed b
y Cl
usta
lW
www.annualreviews.org Mycobacteriophages 345
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
MPME:mycobacteriophagemobile element
they have the capacity to position two unrelatedDNA segments together in a highly creativeprocess. The generation of successful progenywould likely require multiple low-frequencyevents, coupled with selection either for genefunction or for DNA segments of packagablesize. The low frequency of such events wouldnot seem to be a serious impediment in light ofthe dynamic nature of phage-host interactions(1024 infections per second globally), the vastnumber of phage particles (1031), and probableearly origins extending back perhaps 3 billionyears (48, 111).
A third view is that homeologous recombi-nation plays an important role. Support for thisis provided by the observation that lambda Redrecombination is more proficient at recom-bination between divergent sequences thanare host RecABCD pathways and can act atvery short regions of sequence similarity (74).However, exchanges occurring at extremelyshort regions of sequence similarity may notbe readily distinguishable from illegitimaterecombination events, and exchanges at longersegments may not necessarily lead to disrup-tions of synteny (Figure 4). Nonetheless, theproperties of phage-encoded recombinationsystems make them attractive for playingimportant roles in phage evolution, mediatingexchange between short partially conservedsequences such as ribosome binding sites,transcriptional terminators, and repressorbinding sites (11).
A potential caveat for a general role oflambda Redlike recombinases in generatingphage mosaicism is that not all genomesobviously encode such recombination systems.In the mycobacteriophages, Clusters G, I,and Giles encode Escherichia coli RecET-likeproteins, some of which are active in re-combination (118120); Wildcat encodes anErf-like recombinase, and a number of othermycobacteriophages (Clusters C and E) en-code RecA homologs. But recombinase genesmediating homologous exchange cannot bereadily identified in the remaining 48 genomes,suggesting that they are absent or that theseactivities lie within the large number of genes of
unknown function. It is noteworthy that highlevels of recombination among TM4-derivedcosmids were observed during shuttle phas-mid construction (53) even though no TM4recombination genes have been identified.
Transposons andOther Mobile ElementsWhile not all phage genomes necessarily har-bor transposons or other mobile elements, theyare not uncommon, and transposition is ex-pected to contribute to genomic mosaicism.Curiously, although dozens of transposons andinsertion sequences have been identified in my-cobacterial genomes, none occurs in any ofthe sequenced mycobacteriophages (44). How-ever, comparative genomics has revealed anovel class of mycobacteriophage mobile ele-ments (MPMEs) that are broadly distributedamong mycobacteriophage genomes (primarilyin Clusters F, G, and I) but absent from otherphages and mycobacterial chromosomes (96).Two main subclasses (MPME1 and MPME2)share 79% nucleotide sequence identity, al-though the MPME1 and MPME2 share 100%nucleotide identity within their own group (96).The MPMEs are atypically small (MPME1 is439 bp and MPME2 is 440 bp) and generateunusual 6-bp insertions between target DNAand the left inverted repeat at the insertionsite.
There is good evidence to support one ad-ditional transposon insertion. In Llij (ClusterF), gp83 is related to transposases of the IS200family and shares 73% amino acid identity witha putative transposase from Nocardia farcinia.A comparison of the Cluster F genomes at thenucleotide sequence level reveals a discontinu-ity 96 bp upstream of the beginning of gene 83(coordinate 48209) that likely defines the junc-tion between the left end of a putative IS200family element and the target. The right endis not easy to identify, and Cluster F sequencesimilarities are less well defined, with possiblejunctions either at coordinate 49751 or at coor-dinate 49831. The ends of other IS200 familyelements form hairpin loop structures 16 bp and
346 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
RDF: recombinationdirectionality factor
6 bp from the left and right end junctions, re-spectively, and a plausible structure is present tothe left of Llij gene 83. A second structure cor-responding to the right end is less clear, raisingthe possibility that this element may have un-dergone subsequent rearrangements and mayno longer be mobile.
Mycobacteriophage genomes are devoid ofany clearly identifiable introns, although thereare several inteins located within a variety ofgenes, all of which have inteinless counter-parts. Five of these are terminases (encoded byphages Bethlehem, Cjw1, Kostya, Omega, andPipefish), but the Pipefish terminase is distinctfrom the others in that it is circularly permutedand does not have a cos-packaging genome (44).An intein is also present in three genes relatedto the Bxb1 recombination directionality factor(RDF) (gene 47) and a related intein is presentin a putative nucleotidyltransferase gene in Cali(gene 3). The inteins represent highly divergentsequences, and the intein in Bethlehem gene 51has recently been shown to represent a novelfunctional class (115).
MYCOBACTERIAL GENEFUNCTION AND EXPRESSION
Lysis
A lysis cassette was first described for mycobac-teriophage Ms6 (30, 90) and was proposed tocontain five genes (Orfs 15). Although thecomplete genome sequence for Ms6 is notavailable, approximately 5 kbp of a 6.2-kbpsequenced segment is closely related to ClusterF phages, with Fruitloop the nearest relative(98% nucleotide identity). Of the five openreading frames (ORFs) identified, three areimplicated in lysis: lysin A (Orf 2), lysin B(Orf4), and a holin (Orf 4). All the sequencedmycobacteriophages appear to encode anendolysin (lysin A), even though they arean unusual and complex group of proteinsequences composed of a large number ofmodules assembled in multiple combinations.These modules contain many different pepti-doglycan hydrolysis motifs including glycoside
hydrolases, amidases, and peptidases, as well aspeptidoglycan binding motifs. A direct role forthese modules in lysis is demonstrated by thebehavior of a lysin Adefective mutant of phageGiles, and in peptidoglycan hydrolysis by theendolysins of phages Corndog, Bxz1, and Che8(82). The Ms6 lysin B has lipolytic activity (34)and a Giles lysin Bdefective mutant formssmall plaques and exhibits a lysis defect (82);the D29 lysin B protein is structurally relatedto cutinase-like enzymes and functions as a my-colylarabinogalactan esterase (82). Curiously,four mycobacteriophages (Che12, Rosebush,Qyrzula, and Myrna) lack a lysin B homologyet do not exhibit small plaque morphotypeslike the Giles lysB mutant. An intriguingpossibility is that these phages have evolveda mechanism for utilizing a host-encodedcutinase-like enzyme for this function.
Integration and ProphageMaintenanceThirty-six of the sequenced mycobacterio-phages (Clusters A, E, F, G, I, and singletonsGiles and Omega) harbor integration cassettescomposed of an integrase gene, an attP attach-ment site, and an RDF. With the exceptionof Cluster A1 phages, and phages Bxz2 andPeaches (both in Cluster A2), which encode ser-ine integrases, most of the integrases are of thetyrosine-recombinase family. In each genomeencoding a tyrosine integrase, a putative attPsite can be identified owing to a short 25- to45-bp common core region shared between theattP and attB sites in the host chromosome. Fre-quently, the attB core overlaps a host tRNAgene and this is observed for all characterizedmycobacteriophages. In phages L5, D29, Halo,Ms6, Tweety, and Giles the attB site has beenconfirmed experimentally (26, 65, 78, 85, 86,96) and it has been predicted for Che8, Llij,PMC (which are closely related to Tweety),Che9d, Omega, Che9c, Cjw1, and 244 (86).Of the remaining phages, Fruitloop integrase(gp40) is closely related to that of Ms6 andpresumably uses the same attB site; Boomerand Pacc40 integrases are similar to Tweety
www.annualreviews.org Mycobacteriophages 347
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
integrase and its relatives. A putative attP sitefor Brujita has yet to be identified. The M. tu-berculosis prophage-like element phiRv2 is inte-grated into a host tRNAVal gene (49).
Integrase-mediated excisive recombinationtypically requires an RDF, and the best-characterized RDF for the mycobacteriophage-encoded tyrosine integrases is the 56-residuegp36 Xis of L5 (66). However, the RDF classof proteins is highly diverse (67) and onlythe fellow Cluster A2 phages D29, Che12,and Pukovnik encode closely related homologs.Ramsey (Subcluster F1) encodes a more dis-tant relative (gp34), although its location ad-jacent to the Ramsey integrase strongly impli-cates it in recombination directionality control.The functions conferring directional control inthe other Cluster F phages as well as those inClusters E and Brujita (Cluster I) remain elu-sive. In the Cluster G phages a putative RDFwith similarity to other Xis proteins is locatednear the integrase gene (Figure 2), and a simi-lar situation is observed in singletons Giles andOmega (genes 30 and 84, respectively). Che9c(Cluster I) encodes a putative RDF that is re-lated to Giles gp30 but is located over 6 kbpaway from the integrase gene.
Identification of attP and associated attBsites of serine integrases is more complicatedbecause the common core sequence can beas short as 23 bp (102). However, thesesystems are of interest because the attB sitesdo not overlap tRNA genes but are within hostprotein-coding genes and therefore have thecapacity to influence host physiology throughgene inactivation or modification. A goodexample of this is Bxb1, which integrates intothe 3 end of the groEL1 gene of M. smegmatis(61, 80). As a result, Bxb1 lysogens are unableto form normal mature biofilms, unveilingthe role of the unusual GroEL1 chaperone inthe regulation of mycolic biosynthesis (80).The other Subcluster A1 phages share closelyrelated integrases (>95% amino acid identity)and likely integrate into the same site. TheBxz2 integrase is more distantly related (27%amino acid identity with Bxb1 integrase) andintegrates into a different attB site within
Mmseg_5156 (86). The phage Peaches alsoencodes a serine integrase that is most closelyrelated to the Bxb1 integrase (59%) butwhose integration site specificity remainsundetermined. The M. tuberculosis phiRv1prophage-like element encodes a serine inte-grase whose attB site is unusually located withina repetitive element that provides multiplepotential integration sites (7). The partialsequence of the glycopeptidolipid biosynthesisgene cluster of M. avium strain A5 shows thepresence of a related serine integrase (63) thatmay be part of a prophage in this strain.
There are two types of RDFs associated withthe mycobacteriophage-encoded serine inte-grases. The phiRv2 RDF is related to Xis pro-teins that are otherwise associated with tyrosineintegrases, although its mechanism of actionremains poorly defined (6). The Bxb1 RDFis not related to other known RDF proteinsand was identified as the product of gene 47through use of a genetic screen (33). Biochem-ical characterization shows that it is not a DNAbinding protein but interacts directly withintegrase-DNA complexes to promote forma-tion of excisive synaptic complexes (32, 33).Bxb1 gene 47 is curious in that it is conservedamong mycobacteriophages encoding tyrosineintegrases including L5, for which all the com-ponents required for integrative and excisiverecombination are known (and do not includethe L5 gp54 homolog of Bxb1 gp47) (68, 84). Itpresumably is involved in some function otherthan recombinational control, and its genomiclocation among DNA replication genes isconsistent with a replication function. Fur-thermore, Bxb1 gp47 has sequence similarityto proteins of the PP2A class of phosphatases,raising the question whether phosphataseactivity plays any role in recombination.
Several mycobacteriophages encode pro-teins containing putative nuclease domains ofthe ParBc superfamily, including Cluster B3,C1, and Corndog (Cluster E phages encodegenes with similarity to these but which donot include ParBc domains). None of thesegenomes encodes an integration cassette, andit is plausible that these form lysogens in
348 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
which the genomes replicate extrachromoso-mally. However, none of these (or any of the 70)mycobacteriophages encodes ParA homologsand their mode of prophage maintenance, ifany, remains unclear.
Gene Expression and Its RegulationLittle is known about gene expression in mostmycobacteriophages. Perhaps the best under-stood is phage L5, where an early leftward pro-moter (Pleft) is under the control of the phagerepressor (gp71) [the closely related L1 phagehas an identical repressor (99)]. The Pleft pro-moter is similar to E. coli sigma-70 promoters,and a binding site (operator) for gp71 overlapsthe promoter (11, 79). The gp71 repressor rec-ognizes a 13-bp asymmetric sequence that ispresent 30 times in the L5 genome, mostly insmall intergenic intervals and in one orienta-tion relative to the direction of transcription;gp71 binding has been demonstrated for 24 ofthese sites (11). It is proposed that a repressorbound to these stoperator sites prevents un-wanted transcripts from extending into cyto-toxic phage genes during lysogeny (11). Bxb1encodes a related repressor protein (gp69) thatalso binds to multiple operator and/or stopera-tor sites throughout the genome, although thebinding site has a different consensus sequenceand the phages are heteroimmune (54).
All the other Cluster A1 phages encode re-pressors that share at least 98% amino acididentity with Bxb1 gp69 and presumably arehomoimmune. A rightward promoter, whichis also repressor regulated, occurs at the rightend of the Bxb1 genome (54). Multiple promot-ers for repressor synthesis in L1, L5, and Bxb1are presumably required for establishment andmaintenance of lysogeny (14, 54, 79), althoughtheir specific roles remain unclear. Transcrip-tional promoters for Ms6 lysis genes (30) are lo-cated 214 bp upstream of the first of the genesin that region, Orf 1; however, it is not clearwhether this is a general feature of phages thatshare closely related lysis genes (in Cluster F), asthe extent of sequence similarity ends approx-imately 60 bp upstream of Ms6 Orf 1. No late
promoters for any mycobacteriophages havebeen identified, even though protein expressionpatterns suggest that these may be among themost active of all mycobacterial expression sys-tems (25, 46). A mutant defective in late syn-thesis of phage L1 has been reported, but thespecific genes involved are not known (21).
Other MycobacteriophageGene FunctionsSeveral mycobacteriophage genes involved inDNA metabolism have been cloned and char-acterized. Phage L5 encodes both a thymidylatesynthase (ThyX) and a ribonucleotide reductase(RNR) (gp48 and gp50, respectively), and theyare expressed early in lytic growth and appearto function as a complex (5). A mutant defectivein early gene expression influences expressionof a proposed phage nuclease (20). Giri et al.(35) characterized an early nuclease encoded bygene 65 of phage D29 and showed that it is astructure-specific nuclease with a preference forforked structures.
Initial studies of mycobacteriophage L5identified at least two segments of the phage L5genome that are not well tolerated in M. smeg-matis and presumably encode cytotoxic pro-teins. Further analysis identified three cyto-toxic proteins encoded by L5 genes 77, 78, and79 (95) that prevent growth of M. smegmatiswhen expressed and presumably interrupt spe-cific cellular processes, although these proteinsremain ill-defined. We predict that the broadermycobacteriophage collection encodes numer-ous additional cytotoxic proteins with consider-able potential for development of antitubercu-losis drugs as proposed for Staphylococcus phages(71).
MYCOBACTERIOPHAGEGENETIC MANIPULATIONAs noted above, shuttle phasmids have beeninvaluable tools for constructing recombinantmycobacteriophages and for using them to de-liver transposons, allelic exchange substrates,and reporter genes (1, 2, 51). However, with
www.annualreviews.org Mycobacteriophages 349
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
MYCOBACTERIAL RECOMBINEERING
M. tuberculosis is unusual among bacteria in that when linearDNA substrates are introduced by electroporation there is a highpropensity for illegitimate recombination (58). Mycobacterio-phages have provided two useful strategies for constructing genereplacement mutants. First, mycobacteriophage shuttle phasmidscan be used to introduce allelic exchange substrate by infection,and after selection a high proportion of the progeny are the resultof homologous replacement (1). Second, a mycobacterial-specificrecombineering system has been developed in which RecET-likeproteins encoded by mycobacteriophage Che9c are expressed toconfer high levels of recombination (118). Introduction of ds-DNA or single-stranded DNA substrates into recombineeringstrains of M. smegmatis or M. tuberculosis provides an efficientmeans of generating gene replacement mutants and point mu-tations (118, 119). Single-stranded DNA recombineering is par-ticularly attractive for generating isogenic strains with definedpoint mutations with applicability to determining the contri-butions of single base substitutions to the drug resistant phe-notypes of multiple-drug-resistant tuberculosis and extensivelydrug-resistant tuberculosis clinical strains.
BRED: bacteriophagerecombineering ofelectroporated DNA
an average mycobacteriophage genome lengthof over 70 kbp and packaging constraints of50 kbp in lambda particles, many mycobacte-riophages are not amenable to this technology.
Bacteriophage recombineering of electro-porated DNA (BRED) provides a technique fordirect genetic manipulation of mycobacterio-phages that takes advantage of a mycobacteria-specific recombineering system (73, 118). Thisrecombineering approach is based on the useof the RecET-like recombination system en-coded by phage Che9c, such that expressionof genes 60 and 61 generates high levels ofrecombination in both M. smegmatis and M.tuberculosis (118, 119). In the BRED applica-tion, recombineering-proficient cells are co-electroporated with two DNA substrates; oneis genomic DNA of the phage to be ma-nipulated and the other is a short (typically200 bp) substrate that contains the desired mu-tation (73). For example, a defined gene dele-tion can be constructed by creating a 200-bpsubstrate containing 100 bp homologous to
each of the upstream and downstream regions(120). The mutation can be designed to mini-mize genetic polarity, and because recombina-tion is efficient, there is no need to include aselectable marker or identification tag.
Following co-electroporation, plaques arerecovered by plating onto lawns of a permissivebacterial host (M. smegmatis) in an infectiouscenter configuration, i.e., by plating prior tophage replication and lysis. Each plaque there-fore derives from a single cell that has takenup phage genomic DNA. Screening of 1218plaques by PCR typically identifies at least oneplaque that is mixed, containing both wild-typeand mutant alleles. Importantly, this is typicallyobserved whether or not the gene is essentialfor phage growth, because if the gene is essen-tial, then the presence of wild-type helper phagesupports mutant growth in the mixed plaque.Replating for isolated plaques and screening byPCR usually identify a homogenous viable mu-tant (73). If a mutant is not viable, then it canbe recovered with a complementing strain inwhich the essential gene is expressed from aplasmid (73). Because no selection is required,BRED can be used to make virtually any recom-binant that is desired, including defined nonpo-lar deletions, insertions, point mutations, andaddition of gene tags (73). BRED appears tobe broadly applicable to mycobacteriophage ge-netic manipulation provided that plaques can berecovered by electroporation of phage genomicDNA. The BRED technology thus circum-vents a major hurdle in mycobacteriophage ma-nipulation: providing facile genetic approachesfor addressing a multitude of questions in my-cobacteriophage biology.
SUMMARYIn conclusion, mycobacteriophage genomicsreveals that the diversity of the population islarge, and that substantial parts of the pop-ulation at large remain unexplored. Five ofthe 70 sequenced genomes have no close rel-atives, prophages emerging from mycobacte-rial genome sequencing projects are not closelyrelated to known phages, and the diversity
350 Hatfull
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
of the CRISPR spacers in M. tuberculosisand M. avium genomes suggests there aremany genomes yet to be discovered. Withnew technologies for global expression analyses
and mycobacteriophage functional genomics,a new chapter in postgenomic mycobacterio-phage biology is anticipated with considerableexcitement.
SUMMARY POINTS
1. Mycobacteriophage genomes are genetically highly diverse.
2. Mycobacteriophages can be grouped into clusters according to their sequencerelationships.
3. Mycobacteriophage genomes are architecturally mosaic.
4. Approximately 80% of mycobacteriophage gene phamilies are of unknown function.
5. Mycobacteriophages are sources of genetic novelty, including new classes of inteins andmobile elements.
6. BRED recombineering provides a facile and general means for constructing recombinantand mutant forms of mycobacteriophages.
7. Mycobacteriophages are rich resources for mycobacterial genetics.
FUTURE ISSUES
1. Newly discovered mycobacteriophages isolated on a variety of different mycobacterialstrains are needed to fully understand mycobacteriophage genetic diversity.
2. The potential for generating new tools for mycobacterial genetics and gaining insightsinto mycobacterial physiology is great and many advances await development.
3. Elucidating the function of mycobacteriophage genes will provide a fuller understandingof their biology and their evolution.
DISCLOSURE STATEMENTThe author is not aware of any affiliations, memberships, funding, or financial holdings that mightbe perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTSI thank my colleagues and collaborators, Roger Hendrix, Jeffrey Lawrence, Craig Peebles, SteveCresawn, Bill Jacobs, Debbie Jacobs-Sera, and the numerous members of the Hatfull laboratorywho have participated in mycobacteriophage isolation, sequencing, and analysis.
LITERATURE CITED
1. Bardarov S, Bardarov S Jr, Pavelka MS Jr, Sambandamurthy V, Larsen M, et al. 2002. Specializedtransduction: an efficient method for generating marked and unmarked targeted gene disruptions inMycobacterium tuberculosis, M. bovis BCG and M. smegmatis. Microbiology 148:300717
www.annualreviews.org Mycobacteriophages 351
Ann
u. R
ev. M
icro
biol
. 201
0.64
:331
-356
. Dow
nloa
ded
from
ww
w.a
nnua
lrevi
ews.o
rgby
Uni
vers
ity o
f Cal
iforn
ia -
Sant
a Cr
uz o
n 01
/11/
11. F
or p
erso
nal u
se o
nly.
MI64CH18-Hatfull ARI 17 August 2010 15:32
2. Bardarov S, Kriakov J, Carriere C, Yu S, Vaamonde C, et al. 1997. Conditionally replicating mycobac-teriophages: a system for transposon delivery to Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. USA94:1096166
3. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, et al. 2007. CRISPR provides acquiredresistance against viruses in prokaryotes. Science 315:170912
4. Barsom EK, Hatfull GF. 1996. Characterization of Mycobacterium smegmatis gene that confers resistanceto phages L5 and D29 when overexpressed. Mol. Microbiol. 21:15970
5. Bhattacharya B, Giri N, Mitra M, Gupta