Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Casey M. Bergman
Faculty of Life SciencesUniversity of Manchester
[email protected]://bioinf.manchester.ac.uk/bergman
Computational analysis of transposable element evolution in Drosophila genomes.
Overview of Talk
• Demographic history of TEs in D. melanogaster
• TE population genomics using 454 sequencing
• Discovery and detection of TEs in genomes
• Abundance and distribution of TEs in Drosophila genomes
• Noncoding DNA (ncDNA) & transposable elements (TEs)
Higher organisms have ahigher proportion of ncDNA
Bacteria15 %
Yeast30 %
Higher organisms have ahigher proportion of ncDNA
Bacteria15 %
Yeast30 %
Human98 %
Fly75 %
The function of most noncoding DNA is unknown & unannotated
= Exon
Mef2
Mef2
Mef2
Mef2
Mef2
CG15863
CG12130
CG1418
CG12133
Adam
CG12134
CG12134
eve
TER94
TER94
Pka-R2
Pka-R2
Pka-R2
CG12128
BS 1360
(A)n
Mef2
Mef2
Mef2
Mef2
Mef2
CG15863
CG12130
CG1418
CG12133
Adam
CG12134
CG12134
eve
TER94
TER94
Pka-R2
Pka-R2
Pka-R2
CG12128
BS 1360
Enhancers
AR3/72
APRCQ4/6
mes
15RP2
Transposable elements
Goal: comprehensive functional annotation of noncoding sequences in Drososphila
(A)n
DNA transposons (cut+paste)
RNA retrotransposons (copy+paste)
3 major types of transposable element (TE)
Terminal Inverted Repeat (TIR)
LINE-like (non-LTR)
Long Terminal Repeat (LTR)(A)n
Why is the discovery and detection of TE sequences in genomes important?
• Genome alignment
• Genome evolution
• Population genomics
• Genome organization
• Genome assembly
Discovery of new TE families
• Homology to TE proteins (e.g. transposase) => HMMer, tBLASTx
gagPBS
3’TSD 5’LTR 3’LTR 3’TSD
PPTpol env
Dmin ! (b3 – b5) ! Dmax
Lmin ! (e5 – b5), (e3 – b3) ! Lmax
b5 e5 b3 e5
• structural motifs (e.g. LTRs)=> LTRstruc, LTRharvest
• comparative genomics=> compTE
• dispersed repeats (all-by-all, k-mers)=> RECON, PILER, RepeatScout, ReAS
Reviewed in: Bergman & Quesneville (2007) Brief Bioinf. 8:382-92
hmmall-by-allRECON
BLASTER
RepeatMasker
TBLASTX
RMBLR
Release 3
Release 4
Quesneville, Bergman et al. (2005) PLoS Comp. Biol. 1:e22
Detection of individual TE copies
Genomic TE distribution in D. melanogaster
~3% ~20%
genome-wideaverage ~5.5%
10
20
30
40
50
5 10 15 20
X# TEs per 50kb
~ centromere~ high-low rec.
Bergman, Quesneville et al. (2006) Genome Biology 7:R112
Drosophila 12 genomes project
Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
D.mel D.sim D.sec D.yak D.ere D.ana D.pse D.per D.wil D.vir D.moj D.gri
Proport
ion T
E/R
epea
t in
Sca
ffold
s >
200 K
bBLASTER-tx+Repbase-NoDros
BLASTER-tx+BDGP
BLASTER-tx+PILER
RepeatMasker+ReAS
RepeatRunner+PILER
CompTE
TE abundance in 12 Drosophila genomes
Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18
5.3%
2.7% 3.7%
12.0
%
24.9
%
6.9%
15.6
%
2.8%2.7%
8.5%
13.9
%
8.9%
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
D.mel D.sim D.sec D.yak D.ere D.ana D.pse D.per D.wil D.vir D.moj D.gri
Proport
ion T
E C
lass
in S
caff
old
s >
200 K
b
LTR
LINE
TIR
OTHER
Abundance of major TE types is conserved across genus Drosophila
non-LTR
LTR
TIR
Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18
Is the genomic distribution of TEs in D. melanogaster affected by historical activity?
A brief introduction to transposable element (TE) evolution: the current paradigm
• TEs are mobile DNA sequences, intra-genomic parasites
• Transposition rates >> excision rates
• Equilibrium maintained by transposition-selection balance
• Mode of natural selection is debated
- deleterious effects of transposition
- deleterious effects of TE insertion
- deleterious effects of TE-mediated ectopic recombination
✴ TE insertions observed at low frequency in nature
Estimating the age of ‘pseudogene-like’ retrotransposons
Petrov & Hartl (1998) Mol. Biol. Evol. 15:293-302
Alignment of paralogous TEs
Petrov & Hartl (1998) Mol. Biol. Evol. 15:293-302
Estimating the age of ‘pseudogene-like’ retrotransposons
# families
# elements
Total bp Surveyed 1st 2nd 3rd Total Point
Sub. P (Ho) Ψ
All non-LTR 19 377 836,819 1,515 1,424 1,917 4,884 3.56E-24 N
Ψ non-LTR 10 158 336,748 791 746 781 2,341 0.192 Y
All LTR 27 385 1,973,013 677 603 1,120 2,420 2.18E-44 N
Ψ LTR 17 279 1,491,867 272 267 307 851 0.159 Y
Grand Total 46 762 2,809,832 2,192 2,027 3,037 7,304 5.18E-61 N
Total Ψ 27 437 1,828,615 1,063 1,013 1,088 3,192 0.06 Y
59% of retrotransposon families exhibit a pseudogene-like mode of evolution on terminal branches
Most retrotransposon families exhibit a ‘pseudogene-like’ mode of sequence evolution
D. mel - D. sim speciation
Bergman & Bensasson (2007) PNAS 104:11340-5
a_in
vader2
_6
b_m
icro
pia
_4
c_T
abor_
3
d_17.6
_11
e_S
talk
er_
4
f_ro
ver_
3
g_flea_16
h_copia
_28
i_m
dg3_10
j_ro
o_86
k_T
ranspac_4
l_opus_16
m_blo
od_22
n_412_24
o_B
urd
ock_13
p_div
er_
9
q_T
irant_
20
r_jo
ckey2_7
s_H
ele
na_7
t_C
r1a_36
u_baggin
s_6
v_G
4_10
w_D
oc3_7
x_G
5_8
y_B
S_15
z_Juan_9
zz_D
oc_53
0.00
0.02
0.04
0.06
0.08
0.10
0.12D
iverg
ence (
sub/s
ite)
0
1.80
3.60
5.41
7.21
9.01
10.81
Age (
Mya)
Retrotransposon demographics in D. melanogaster
Intra-element LTR-LTR comparisons support age estimates based on terminal branch length
5’ LTR ACGTAGCTAGGCTGACGTGGACTGTAC ||||||||||| ||||||| |||||||3’ LTR ACGTAGCTAGGGTGACGTGCACTGTAC
T = D/(2*r)
T - absolute timeD - 5’ vs. 3’ LTR divergence
r - neutral substitution rate (0.0111/my)
see also Bowen and McDonald (2001) Genome Res 11:1527-1540
a_
inva
de
r2_
6
b_
mic
rop
ia_
4
c_
Ta
bo
r_3
d_
17
.6_
11
e_
Sta
lke
r_4
f_ro
ve
r_3
g_
fle
a_
16
h_
co
pia
_2
8
i_m
dg
3_
10
j_ro
o_
86
k_
Tra
nsp
ac_
4
l_o
pu
s_
16
m_
blo
od
_2
2
n_
41
2_
24
o_
Bu
rdo
ck_
13
p_
div
er_
9
q_
Tira
nt_
20
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Div
erg
en
ce
(su
b/s
ite
)
0
1.80
3.60
5.41
7.21
9.01
10.81
Ag
e (
Mya
)
Intra-element LTR-LTR age estimates correlate with terminal branch age estimates
sqrt(branch length)
sqrt
(LTR
-LTR
div
erge
nce)
Spearman’s Rank Correlation Test:
p= 0.006531
Recent demographic history of D. melanogaster
Li & Stephan (2006) PLoS Genet. 2:e166
15,800 ya60,000 ya
Current paradigm is based on LTR families
Maside et al. (2000) Genet. Res 75:275-284
******?
****
Current paradigm assumes transposition-selection equilibrium
Carr et al. (2002) Chromosoma 110:511-518
Current paradigm interprets low TE frequency as evidence for purifying selection
Aquadro et al. (1986) Genetics 114:1165-1190
Summary of retrotransposon demographicsinferred from intra-genomic comparisons
• LTR elements systematically younger than non-LTR elements
=> Low frequency of LTR insertions may not be due to selection
• non-LTR families inserted in waves since speciation
• most LTR families inserted since colonization of non-African habitats
=> LTR insertions not at transposition-selection equilibrium
• LTR elements evolve under pseudogene-like mode like non-LTR elements
From evolutionary statics to dynamics: population genomics of TEs using 454 sequencing
Population genomics of TEs using 454 sequencing
Hybrid TE-unique reads“Unique Flank Tags”
Strain X
454 Reads
TEs
Population genomics of TEs using 454 sequencing
Hybrid TE-unique reads“Unique Flank Tags”
Strain X
454 Reads
TEs
KNOWN ✓ReferenceNEW !
Population genomics of TEs using 454 sequencing
TEs in reference sequence
Population genomics of TEs using 454 sequencing
TEs in reference sequence
Known INE-1 insertion present in NC and AF strains
Novel jockey insertion present in >1 strain
Preliminary findings using 454 sequencing
• 10 strains of D. melanogaster: 6 USA, 4 Malawi
• ~1/3 of INE-1 found in nature, so estimate ~3900 annotated TEs present in >=1 wild strain
• ~24% of annotated TEs found in >=1 wild strain (1300/5400)
• ~72% found in nature in low recomb. regions (950/1300)
• DNA transposons (~30%) found in nature more often than LTR/non-LTR retrotransposons (~10%)
• consistent with all TEs fixed in low recomb. regions plus few hundred segregating in high recomb. regions
Summary
• Mature methods exist for analysis of TEs in genomeshttp://www.bioinf.manchester.ac.uk/bergman/te-tools.html
• Structural classes of TEs have different genome dynamics
• Recent LTR insertion has implications for transposition-selection balance paradigm
• Population genomics using next generation sequencing will help resolve forces controlling TE evolution
Hadi QuesnevilleRuqiang Li
Douda Bensasson
Post-doctoral & PhD positions open
Andy Clark
University of Sheffield•15-17 July 2009
RES Symposium on insect infection and immunity: Evolution, Ecology and MechanismsRES Annual National Meeting
ento’09
Specialist topics to include: • Immunity • Comparative genomics • Reproduction • Range expansion/climate change • Insect Evo-devo • General Entomology • Insect chemoreceptionSpeakers include: Professor Fotis Kafatos, Imperial College, London, UK; Professor Paul Schmid-Hempel, ETH Zurich, Switzerland;Professor Shelley Adamo, Dalhousie University, Canada; Professor Bruno Lemaitre, EPF Lausanne, Switzerland
SYMPOSIUM CONVENORS:Professor Stuart Reynolds, University of Bath, [email protected]. Jens Rolff, University of Sheffield, [email protected]
NATIONAL MEETING CONVENORS:Professor Roger Butlin, University of Sheffield, [email protected] Mike Siva-Jothy, University of Sheffield, [email protected]. Klaus Reinhardt, University of Sheffield, [email protected]
Further information, registration, abstract and accommodation booking forms available on www.royensoc.co.uk
Tel: +44 (0)1727 899387 Fax: +44 (0)1727 894797E-mail: [email protected]
PHOTO CREDITS: PAUL DEAN, STAINED CELL SHOTS; RICHARD NAYLOR, BED BUG.