Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Sequence-independent identification of active LTR retrotransposons in Arabidopsis
Supplementary Information
Materials and Methods
Supplementary Figures 1-7
Supplementary references
Authors
Jayne Griffiths, Marco Catoni, Mayumi Iwasaki2 and Jerzy Paszkowski1
1 To whom correspondence should be addressed: [email protected], tel: +44(0)1223761159, 2 Present address: University of Geneva, Geneva, Switzerland
Materials and Methods
Plants and Growth conditions
The wild-type background used was Col-0 or Ler as specified and 13th generation met1-1 plants were
also used in this study. The isolation of ddm1-2 was described previously (Vongs et al., 1993). met1-
1/Ler was kindly provided by Dr. Eric Richards. The met1-1 mutation was introgressed into Ler and
backcrossed four times and the progeny used in this study were genotyped and confirmed to be
homozygous for the met1-1 mutation. Seedlings were grown on ½ MS media (1% sucrose and 0.8%
agar, pH5.7) for two weeks at 20°C under long-day (LD) conditions (16h light/8h dark). Pools of
plants were then harvested into liquid nitrogen. Individual seedlings or flowering plants were grown
on soil (Levington F2) at 20°C under long-day conditions (16h light/8h dark). Leaves were harvested
from individual plants at 2-4 weeks, depending on plant size.
Bioinformatics
We selected all LTR-TEs belonging to the superfamilies LTR/Copia and LTR/Gypsy on the TAIR10
transposon annotation. We then searched in LTR-TE DNA sequences for a match with the last 12 bp
of the Met-iCAT tRNA (GCTCTGATACCA). On the selected group of LTR TEs, we filtered transposons
with at least one match with each of the different primers designed, consisting of the last 12 bp of
the Met-iCAT tRNA (GCTCTGATACCA) followed by any nucleotide number between 0 to 3 and the TG
dinucleotide (complementary to the last two base pairs of the LTR sequence). The pattern matching
was performed using the R “grep” function with the pattern “GCTCTGATACCA[ATCG][n]TG”, where
“n” was replaced by the number of nucleotides considered.
Blast on the NCBI database was performed on the NCBI website (https://blast.ncbi.nlm.nih.gov),
applying default parameters. The blast against the Ler genome was done with the BLAST 2.2.25+
program with -evalue 1e-5, using as the sequence database the Ler-0 assembly from Pacific
Bioscience website (http://www.pacb.com/uncategorized/new-data-release-arabidopsis-assembly).
Sequence-Independent Retrotransposon Trapping (SIRT)
Adaptors were modified from the GenomeWalking adaptors (Clontech) to prevent primer dimers
between the PBS primer and the original GenomeWalking adaptor/primers. SIRT adaptors were
prepared to 50 µM; SIRT adaptor _1 (5’-
GTAATACGACTCACTATAGGGCACGCGTCCACGACGGCCCGGGCTCCA-3’) and SIRT_adaptor_2 (5’-Phos-
TGGAGCCC-3 ) were heated at 95°C for 10 min and allowed to cool to room temperature. Genomic ′DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to
the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer, 8 µl 50µM SIRT adaptor, 1.5 units T4 Ligase
(Promega), H20 to 20 µl) over 24 h at 4°C, with an additional 1.5 units T4 ligase added after 9 h. The
ligations were cleaned using QIAGEN PCR Clean-up Kit and eluted in 50 µl H20. PCR was performed
using the adaptor specific AP1 (5’-GTAATACGACTCACTATAGGGC-3’) and one of the anchor primers
(12.5 µl 2x PCRBIO Ultra Mix (PCR Biosystems), 1-2 µl SIRT ligated DNA, 0.5 µl AP1, 0.5 µl Anchor(n)
and H20 to 25 µl). The PCR program consisted of 95°C 2 min, 40 cycles 95°C 10 s, 51°C 10 s, 72°C 15 s.
PCR products were run on a 1.5% gel at 80 V for 90 min.
Bands were extracted from the gel using a QIAGEN Gel Extraction Kit and eluted in 30 µl H20. The
purified PCR products were ligated into pGEM-T. For ligation 1.8 µl of the product was mixed with
2.5 µl 2x ligation buffer, 10 ng pGEM-T vector, 1.5 units of T4 DNA ligase (Promega). The mixture was
incubated at 4°C overnight and 2 µl transformed into E.coli DH5α cells (50 µl). Random white
colonies were chosen and sequenced with either the adaptor specific AP2 (5’-
ATAGGGCACGCGTCCACGAC-3’) primer or a standard M13R primer.
Transposon-Specific Primers for SIRT Validation
The protocol for SIRT was followed with anchor primers replaced by transposon-specific primers:
ONSEN (5’- CGGGACTTGGAAGGGAACAT-3’), EVADÉ (5’-TGTATGTGAGAGAATATGAGACCA-3’),
COPIA21 (5’- ACTTGACGACGCCACATGAG-3’). PCR was carried out using adaptor specific AP1 and the
TE-specific primers (5 µl 0.125 µl GoTaq2, 5xBuffer, 1 µl SIRT ligated DNA, 0.5 µl AP1, 0.5 µl TE
specific, 0.5 µl 10 mM dNTPs and 17.375 µl H20) 95°C 2 min, 35 cycles 95°C 30 s, 55°C 30 s, 72°C 30 s.
Transposon Display
Transposon display adaptors were prepared to 25 µM: GenomeWalker_adaptor_1 (5’-
GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGGT -3’) and GenomeWalker_adaptor
_2 (5’-ACCAGCCC- 3’-Amino) were heated at 95°C for 10 min and allowed to cool to room
temperature. Genomic DNA was extracted from seedlings using the CTAB method; 300 ng DNA was
digested using the Dra1 enzyme, which produces blunt-ended fragments, making it compatible for
ligation with the GenomeWalker adaptor (300 ng DNA, 5 µL cut smart buffer, 2.5 µl Dra1 and H20 to
50 µl), digestion was carried out at 37°C for 16 h, cleaned using a QIAGEN Gel Extraction Kit and
eluted in 2x15 µl EB buffer. Adaptor ligation was carried out at 16°C for 24 h (digested DNA 5 µl, GW
adaptors (25 µM) 2 µl, 10x ligation buffer 1.6 µl, T4 Ligase 1 µl and H20 to 16 µl). Ligated DNA was
diluted 1:20 before nested PCR was carried out. New insertions of COPIA21 in met1-1 were
identified using GenomeWalker_AP1 (5’-GTAATACGACTCACTATAGGGC-3’) and TD_COPIA21_138_R
(5’- TCACTGCTCTGATACCATGTGAT-3’) in the primary PCR and GenomeWalker_AP2 (5’-
ACTATAGGGCACGCGTGGT-3’) with TD_COPIA21_62R (5’- AGTCTAGGGTATACACAACTACT-3’) in the
nested PCR. New insertions of DODGER were identified using GenomeWalker_AP1 and
TD_COPIA93_101R (5’- TCTCAATGTGTTCGGCCCAA-3’) in the primary PCR and GenomeWalker_AP2
(5’-ACTATAGGGCACGCGTGGT-3’) and TD_COPIA93_69R (5’-TACACACAACACTCGGCCTC-3’) in the
nested PCR. PCR was carried out using adaptor specific AP1 and the TE-specific primers in the
combinations previously listed: 5 µl 0.125 µl GoTaq2, 5x Buffer, 1 µl transposon display ligated DNA,
0.5 µl AP1, 0.5 µl TE specific, 0.5 µl 10 mM dNTPs and 17.375 µl H20); primary PCR 95°C 2 min, 33
cycles 95°C 30 s, 58°C 30 s, 72°C 1 min, secondary PCR on 1:100 dilution of primary PCR 95°C 2 min,
35 cycles 95°C 30 s, 58°C 30 s, 72°C 1 min. PCR products were cut from the gel and ligated into
pGEM-T for sequencing.
Insertion confirmation
PCR was carried out using transposon-specific primers and primers located in the flanking region of
the new insertion. Primers were designed in sequences bordering the COPIA21 insertion in QRT1,
PCR was carried out using genome specific primers QRT_F1 (5’- ACAAACGCTAACGCTAACGC-3’) and
QRT_R1 (5’- TGTTTCCAGTCACTCCAGCC-3’) to identify QRT1 with no insertion and primers QRT_F1
with TD_COPIA21_62R to confirm the new insertion.
To confirm new insertions of DODGER and identify the DODGER copy that had reinserted, primers
were used in the sequences flanking the new insertion and also in the body of DODGER to allow
sequencing of the confirmation product. Primers were designed in the body of DODGER for
amplification of the entire LTR with the new flanking sequence. To amplify the 5’ LTR we used
DODGER_551R (5’- TGAGATTTTTCGAAGACGTGAAGA-3’) with an upstream flanking primer and to
amplify the 3’LTR we used DODGER_ 4235 F (5’- GCTGTGAATCAGGTTAGCCAAC-3’) in combination
with a downstream flanking primer. To confirm the new insertion in At5g22500, we carried out PCR
using DODGER_551R with At5g22500_F1 (5’- GGCTCGCCAATCATAGTGGT-3’) to amplify the 5’LTR of
the new insert and DODGER_4961F with At5g22500_R1 (5’- AGTTGTGATGACGACTCAAAGA-3’) to
amplify the 3’ region of the new insert, including informative SNP. Confirmation of insertion 2 was
carried out using DODGER_551R and TD_8F3 (5’- AGTCGGTTTACTTTGTGGAAGGT-3’) to amplify the
5’ LTR and DODGER_ 4961 F (5’- GACTTGCGTGAGTCTTACGC-3’) and TD_8R3 (5’-
ACTTTTGCAGAAGGATGACCCA-3’) to amplify the 3’ LTR of the new insert. PCR products were then
extracted from the gel and ligated into pGEMT-easy for Sanger sequencing.
qPCR
Total RNA was isolated from closed flower buds using the RNeasy kit (Qiagen) according to the
manufacturer’s instructions. RNA was treated with DNase from TurboDNase kit (Ambion) and 1 µg
was reverse transcribed into cDNA using Superscript VILO (Invitrogen). Analyses by qPCR were run
on a LightCycler480 (Roche) using LightCycler 480 SYBR Green I Master (Roche). The replicates
analysed are indicated in the figure legend, with two technical replicates being carried out for each
sample. QRT expression levels were analysed relative to ACT7. Primers used were qACT7_F1 (5’-
CGTACAACCGGTATTGTGCT-3’) with qACT7_R1 (5’-TCAGTGAGAATCTTCATGAGTGAG-3’) and qQRT_F3
(5’-AGGAGCTGTTGATATGGTTCCT-3’) with qQRT_R3 (5’- TGTAAGTTACCTGTAAATCCCAG-3’).
Environmental SEM
Samples were placed on an adhesive carbon pad fixed to an SEM stage stub. Environmental Scanning
Electron Microscopy was carried out on a Zeiss EVO HD15 SEM running in variable pressure mode at
75 Pa. All imaging used the backscatter detector set to high gain. Beam settings were 25 kV with a
probe size equivalent to 10 nA.
Supplementary Figures
Supplementary Figure 1. SIRT detection of known active LTR-TEs in Arabidopsis.
PCR analysis of the ecDNA, uncropped gels as in Figure 2. DNA was extracted from 2-week-old
seedlings with known retrotransposon activity and PCR was carried out using the adaptor-specific
primer with individual anchor primers. Seedlings with temperature treatment: non-stressed (NS) Col-
0 and nrpd1-3 with no known LTR-TEs, heat-stressed (HS) seedlings (ONSEN). Seedlings with no
temperature treatment: epiRIL12 (EVADÉ), met1-1 (COPIA21), and ddm1-2 (EVADÉ). Amplified
fragments of the expected sizes are indicated by arrows.
A) Adaptor-specific primer and Anchor0. B) Adaptor-specific primer and Anchor1. C) Adaptor-specific
primer and Anchor2. D) Adaptor-specific primer and Anchor3.
Supplementary Figure 2. Sequences of SIRT products.
Sequence analysis of PCR products amplified in Figure 2A. Bands extracted from the gel were ligated
into pGEMTeasy and Sanger sequenced. All sequences contained the adaptor sequence (in orange)
and the PBS anchor sequence (underlined) with the recovered LTRs (highlighted in grey).
Supplementary Figure 3. Transposition of COPIA21 in met1-1 background
A) Transposon display analysis of genomic DNA digested with DraI. Adaptor-specific and COPIA21-
specific primers were used. New COPIA21 insertions detected in two different met1-1 plants in QRT1
gene (marked with the black arrow) are absent in control Col-0 plants. The band representing the
original COPIA21 location in the Col-0 accession is marked with a grey arrow. B) q-PCR analysis of
QRT1 transcript levels in flowers of Col-0 (n=1), met1-1 without COPIA21 insertion in QRT1 (n=2) and
met1-1 with homozygous COPIA21 insertion in QRT1 (n=6). qPCR was carried out on two technical
replicates per sample. Primers designed in QRT1 downstream of the insertion site were used. The
relative expression was calculated using the actin 2 gene (ACT2) as reference and normalised to
expression in Col. C) Location of the primers used for PCR validation. QRT1-specific primers indicated
in blue and COPIA21-specific primers in red. Validation PCR assay using genomic DNA from 24 met1-
1 individuals grown from bulked met1-1 seeds (derived from multiple individuals). Homozygous
insertions of COPIA21 are marked with black arrows under the gels. D) SEM images of pollen
showing the known qrt phenotype (Preuss et al., 1994) of all plants used for qPCR: Col-0, and met1-1
with or without the COPIA21 insertion in QRT1.
Supplementary Figure 4. Sequences of Ler SIRT products.Sequence analysis of SIRT products displayed in Figure 1G. Bands were extracted from the gel,
ligated into pGEMTeasy and Sanger sequenced.
A) Sequences derived from met1-1/Ler samples contained the adaptor sequence (in orange) and the
PBS anchor sequence (underlined) with the recovered LTR in between (highlighted in grey).
B) The band present in Ler sample did not contain the adaptor sequence, only inverted PBS primers
(underlined).
Supplementary Figure 5. DODGER and EVADÉ-like retrotransposon families in the Ler accession.
A) Structures of the seven DODGER family members; predicted ORFs are marked by grey arrows,
LTRs by blue arrows and non-coding regions by a black line. Five DODGER copies encode a putative
polyprotein (DODGER_1, DODGER_2, DODGER_3, DODGER_4, DODGER_7). The percent 5’ and 3’
LTRs identity is given together with the lengths of LTR on the left.
B) Structures of two EVD-like copies annotated as in A.
Supplementary Figure 6. Predicted reconstitution of LTRs during ecDNA synthesis.
The genomic LTR-TE DNA consists of a coding region (grey box), a PBS (orange box), 5’ (green box)
and 3’ (blue box) LTRs, each LTR consisting of U3, R and U5 regions. During reverse transcription of
the LTR-TE transcript there are two template switches of DNA in R and PBS, resulting in two identical
“hybrid” LTRs found also in new insertions. Schematic displays of three chromosomal copies of
DODGER (marked “genomic”) and their structural changes due to the reverse transcription (marked
“new insertion”).
A) DODGER_1, the position of the T to A SNP in genomic DODGER_1 3’ LTR is indicated.
B) DODGER_2, the position of the T to A SNP in genomic DODGER_2 5’ LTR is indicated.
C) DODGER_3, the position of the G to A SNP in genomic DODGER_3 3’ LTR is indicated.
Supplementary Figure 7. Terminal sequences of new DODGER insertions.
Sequences of PCR products amplified as depicted in Figure 1I. In addition, DODGER-specific primers
were also designed to amplify the 3’ junction of the new insertion. Flanking sequences are shown
underlined, target site duplications are highlighted in green, LTR sequences are highlighted in grey,
PBS sequences are highlighted bold, and the DODGER sequences are not highlighted.
A) Sequence validation of a new insertion of DODGER_3 in locus 1. The junctions of the new
insertion were amplified using DODGER-specific primers and locus-specific primers. Both the 5’ and
3’ junctions were amplified. In order to distinguish between DODGER_2 and DODGER_3, the 3’
coding region (containing an informative SNP, highlighted in pink) was also included.
B) Sequence validation of the new insertion of DODGER _1 in location 2.
Supplementary references
Preuss, D., Rhee, S.Y., and Davis, R.W. (1994). Tetrad analysis possible in Arabidopsis with mutation of the QUARTET (QRT) genes. Science 264:1458-1460.
Vongs, A., Kakutani, T., Martienssen, R.A., and Richards, E.J. (1993). Arabidopsis thaliana DNA methylation mutants. Science 260:1926-1928.