18
Sequence-independent identification of active LTR retrotransposons in Arabidopsis Supplementary Information Materials and Methods Supplementary Figures 1-7 Supplementary references Authors Jayne Griffiths, Marco Catoni, Mayumi Iwasaki 2 and Jerzy Paszkowski 1 1 To whom correspondence should be addressed: [email protected] , tel: +44(0)1223761159, 2 Present address: University of Geneva, Geneva, Switzerland

€¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Sequence-independent identification of active LTR retrotransposons in Arabidopsis

Supplementary Information

Materials and Methods

Supplementary Figures 1-7

Supplementary references

Authors

Jayne Griffiths, Marco Catoni, Mayumi Iwasaki2 and Jerzy Paszkowski1

1 To whom correspondence should be addressed: [email protected], tel: +44(0)1223761159, 2 Present address: University of Geneva, Geneva, Switzerland

Page 2: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Materials and Methods

Plants and Growth conditions

The wild-type background used was Col-0 or Ler as specified and 13th generation met1-1 plants were

also used in this study. The isolation of ddm1-2 was described previously (Vongs et al., 1993). met1-

1/Ler was kindly provided by Dr. Eric Richards. The met1-1 mutation was introgressed into Ler and

backcrossed four times and the progeny used in this study were genotyped and confirmed to be

homozygous for the met1-1 mutation. Seedlings were grown on ½ MS media (1% sucrose and 0.8%

agar, pH5.7) for two weeks at 20°C under long-day (LD) conditions (16h light/8h dark). Pools of

plants were then harvested into liquid nitrogen. Individual seedlings or flowering plants were grown

on soil (Levington F2) at 20°C under long-day conditions (16h light/8h dark). Leaves were harvested

from individual plants at 2-4 weeks, depending on plant size.

Bioinformatics

We selected all LTR-TEs belonging to the superfamilies LTR/Copia and LTR/Gypsy on the TAIR10

transposon annotation. We then searched in LTR-TE DNA sequences for a match with the last 12 bp

of the Met-iCAT tRNA (GCTCTGATACCA). On the selected group of LTR TEs, we filtered transposons

with at least one match with each of the different primers designed, consisting of the last 12 bp of

the Met-iCAT tRNA (GCTCTGATACCA) followed by any nucleotide number between 0 to 3 and the TG

dinucleotide (complementary to the last two base pairs of the LTR sequence). The pattern matching

was performed using the R “grep” function with the pattern “GCTCTGATACCA[ATCG][n]TG”, where

“n” was replaced by the number of nucleotides considered.

Blast on the NCBI database was performed on the NCBI website (https://blast.ncbi.nlm.nih.gov),

applying default parameters. The blast against the Ler genome was done with the BLAST 2.2.25+

program with -evalue 1e-5, using as the sequence database the Ler-0 assembly from Pacific

Bioscience website (http://www.pacb.com/uncategorized/new-data-release-arabidopsis-assembly).

Sequence-Independent Retrotransposon Trapping (SIRT)

Adaptors were modified from the GenomeWalking adaptors (Clontech) to prevent primer dimers

between the PBS primer and the original GenomeWalking adaptor/primers. SIRT adaptors were

prepared to 50 µM; SIRT adaptor _1 (5’-

GTAATACGACTCACTATAGGGCACGCGTCCACGACGGCCCGGGCTCCA-3’) and SIRT_adaptor_2 (5’-Phos-

TGGAGCCC-3 ) were heated at 95°C for 10 min and allowed to cool to room temperature. Genomic ′DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to

the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer, 8 µl 50µM SIRT adaptor, 1.5 units T4 Ligase

Page 3: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

(Promega), H20 to 20 µl) over 24 h at 4°C, with an additional 1.5 units T4 ligase added after 9 h. The

ligations were cleaned using QIAGEN PCR Clean-up Kit and eluted in 50 µl H20. PCR was performed

using the adaptor specific AP1 (5’-GTAATACGACTCACTATAGGGC-3’) and one of the anchor primers

(12.5 µl 2x PCRBIO Ultra Mix (PCR Biosystems), 1-2 µl SIRT ligated DNA, 0.5 µl AP1, 0.5 µl Anchor(n)

and H20 to 25 µl). The PCR program consisted of 95°C 2 min, 40 cycles 95°C 10 s, 51°C 10 s, 72°C 15 s.

PCR products were run on a 1.5% gel at 80 V for 90 min.

Bands were extracted from the gel using a QIAGEN Gel Extraction Kit and eluted in 30 µl H20. The

purified PCR products were ligated into pGEM-T. For ligation 1.8 µl of the product was mixed with

2.5 µl 2x ligation buffer, 10 ng pGEM-T vector, 1.5 units of T4 DNA ligase (Promega). The mixture was

incubated at 4°C overnight and 2 µl transformed into E.coli DH5α cells (50 µl). Random white

colonies were chosen and sequenced with either the adaptor specific AP2 (5’-

ATAGGGCACGCGTCCACGAC-3’) primer or a standard M13R primer.

Transposon-Specific Primers for SIRT Validation

The protocol for SIRT was followed with anchor primers replaced by transposon-specific primers:

ONSEN (5’- CGGGACTTGGAAGGGAACAT-3’), EVADÉ (5’-TGTATGTGAGAGAATATGAGACCA-3’),

COPIA21 (5’- ACTTGACGACGCCACATGAG-3’). PCR was carried out using adaptor specific AP1 and the

TE-specific primers (5 µl 0.125 µl GoTaq2, 5xBuffer, 1 µl SIRT ligated DNA, 0.5 µl AP1, 0.5 µl TE

specific, 0.5 µl 10 mM dNTPs and 17.375 µl H20) 95°C 2 min, 35 cycles 95°C 30 s, 55°C 30 s, 72°C 30 s.

Transposon Display

Transposon display adaptors were prepared to 25 µM: GenomeWalker_adaptor_1 (5’-

GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGGT -3’) and GenomeWalker_adaptor

_2 (5’-ACCAGCCC- 3’-Amino) were heated at 95°C for 10 min and allowed to cool to room

temperature. Genomic DNA was extracted from seedlings using the CTAB method; 300 ng DNA was

digested using the Dra1 enzyme, which produces blunt-ended fragments, making it compatible for

ligation with the GenomeWalker adaptor (300 ng DNA, 5 µL cut smart buffer, 2.5 µl Dra1 and H20 to

50 µl), digestion was carried out at 37°C for 16 h, cleaned using a QIAGEN Gel Extraction Kit and

eluted in 2x15 µl EB buffer. Adaptor ligation was carried out at 16°C for 24 h (digested DNA 5 µl, GW

adaptors (25 µM) 2 µl, 10x ligation buffer 1.6 µl, T4 Ligase 1 µl and H20 to 16 µl). Ligated DNA was

diluted 1:20 before nested PCR was carried out. New insertions of COPIA21 in met1-1 were

identified using GenomeWalker_AP1 (5’-GTAATACGACTCACTATAGGGC-3’) and TD_COPIA21_138_R

(5’- TCACTGCTCTGATACCATGTGAT-3’) in the primary PCR and GenomeWalker_AP2 (5’-

ACTATAGGGCACGCGTGGT-3’) with TD_COPIA21_62R (5’- AGTCTAGGGTATACACAACTACT-3’) in the

Page 4: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

nested PCR. New insertions of DODGER were identified using GenomeWalker_AP1 and

TD_COPIA93_101R (5’- TCTCAATGTGTTCGGCCCAA-3’) in the primary PCR and GenomeWalker_AP2

(5’-ACTATAGGGCACGCGTGGT-3’) and TD_COPIA93_69R (5’-TACACACAACACTCGGCCTC-3’) in the

nested PCR. PCR was carried out using adaptor specific AP1 and the TE-specific primers in the

combinations previously listed: 5 µl 0.125 µl GoTaq2, 5x Buffer, 1 µl transposon display ligated DNA,

0.5 µl AP1, 0.5 µl TE specific, 0.5 µl 10 mM dNTPs and 17.375 µl H20); primary PCR 95°C 2 min, 33

cycles 95°C 30 s, 58°C 30 s, 72°C 1 min, secondary PCR on 1:100 dilution of primary PCR 95°C 2 min,

35 cycles 95°C 30 s, 58°C 30 s, 72°C 1 min. PCR products were cut from the gel and ligated into

pGEM-T for sequencing.

Insertion confirmation

PCR was carried out using transposon-specific primers and primers located in the flanking region of

the new insertion. Primers were designed in sequences bordering the COPIA21 insertion in QRT1,

PCR was carried out using genome specific primers QRT_F1 (5’- ACAAACGCTAACGCTAACGC-3’) and

QRT_R1 (5’- TGTTTCCAGTCACTCCAGCC-3’) to identify QRT1 with no insertion and primers QRT_F1

with TD_COPIA21_62R to confirm the new insertion.

To confirm new insertions of DODGER and identify the DODGER copy that had reinserted, primers

were used in the sequences flanking the new insertion and also in the body of DODGER to allow

sequencing of the confirmation product. Primers were designed in the body of DODGER for

amplification of the entire LTR with the new flanking sequence. To amplify the 5’ LTR we used

DODGER_551R (5’- TGAGATTTTTCGAAGACGTGAAGA-3’) with an upstream flanking primer and to

amplify the 3’LTR we used DODGER_ 4235 F (5’- GCTGTGAATCAGGTTAGCCAAC-3’) in combination

with a downstream flanking primer. To confirm the new insertion in At5g22500, we carried out PCR

using DODGER_551R with At5g22500_F1 (5’- GGCTCGCCAATCATAGTGGT-3’) to amplify the 5’LTR of

the new insert and DODGER_4961F with At5g22500_R1 (5’- AGTTGTGATGACGACTCAAAGA-3’) to

amplify the 3’ region of the new insert, including informative SNP. Confirmation of insertion 2 was

carried out using DODGER_551R and TD_8F3 (5’- AGTCGGTTTACTTTGTGGAAGGT-3’) to amplify the

5’ LTR and DODGER_ 4961 F (5’- GACTTGCGTGAGTCTTACGC-3’) and TD_8R3 (5’-

ACTTTTGCAGAAGGATGACCCA-3’) to amplify the 3’ LTR of the new insert. PCR products were then

extracted from the gel and ligated into pGEMT-easy for Sanger sequencing.

qPCR

Total RNA was isolated from closed flower buds using the RNeasy kit (Qiagen) according to the

manufacturer’s instructions. RNA was treated with DNase from TurboDNase kit (Ambion) and 1 µg

Page 5: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

was reverse transcribed into cDNA using Superscript VILO (Invitrogen). Analyses by qPCR were run

on a LightCycler480 (Roche) using LightCycler 480 SYBR Green I Master (Roche). The replicates

analysed are indicated in the figure legend, with two technical replicates being carried out for each

sample. QRT expression levels were analysed relative to ACT7. Primers used were qACT7_F1 (5’-

CGTACAACCGGTATTGTGCT-3’) with qACT7_R1 (5’-TCAGTGAGAATCTTCATGAGTGAG-3’) and qQRT_F3

(5’-AGGAGCTGTTGATATGGTTCCT-3’) with qQRT_R3 (5’- TGTAAGTTACCTGTAAATCCCAG-3’).

Environmental SEM

Samples were placed on an adhesive carbon pad fixed to an SEM stage stub. Environmental Scanning

Electron Microscopy was carried out on a Zeiss EVO HD15 SEM running in variable pressure mode at

75 Pa. All imaging used the backscatter detector set to high gain. Beam settings were 25 kV with a

probe size equivalent to 10 nA.

Page 6: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figures

Supplementary Figure 1. SIRT detection of known active LTR-TEs in Arabidopsis.

PCR analysis of the ecDNA, uncropped gels as in Figure 2. DNA was extracted from 2-week-old

seedlings with known retrotransposon activity and PCR was carried out using the adaptor-specific

primer with individual anchor primers. Seedlings with temperature treatment: non-stressed (NS) Col-

0 and nrpd1-3 with no known LTR-TEs, heat-stressed (HS) seedlings (ONSEN). Seedlings with no

temperature treatment: epiRIL12 (EVADÉ), met1-1 (COPIA21), and ddm1-2 (EVADÉ). Amplified

fragments of the expected sizes are indicated by arrows.

A) Adaptor-specific primer and Anchor0. B) Adaptor-specific primer and Anchor1. C) Adaptor-specific

primer and Anchor2. D) Adaptor-specific primer and Anchor3.

Page 7: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figure 2. Sequences of SIRT products.

Sequence analysis of PCR products amplified in Figure 2A. Bands extracted from the gel were ligated

into pGEMTeasy and Sanger sequenced. All sequences contained the adaptor sequence (in orange)

and the PBS anchor sequence (underlined) with the recovered LTRs (highlighted in grey).

Page 8: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figure 3. Transposition of COPIA21 in met1-1 background

A) Transposon display analysis of genomic DNA digested with DraI. Adaptor-specific and COPIA21-

specific primers were used. New COPIA21 insertions detected in two different met1-1 plants in QRT1

gene (marked with the black arrow) are absent in control Col-0 plants. The band representing the

original COPIA21 location in the Col-0 accession is marked with a grey arrow. B) q-PCR analysis of

QRT1 transcript levels in flowers of Col-0 (n=1), met1-1 without COPIA21 insertion in QRT1 (n=2) and

met1-1 with homozygous COPIA21 insertion in QRT1 (n=6). qPCR was carried out on two technical

replicates per sample. Primers designed in QRT1 downstream of the insertion site were used. The

relative expression was calculated using the actin 2 gene (ACT2) as reference and normalised to

expression in Col. C) Location of the primers used for PCR validation. QRT1-specific primers indicated

in blue and COPIA21-specific primers in red. Validation PCR assay using genomic DNA from 24 met1-

Page 9: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

1 individuals grown from bulked met1-1 seeds (derived from multiple individuals). Homozygous

insertions of COPIA21 are marked with black arrows under the gels. D) SEM images of pollen

showing the known qrt phenotype (Preuss et al., 1994) of all plants used for qPCR: Col-0, and met1-1

with or without the COPIA21 insertion in QRT1.

Supplementary Figure 4. Sequences of Ler SIRT products.Sequence analysis of SIRT products displayed in Figure 1G. Bands were extracted from the gel,

ligated into pGEMTeasy and Sanger sequenced.

A) Sequences derived from met1-1/Ler samples contained the adaptor sequence (in orange) and the

PBS anchor sequence (underlined) with the recovered LTR in between (highlighted in grey).

B) The band present in Ler sample did not contain the adaptor sequence, only inverted PBS primers

(underlined).

Page 10: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figure 5. DODGER and EVADÉ-like retrotransposon families in the Ler accession.

A) Structures of the seven DODGER family members; predicted ORFs are marked by grey arrows,

LTRs by blue arrows and non-coding regions by a black line. Five DODGER copies encode a putative

polyprotein (DODGER_1, DODGER_2, DODGER_3, DODGER_4, DODGER_7). The percent 5’ and 3’

LTRs identity is given together with the lengths of LTR on the left.

B) Structures of two EVD-like copies annotated as in A.

Page 11: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figure 6. Predicted reconstitution of LTRs during ecDNA synthesis.

The genomic LTR-TE DNA consists of a coding region (grey box), a PBS (orange box), 5’ (green box)

and 3’ (blue box) LTRs, each LTR consisting of U3, R and U5 regions. During reverse transcription of

the LTR-TE transcript there are two template switches of DNA in R and PBS, resulting in two identical

“hybrid” LTRs found also in new insertions. Schematic displays of three chromosomal copies of

DODGER (marked “genomic”) and their structural changes due to the reverse transcription (marked

“new insertion”).

A) DODGER_1, the position of the T to A SNP in genomic DODGER_1 3’ LTR is indicated.

B) DODGER_2, the position of the T to A SNP in genomic DODGER_2 5’ LTR is indicated.

C) DODGER_3, the position of the G to A SNP in genomic DODGER_3 3’ LTR is indicated.

Page 12: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Supplementary Figure 7. Terminal sequences of new DODGER insertions.

Page 13: €¦ · Web viewGenomic DNA was extracted from seedling tissue using the standard CTAB method; 200 ng DNA was ligated to the SIRT adaptors (200 ng DNA, 2 µl 10x Ligation buffer,

Sequences of PCR products amplified as depicted in Figure 1I. In addition, DODGER-specific primers

were also designed to amplify the 3’ junction of the new insertion. Flanking sequences are shown

underlined, target site duplications are highlighted in green, LTR sequences are highlighted in grey,

PBS sequences are highlighted bold, and the DODGER sequences are not highlighted.

A) Sequence validation of a new insertion of DODGER_3 in locus 1. The junctions of the new

insertion were amplified using DODGER-specific primers and locus-specific primers. Both the 5’ and

3’ junctions were amplified. In order to distinguish between DODGER_2 and DODGER_3, the 3’

coding region (containing an informative SNP, highlighted in pink) was also included.

B) Sequence validation of the new insertion of DODGER _1 in location 2.

Supplementary references

Preuss, D., Rhee, S.Y., and Davis, R.W. (1994). Tetrad analysis possible in Arabidopsis with mutation of the QUARTET (QRT) genes. Science 264:1458-1460.

Vongs, A., Kakutani, T., Martienssen, R.A., and Richards, E.J. (1993). Arabidopsis thaliana DNA methylation mutants. Science 260:1926-1928.