23
The DNA metabarcoding approach for analyzing environmental samples: high throughput plant and animal identification Pierre Taberlet, Eric Coissac, François Pompanon, Johan Pansu, Wasim Shehzad, Tiayyba Riaz Laboratoire d'Ecologie Alpine, CNRS UMR 5553 Université Joseph Fourier, Grenoble, France

The DNA metabarcoding approach for analyzing environmental

Embed Size (px)

Citation preview

The DNA metabarcoding approach for analyzing environmental

samples: high throughput plant and animal identification

Pierre Taberlet, Eric Coissac, François Pompanon, Johan Pansu, Wasim Shehzad, Tiayyba Riaz

Laboratoire d'Ecologie Alpine, CNRS UMR 5553 Université Joseph Fourier, Grenoble, France

Need for high throughput collection of biodiversity data

•  For research •  For management

NASA Earth Observing System: Terra Satellite Platform

At the moment, no possibilty to use satellites

for identifying taxa and collecting biodiversity data.

Why not using DNA metabarcoding?

Our goal: a new high throughput approach for obtaining

biodiversity data (based on the DNA-barcoding concept, and

using next generation sequencers)

•  A single sampling in the field •  Simple and robust metabarcoding

experiments at the bench •  Complete biodiversity assessment at

the sampling site

Environmental DNA (eDNA)

•  Environmental DNA refers to DNA that can be extracted from air, water, or soil, without isolating any specific type of organism beforehand

•  Two types: –  intracellular eDNA –  extracellular eDNA

•  Intracellular eDNA commonly used by microbiologists

•  We focus on extracellular eDNA

Constraints of working with environmental DNA

•  Complex mixture containing degraded DNA •  The eDNA extract must be representative of the local

biodiversity •  The standard DNA barcodes are not optimal (they are by

far too long to reveal the whole spectrum of biodiversity) •  The primers must be highly versatile (to equally amplify

the different target DNAs) •  Problem of the taxonomic resolution when using very

short barcodes •  At the moment, problem of the reference database when

using non-standard barcodes

"Roche Noire" experiment in French Alps

•  Four plant communities –  dry high alpine meadows dominated by Kobresia myosuroides –  low alpine meadows dominated by Carex sempervirens –  subalpine heath dominated by Vaccinium sp. –  subalpine grasslands dominated by Festuca paniculata

•  Three plots per plant community (12 plots) •  Two soil samples per plot (with 80 cores per sample)

•  Two DNA extractions per sample •  Two DNA amplifications per extraction

"Roche Noire" experiment in French Alps

10m

80 soil cores per sample

● Extraction of extracellular DNA from kilograms of soil using a phosphate buffer ● DNA amplification of the P6 loop of the chloroplast trnL (UAA) intron ● Sequencing on the 454

"Roche Noire" experiment: projections of a between class analysis

CarexFestucaKobresiaVaccinium

Axe 1 (18.9%) Axe 2 (15.4%)

Axe 3 (13.2%)Axe 2 (15.4%)

A B

Taberlet P, Prud'homme S, Campione E, et al. (2012) Extraction of extracellular DNA from large amount of soil for metabarcoding studies. Molecular Ecology, 21, in press.

doi: 10.1111/j.1365-1294X.2011.05317.x.

Simple and robust metabarcoding experiments

•  In silico analysis: design and test of short metabarcodes (ecoPrimers, ecoPCR)

•  Empirical experiments –  DNA amplification with barcode primers –  Sequencing of the PCR products on next generation

sequencers

•  Sequence analysis –  OBITools (www.prabi.grenoble.fr/trac/OBITools)

Ficetola GF, Coissac E, Zundel S, et al. (2010) An in silico approach for the evaluation of DNA barcodes. BMC Genomics, 11, 434.

Riaz T, Shehzad W, Viari A, Pompanon F, Taberlet P, Coissac E (2011) ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Research,

doi:10.1093/nar/gkr1732.

A collection of metabarcoding primers Taxonomic group Gene Length Accuracy (Bs)

Angiosperms/Gymnosperms cpDNA trnL intron 10-100 bp Genus/Species

Poaceae ITS1 54-88 bp Species

Fungi ITS1 ~ 200 bp Species ?

Vertebrates mtDNA 12S V05 76-110 bp Genus/Species

Teleost fishes mtDNA 12S 60-70 bp Species

Batrachia mtDNA 12S ~ 42-57 bp Species

Earthworms mtDNA 16S (ewB/ewC) ~ 30 bp Species

Earthworms mtDNA 16S (ewD/ewE) ~ 70 bp Species

Oligochaetes mtDNA 16S (ewB/ewE) ~ 120 bp Species

Arthropods/Mollusks mtDNA 16S 35-40 bp Family/Genus

Termites mtDNA 12S ~ 30 bp Species ?

Termites mtDNA 12S ~ 70 bp Species ?

Collembola mtDNA 12S 39-44 bp Species ?

Collembola mtDNA 12S 125-138 bp Species ?

More information soon on www.metabarcoding.org

Earthworms from soil DNA •  Eight soil samples collected per plot •  Universal short metabarcodes for earthworms •  Reference database built using samples identified with the

standardized COI barcoding approach •  Sequencing on Illumina GA IIx

mtDNA 12S c b

30 bp 70 bp

d e

Earthworms from soil DNA: results

Chartreuse Grenoble Species Barcode Plot 1 Plot 2 Plot 1 Plot 2

Aporrectodea icterica catcttaatgaagactaaaacttcactaaa 836954 649677 834031 1359355 Aporrectodea longa tattttaacaaaaacccaaaaattttcaataaa 2 6 244463 271829

Aporrectodea sp cattttaataaaaattataaattttactaaa 0 0 236024 236678 Octolasion cyaneum cattttaatagaagcttactattctaataaa 468462 3823 0 2 Lumbricus terrestris aatttaaataaatataaaaaatttactaaa 0 0 174286 143682 Octolasion tyrtaeum cattttaatagaaaaataatatcctaataaa 306476 0 0 2

Lumbricus castaneus aatttaaataaatataaaaaaatttactaaa 0 0 56 131001 Aporrectodea longa tattttaacaaaacccaaaaattttcaataaa 2469 105312 159 145

Allobophora chlorotica cattttaataaagatataaactttactaaa 0 0 51953 43196 Aporrectodea caliginosa tattttaataaaaaaatataaatttttaataa 0 23005 0 0

Bienert R, de Danieli S, Miquel C, Coissac E, Poillot C, Brun JJ, Taberlet P (2012) Tracking earthworm communities from soil DNA. Molecular Ecology, 21, in press.

number of sequence reads

Current limitations of the PCR-based approach

•  Dependency on PCR –  Amplification introduces errors –  Difficulty to find suitable barcodes –  Different groups of organisms are analyzed

separately •  Lack of comprehensive taxonomic reference

databases for non-standard metabarcodes •  Limitations linked to the use of organellar

markers

Future: capture •  Easier to find a single

conserved region for designing the probe for the capture than two close conserved regions for PCR

•  ecoProbes: computer program for designing suitable probes (comparable to ecoPrimers)

•  Possibility to use hundreds of probes at the same time

•  Both organellar and nuclear DNA can be analyzed at the same time

e.g. Briggs AW, Good JM, Green RE, et al. (2009) Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science, 325, 318-321.

An idea of the HiSeq 2000 production per run

•  6 billions of reads of 100 bp •  6 lines per read •  55 lines per page (time 11) •  654 545 454 pages •  194 400 km long •  70.5 km high •  more than 3,000 tons of

paper

Future: shotgun sequencing

•  Shotgun sequencing of soil extracellular DNA on HiSeq 2000

•  We do not know the percentage of informative reads

•  Might allow to use the standard barcode reference libraries

•  Real bioinformatics challenge •  Ongoing experiments…

Acknowledgements

Rike Bienert, Kari Anne Bråthen, Christian Brochmann, Anne Krag Brysting, Etienne Campione, Corinne Cruaud; Francesco de Bello, Tony Dejean, Mary Edwards,

Francesco Ficetola, Frédérick Gavory, Ludovic Gielly, James Haile, Christelle Melo de Lima, Christian Miquel, Stéphanie Pellier-Cuit, Sophie Prud'homme, Delphine Rioux, Julien Roy, Jorn Henrik Sønstebø, Wilfried Thuiller, Alice Valentini, Eske Willerslev,

Patrick Wincker, Nigel Yoccoz

Thank you for your attention

Contacts: [email protected]; [email protected]

Molecular Ecology will publish in 2012 a special issue on Environmental DNA