12
RESEARCH ARTICLE Open Access Effectiveness of ITS and sub-regions as DNA barcode markers for the identification of Basidiomycota (Fungi) Fernanda Badotti 1 , Francislon Silva de Oliveira 2 , Cleverson Fernando Garcia 1 , Aline Bruna Martins Vaz 3,4 , Paula Luize Camargos Fonseca 3 , Laila Alves Nahum 2,5 , Guilherme Oliveira 2,6 and Aristóteles Góes-Neto 3* Abstract Background: Fungi are among the most abundant and diverse organisms on Earth. However, a substantial amount of the species diversity, relationships, habitats, and life strategies of these microorganisms remain to be discovered and characterized. One important factor hindering progress is the difficulty in correctly identifying fungi. Morphological and molecular characteristics have been applied in such tasks. Later, DNA barcoding has emerged as a new method for the rapid and reliable identification of species. The nrITS region is considered the universal barcode of Fungi, and the ITS1 and ITS2 sub-regions have been applied as metabarcoding markers. In this study, we performed a large-scale analysis of all the available Basidiomycota sequences from GenBank. We carried out a rigorous trimming of the initial dataset based in methodological principals of DNA Barcoding. Two different approaches (PCI and barcode gap) were used to determine the performance of the complete ITS region and sub-regions. Results: For most of the Basidiomycota genera, the three genomic markers performed similarly, i.e., when one was considered a good marker for the identification of a genus, the others were also; the same results were observed when the performance was insufficient. However, based on barcode gap analyses, we identified genomic markers that had a superior identification performance than the others and genomic markers that were not indicated for the identification of some genera. Notably, neither the complete ITS nor the sub-regions were useful in identifying 11 of the 113 Basidiomycota genera. The complex phylogenetic relationships and the presence of cryptic species in some genera are possible explanations of this limitation and are discussed. Conclusions: Knowledge regarding the efficiency and limitations of the barcode markers that are currently used for the identification of organisms is crucial because it benefits research in many areas. Our study provides information that may guide researchers in choosing the most suitable genomic markers for identifying Basidiomycota species. Keywords: ITS, ITS1, ITS2, Probable correct identification, Barcode gap, Basidiomycota Background Fungi are one of the major eukaryotic lineages that are equivalent in species number to animals but exceed that of plants [1]. Fungi are among the most important organ- isms in the world because of their vital roles in decompos- ition, nutrient cycling, and obligate mutualistic symbioses with plants, algae, and cyanobacteria [2]. Fungi also have great economic importance for industrial fermentation, pharmaceutical, and biotechnological industries [3]. They may also cause food spoilage and diseases in plants and animals [4]. The diversity of activities is reflected in the high number of taxa, morphologies, habitats, and life strategies used by this group of organisms. Further studies are necessary to better understand their complex interac- tions with other organisms and environments. The phylum Basidiomycota is the second largest of the Fungi kingdom and comprises approximately 30% of all described fungal species [5]. This diverse phylum includes primarily macroscopic but also microscopic fungi, such as mushrooms and basidiomycotan yeasts, * Correspondence: [email protected] 3 Universidade Federal de Minas Gerais, Departamento de Microbiologia, Av. Antônio Carlos, Belo Horizonte 6627, 31270-901, MG, Brazil Full list of author information is available at the end of the article © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Badotti et al. BMC Microbiology (2017) 17:42 DOI 10.1186/s12866-017-0958-x

Effectiveness of ITS and sub-regions as DNA barcode

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effectiveness of ITS and sub-regions as DNA barcode

RESEARCH ARTICLE Open Access

Effectiveness of ITS and sub-regions asDNA barcode markers for the identificationof Basidiomycota (Fungi)Fernanda Badotti1, Francislon Silva de Oliveira2, Cleverson Fernando Garcia1, Aline Bruna Martins Vaz3,4,Paula Luize Camargos Fonseca3, Laila Alves Nahum2,5, Guilherme Oliveira2,6 and Aristóteles Góes-Neto3*

Abstract

Background: Fungi are among the most abundant and diverse organisms on Earth. However, a substantial amountof the species diversity, relationships, habitats, and life strategies of these microorganisms remain to be discoveredand characterized. One important factor hindering progress is the difficulty in correctly identifying fungi.Morphological and molecular characteristics have been applied in such tasks. Later, DNA barcoding has emerged as anew method for the rapid and reliable identification of species. The nrITS region is considered the universal barcode ofFungi, and the ITS1 and ITS2 sub-regions have been applied as metabarcoding markers. In this study, we performed alarge-scale analysis of all the available Basidiomycota sequences from GenBank. We carried out a rigorous trimming ofthe initial dataset based in methodological principals of DNA Barcoding. Two different approaches (PCI and barcodegap) were used to determine the performance of the complete ITS region and sub-regions.

Results: For most of the Basidiomycota genera, the three genomic markers performed similarly, i.e., when one wasconsidered a good marker for the identification of a genus, the others were also; the same results were observed whenthe performance was insufficient. However, based on barcode gap analyses, we identified genomic markers that had asuperior identification performance than the others and genomic markers that were not indicated for the identificationof some genera. Notably, neither the complete ITS nor the sub-regions were useful in identifying 11 of the 113Basidiomycota genera. The complex phylogenetic relationships and the presence of cryptic species in some genera arepossible explanations of this limitation and are discussed.

Conclusions: Knowledge regarding the efficiency and limitations of the barcode markers that are currently used forthe identification of organisms is crucial because it benefits research in many areas. Our study provides informationthat may guide researchers in choosing the most suitable genomic markers for identifying Basidiomycota species.

Keywords: ITS, ITS1, ITS2, Probable correct identification, Barcode gap, Basidiomycota

BackgroundFungi are one of the major eukaryotic lineages that areequivalent in species number to animals but exceed thatof plants [1]. Fungi are among the most important organ-isms in the world because of their vital roles in decompos-ition, nutrient cycling, and obligate mutualistic symbioseswith plants, algae, and cyanobacteria [2]. Fungi also havegreat economic importance for industrial fermentation,

pharmaceutical, and biotechnological industries [3]. Theymay also cause food spoilage and diseases in plants andanimals [4]. The diversity of activities is reflected in thehigh number of taxa, morphologies, habitats, and lifestrategies used by this group of organisms. Further studiesare necessary to better understand their complex interac-tions with other organisms and environments.The phylum Basidiomycota is the second largest of

the Fungi kingdom and comprises approximately 30%of all described fungal species [5]. This diverse phylumincludes primarily macroscopic but also microscopicfungi, such as mushrooms and basidiomycotan yeasts,

* Correspondence: [email protected] Federal de Minas Gerais, Departamento de Microbiologia, Av.Antônio Carlos, Belo Horizonte 6627, 31270-901, MG, BrazilFull list of author information is available at the end of the article

© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Badotti et al. BMC Microbiology (2017) 17:42 DOI 10.1186/s12866-017-0958-x

Page 2: Effectiveness of ITS and sub-regions as DNA barcode

respectively [6, 7]; saprotrophs, such as wood-decayingfungi [8]; pathogens of plants [3] and animals [9, 10];and mycorrhizal symbionts [11]. Basidiomycota speciesare grouped into the following subphyla: Agaricomyco-tina, Pucciniomycotina, and Ustilaginomycotina. Thefirst is the largest subphylum with approximately one-thirdof all described fungal species [5, 12]. Thus, a substantialamount of the data that is currently available on diversity,distribution, and sequencing has targeted Agaricomycotina,particularly in the orders of Agaricales, Polyporales, andBoletales. This subphylum is primarily composed of wooddecayers, litter decomposers, and ectomycorrhizal fungi, aswell as pathogens and poisonous, hallucinogenic, or ediblespecies [13].The identification of fungi at the species level is crit-

ical to many research areas, such as health sciences andagriculture, where the determination of causal agents ofdiseases is central to the definition of the suitable treat-ment, elucidation of outbreaks, and transmission mecha-nisms [14, 15]. Furthermore, the understanding of thespecific roles of microorganisms in an ecosystem, theirabundance, and their community composition in eco-logical and biodiversity studies can only be attainedthrough their reliable identification [16]. However, dis-covering and describing all extant fungal species appearschallenging. According to the Dictionary of Fungi, onlyapproximately 100,000 species have been described thusfar [12], and the estimated diversity ranges from 1.5 to5.1 million [1, 17, 18].Morphological characteristics are useful for species de-

scription; however, they may be limited because manymacroscopic structures are produced infrequently andtemporarily [19], and many taxa often harbor crypticspecies complexes [20]. Molecular tools complementingmorphological ones are very promising in identifyingspecies and can be used to rapidly and reliably evaluatebiological diversity. These markers have been applied tothe identification of fungal species since the 1990s [21,22]; however, the strategy based on the sequencing ofstandardized genomic fragments (DNA barcoding) wasrecognized afterwards [23]. The primary difference be-tween molecular identification tools and the “DNAbarcode” approach is that the latter involves the use ofa standard DNA region that is specific for a taxonomicgroup. The use of a segment of the mitochondrial geneencoding the cytochrome c oxidase subunit I (COI) hasbeen proposed for animals [24]. For plants, various locicombinations have been proposed [25]; however, astudy conducted by the Consortium for the Barcode ofLife (CBOL) Plant Working Group agreed that thecombination of sequences of two plastid genes, matKand rbcL, is the most promising plant barcode [26]. In2012, the study conducted by Schoch and colleaguescompared six DNA regions as promising universal

barcodes for fungi. Mitochondrial COI and otherprotein-coding nuclear gene regions were excluded aspotential markers for various reasons such as difficul-ties in amplifying DNA and insufficient variability. Thenuclear ribosomal RNA internal transcribed spacer(ITS) region exhibited the highest probability of correctidentification (PCI) for a wide number of fungal line-ages analyzed and the most clearly defined barcode gap[27]. Since then, the ITS region has been accepted asthe standard barcode marker for fungi. However, athorough study of ITS sequences in the InternationalNucleotide Sequence Database (INSD: GenBank, EMBLand DDBJ) revealed that this region is not equally variablein all groups of fungi [28]. Notably, for some genera ofAscomycota, including Alternaria [29], Aspergillus [30],Cladosporium [31], Penicillium [30], and Fusarium [32],identification using the ITS barcode has been difficult.One advantage of using the ITS region as a standard

marker is that most fungal species have been identifiedbased on this genomic region. GenBank [33] is themost comprehensive and widely used sequence reposi-tory in the field. A database specific for fungal se-quences, the UNITE (User-friendly Nordic ITSEctomycorrhiza Database) has been developed [34].UNITE aims to unify the fungal taxonomic identifica-tion and correct the annotations associated with thetaxonomic names to the greatest extent possible. TheBarcode of Life Data System - BOLD [35] representsanother bioinformatics platform; however, fungi remainunderrepresented in it. BOLD supplies tools for thestorage, quality warranty, and analysis of specimens andsequences to validate a barcode library. To obtain abarcode status on BOLD, sequences must fulfill somerequirements, such as voucher data, collection record,and trace files. In the last few years, the scientific com-munity has observed the rapid improvement of DNAsequencing technologies and the huge volume of datagenerated. Trimming and identifying this enormousamount of data requires bioinformatics tools, such asautomated pipelines and various programs. However,the success of the analysis greatly depends on the cor-rect taxonomic identification of sequences. Specifically,in the case of publicly available fungal ITS sequences,the reliability and technical quality vary significantly[34, 36]. Schoch and colleagues [27] estimate that onlyapproximately 50% of the ITS sequences that are de-posited in public databases are annotated at the specieslevel. Moreover, Nilsson and colleagues [37] estimatedthat more than 10% of these fully identified fungal ITSsequences are incorrectly annotated at the species level.On the other hand, excellent initiatives, such as UNITEand that from NCBI that include a tool which allowsflagging a GenBank sequence with type material [38]have emerged to minimize such a problem.

Badotti et al. BMC Microbiology (2017) 17:42 Page 2 of 12

Page 3: Effectiveness of ITS and sub-regions as DNA barcode

The ITS region comprises two sections (ITS1 andITS2) that flank the conserved 5.8S region. The identifi-cation of multiple species from environmental samples(the DNA metabarcode) requires the use of high-throughput technologies, which may have limitations insequencing read lengths [39]. For such approaches, onlya portion of the ITS region is usually used, the ITS1 orthe ITS2. The efficiency of these sub-regions in the iden-tification of species in many fungal lineages has beenevaluated, and some authors claim that ITS1 is morevariable than ITS2 [28, 40–42]. Others have found op-posite results [43] or that both the sub-regions are suit-able as metabarcoding markers [44, 45]. In a recentwork, Guarnica and colleagues [46] demonstrated thatthe ITS1 region is not more variable than the ITS2region for Cortinarius. Furthermore, the complete ITSregion is highly effective in discriminating among speciesin this highly sampled genus of Basidiomycota.In the present study, an extensive comparative analysis

based on the probability of correct identification (PCI)and barcode gap analyses was performed using atrimmed dataset composed of all Basidiomycota se-quences deposited in GenBank. We evaluated the mostwidely used genomic markers for Fungi (the completeITS region and the ITS1 and ITS2 sub-regions) to deter-mine which is the most suitable for the identification ofBasidiomycota species. Issues related to the need of add-itional molecular barcode markers as well as the taxo-nomic complexities within the subphyla are discussed.

MethodsData acquisition and filteringIn this study, only sequences with complete nuclearribosomal ITS from permanent collections whosetaxonomic identifications were curated by specialists(voucher specimens) and deposited in GenBank [33]were used. Taxonomic information regarding the spec-imens was enriched, when available, from the UNITEdatabase [34]. This step was used after downloadingsequences from GenBank and before logical and qual-ity filters were applied. For this enrichment, we firstlydownloaded the FASTA sequence files from UNITE,and then we generated a tabular file with the UNITEdata, keeping only the access numbers that correspondedto our specimens. Then, we retrieved the information re-lated to sampling area and fungal classification fromUNITE. Finally, we used the UNITE information to enrichthe GenBank information.Quality filters removed sequences with one or more

IUB/IUPAC ambiguous characters, and logic filters en-sured that the sequences were suitable for DNA barcodestudy in accordance with Barcode of Life recommenda-tions (http://www.barcodeoflife.org/). The first logic fil-ter guaranteed that only sequences identified at the

species level were maintained in the database. Therefore,species with inconclusive names ('sp.', 'aff.', 'cf.', and 'un-cultured') were removed. FungalITSExtractor [47] wasused to guarantee that only sequences with completeITS regions were maintained in the database. More than99% of fungal complete ITS sequences deposited in Gen-Bank are shorter than 800 or longer than 400 pb; thusall sequences outside of this interval were excluded fromthe dataset. The low representativeness together withthe potential to distort the multiple sequence alignmentjustified this filter. Only species with specimens collectedfrom at least three different localities were included toguarantee that only distinct and geographically distantspecimens were evaluated and to avoid the possibility ofworking with genetically identical specimens. The list ofall species used to perform the analyses of this study isprovided (Additional file 1). All filters were performedusing custom scripts written in the Perl programminglanguage and are available upon request. The FungalIT-SExtractor software was used to identify and extract theITS, ITS1, and ITS2 regions.

Data AnalysisThe ITS, ITS1, and ITS2 datasets were partitioned inseveral sub-datasets, each containing sequences belong-ing to only one genus. Sequences from each sub-datasetwere aligned using MUSCLE (version 3.8.31) with de-fault parameters [48]. Distance matrices were generatedusing an uncorrected p-distance because it is simple andwithout any biological assumptions [49]. To evaluate thediscriminative power of the three genomic markers, theprobability of correct identification (PCI) was calculatedas the ratio of species successfully identified per totalnumber of species. A species was considered successfullyidentified if the minimum interspecific distance waslarger than its maximum intraspecific distance [50].Custom Perl scripts were written to calculate the dis-tance matrices and the PCI values. Boxplots were plottedin R language.Two statistical analyses were performed to graphically

represent the data, a scatter plot and a dot plot. Thescatter plot aimed to evaluate the correlations betweenthe PCI values for the genomic regions pairwise combi-nations (ITS versus ITS1, ITS versus ITS2, and ITS1versus ITS2), and the Spearman correlation coefficientwas determined. The dot plot was used to compare thePCI with the barcode gap analyses. For this purpose,the PCI values for the four groups previously definedfrom the barcode gap analyses (Groups 1 to 4) wererepresented for each genomic region. All data andgraphics were generated using Minitab (Minitab StatisticalSoftware, version 17.3.1, State College, Pennsylvania:Minitab Inc., 2016).

Badotti et al. BMC Microbiology (2017) 17:42 Page 3 of 12

Page 4: Effectiveness of ITS and sub-regions as DNA barcode

ResultsOur primary dataset was comprised of all complete ITS(ITS1 + 5.8S + ITS2) sequences of Basidiomycota and con-sisted of 37,699 sequences. The exclusion of sequenceswithout the field ‘specimen_voucher’ in GenBank file re-duced the number to 37,342. Removing sequences withambiguous nucleotides led to 27,459 sequences, and re-moving sequences with inconclusive species names resultedin 21,238 sequences. After applying FungalITSExtractor,19,578 sequences remained. ITS sequences with less than400 bp and more than 800 bp were also excluded from thedataset, as well as ITS1 and ITS2 sequences less than100 bp, leaving 19,149 sequences. The last filter was usedto ensure that only species with at least three sequencescollected from different geographic locations were retainedin the dataset. Because most of the sequences did not in-clude information regarding their origin, our final datasethad this number reduced to 7,731 sequences from 112countries from six continents. This dataset was used to

perform all subsequent analyses and represented three sub-phyla, five classes, 25 orders, 73 families, 211 genera, and936 species (Additional file 2). This dataset has 167 se-quences whose DNA were originated from biological speci-mens considered as type material. Many sequences fromtype materials were not included in our dataset only be-cause they did not pass in quality and logic filters.Although GenBank is known to be the most complete

available public database, the amount of sequences isbiased in our trimmed dataset as follows: 93.1% (7,197sequences) belong to species of Agaricomycotina,whereas only 5.7% (442 sequences) come from Puccinio-mycotina and 1.2%. (92 sequences) from Ustilaginomy-cotina. When other taxonomic ranks were analyzed, asimilar distribution was observed with the vast majorityof species belonging to Agaricomycotina (Fig. 1). Insidethe subphyla, the imbalance in the amount of sequencesis also enormous. For example, in Agaricomycotina, wefound very well represented taxa (such as Cortinarius,

Fig. 1 Pie charts represent abundance (number) of sequences (a) and species (b) for the three subphyla represented in the dataset used in thisstudy. The histograms show the number of species and sequences distributed for genera belonging to Agaricomycotina (c), Pucciniomycotina (d)and Ustilagomycotina (e) phylum

Badotti et al. BMC Microbiology (2017) 17:42 Page 4 of 12

Page 5: Effectiveness of ITS and sub-regions as DNA barcode

with 829 sequences from 124 species) and others thatwere poorly represented (such as Auriscalpium, withonly one species represented by three sequences). Mostof the genera from the Agaricomycotina dataset wereunderrepresented; 126 of 194 had 20 or fewer sequences,whereas only 16 genera were represented by more than100 sequences (Fig. 1 and Additional file 2).The probability of correct identification (PCI) for the

three genomic regions under study was estimated usingour trimmed dataset (7,028 sequences from 113 genera).The number of genera analyzed decreased comparedwith the original dataset (211 genera) because we neededat least two species to estimate intraspecific and inter-specific distances. Moreover, the sequences identified astype material are distributed in 27 distinct genera (23.9%of total) (Additional File 3), and only 25 sequences withRefSeq accessions interchangeably with GenBank num-bers were identified (Additional File 4). This representedapproximately only 0.36% of the sequences that com-prised the dataset used to estimate PCI and barcode gapindices.The mean PCI value for the complete ITS region was

63%, those for the sub-regions were slightly smaller asfollows: 59% for ITS1 and 58% for ITS2. For the ITS re-gion, 53.1% of the genera had PCI values higher than themean, whereas for ITS1 and ITS2, these values were46% and 48%, respectively (Table 1). The pairwise correl-ation between the three markers (ITS versus ITS1, ITSversus ITS2 and ITS1 versus ITS2) was estimated con-sidering the PCI values of all genera composing thedataset. The comparisons between complete ITS and thesub-regions showed most of the data on or near the re-gression line, meaning that most of the PCI values weresimilar for the genera (Spearman correlation factor forITS versus ITS1 = 0.8825 and for ITS versus ITS2 =0.9102). When the sub-regions were associated (ITS1versus ITS2), the distribution of data had a different pro-file and a lower correlation was observed (0.8158)(Fig. 2). The pairwise correlation between the genomicregions was carried out at the subphylum level; however,there were no observable patterns at this taxonomiclevel.Based on the analysis of the barcode gaps, we assessed

and compared the efficiency of the three genomicmarkers for the identification of Basidiomycota. Thus,we classified the marker performance into the followingthree distinct categories: good, intermediate, or poor.When a clear barcode gap was present (e.g., Agaricus,Fig. 3a), we conventionally stated that the identificationwas good, even if outliers were overlapping. The genomicmarkers were considered intermediate if the whiskersfrom an intraspecific distance overlap those from an in-terspecific distance (e.g., Hebeloma, Fig. 3b), and poor ifthe boxes overlap or the intraspecific distance values

were superior to those of interspecific distance (e.g., Lac-tarius, Fig. 3c). For most of the genera (91.5%) evaluated,the three genomic regions performed similarly, i.e., whenthe identification is good for one region, it is also goodfor the others. The same occurred when the perform-ance was intermediate or poor. However, for some gen-era, we found some genomic regions with superioridentification performance than others. For instance, thecomplete ITS had a clearer barcode gap for the generaAuricularia, Flammulina, Lentinellus, Microbotryum,Parasola, and Tuberculina compared with the ITS1 orITS2 sub-regions. ITS1 performed better than the otherregions in the identification of species from the generaHygrophorus and Stephanospora, as well as ITS2 for thespecies belonging to the genera Amanita, Amyloporia,Fomitopsis, Scleroderma, and Strobilurus (Table 2,Group 2). In some instances, one of the three geneticmarkers performed worse than the other(s). The ITS1sub-region is not sufficient to differentiate the species ofthe genera Collybia and Pleurotus, and the ITS2 is not agood marker for Sebacina, Hydnellum, or Vuilleminia.Finally, it is important to note that for 11 out of the 113genera evaluated (Botyriboletus, Clavulina, Crepidotus,Hohenbuehelia, Hydnum, Laccaria, Lactarius, Mucidula,Peniophorella, Phaeocollybia, and Pisolithus), none ofthe complete ITS, ITS1 or ITS2 sub-regions could beused to differentiate the species based on the barcodegap analyses (Table 2, Group 4). For a detailed classifica-tion of genera considering their barcodes, see Table 2and Additional File 5, where the boxplots for all ana-lyzed genera are shown.The results of barcode gap analyses were compared

with the PCI values for each genus using a dot plot(Fig. 4). For the genera for which the three genomicmarkers were classified as good in barcode gap analyses(Group 1, Table 2), most of the genera exhibited PCIabove the mean value (63%); however, some disagree-ments were found. Some genera within this group had aPCI equal to zero (Datronia, Hygrocybe, Tecaphora, andTelephora) or between 20 and 50% (Chroogomphus,Coprinopsis, Lactifluus, Melampsora, Phellinus, Pilo-derma, Puccinia, Russula, Tilletia, Xerocomus, and Xer-omphalina) (Fig. 4a). When the group for which one ortwo genomic regions showing a clearer barcode gap(Group 2, Table 2) was compared with the PCI, most ofthe genera had a PCI below the mean value (Fig. 4b).When the group for which most of the genomic regionsshowed an intermediate barcode gap (Group 3, Table 2),only Lentinus and Hyphoderma had higher PCI thanmean value (both for the ITS2 region, Fig. 4c). Whenthe groups for which all three genomic regions wereclassified as poor markers considering the barcode gap(Group 4, Table 2), most of the genera also had a PCIbelow the mean value (Fig. 4b) with the exception of

Badotti et al. BMC Microbiology (2017) 17:42 Page 5 of 12

Page 6: Effectiveness of ITS and sub-regions as DNA barcode

Table 1 Probable Correct Identification (PCI) values (%) for all ofthe Basidiomycota genera from our trimmed dataset. The PCIvalues were estimated for the three genomic regions studied,the complete ITS region (ITS1 + 5.8S + ITS2) and the sub-regionsITS1 and ITS2

Genera ITS (ITS1 + 5.8S+ ITS2) ITS1 ITS2

Agaricus 100 100 100

Alnicola 44 33 33

Amanita 24 27 27

Amyloporia 25 25 25

Antherospora 100 100 100

Antrodia 70 70 70

Antrodiella 75 50 75

Armillaria 20 20 20

Auricularia 80 40 60

Boletus 32 26 47

Butyriboletus 67 33 67

Calvatia 0 0 0

Ceriporiopsis 100 100 100

Chlorophyllum 100 100 100

Chroogomphus 50 50 50

Clavaria 100 100 100

Clavulina 33 33 33

Collybia 50 50 50

Coprinopsis 50 50 50

Cortinarius 36 45 37

Crepidotus 50 50 50

Cystoderma 50 50 33

Cystodermella 100 100 100

Datronia 0 50 0

Endoraecium 100 100 100

Entoloma 100 86 93

Entyloma 100 100 100

Exobasidium 100 100 100

Favolus 100 100 100

Fibroporia 100 100 67

Flammulina 100 50 75

Fomitopsis 50 25 50

Fuscoporia 100 100 100

Ganoderma 43 43 43

Geastrum 100 67 67

Gloeophyllum 100 100 100

Gymnopilus 100 100 50

Gymnopus 67 67 60

Hebeloma 42 37 32

Helicobasidium 100 75 100

Hohenbuehelia 0 0 0

Table 1 Probable Correct Identification (PCI) values (%) for all ofthe Basidiomycota genera from our trimmed dataset. The PCIvalues were estimated for the three genomic regions studied,the complete ITS region (ITS1 + 5.8S + ITS2) and the sub-regionsITS1 and ITS2 (Continued)

Hydnellum 56 56 56

Hydnum 50 50 50

Hygrocybe 0 0 0

Hygrophorus 67 67 33

Hymenopellis 100 100 100

Hyphoderma 60 60 80

Hyphodermella 100 100 100

Hypholoma 0 0 0

Inocybe 30 19 28

Laccaria 0 0 13

Lactarius 37 37 41

Lactifluus 50 50 50

Lentinellus 43 43 29

Lentinus 57 57 71

Lepiota 92 83 75

Lepista 100 100 100

Leucoagaricus 90 100 80

Leucopaxillus 100 100 100

Lycoperdon 100 100 100

Lyomyces 100 100 100

Lyophyllum 100 100 67

Macrolepiota 50 50 33

Megacollybia 83 83 33

Melampsora 40 40 20

Melanoleuca 38 38 25

Microbotryum 70 80 50

Mucidula 0 0 0

Mycena 22 22 33

Neofavolus 100 100 100

Octaviania 100 100 100

Oligoporus 100 100 100

Parasola 100 67 67

Paxillus 0 0 33

Peniophorella 50 50 50

Phaeocollybia 0 0 0

Phanerochaete 67 67 67

Phellinus 100 33 67

Phellodon 86 86 71

Piloderma 50 50 50

Pisolithus 0 0 0

Pleurotus 29 43 14

Pluteus 27 27 23

Badotti et al. BMC Microbiology (2017) 17:42 Page 6 of 12

Page 7: Effectiveness of ITS and sub-regions as DNA barcode

Butyriboletus (with a PCI value above the mean for ITSand ITS2, Fig. 4d).

DiscussionThe accepted DNA barcode for Fungi is the rDNA ITSregion [27]. ITS is recognized as a fungal barcode be-cause it is the most sequenced region of fungi and isroutinely used for systematics, phylogenetics, and identi-fication [51, 52]. In this study, we downloaded allcomplete ITS sequences of species belonging to thephylum Basidiomycota from GenBank. Although this isthe most complete repository of available ITS sequences,misidentifications or low-quality sequencing have been

encountered in this public database [37]. However, someauthors think that it is unrealistic that future databasesor even a barcode database could be more reliable thanGenBank because misidentified sequences would be ascommon as they are currently and because vouchers willnot be re-identified by taxonomic experts (for a widediscussion, see [53–55]). To overcome this drawback, lo-gical and quality filters were applied to our original data-set to obtain the most reliable results possible. Therestrictiveness in the filtering step aimed to create ahigh-quality dataset (accurate taxonomic annotation andpresence of relevant metadata) that would meet the theor-etical assumptions of the biological system of identifica-tion via DNA barcode and the principles recommended inBOLD Systems [24, 35].More than 90% of our trimmed dataset belonged to

the subphylum Agaricomycotina. This result is not sur-prising because it reflects the high diversity of this taxoncompared with the other subphyla, which is widely men-tioned in the literature [5, 12]. Kirk and colleagues [12]estimated that one-fifth of all known fungal species de-scribed belong to the Agaricomycete clade; this diversityis considered to be underestimated because new taxa arecontinually being described [1, 56]. This discrepancy inthe amount of species and sequences from the subphylamay reflect a natural event or may occur due to the spe-cific interests of the scientific community in Agaricomy-cotina species.Some criteria have been traditionally used to test the

DNA barcoding efficacy to classify and/or identify speci-mens at the species level, such as similarity measures,tree-based techniques, and identification based on directsequence comparison [57, 58]. However, all of these ap-proaches present several issues (see [55] for a detaileddiscussion). Similarity measures are generally used tocluster sequences in “molecular operational taxonomicunits”; however, the choice of the threshold value fordistinguishing intraspecific and interspecific distances islargely arbitrary [58, 59]. An important and acceptablemeasure of the efficacy of a genetic marker should re-flect the probability of correctly identifying a species.This concept has emerged as the probability of correctidentification (PCI) [50, 53, 55, 60]. However, there is noconsensus for the definition and calculation of PCI,which currently embraces a broad class of measures. Inthis work, we assume the concept described by Hollings-worth and colleagues [50] in which the authors consid-ered the “discrimination as successful if the minimumuncorrected interspecific p-distance involving a specieswas larger than its maximum intraspecific distance” tomeasure the PCI for each genus included in our dataset.Furthermore, the use of genetic distances enables theobservation of the ‘barcoding gap’, which is possible byplotting the intraspecific and interspecific distances.

Table 1 Probable Correct Identification (PCI) values (%) for all ofthe Basidiomycota genera from our trimmed dataset. The PCIvalues were estimated for the three genomic regions studied,the complete ITS region (ITS1 + 5.8S + ITS2) and the sub-regionsITS1 and ITS2 (Continued)

Polyporus 100 100 86

Porodaedalea 100 100 100

Postia 50 50 50

Psathyrella 100 100 100

Psilocybe 100 100 100

Puccinia 43 43 43

Pycnoporellus 100 100 100

Ramaria 33 33 33

Resinicium 100 100 100

Rhizopogon 27 27 27

Rhodocollybia 100 100 100

Rigidoporus 100 100 50

Russula 38 27 36

Sarcodon 86 86 86

Scleroderma 33 33 33

Sebacina 33 0 33

Stephanospora 80 80 20

Strobilurus 67 67 67

Suillus 83 83 83

Thecaphora 100 100 100

Thelephora 0 0 0

Tilletia 100 33 100

Tomentella 25 50 25

Trametes 42 42 42

Tricholoma 43 57 29

Tricholomopsis 100 100 100

Tuberculina 67 0 67

Vuilleminia 33 33 33

Xerocomus 100 50 100

Xeromphalina 100 100 50

Badotti et al. BMC Microbiology (2017) 17:42 Page 7 of 12

Page 8: Effectiveness of ITS and sub-regions as DNA barcode

Therefore, an ideal barcode marker would reveal intra-specific divergences lower than interspecific divergences[61].In this study, we aimed to identify the most suitable

genomic marker (complete ITS, ITS1 or ITS2) to iden-tify fungal species belonging to Basidiomycota. Our find-ings, based on PCI and barcode gap analyses, indicatedthat for most of the genera, the three genomic regionsperform similarly, i.e., when one genomic region wasconsidered a good marker (a PCI above the mean valueor the presence of a clear barcode gap) the other regions

were also; the same was observed when the performanceof genomic markers was considered insufficient. Whenthe performance of the genomic markers was individu-ally evaluated, barcode gap analyses provided a more op-timistic view than PCI values. Approximately half of thegenera exhibited PCI values lower than the mean (63%);however, the three genomic regions were classified asgood for most of the genera (Table 2) when the barcodegap is taken into account. Accordingly, the comparisonbetween barcode gap and PCI for each genus showedsome disagreements. This was primarily observed for

Fig. 2 Pairwise correlations (a, ITS X ITS1, b, ITS X ITS2 and c, ITS1 X ITS2) between PCI values of all genera from our dataset

Fig. 3 Examples of the barcode gap performance classifications used in this study. a. Clear barcode gap (identification performance classified asgood) for the genera Agaricus, b. Intermediate separation between the intra- and interspecific distances for Hebeloma and c. A poor barcode gapfor Lactarius

Badotti et al. BMC Microbiology (2017) 17:42 Page 8 of 12

Page 9: Effectiveness of ITS and sub-regions as DNA barcode

some of the genera that showed good identification per-formance using the barcode gap but had low PCI values(Fig. 4a). The opposite, i.e., high PCI values and pooridentification performance via barcode gap, was ob-served for only one genus, Botyriboletus (Fig. 4d).Initially, the low PCI values found for some genera

(such as Calvatia, Datronia, Hygrocybe, Hohenbuehelia,Hypholoma, Mucidula, and Pisolithus) could be ex-plained by dataset features, such as the low number of

species (genera represented by sequences from only twospecies) and/or by the high number of outliers, whichwould have distorted the PCI estimates. Additionally,the taxonomy appears very complex for many of thegenera for which the identification performance usingITS and sub-regions were insufficient. Taxonomy issuesfor two genera (Hygrocybe and Thelephora) for whichPCI values were low and three genera (Hypholoma,Phaeocollybia, and Pisolithus) for which both PCI and

Fig. 4 PCI values for the genera classified in the Group 1 (a), Group 2 (b), Group 3 (c) and Group 4 (d) are represented for the ITS, ITS1 and ITS2genomic region

Table 2 Grouping of Basidiomycota genera based on the barcode gap analyses (See Additional file 5 for more details)

Group 1 Group 2 Group 3 Group 4

Genera for which all the threegenetic regions are good markers

Genera for which one or two geneticregions showed a clearer barcodegap and are recommended overthe other (s)

Genera for which most of thegenetic regions showed intermediatebarcode gap and their use shouldbe carefully evaluated

Genera for which all threegenetic regions are poormarkers

Agaricus, Antherospora, Antrodia,Ceriporiopsis, Chroogomphus, Clavaria,Coprinopsis, Cystodermella, Datronia,Edoraecium, Entoloma, Entyloma,Exobasidium, Favolus, Fibroporia,Fuscoporia, Geastrum, Gloeophyllum,Helicobasidium, Hygrocybe,Hymenopellis, Hyphodermella, Lactifluus,Lepiota, Lepista, Leucopaxillus,Lycoperdon, Lyomyces, Lyophyllum,Melampsora, Neofavolus, Octaviania,Oligoporus, Phanerochaete, Phellinus,Phellodon, Piloderma, Polyporus,Porodaedalea, Psathyrella, Psilocybe,Puccinia, Pycnoporellus, Resinicium,Rhodocollybia, Russula, Sarcodon,Suillus, Telephora, Thecaphora, Tilletia,Tricholomopsis, Xerocomus,Xeromphalina

Amanita (ITS2), Amyloporia (ITS2),Antrodiella (ITS and ITS2), Auricularia(ITS), Calvatia (ITS and ITS1),Chlorophyllum (ITS and ITS1),Flammulina (ITS), Fomitopsis (ITS2),Ganoderma (ITS and ITS1), Gymnopilus(ITS and ITS1), Gymnopus (ITS andITS1), Hygrophorus (ITS1), Lentinellus(ITS), Leucoagaricus (ITS and ITS1),Macrolepiota (ITS and ITS1),Megacollybia (ITS and ITS1),Microbotryum (ITS), Parasola (ITS),Postia (ITS and ITS2), Rigidoporus (ITSand ITS1), Scleroderma (ITS2),Stephanospora (ITS1), Strobilurus (ITS2),Tuberculina (ITS)

Alnicola, Armillaria, Boletus, Collybia,Cortinarius, Cystoderma, Hebeloma,Hydnellum, Hyphoderma, Hypholoma,Inocybe, Lentinus, Melanoleuca,Mycena, Paxillus, Pleurotus, Pluteus,Ramaria, Rhizopogon, Sebacina,Tomentella, Tricholoma, Trametes,Vulleminia

Butyriboletus, Clavulina,Crepidotus, Hohenbuehelia,Hydnum, Laccaria, Lactarius,Mucidula, Peniophorella,Phaeocollybia, Pisolithus

Badotti et al. BMC Microbiology (2017) 17:42 Page 9 of 12

Page 10: Effectiveness of ITS and sub-regions as DNA barcode

barcode gap analyses proved that ITS, ITS1 and ITS2are not sufficient markers for the identification of spe-cies are discussed below based on pertinent literature.Hygrocybe species exhibit extremely high variability in

the ITS region, with sequences diverging by more than25%. Thus, the use of additional DNA barcode markershas been proposed to re-evaluate the taxonomy of thisgenus [62, 63]. Moreover, significant changes in the clas-sification of Hygrocybe, such as its division, are expected[64].The phylogenetic relationships between and within

species of Thelephora are also doubtful with ITS. Theexistence of cryptic species was described, and the im-portance of integrating morphological and moleculardata, as well as employing a meaningful number of sam-ples for the accurate identification is highlighted [65].Hypholoma has been poorly studied. However, a recentstudy based on the morphological and molecular aspectsof H. cinnabarinum samples showed that this species isnot a member of the genus Hypholoma but belongs in-stead to Agaricus [66]. The ecological role of Phaeocolly-bia is uncertain. Smith [67] argues that the genusharbors both saprobes and mycorrhiza formers. Singer[68] considered that members of the genus were notobligatorily ectomycorrhizal, whereas Norvell [69] pre-sented evidence for the consideration of Phaeocollybiaas a mycorrhizal genus. At the taxonomic level, thecomplexity remains, as may be exemplified in Norvell[70]. The author proposed the re-evaluation of the genusPhaeocollybia by revealing four new agaric species mor-phologically similar to Phaeocollybia kauffmanii. Thewide genetic divergence among Pisolithus ITS sequences[71–73] indicates significant evolutionary divergence andsuggests that this genus encompasses a species complex.This hypothesis was reinforced by Kope and Fortin [74]who separated three groups of Pisolithus using incom-patibility tests and basidiospore spine morphology.According to Bickford and colleagues [20], cryptic

species are two or more distinct species that are erro-neously classified under one species name. Large intra-specific genetic distances associated with morphologicaland geographical discrete differences have revealed abroad range of cryptic species for many organisms andhabitats [75, 76]. Although our knowledge of fungalspecies remains limited, the presence of cryptic speciesinside the group is well recognized [20] and was subse-quently described for many of the genera covered inthis study.The use of molecular techniques, primarily DNA se-

quences, generates information to re-evaluate classifica-tions and provides more accurate species delimitations[77]. Currently, the utility of DNA barcoding is evident.However, a universal barcode for the clear identificationof all fungal species does not appear feasible, and

secondary barcodes for Fungi have already been pro-posed [78]. In addition to the known limitations of ITSbarcodes for some genera of Ascomycota, our resultsindicated that for some genera of Basidiomycota, suchas Hygrocybe and Pisolithus, additional barcode markersmay contribute to a clear elucidation of the complex re-lationships between and within species. The failure tocorrectly identify biological species hampers the effortsof the scientific community to conserve, study, orutilize them. Future research in this field should in-clude discovering characteristics that natural selectionacts upon [20].

ConclusionsProgress in many research areas fundamentally dependson the rapid and reliable identification of biological spe-cies. Most fungal diversity is unknown, and issues re-lated to the conservation of these organisms are urgent;thus, studies related to species identification are crucial.Knowledge regarding the efficiency and limitations ofthe barcode markers that are currently used for specificgroups of organisms optimize the work of many studies.Therefore, the present study contributes to the rationalselection of barcode markers of species belonging to thephylum Basidiomycota.

Additional files

Additional file 1: List of species used in this study, their accession andtaxon ID in GenBank and taxonomic affiliations. (XLSX 345 kb)

Additional file 2: Number of species and sequences (specimens)recovered to each genus and their taxonomic affiliations. Data werecompiled from our trimmed dataset. (DOCX 36 kb)

Additional file 3: List of genera with sequences originated from typespecimens and their PCI values (ITS, ITS1, ITS 2) and groups according tothe barcode gap analysis. (DOCX 61 kb)

Additional file 4: List of sequences with RefSeq accessionsinterchangeably with GenBank numbers. (DOCX 110 kb)

Additional file 5: Barcode gap of all the 113 genera studied for ITS, ITS1and ITS2 genomic regions by plotting intra- and interspecific distances.(DOCX 9035 kb)

AbbreviationsBOLD: The barcode of life data system; CBOL: Consortium for the barcode oflife; INSD: International nucleotide sequence database; ITS: Nuclear ribosomalRNA internal transcribed spacer region; MUSCLE: Multiple sequencecomparison by log-expectation; PCI: Probability of correct identification;UNITE: User-friendly nordic its ectomycorrhiza database

AcknowledgementsWe thank all who contributed directly or indirectly to this work, especiallythe CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico),FIOCRUZ-MG (Fundação Oswaldo Cruz, Minas Gerais), CEFET-MG (CentroFederal de Educação Tecnológica de Minas Gerais), Vale Institute of Technology,and the Graduate Programs of Microbiology and Bioinformatics of theUniversidade Federal de Minas Gerais (UFMG).

FundingThis work was supported by grants from Conselho Nacional de DesenvolvimentoCientífico e Tecnológico (CNPq, 308148/2013-4 and 564944/2010-6).

Badotti et al. BMC Microbiology (2017) 17:42 Page 10 of 12

Page 11: Effectiveness of ITS and sub-regions as DNA barcode

Availability of data and materialThe data set supporting the results of this article is presented in the mainpaper or as additional files. Moreover, the reader can contact thecorresponding author to get the information needed.

Authors’ contributionsFB analyzed the data and drafted the manuscript. FSO wrote the scripts,downloaded and filtered the dataset. CFG and ABMV worked on statisticalanalyses. PLCF and LN assisted with the data analyses. GO and AGNdesigned the analyses, analyzed and discussed the data. All authors read andapproved the final manuscript.

Competing interestsThe authors declare that they have no competing interests.

Consent for publicationNot applicable.

Ethics approval and consent to participateNot applicable.

Author details1Centro Federal de Educação Tecnológica de Minas Gerais (CEFET-MG),Departamento de Química, 30.421-169 Belo Horizonte, MG, Brazil. 2FundaçãoOswaldo Cruz (FIOCRUZ), Centro de Pesquisas René Rachou – CPqRR,30190-002 Belo Horizonte, MG, Brazil. 3Universidade Federal de Minas Gerais,Departamento de Microbiologia, Av. Antônio Carlos, Belo Horizonte 6627,31270-901, MG, Brazil. 4Faculdade de Minas (FAMINAS), 66055-090 BeloHorizonte, MG, Brazil. 5Faculdade Promove de Tecnologia, 30140-061 BeloHorizonte, MG, Brazil. 6Instituto Tecnológico Vale, 66055-090 Belém, PA, Brazil.

Received: 30 November 2016 Accepted: 14 February 2017

References1. Blackwell M. The fungi: 1, 2, 3 … 5.1 million species? Am J Bot. 2011;98(3):

426–38.2. Gadd GM. Fungi and their role in the biosphere. In: Jorgensen SE, Brian F,

editors. Encyclopedia of ecology. Amsterdam: Elsevier; 2013. p. 1709–17.3. Lane CR, Beales PA, Hughes KJD. Fungal plant patogens. Wallingford: CAB

International; 2012. p. 2012.4. Fisher MC, Henk DA, Briggs CJ, Brownstein JS, Madoff LC, McCraw SL, Gurr

SJ. Emerging fungal threats to animal, plant and ecosystem health. Nature.2012;484(7393):186–94.

5. Hibbett DS. Major events in the evolution of the Fungi. In: Losos J, editor.Princeton Guide to Evolution, Princeton University Press, vol. 2014. 2014. p.152–8.

6. Morin E, et al. Genome sequence of the button mushroom Agaricusbisporus reveals mechanisms governing adaptation to a humic-richecological niche. Proc Natl Acad Sci U S A. 2012;109(43):17501–6.

7. Stajich JE, et al. Insights into evolution of multicellular fungi from theassembled chromosomes of the mushroom Coprinopsis cinerea (Coprinuscinereus). Proc Natl Acad Sci U S A. 2010;107(26):11889–94.

8. Floudas D, et al. The Paleozoic origin of enzymatic lignin decompositionreconstructed from 31 fungal genomes. Science. 2012;336(6089):1715–9.

9. Brown SM, Campbell LT, Lodge JK. Cryptococcus neoformans, a fungusunder stress. Curr Opin Microbiol. 2007;10(4):320–5.

10. Dawson Jr TL. Malassezia globosa and restricta: breakthrough understanding ofthe etiology and treatment of dandruff and seborrheic dermatitis throughwhole-genome analysis. J Investig Dermatol Symp Proc. 2007;12(2):15–9.

11. Martin F, et al. The genome of Laccaria bicolor provides insights intomycorrhizal symbiosis. Nature. 2008;452:88–92.

12. Kirk PM, Cannon PF, Minter DW, Stalpers JA. Dictionary of the fungi. 10thed. Wallingford: CABI; 2008.

13. Hibbett DS. A phylogenetic overview of the agaricomycotina. Mycologia.2006;98(6):917–25.

14. Araujo R. Towards the genotyping of fungi: methods, benefits andchallenges. Cur Fung Infect Rep. 2014;8(3):203–10.

15. McNeil M, Roberts AMI, Cockerell V, Mulholland V. Real-time PCR assay forquantification of tilletia caries contamination of UK wheat seed. Plant Pathol.2004;53(6):741–75.

16. Peay KG, Kennedy PG, Bruns TD. Fungal community ecology: a hybrid beastwith a molecular master. Bioscience. 2008;58(9):799–810.

17. Hawksworth DL. The fungal dimension of biodiversity: magnitude,significance and conservation. Mycol Res. 1991;95:641–55.

18. Hawksworth DL. The magnitude of fungal diversity: the 1.5 million speciesestimate revisited. Mycol Res. 2001;105(12):1422–32.

19. Slepecky RA, Starmer WT. Phenotypic plasticity in fungi: a review withobservations on Aureobasidium pullulans. Mycologia. 2009;101:823–32.

20. Bickford D, Lohman DJ, Sodhi NS, Ng PK, Meier R, Winker K, Ingram KK, DasI. Cryptic species as a window on diversity and conservation. Trends EcolEvol. 2007;22(3):148–55.

21. White TJ, Bruns T, Lee S, Taylor J. Amplification and direct sequencing offungal ribosomal RNA genes for phylogenetics. In: Innis MA, Gelfand DH,Sninsky JJ, White TJ, editors. PCR Protocols: a Guide to Methods andApplications. 1990. p. 315–22.

22. Bruns TD, White TJ, Taylor JW. Fungal molecular systematics. Annu Rev EcolSyst. 1991;22:525–64.

23. Hollingsworth PM. DNA barcoding: potential users. Genom Soc Pol. 2007;3:44–7.24. Hebert PD, Cywinska A, Ball SL, deWaard JR. Biological identifications

through DNA barcodes. Proc Biol Sci. 2003;270(1512):313–21.25. Pennisi E. Taxonomy. Wanted: a barcode for plants. Science. 2007;318:190–1.26. Hollingsworth PM, et al. A DNA barcode for land plants. Proc Natl Acad Sci

U S A. 2009;106:12794–7.27. Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen

W. Fungal barcoding consortium. Nuclear ribosomal internal transcribedspacer (ITS) region as a universal DNA barcode marker for fungi. Proc NatlAcad Sci U S A. 2012;109(16):6241–6.

28. Nilsson RH, Kristiansson E, Ryberg M, et al. Intraspecific ITS variability in thekingdom fungi as expressed in the international sequence databases and itsimplications for molecular species identification. Evol Bioinforma. 2008;4:193–201.

29. Pryor B, Michailides T. Morphological, pathogenic, and molecularcharacterization of alternaria isolates associated with alternaria late blight ofpistachio. Phytopathology. 2002;92:406–16.

30. Skouboe P, Frisvadm J, Taylor J, Lauritsen D, Boysen M, Rossen L.Phylogenetic analysis of nucleotide sequences from the ITS region ofterverticillate Penicillium species. Mycol Res. 1999;103:873–81.

31. Schubert K, Groenewald J, Braun U, Dijksterhuis J, Starink M, Hill C, Zalar P,de Hoog G, Crous P. Biodiversity in the Cladosporium herbarum complex(Davidiellaceae, Capnodiales), with standardization of methods forCladosporium taxonomy and diagnostics. Stud Mycol. 2007;58:105–56.

32. O’Donnell K, Cigelnik E. Two divergent intragenomic rDNA ITS2 types withina monophyletic lineage of the fungus Fusarium are nonorthologous. MolPhylogenet Evol. 1997;7:103–16.

33. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW.GenBank. Nucleic Acids Res. 2015;43(Database issue):D30–35.

34. Kõljalg, et al. Towards a unified paradigm for sequence-based identificationof fungi. Mol Ecol. 2013;22:5271–7.

35. Ratnasingham S, Hebert PD. A DNA-based registry for all animal species: thebarcode index number (BIN) system. PLoS One. 2013;8(7):e66213.

36. Bruns TD, Blackwell M, Edwards I, Taylor AF, Horton T, Zhang N, et al.Preserving accuracy in GenBank. Science. 2008;319(5870):1616.

37. Nilsson RH, Ryberg M, Kristiansson E, Abarenkov K, Larsson KH, Koljalg U.Taxonomic reliability of DNA sequences in public sequence databases: afungal perspective. PLoS One. 2006;1:e59.

38. Federhen S. Type material in the NCBI taxonomy database. Nucleic AcidsRes. 2014;43:D1086–98.

39. Cuadros-Orellana S, Leite LR, Smith A, Medeiros JD, Badotti F, Fonseca PL, VazABM, Oliveira G, Góes-Neto A. Assessment of fungal diversity in the environmentusing metagenomics: a decade in review. Fung Genom Biol. 2013;3:110.

40. Mullineux T, Hausner G. Evolution of rDNA ITS1 and ITS2 sequences andRNA secondary structures within members of the fungal generaGrosmannia and Leptographium. Fungal Genet Biol. 2009;46:855–67.

41. Wang XC, Liu C, Huang L, Bengtsson-Palme J, Chen H, Zhang JH, Cai D, LiJQ. ITS1: a DNA barcode better than ITS2 in eukaryotes? Mol Ecol Resour.2015;15(3):573–86.

42. Ryberg M, Kristiansson E, Sjökvist E, et al. An outlook on the fungal internaltranscribed spacer sequences in GenBank and the introduction of a web-based tool for the exploration of fungal diversity. New Phytol. 2009;181:471–7.

43. Bazzicalupo AL, Bálint M, Schmitt I. Comparison of ITS1 and ITS2 rDNA in454 sequencing of hyperdiverse fungal communities. Fungal Ecol. 2013;6(1):102–9.

Badotti et al. BMC Microbiology (2017) 17:42 Page 11 of 12

Page 12: Effectiveness of ITS and sub-regions as DNA barcode

44. Mello A, Napoli C, Murat C, Morin E, Marceddu G, Bonfante P. ITS-1 versusITS-2 pyrosequencing: a comparison of fungal populations in trufflegrounds. Mycologia. 2011;103(6):1184–93.

45. Blaalid R, Kumar S, Nilsson RH, Abarenkov K, Kirk PM, Kauserud H. ITS1versus ITS2 as DNA metabarcodes for fungi. Mol Ecol Resour. 2013;13(2):218–24. doi:10.1111/1755-0998.12065.

46. Garnica S, Schön ME, Abarenkov K, et al. Determining threshold values forbarcoding fungi: lessons from Cortinarius (Basidiomycota), a highly diverse andwidespread ectomycorrhizal genus. FEMS Microbiol Ecol. 2016;92(4):fiw045.

47. Nilsson RH, Bok G, Ryberg M, Kristiansson E, Hallenberg N. A softwarepipeline for processing and identification of fungal ITS sequences. SourceCode Biol Med. 2009;15(4):1.

48. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy andhigh throughput. Nucleic Acids Res. 2004;32(5):1792–7.

49. Russo CAM, Miyaki CY, Pereira SL. Reconstrução filogenética: Métodosgeométricos. In: Matioli SR, editor. Biologia Molecular e Evolução. RibeirãoPreto: Holos Editora; 2012. p. 108–16. 202p.

50. Hollingsworth ML, et al. Selecting barcoding loci for plants: evaluation ofseven candidate loci with species-level sampling in three divergent groupsof land plants. Mol Ecol Resour. 2009;9(2):439–57.

51. Begerow D, Nilsson H, Unterseher M, Maier W. Current state andperspectives of fungal DNA barcoding and rapid identification procedures.Appl Microbiol Biotechnol. 2010;87(1):99–108.

52. Bellemain E, Carlsen T, Brochmann C, Coissac E, Taberlet P, Kauserud H. ITSas an environmental DNA barcode for fungi: an in silico approach revealspotential PCR biases. BMC Microbiol. 2010;10:189.

53. Spouge JL, Marino-Ramirez L. The practical evaluation of DNA barcodeefficacy. Methods Mol Biol. 2012;858:365–77.

54. Seberg O. The future of systematics: Assembling the Tree of Life.Systematist. 2004;23:2–8.

55. Meier R, Shiyang K, Vaidya G, Ng PK. DNA barcoding and taxonomy inDiptera: a tale of high intraspecific variability and low identification success.Syst Biol. 2006;55(5):715–28.

56. Hibbett DS, Ohman A, Glotzer D, Nuhn M, Kirk P, Nilsson RH. Progress inmolecular and morphological taxon discovery in Fungi and options forformal classification of environmental sequences. Fungal Biol Rev. 2011;25(1):38–47.

57. Blaxter M, Mann J, Chapman T, Thomas F, Whitton C, Floyd R, Abebe E.Defining operational taxonomic units using DNA barcode data. Philos TransR Soc Lond B Biol Sci. 2005;360(1462):1935–43.

58. DeSalle R, Egan MG, Siddall M. The unholy trinity: taxonomy, speciesdelimitation and DNA barcoding. Philos Trans R Soc Lond B Biol Sci. 2005;360(1462):1905–16.

59. Will KW, Rubinoff D. Myth of the molecule: DNA barcodes for speciescannot replace morphology for identification and classification. Cladistics.2004;20:47–55.

60. Erickson DL, Spouge JL, Resch A, et al. DNA barcoding in land plants:developing standards to quantify and maximize success. Taxon. 2008;3:1304–16.

61. Meyer CP, Paulay G. DNA barcoding: Error rates based on comprehensivesampling. PLoS Biol. 2005;3:2229–38.

62. Bresinsky A. Beiträge zu einer mykoflora deutschlands (2): Die gattungenhydropus bis hypsizygus mit angaben zur ökologie und verbreitung derarten. Regensburger mykologische schriften band 15. RegensburgischeBotanische Gesellschaft: Regensburg; 2008.

63. Babos M, Halász K, Zagyva T, Zöld-Balogh Á, Szegő D, Bratek Z. Preliminarynotes on dual relevance of ITS sequences and pigments in Hygrocybetaxonomy. Persoonia. 2011;26:99–107. doi:10.3767/003158511X578349.

64. Boertmann D. Fungi of Northern Europe, Volume 1: The Genus Hygrocybe.Oberreifenberg: Koeltz Botanical Books; 2010.

65. Ramírez-López I, Villegas-Ríos M, Salas-Lizana R, Garibay-Orijel R, Alvarez-Manjarrez J. Thelephora versatilis and Thelephora pseudoversatilis: two newcryptic species with polymorphic basidiomes inhabiting tropical deciduousand sub-perennial forests of the Mexican Pacific coast. Mycologia. 2015;107(2):346–58.

66. Su M-S, et al. Hypholoma cinnabarinum revisited: A contribution toknowledge of Agaricus subgenus Lanagaricus section Trisulphurati(Agaricaceae, Agaricales). Mycoscience. 2014;55(6):449–55.

67. Smith AH. 1957. A contribution toward a monograph of Phaeocollybia.Brittonia. 1957. 9:195–217

68. Singer R. The Agaricales in modern taxonomy. 4th ed. Koenigstein: KoeltzScientific Books; 1986.

69. Norvell LL. The biology and taxonomy of Pacific Northwest species ofPhaeocollybia Heim (Agaricales, Cortinariaceae) [PhD dissertation]. Seattle,Washinton: University of Washington; 1998. 391 pp.

70. Norvell L. Phaeocollybia in western North America. I. The Phaeocollybiakauffmanii complex. Can J Bot. 2000;78:1055–76.

71. Anderson IC, Chambers SM, Cairney JWG. Molecular determination ofgenetic variation in Pisolithus isolates from a defined region in New SouthWales, Australia. New Phytol. 1998;138(1):151–62.

72. Martin F, Delaruelle C, Ivory M. Genetic variability in intergenic spacers ofribosomal DNA in Pisolithus isolates associated with pine, eucalyptus andAfzelia in Lowland Kenyan forests. New Phytol. 1998;139:341–52.

73. Díez J, Anta B, Manjón JL, Honrubia M. Genetic variability of Pisolithusisolates associated with native hosts and exotic eucalyptus in the westernMediterranean region. New Phytol. 2001;149:577–87.

74. Kope HH, Fortin JA. Germination and comparative morphology ofbasidiospores of Pisolithus arhizus. Mycologia. 1990;82:350–7.

75. Hebert PD, Penton EH, Burns JM, Janzen DH, Hallwachs W. Ten species in one:DNA barcoding reveals cryptic species in the neotropical skipper butterflyAstraptes fulgerator. Proc Natl Acad Sci U S A. 2004;101(41):14812–7.

76. Grundt HH, Kjolner S, Borgen L, Rieseberg LH, Brochmann C. High biologicalspecies diversity in the arctic flora. Proc Natl Acad Sci U S A. 2006;03(4):972–5.

77. Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM, Hibbett DS, FisherMC. Phylogenetic species recognition and species concepts in fungi. FungalGenet Biol. 2000;31(1):21–32.

78. Stielow JB, Levesque CA, Seifert KA, Meyer W, Iriny L, Smits D, Renfurm R,Verkley GJ, Groenewald M, Chaduli D, et al. One fungus, which genes?Development and assessment of universal primers for potential secondaryfungal DNA barcodes. Persoonia. 2015;35:242–63.

• We accept pre-submission inquiries

• Our selector tool helps you to find the most relevant journal

• We provide round the clock customer support

• Convenient online submission

• Thorough peer review

• Inclusion in PubMed and all major indexing services

• Maximum visibility for your research

Submit your manuscript atwww.biomedcentral.com/submit

Submit your next manuscript to BioMed Central and we will help you at every step:

Badotti et al. BMC Microbiology (2017) 17:42 Page 12 of 12