21
Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D. Roe * , Felix A.H. Sperling CW403 Biological Sciences Building, Department of Biological Sciences, University of Alberta, Edmonton, Alta., Canada T6G 2E9 Received 9 August 2006; revised 27 November 2006; accepted 7 December 2006 Available online 20 December 2006 Abstract DNA barcoding has focused increasing attention on the use of specific regions of mitochondrial cytochrome c oxidase I and II genes (COI–COII) to diagnose and delimit species. However, our understanding of patterns of molecular evolution within these genes is lim- ited. Here we examine patterns of nucleotide divergence in COI–COII within species and between species pairs of Lepidoptera and Dip- tera using a sliding window analysis. We found that: (1) locations of maximum divergence within COI–COII were highly variable among taxa surveyed in this study; (2) there was major overlap in divergence within versus between species, including within individual COI– COII profiles; (3) graphical DNA saturation analysis showed variation in percent nucleotide transitions throughout COI–COII and only limited association with levels of DNA divergence. Ultimately, no single optimally informative 600 bp location was found within the 2.3kb of COI–COII, and the DNA barcoding region was no better than other regions downstream in COI. Consequently, we recommend that researchers should maximize sequence length to increase the probability of sampling regions of high phylogenetic informativeness, and to minimize stochastic variation in estimating total divergence. Ó 2007 Elsevier Inc. All rights reserved. Keywords: COI; mtDNA; Species delimitation; Divergence; Saturation; Insects 1. Introduction Molecular systematics is one of the most rapidly expanding fields in biology and yet our understanding of patterns of molecular character evolution remains relative- ly superficial. Analysis of molecular data has proven to be important for understanding deep phylogenetic relation- ships (Blair and Hedges, 2005; Regier et al., 2005), examin- ing population structure within a species (Avise et al., 1987; Zhang and Hewitt, 2003), assigning unknown specimens or immatures to reference species (Olson, 1991; Bartlett and Davidson, 1992; Sperling et al., 1994; Hebert et al., 2003a,b), and diagnosing and delimiting cryptic species (Sperling and Hickey, 1994; Goetze, 2003; Hebert et al., 2003a,b; Scheffer et al., 2004; Blair et al., 2005; Hendrixson and Bond, 2005). The use of molecular characters to speed identification of unknowns has proven to be useful and highly effective, and can be achieved using only a small number of molecular differences (Sperling and Hickey, 1994; Wells et al., 2001; Hebert et al., 2003a,b), However, delimiting species ideally requires data from many different sources, such as morphology, behavior, and multiple molecular markers (Funk and Omland, 2003; Dayrat, 2005). Using only mitochondrial characters has always been controversial (Dunn, 2003; Lipscomb et al., 2003; Seberg et al., 2003; Tautz et al., 2003), and recent debates have been particularly heated (Moritz and Cicero, 2004; Ebach and Holdrege, 2005; Hebert and Gregory, 2005; Meyer and Paulay, 2005; Will et al., 2005) in light of the increasing popularity of projects such as the Consortium for the Barcode of Life (http://barcoding.si.edu/). Studies that use DNA fragments to delimit species, either separately or in combination with morphological 1055-7903/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2006.12.005 * Corresponding author. Present address: Department of Entomology, 1980 Folwell Avenue, Room 219, University of Minnesota, St. Paul, MN 55108, USA. Fax: +1 612 625 5299. E-mail address: [email protected] (A.D. Roe). www.elsevier.com/locate/ympev Molecular Phylogenetics and Evolution 44 (2007) 325–345

Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

www.elsevier.com/locate/ympev

Molecular Phylogenetics and Evolution 44 (2007) 325–345

Patterns of evolution of mitochondrial cytochrome c oxidase Iand II DNA and implications for DNA barcoding

Amanda D. Roe *, Felix A.H. Sperling

CW403 Biological Sciences Building, Department of Biological Sciences, University of Alberta, Edmonton, Alta., Canada T6G 2E9

Received 9 August 2006; revised 27 November 2006; accepted 7 December 2006Available online 20 December 2006

Abstract

DNA barcoding has focused increasing attention on the use of specific regions of mitochondrial cytochrome c oxidase I and II genes(COI–COII) to diagnose and delimit species. However, our understanding of patterns of molecular evolution within these genes is lim-ited. Here we examine patterns of nucleotide divergence in COI–COII within species and between species pairs of Lepidoptera and Dip-tera using a sliding window analysis. We found that: (1) locations of maximum divergence within COI–COII were highly variable amongtaxa surveyed in this study; (2) there was major overlap in divergence within versus between species, including within individual COI–COII profiles; (3) graphical DNA saturation analysis showed variation in percent nucleotide transitions throughout COI–COII and onlylimited association with levels of DNA divergence. Ultimately, no single optimally informative 600 bp location was found within the2.3kb of COI–COII, and the DNA barcoding region was no better than other regions downstream in COI. Consequently, we recommendthat researchers should maximize sequence length to increase the probability of sampling regions of high phylogenetic informativeness,and to minimize stochastic variation in estimating total divergence.� 2007 Elsevier Inc. All rights reserved.

Keywords: COI; mtDNA; Species delimitation; Divergence; Saturation; Insects

1. Introduction

Molecular systematics is one of the most rapidlyexpanding fields in biology and yet our understanding ofpatterns of molecular character evolution remains relative-ly superficial. Analysis of molecular data has proven to beimportant for understanding deep phylogenetic relation-ships (Blair and Hedges, 2005; Regier et al., 2005), examin-ing population structure within a species (Avise et al., 1987;Zhang and Hewitt, 2003), assigning unknown specimens orimmatures to reference species (Olson, 1991; Bartlett andDavidson, 1992; Sperling et al., 1994; Hebert et al.,2003a,b), and diagnosing and delimiting cryptic species(Sperling and Hickey, 1994; Goetze, 2003; Hebert et al.,

1055-7903/$ - see front matter � 2007 Elsevier Inc. All rights reserved.

doi:10.1016/j.ympev.2006.12.005

* Corresponding author. Present address: Department of Entomology,1980 Folwell Avenue, Room 219, University of Minnesota, St. Paul,MN 55108, USA. Fax: +1 612 625 5299.

E-mail address: [email protected] (A.D. Roe).

2003a,b; Scheffer et al., 2004; Blair et al., 2005; Hendrixsonand Bond, 2005). The use of molecular characters to speedidentification of unknowns has proven to be useful andhighly effective, and can be achieved using only a smallnumber of molecular differences (Sperling and Hickey,1994; Wells et al., 2001; Hebert et al., 2003a,b), However,delimiting species ideally requires data from many differentsources, such as morphology, behavior, and multiplemolecular markers (Funk and Omland, 2003; Dayrat,2005). Using only mitochondrial characters has alwaysbeen controversial (Dunn, 2003; Lipscomb et al., 2003;Seberg et al., 2003; Tautz et al., 2003), and recent debateshave been particularly heated (Moritz and Cicero, 2004;Ebach and Holdrege, 2005; Hebert and Gregory, 2005;Meyer and Paulay, 2005; Will et al., 2005) in light of theincreasing popularity of projects such as the Consortiumfor the Barcode of Life (http://barcoding.si.edu/).

Studies that use DNA fragments to delimit species,either separately or in combination with morphological

Page 2: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

326 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

and ecological data, have relied on a variety of gene mark-ers. Gene choice and fragment length vary substantially,depending on the evolutionary question and taxon of inter-est (Caterino et al., 2000; Meyer and Zardoya, 2003). Mito-chondrial DNA (mtDNA) genes have long dominated thefield of molecular systematics because of their maternalinheritance, limited recombination, rapid evolution, andthe robustness of mtDNA against degradation, makingthem ideal markers for many species-level questions (Aviseet al., 1987). Species-level arthropod research has focusedon a range of genes, in particular COI, COII, ND5 or16S (Caterino et al., 2000). Fragment location within thesegene regions is also variable, often dependent on the taxo-nomic group as well as the availability of previously devel-oped primers (Folmer et al., 1994; Simon et al., 1994).

One of the key features of the DNA barcoding project,as proposed by Hebert et al. (2003a,b), is the designation ofa single mtDNA fragment at the 50 end of cytochrome c

oxidase I (COI) gene to act as a ‘barcode’ to identify anddelineate all animal life. By choosing a standard DNA frag-ment, the efforts of multiple research groups can be coordi-nated, and they are able to construct a more comprehensivelibrary of DNA sequences than would be possible if work-ing independently (Caterino et al., 2000).

Although many aspects of DNA barcoding have beencritiqued (Sperling, 2003b; Moritz and Cicero, 2004; Willand Rubinoff, 2004; Ebach and Holdrege, 2005; Willet al., 2005), there has been only limited discussion onthe decision to use a specific �600 bp fragment from the50 end of COI as the DNA barcoding region (Erpenbecket al., 2005). Furthermore, there has been little discussionof the potential effects of using this particular region ondelimiting closely related species.

Initially, the 50 end of COI was chosen as the focalregion because it is flanked by two ‘‘universal’’ primersthat work for a range of metazoans (Folmer et al.,1994). The need to use widely applicable primers is under-standable, but examination of the DNA barcoding litera-ture reveals that the majority of projects actually rely ontaxon specific primers, rather than universal primers, inorder to optimize PCR performance (Hebert et al., 2004;Penton et al., 2004; Barrett and Hebert, 2005; Hebertet al., 2005), particularly with degraded material (Lambertet al., 2005). In addition, some DNA barcoding projects

Fig. 1. Overview of COI–COII genes, with nucleotide locations as in Drosophi

along the upper bar, where grey indicates tRNAs, white indicate membraneCOI–COII regions that are commonly used in insect systematics (Caterino et aet al., 1993; Folmer et al., 1994; Simon et al., 1994).

have used even smaller fragments of COI (Whitemanet al., 2004; Page et al., 2005).

On the other hand, with current sequencing technology,more than 900 bp can be routinely obtained with 98.5%accuracy from a single run (Gunning et al., 2002), so it isnot necessary to limit the DNA fragment length to 600bp. As well, maximum amounts of evidence should ideallybe examined when inferring species boundaries (Dayrat,2005; Will et al., 2005), due to stochastic variation in thegenome. Thus it is reasonable to expect projects that relysolely on shorter DNA fragments for delimiting species tobe more vulnerable to heterogeneous patterns of nucleotidesubstitutions within COI. The key question is whether thereis a region or length that is optimally informative.

Functional constraints, unequal substitution patterns,and heterogeneous evolutionary rates can all potentiallyaffect the usefulness of COI for species delimitation. TheCOI–COII genes are composed of interspersed, highly con-served membrane bound regions and variable extramem-brane loops (Fig. 1) (Saraste, 1990; Lunt et al., 1996).Previous examinations of COI molecular evolution haveshown that this structure leads to functional constraintson the mutation rate in this gene, particularly in the mem-brane bound regions (Saraste, 1990; Lunt et al., 1996; Cate-rino and Sperling, 1999), although the majority of studieshave focused on amino acid substitution rate, rather thannucleotide change. Studying only amino acid change canoverlook patterns of unequal substitution patterns sincenucleotide differences between closely related species aregenerally synonymous third codon position changes. Muta-tion hot spots, or adaptive substitutions, are known to existin mtDNA causing heterogeneous evolutionary rates acrossgenes (Stoneking, 2000; Innan and Nordborg, 2002). DNAsubstitution patterns may play an important role in shapingphylogenetic relationships and population structure, andmay vary between independent lineages and change as taxabecome increasingly diverged (Galtier et al., 2006).

Molecular characters that are initially used for identifi-cations or diagnoses are often later incorporated into stud-ies with a broader phylogenetic scope. Consequently, it isimportant to consider how phylogenetic informationchanges at increasingly deeper divergences. Phylogenetical-ly informative characters gradually accumulate in the formof mutations as lineages diverge and, in a simplistic sense,

la yakuba (Clary and Wolstenholme, 1985). Structural divisions are shownbound regions, and solid black indicates non-membrane bound regions.l., 2000) are identified by their forward and reverse primers (Bogdanowicz

Page 3: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 327

greater divergences should lead to increased phylogeneticinformation. However, at the same time the probabilityof multiple hits at a given site increases, resulting in the lossof informative characters (saturation). Transition/transver-sion ratio can be used as an indirect measure of saturationsince this ratio changes predictably with saturation. Tran-sitions occur much more commonly than transversions atlow levels of divergence, particularly in mtDNA, but astwo sequences diverge from a common ancestor, transver-sions gradually increase, obscuring previous changes (Gal-tier et al., 2006). As the Ti/Tv ratio approaches 1:2 insequences with equal base frequencies, phylogenetic infor-mation is lost, reducing the informativeness of a DNAsequence comparison. Phylogenetic information can thusbe considered a tradeoff between overall divergence andTi/Tv ratio, with different relative amounts of these twoquantities necessitating different kinds of phylogeneticanalyses (Swofford et al., 1996). We expected that com-bined comparison of divergence and saturation would bea good visual indicator of the quality of phylogenetic infor-mation. We subsequently use the term ‘‘DNA saturationanalysis’’ to represent this graphical combination of nucle-otide divergence and saturation, and describe the variationin phylogenetic information between DNA sequences.

Considering the growth of DNA barcoding and DNAtaxonomy (Monaghan et al., 2006), there is an urgent needto understand how patterns of nucleotide substitutionwithin COI–COII can affect the delimitation of closelyrelated species. It is also important to examine whether pre-vious work has used the most informative gene regions. Avariety of criteria could be considered when choosing anoptimally informative COI–COII region. Such a regionshould maximize one or more of nucleotide divergencerate, consistency of evolutionary rate, accuracy as an indi-cator of wider mtDNA divergence, and robustness of phy-logenetic signal. Ideally, a region should be chosen tomaximize all of these factors simultaneously, althoughsome tradeoffs will be inevitable.

In this study, we compare intra- and interspecific diver-gences among multiple groups of closely related COI–COII(2.3 kb) sequences to examine changes in signal due to var-iation in (1) localized nucleotide divergence patterns, (2)localized divergence relative to total COI–COII divergence,(3) fragment length, and (4) localized phylogenetic satura-tion. In particular, we examine how the COI fragment usedby the DNA barcoding project compares to other similarlysized fragments throughout these genes using sliding win-dow analyses. Finally, we provide recommendations for amore accurate and effective DNA region for delimitingclosely related species.

2. Methods

2.1. Overview

In order to visualize patterns of nucleotide divergencethroughout COI–COII we used sliding window analyses

(Rozas et al., 2003). We selected a preset window and stepsize (600 bp window with 5 bp step), with each window rep-resented by the median nucleotide location, or nucleotidemidpoint, and calculated a specific parameter (e.g. nucleo-tide divergence) for each window. From this, a graphicalrepresentation or profile of the parameter along a lengthof sequences was produced. We compared nucleotide diver-sity within species and nucleotide divergence between sisterspecies pairs across full COI–COII sequences. Profilesoften contained regions of concentrated nucleotide diver-gence, suggesting the possibility of mutational hot spots.Using randomly generated sequences, we examinedwhether empirically observed regions of high divergenceexceeded the stochastic variation produced by simulatedrandom divergence, which would provide evidence fornon-random, biologically significant mutation hot spots.We expressed nucleotide divergence in each window rela-tive to total COI–COII divergence, allowing us to combineand compare all sister species pair profiles, regardless ofabsolute percent divergence. We also examined how frag-ment length and location affected relative divergence, andidentified COI–COII sequence regions that could act asaccurate indicators of total COI–COII nucleotide diver-gence. Finally, DNA saturation analysis was used to exam-ine the relationship between absolute nucleotide divergenceand frequency of transitions along the length of COI–COIIfor series of increasingly diverged taxa.

2.2. Available material

A total of 114 mtDNA sequences were used in this anal-ysis (Appendix A). These represented all available insectsequences (as of December 2005) that extend acrossCOI–COII and would allow comparison of haplotypeswithin and between closely related species. These sequencescontiguously spanned a 2.3 kb region of mtDNA thatbegins in the tRNA tyrosine gene, crosses the COI, tRNAleucine, and COII genes, and ends in the tRNA lysine gene.The DNA barcoding region of Hebert et al. (2003a,b)includes most of the 50 half of COI. All sequences werealigned by eye to the sequence of Drosophila yakuba (acces-sion number: NC_001322) and nucleotide position wasnumbered based on the D. yakuba system (Clary and Wol-stenholme, 1985). The aligned data set contained 2342 bp,although sequence length varied between species, rangingfrom 2254 to 2320 bp.

A total of 73 taxa (species and sub-species; see AppendixA) were examined from Lepidoptera and Diptera; no addi-tional groups of complete COI–COII insect sequences wereavailable outside these two orders. Eighteen sequences werepreviously unpublished (Appendix A) and four previouslypublished partial sequences were completed for the fullCOI–COII region. These were obtained following DNAextraction, PCR amplification and sequencing protocolsoutlined in Sperling and Hickey (1994). Heterologousprimers designed for a range of insect families were usedto develop primers for each new insect species

Page 4: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

328 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

(Bogdanowicz et al., 1993; Simon et al., 1994; Sperling andHickey, 1994). New sequences were deposited in GenBankunder accession numbers DQ792576–DQ792593, whileextended sequences retained the original accession num-bers, and the remaining 92 sequences were already avail-able on GenBank. All accession numbers and associatedpublications are listed in Appendix A.

To understand the relationship between nucleotidedivergence patterns and species delimitation, patterns ofchange must be examined both within species, and betweenrecognized sister species, since it is at this taxonomic levelthat identifications and delineations are most likely to beproblematic (Sperling, 2003a; Moritz and Cicero, 2004).A total of 23 species with multiple COI–COII sequences

Table 1Intraspecific nucleotide diversity (%) for 23 species with multiple specimens (N)

Speciesa N Total% Max% Midpoint(s) Min

LepidopteraChoristoneura a + b

lineages9 1.59 2.07 2470–2480 0.98

Choristoneura

fumiferana

2 0.09 0.17 2150–2799,2940–3534

0.00

Dioryctria abietivorella 3 0.03 0.11 3134–3473 0.00Dioryctria auranticella 2 0.09 0.33 3343–3408 0.00Dioryctria okanaganella 2 0.26 0.67 1760–1880b 0.00Dioryctria pentictonella 3 0.35 0.67 1766–1861,

2611–2666b0.00

Dioryctria sylvestrella 2 0.09 0.17 2471–3070,3392–3472

0.00

Dioryctria yiai 2 0.30 0.67 3073–2291 0.00Feltia jaculifera 3 2.25 3.22 2602–2607 1.67Hyles euphorbiae 6 0.20 0.39 1994–2204c 0.06Hyles gallii 6 0.19 0.34 3233–3458 0.07Hyles robertsi peplidis 3 0.03 0.11 3419–3473 0.00Hyles sammutii 4 1.36 1.69 2155–2205 0.00Hyles tithymali 12 0.12 0.22 1801–1816 0.00Lambdina fiscellaria 2 1.81 3.00 1862–1992 0.83Papilio demodocus 2 0.74 1.33 2497–2652 0.17Papilio demoleus 2 2.71 4.50 2031–2046c 1.50Papilio grosesmithi 2 0.26 0.50 3471 0.00

Papilio machaon 3 0.95 1.83 1986–2081c 0.11

DipteraAnopheles gambiae 3 0.46 0.67 2352–2527,

3307–34070.22

Anopheles melas 2 0.78 1.30 3267–3272 0.50

Chrysomya

megacephala

2 0.17 0.67 1962–2117c 0.00

Chrysomya rufifacies 2 0.78 1.33 2042c 0.33

Mean (±1 SD) 0.68(±0.76)

1.12(±1.15)

0.28(±0.

Full COI–COII diversity (Total), maximum diversity (Max, 600 bp fragment), alocations of maximum and minimum diversity are shown in D. yakuba numberwhen less than 100 bp apart. Diversity was also examined for representative 6region (BC, midpoint Dy# 1843), Jerry–Pat (J–P, midpoint Dy# 2599), George–3460).

a Source of specimens and original references are listed in Appendix A.b Region of maximum divergence encompasses midpoint Dy#1843, which coc Region of maximum divergence encompasses midpoint Dy#2042.

(Table 1) were available to examine intraspecific variabili-ty. We also examined 23 pairs of sister species for interspe-cific variability (Table 2). Sister species pairs were chosenbased on previously established phylogenetic relationships(see associated literature in Appendix A).

2.3. Sliding window analysis

Patterns of nucleotide substitution across COI–COIIwere visualized using sliding window analyses performedwith DNAsp ver. 4.10.4 (Rozas et al., 2003). Nucleotidediversity (p) (Nei, 1987 equation 10.5; Rozas et al., 2003)was used to estimate variability within species. Nucleotidedivergence (K) (Tajima, 1983 equation A3; Rozas et al.,

, calculated as the average diversity between all specimens within a species

% Midpoint(s) BC J–P G–M P–E

3187–3212 1.78 1.76 1.24 1.45

1765–2145,2750–2935

0.00 0.17 0.17 0.17

1759–3138 0.00 0.00 0.00 0.111761–3138 0.00 0.00 0.17 0.172710–2890 0.67 0.17 0.17 0.173212–3473 0.56 0.67 0.22 0.00

1761–2466,3675–3387

0.00 0.17 0.00 0.17

2701–2991 0.50 0.33 0.17 0.333136–3141 2.16 3.22 1.78 2.112594–2664 0.28 0.06 0.29 0.232011–2046 0.15 0.18 0.2 0.341776–3396 0.00 0.00 0.00 0.113015–3121 1.57 1.31 0.94 1.252222–2492 0.20 0.09 0.12 0.112962–3170 2.99 1.67 0.83 1.331767 0.5 1.33 0.83 0.503103–3113 3.33 3.33 1.83 1.672031–2476,3093–3168

0.17 0.17 0.17 0.33

2981–3097 1.50 0.78 0.11 0.56

1892–1877,2002–2162

0.33 0.56 0.33 0.56

2082–2107,2437–2492,2802–3132

1.0 0.67 0.67 1.00

2562–3472 0.50 0.00 0.00 0.00

2982–3170 1.00 0.83 0.50 0.67

49)0.84(±0.96)

0.76(±0.96)

0.46(±0.54)

0.58(±0.59)

nd minimum diversity (Min, 600 bp fragment) are shown. Midpoints of thes, and separate regions with multiple equal diversity values were combined00 bp fragments for four commonly sequenced regions: DNA barcodingMarilyn (G–M, midpoint Dy# 3090), and Pierre–Eva (P–E, midpoint Dy#

rresponds to the region used by DNA barcoding.

Page 5: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Table 2Interspecific nucleotide diversity (%) for 23 species pairs

Sister species pairsa Total% Max% Midpoint(s) Min% Midpoint(s) BC J–P G–M P–E

LepidopteraChoristoneura a lineage · C. b

lineage2.23 2.82 2470–2480 1.59 3187–3212 2.42 2.41 1.89 2.15

C. fumiferana · C. a + b lineage 2.47 3.96 2575–2615 1.52 3187–3212 1.91 3.96 2.28 2.42Dioryctria abietella · D.

abietivorella

3.87 5.17 2497–2632 1.91 3145–3210 4.29 5.17 2.20 3.81

D. reniculelloides · D.

pseudotsugella

1.72 2.37 1830–1934b,3382–3472

0.83 2941–3001 2.04 1.18 1.35 2.37

D. rubella · D. sylvestrella 1.27 2.28 2692–2702 0.83 2182–2307 1.18 1.94 1.18 1.43D. zimmermani · D. cambiicola 0.31 0.67 2321–2486 0.16 2901–3472 0.34 0.33 0.17 0.30Hyles annei · H. euphorbiarum 2.93 5.00 2671–2691 1.35 3472 3.24 4.64 2.72 1.52H. gallii · H. nicaea 1.63 2.54 2620–2645c 1.04 2014–2024 1.15 2.37 2.14 1.66Papilio canadensis · P. glaucus 1.28 2.37 1891–1951 0.50 3167–3182 2.20 1.18 1.01 0.67P. demodocus · P. erithonioides 4.11 6.53 2632–2642c 2.98 1767 3.32 6.07 4.46 4.38P. erostratus · P. anchicides 2.80 3.76 2706 1.86 2326–2356 3.24 3.41 2.89 2.54P. machaon · P. hospiton 3.47 4.94 2066 2.49 3152–3157 4.05 3.29 2.77 3.58P. m. oregonius · P. m.

hippocrates

1.37 2.54 1986–2081 0.17 2981–3097,3277

2.03 1.18 0.17 0.84

P. memnon · P. rumanzovia 3.81 4.64 3377–3422 2.89 3252–3277 3.76 4.11 4.11 4.29P. thoas · P. cresphontes 5.90 7.53 2671–2676,

3192–32025.00 1836–1882 5.17 6.62 6.80 6.62

P. troilus · P. palamedes 6.76 9.02 2636–2651c 5.00 2236–2246 5.89 8.46 6.25 7.90P. zelicaon · P. polyxenes 2.67 3.58 1796–1861b,

2641–2746c1.35 2236–2241 3.41 3.41 2.72 2.72

Yponomeuta cagnagella · Y.

padella

0.18 0.50 1843–2006 0.00 2446–2691,3308–3473

0.50 0.00 0.17 0.00

DipteraAnopheles gambiae · A.

arabiensis

0.52 1.00 3212–3232 0.28 1892–1927 0.33 0.45 0.84 0.50

A. melas · A. quadriannulatis 1.45 2.20 1852 0.84 2592–2617 2.03 0.84 0.18 1.35Chrysomya megacephala · C.

bezziana

4.11 6.98 2542–2574 2.54 3122–3147 3.32 6.62 3.06 3.58

C. norrisi · C. variceps 5.21 7.35 2317–2347 3.41 3078–3195 6.98 7.17 3.58 3.94C. rufifacies · C. albiceps 3.06 4.55 2542–2567 2.28 1767–1772 2.89 4.38 2.98 3.50

Mean(±1 SD) 2.74(±1.74)

4.01(±2.31)

1.77(±1.40)

2.86(±1.71)

3.44(±2.42)

2.43(±1.80)

2.69(±1.96)

Values for full COI–COII divergence (Total), maximum divergence (Max, 600 bp fragment), and minimum divergence (Min, 600 bp fragment) are shown.Midpoints of the locations of maximum and minimum divergence are shown in D. yakuba numbers, and separate regions with multiple equal divergencevalues were combined when less than 100 bp apart. Divergence was also examined for representative 600 bp fragments for the four commonly sequencedregions described in Table 1.

a Source of specimens and original references are listed in Appendix A.b Region of maximum divergence encompasses midpoint Dy#1843, which corresponds to the region used by DNA barcoding.c Region of maximum divergence encompasses midpoint Dy#2662.

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 329

2003) was used to analogously quantify variability betweensister species. Both values were converted to percentages tofacilitate comparisons. Analyses were performed usingdefault settings (except for relative divergence profiles, seebelow), which included a Jukes–Cantor correction to alldivergence calculations.

To ensure that these results could be compared to dataobtained from DNA barcoding, as well as other commonlyused regions in COI–COII, we selected a 600 bp windowsize for all sliding window analyses. The universal primers(LCO 1490–HCO 2198 (Folmer et al., 1994)) were used byHebert et al. (2003a,b) to amplify a 658 bp fragment(Dy#1514–2173), while the original sequences depositedin GenBank from the initial DNA barcoding paper wereeither 617 or 624 bp in length (Dy#1555–2171 bp to

Dy#1533–2157 bp, respectively) and subsequent studieshave used variable lengths of sequence (Whiteman et al.,2004; Page et al., 2005).

Three additional universal primer regions were alsocompared: C1-J-2183 to TL2-N-3014 (Jerry–Pat), C1-J-2792 to C2-N-3389 (George–Marilyn), and C2-J-3138 toTK-N-3782 (Pierre–Eva) (Fig. 1) (Bogdanowicz et al.,1993; Simon et al., 1994). Actual length of these primerregions is variable (Table 3), so a 600 bp fragment fromthe middle of each region was used for comparison (similarto the barcoding region). Ranges and midpoints (in brack-ets) of each 600 bp fragment are as follows: LCO–HCO(barcoding region) Dy#1543–2143 (1843); Jerry–Pat,Dy#2299–2899 (2599); George–Marilyn Dy#2790–3390(3090); Pierre–Eva Dy#3160–3760 (3460).

Page 6: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Table 3Nucleotide divergence and relative divergence calculated as the percent of total COI–COII divergence for four commonly sequenced regions in insectsystematics

PCR fragment Primer names Length (bp) Mean absolute divergence (%) Mean relative divergence (%) (±1 SD%)

LCOa–HCOa LCO1490 (1514)b 658 2.81 (±1.76) 111.88 (±44.45)HCO2198 (2173)b

Jerryc–Patc C1-J-2183 830 3.24 (±2.22) 114.44 (±33.75)TL2-N-3014

Georged–Marilync C1-J-2792 596 2.40 (±1.78) 87.09 (±26.42)C2-N-3389

Pierrec–Evad C2-J-3138 643 2.65 (±2.05) 87.24 (±30.12)TK-N-3782

Total COI–COII 2.74 (±1.74) 100 (±0)

Mean absolute divergence and relative divergence are calculated from 23 sister species pairs. Length of PCR fragments are measured in relation toD. yakuba, but actual length may vary in other sequences due to indels at the ends of COI and COII.

a Folmer et al., 1994.b D. yakuba bp number at 30 end.c Simon et al., 1994.d Bogdanowicz et al., 1993.

330 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

A step size of 5 bp was used for all sliding window anal-yses, with each window represented by its nucleotide mid-point. A 5 bp step size was chosen to provide asmoothing function to the sliding window analyses, whichimproved visualization of nucleotide changes along theCOI–COII fragment. Using the default setting, sites withindels were ignored and every window included 600 nucle-otides. This ensured that windows were comparable amongprofiles of COI–COII. To discern patterns of nucleotidechange, we compared diversity profiles within species anddivergence profiles between sister species pairs in order toidentify (1) magnitude and locations of maximum diver-gence, (2) magnitude and locations of minimum diver-gence, and (3) magnitude of divergence in the fourcommonly used regions described above (Fig. 1).

2.4. Randomizations

Although we identified some regions of high nucleotidedivergence in profiles of sister species pairs that appearedto be mutational hot spots (Stoneking, 2000; Galtieret al., 2006), it was not clear whether these peaks of highdivergence were non-random. To test for the presence ofconcentrated divergence, we used a sliding window profileto compare the distribution of randomly generatedsequence pairs simulated from the original sequences. Ifthe original profile significantly exceeded the distributionof randomly generated sequences then the presence of amutation hot spot would be supported. The Papilio

sequences comprised the most extensive data set availablein this study and the parameters from these were used togenerate the random data sets. Modeltest vers. 3.7 (Posadaand Crandall, 1998) was initially used to identify the mostappropriate model, using hierarchical likelihood ratio tests.A likelihood tree (log-likelihood score –12699.587) wasobtained from a maximum likelihood analysis of all the ori-ginal Papilio sequences using the following GTR + C + Imodel: base frequencies A = 0.3255, C = 0.1014, G =

0.1206, T = 0.4525; rates AC = 10.6213, AG = 16.7683,AT = 8.8273, CG = 1.5416, CT = 122.9118, GT = 1.000;gamma shape parameter = 0.8468, and proportion ofinvariable sites = 0.5750. A total of 50 sets of randomsequences, modeled from the original Papilio likelihoodtree, were generated in Mesquite (Maddison and Maddison,2004) using the above GTR model and a starting branchdepth of 0.1. Sliding window analyses, as previouslydescribed, were performed on three random sequence setsin each of the 50 sets that corresponded to three of thefollowing original Papilio sister species pairs: P. canaden-

sis · P. glaucus; P. demodocus · P. erithonioides; P. troi-

lus · P. palamedes. Nucleotide divergence profiles in therandom sequences were compared to the original data setsto identify any regions of divergence that fell beyond the95% confidence limits of the mean random divergence(n = 50).

2.5. Localized nucleotide divergence as an indicator of total

COI–COII divergence

To assess whether a smaller region of sequence withinCOI–COII could act as an effective indicator of totalCOI–COII divergence, we first needed to convert absolutenucleotide divergence in each window into a form thatwould allow comparisons between species pairs with differ-ent total nucleotide divergences. This was accomplished bycalculating the ratio of nucleotide divergence sampled in awindow relative to total COI–COII divergence for that sis-ter species pair, allowing us to combine all species pairs.Mean divergence was then calculated for each windowfor all 23 sister species pairs and combined into a singleprofile by using the following option in DNAsp: SlidingWindow Options: Sites with alignment gaps considered.Mean relative divergence (presented as a percentage) andstandard deviation were calculated for each window.

We used two criteria to identify regions that would actas the best indicators for total COI–COII divergence. First,

Page 7: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 331

mean relative divergence for the region should be 100% oftotal COI–COII divergence. Second, regions should have alow standard deviation from mean relative divergence,indicating that there is minimal variation among sister spe-cies pairs. We also compared divergence relative to totalCOI–COII divergence for the four commonly used primerregions described previously. We calculated mean absolutenucleotide divergence and divergence relative to COI–COIItotals for each region for all 23 sister species pairs and iden-tified regions that were effective indicators of total COI–COII divergence based on both mean and minimalvariance.

2.6. Effect of fragment length on nucleotide divergence

estimation

We examined how variation in fragment length affectedestimates of total COI–COII divergence in the 23 sister spe-cies comparisons. Relative divergence was used to controlfor total divergence differences between sister species pairs,and was calculated for each fragment length for each sisterspecies pair. Starting with a 200 bp fragment at the 50 endof COI genes, fragment length was increased by 200 bpincrements until the full COI–COII sequence data wasincluded. This process was repeated starting at the 30 endof COII, as well as in the middle of COI–COII (midpointDy#2610, fragment length increased 100 bp in either direc-tion). For each set, we calculated mean divergence relativeto total COI–COII nucleotide divergence and standarddeviation for each fragment length, and identified a lengthwhere these two parameters leveled off relative to totalCOI–COII.

2.7. DNA saturation analysis

In addition to using absolute and relative divergences toidentify hot spots, we also used DNA saturation analysis toexamine how levels of saturation change across COI–COII.DNA saturation analysis allows graphical evaluation ofhow saturation of transitions accumulates in relation tonucleotide divergence. Optimal regions for phylogeneticanalysis should have high nucleotide divergence relativeto saturation. As an estimation of saturation, we used per-cent transitions (%Ti), rather than the more standard Ti/Tvratio, to eliminate the problem of undefined values whentransversions equal zero. Divergence and %Ti can poten-tially be combined by multiplying them together to give asingle measure of potential DNA information content;however they are kept separate here in order to provideinstructive assessment of their independence.

Three series of species were compared for changes in%Ti, in separate phylogenetic time series. One referencetaxon in each series was compared to a set of increasinglydiverged congeners to examine how %Ti and absolutenucleotide divergence changed with increasing phylogenet-ic depth. For P. machaon oregonius the increasingly moredistant comparisons were with P. m. hippocrates, P. hospi-

ton, P. indra, and P. xuthus. For P. canadensis the compar-isons were with P. glaucus, P. rutulus, P. multicaudatus, andP. scamander, all of which are placed in a different subge-nus than the series that includes P. machaon oregonius.For Dioryctria zimmermani, which is a pyralid moth, thecomparisons were with D. fordi, D. okanaganella, and D.magnifica. A sliding window analysis of nucleotide diver-gence was conducted for each phylogenetic time series inDNAsp ver. 4.10.4 (600 bp window; 5 bp step). Ti/Tv ratiocould not be calculated in DNAsp, so a sliding windowanalysis was conducted using a small script written foruse in PAUP* ver. 4.10b. Ti/Tv ratio was then convertedto %Ti and graphed together with percent nucleotide diver-gence for each pair within the three phylogenetic time ser-ies. Regions of low %Ti were defined as regions where %Tiwas less than or equal to 50% (Ti/Tv 6 1).

3. Results

3.1. Intraspecific diversity

Sliding window profiles differed substantially among the23 species that had multiple COI–COII sequences, and 12profiles are presented here to illustrate this diversity(Fig. 2). Several species had more than one region ofmaximum diversity (Table 1: Choristoneura fumiferana,Dioryctia pentictonella, D. sylvestrella, and Anopheles

gambiae). Although the location of maximum diversitywas variable, some sliding window profiles bore substan-tial similarity to one another (Fig. 2: e.g. Papilio demoleus

and P. machaon).Full COI–COII diversity within species ranged from

2.71% (P. demoleus) to 0.03% (D. abietivorella, Hyles rober-

tsi peplidis) with a mean diversity of 0.68% (Table 1). Max-imum diversity among all 600 bp windows within theprofiles ranged from 4.50% (P. demoleus) to 0.11% (D. abie-

tivorella, H. robertsi peplidis) with a mean of 1.12% (Table1). Minimum diversity among the windows ranged from1.67% (Feltia jaculifera) to 0.0% (12 species) with a meanof 0.28% (Table 1).

Mean diversity within species for the four fragmentswas highest in the barcode fragment (0.84%) and lowestfor the George–Marilyn fragment (0.46%) (Table 1). P.

demoleus had the highest diversity over three of the fourfragments, but the rank order of maximum and minimumdiversity in the fragments for all other taxa was quitevariable (Table 1).

Many differences consisted of single nucleotide changesand the locations of peak diversity occurred throughoutCOI–COII (Fig. 2). Five species had regions of maximumdiversity that contained midpoint Dy# 2042 (fragmentDy# 1743-2342) (Table 1). By comparison, only two spe-cies had a region of maximum diversity that correspondedto the DNA barcoding region (midpoint Dy# 1843, frag-ment Dy# 1543-2143). In species with low intraspecificdiversity (one or two nucleotide differences), regions ofmaximum diversity were often found at the 30 end of COII,

Page 8: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Fig. 2. Sliding window profiles (600 bp window; 5 bp step) of COI–COII intraspecific nucleotide diversity for 12 of 23 species. Average nucleotide diversityfor total COI–COII follows each species name. Black dots on X-axes indicate midpoints of commonly used sequence regions. Scales on Y-axes differbetween upper and lower panels.

332 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

while in species with higher levels of diversity the regions ofmaximum diversity were more frequently found in COI(Table 1).

3.2. Interspecific divergence

Profiles of divergences between the 23 sister species pairsexamined in this study were quite variable; 15 were chosento illustrate this variation (Fig. 3). Three species pairs hadmultiple distinct (P100 bp apart) regions of maximumdivergence (Table 2: D. reniculelloides · D. pseudotsugella,P. thoas · P. cresphontes, P. zelicaon · P. polyxenes). Aswas the case for diversity within species, divergence profileswere highly variable, but some sister pair profiles wereremarkably similar (Fig. 3: P. demodocus · P. erithonioidesand P. troilus · P. palamedes), even between distantly relat-ed lineages (Chrysomya megacephala · C. bezziana, andH. annei · H. euphorbiarum).

Interspecific divergences for the full COI–COII regionranged widely, from 6.76% (P. troilus · P. palamedes) to0.18% (Yponomeuta cagnagella · Y. padella), with a meannucleotide divergence of 2.74% (Table 2). Maximum diver-gences ranged from 9.02% (P. troilus · P. palamedes) to0.50% (Y. cagnagella · Y. padella), with a mean of 4.01%(Table 2). Minimum divergences ranged from 5.00% (P.

troilus · P. palamedes, P. thoas · P. cresphontes) to 0.00%(Y. cagnagella · Y. padella) with a mean of 1.77% (Table 2).

Mean divergence of commonly sequenced regions washighest in Jerry–Pat (3.44%) and lowest in George–Marilyn(2.43%). Nucleotide divergence varied by up to 4.02% with-in a single profile (Table 2, maximum versus minimumregion of divergence: P. troilus · P. palamedes). The rankorder of maximum and minimum diversity in the fragmentsfor all other taxa was quite variable (Table 2).

Although regions of maximum divergence occurredthroughout COI–COII, four sister species pairs hadregions of maximum divergence that contained nucleotidemidpoint Dy# 2662 (fragment Dy# 2362–2962), and 8additional pairs were within 100 bp of this location (Table2). By comparison, only one sister pair had a region ofmaximum divergence that corresponded to the DNA bar-coding region (midpoint Dy# 1843, fragment Dy# 1543–2143), and four additional pairs were within 100 bp ofthis midpoint (Table 2). A single sister pair had multipleregions of maximum divergence that encompassed bothmidpoints.

When nucleotide divergence in the DNA barcodingregion was compared with the region of maximum diver-gence, we found a surprising amount of discrepancy inthe levels of divergence between the two regions (Table

Page 9: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Fig. 3. Sliding window profiles (600 bp window; 5 bp step) of COI–COII interspecific nucleotide divergence for 15 of 23 sister species pairs. Total COI–COII nucleotide divergence follows each species pair name. Black dots on X-axes indicate midpoints of commonly used sequence regions. Scales on Y-axesare variable.

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 333

2). Differences in divergence in the region of maximumdivergence versus the barcoding region ranged from3.66% (Chrysomya megacephala · C. bezziana) to 0.00%(Y. cagnagella · Y. padella). Nearly half (10 sister pairs)showed more than 1% difference in diversity between thetwo regions, and 5 pairs differed by more than 2%

(C. megacephala · C. bezziana, P. demodocus · P. eritho-

nioides, P. troilus · P. palamedes, P. thoas · P. cresphontes,C. fumiferama · C.a + b). For two sister pairs (A. gam-biae · A. arabiensis and D. zimmermani · D. cambiicola)the difference between the two regions was greater thantheir total COI–COII divergence.

Page 10: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

334 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

3.3. Overlap of intra- and interspecific divergence

Where possible, we compared intraspecific diversity andinterspecific divergence across COI–COII, and for the

Fig. 4. Sliding window profiles (600 bp window; 5 bp step) of COI–COIIintraspecific nucleotide diversity (A. gambiae) and interspecific divergence(A. gambiae · A. arabiensis) showing regions of overlap. Black dots onX-axes indicate midpoints of commonly used sequence regions.

Fig. 5. Sliding window profile of P. troilus · P. palamedes and profiles of 50 rdivergence of random profiles (thin bold line), 95% confidence intervals ofconfidence interval (grey lines).

majority of sister pairs these values did not overlap(Tables 1 and 2, e.g., maximum intraspecific diversity ofH. galli compared to minimum interspecific divergencebetween H. gallii · H. nicaea). One notable exceptionwas between Anopheles gambiae and A. arabiensis

(Fig. 4). When intraspecific nucleotide diversity of A. gam-biae was plotted with the interspecific divergence betweenA. gambiae · A. arabiensis, there were several regionswhere intraspecific diversity in A. gambiae equaled orexceeded the interspecific divergence between A. gambiae

and A. arabiensis (Fig. 4).

3.4. Randomizations

When the profiles of three Papilio sister species pairs (P.

canadensis · P. glaucus; P. demodocus · P. erithonioides;P. troilus · P. palamedes) were compared to randomly gener-ated sister species pair profiles, we found that the real nucle-otide divergence profile was mostly within the distributionof the random profiles (Fig. 5: P. troilus · P. palamedes).Two of the three distributions (P. canadensis · P. glaucus,and P. demodocus · P. erithonioides; data not shown) wereentirely within the 95% confidence interval of the randomprofiles. Only P. troilus · P. palamedes had a small sectionthat was outside the 95% confidence limits of the meannucleotide divergence of the simulated sequences; it wasnonetheless within the full set of random profiles (Fig. 5).

andomly generated sequence pairs. Original profile (thick bold line), meanrandom profiles (shaded region), and random profiles that exceed 95%

Page 11: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 335

3.5. Chimeric sequences

During the course of this investigation, we identified anerroneous chimeric sequence in P. demodocus (AY457588,Zakharov et al., 2004b) by using sliding window analysis.The COII gene region was highly divergent and identicalto P. demoleus malayanus from Malaysia (accession num-ber AF044000). We examined the original data set anddiscovered that an error had occurred when the sequencewas submitted to GenBank. The corrected sequence filehas been deposited in GenBank. This example serves tohighlight an additional application of sliding windowanalyses.

3.6. Localized divergence relative to total COI–COII

divergence

Relative local divergence was quite variable among win-dows across the full COI–COII sequence length (Fig. 6).Mean relative divergence for each window ranged from124.5% (midpoints 2715 and 2720 bp) to 86.7% (3483bp), with standard deviation ranging from ±54.5% (1869bp) to ±14.6% (2133 bp). If a mean relative divergence of100% and a narrow standard deviation indicates an opti-mal region, then the best 600 bp window spanned Dy#1831–2430 (midpoint Dy# 2133 bp), which had a mean per-cent divergence of 100.7% and standard deviation ±14.6%(Fig. 6).

We also related patterns of nucleotide divergence to thevalues for the true length of commonly used sequence

Fig. 6. Sliding window profile of mean relative divergence (bold) and ±1 standalines) show individual sister species pairs that exceed the zone of ±1 SD. Midpo600 bp region are shown.

regions (Table 3) rather than the central 600 bp of theseregions (Tables 1 and 2; Fig. 6). Jerry-Pat had the highestabsolute divergence, LCO–HCO had the best relativedivergence and absolute variance, and George-Marilynhad the best relative variance (Table 3). Overall, no singlePCR fragment appears to optimally sample the sister pairdivergences of this study; none outperform the optimal600 bp fragment identified at Dy# 1831–2430, althoughLCO–HCO and Jerry–Pat both span parts of this region.

3.7. Effect of fragment length on total COI–COII divergence

estimation

By increasing fragment lengths in 200 bp increments, wegraphed the approach to 100% relative divergence and 0%variance for the 23 sister species pairs (Fig. 7). The patternof change in mean relative divergence was quite variablefor the three starting locations.

Starting at the 50 end of COI, the initial mean relativedivergence was 77%, increasing beyond 100% between 400and 800 bp, before dipping to near 100% at 1000 bp.Lengths greater than 1000 bp had increased relative diver-gence again, indicating that a region of high relativedivergence was being sampled. Mean relative divergencedid not stabilize again at 100% until fragment lengthwas 2000 bp.

Starting in the middle of COI–COII (Dy# 2610), themean relative divergence for the 200 bp fragment lengthwas higher than the two other starting locations (Fig. 7).Mean divergence increased at 400 bp, but then decreased

rd deviation (shaded) for all 23 sister species pairs. Individual profiles (greyint locations of commonly used COI–COII PCR regions and the ‘‘optimal’’

Page 12: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Fig. 7. Mean relative nucleotide divergence and variance (1 standard deviation) for fragments of increasing length starting at either the 50 end of COI, themidpoint of COI–COII, or the 30 end of COII for 23 sister species. Fragment length started at 200 bp, and increased in 200 bp increments until the entiresequence was included.

336 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

steadily to 100% at 1400 bp. Like the 50 start, mean diver-gence deviated from 100% at longer fragment lengths(1400–2200 bp), and did not stabilize till the entiresequence length was sampled (2300 bp).

Starting at the 30 end of COII, the initial mean relativedivergence was 75%, similar to starting at the 50 end ofCOI (Fig. 7). Mean divergence decreased again when thefragment reached 800 bp, and did not approach near100% until the fragment was 1200 bp in length. Beyond thislength, mean divergence stabilized at 100%, unlike the 50 ormidpoint start.

Changes in variance with increases in fragment lengthwere also non-uniform among the three start locations(Fig. 7). Differences in variance and rate of change weremost pronounced at small fragment lengths, but once frag-ment lengths reached 1000 bp, decreases in variance weremore similar for all start locations. At a fragment lengthof 1000 bp, the variance for the midpoint start was lessthan the other two start locations, and remained lower thanthe other two start locations for all larger fragment lengths.

Fig. 8. Example profile for DNA saturation analysis, with a closelyrelated species pair (low divergence) on the left, and a more distantlyrelated species pair (high divergence) on the right. Nucleotide divergencehas thick lines and an axis on the left. Percent transitions (%Ti) has thinlines and an axis on the right. The dotted horizontal line represents 50% Tiand 6% divergence. Areas of <50% Ti are shaded to indicate regions ofsaturation.

3.8. DNA saturation analysis

Nucleotide divergence and %Ti were graphed togetherto analyse saturation of phylogenetic information amongthree phylogenetic time series (P. machaon oregonius, P.

canadensis, and Dioryctria zimmermani). A heuristic exam-

ple is shown in Fig. 8 in order to aid interpretation of phy-logenetic time series graphs; the left side with lowdivergence and high %Ti, and the right with high diver-gence and low %Ti. Maximal phylogenetic informationoccurs when both nucleotide divergence and %Ti aremaximized.

When nucleotide divergence and %Ti profiles for threephylogenetic time series were examined together, severalpatterns emerged. First, as expected, %Ti was initially high

Page 13: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 337

between closely related species and decreased as nucleotidedivergence increased. Second, and less obviously, regionsof high divergence did not always correspond to regionsof low %Ti. For example, P. machaon oregonius · P. m.

hippocrates (Fig. 9: arrow at midpoint 2060 bp) or P.

canadensis · P. multicaudatus (Fig. 9: midpoint 3510 bp)show areas where both high nucleotide divergence and rel-atively high %Ti coincide. Third, accumulation of %Ti var-ied between series, and even within series. In both the P.

machaon oregonius and P. canadensis time series (Fig. 9),regions of low %Ti (less than 50%) occurred initiallybetween closely related species (P. machaon oregonius · P.

m. hippocrates and or P. canadensis · P. glaucus), but didnot persist, and were recovered again when more divergedtaxa were examined (P. machaon oregonius · P. indra andP. canadensis · P. multicaudatus). On the other hand, inthe Dioryctria zimmermani time series regions of low %Tido not occur, even at relatively high levels of divergence(Fig. 9: D. zimmermani · D. magnifica). Regions of maxi-

Fig. 9. Phylogenetic time series for P. machaon oregonius, P. canadensis, andtotal divergences arranged in rows. Each species is compared to increasinglyindicated for each comparison. See Fig. 8 for a heuristic profile. Nucleotide midrelatively high divergence and high percent transitions.

mum phylogenetic information were variable across thegenes and taxa, and there was no common pattern in thepairs of species sampled, even within a single time series.

4. Discussion

Nucleotide divergence has been a primary criterion fordelimiting species and detecting cryptic species in initiativessuch as DNA taxonomy (Tautz et al., 2003; Monaghanet al., 2006) and DNA barcoding (Hebert et al., 2003a,b).We have shown that nucleotide change is heterogenousthroughout COI–COII. This finding contrasts with thereport by Hebert et al. (2003b, p. S96) that ‘‘sequencedivergences in the halves [of COI] were closely similar’’and ‘‘because of this congruence, the measures of sequencedivergence for other species pairs are analysed without ref-erence to their source region in the gene.’’ Such arbitraryreliance on a single mtDNA fragment without compensa-tion for the underlying heterogeneity of evolution of the

D. zimmermani, with phylogenetic series arranged in columns and similardiverged congeners of roughly equal increments, with total divergence

point locations are given relative to D. yakuba. Arrows indicate regions of

Page 14: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

338 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

molecular sequences could result in under- or overestimat-ing species diversity. We examined four specific nucleotidedivergence patterns that could affect such species delimita-tion and diagnosis: (1) overlap of intra- and interspecificdivergences, (2) mutation hot spots, (3) levels of divergencerelative to fragment size and location, and (4) DNAsaturation that could lead to reduced phylogeneticinformativeness.

4.1. Intra- versus interspecific divergence

When relying on molecular divergences to delimit spe-cies, it is important that nucleotide diversity within speciesis no greater than divergence between species (Meyer andPaulay, 2005). Over the entire data set, there was majoroverlap between intraspecific diversity (Table 1: 0.029–2.71%) and interspecific divergence (Table 2: 0.18–6.76%)for full COI–COII sequences. The DNA barcoding region,Jerry–Pat, George—Marilyn, and Pierre–Eva all showedsimilar overlap (Tables 1 and 2). For species with bothintra- and interspecific data, the majority showed little tono overlap between these two values, with the exceptionof Anopheles gambiae · A. arabiensis (Tables 1 and 2,Fig. 4). Historically, these species have been notoriouslydifficult to separate, due to either hybridization or retainedancestral polymorphisms (Besansky et al., 2003). Therecency of speciation, rather than inaccurate taxonomy, isthe primary reason for difficulty in delimitation and servesto demonstrate the problems facing DNA barcode identifi-cation. It also serves to highlight the importance of explor-ing alternative markers and analytical methods, such asthose used in population-level studies to clarify the rela-tionships among difficult taxa.

These data also provide an opportunity to evaluate theaccuracy of standard divergence thresholds for delimitingspecies. Such thresholds rely on the presence of genetic‘‘breaks’’ (Hebert et al., 2003a,b) or ‘‘barcoding gaps’’(Meyer and Paulay, 2005) between intra- and interspecificdivergences. It is important to test these thresholds againstrecognized sister species, rather than a random sample ofarbitrarily chosen relatives (Moritz and Cicero, 2004). Asshown in A. gambiae · A. arabiensis, intra- and interspecif-ic overlap can vary by location within COI–COII for a sin-gle sister pair, but how accurate are thresholds for theentire data set? All comparisons within this study werebased on closely related sister taxa, where identificationsare most likely to be problematic. If we apply a 3% thresh-old (Hebert et al., 2003a,b) to divergences obtained fromthe barcoding region in Lepidoptera, only 9 out of 19 sisterpairs (47%) are correctly delimited (Table 2). If a more con-servative threshold of 2% is used for the entire data set, 17out of 23 sister pairs (71%) are correctly delimited (Table2). Even the most conservative thresholds failed to achievethe results claimed by barcoding advocates (98% correctdelimitation, Hebert et al., 2003a,b). Ranges of intraspecif-ic diversity and interspecific divergence were quite broad(Tables 1 and 2), even considering the possibility that some

of the species with very high levels of intraspecific diversitymay indicate undescribed species and some of the pairswith very low levels of interspecific divergence could resultfrom poor taxonomic classification.

4.2. Mutation hot spots

Non-random regional variation has previously beenshown to occur in mtDNA (Lin and Danforth, 2003;Bartolome et al., 2005; Broughton and Reneau, 2006; Gal-tier et al., 2006). Sliding window analyses demonstratedthat nucleotide diversity within taxa (Fig. 2) and divergenc-es between taxa (Fig. 3) were quite variable across COI–COII. However, regions of high divergence did not usuallyexceed the 95% confidence interval of the mean divergencebetween randomly generated sequence pairs (Fig. 5: P. troi-

lus · P. palamedes), indicating that observed regions ofhigh divergence did not differ in magnitude from simulatedrandom peaks in divergence.

Although the presence of mutational hot spots was notconfirmed, we found several patterns that suggested thatCOI–COII is not undergoing purely random regionalnucleotide change. First, both mean relative divergenceand variance in some regions of COI–COII were consis-tently higher than others, rather than being equal through-out COI–COII (Fig. 6). Second, patterns of nucleotidedivergence showed similarity between some species pairs,even between distantly related taxa (Fig. 3: C. megacepha-

la · C. bezziana, and H. annei · H. euphobiarum). Third,maximum divergence values for a number of taxa appearconcentrated in a region encompassing 2362–2962 bp(Table 2).

To better understand the phenomenon of mutationalhot spots, conspecific specimens and closely related speciesshould be examined throughout their geographic ranges, tomaximize sampling of mtDNA haplotype diversity, andacross as large a sequence region as possible, to minimizethe effect of localized stochastic mutational anomalies. Var-iability of divergence rates is particularly important ingroups characterized by extremely high levels of intraspe-cific variation (Johnson et al., 2002) or low levels of inter-specific divergence (Sperling et al., 1995).

4.3. Location and size of mtDNA fragments for delimiting

species

Fragment choice can influence species diagnoses thatrely on nucleotide divergence, particularly when based onshort DNA fragments. If divergences are generally lowbetween sister species, targeting regions of maximum diver-gence should ensure the greatest probability of consistentlydelimiting these taxa by sequencing regions with the mostinformative nucleotide variation. However, to achievemaximum compatibility with other studies, it is advanta-geous to focus on regions that give accurate and consistentestimates of divergences relative to larger mtDNA regions.We identified several 600 bp regions that have a mean

Page 15: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 339

nucleotide divergence for sister species pairs similar to totalCOI–COII divergence, although these regions differ sub-stantially in variability (Fig. 6). We identified a 600 bp frag-ment (Dy# 1831–2430) with a mean percent divergence of100.7% relative to total COI–COII divergence, which alsohad relatively low variance (±14.6% SD). Based on ourdata, this 600 bp fragment would be the best indicator oftotal COI–COII divergence for sister species.

In recognition of the value of incorporating previouslypublished data, we focused on regions of COI–COII thatare commonly used in insect systematics (Caterino et al.,2000; Hebert et al., 2003a,b), with the most widely usedprimer pairs for reference (Bogdanowicz et al., 1993; Fol-mer et al., 1994; Simon et al., 1994). If an ideal regionfor species diagnoses is considered to minimize varianceand maximize the approach of mean relative nucleotidedivergence to 100%, then regions may be compared onthe basis of these criteria. None of the commonly usedPCR fragments optimized these values to the extent shownby the 600 bp window demarcated by Dy# 1831–2430,although LCO–HCO and Jerry–Pat partially span thisregion. We recommend that future taxonomic projects con-sider lengthening commonly used primer regions to incor-porate this optimal region.

We have demonstrated that locations of maximum diver-gence occur throughout COI–COII, and that the PCR frag-ments traditionally used for insect systematics have highlevels of variation in nucleotide divergence (Table 3). Wealso demonstrated that peaks of divergence were similarto peaks obtained from randomly generated sequences(Fig. 5). Using a larger fragment would help to minimizesuch nucleotide variability caused by random variation inCOI–COII (Fig. 5). As sequence length increases (Fig. 7),variance within each fragment region does not stabilizeuntil the fragment size reaches approximately 1000 bp. Ofthe three starting points tested (Fig. 7), some sets reach100% mean relative divergence at fragment lengths less than600 bp, but the variance is still very high.

To ensure that a fragment length will produce consistentdivergence estimates across a range of taxa, minimizingvariance within a fragment is important. Small fragmentsare more likely to be skewed by localized regions of unusu-al nucleotide divergence, whereas increased fragmentlength would reduce this risk. Consequently, we feel thatit is more important to maximize fragment length than totarget specific regions within COI–COII, for reasons out-lined above. For example if primer region LCO–HCO(DNA barcoding region) were lengthened 300 bp in the30 direction, it would contain the optimal 600 bp fragment(Dy# 1831–2430) identified in this study and this 900 bpfragment could still potentially be sequenced as a singlefragment (Gunning et al., 2002). Poor quality DNA tem-plate can be sequenced as shorter, overlapping PCR frag-ments. The added sequencing would contribute arelatively low additional cost since the more substantialinvestment in obtaining species and extracting DNA willhave already been made.

4.4. DNA saturation analysis for evaluating phylogenetic

information

Several systematically important patterns of DNA satu-ration were observed in the three phylogenetic time seriesexamined in our study. Although %Ti generally decreasedas nucleotide divergence accumulated, particular areas ofmaximum divergence often did not correspond to theregions of low %Ti (Fig. 9: P. machaon oregonius · P. hos-

piton or P. canadensis · P. multicaudatus), contradictingour initial expectation of a clear tradeoff between thesetwo values. For closely related species, it is plausible toassume that saturation is not significant, but it is interestingto note that some areas of low divergence nonethelessalready appear to show some saturation, or at least sub-stantial stochastic variation in %Ti (Fig. 9: P. canaden-

sis · P. multicaudatus). We also found similar patternsthat appear to correspond in location to tRNA Leu, whichhas different functional constraints that may lead to satura-tion at low levels of divergence (Fig. 9: P. machaon orego-

nius · P. m. hippocrates and P. canadensis · P. glaucus).Hebert and Gregory (2005) have stated that projects

such as DNA barcoding are not intended to reconstructphylogenetic relationships, but instead focus explicitly onspecies delimitation and diagnostics. On the other hand,Hebert et al. (2003a,b, p. 1931) stated that COI ‘‘is morelikely to provide deeper phylogenetic insights than alterna-tive [genes]’’, and as such, many researchers who employ abarcoding approach will try to use the data to examinerelationships among their focal species. With this in mind,we also consider it important to survey COI–COII forregions optimal for phylogenetic studies. We evaluatedphylogenetic informativeness graphically using DNA satu-ration analysis, searching for an ideal region that consis-tently maximizes nucleotide divergence while minimizingsaturation (low %Ti). However, nucleotide divergenceand %Ti accumulated unequally along COI–COII sequenc-es; among the three time series no optimal region for max-imizing nucleotide divergence and %Ti was evident. Thisvariability suggests that researchers wishing to optimizethe use of molecular data for inferring phylogenetic rela-tionships need to examine the underlying patterns ofDNA saturation for their taxon of interest if relying onshort sequence fragments such as subset of COI–COII.

5. Conclusion

Over the past decade, molecular sequence data, in con-junction with morphological and ecological characters,have become integral components of the ‘tool box’ (Arm-strong and Ball, 2005) of systematics and taxonomy. AsDNA sequencing methods become more accessible, increas-ing numbers of researchers are incorporating, or relyingsolely on, DNA characters to clarify species problems oridentify specimens. The use of such characters will onlyincrease, particularly as the number of DNA barcodingprojects continues to grow, and these projects will have

Page 16: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

340 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

significant impacts on systematics, taxonomy, conservation,and pest identification. Species identification should berapid, and can be routinely obtained using a small numberof characters, be they DNA or morphological, if they corre-spond to a previously described species (Dayrat, 2005).However, since relatively few characters may be needed toidentify a species, it is imperative that these characters areaccurate. Species delimitation, on the other hand, is bestachieved when many different characters are examined,such as independent molecular markers, behavior and mor-phology. This ideal cannot always be achieved due to con-straints in time and money, but every effort must be madeto maximize the number of characters used to clarify speciesboundaries or elucidate an organism’s complex history.

The danger of relying upon a single DNA fragment foridentifying and delimiting species is simply that the resultsmay be misleading. If the region of DNA with truly diag-nostic nucleotide changes is not sampled, then species iden-tifications may be incorrect. If only a small fragment ofDNA is sampled, it may fail to produce an accurate repre-sentation of the total genetic variability in that gene. Thiscould result in misrepresentation of the intra- and interspe-cific divergences between closely related species, leading toan inaccurate species delimitation. Consequently, we offerseveral recommendations for minimizing such misleadingresults in future projects that use COI–COII fragments todelimit and diagnose species.

First, we suggest that researchers should maximize thelength of the DNA sequence used for initial pilot studieson any taxon. This will decrease the influence of labilesequence regions on sampled nucleotide divergence andphylogenetic signal. Longer fragments will also minimizestochastic variation across taxa and be more likely toreflect broader patterns of nucleotide divergence. Theimportance of larger fragments and more characters hasnow been highlighted elsewhere (Erpenbeck et al., 2005),although some projects, such as Barcodes of Life, are stillopting to use only relatively short DNA fragments. As astandard for barcoding, the Database Working Group ofthe Consortium of the Barcode of Life (http://barcod-ing.si.edu/DNABarCoding.htm) has proposed that all bar-codes (unless otherwise approved) be a 648 bp region fromthe COI gene, starting at position 58 of the mouse mito-chondrial genome, and containing at least 500 unambigu-ous bases (Hanner, 2005). Assigning COI as a standardsequence region was a particularly important step for theDNA barcoding initiative and follows the recommenda-tions of Caterino et al. (2000). Furthermore, since a largecompilation of LCO–HCO sequences is now available, itis unreasonable to recommend using an entirely indepen-dent COI fragment for the standard sequence, even thougha potentially better region exists. Rather, we suggest thatsequence length requirements should be expanded. Withdevelopments of new and more efficient sequencing it isroutinely possible to obtain longer fragments. For arthro-pods, the extension of the barcoding region by 300–400bp in the 30 direction would encompass significantly more

variation and better reflect sequence divergence withinand between species.

Second, in-depth projects that rely on mtDNA divergenc-es to delimit species (e.g. Monaghan et al., 2006) could benefitfrom targeting regions of maximum divergence, particularlyfor species characterized by low levels of divergence. We didnot find a single optimal region of maximal divergenceacross all taxa; this region varied considerably even betweenclosely related taxa. If fragment length is increased inpilot studies, then consistency can be preserved for identifica-tion across broader taxonomic ranges, and these can be fol-lowed by more extensive surveys using shorter fragmentsthat maximize the survey efficiency for that taxon.

Third, numerous sequences should be obtained to max-imize sampling of diversity within and between species, sothat overlap of intra- and interspecific divergence can bemore accurately identified. This includes sampling multiplespecimens from across the known geographic range of thespecies (Funk and Omland, 2003).

Fourth, we recommend that researchers wishing to useDNA sequence data for both species recognition and phy-logenetic inference be aware of the underlying variability inphylogenetic signal, and adjust their choice of DNA regionand length to minimize saturation and optimize divergencelevels for their taxon of interest.

As molecular systematics evolves, and integrative taxon-omy becomes increasingly prevalent, understanding thepatterns of evolution in molecular characters becomesincreasingly important. Researchers must be aware of thepatterns of character change occurring in their data setand consciously consider the effects of these patterns onspecies delimitation, diagnoses and phylogenetic inference.There are no simple, universal solutions to the full range ofproblems that systematists routinely deal with, and so ourfinal recommendation is that molecular systematists striveto retain flexibility and nuance in their responses to thesechallenges.

Acknowledgments

We would like to thank a number of people for helpwith this manuscript, including T. Fulton for her insightfulcomments and suggestions on an early version of the man-uscript, L. Dosdall, L. Packer, H. Proctor, and J. Spencefor comments on the PhD. thesis version, and B. Dayratand an anonymous reviewer for review of the submittedmanuscript. For previously unpublished sequence, we aregrateful to F. Collins for Anopheles specimens, and to J.Singh, J. Leibovitz, D. Kain, M. Kuo, and D. Rubinofffor DNA extraction and sequencing. We are indebted toH. Proctor, P. Hooper, T. Klassen, W. Gallin and C. Mac-Quarrie for their patience and helpful suggestions on ourstatistical analyses. We would also like to thank the mem-bers of the Sperling lab for animated discussions aboutmolecular evolution and species delimitations. Researchfunding was provided by an NSERC grant to Sperlingand two NSERC PGS scholarships to Roe.

Page 17: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Appendix A. Specimen information for all examined sequences

Species DNA# (Haplotype) Location GenBank Reference

Lepidoptera

Tortricidae Choristoneura biennis Freeman FS.b-53 (bl) CAN: BC: Morrisey Creek DQ792587a Sperling and Hickey (1994)Tortricidae C. biennis Freeman FS.b-54 (bb) CAN: BC: Morrisey Creek L19096b Sperling and Hickey (1994)Tortricidae C. fumiferana (Clemens) FS.b-12(f2) CAN: ON: Ignace L19098b Sperling and Hickey (1994)Tortricidae C. fumiferana (Clemens) FS.b-37 (fl) CAN: AB: Manning L19094b Sperling and Hickey (1994)Tortricidae C. occidentalis Freeman FS.b-16(ol) CAN: BC: Monte Creek DQ792584a Sperling and Hickey (1994)Tortricidae C. occidentalis Freeman FS.b-367 (ob) CAN: BC: Bridesville DQ792585a Sperling and Hickey (1994)Tortricidae C. orae Freeman FS.b-216(bl) USA: AK: Fairbanks; phero. 82:9:9 DQ792586a Sperling and Hickey (1994)Tortricidae C. pinus (Freeman) FS.b-15(pl) CAN: ON: Parry Sound L19095b Sperling and Hickey (1994)Tortricidae C. retiniana (Walsingham) FS.b-816 USA: CA: Sierraville DQ792588 Previously unpublishedTortricidae C. retiniana (Walsingham) FS.b-817 USA: CA: Tehachapi DQ792590 Previously unpublishedTortricidae C. retiniana (Walsingham) FS.b-866 USA: NV: Mt. Charleston DQ792589 Previously unpublishedPyralidae Dioryctria abietella Denis and

SchiffermiillerDu64 China: Henan Province: Mt. Baiyun DQ247739 Du et al. (2005)

Pyralidae D. abietivorella (Grote) AR22 USA: CA: Butte Co.: Chico DQ295185 Roe et al. (2006)Pyralidae D. abietivorella (Grote) DuO4 USA: CA: Butte Co.: Chico DQ247740 Du et al. (2005)Pyralidae D. abietivorella (Grote) DuO5 USA: CA: Butte Co.: Chico DQ247741 Du et al. (2005)Pyralidae D. auranticella (Grote) AR144 USA: CA: El Dorado Co., Placerville DQ295176 Roe et al. (2006)Pyralidae D. auranticella (Grote) DuO2 USA: CA: El Dorado Co., Placerville DQ247736 Du et al. (2005)Pyralidae D. cambiicola (Grote) AR78 CAN: BC: Prince George DQ295183 Roe et al. (2006)Pyralidae D. fordi Donahue and Neunzig AR157 USA: CA: Butte Co. Chico DQ295184 Roe et al. (2006)Pyralidae D. magnifica Munroe Du69 China: Mt. Baiyun, Henan Province DQ247742 Du et al. (2005)Pyralidae D. okanaganella

Mutuura, Munroe & RossAR150 USA: CA: El Dorado Co., Placerville DQ295179 Roe et al. (2006)

Pyralidae D. okanaganella

Mutuura, Munroe & RossAR148 USA: CA: El Dorado Co., Placerville DQ295178 Roe et al. (2006)

Pyralidae D. pentictonella

Mutuura, Munroe & RossAR15 USA: CA: Butte Co.: Chico DQ295180 Roe et al. (2006)

Pyralidae D. pentictonella

Mutuura, Munroe & RossAR58 USA: CA: Butte Co.: Chico DQ295181 Roe et al. (2006)

Pyralidae D. pentictonella

Mutuura, Munroe & RossAR149 USA: CA: El Dorado Co., Placerville DQ295182 Roe et al. (2006)

Pyralidae D. pseudotsugella Munroe AR82 CAN: AB: Vulcan: 10 mi SE DQ295186 Roe et al. (2006)Pyralidae D. reniculelloides

Mutuura and MunroeDuOl CAN: AB: Fort MacMurray DQ247734 Du et al. (2005)

Pyralidae D. rubella Hampson Du21 China: Tianjin: Mt. Baxian DQ247743 Du et al. (2005)Pyralidae D. sylvestrella (Ratzeburg) D130 Germany: Bavaria: Parkstein-Hutten DQ247746 Du et al. (2005)Pyralidae D. sylvestrella (Ratzeburg) SK02087 Gernamy: Bavaria: Landshut DQ247745 Du et al. (2005)Pyralidae D. yiai Mutuura and Munroe Dul3 China: Hebei Province: Mt.

XiaowutaiDQ247737 Du et al. (2005)

Pyralidae D. yiai Mutuura and Munroe D17 China: Tianjin: Mt. Baxian DQ247738 Du et al. (2005)Pyralidae D. zimmermani (Grote) Dull8 USA: MS: Hinds Co. DQ247730 Du et al. (2005)Noctuidae Feltia jaculifera (Gn.) FS.b-150 (pheroA) CAN: AB: Lethbridge U60990 Sperling et al. (1996)Noctuidae F. jaculifera (Gn.) FS.b-152 (pheroB) CAN: AB: Lethbridge DQ792591a Sperling et al. (1996)Noctuidae F. jaculifera (Gn.) FS.b-464 (pheroC) CAN: ON: Ottawa DQ792592a Sperling et al. (1996)Sphingidae Hyles annei Guerin 16157 Chile: Toconao, north AJ749430 Hundsoerfer et al. (2005)Sphingidae H. e. euphorbiae (L.) 695887 France: South AJ749480 Hundsoerfer et al. (2005)Sphingidae H. e. euphorbiae (L.) 020c Spain: Alio: Catalonia AJ749485 Hundsoerfer et al. (2005)

(continued on next page)

A.D

.R

oe,

F.A

.H.

Sp

erling

/M

olecu

lar

Ph

ylo

gen

eticsa

nd

Evo

lutio

n4

4(

20

07

)3

25

–3

45

341

Page 18: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Appendix A (continued)

Species DNA# (Haplotype) Location GenBank Reference

Sphingidae H. e. euphorbiae (L.) 23172a Germany AJ749512 Hundsoerfer et al. (2005)Sphingidae H. e. euphorbiae (L.) 010 Spain AJ749514 Hundsoerfer et al. (2005)Sphingidae H. euphorbiarum

(Guerin-Mereville &Percheron)

23274 Argentina AJ749428 Hundsoerfer et al. (2005)

Sphingidae H. gallii (Rottemburg) 0042 China: Yanqin: near Beijing AJ749432 Hundsoerfer et al. (2005)Sphingidae H. gallii (Rottemburg) 695869 England: ex bred stock, ex Neil

WestAJ749433 Hundsoerfer et al. (2005)

Sphingidae H. gallii (Rottemburg) 0027 Czech Republic: ex bred stock AJ749450 Hundsoerfer et al. (2005)Sphingidae H. gallii (Rottemburg) 16189 Germany AJ749451 Hundsoerfer et al. (2005)Sphingidae H. gallii (Rottemburg) 0080 China AJ749579 Hundsoerfer et al. (2005)Sphingidae H. gallii (Rottemburg) 695843 Finland AJ749580 Hundsoerfer et al. (2005)Sphingidae H. nicaea castissima

(Austaut)23208 Morocco: High Atlas, SW Midelt AJ749444 Hundsoerfer et al. (2005)

Sphingidae H. robertsi peplidis

(Christoph)695835 Iran: near Esfahan AJ749464 Hundsoerfer et al. (2005)

Sphingidae H. r. peplidis (Christoph) 695842 Iran: near Esfahan AJ749465 Hundsoerfer et al. (2005)Sphingidae H. r. peplidis (Christoph) 695872 Iran: near Esfahan AJ749466 Hundsoerfer et al. (2005)Sphingidae H. sammuti Eitchberger,

Danner & Surholt0053 Malta AJ749505 Hundsoerfer et al. (2005)

Sphingidae H. sammuti Eitchberger,Danner & Surholt

23239 Italy: Sicily: Zafferana AJ749459 Hundsoerfer et al. (2005)

Sphingidae H. sammuti Eitchberger,Danner & Surholt

0054 Malta AJ749461 Hundsoerfer et al. (2005)

Sphingidae H. sammuti Eitchberger,Danner & Surholt

0065 Malta AJ749463 Hundsoerfer et al. (2005)

Sphingidae H. tithymali deserticola

(Staudinger)055d Morocco AJ749497 Hundsoerfer et al. (2005)

Sphingidae H. t. deserticola (Staudinger) 055c Morocco AJ749496 Hundsoerfer et al. (2005)Sphingidae H. t. deserticola (Staudinger) 055b Morocco AJ749495 Hundsoerfer et al. (2005)Sphingidae H. t. himyarensis Meerman 23173a Yemen AJ749521 Hundsoerfer et al. (2005)Sphingidae H. t. himyarensis Meerman 23224 Yemen AJ749499 Hundsoerfer et al. (2005)Sphingidae H. t. mauretanica (Staudinger) 23215 Morocco AJ749494 Hundsoerfer et al. (2005)Sphingidae H. t. mauretanica (Staudinger) 23216 Morocco AJ749545 Hundsoerfer et al. (2005)Sphingidae H. t. tithymali (Boisduval) 084a Spain: Canary Islands AJ479486 Hundsoerfer et al. (2005)Sphingidae H. t. tithymali (Boisduval) 100_AH Spain: Canary Islands AJ479488 Hundsoerfer et al. (2005)Sphingidae H. t. gecki de Freina 23238 Portugal AJ749491 Hundsoerfer et al. (2005)Sphingidae H. t. gecki de Freina 0164 Portugal AJ749490 Hundsoerfer et al. (2005)Geometridae Lambdina fiscellaria fiscellaria

(Guenee)FS.b-#9 (Fl) Canada: NF: Corner Brook AF064521 Sperling et al. (1999)

Geometridae L.f. lugubrosa (Hulst) FS.b-265 (LI) Canada: BC: Mud Lake DQ792593a Sperling et al. (1999)Papilionidae Papilio anchisiades Esper FS.a-44 Brazil: Campinas AF044005 Caterino and Sperling (1999)Papilionidae P. canadensis Rothschild &

JordanFS.a-16 USA: NY: Richford AF044014 Caterino and Sperling (1999)

Papilionidae P. cresphontes Cramer FS.a-170 USA: WI: Sauk County AF043999 Caterino and Sperling (1999)Papilionidae P. demodocus demodocus Esper FS.b-1929 South Africa: Nelspruit AY569091 Zakharov et al. (2004b)Papilionidae P. demodocus Esper FS.b-146 Kenya (ex pupa) AY457588 Zakharov et al. (2004a)Papilionidae P. demoleus malaynus FS.a-68 Malaysia: Penang Island AF044000 Caterino and Sperling (1999)Papilionidae P. d. sthenelus W.S. Macleay FS.b-1832 Australia: New South Wales AY569092 Zakharov et al. (2004b)

342A

.D.

Ro

e,F

.A.H

.S

perlin

g/

Mo

lecula

rP

hy

log

enetics

an

dE

volu

tion

44

(2

00

7)

32

5–

34

5

Page 19: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

Papilionidae P. erithonioides Grose-Smith FS.b-1626 Madagascar: Ankitsanga AY565095 Zakharov et al. (2004b)Papilionidae P. erostratus Westwood FS.b-973 El Salvador AY457599 Zakharov et al. (2004a)Papilionidae P. glaucus L. FS.a-69 USA: MD: Potomac AF044013 Caterino and Sperling (1999)Papilionidae P. grosesmithi Rothschild FS.b-1625 Madagascar: Kirindy AY569089 Zakharov et al. (2004b)Papilionidae P. grosesmithi Rothschild FS.b-1624 Madagascar: Ambahibe AY569090 Zakharov et al. (2004b)Papilionidae P. hospiton Gene FS.a-143 Sardinia AF044009 Caterino and Sperling (1999)Papilionidae P. indra Reakirt FS.a-66 USA: WA: Wawawai AF044011 Caterino and Sperling (1999)Papilionidae P. machaon machaon L. FS.a-27 France: Coudoux AF044006 Caterino and Sperling (1999)Papilionidae P. m. hippocrates

Felder · FelderFS.b-78 Japan: Gifu Pref. AY457593 Zakharov et al. (2004a)

Papilionidae P. m. oregonius Edwards FS.a-77 USA: WA: Palouse Falls AF044007 Caterino and Sperling (1999)Papilionidae P. memnon L. FS.b-91 Japan: Gifu Pref. AY457578 Zakharov et al. (2004a)Papilionidae P. multicaudatus Kirby FS.a-163 USA: SD: Black Hills AF044016 Caterino and Sperling (1999)Papilionidae P. palamedes Drury FS.a-18 USA: FL: Ocala State Forest AF044018 Caterino and Sperling (1999)Papilionidae P. polyxenes Fabricius FS.a-64 USA: NY: Tompkins County AF044010 Caterino and Sperling (1999)Papilionidae P. rumanzovia Eshscholtz FS.b-972 Philippines AY457582 Zakharov et al. (2004a)Papilionidae P. rutulus Lucas FS.a-472 USA: WA: Orcas Island AF044015 Caterino and Sperling (1999)Papilionidae P. scamander Boisduval FS.a-19 Brazil: Campinas AF044020 Caterino and Sperling (1999)Papilionidae P. thoas L. FS.b-302 French Guiana: Pointe Macouria AY457601 Zakharov et al. (2004a)Papilionidae P. troilus L. FS.a-29 USA: FL: Ocala State Forest AF044017 Caterino and Sperling (1999)Papilionidae P. xuthus L. FS.a-238 Japan: Tokyo AF043999 Caterino and Sperling (1999)Papilionidae P. zelicaon Lucas FS.a-76 USA: CA: Riverside County AF044008 Caterino and Sperling (1999)Yponomeutidae Yponomeuta cagnagella Hubner FS.b-467 Canada: ON: Ottawa DQ792583a Sperling et al. (1995)Yponomeutidae Y. padella (L.) FS.b-470 Canada: BC: Victoria. DQ792582a Sperling et al. (1995)

Diptera

Culicidae Anopheles arabiensis Patton EMI Burkina Faso: Zaghtouli: CDCAtlanta, strain ARZAG

DQ792576 Previously unpublished

Culicidae A. gambiae Giles — Strain G3 NC_002084 Beard et al. (1993)Culicidae A. gambiae Giles EM3 (Gl) Kenya: Asembo Bay: CDC

Atlanta, strain AS46DQ792577 Previously unpublished

Culicidae A. gambiae Giles JA37 (G2) Kenya: Asembo Bay: CDCAtlanta, strain AS46

DQ792578 Previously unpublished

Culicidae A. melas Theobald JA22 (ML1) Gambiae: Balingho: CDCAtlanta, strain BAL

DQ792579 Previously unpublished

Culicidae A. melas Theobald EM5 (ML2) Gambiae: Balingho: CDCAtlanta, strain BAL

DQ792580 Previously unpublished

Culicidae A. quadriannulatis (Theobald) EM2 (Ql) Zimbabwe: CDC Atlanta, strainCHIL

DQ792581 Previously unpublished

Calliphoridae Chrysomya albiceps

(Wiedemann)C_alb5 Egypt: Alexandria: Moharrem

BeyAF083657 Wells and Sperling (1999)

Calliphoridae C. bezziana Villeneuve PNG#10 Indonesia: Bogor AF295548 Wells and Sperling (2001)Calliphoridae C. megacephala (F.) Variant 3 Malaysia AY909053 Tan et al. (2005)Calliphoridae C. megacephala (F.) — Papua New Guinea: between Lae

& BuloloAF295551 Wells and Sperling (2001)

Calliphoridae C. norrisi James PNG#50 Paupa New Guniea: Wau AF295552 Wells and Sperling (2001)Calliphoridae C. rufifacies (Maquart) Variant 5 Malaysia AF909055 Tan et al. (2005)Calliphoridae C. rufifacies (Maquart) C_ruf2 USA: FL: Miami AF083658 Wells and Sperling (1999)Calliphoridae C. varipes (Macquart) C_varl Australia: Adelaide AF295556 Wells and Sperling (2001)

a Previously published but not submitted to GenBank.b Partial COI–COII fragment extended to full 2.3 kb COI-CO.

A.D

.R

oe,

F.A

.H.

Sp

erling

/M

olecu

lar

Ph

ylo

gen

eticsa

nd

Evo

lutio

n4

4(

20

07

)3

25

–3

45

343

Page 20: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

344 A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345

Appendix B. Supplementary data

Supplementary data associated with this article can befound, in the online version, at doi:10.1016/j.ympev.2006.12.005.

References

Armstrong, K.F., Ball, S.L., 2005. DNA barcodes for biosecurity: invasivespecies identification. Philos. T. Roy. Soc. B. 360, 1813–1823.

Avise, J.C., Arnold, J., Ball, R.M., Bermingham, E., Lamb, T., Neigel,J.E., Reeb, C.A., Saunders, N.C., 1987. Intraspecific phylogeography:the mitochondrial DNA bridge between population genetics andsystematics. Annu. Rev. Ecol. Syst. 18, 489–522.

Barrett, R.D.H., Hebert, P.D.N., 2005. Identifying spiders through DNAbarcodes. Can. J. Zool. 83, 481–491.

Bartlett, S.E., Davidson, W.S., 1992. FINS (Forensically InformativeNucleotide Sequencing): a procedure for identifying animal origin ofbiological specimens. Biotechniques 12, 408–411.

Bartolome, C., Maside, X., Yi, S., Grant, A.L., Charlesworth, B., 2005.Patterns of selection on synonymous and nonsynonymous variants inDrosophila miranda. Genetics 169, 1495–1507.

Beard, C.B., Hamm, D.M., Collins, F.H., 1993. The mitochondrialgenome of the mosquito Anopheles gambiae: DNA sequence, genomeorganization, and comparisons with mitochondrial sequences of otherinsects. Insect Mol. Biol. 2, 103–124.

Besansky, N.J., Krzywinski, J., Lehmann, T., Simard, F., Kern, M.,Mukabayire, O., Fontenill, D., Toure, Y., Sagnon, N.F., 2003.Semipermeable species boundaries between Anopheles gambiae andAnopheles arabiensis: evidence from multilocus DNA sequence varia-tion. Proc. Natl. Acad. Sci. USA 100, 10818–10823.

Blair, J.E., Hedges, S.B., 2005. Molecular phylogeny and diver-gence times of deuterostome animals. Mol. Biol. Evol. 22,2275–2284.

Blair, C.P., Abrahamson, W.G., Jackman, J.A., Tyrrell, I., 2005. Crypticspeciation and host-race formation in a purportedly generalist tum-bling flower beetle. Evolution 59, 304–316.

Bogdanowicz, S.M., Wallner, W.E., Bell, J., O’ Dell, T.M., Harrison,R.G., 1993. Asian gypsy moths (Lepidoptera, Lymantridae) in NorthAmerica – Evidence from molecular data. Ann. Entomol. Soc. Am. 86,710–715.

Broughton, R.E., Reneau, P.C., 2006. Spatial covariation of mutation andnonsynonymous substitution rates in vertebrate mitochondrial gen-omes. Mol. Biol. Evol. 23, 1516–1524.

Caterino, M.S., Sperling, F.A.H., 1999. Papilio phylogeny based onmitochondrial cytochrome oxidase I and II genes. Mol. Phylogenet.Evol. 11, 122–137.

Caterino, M.S., Cho, S., Sperling, F.A.H., 2000. The current state of insectmolecular systematics: A thriving tower of Babel. Annu. Rev.Entomol. 45, 1–54.

Clary, D.O., Wolstenholme, D.R., 1985. The mitochondrial DNAmolecule of Drosophila yakuba: nucleotide sequence, gene organiza-tion, and genetic code. J. Mol. Evol. 22, 252–271.

Dayrat, B., 2005. Towards integrative taxonomy. Biol. J. Linn. Soc. 85,407–415.

Du, Y., Roe, A.D., Sperling, F.A.H., 2005. Phylogenetic framework forDioryctria (Lepidoptera: Pyralidae: Phycitinae) based on combinedanalysis of mitochondrial DNA and morphology. Can. Entomol. 137,685–711.

Dunn, C.P., 2003. Keeping taxonomy based in morphology. Trends Ecol.Evol. 18, 270–271.

Ebach, M.C., Holdrege, C., 2005. DNA barcoding is no substitute fortaxonomy. Nature 434, 697.

Erpenbeck, D., Hooper, J.N.A., Worheide, G., 2005. CO1 phylogenies indiploblasts and the ’Barcoding of Life’ – are we sequencing asuboptimal partition? Mol. Ecol. Notes 1, 1–4.

Folmer, O., Black, M., Hoeh, W., Lutz, R., Vrijenhoek, R., 1994. DNAprimers for amplification of mitochondrial cytochrome c oxidasesubunit I from diverse metazoan invertebrates. Mol. Mar. Biol.Biotechnol. 3, 294–299.

Funk, D.J., Omland, K.E., 2003. Species-level paraphyly and polyphyly:frequency, causes, and consequences, with insights from animalmitochondrial DNA. Annu. Rev. Ecol. Evol. Syst. 34, 397–423.

Galtier, N., Enard, D., Ranondy, Y., Bazin, E., Belkhir, K., 2006.Mutation hot spots in mammalian mitochondrial DNA. Genome Res.16, 215–222.

Goetze, E., 2003. Cryptic speciation on the high seas; global phylogeneticsof the copepod family Eucalanidae. Proc. Roy. Soc. Lond. B. Biol.270, 2321–2331.

Gunning, K.M., McIntyre, E., Brown, C., Chansky, K., Bourey, J.,Johnson, B., Wheaton, A., Wike, C. 2002. ABI PRISM� 3100 GeneticAnalyzer: Further advances to expand productivity. Applied Biosys-tems Document #105549. Accessed August 1, 2006. Available from:http://docs.appliedbiosystems.com/basket.taf?_UserReference=FCC053A885AACB6344CF7A6D&_basket=add&searchid=.

Hanner, R. 2005. Proposed standards for BARCODE records in INSDC(BRIs). p. 3. National Museum of Natural History, Washington, DC.Accessed August 1, 2006. (Barcode Data Standards: http://barcod-ing.si.edu/resources.html).

Hebert, P.D.N., Gregory, T.R., 2005. The promise of DNA barcoding fortaxonomy. Syst. Biol. 54, 852–859.

Hebert, P.D.N., Cywinska, A., Ball, S.L., deWaard, J.R., 2003a. Biolog-ical identification through DNA barcodes. Proc. Roy. Soc. Lond. B.Biol. 270, 313–321.

Hebert, P.D.N., Ratnasingham, S., deWaard, J.R., 2003b. Barcodinganimal life: cytochrome c oxidase subunit 1 divergences among closelyrelated species. Proc. Roy. Soc. Lond. B. Biol. (Suppl. 27), S96–S99.

Hebert, P.D.N., Stoeckle, M.Y., Zemiak, T.S., Francis, C.M., 2004.Identification of birds through DNA barcodes. Public Libr. Sci. Biol.2, e312.

Hebert, P.D.N., Penton, E.H., Burns, J.M., Janzen, D.H., Hallwachs, W.,2005. Ten species in one: DNA barcoding reveals cryptic species in theneotropical skipper butterfly Astraptes fulgerator. Proc. Natl. Acad.Sci. USA 101, 14812–14817.

Hendrixson, B.E., Bond, J.E., 2005. Testing species boundaries in theAnthrodiaetus unicolor complex (Araneae: Mygalomorphae: Antrodia-etidae): ‘‘Paraphyly’’ and cryptic diversity. Mol. Phylogenet. Evol. 36,405–416.

Hundsoerfer, A.K., Kitching, I.J., Wink, M., 2005. A molecular phylog-eny of the hawkmoth genus Hyles (Lepidoptera: Sphingidae: Macro-glossinae). Mol. Phylogenet. Evol. 35, 442–458.

Innan, H., Nordborg, M., 2002. Recombination or mutational hot spots inhuman mtDNA? Mol. Biol. Evol. 19, 1122–1127.

Johnson, K.P., Williams, B.L., Drown, D.M., Adams, R.J., Clayton,D.H., 2002. The population genetics of host specificity: geneticdifferentiation in dove lice (Insects: Phthiraptera). Mol. Ecol. 11,25–38.

Lambert, D.M., Baker, A., Huynen, L., Haddrath, O., Hebert, P.D.N.,Millar, C.D., 2005. Is a large-scale DNA-based inventory of ancientlife possible? J. Hered. 96, 279–284.

Lin, C.-P., Danforth, B.N., 2003. How do insect nuclear and mitochon-drial gene substitution patterns differ? Insights from Bayesian analysesof combined datasets. Mol. Phylogenet. Evol. 30, 686–702.

Lipscomb, D., Platnick, N., Wheeler, Q., 2003. The intellectual content oftaxonomy: a comment on DNA taxonomy. Trends Ecol. Evol. 18,65–66.

Lunt, D.H., Zhang, D.-X., Szymura, J.M., Hewitt, G.M., 1996. The insectcytochrome oxidase I gene: evolutionary patterns and conservedprimers for phylogenetic studies. Insect Mol. Biol. 5, 153–165.

Maddison, W.P., Maddison, D.R. 2004. Mesquite: A modular system forevolutionary analysis. Version 1.11 Accessed August 1, 2006. Availablefrom: http://mesquiteproject.org.

Meyer, C.P., Paulay, G., 2005. DNA barcoding: error rates based oncomprehensive sampling. Public Libr. Sci. Biol. 3, e422.

Page 21: Patterns of evolution of mitochondrial cytochrome oxidase I …Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcoding Amanda D

A.D. Roe, F.A.H. Sperling / Molecular Phylogenetics and Evolution 44 (2007) 325–345 345

Meyer, A., Zardoya, R., 2003. Recent advances in the (molecular)phylogeny of vertebrates. Annu. Rev. Ecol. Evol. Syst. 34, 311–338.

Monaghan, M.T., Balke, M., Pons, J., Vogler, A.P., 2006. Beyondbarcodes: complex DNA taxonomy of a South Pacific island radiation.Proc. Roy. Soc. Lond. B. Biol. 273, 887–893.

Moritz, C., Cicero, C., 2004. DNA barcoding: promise and pitfalls. PublicLibr. Sci. Biol. 2, e354.

Nei, M., 1987. Molecular Evolutionary Genetics. Columbia UniversityPress, New York.

Olson, R.R., 1991. Whose larvae? Nature 351, 352–353.Page, T.J., Choy, S.C., Hughes, J.M., 2005. The taxonomic feedback loop:

symbiosis of morphology and molecules. Biol. Lett. 1, 139–142.Penton, E.H., Hebert, P.D.N., Crease, T.J., 2004. Mitochondrial DNA

variation in North American populations of Daphnia obtusa: contin-entalism or cryptic endemism? Mol. Ecol. 13, 97–107.

Posada, D., Crandall, K.A., 1998. Modeltest: testing the model of DNAsubstitution. Bioinformatics 14, 817–818.

Regier, J.C., Shultz, J.W., Kambic, R.E., 2005. Pancrustacean phylogeny:hexapods are terrestrial crustaceans and maxillopods are not mono-phyletic. Proc. Roy. Soc. Lond. B. Biol. 272, 395–401.

Roe, A.D., Stein, J.D., Gillette, N.E., Sperling, F.A.H., 2006. Identifica-tion of Dioryctria (Lepidoptera: Pyralidae) in a seed orchard at Chico,California. Ann. Entomol. Soc. Am. 99, 433–448.

Rozas, J., Sanchez-DelBarrio, J.C., Messeguer, X., Rozas, R., 2003.DnaSP, DNA polymorphism analyses by the coalescent and othermethods. Bioinformatics 19, 2496–2497.

Saraste, M., 1990. Structural features of cytochrome oxidase. Quart. Rev.Biophys. 23, 331–366.

Scheffer, S.J., Giblin-Davis, R.M., Taylor, G.S., Davies, K.A., Purcell,M., Lewis, M.I., Goolsby, J., Center, T.D., 2004. Phylogeneticrelationships, species limits, and host specificity of gall-formingFergusonina flies (Diptera: Fergusoninidae) feeding on Melaleuca

(Myrtaceae). Ann. Entomol. Soc. Am. 97, 1216–1221.Seberg, O., Humphries, C.J., Knapp, S., Stevenson, D.W., Petersen, G.,

Scharff, N., Andersen, N.M., 2003. Shortcuts in systematics? Acommentary on DNA-based taxonomy. Trends Ecol. Evol. 18, 63–65.

Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., Flook, P., 1994.Evolution, weighting, and phylogenetic utility of mitochondrial genesequences and a compilation of conserved polymerase chain reactionprimers. Ann. Entomol. Soc. Am. 87, 651–701.

Sperling, F.A.H., 2003a. Butterfly molecular systematics: from speciesdefinitions to higher-level phylogenies. In: Boggs, C., Watt, W.,Ehrlich, P. (Eds.), Butterflies: Evolution and Ecology Taking Flight.University of Chicago Press, Chicago, pp. 431–458.

Sperling, F.A.H. 2003. DNA barcoding: Deus ex Machina. Newsletter ofthe Biological Survey of Canada (Terrestrial Arthropods). OpinionPage [online] 22. Accessed January 15, 2006. Available from: http://www.biology.ualberta.ca/bsc/news22_2/opinionpage.htm.

Sperling, F.A.H., Hickey, D.A., 1994. Mitochondrial DNA sequencevariation in the spruce budworm species complex (Choristoneura:Lepidoptera). Mol. Biol. Evol. 11, 656–665.

Sperling, F.A.H., Anderson, G.S., Hickey, D.A., 1994. A DNA-basedapproach to the identification of insect species used for postmorteminterval estimation. J. Forensic Sci. 39, 418–427.

Sperling, F.A.H., Landry, J.-F., Hickey, D.A., 1995. DNA-based identi-fication of introduced ermine moth species in North America(Lepidoptera: Yponomeutidae). Ann. Entomol. Soc. Am. 88, 155–162.

Sperling, F.A.H., Byers, R., Hickey, D.A., 1996. Mitochondrial DNAsequence variation among pheromotypes of the dingy cutworm, Feltia

jaculifera (Gn.) (Lepidoptera: Noctuidae). Can. J. Zool. 74,2109–2117.

Sperling, F.A.H., Raske, A.G., Otvos, I.S., 1999. Mitochondrial DNAsequence variation among populations and host races of Lambdina

fiscellaria (Gn.) (Lepidoptera: Geometridae). Insect Mol. Biol. 8,97–106.

Stoneking, M., 2000. Hypervariable sites in the mtDNA control region aremutational hotspots. Am. J. Hum. Genet. 67, 1029–1032.

Swofford, D.L., Olsen, G.J., Waddell, P.J., Hillis, D.M., 1996. Phyloge-netic inference. In: Hillis, D.M., Moritz, C., Mable, B. (Eds.),Molecular Systematics. Sinauer Associates, Sunderland, MA, pp.407–514.

Tajima, F., 1983. Evolutionary relationship of DNA sequences in finitepopulations. Genetics 105, 437–460.

Tan, S.H., Mohd-Aris, E., Omar, B., Surin, J., Mohamed, Z. 2005. DNA-based characteristics of forensically important insect species inMalaysia. GenBank. Accessed August 1, 2006. Available from:http://www.ncbi.nlm.nih.gov/.

Tautz, D., Arctander, P., Minelli, A., Thomas, R.H., Volger, A.P., 2003.A plea for DNA taxonomy. Trends Ecol. Evol. 18, 70–74.

Wells, J.D., Sperling, F.A.H., 1999. Molecular phylogeny of Chrysomya

albiceps and C. rufifacies (Diptera: Calliphoridae). J. Med. Entomol.36, 222–226.

Wells, J.D., Sperling, F.A.H., 2001. DNA-based identification of foren-sically important Chrysomyinae (Diptera: Calliphoridae). Forensic Sci.Int. 3065, 1–6.

Wells, J.D., Paper, T., Sperling, F.A.H., 2001. DNA-based identificationand molecular systematics of forensically important Sarcophagidae(Diptera). J. Forensic Sci. 46, 1098–1102.

Whiteman, N.K., Santiago-Alarcon, D., Johnson, K.P., Parker, P.G.,2004. Differences in straggling rates between two genera of dove lice(Insecta: Phthiraptera) reinforce population genetic and cophyloge-netic patterns. Int. J. Parasitol. 34, 1113–1119.

Will, K.W., Rubinoff, D., 2004. Myth of the molecule: DNA barcodes forspecies cannot replace morphology for identification and classification.Cladistics 20, 47–55.

Will, K.W., Mishler, B.D., Wheeler, Q.D., 2005. The perils of DNAbarcoding and need for integrative taxonomy. Syst. Biol. 54, 844–851.

Zakharov, E.V., Caterino, M.S., Sperling, F.A.H., 2004a. Molecularphylogeny, historical biogeography, and divergence time estimates forswallowtail butterflies of the genus Papilio (Lepidoptera: Papilionidae).Syst. Biol. 53, 193–215.

Zakharov, E.V., Smith, C.R., Lees, D.C., Cameron, A., Vane-Wright,R.I., Sperling, F.A.H., 2004b. Independent gene phylogenies andmorphology demonstrate a Malagasy origin for a wide-ranging groupof swallowtail butterflies. Evolution 58, 2763–2782.

Zhang, D.-X., Hewitt, G.M., 2003. Nuclear DNA analyses in geneticstudies of populations: practice, problems and prospects. Mol. Ecol.12, 563–584.