5
Proc. Natl. Acad. Sci. USA Vol. 92, pp. 280-284, January 1995 Evolution Bidirectional interlocus concerted evolution following allopolyploid speciation in cotton (Gossypium) (hybridization/molecular evolution/phylogeny reconstruction/polyploidy/ribosomal DNA) JONATHAN F. WENDEL*, ANDREW SCHNABEL, AND TosAK SEELANAN Department of Botany, Iowa State University, Ames, IA 50011 Communicated by Major M. Goodman, North Carolina State University, Raleigh, NC, October 3, 1994 (received for review July 12, 1994) ABSTRACT Polyploidy is a prominent process in plant evolution; yet few data address the question of whether homeol- ogous sequences evolve independently subsequent to polyploidization. We report on ribosomal DNA (rDNA) evolution in five aliopolyploid (AD genome) species of cotton (Gossypium) and species representing their diploid progenitors (A genome, D genome). Sequence data from the internal transcribed spacer regions (1TSl and ITS2) and the 5.8S gene indicate that rDNA arrays are homogeneous, or nearly so, in all diploids and allopolyploids examined. Because these arrays occur at four chromosomal loci in allopolyploid cotton, two in each subge- nome, repeats from different arrays must have become homog- enized by interlocus concerted evolution. Southern hybridization analysis combined with copy-number estimation demonstrate that this process has gone to completion in the diploids and to completion or near-completion in all allopolyploid species and that it most likely involves the entire rDNA repeat Phylogenetic analysis demonstrates that interlocus concerted evolution has been bidirectional in allopolyploid species-i.e., rDNA from four polyploid lineages has been homogenized to a D genome repeat type, whereas sequences from Gossypium mustelinum have con- certed to an A genome repeat type. Although little is known regarding the functional significance of interlocus concerted evolution of homeologous sequences, this study demonstrates that the process occurs for tandemly repeated sequences in diploid and polyploid plants. That interlocus concerted evolution can occur bidirectionally subsequent to hybridization and polyploidization has significant implications for phylogeny re- construction, especially when based on rDNA sequences. Approximately 70% of angiosperms are thought to have experienced one or more episodes of polyploidy at some point in the past (1). Despite the prevalence of polyploidy in plants, surprisingly little is known regarding its genetic and genomic consequences. There are few data, for example, on the mode and tempo of homeologous sequence evolution in polyploid genomes. These sequences, which evolved independently while isolated in diverging diploid genomes, are united in a common nucleus at the time of polyploid formation, thereby providing the possibility of nonindependent molecular evolution. We have initiated studies of repeated sequence evolution (2) in a model system involving five allopolyploid species of cotton (Gossypium) and species that represent their two diploid progenitors (Fig. 1). The allopolyploid ("AD genome"; (2n = 4 x = 52) species originated following hybridization of an African or Asian diploid ("A genome"; 2n = 26) species with a diploid American ("D genome"; 2n = 26) species (4,5). Data suggest that the best models of the ancestral D genome parent are G. raimondii (4) and G. gossypioides (3) and that the A genome donor was similar to present-day G. herbaceum and G. arboreum (4). Polyploidization is suggested to have occurred in the mid-Pleistocene, following trans-oceanic dispersal of an A The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. genome taxon to the New World, which served as the female (seed) parent in the initial hybridization (3, 5). Following polyploidization, the allopolyploids diverged into several lineages represented by five modern species, including the commercially important species G. hirsutum ("upland cotton") and G. barba- dense ("pima" and "Egyptian" cotton). We chose to study rDNA as an example of a highly repeated, tandemly arranged sequence for two reasons. (i) Higher plant rDNA is organized into arrays at one or more chromosomal locations (6, 7); each array contains hundreds to thousands of identical to near-identical repeats (Fig. 2), the repeats having become homogenized by evolutionary forces [e.g., unequal crossing-over (9, 10), gene conversion (11-13)] that are col- lectively referred to as concerted evolution (14, 15). Although concerted evolutionary forces have long been known to ho- mogenize repeats within individual arrays (8, 15), there are few robust demonstrations of concerted evolution among repeats from different arrays (9, 13, 15). (ii) In situ hybridization experiments have shown that four rDNA loci exist in al- lopolyploid G. hirsutum, two in each of the A and D subge- nomes (16), thereby providing the opportunity for interlocus concerted evolution at the diploid and polyploid levels. The ITS region, containing the 5.8S RNA gene flanked by two spacer sequences (Fig. 2), was selected for study due to its relatively rapid rate of sequence evolution (7, 8). Describing the evolutionary dynamics of multilocus ITS sequences is additionally significant in that sequence variation in this region recently has become widely used for phylogeny reconstruction in plants (17-22).t MATERIALS AND METHODS Genomic DNAs were isolated (23) from individual plants of both extant A genome diploids (G. arboreum and G. herba- ceum), two representative D genome diploids (G. thurberi and G. raimondii), and all five existing allopolyploids (Table 1). For comparative purposes, we included an allopolyploid syn- thesized by colchicine-doubling the sterile intergenomic hybrid G. arboreum x G. thurberi. Because the C genome diploids are basal within the genus (3), we selected the C genome species G. robinsonii for purposes of rooting phylogenetic trees. To generate single-stranded ITS DNA for sequencing, we conducted two-stage PCR amplifications (24). We performed symmetric amplifications using primers a and b (Fig. 2), followed by asymmetric amplifications using aliquots of the double- stranded PCR products as template and only a single primer (a or b). Single-stranded DNAs were purified by ultrafiltration prior to sequencing. Because of problems presumably introduced by secondary structure, we sequenced both DNA strands with and without dGTP analogs, using the two amplification primers and Abbreviations: ITS, internal transcribed spacer; rDNA, ribosomal DNA. *To whom reprint requests should be addressed. tThe sequences discussed in this paper have been deposited in the GenBank data base (accession nos. U12710-U12719). 280

Bidirectional interlocus concerted evolutionfollowing ... · PCR-amplified ITS fragments were quantified fluorometri-cally and digested with the appropriate restriction enzymes; amounts

Embed Size (px)

Citation preview

Proc. Natl. Acad. Sci. USAVol. 92, pp. 280-284, January 1995Evolution

Bidirectional interlocus concerted evolution followingallopolyploid speciation in cotton (Gossypium)

(hybridization/molecular evolution/phylogeny reconstruction/polyploidy/ribosomal DNA)

JONATHAN F. WENDEL*, ANDREW SCHNABEL, AND TosAK SEELANANDepartment of Botany, Iowa State University, Ames, IA 50011

Communicated by Major M. Goodman, North Carolina State University, Raleigh, NC, October 3, 1994 (received for review July 12, 1994)

ABSTRACT Polyploidy is a prominent process in plantevolution; yet few data address the question of whether homeol-ogous sequences evolve independently subsequent topolyploidization. We report on ribosomal DNA (rDNA) evolutionin five aliopolyploid (AD genome) species of cotton (Gossypium)and species representing their diploid progenitors (A genome,Dgenome). Sequence data from the internal transcribed spacerregions (1TSl and ITS2) and the 5.8S gene indicate that rDNAarrays are homogeneous, or nearly so, in all diploids andallopolyploids examined. Because these arrays occur at fourchromosomal loci in allopolyploid cotton, two in each subge-nome, repeats from different arrays must have become homog-enized by interlocus concerted evolution. Southern hybridizationanalysis combined with copy-number estimation demonstratethat this process has gone to completion in the diploids and tocompletion or near-completion in all allopolyploid species andthat it most likely involves the entire rDNA repeat Phylogeneticanalysis demonstrates that interlocus concerted evolution hasbeen bidirectional in allopolyploid species-i.e., rDNA from fourpolyploid lineages has been homogenized to a D genome repeattype, whereas sequences from Gossypium mustelinum have con-certed to an A genome repeat type. Although little is knownregarding the functional significance of interlocus concertedevolution of homeologous sequences, this study demonstratesthat the process occurs for tandemly repeated sequences indiploid and polyploid plants. That interlocus concerted evolutioncan occur bidirectionally subsequent to hybridization andpolyploidization has significant implications for phylogeny re-construction, especially when based on rDNA sequences.

Approximately 70% of angiosperms are thought to haveexperienced one or more episodes of polyploidy at some pointin the past (1). Despite the prevalence of polyploidy in plants,surprisingly little is known regarding its genetic and genomicconsequences. There are few data, for example, on the modeand tempo of homeologous sequence evolution in polyploidgenomes. These sequences, which evolved independently whileisolated in diverging diploid genomes, are united in a commonnucleus at the time of polyploid formation, thereby providingthe possibility of nonindependent molecular evolution.We have initiated studies of repeated sequence evolution (2)

in a model system involving five allopolyploid species of cotton(Gossypium) and species that represent their two diploidprogenitors (Fig. 1). The allopolyploid ("AD genome"; (2n =4 x = 52) species originated following hybridization of anAfrican or Asian diploid ("A genome"; 2n = 26) species witha diploid American ("D genome"; 2n = 26) species (4,5). Datasuggest that the best models of the ancestral D genome parentare G. raimondii (4) and G. gossypioides (3) and that the Agenome donor was similar to present-day G. herbaceum and G.arboreum (4). Polyploidization is suggested to have occurred inthe mid-Pleistocene, following trans-oceanic dispersal of an A

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement" inaccordance with 18 U.S.C. §1734 solely to indicate this fact.

genome taxon to the New World, which served as the female(seed) parent in the initial hybridization (3, 5). Followingpolyploidization, the allopolyploids diverged into several lineagesrepresented by five modern species, including the commerciallyimportant species G. hirsutum ("upland cotton") and G. barba-dense ("pima" and "Egyptian" cotton).We chose to study rDNA as an example of a highly repeated,

tandemly arranged sequence for two reasons. (i) Higher plantrDNA is organized into arrays at one or more chromosomallocations (6, 7); each array contains hundreds to thousands ofidentical to near-identical repeats (Fig. 2), the repeats havingbecome homogenized by evolutionary forces [e.g., unequalcrossing-over (9, 10), gene conversion (11-13)] that are col-lectively referred to as concerted evolution (14, 15). Althoughconcerted evolutionary forces have long been known to ho-mogenize repeats within individual arrays (8, 15), there are fewrobust demonstrations of concerted evolution among repeatsfrom different arrays (9, 13, 15). (ii) In situ hybridizationexperiments have shown that four rDNA loci exist in al-lopolyploid G. hirsutum, two in each of the A and D subge-nomes (16), thereby providing the opportunity for interlocusconcerted evolution at the diploid and polyploid levels. TheITS region, containing the 5.8S RNA gene flanked by twospacer sequences (Fig. 2), was selected for study due to itsrelatively rapid rate of sequence evolution (7, 8). Describingthe evolutionary dynamics of multilocus ITS sequences isadditionally significant in that sequence variation in this regionrecently has become widely used for phylogeny reconstructionin plants (17-22).t

MATERIALS AND METHODS

Genomic DNAs were isolated (23) from individual plants ofboth extant A genome diploids (G. arboreum and G. herba-ceum), two representative D genome diploids (G. thurberi andG. raimondii), and all five existing allopolyploids (Table 1).For comparative purposes, we included an allopolyploid syn-thesized by colchicine-doubling the sterile intergenomic hybridG. arboreum x G. thurberi. Because the C genome diploids arebasal within the genus (3), we selected the C genome speciesG. robinsonii for purposes of rooting phylogenetic trees.To generate single-stranded ITS DNA for sequencing, we

conducted two-stage PCR amplifications (24). We performedsymmetric amplifications using primers a and b (Fig. 2), followedby asymmetric amplifications using aliquots of the double-stranded PCR products as template and only a single primer (aor b). Single-stranded DNAswere purified by ultrafiltration priorto sequencing. Because of problems presumably introduced bysecondary structure, we sequenced both DNA strands with andwithout dGTP analogs, using the two amplification primers and

Abbreviations: ITS, internal transcribed spacer; rDNA, ribosomalDNA.*To whom reprint requests should be addressed.tThe sequences discussed in this paper have been deposited in theGenBank data base (accession nos. U12710-U12719).

280

Proc. Natl Acad Sci USA 92 (1995) 281

FIG. 1. Organismal context for the evaluation of concerted evolutionof ribosomal DNA (rDNA) in the genus Gossypium Phylogenetic rela-tionships (solid lines) of three groups of diploid cotton are illustrated (3).Genomes of these groups vary >2-fold in DNA content and have beenassigned genomic designations (boxes) based on meiotic pairing behaviorin interspecies hybrids (4). A genome species are African-Asian in originwhile D genome species are endemic to the New World subtropics,primarily Mexico. The Australian, C genome diploids are basal within thegenus (3), thereby providing outgroup taxa for phylogenetic analysis.Hybridization betweenA genome and D genome representatives led tothe evolution of the New World allopolyploid -cottons (AD genome).Chloroplast DNA sequence divergence data suggest that theA genomeand D genome groups diverged from a common ancestor 5-10 millionyears ago and that the two diverged genomes becamne reunited in acommon nucleus, via allopolyploidization, in the mid-Pleistocene (3, 5).

two internal sequencing primers (c and d; Fig. 2). Sequences weregenerated by standard methods of dideoxy sequencing.We aligned the sequences using the computer program

MALIGN (25) and made minor adjustments manually to theresulting alignments. The phylogeny of these sequences wasexplored by using character-based and distance-based ap-proaches to phylogeny reconstruction. Maximum parsimonyanalysis was conducted with the aid of the computer programPAUP version 3.Os (26). In this analysis, nucleotide transfor-mations were weighted equally, and gaps were treated asadditional character states, with the exception of a single 15-bpgap that was treated as a binary, presence/absence character.For distance-based phylogeny estimation, we translated theobserved distances between all pairs of sequences to Jukes-Cantor or Kimura-2 "corrected" distances and subjected theresulting distance matrices to neighbor-joining analysis. Theseanalyses were facilitated by the computer program MEGA (27).

/-- \--

a

Table 1. Description of Gossypium accessions studied

Taxon Genome Accession Geographic origin

G. herbaceum Ai A1-73 BotswanaG. arboreum A2 A2-74 ChinaG. thurberi Di T no. 17 Arizona, USAG. raimondii D5 D5-37 PeruG. hirsutum (AD)1 TX2094 Yucatan, MexicoG. hirsutum (AD), Palmeri Oaxaca, MexicoG. barbadense (AD)2 K10l BoliviaG. tomentosum (AD)3 WT936 Hawaii, USAG. mustelinum (AD)4 W400 Bahia, BrazilG. mustelinum (AD)4 JL Ceara, BrazilG. darwinji (AD)s WB1215 Galapagos IslandsDoubled [G. arboreum 2(A2Dl) Beasley Syntheticx G. thurberi] allopolyploid

G. robinsonii C2 AZ-50 Western Australia

Genomic designations follow ref. 4.

rDNA copy number was estimated for each taxon by South-ern blot analysis. Aliquots of genomic DNAs that had beenquantified by fluororhetry were digested with restriction en-zymes (Acc I, Mbo I, Xho I,Xmn I) that were predicted, basedon the DNA sequence data, to lead to genome-specific ITScleavage patterns. Resulting fragments were separated byagarose gel electrophoresis (3.5% NuSieve 3:1, FMC) andtransferred to nylon membranes using standard procedures.Previous quantitation of genome sizes for Gossypium diploidsand allopolyploids (28, 29) facilitated the loading of specificnumbers of 1C genomic equivalents per lane, these rangingfrom 5 x 103 to 5 x 105. For internal standards on each blot,PCR-amplified ITS fragments were quantified fluorometri-cally and digested with the appropriate restriction enzymes;amounts ranging from 5 x 107 to 5 x 108 ITS copies wereloaded per lane. Hybridization probes were 32P-labeled, PCR-amplified ITS sequences from G. mustelinum or G. hirsutum.Hybridization and wash conditions were as described (30).

Restriction site variation in the rDNA nontranscribedspacer, external transcribed spacer, and the 18S and 26Scoding regions (Fig. 2) was assessed by Southern blot analysis.Fifteen restriction enzymes were used (listed in ref. 31).Membranes were probed with a labeled clone, derived from G.hirsutum (Ghrl), containing a full-length (9.4 kb) rDNArepeat (31). Restriction site maps of the ITS sequence dataallowed us to determine whether the resulting polymorphismswere due to variation in the ITS region or elsewhere within therepeat.

RESULTSITS sequences were deposited in GenBank under the followingaccession numbers: G. robinsonii, U12710; G. arboreum,

FIG. 2. Nuclear ribosomal genes in higher plants are organized into one or more arrays containing hundreds to thousands of tandemly arranged repeats(6, 8). Each repeat, which is -9.4 kb in length in Gossypium, consists of a nontranscribed spacer region and an external transcribed spacer (solid lines,upper) and a coding region (boxes, upper) that contains genes for the large and small subunit RNAs and the 5.8S RNA (163-164 bp in length). The twointernal transcribed spacer sequences (ITS1 and ITS2, lower) are cleaved from precursor transcripts during formation of the mature RNAs. Half-arrowsdenote PCR primers used for amplification and sequencing: a = 5'-GGAAGTAAAAGTCGTAACAAGG-3'; b = 5'-TCCTCCTCCGCTTATT-GATATGC-3'; c = 5'-GCATCGATGAAGAACGCAGC-3'; d = 5'-GCTGCGTTCTTCATCGATGC-3'.

Evolution: Wendel et al.

Proc. NatL Acad Sci USA 92 (1995)

U12712; G. herbaceum, U12713; G. mustelinum, U12714; G.barbadense, U12715; G. darwinii, U12716; G. hirsutum,U12719; G. tomentosum, U12717; G. thurberi, U12711; G.raimondii, U12718. Comparisons of these data with publishedsequences (17,32-34) allowed us to infer the boundaries of thethree components of the ITS region (ITS1, ITS2, and the 5.8Sgene). Because all Gossypium sequences are nearly identical inlength, sequence alignment necessitated the insertion of onlyfive gaps. Four of these (three in ITS1, two in ITS2) were asingle nucleotide in length, while the fifth was a 15-bp gap inpositions 670-684 of the sequences from G. arboreum, G.herbaceum, and G. mustelinum. ITS1 is longer in Gossypium(aligned length = 295 bp; absolute length = 293-294 bp) thanin other angiosperms studied to date (ITS1 is usually 200-260bp in length; refs. 17-22), but it has a typical GC content(58%). For ITS2, GC content (61%) and length (210-226 bp)are typical of other angiosperms. As expected, the 164-bp 5.8Sgene is strongly conserved, with no variable sites detected.Excluding polymorphisms introduced by indels, we observedover twice as many variable nucleotide positions in ITS1 (59)than in ITS2 (25).Although the DNA sequences were generated from a po-

tentially heterogeneous PCR-amplified pool of ITS fragments,monomorphic sequencing ladders were observed in all diploidsand allopolyploids examined. Sequences from a syntheticallopolyploid (doubled G. arboreum x G. thurberi) and fromdiploid hybrids, however, displayed all polymorphisms pre-dicted from the parental sequences. These observations sug-gest that the PCR pools were relatively homogeneous-i.e.,that sequence heterogeneity among repeats is minimal indiploid and allopolyploid Gossypium. When combined with thein situ hybridization data (16) that demonstrate four rDNA lociin allopolyploid G. hirsutum (two in each of its two subge-nomes), the observation of ITS homogeneity leads us toconclude that there must have been complete or nearlycomplete interlocus concerted evolution of the ITS region indiploid and polyploid cotton species.To evaluate whether the phenomenon of interlocus con-

certed evolution has extended beyond the ITS region intoother components of the 9.4-kb rDNA repeat, we analyzed

0 . e4

4-SA a?

mic DNAs, digested with 15 restriction enzymes, byiern hybridization, using labeled rDNA as a probe.-ularly informative were restriction enzymes (e.g.,ApaLI,II, EcoRI, Sac I) that have no recognition sequences inT'S region but reveal A genome and D genome specificdization patterns. In all such cases, we observed only onee two patterns in allopolyploid cotton (not shown).)ugh we did not map the restriction site variants observed,must have arisen from polymorphisms located in eitherSS gene, the 26S gene, or the large intergenic spacer (=anscribed spacer plus external transcribed spacer; Fig. 2).ibsence of heterogeneity among repeats within individ-br these polymorphisms is congruent with the results forPS region, suggesting that the entire repeat has undergoneocus concerted evolution.[lowing an exhaustive search, phylogenetic analysis of theequences produced a single most-parsimonious tree (Fig.,ith two primary, strongly supported branches (an Ane clade and aD genome clade). Topologically congruentwere obtained from distance-based (neighbor-joining)tgeny reconstruction methods and from analyses based onotide variation for only ITS1 or ITS2 (not shown). The;tness of this phylogenetic hypothesis is indicated by (i)g character support for the two divergent branches (220 character-state changes, respectively); (ii) high inter-cter congruence, as demonstrated by the fact that onlyucleotide characters were homoplasious; and (iii) topo-.1 stability with respect to distance-based phylogeneticods.e most striking feature of the ITS phylogeny is theigruence between the well-established organismal tree1) and the "gene tree" (Fig. 3). In the absence ofrted evolutionary forces, and given complete sampling,nces from all allopolyploids should be evident on bothclades (A genome and D genome), because each al-

yploid is expected to contain two sets of paralogous loci.results, however, show only one sequence for each al-yploid species, and, moreover, these sequences are notiphyletic: four of the five allopolyploids and both Dne diploids constitute one of the two primary branches,

MN C-, *.'0-T 4P 4N'

0 ct ta

FIG. 3. Phylogeny of rDNA ITS sequences in Gossypium. Taxon names are followed by their alphabetical genome designations. This singlemost-parsimonious cladogram (length = 90), obtained from an exhaustive search, is topologically identical to trees obtained using neighbor-joininganalysis. The phylogeny is rooted with the Australian, outgroup taxon G. robinsonii. Numbers indicate character support for each branch segment(overall consistency index = 0.98; retention index = 0.98). ITS sequences from the allopolyploid species (AD genome) occur on both primarybranches, demonstrating that different polyploid lineages have experienced homogenization to alternative rDNA repeat types.

282 Evolution: Wendel et al.

Proc. Natt Acad Sci USA 92 (1995) 283

while the other clade consists of the twoA genome diploids andthe sole remaining allopolyploid, G. mustelinum. To confirmthis result and check for PCR contamination, we analyzedsequences from an additional accession of one allopolyploidspecies on each branch-i.e., G. hirsutum and G. mustelinum(Table 1). We selected these accessions according to expec-tations that they would be maximally divergent from the initialaccessions studied, based on previous evidence (35-37). Inboth cases, the newly generated ITS sequence was identical tothat generated from the first accession studied. When com-bined with the observation of relatively homogeneous PCRpools and the in situ hybridization data, the phylogenetic resultthat theAD genome allopolyploid Gossypium species occur ontwo divergent clades leads to the conclusion that not only hasthere been interlocus concerted evolution among rDNA re-peats on four different chromosomes but also that this phe-nomenon has occurred bidirectionally-i.e., rDNA of G. mus-telinum has "concerted" to anA genome repeat type, whereasrDNA from the remaining polyploids has become homoge-nized to a D genome repeat type (the in situ hybridizationobservations exclude the formal possibility of differential lossof entire rDNA arrays in different polyploid lineages).We used Southern blot analysis to address the question of

whether interlocus concerted evolution has reciprocally ho-mogenized all rDNA repeats or simply most rDNA repeats. Asshown in Fig. 4, the synthetic allopolyploid exhibits both repeattypes in approximately equal amount, demonstrating that thephenomenon of interlocus concerted evolution requires alonger time to become evident than the few generations thathave elapsed since synthesis of the laboratory allopolyploid. Incontrast to the synthetic allopolyploid, only one repeat type isvisible in the naturally occurring allopolyploids (Fig. 4). Pro-longed exposure of the autoradiogram confirmed that homog-enization to anA genome repeat type has been complete in theG. mustelinum lineage but that in G. hirsutum, where con-certed evolution has operated in the direction of theD genomerepeat type, a small amount (two or three orders of magnitudeless) ofA genome rDNA remains.To quantify the extent of interlocus homogenization, we

estimated the number ofrDNA repeat units in the diploids andthe allopolyploids. These experiments (data not shown) dem-onstrated that the A genome and D genome diploids contain-3800 rDNA repeats per haploid genome and that this

Mbo I - digested

795_*~~~~~~k' knn ,....

437

....... ... .......................

FIG. 4. Southern hybridization demonstrating bidirectional inter-locus concerted evolution of ITS sequences in Gossypium. The firstand last lanes show the 795-bp PCR-amplified ITS region from G.hirsutum; other lanes display hybridization profiles of genomic DNAsdigested with the diagnostic restriction enzyme Mbo I or PCR productdigested with the same enzyme (lanes labeled G. hirsutum ITS and G.mustelinum ITS). A genome and D genome repeat types are repre-sented by G. arboreum and G. thurberi, respectively. The syntheticallopolyploid, 2(A2D1), derived from these two species, exhibits bothrepeat types, whereas the naturally occurring allopolyploids, G. mus-telinum and G. hirsutum, show only one of the two repeat types.

number is approximately additive in the allopolyploids. Wesuggest, accordingly, that since polyploid formation -3800 Dgenome repeats have become homogenized to an A genomerepeat type in the G. mustelinum lineage, while a nearlyequivalent number of repeats have been homogenized in theopposite direction in the G. hirsutum lineage.

DISCUSSIONThe sequence data and in situ hybridization results, combinedwith phylogenetic reconstruction and Southern blot analyses,lead to the conclusion that Gossypium rDNA has experiencedinterlocus as well as intralocus concerted evolution. Althoughseveral studies have documented the retention of more thanone rDNA repeat type in various plant groups followinghybridization (Fig. 4 and refs. 38-42) or polyploid formation(43-45), there appear to be few, if any, demonstrations ofinterlocus rDNA homogenization in plants (46, 47). This raisesthe question ofwhether our results will prove to be exceptionalas data accumulate for other plant groups. The paucity ofdocumented cases of rDNA interlocus concerted evolutionmay reflect, in part, the severity of the criteria required todemonstrate the process: a well-established phylogeneticframework; justified inferences of allopolyploidy (as opposedto autopolyploidy); knowledge of rDNA locus number indescendant polyploids, which is critical for ruling out rDNAlocus loss (48) as an alternative explanation for sequencehomogeneity. This combination of requirements has beenrarely met [e.g., Gossypium, Brassica (43), Triticum (46)].Concerted evolution among rDNA loci may be more commonthan previously recognized, however, given accumulating ev-idence of ITS sequence homogeneity in "diploid" plant groupssuspected of being paleopolyploid (refs. 17 and 49; this study).Although we have no direct evidence bearing on the mech-

anism of interlocus homogenization in Gossypium, perhaps themost plausible scenario is mitotic or meiotic unequal crossing-over between repeats located on different chromosomes (9). Inthis respect, it may be more than coincidental that GossypiumrDNA arrays occupy telomeric or subtelomeric chromosomalloci (16), because this would allow ephemeral, perhaps spo-radic, pairing and exchange to occur without deleteriouscytogenetic consequences. In this respect, the pattern ofallopolyploid Gossypium rDNA evolution is remarkably sim-ilar to that in humans, where repeats at five telomeric locationsappear to have experienced interlocus homogenization (9, 15).If interlocus unequal crossing-over is the operative mecha-nism, we would expect that following polyploidization, long-term maintenance of more than a single repeat type, as in theflowering plant family Winteraceae (21), is more likely inorganisms where rDNA loci are not telomeric in distribution.At present, we do not know whether interlocus concerted

evolution has homogenized other tandemly repeated se-quences or whether there is functional or selective significanceto the process. In this vacuum of empirical evidence, we canonly speculate about the possibility that functionally or selec-tively unequal repeated families can become united in acommon nucleus as a consequence of polyploidization (ordiploid hybridization) and thereby provide the opportunity fordifferential selection.Much clearer is the significance of our results for phylogeny

reconstruction based on ITS (and other rDNA) sequencing,which has become the tool of choice for inferring organismalrelationships based on nuclear sequences in plants (17-22, 49).As demonstrated in allopolyploid Gossypium, bidirectionalinterlocus concerted evolution is a possibility whenever two ormore divergent repeat types are maintained through at leastone cladogenetic (speciation) event. When followed by inter-locus homogenization to alternative repeat types, stronglysupported, but positively misleading, inferences of organismalrelationships may obtain. A necessary (but insufficient) con-

Evolution: Wendel et al.

Proc. Natl Acad ScL USA 92 (1995)

dition for this process to operate is that two (or more) rDNAarrays diverge in a single nucleus or that two divergent arraysbecome united in a common nucleus via polyploidization orhybridization at the diploid level. The known high frequencyof these latter two phenomena in plants (1, 50) underscores thepotential phylogenetic significance of bidirectional, interlocusconcerted evolution.

Finally, we note that even under conditions of complete andbidirectional homogenization to alternative repeat types,where misleading phylogenetic relationships are expected,rDNA data may still reinforce or extend our understanding ofevolutionary history. This is most likely to be true when otherindependent lines of phylogenetic evidence are available, as inGossypium. In such cases, ITS and other rDNA data providethe possibility of corroborating or providing insight into thegenomic composition of allopolyploids or diploids suspected ofhaving an introgressant history.

We thank B. Baldwin for initial advice regarding sequencing of theITS region and B. Baldwin, J. Doyle, and an anonymous reviewer forcomments on the manuscript. Financial support (to J.F.W.) wasprovided by the National Science Foundation.

1. Masterson, J. (1994) Science 264, 421-424.2. VanderWiel, P. S., Voytas, D. F. & Wendel, J. F. (1993) J. Mo.

Evol. 36, 429-447.3. Wendel, J. F. & Albert, V. A. (1992) Syst. Bot. 17, 115-143.4. Endrizzi, J. E., Turcotte, E. L. & Kohel, R. J. (1985)Adv. Genet.

23, 271-375.5. Wendel, J. F. (1989) Proc. Natl. Acad. Sci. USA 86, 4132-4136.6. Rogers, S. 0. & Bendich, A. J. (1987) Plant Moi. Biol. 9,509-520.7. Hillis, D. M. & Dixon, M. T. (1991) Q. Rev. Bio. 66, 411-453.8. Jorgensen, R. A. & Cluster, P. D. (1988) Ann. Mo. Bot. Gard. 75,

1238-1247.9. Arnheim, N. (1983) in Evolution of Genes and Proteins, eds. Nei,

M. & Koehn, R. K. (Sinauer, Sunderland, MA), pp. 38-61.10. Seperack, P., Slatkin, M. & Arnheim, N. (1988) Genetics 119,

943-949.11. Enea, V. & Corredor, V. (1991) J. Mol. Evol. 32, 183-186.12. Nagylaki, T. (1984) Proc. Natl. Acad. Sci. USA 81, 3796-3800.13. Hillis, D. M., Moritz, C., Porter, C. A. & Baker, R. J. (1991)

Science 251, 308-310.14. Zimmer, E. A., Martin, S. L., Beverley, S. M., Kan, Y. W. &

Wilson, A. C. (1980) Proc. Natl. Acad. Sci. USA 77, 2158-2162.15. Arnheim, N., Krystal, M., Schmickel, R., Wilson, G., Ryder, 0.

& Zimmer, E. (1980) Proc. Natl. Acad. Sci. USA 77, 7323-7327.16. Crane, C. F., Price, H. J., Stelly, D. M. & Czeshin, D. G. (1993)

Genome 36, 1015-1022.17. Baldwin, B. G. (1992) Mol. Phylogenet. Evol. 1, 3-16.18. Baldwin, B. G. (1993) Am. J. Bot. 80, 222-238.19. Hsiao, C., Chatterton, N. J., Asay, K. H. & Jensen, K. B. (1994)

Genome 37, 112-120.20. Savard, L., Michaud, M. & Bousquet, J. (1993) Mol. Phylogenet.

Evol. 2, 112-118.21. Suh, Y., Thien, L. B., Reeve, H. E. & Zimmer, E. A. (1993) Am.

J. Bot. 80, 1042-1055.

22. Wojciechowski, M. F., Sanderson, M. J., Baldwin, B. G. & Do-noghue, M. J. (1993) Am. J. Bot. 80, 711-722.

23. Paterson, A. H., Brubaker, C. L. & Wendel, J. F. (1993) PlantMol. Biol. Rep. 11, 122-127.

24. White, T. J., Bruns, T., Lee, S. & Taylor, J. (1990) in PCRProtocols: A Guide to Methods and Applications, eds. Innis, M.,Gelfin, D., Sninsky, J. & White, T. (Academic, San Diego), pp.315-322.

25. Wheeler, W. C. & Gladstein, D. S. (1994) J. Hered. 85, 417-418.26. Swofford, D. L. (1990) PhylogeneticAnalysis Using Parsimony (Ill.

Nat. Hist. Surv., Urbana).27. Kumar, S., Koichir, T. & Nei, M. (1993) MEGA, Molecular

Evolutionary Genetics Analysis (Penn. State Univ., UniversityPark), Version 1.0.

28. Kadir, Z. B. A. (1976) Chromosoma 56, 85-94.29. Gomez, M., Johnston, J. S., Ellison, J. R. & Price, H. J. (1993)

Biol. Zentralbl. 112, 351-357.30. Murray, M. G., Pitas, J. W., DeMars, S. J. & Cramer, J. H. (1992)

Plant Mol. Biol. Rep. 10, 173-177.31. Wendel, J. F., Stewart, J. M. & Rettig, J. H. (1991) Evolution 45,

694-711.32. Yokota, Y., Kawata, T., lida, Y., Kato, A. & Tanifuji, S. (1989)

J. Mol. Evol. 29, 294-301.33. Venkateswarlu, K. & Nazar, R. (1991) Plant Mol. Biol. 17,

189-194.34. Ritland, C. & Straus, N. A. (1993) Plant Mol. Biol. 22, 691-696.35. Brubaker, C. L. & Wendel, J. F. (1994)Am. J. Bot. 81, 1309-1326.36. Wendel, J. F., Brubaker, C. L. & Percival, A. E. (1992) Am. J.

Bot. 79, 1291-1310.37. Wendel, J. F., Rowley, R. & Stewart, J. M. (1994) Plant Syst. Evol.

192, 49-59.38. Doyle, J. J. & Doyle, J. L. (1988) Am. J. Bot. 75, 1238-1246.39. Doyle, J. J., Soltis, D. E. & Soltis, P. S. (1985) Am. J. Bot. 72,

1388-1391.40. Liston, A., Rieseberg, L. H. & Mistretta, 0. (1990) Biochem. Syst.

Ecol. 18, 239-244.41. Rieseberg, L. H., Carter, R. & Zona, S. (1990) Evolution 44,

1498-1511.42. Zimmer, E. A., Jupe, E. R. & Walbot, V. (1988) Genetics 120,

1125-1136.43. Delseny, M., McGrath, J. M., This, P., Chevre, A. M. & Quiros,

C. F. (1990) Genome 33, 733-744.44. Doyle, J. J., Doyle, J. L. & Brown, A. H. D. (1990)Aust. Syst. Bot.

3, 125-136.45. Soltis, P. S. & Soltis, D. E. (1991) Syst. Bot. 16, 407-413.46. Dvorak, J. (1990) in Plant Population Genetics, Breeding, and

Genetic Resources, eds. Brown, A. H. D., Clegg, M. T., Kahler,A. L. & Weir, B. S. (Sinauer, Sunderland, MA), pp. 83-97.

47. van Houten, W. H. J., Scarlett, N. & Bachmann, K. (1993) Theor.Appl. Genet. 87, 498-505.

48. Skorupska, H., Albertsen, M. C., Langholz, K. D. & Palmer,R. G. (1989) Genome 32, 1091-1095.

49. Baum, D. A., Sytsma, K. J. & Hoch, P. C. (1994) Syst. Bot. 19,363-388.

50. Rieseberg, L. H. & Wendel, J. F. (1993) in Hybrid Zones and theEvolutionary Process, ed. Harrison, R. (Oxford Univ. Press, NewYork), pp. 70-109.

284 Evolution: Wendel et al.