10
Copyright 2004 by the Genetics Society of America Population Genetics of the Wild Yeast Saccharomyces paradoxus Louise J. Johnson,* ,1 Vassiliki Koufopanou,* Matthew R. Goddard, Richard Hetherington,* Stefanie M. Scha ¨fer* ,2 and Austin Burt* *Department of Biological Sciences and NERC Centre for Population Biology, Imperial College at Silwood Park, Ascot SL5 7PY, United Kingdom Manuscript received November 4, 2002 Accepted for publication September 22, 2003 ABSTRACT Saccharomyces paradoxus is the closest known relative of the well-known S. cerevisiae and an attractive model organism for population genetic and genomic studies. Here we characterize a set of 28 wild isolates from a 10-km 2 sampling area in southern England. All 28 isolates are homothallic (capable of mating-type switching) and wild type with respect to nutrient requirements. Nine wild isolates and two lab strains of S. paradoxus were surveyed for sequence variation at six loci totaling 7 kb, and all 28 wild isolates were then genotyped at seven polymorphic loci. These data were used to calculate nucleotide diversity and number of segregating sites in S. paradoxus and to investigate geographic differentiation, population structure, and linkage disequilibrium. Synonymous site diversity is 0.3%. Extensive incompatibilities between gene genealogies indicate frequent recombination between unlinked loci, but there is no evidence of recombination within genes. Some localized clonal growth is apparent. The frequency of outcrossing relative to inbreeding is estimated at 1.1% on the basis of heterozygosity. Thus, all three modes of reproduction known in the lab (clonal replication, inbreeding, and outcrossing) have been important in molding genetic variation in this species. M ANY fields in biology have progressed by the con- are escaped domestics or otherwise greatly affected by human activity (Vaughan-Martini and Martini 1995; centrated study of a select group of model sys- Naumov et al. 1992a). This could greatly affect their tems. In population and evolutionary genetics, only a population genetics, severely complicating interpreta- few species such as Drosophila and humans have been tions and reducing the extent to which lessons learned widely adopted, and it might make sense to consider with this species are likely to be widely applicable. For what other taxa might best complement these. The yeast example, one survey of S. cerevisiae in wineries revealed Saccharomyces cerevisiae has a number of characteristics some surprising findings, including 31% of strains het- that would seem to make it ideal (Zeyl 2000): (i) It is erozygous for a lethal mutation and 23% heterozygous already a well-studied model system in biochemistry, cell or homozygous for heterothallism, i.e., an inability to biology, classical genetics, and molecular biology; (ii) undergo mating-type switching (Mortimer 2000). The genomes can be precisely altered by homologous recom- association between Drosophila and humans has posed bination; and (iii) long-term experiments with large similar problems (Andolfatto and Przeworski 2000; population sizes and sensitive fitness assays are readily Wall et al. 2002). possible in the laboratory. These features suggest that One way to circumvent this problem would be to study one may be more likely to be able to investigate and a close relative that has the same advantages, but not interpret the functional significance of natural DNA the disadvantage. S. paradoxus is (along with S. cario- sequence variation in this species than in any other canus) the closest known relative of S. cerevisiae (God- eukaryote. Moreover, it has a relatively small and gene- dard and Burt 1999). The two species appear to be rich genome, reducing the size of the problem to be biochemically indistinguishable (Barnett et al. 1990), solved. However, there is a problem: S. cerevisiae has have the same chromosome number, and appear to be long been associated with humans, and in collecting largely syntenic (Naumov et al. 1992b). Growth prefer- strains it is difficult to determine to what extent they ences in the lab are the same as for S. cerevisiae, and genetic engineering by the same homologous gene re- placement methods used in S. cerevisiae is possible (E. Sequence data from this article have been deposited with the Louis, personal communication). Thus, many of the EMBL/GenBank Data Libraries under accession nos. AJ515177– AJ515216, AJ515322–AJ515352, and AJ515430–AJ515449. advantages still apply. Moreover, it has been isolated 1 Corresponding author: Institute of Genetics, University of Notting- from many natural locations worldwide (e.g., Sniegow- ham, Queens Medical Centre, Nottingham NG7 2UH, United King- ski et al. 2002) and apparently has not been widely dom. E-mail: [email protected] domesticated. Gene flow between S. cerevisiae and S. 2 Present address: Department of Infectious Disease Epidemiology, Imperial College, London W2 1PG, United Kingdom. paradoxus is also unlikely; hybrids can be formed, but Genetics 166: 43–52 ( January 2004)

Population Genetics of the Wild Yeast Saccharomyces paradoxus

Embed Size (px)

DESCRIPTION

Population Genetics of the Wild Yeast Saccharomyces paradoxus

Citation preview

Page 1: Population Genetics of the Wild Yeast Saccharomyces paradoxus

Copyright 2004 by the Genetics Society of America

Population Genetics of the Wild Yeast Saccharomyces paradoxus

Louise J. Johnson,*,1 Vassiliki Koufopanou,* Matthew R. Goddard,†

Richard Hetherington,* Stefanie M. Schafer*,2 and Austin Burt*

*Department of Biological Sciences and †NERC Centre for Population Biology, Imperial College at Silwood Park,Ascot SL5 7PY, United Kingdom

Manuscript received November 4, 2002Accepted for publication September 22, 2003

ABSTRACTSaccharomyces paradoxus is the closest known relative of the well-known S. cerevisiae and an attractive model

organism for population genetic and genomic studies. Here we characterize a set of 28 wild isolates froma 10-km2 sampling area in southern England. All 28 isolates are homothallic (capable of mating-typeswitching) and wild type with respect to nutrient requirements. Nine wild isolates and two lab strains ofS. paradoxus were surveyed for sequence variation at six loci totaling 7 kb, and all 28 wild isolates werethen genotyped at seven polymorphic loci. These data were used to calculate nucleotide diversity andnumber of segregating sites in S. paradoxus and to investigate geographic differentiation, populationstructure, and linkage disequilibrium. Synonymous site diversity is �0.3%. Extensive incompatibilitiesbetween gene genealogies indicate frequent recombination between unlinked loci, but there is no evidenceof recombination within genes. Some localized clonal growth is apparent. The frequency of outcrossingrelative to inbreeding is estimated at 1.1% on the basis of heterozygosity. Thus, all three modes ofreproduction known in the lab (clonal replication, inbreeding, and outcrossing) have been important inmolding genetic variation in this species.

MANY fields in biology have progressed by the con- are escaped domestics or otherwise greatly affected byhuman activity (Vaughan-Martini and Martini 1995;centrated study of a select group of model sys-Naumov et al. 1992a). This could greatly affect theirtems. In population and evolutionary genetics, only apopulation genetics, severely complicating interpreta-few species such as Drosophila and humans have beentions and reducing the extent to which lessons learnedwidely adopted, and it might make sense to considerwith this species are likely to be widely applicable. Forwhat other taxa might best complement these. The yeastexample, one survey of S. cerevisiae in wineries revealedSaccharomyces cerevisiae has a number of characteristicssome surprising findings, including 31% of strains het-that would seem to make it ideal (Zeyl 2000): (i) It iserozygous for a lethal mutation and 23% heterozygousalready a well-studied model system in biochemistry, cellor homozygous for heterothallism, i.e., an inability tobiology, classical genetics, and molecular biology; (ii)undergo mating-type switching (Mortimer 2000). Thegenomes can be precisely altered by homologous recom-association between Drosophila and humans has posedbination; and (iii) long-term experiments with largesimilar problems (Andolfatto and Przeworski 2000;population sizes and sensitive fitness assays are readilyWall et al. 2002).possible in the laboratory. These features suggest that

One way to circumvent this problem would be to studyone may be more likely to be able to investigate anda close relative that has the same advantages, but notinterpret the functional significance of natural DNAthe disadvantage. S. paradoxus is (along with S. cario-sequence variation in this species than in any othercanus) the closest known relative of S. cerevisiae (God-eukaryote. Moreover, it has a relatively small and gene-dard and Burt 1999). The two species appear to berich genome, reducing the size of the problem to bebiochemically indistinguishable (Barnett et al. 1990),solved. However, there is a problem: S. cerevisiae hashave the same chromosome number, and appear to belong been associated with humans, and in collectinglargely syntenic (Naumov et al. 1992b). Growth prefer-strains it is difficult to determine to what extent theyences in the lab are the same as for S. cerevisiae, andgenetic engineering by the same homologous gene re-placement methods used in S. cerevisiae is possible (E.

Sequence data from this article have been deposited with theLouis, personal communication). Thus, many of theEMBL/GenBank Data Libraries under accession nos. AJ515177–

AJ515216, AJ515322–AJ515352, and AJ515430–AJ515449. advantages still apply. Moreover, it has been isolated1Corresponding author: Institute of Genetics, University of Notting- from many natural locations worldwide (e.g., Sniegow-

ham, Queens Medical Centre, Nottingham NG7 2UH, United King- ski et al. 2002) and apparently has not been widelydom. E-mail: [email protected]. Gene flow between S. cerevisiae and S.2Present address: Department of Infectious Disease Epidemiology,

Imperial College, London W2 1PG, United Kingdom. paradoxus is also unlikely; hybrids can be formed, but

Genetics 166: 43–52 ( January 2004)

Page 2: Population Genetics of the Wild Yeast Saccharomyces paradoxus

44 L. J. Johnson et al.

cerevisiae strains with substantial variability within each species.are almost completely sterile (Naumov et al. 1997a).The initial collection of 344 bark scrapings yielded 28 isolates.Overall DNA sequence divergence between the two spe-

Other strains: The Centraalbureau voor Schimmelculturescies is thought to be �20% (Herbert et al. 1988), and (CBS) supplied CBS 432, the type strain of S. paradoxus, andsynonymous site divergence at the loci studied here is the Danish lab strain CBS 5829, here referred to as “Type”

and “Danish,” respectively.�30%.Two S. paradoxus isolates from the Russian Far East (FE),In the laboratory, the life cycle of S. paradoxus is the

CBS 8436 and CBS 8444, were included for comparison. Thesesame as that of S. cerevisiae (Herskowitz 1988). It nor-isolates differ from European S. paradoxus at allozyme loci

mally reproduces mitotically as a diploid, but when (Naumov et al. 1997b) and show �5% synonymous site diver-starved of nitrogen undergoes meiosis and produces gence from the type strain of S. paradoxus at the six sequenced

loci. These strains, referred to herein as FE1 and FE2, respec-four haploid spores encapsulated in an ascus. There aretively, were kindly provided by Edward Louis. All S. cerevisiaetwo mating types, and the spores usually mate withinsequence data were from the Yeast Genome Project (Goffeauthe ascus upon germination, but if this does not happen,et al. 1996).

they are able to reproduce mitotically as haploids. Hap- Phenotypic assays: To isolate individual spores for pheno-loid cells are constitutively ready to mate and can out- typic assays, all wild isolates were grown on sporulation me-

dium for 4 days, and resultant asci were enzymatically digestedcross. However, haploid mitoses are associated with a(10 min in a 50-�l solution of 10 mg/ml sulfanotase, 10 mg/sophisticated mechanism of mating-type switching, withml lyticase at 25�). Individual spores were removed with a Zeissthe result that cells can also mate with their clonemates,micromanipulator and incubated at 25� for 4 days on YPD

producing an entirely homozygous diploid (“autodiploi- agar to allow colony growth. Colonies were replica plated todization”). Thus, S. paradoxus may undergo two types minimal and sporulation media and after 3 days examined

for growth or surveyed by microscopy for the presence ofof self-fertilization: intra-ascus mating and autodiploidi-tetrads. The presence of tetrads was considered indicativezation. For a review of ascomycete mating systems, seeof mating-type switching. All media were made according toNelson (1996).Sherman (1991).

In this article we describe a preliminary investigation Molecular methods: Nine wild isolates were chosen ran-into the genetics of a single population of S. paradoxus, domly for an initial survey of sequence variation. Total DNA

was extracted (Sherman 1991) and diluted 100-fold for usefocusing on quantifying levels of nucleotide variationas a PCR template. Six genes involved in mate recognitionand analyzing the pattern of variation to infer matingwere amplified from the nine wild isolates and from the Typesystem (and, to a lesser extent, dispersal).strain, Danish, FE1, and FE2 isolates. Details of genes andprimers are given in Table 1. All 28 wild isolates were thengenotyped at polymorphic sites by restriction at the MFA1 and

MATERIALS AND METHODS AGA2 loci, using enzymes Tsp451 and AseI, respectively, andby sequencing fragments of MF�1, SAG1, STE2, and STE3.

Collections: S. paradoxus was isolated from the bark of oak Microsatellite locus: Twenty S. cerevisiae microsatellitetrees (Quercus, mainly Quercus robur ; Naumov et al. 1998) in primer pairs (Field and Wills 1998) were tested on S. para-Silwood Park and Windsor Great Park. Bark scrapings (�1 g) doxus. Of these only 3 gave a PCR product with S. paradoxus,were collected from 86 oak trees on each of two dates, with and 1 was found to be polymorphic, a variable-length repeattwo scrapings on opposite sides of the tree on each date. in the TFA1 gene (chromosome XI in S. cerevisiae). The wildScrapings were aseptically transferred to acidified malt me- isolates were genotyped at this locus by polyacrylamide geldium [5% malt extract (Sigma, Dorset, UK), 0.4% lactic acid electrophoresis of radioactively end-labeled PCR products(Sigma) w/v] in loosely capped vials and shaken for 2 days (Sambrook et al. 1989). A representative of each mobilityat 30�. Many types of microbe were present in the medium so group was sequenced to determine the length of each allele.a selection procedure was incorporated to isolate S. paradoxus. Statistical analysis and software used: Nucleotide diversityDilutions of the 48-hr culture were plated on acidified malt � at synonymous and nonsynonymous sites, and synonymousand incubated for 24 hr at 30�. The resulting colony-forming site divergence, were calculated using DnaSP (Rozas and Rozasunits were visually inspected and colonies looking like S. para- 1999; available at http://www.ub.es/dnasp/). Parsimony analy-doxus were picked, placed on YPD [1% yeast extract (Merck, sis of gene trees and comparisons among them by the partitionDorset, UK), 2% peptone (Merck), 2% glucose (BDH, Leices- homogeneity test (Farris et al. 1994) were performed usingtershire, UK], and then subsamples were tested for their ability PAUP (Swofford 2002). To test for deviations from neutral-to form tetrads when placed upon nitrogen-starving medium ity, we compared the variance of branch lengths on the geneal-(2% potassium acetate; BDH). Heterozygosity was maintained ogy to that from 1000 random genealogies with the same totalin the original samples because they were not stimulated to branch length, constructed using N. Barton’s genealogies pack-sporulate. For those that formed tetrads, the internal tran- age (available at http://helios.bto.ed.ac.uk/evolgen/barton/scribed spacer region (ITS1-5.8rRNA-ITS2) was amplified us- index.html) for Mathematica (Wolfram Research 1999). Testsing primers ITS1 and ITS4 (White et al. 1990) and then for overrepresentation of genotypes and linkage disequilib-visualized via electrophoresis through 1% agarose. ITS ampli- rium were performed using MultiLocus (Agapow and Burtcons of roughly the correct size were sequenced (with an ABI 2001; available at http://www.bio.ic.ac.uk/evolve/software/373) and compared to the ITS sequences from the S. paradoxus multilocus/index.html). The correlation between genetic and(CBS 432) and S. cerevisiae type strains. Three types of sequence geographical distance across all pairs of isolates was tested bywere recovered. Two of these were largely unalignable to the randomization, in Mathematica.Saccharomyces sequences and were identified as Hanseni-aspora osmophila (CBS 313) and Torulaspora delbrueckii (CBS404), using BLAST (Altschul et al. 1990). All sequences in RESULTSthe third category were very similar to the S. paradoxus se-

Isolations: S. paradoxus was isolated from 28 of 344quence and were included in our sample. Our proceduretherefore allowed the isolation of both S. paradoxus and S. bark scrapings, a success rate of 8%. There was no obvi-

Page 3: Population Genetics of the Wild Yeast Saccharomyces paradoxus

45Population Genetics of S. paradoxus

TABLE 1

Primer sequences

Gene Primers, 5�–3�, forward first

MFA1 (YDR461w): �-pheromone, chromosome 4 MFL-5px: CTG TTG CTC GGA TAA AAT CAA GMFL-6px: GGA TAA CAG TAA CAG CGC TAA G

MF�1 (YPL187w): �-pheromone, chromosome 16 sMFG1-U: AAA GCA ACA ACA GGT TTT GGsMFG1-L: CAA ATT GAA ATA TGG CAG GCMFAL-SEQF*: TTT TAA TAC ACA CAA ATA AAT TAT CCMFAL-SEQR*: TGA GAA AGT TGA TTT TGT TAC GC

STE2 (YFL026w): �-pheromone receptor, chromosome 6 STE2-142F: ACT GTT ACT CAG GCT ATT ATG TTC GSTE2-1539R: TAA TCC AAT GAA AAA AAA TCA CTG CSTE2-497F*: TGA CAT CAA TAT CTT TCA CTT TCA CTT TAG GSTE2-1078F*: TCA GAA AGA ACT TTT GTT GCT GAG GSTE2-1148R*: CCT TGT ATT TTT TGA ACT CGT GGSTE2-235R*: AAA CTT GGT TGA TAA TGA AAA TTG G

STE3 (YKL178c): �-pheromone receptor, chromosome 11 STE3-F3: TGG ACA CAT TCA TTA CCT ACC ACGSTE3-ENDR: TTT CTG AAC TAA GCT CAT TTG AACSTE3-530R*: GAA AAC GAA CAG CAC CAA GGSTE3-989F*: AGG ATT TAC AGC AGG TGG ATG GSTE3-997R*: TTT CAG AAT CGG TAG AGA ATG G

AGA2 (YGL032c): �-agglutinin subunit, chromosome 7 AGA2-7PX: CTT TTG TTG TTC GGG CAT TTC CAGA2-8PX: GTT GGC TAT TAT GAT AGT CCA TCC

SAG1 (YJR004c): �-agglutinin, chromosome 10 SAG1-78F: GCT ATG TGA ACC AAA AAA AGA TAC CSAG1-2005R: GCC TGA TGT TGA AGA ATA ATA TGCSAG1-411R*: GTT TTT TGC GAT GAA TCT GAC AGCSAG1-711F*: AAT GTC TGA TGT GGT GAA TTT CGSAG1-1317F*: GTC GGA AGT AAT CAG TCA TGT GGSAG1-1503R*: GAT GTT GAA GTC ACA ATA GGT ACG

* Internal sequencing primers.

ous difference in success rate between large and small them at the observed frequency is �60 � 10�8/0.069 �10�5. Thus, even very small selection coefficients wouldtrees or samples with different aspect. From 4 bark

scrapings on each of two dates, 63 trees produced no be sufficient to keep the mutants at the observed lowfrequency.isolates, 18 produced one isolate, and 5 produced two

isolates. No S. cerevisiae strains were recovered although All colonies grown from haploid spores were alsocapable of forming tetrads on sporulation medium, indi-they were not excluded by our procedure.

Phenotypic variation: All 28 wild isolates were induced cating that they had autodiploidized following mating-type switching (i.e., were homothallic). In S. cerevisiae itto undergo meiosis, and the four haploid spores were

dissected from the asci. The resultant colonies were all appears that there is only one locus that can mutate togive a heterothallic phenotype (HO); making the samecapable of growth on minimal medium, demonstrating

that none of the 28 strains carried an auxotrophic muta- calculations as above indicates that the minimum selec-tion coefficient against such mutants in the wild istion. The frequency of auxotrophic mutants is thus 0,

with a 2-unit upper support limit of 0.069. In S. cerevisiae, �10�7.Molecular data set 1: DNA sequences from nine iso-�60 genes can mutate to auxotrophy, as estimated by

counting gene names denoting amino acid auxotrophy lates: The initial survey of molecular variation involvedsequencing six loci from nine wild isolates plus the Type,in the yeast genome (Goffeau et al. 1996). The sponta-

neous mutation rate in the lab is �10�8/locus/mitotic Danish, FE1, and FE2 isolates. Sequence variation wasdiscovered at each of the six loci, and there were a totalgeneration (Drake 1991; Zeyl and Devisser 2001). If

the same values apply to S. paradoxus in nature, and the of 24 polymorphic sites and one polymorphic repeat in�7000 bp of sequence from nine isolates (see Table 2).population is at mutation selection balance (i.e., the

frequency of deleterious mutants is equal to q � u/s, None of the isolates was heterozygous at any of thesepolymorphic sites. Three isolates (T8.1, T21.4, andwhere u is the mutation rate and s is the selection coeffi-

cient), the minimum harmonic mean selection coeffi- T32.1) had identical genotypes; subsequent analysis (de-scribed below for data set 2) suggests that they are partcient against auxotrophic mutants necessary to keep

Page 4: Population Genetics of the Wild Yeast Saccharomyces paradoxus

46 L. J. Johnson et al.

TA

BL

E2

Pol

ymor

phis

ms

Gen

e

MF�

1M

FA1

STE3

STE2

SAG

1A

GA

2

Bas

e�

130

�10

835

4R

1724

5�

1079

280

513

6515

4415

9316

7917

0717

1817

7522

483

711

0613

8213

87�

9091

515

7834

634

7

Typ

eA

CG

3C

AA

TG

GC

TA

GA

TA

CC

AC

AA

GG

—D

anis

h?

..

4T

—G

..

AT

..

..

.G

.T

TT

..

.T

TT

8.1

T.

.4

T—

..

..

T.

..

..

G.

..

.G

GA

..

T21

.4T

..

4T

—.

..

.T

..

..

.G

..

..

??

A.

.T

32.1

T.

.4

T—

..

..

T.

..

..

G.

..

.G

GA

..

T62

.1T

..

4T

—.

C.

AT

AC

CT

AG

..

..

..

..

.T

76.6

T—

.4

..

..

T.

T.

..

..

GT

..

..

..

TT

Q4.

1T

.a

4.

..

..

.T

..

..

.G

T.

..

..

.T

TQ

32.3

T.

.4

..

..

..

T.

..

..

GT

..

.G

GA

..

Q59

.1T

..

4.

..

..

.T

..

..

.G

..

TT

..

..

.Q

70.8

T.

a4

T—

..

..

T.

..

..

GT

..

..

..

..

FE1

??

..

T—

..

..

T.

..

..

GT

..

..

..

..

FE2

??

.4

T—

..

..

T.

..

..

GT

..

..

..

..

S.c

T.

.4

T—

G.

..

.C

.T

.A

GT

..

..

..

..

Nuc

leot

ide

poly

mor

phis

ms

foun

din

the

init

ial

surv

eyof

nin

ew

ildis

olat

esan

dth

eT

ype

and

Dan

ish

stra

ins

are

show

n.

Als

osh

own

,fo

rco

mpa

riso

n,

are

the

nuc

leot

ides

foun

din

the

Far

Eas

tern

stra

ins

and

inS.

cere

visi

ae(S

.c)

.D

ots

indi

cate

iden

tity

toT

ype.

Bas

esar

en

umbe

red

from

the

star

tof

the

codi

ng

sequ

ence

;n

egat

ive

num

bers

indi

cate

upst

ream

posi

tion

s.C

olum

nR

show

sn

umbe

rof

repe

ats

ofth

eM

F�1

pher

omon

eun

it.

Non

codi

ng

regi

ons

are

show

nin

ital

icty

pe.

Page 5: Population Genetics of the Wild Yeast Saccharomyces paradoxus

47Population Genetics of S. paradoxus

TABLE 3

Estimates of nucleotide diversity in S. paradoxus wild isolates

Coding sequence Noncoding sequence

Gene Strains bp �a � 103 �s � 103 bp � � 103

MFA1 9 111 0 23.31 463 1.2MF�1 9 534 0 6.94 150 2.6STE2 9 1296 0 2.16 214 1.8STE3 9 1413 0.21 1.42 450 2.5AGA2 9 264 0 0 221 1.8SAG1 8 1956 0 3.96 158 3.4Total 5534 0.07 3.53 1656 1.7

Average pairwise diversity per nucleotide site at synonymous (�s) and nonsynonymous (�a) sites of codingregions and of adjacent noncoding sequence is shown. Noncoding regions considered are upstream of MF�1and SAG1; downstream of MFA1, STE2, and STE3; and 91 bp upstream 130 bp downstream of AGA2. TheSTE2 sequence does not include the first 200 bp.

of a single clone. No other pair of isolates had identical and Sharp 1992). These results indicate that the sixgenes are under purifying selection in S. paradoxus.genotypes. Table 3 shows the average pairwise diversity

per nucleotide site of these six genes in wild isolates. Gene trees for each locus, rooted using the Far East-ern isolates and S. cerevisiae, are shown in Figure 1.Only one amino acid polymorphism is seen among the

nine wild isolates; the nonsynonymous nucleotide diver- The data fit these trees perfectly—i.e., their consistencyindex is 1 (Farris 1989): There is no homoplasy withinsity at these loci is low (�0.01%), comparable to that

found in humans (Li and Sadler 1991). By contrast, the European data. Far Eastern and European isolates,however, share a polymorphism in MF�1 pheromonethe synonymous and noncoding nucleotide diversity is

relatively high (�0.3%), comparable to that found in repeat number. There are fixed differences betweenFar Eastern and European MF�1 sequences at otherDrosophila melanogaster (Begun and Aquadro 1992)—

although this is still far lower than the diversity of �5% sites, so this homoplasy must have been created eitherby recombination between alleles from the Far East andseen between sympatric isolates of Escherichia coli (Hall

Figure 1.—Gene trees for11 European S. paradoxus iso-lates at six loci. Identical se-quences at each locus aregrouped together, and branchlengths are labeled with num-ber of base changes. T21.4 andT32.1 are in all cases identicalto T8.1 and have been omitted.Arrows indicate ancestral stateas indicated by the Far Easternisolates.

Page 6: Population Genetics of the Wild Yeast Saccharomyces paradoxus

48 L. J. Johnson et al.

Europe or by parallel mutations. Parallel mutation is a One such pair (Q15.1 and Q16.1) was collected fromthe same tree at the same time and is the most likelyplausible cause, as repeat number is highly variable in

Saccharomyces (Kitada and Hishinuma 1988) and var- candidate; each other pair is separated by 500 m andthe data do not allow one to distinguish whether theseies from two to four repeats in our set of 28 wild isolates

(see below). Overall, then, there is no compelling evi- are clonemates or are identical just by chance.Apart from this localized clonal growth, there is nodence of recombination within any of these genes.

To test for recombination between genes, the data obvious correlation between genotype and geographiclocation. With all isolates included, there is a significantfrom all six loci were combined for parsimony analysis.

The European isolates give a shortest tree of 30 steps, positive regression across all pairs of isolates of geno-typic distance (proportion of loci at which the isolates7 steps longer than the minimum possible (consistency

index � 0.77), showing extensive homoplasy. Eight of differ) and geographical distance (slope � 0.01 km�1,P � 0.02). However, if only a single (randomly chosen)the 15 possible pairs of gene trees conflict, and no

branch is common to all 6 trees. Moreover, nucleotide isolate of each distinct genotype is included in the analy-sis, the regression is not significant (slope � 0.005 km�1,sites in the same gene are significantly more likely to

agree than sites in different genes (partition homogene- P � 0.25). It appears that this population experiencesfrequent gene flow on a kilometer scale.ity test, P � 0.002). Recombination does therefore ap-

pear to have occurred between the six genes, each of Homozygosity and inbreeding: In the entire data set,only a single isolate was heterozygous, at a single locuswhich is on a different chromosome.

Interestingly, for none of the genes do our wild iso- (Table 4). Wright’s inbreeding coefficient, F, estimatedfrom the fixation index (Brown 1979) is 0.99. Thislates form a monophyletic clade with respect to the

Type and Danish strains (with the possible exception suggests a high level of inbreeding. In the appendixwe model a mixed-mating population in which diploidof SAG1). This indicates either gene flow on the scale

of thousands of kilometers or large populations since individuals are derived either from intra-ascus matingor from random outcrossing. Using this model, the max-divergence such that variation present at the time of

divergence has not sorted out. imum-likelihood estimate of the outcrossing rate is1.1%, with 2-unit support limits of 0.06 and 5%. If auto-To compare the gene trees to the expectation under

the null hypothesis of a neutral coalescent, we calculated diploidization occurs in the wild, this method will under-estimate the true outcrossing rate, as autodiploidizationthe variance of branch lengths in the genealogies and

compared them to those found on randomized genealo- removes heterozygosity far more quickly than intra-ascusmating does (appendix).gies with the same total number of mutations. For this

analysis the sample size was taken as seven (i.e., clonemates Recombination: In both data sets, there is abundantevidence of recombination between loci. Of the 21 possi-were excluded). For STE3, seven of the eight differences

segregating within our wild isolates are on the same ble pairs of loci, 18 of them are phylogenetically incom-patible (i.e., show evidence of past recombination). Par-branch and the variance of branch lengths is 4.1, sig-

nificantly higher than that in random genealogies (P � simony analysis of the entire data set gives a shortesttree of 22 steps, compared to a minimum possible of0.005). For SAG1, all three segregating differences are

on the same branch, and the variance is 0.75, also sig- 12 (consistency index � 0.54). Taken as a whole thereis significant multilocus linkage disequilibrium (IA �nificant (P � 0.05). This clumping of nucleotide

changes on the genealogies could have resulted from 0.21, rD � 0.035, P � 0.02), but not if each distinctgenotype is reduced to a single observation (IA � �0.05,nonindependent mutation (perhaps unlikely since the

changes occurred 600 bp apart), introgression from rD � �0.008, P � 0.6).other more divergent populations, or balancing selec-tion at a linked locus.

DISCUSSIONMolecular data set 2: genotypes of 28 isolates at seven

loci: The second data set consists of all 28 isolates geno- Like S. cerevisiae, S. paradoxus is capable of three typesof reproduction in the laboratory: clonal replication,typed for at least one polymorphism per locus se-

quenced, plus a microsatellite locus (Table 4). Six iso- inbreeding, and outcrossing. All three appear to beimportant in molding the pattern of genetic variation inlates, including the three found to be identical in data

set 1, had identical genotypes. This is unlikely in a ran- our natural population. Evidence for clonal replicationcomes from the repeated isolation of the same genotype,domized data set (P � 0.001), and all 6 isolates were

collected within 600 m of one another over a 3-month more than would be expected by chance: Among our28 wild isolates, 6 appear to be members of a singleperiod (Figure 2). We interpret these 6 isolates as part

of a clone. If five of these six clones are removed from clone, and at least one of the other five pairs of identicalgenotypes is also likely to be clonemates. There maythe data set, there remain 5 pairs of identical isolates and

only 18 different genotypes. This is fewer than would be have been inbreeding in the ancestry of these clonemates,or even mating between clonemates, but inbreedingexpected in a randomized data set (P � 0.05), sug-

gesting that one or more of these are also clonemates. alone without clonal replication would not lead to such

Page 7: Population Genetics of the Wild Yeast Saccharomyces paradoxus

49Population Genetics of S. paradoxus

TABLE 4

Genotypes of 28 wild S. paradoxus isolates

Month MF�1 STE3 STE2 SAG1 AGA2ID collected 333, 354, R MFA1 17 792, 805 1382, 1387 1578 346, 347 TFA1

Silkwood ParkW7 10/96 TA4 C TG ? G G- 2S36.7 12/97 TG4 C CG TT G G- 3T4ba 5/98 TG4 T TG AC A G- 1T8.1a 5/98 TG4 T TG AC A G- 1T18.2 5/98 TG4 C TG ? A/G G- 3T21.4a 5/98 TG4 T TG AC A G- 1T22.1a 5/98 TG4 T TG AC A G- 1T26.3 7/98 ? C TG AC A TT 1T27.3a 7/98 TG4 T TG AC A G- 1T32.1a 7/98 TG4 T TG AC A G- 1T62.1 7/98 TA4 T CG AC G G- 2T68.2a 7/98 TA4 T TG AC G G- 2T76.6a 7/98 TG4 C TT AC G TT 2

Windsor Great ParkQ4.1 9/98 TA4 C TG AC G TT 1Q6.1 9/98 TA4 T TG AC A TT 1Q14.4a 9/98 TA2 C TG TT G G- 2Q15.1a 9/98 AA3 C TG TT A TT 2Q16.1a 9/98 AA3 C TG TT A TT 2Q31.4a 9/98 TG4 C TT AC G TT 2Q32.3 9/98 TG4 C TG AC A G- 1Q43.5a 9/98 TA4 C TG ? G G- 1Q59.1 10/98 TG4 C TG TT G G- 1Q62.5 10/98 TG4 C TG TT G TT 2Q69.8a 10/98 TA2 C TG ? G G- 2Q70.8a 10/98 TA4 T TG AC G G- 2Q74.4a 10/98 TA4 C TG ? G G- 1Q89.8 10/98 TG4 C CG TT A G- 1Q95.3 10/98 TG4 C TG AC G G- 1

Bases or repeat numbers are shown for polymorphic sites at seven loci. Numbers under the gene namesindicate polymorphic positions scored (see Table 2). Two MF�1 alleles were absent from the nine-isolate set:both differ from Type sequence by the G → A change at base 354; allele 3 has a further T → A change atbase 333 and three pheromone repeats. Allele 4 has two pheromone repeats.

a Isolate IDs with indistinguishable genotypes.

an overrepresentation of genotypes. Evidence for in- different loci (Maynard Smith 1994). Nevertheless,inbreeding reduces the effective rate of recombinationbreeding comes from the high homozygosity. An as-

sumption in making this inference is that S. paradoxus (re) in the population below the actual rate (ra), ac-cording to the relation re � (1 � F)ra (Dye and Wil-in the field behaves as it does in the lab, and in particular

that the diplophase predominates, and so the cells we liams 1997; Nordborg 2000). This is because recombi-nation is effective only in heterozygous individuals, andisolated are diploid. In principle, an alternative explana-

tion for the lack of heterozygosity is that cells are hap- inbreeding reduces the frequency of heterozygotes. Inour population, F � 0.99, and so the effective recombi-loids in nature, but autodiploidize in the early stages of

the isolation procedure. However, we do not consider nation rate is 1% of what it would be in a random-mating population. This means that linkage disequi-it likely that S. paradoxus should change its life cycle so

drastically in response to laboratory conditions. Finally, librium should extend for greater distances along thegenome than would otherwise be the case and may haveevidence for outcrossing comes from the single hetero-

zygote we found plus the genealogical incompatibility contributed to the absence of evidence for recombina-tion within any of the genes studied. This extension ofbetween loci and absence of linkage disequilibrium.

This contrast between the great excess of homozygos- linkage disequilibrium along the genome means thatDNA sequences will be more informative for at leastity and the absence of linkage disequilibrium between

genes reflects the fact that even small amounts of out- some types of analyses than would otherwise be the case(Nordborg 2000), which makes S. paradoxus yet morecrossing and recombination will randomize alleles at

Page 8: Population Genetics of the Wild Yeast Saccharomyces paradoxus

50 L. J. Johnson et al.

Figure 2.—Locations of oak trees from which wild isolates were collected. Superimposed circles indicate isolates from thesame tree. Suspected clones are shown as open circles.

attractive as a model system for population genetics and though formal theoretical work would be useful inclarifying this. If balancing selection operates, it is prob-genomics. Also relevant, of course, is the actual rate of

recombination, and it is interesting that S. cerevisiae has ably not heterozygote advantage (given the low levelsof heterozygosity), but frequency-dependent selection.one of the highest known recombination rates per

megabase of DNA. One explanation is that this has Inbreeding in S. paradoxus can occur both by intra-ascus mating and by autodiploidization (as well as byevolved to compensate for a low rate of outcrossing, as

is suggested to explain the high chiasmata frequency mating between other types of relatives) and it is notpossible with our data to determine the relative fre-seen in selfing plants (e.g., Zarchi et al. 1972). Alterna-

tively, it is possible that the high rate of recombination quency of these alternatives. One possible approachwould be to compare heterozygosity at loci tightly linkedhas evolved as a consequence of intense selection pres-

sures imposed by domestication (Burt and Bell 1987). to the mating-type locus to that at unlinked loci; if therehas not been switching, heterozygosity near the mating-It will be interesting to see whether S. paradoxus also

has a high rate of recombination in lab crosses and to type locus will be maintained, even with selfing. Pre-sumably switching does occur at least occasionally, asdetermine just how far linkage disequilibrium extends

along the genome. otherwise selection would not maintain the underlyingmechanism.The low effective rate of recombination over distances

of �1 kb allowed us to reconstruct genealogies for each Inbreeding species present some difficulties for inter-preting sequence variability, due to genotypes beinggene. We compared the variance of branch lengths to

those found on random genealogies and detected sig- nonindependent. Although inbreeding predominatesover outcrossing in S. paradoxus, it is not as extreme innificant deviations from neutrality in two genes, both

in the direction of changes being clumped on the gene- this regard as some other yeasts, at least in the labora-tory—in many species, mating typically occurs betweenalogy. Nonindependent mutation, introgression, or bal-

ancing selection could give rise to such a pattern, al- a haploid mother cell and a daughter bud (Johannsen

Page 9: Population Genetics of the Wild Yeast Saccharomyces paradoxus

51Population Genetics of S. paradoxus

and Saccharomyces douglasii—a paradigm of incipient evolution.and van der Walt 1980; Kurtzman and Fell 1998).Mol. Gen. Genet. 213: 297–309.

Other species probably outcross more than S. para- Herskowitz, I., 1988 Life cycle of the budding yeast Saccharomycesdoxus—in particular, species that are vegetatively hap- cerevisiae. Microbiol. Rev. 52: 536–553.

Johannsen, E., and J. P. van der Walt, 1980 Hybridization studiesloid and heterothallic (Kurtzman and Fell 1998). Itwithin the genus Schwanniomyces Klocker. Can. J. Microbiol. 26:would be interesting to compare patterns of genetic 1199–1203.

variation for such species with those found here. Kitada, K., and F. Hishinuma, 1988 Evidence for preferential multi-plication of the internal unit in tandem repeats of MFalpha genesFinally, the results reported here differ markedly fromin Saccharomyces yeasts. Curr. Genet. 13: 1–5.those reported for S. cerevisiae from wineries, in which Kurtzman, C. P., and J. W. Fell, 1998 The Yeasts: A Taxonomic Survey,

there was a high frequency of heterozygous strains, re- Ed. 4. Elsevier, Amsterdam.Li, W.-H., and L. A. Sadler, 1991 Low nucleotide diversity in man.cessive lethals, and heterothallism (Mortimer 2000).

Genetics 129: 513–523.These differences are presumably the effect of domesti- Maynard Smith, J., 1994 Estimating the minimum rate of geneticcation, although the precise details remain obscure. transformation in bacteria. J. Evol. Biol. 7: 525–534.

Mortimer, R. K., 2000 Evolution and variation of the yeast (Saccharo-With the development of wild strain collections, suchmyces) genome. Genome Res. 10: 403–409.as are available for Drosophila, and the identification

Naumov, G. I., E. Naumova and M. Korhola, 1992a Genetic identi-of more molecular markers in this species, S. paradoxus fication of natural Saccharomyces sensu stricto yeasts from Finland,

Holland and Slovakia. Antonie van Leeuwenhoek 61: 237–243.may prove to be a valuable addition to the current suiteNaumov, G. I., E. S. Naumova, R. A. Lantto, E. J. Louis and M.of model organisms available to the population geneti-

Korhola, 1992b Genetic homology between Saccharomyces cere-cist. visiae and its sibling species S. paradoxus and S. bayanus : electro-

phoretic karyotypes. Yeast 8: 599–612.Thanks go to Alexandra Eggington and Celine Vass for technicalNaumov, G. I., E. S. Naumova and A. Querol, 1997a Genetic studyhelp. This work was funded by the Natural Environment Research of natural introgression supports delimitation of biological spe-

Council in studentships to Louise Johnson, Matthew Goddard, and cies in the Saccharomyces sensu stricto complex. Syst. Appl. Micro-Richard Hetherington; and a grant to Austin Burt. biol. 20: 595–601.

Naumov, G. I., E. S. Naumova and P. D. Sniegowski, 1997b Differ-entiation of European and Far East Asian populations of Saccha-romyces paradoxus by allozyme analysis. Int. J. Syst. Bacteriol.

LITERATURE CITED 47: 341–344.Naumov, G. I., E. S. Naumova and P. D. Sniegowski, 1998 Saccharo-

Agapow, P.-M., and A. Burt, 2001 Indices of multilocus linkage myces paradoxus and Saccharomyces cerevisiae are associated withdisequilibrium. Mol. Ecol. Notes 1: 101–102. exudates of North American oaks. Can. J. Microbiol. 44: 1045–

Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman, 1050.1990 Basic local alignment search tool. J. Mol. Biol. 215: 403– Nelson, M. A., 1996 Mating systems in ascomycetes: a romp in the410. sac. Trends Genet. 12: 69.

Andolfatto, P., and M. Przeworski, 2000 A genome-wide depar- Nordborg, M., 2000 Linkage disequilibrium, gene trees and selfing:ture from the standard neutral model in natural populations of an ancestral recombination graph with partial self-fertilization.Drosophila. Genetics 156: 257–268. Genetics 154: 923–929.

Barnett, J. A., R. W. Payne and D. Yarrow, 1990 Yeasts: Characteris- Rozas, J., and R. Rozas, 1999 DnaSP version 3: an integrated pro-tics and Identification. Cambridge University Press, Cambridge, gram for molecular population genetics and molecular evolutionUK/London/New York. analysis. Bioinformatics 15: 174–175.Begun, D. J., and C. F. Aquadro, 1992 Levels of naturally occurring Sambrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Clon-DNA polymorphism correlate with recombination rates in D. ing: A Laboratory Manual, Ed. 2. Cold Spring Harbor Laboratorymelanogaster. Nature 356: 519–520. Press, Cold Spring Harbor, NY.Brown, A. H. D., 1979 Enzyme polymorphism in plant populations.

Sherman, F., 1991 Getting started with yeast, pp. 3–21 in Guide toTheor. Popul. Biol. 15: 1–42.Yeast Genetics and Molecular Biology, edited by C. Guthrie andBurt, A., and G. Bell, 1987 Mammalian chiasma frequencies as aG. R. Fink. Academic Press, San Diego.test of two theories of recombination. Nature 326: 803–805.

Sniegowski, P. D., P. G. Dombrowski and E. Fingerman, 2002 Sac-Drake, J. W., 1991 A constant rate of spontaneous mutation in DNA-charomyces cerevisiae and Saccharomyces paradoxus coexist inbased microbes. Proc. Natl. Acad. Sci. USA 88: 7160–7164.a natural woodland site in North America and display differentDye, C., and B. G. Williams, 1997 Multigenic drug resistance amonglevels of reproductive isolation from European conspecifics.inbred malaria parasites. Proc. R. Soc. Lond. Ser. B Biol. Sci.FEMS Yeast Res. 1: 299–306.264: 61–67.

Swofford, D. L., 2002 PAUP*. Phylogenetic Analysis Using ParsimonyFarris, J. S., 1989 The retention index and rescaled consistency(*and Other Methods), Version 4. Sinauer Associates, Sunderland,index. Cladistics 5: 417–419.MA.Farris, J. S., M. Kallersjo, A. C. Kluge and C. Bult, 1994 Testing

Vaughan-Martini, A., and A. Martini, 1995 Facts, myths and leg-significance of incongruence. Cladistics 10: 315–319.ends on the prime industrial microorganism. J. Indust. Microbiol.Field, D., and C. Wills, 1998 Abundant microsatellite polymor-14: 514–522.phism in S. cerevisiae, and the different distributions of microsatel-

Wall, J. D., P. Andolfatto and M. Przeworski, 2002 Testing mod-lites in eight prokaryotes and S. cerevisiae, result from strongels of selection and demography in Drosophila simulans. Geneticsmutation pressures and a variety of selective forces. Proc. Natl.162: 203–216.Acad. Sci. USA 95: 1647–1652.

White, T. J., T. Bruns, S. Lee and J. W. Taylor, 1990 AmplificationGoddard, M. R., and A. Burt, 1999 Recurrent invasion and extinc-and direct sequencing of fungal rRNA genes for phylogenetics,tion of a selfish gene. Proc. Natl. Acad. Sci. USA 96: 13880–13885.pp. 315–322 in PCR Protocols: A Guide to Methods and Applications,Goffeau, A., B. G. Barrell, H. Bussey, R. W. Davis, B. Dujon etedited by M. A. Innes, D. H. Gelfand, J. J. Sninsky and T. J.al., 1996 Life with 6000 genes. Science 274: 563–567.White. Academic Press, San Diego.Hall, B. G., and P. M. Sharp, 1992 Molecular population genetics

Wolfram Research, 1999 Mathematica, Version 4. Wolfram Re-of Escherichia coli : DNA sequence diversity at the celC, crr and gutBsearch, Champaign, IL.loci of natural isolates. Mol. Biol. Evol. 9: 654–665.

Zarchi, Y., G. Simchen, J. Hillel and T. Schaap, 1972 ChiasmataHerbert, C. J., G. Dujardin, M. Labouesse and P. P. Slonimski,and the breeding system in wild populations of diploid wheats.1988 Divergence of the mitochondrial leucyl transfer-RNA syn-

thetase genes in 2 closely related yeasts, Saccharomyces cerevisiae Chromosoma 38: 77–94.

Page 10: Population Genetics of the Wild Yeast Saccharomyces paradoxus

52 L. J. Johnson et al.

Zeyl, C., 2000 Budding yeast as a model organism for population Note that this assumes the loci are independent. Forgenetics. Yeast 16: 773–784.

the six isolates with missing data, the inside product isZeyl, C., and J. A. G. M. DeVisser, 2001 Estimates of the rateand distribution of fitness effects of spontaneous mutation in done over only the loci for which there are data. Finally,Saccharomyces cerevisiae. Genetics 157: 53–61. isolate T18.2 is homozygous at 5 loci and heterozygous

at SAG1, and the probability of an individual being thisCommunicating editor: D. Charlesworth

is

APPENDIXp(T18.2) � �

x�0

sxt�5

i�1�1 � HWi�23�

x�HWSAG1�23�x

.To estimate the frequency of outcrossing compatible

with the observed level of heterozygosity, we first mod-When we count only one isolate of each distinct geno-eled a mixed-mating population in which haploid cellstype, the data consist of 14 completely homozygous ge-either mate within the ascus with probability s or matenotypes, two homozygous isolates with unknown STE2randomly in the population with probability t (�1 �genotype, one homozygous isolate with unknown MF�1s). Note first that in such a population, the probabilitygenotype, and the heterozygote T18.2. The probabilitythat an individual chosen at random is derived from xof observing the entire data set is thereforegenerations of selfing (i.e., there are exactly x genera-

tions of selfing in its ancestry before one gets back top(data) � p(all homozygous)14 � p(missing STE2)2

an outcrossing event) is sxt. Second, the probability thatan individual derived from x generations of selfing is � p(missing MF�1) � p(T18.2).homozygous at locus i is 1 � HWi(2/3)x, where HWi is

The maximum possible value of this occurs at an out-the Hardy-Weinberg proportion of heterozygotes in thecrossing rate of t � 1.1%, with 2-unit support limits ofpopulation at that locus. Note that in this system selfing0.06 and 5%.reduces heterozygosity by one-third every generation,

We also modeled a mixed-mating population in whichnot by one-half, as in more familiar systems where selfingindividuals were derived either from mating betweengametes come from independent meioses (e.g., plants).

This is because, with intra-ascus mating, each haploid clonemates (autodiploidization) with probability s orspore produced from a heterozygous diploid shares an from random outcrossing with probability t. In this caseallele with only one of its three potential mating part- individuals are either completely homozygous at all lociners. Finally, the overall probability that a random indi- or heterozygous at Hardy-Weinberg proportions, andvidual is homozygous at the ith locus is the product the probability an individual is homozygous at the ithof these two probabilities, summed over all possible locus isnumbers of generations of selfing in its ancestry:

p(homozygous) � s t(1 � HWi).p(homozygous) � �

x�0

sxt �1 � HWi �23�x� .

With this model the maximum-likelihood outcrossingrate is 6%, with 2-unit support limits of 0.3 and 23%,In our data set there are 7 loci, and the probability thathigher than that in the previous model, as a greateran individual will be homozygous at all of them is thenfrequency of outcrossing is needed to counterbalancethe more intense inbreeding caused by autodiploidiza-p(all homozygous) � �

x�0

sxt�7

i�1�1 � HWi�23�

x� .tion.