13
Testing the Neutral Fixation of Hetero-Oligomerism in the Archaeal Chaperonin CCT Valentin Ruano-Rubio and Mario A. Fares Evolutionary Genetics and Bioinformatics Laboratory, Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland The evolutionary transition from homo-oligomerism to hetero-oligomerism in multimeric proteins and its contribution to function innovation and organism complexity remain to be investigated. Here, we undertake the challenge of contributing to this theoretical ground by investigating the hetero-oligomerism in the molecular chaperonin cytosolic chaperonin containing tailless complex polypeptide 1 (CCT) from archaea. CCT is amenable to this study because, in contrast to eukaryotic CCTs where sub-functionalization after gene duplication has been taken to completion, archaeal CCTs present no evidence for subunit functional specialization. Our analyses yield additional information to previous reports on archaeal CCT paralogy by identifying new duplication events. Analyses of selective constraints show that amino acid sites from 1 subunit have fixed slightly deleterious mutations at inter-subunit interfaces after gene duplication. These mutations have been followed by compensatory mutations in nearby regions of the same subunit and in the interface contact regions of its paralogous subunit. The strong selective constraints in these regions after speciation support the evolutionary entrapment of CCTs as hetero-oligomers. In addition, our results unveil different evolutionary dynamics depending on the degree of CCT hetero-oligomerism. Archaeal CCT protein complexes comprising 3 distinct classes of subunits present 2 evolutionary processes. First, slightly deleterious and compensatory mutations were fixed neutrally at inter-subunit regions. Second, sub-functionalization may have occurred at substrate-binding and adenosine triphosphate–binding regions after the 2nd gene duplication event took place. CCTs with 2 distinct types of subunits did not present evidence of sub-functionalization. Our results provide the 1st in silico evidence for the neutral fixation of hetero-oligomerism in archaeal CCTs and provide information on the evolution of hetero-oligomerism toward sub- functionalization in archaeal CCTs. Introduction Protein dimerization and oligomerization is a universal phenomenon in organisms and proteins with different de- grees of oligomerization and is present in the most evolu- tionarily conserved protein complexes (Marianayagam et al. 2004). Dimerized and oligomerized proteins are pres- ent in many important pathways including functionally im- portant proteins as previously reported (Simon and Goodenough 1998; Amoutzias et al. 2004; Marianayagam et al. 2004; Ronnstrand 2004; Lockhart et al. 2006). Whether the transition from homo-oligomeric to hetero- oligomeric protein complexes is responsible for the emer- gence of new functions and metabolic pathways remains largely unanswered. Some authors have proposed a link be- tween structural stability and self-interaction of oligomers (Dunbar et al. 2004). Others have demonstrated an impor- tant effect of the abundance of oligomers on the protein– protein interaction networks geometry (Ispolatov et al. 2005). Examples for homo-oligomeric and hetero-oligomeric protein complexes coexist in organisms irrespective of their complexity. For example, heat shock proteins (Hsps), also called molecular chaperones and chaperonins such as the homo-oligomeric GroEL and the hetero-oligomeric cytosolic chaperonin containing tailless complex polypeptide 1 (CCT), perform equally complex functions and both interact with proteins showing different levels of structure complexity. The fact that Hsps interact with a significant number of proteins in the cell and that the numbers of known protein clients for the different Hsps are qualitatively different make them ideal candidates to test the relationship be- tween the degree of hetero-oligomerism and organisms complexity. Chaperonins are double back-to-back oriented, ringed proteins that assist de novo protein folding in most cellular compartments (Hartl 1996; Ellis 1997; Bukau and Horwich 1998; Frydman 2001; Hartl and Hayer-Hartl 2002). Two groups of chaperonins have been characterized, Group I with GroEL being the best-studied protein and Group II represented by CCTs. Both groups of chaperonins are known to share a common 60-kDa protein ancestor that branched at the base of the tree of life. Group I evolved along the bacterial lineage and eukaryotic organelles (Bukau and Horwich 1998; Ellis and Hartl 1999), whereas Group II evolved along the archaeal and eukaryal lineages (Gutsche et al. 1999; Leroux and Hartl 2000). Both types of multisubunit chaperonins are highly similar in their struc- tural as well as domain organization. Both proteins present complexes containing a central cavity that binds unfolded polypeptides, and each subunit comprises 3 structurally and functionally distinguishable domains, namely, the apical, equatorial, and intermediate domains (Ditzel et al. 1998; Llorca et al. 1998; Nitsch et al. 1998; Llorca, McCormack et al. 1999; Llorca, Smyth et al. 1999; Gutsche et al. 2000; Llorca et al. 2000; Schoehn, Hayes et al. 2000; Schoehn, Quaite-Randall et al. 2000). Despite their structural similarities, important func- tional differences exist between both types of proteins. Un- like GroEL that utilizes the ring-shaped co-chaperonin GroES to discharge bound proteins to the cavity of GroEL (Hartl 1996; Buckle et al. 1997; Kad et al. 1998; Grantcharova et al. 2001), CCTs use a helical protrusion at the apical do- main that emulates the function of the co-chaperone GroES (Klumpp et al. 1997; Ditzel et al. 1998; Nitsch et al. 1998; Llorca, McCormack et al. 1999). CCT also functions in con- junction with the hetero-hexameric chaperone prefoldin (Geissler et al. 1998; Vainberg et al. 1998). In contrast to GroEL, CCTs do not release proteins into the cavity of Key words: neutral evolution, positive selection, accelerated fixation rates, coevolution, CCT, archaea. E-mail: [email protected]. Mol. Biol. Evol. 24(6):1384–1396. 2007 doi:10.1093/molbev/msm065 Advance Access publication April 3, 2007 Ó The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] by guest on March 2, 2016 http://mbe.oxfordjournals.org/ Downloaded from

Testing the Neutral Fixation of Hetero-Oligomerism in the Archaeal Chaperonin CCT

  • Upload
    csic

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Testing the Neutral Fixation of Hetero-Oligomerism in the ArchaealChaperonin CCT

Valentin Ruano-Rubio and Mario A. FaresEvolutionary Genetics and Bioinformatics Laboratory, Department of Genetics, Smurfit Institute of Genetics,University of Dublin, Trinity College, Dublin, Ireland

The evolutionary transition from homo-oligomerism to hetero-oligomerism in multimeric proteins and its contribution tofunction innovation and organism complexity remain to be investigated. Here, we undertake the challenge ofcontributing to this theoretical ground by investigating the hetero-oligomerism in the molecular chaperonin cytosolicchaperonin containing tailless complex polypeptide 1 (CCT) from archaea. CCT is amenable to this study because, incontrast to eukaryotic CCTs where sub-functionalization after gene duplication has been taken to completion, archaealCCTs present no evidence for subunit functional specialization. Our analyses yield additional information to previousreports on archaeal CCT paralogy by identifying new duplication events. Analyses of selective constraints show thatamino acid sites from 1 subunit have fixed slightly deleterious mutations at inter-subunit interfaces after gene duplication.These mutations have been followed by compensatory mutations in nearby regions of the same subunit and in theinterface contact regions of its paralogous subunit. The strong selective constraints in these regions after speciationsupport the evolutionary entrapment of CCTs as hetero-oligomers. In addition, our results unveil different evolutionarydynamics depending on the degree of CCT hetero-oligomerism. Archaeal CCT protein complexes comprising 3 distinctclasses of subunits present 2 evolutionary processes. First, slightly deleterious and compensatory mutations were fixedneutrally at inter-subunit regions. Second, sub-functionalization may have occurred at substrate-binding and adenosinetriphosphate–binding regions after the 2nd gene duplication event took place. CCTs with 2 distinct types of subunits didnot present evidence of sub-functionalization. Our results provide the 1st in silico evidence for the neutral fixation ofhetero-oligomerism in archaeal CCTs and provide information on the evolution of hetero-oligomerism toward sub-functionalization in archaeal CCTs.

Introduction

Protein dimerization and oligomerization is a universalphenomenon in organisms and proteins with different de-grees of oligomerization and is present in the most evolu-tionarily conserved protein complexes (Marianayagamet al. 2004). Dimerized and oligomerized proteins are pres-ent in many important pathways including functionally im-portant proteins as previously reported (Simon andGoodenough 1998; Amoutzias et al. 2004; Marianayagamet al. 2004; Ronnstrand 2004; Lockhart et al. 2006).Whether the transition from homo-oligomeric to hetero-oligomeric protein complexes is responsible for the emer-gence of new functions and metabolic pathways remainslargely unanswered. Some authors have proposed a link be-tween structural stability and self-interaction of oligomers(Dunbar et al. 2004). Others have demonstrated an impor-tant effect of the abundance of oligomers on the protein–protein interaction networks geometry (Ispolatov et al.2005). Examples for homo-oligomeric and hetero-oligomericprotein complexes coexist in organisms irrespective of theircomplexity. For example, heat shock proteins (Hsps), alsocalled molecular chaperones and chaperonins such as thehomo-oligomeric GroEL and the hetero-oligomeric cytosolicchaperonin containing tailless complex polypeptide 1 (CCT),perform equally complex functions and both interact withproteins showing different levels of structure complexity.The fact that Hsps interact with a significant number ofproteins in the cell and that the numbers of known proteinclients for the different Hsps are qualitatively differentmake them ideal candidates to test the relationship be-

tween the degree of hetero-oligomerism and organismscomplexity.

Chaperonins are double back-to-back oriented, ringedproteins that assist de novo protein folding in most cellularcompartments (Hartl 1996; Ellis 1997; Bukau and Horwich1998; Frydman 2001; Hartl and Hayer-Hartl 2002). Twogroups of chaperonins have been characterized, Group Iwith GroEL being the best-studied protein and Group IIrepresented by CCTs. Both groups of chaperonins areknown to share a common 60-kDa protein ancestor thatbranched at the base of the tree of life. Group I evolvedalong the bacterial lineage and eukaryotic organelles(Bukau and Horwich 1998; Ellis and Hartl 1999), whereasGroup II evolved along the archaeal and eukaryal lineages(Gutsche et al. 1999; Leroux and Hartl 2000). Both types ofmultisubunit chaperonins are highly similar in their struc-tural as well as domain organization. Both proteins presentcomplexes containing a central cavity that binds unfoldedpolypeptides, and each subunit comprises 3 structurally andfunctionally distinguishable domains, namely, the apical,equatorial, and intermediate domains (Ditzel et al. 1998;Llorca et al. 1998; Nitsch et al. 1998; Llorca, McCormacket al. 1999; Llorca, Smyth et al. 1999; Gutsche et al. 2000;Llorca et al. 2000; Schoehn, Hayes et al. 2000; Schoehn,Quaite-Randall et al. 2000).

Despite their structural similarities, important func-tional differences exist between both types of proteins. Un-like GroEL that utilizes the ring-shaped co-chaperoninGroES to discharge bound proteins to the cavity of GroEL(Hartl1996;Buckleetal.1997;Kadetal.1998;Grantcharovaet al. 2001), CCTs use a helical protrusion at the apical do-main that emulates the function of the co-chaperone GroES(Klumpp et al. 1997; Ditzel et al. 1998; Nitsch et al. 1998;Llorca, McCormack et al. 1999). CCT also functions in con-junction with the hetero-hexameric chaperone prefoldin(Geissler et al. 1998; Vainberg et al. 1998). In contrast toGroEL, CCTs do not release proteins into the cavity of

Key words: neutral evolution, positive selection, accelerated fixationrates, coevolution, CCT, archaea.

E-mail: [email protected].

Mol. Biol. Evol. 24(6):1384–1396. 2007doi:10.1093/molbev/msm065Advance Access publication April 3, 2007

� The Author 2007. Published by Oxford University Press on behalf ofthe Society for Molecular Biology and Evolution. All rights reserved.For permissions, please e-mail: [email protected]

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

the complex, but proteins remain bound to the chaperonin(Llorca, Martin-Benito, Gomez-Puertas et al. 2001). Simi-larly to GroEL that binds a wide range of different proteinssharing a specific GroES-like motif (Stan et al. 2006), CCTalso binds a wide range of distinct proteins including cyto-skeletal proteins, such as actin and tubulin, and many non-cytoskeletal proteins, including luciferase (Frydman et al.1994), G-a transducin (Farr et al. 1997), hepatitis B viruscapsid protein (Lingappa et al. 1994), cyclin E (Yang et al.2005), the Ebsterin-Barr nuclear antigen 1 viral protein(Kashuba et al. 1999),myosin (SrikakulamandWinkelmann1999), and the Von Hippel-Lindau tumor suppressor VHL(Feldman et al. 1999). Studies based on proteomic ap-proaches (Gavin et al. 2002; Ho et al. 2002) have extendedthis list.

One of the most intriguing features of eukaryoticCCTs is their versatility in binding substrate proteins com-pared with GroEL (Leroux and Hartl 2000). For example,CCTs have the ability to establish hydrophobic interactionswith partially folded proteins (Klumpp et al. 1997; Ditzelet al. 1998) and yet establish specialized non-hydrophobic–specific interactions with the cytoskeleton protein actin(Llorca, McCormack et al. 1999; Hynes and Willison2000; Llorca et al. 2000;McCormack et al. 2001). This flex-ibility seems to be correlated with the hetero-oligomerismand sub-functionalization of eukaryotic CCTs subunits(Llorca, Martin-Benito, Grantham et al. 2001; Fares andWolfe 2003). For instance, unlike the homo-tetradecamerGroEL protein, CCTs may present up to 9 subunits per ringwith different degrees of sequence divergence (Liou andWillison 1997; Sigler et al. 1998; Gutsche et al. 1999;Llorca, McCormack et al. 1999; Grantham et al. 2000;Llorca et al. 2000; Valpuesta et al. 2002).

The repetitive lineage-specific gene duplication andconversion in archaeal CCTs (Archibald and Roger2002a, 2002b), with most CCTs presenting only 2–3 differ-ent subunits in each ring (Archibald et al. 1999) argueagainst duplication as a mean for sub-functionalization.In their insightful work, Archibald et al. (1999) proposedinstead the neutral fixation of evolutionarily trappedCCT hetero-oligomers as a more plausible explanation forsuch a scenario. To certify the neutral fixation of hetero-oligomerism, 4 conditions have to be met under our pointof view: 1) accelerated fixation rates of amino acid replace-ments in both copies should have followed gene duplicationas a result of relaxed selective constraints in at least one ofthe CCT subunits. Additionally, the average fixation rate of1 subunit should have been greater than that for its paral-ogous subunit. 2) These substitutions should have cumu-lated in within-ring inter-subunit interfaces at amino acidsites that became constrained afterward leading to an evo-lutionary entrapment of CCT as hetero-oligomers. We dis-tinguished 2 types of mutated sites depending on whetherthey are located in contact regions between subunits in thesame ring (within-ring inter-subunit regions) or betweensubunits in different rings at their equatorial domains(between-ring inter-subunit regions). 3) Fixation of com-pensatory mutations at accelerated rates or by adaptiveevolution in inter-subunit interfaces should have occurredto mitigate the effects of slightly deleterious mutations (e.g.,mutations that were fixed at high rate despite their slight

negative effect on the biological fitness of organisms) onthe molecule structure or/and function. 4) Compensatoryand slightly deleterious mutations should have coevolvedto maintain the structural and functional characteristicsof the ancestral homo-oligomer.

We tested all these evolutionary conditions in a dataset including sequences from the archaeal groups Crenarch-aeotes and Euryarchaeotes. We investigate this question byconducting a comprehensive in silico analysis of the selec-tive constraints governing the evolution of the archaealchaperonin CCT. The results obtained from this approachcontribute to building the theoretical ground for the originand evolution of hetero-oligomerism.

Materials and MethodsSequence Alignments and Phylogenetic Analyses

We searched for all protein sequences Hsp60 homo-logues within the Archaea domain using the NCBI Blastservice (Madden et al. 1996). We filtered out those sequen-ces that appear to be partial (less than 500 residues long) orredundant (very similar to other sequences in fully anno-tated genomes). We also re-predicted the gamma subunitsequence for Methanococcoides burtonii as the annotatedsequence seems to be missing its amino-terminal region.The resulting subset included 100 sequences. We also ex-cluded 4 eubacterial GroEL homologues probably acquiredby lateral gene transfer (LGT). Therefore, we constructedan alignment based on a full data set of 96 sequences. None-theless, phylogenetic analyses revealed a lack of consis-tency in lineage resolution due to possible phylogeneticartifacts caused by long branches. Accordingly, we ex-cluded those groups of sequences suspicious of presentinglong-branch attraction (LBA), non-functionalization, orother artifacts that we did not consider vital for the intendeddownstream analyses. Consequently, the final reduced dataset included just 72 sequences. For both data sets, wealigned protein sequences using the program Musclev3.6 (Edgar 2004) with the default settings. We thenaligned nucleotide sequences concatenating triplets of nu-cleotides according to the protein sequence alignment(alignments are available from the authors on request).

Regarding phylogenetic analyses, first we used Prot-Test 1.3 (Abascal et al. 2005; Keane et al. 2006) and Mod-elGenerator v0.82 (Keane et al. 2006) to determine the bestcandidate substitution rate matrix for maximum likelihood(ML) inference. Both programs pinpointed RtREVþGþFas 1st option by many criteria despite the fact that thismodel was specifically devised for retro-transcriptase phy-logenies (Dimmic et al. 2002). The second best modelbased on a different rate matrix was WAGþG (Whelanand Goldman 2001). Because both models produced similarML topologies considering just well supported lineages, wedecided to use the latter from that point on.

We ran the program Phyml v2.4.4 (Guindon andGascuel 2003) upon the full and the reduced data sets toobtain a single ML tree estimate and 1,000 nonparametricalbootstrapped topologies. Then, we used the program Con-sense from the PHYLIP v3.6 package (Felsenstein 1989) tosummarize those bootstrap replicates in a single fully re-solved consensus tree using the extended majority rule

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1385

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

method. In either data set, the consensus tree seemedmore reliable than the single ML estimate considering pre-vious publications on the phylogeny of archaea (Brochier,Forterre, andGribaldo 2005; Brochier, Gribaldo et al. 2005).Accordingly, we used consensus topologies for all down-stream analyses and figures.

Additionally, we resolved an incompatibility in the or-der of speciation events between subunits in Halobacter-iales species by subunit alignment concatenation, MLreconstruction, and bootstrapped consensus tree.

Testing Non-Functionalization of Very Fast–EvolvingSequences

We used the program MEGA v3.1 (Kumar, Tamuraand Nei 2004) to calculate synonymous (dS) and non-synonymous (dN) change distances between very fast andmoderately evolving sequences in Methanomicrobia andHalobacteria clades. We calculated the transition/transver-sion ratio using the estimator (j 5 1.48) returned by theprogram Codeml in the PAML package v3.15 (Yang1997) and applied the Nei and Gojobori modified method(Nei and Kumar 2000). If a sequence or group of sequencesare pseudogenes, the ratio dN/dS observed between them oragainst any other sequence outside the groupmust approach1. In any case, it must be clearly greater than the ratio ob-tained between putative functional genes specially whenmeasured within orthologues where there is little roomfor functional divergence. We also used MEGA to calculatemean residue identity between accelerated, moderatelyevolving archaea sequences and eukaryotic CCT.

Testing the Constancy of Amino Acid Substitution Ratesafter Gene Duplications

To test whether changes in substitution rates occurredafter the different duplication events in archaeal CCTs, weused the 2-cluster test implemented in the program Lintree(Takezaki et al. 1995). The 2-cluster test examines theequality of substitution rates for 2 clusters linked by a nodeon the phylogenetic tree. In order to determine what clusteris accelerated with respect to the other, we needed to use anoutgroup cluster for the analyses. Because of the high di-vergence levels between CCTs from eukaryotes and thosefrom archaea, we performed this test dividing the archaeadata into 2 subsets, with the 1st set including sequencesfrom Crenarchaeotes and the 2nd set including sequencesfrom Euryarchaeotes (fig. 1 shows the phylogenetic tree withthe different groups of sequences). We used representativesequences from each CCT subunit of Crenarchaeotes asoutgroup sequences for the 2-cluster test analysis in the Eur-yarchaeotes data set and vice versa.

Identifying Lineages and Protein Regions under SelectiveConstraints

To detect accelerated rates of evolution simulta-neously in specific lineages of the tree and regions of thesequence alignment, we used the sliding-window–basedapproach (Fares et al. 2002) implemented in the program

SWAPSC v1.0 (Fares 2004). Briefly, the software slidesa statistically optimum window size along the sequencealignment to detect selective constraints and estimatesthe probability of replacements per non-synonymous sites(dN) and substitutions per synonymous sites (dS). The win-dow size is optimized by means of using a number of sim-ulated data sets. The standard way to measure the intensityof selection when analyzing DNA variability is by compar-ing dS with dN (Kimura 1977; Sharp 1997; Posada andCrandall 1998; Akashi 1999; Thulasiraman et al. 1999).The ratio between the 2 rates (x 5 dN/dS) helps to elucidateif the gene has been fixing amino acid replacements neu-trally (x 5 1), replacements have been removed by puri-fying selection (x , 1), or mutations have been fixed byadaptive evolution (x . 1). It has been shown, however,that x is a poor indicator of the action of adaptive evolution(Sharp 1997; Posada and Crandall 1998). The reason is thatdetection of episodic positive selection may be hamperedby the strong purifying selection against the majority ofamino acid mutations throughout most of the evolutionarytime resulting in x ,, 1. Thus, x is a conservative detec-tor of adaptive evolution.

We used 1,000 simulated data sets in our analysis ob-tained using the program Evolver from the PAML package(Yang 1997). To perform the simulations, we took as initialparameters the average x value, transition-to-transversionrates ratio, and codon-usage table generated under theGoldman and Yang (1994) (G&Y) model, using the realsequence alignment as input. The program then slides thewindow along the real sequence alignment and estimatesdN and dS by the method of Li (1993). The program deter-mines significance of these estimates under a Poisson dis-tribution of nucleotide substitutions along the alignment.Along with adaptive evolution, SWAPSC also tests for ac-celerated rates of amino acid substitutions without the re-striction of x . 1 (e.g., due to the fixation of slightlydeleterious mutations by genetic drift), saturations of syn-onymous sites, and hot spots (where both dS and dN are sig-nificantly high but where x , 1).

To figure out whether mutations fixed at inter-subunitinterfaces have selectively trapped CCTs into hetero-oligomerism, we investigated the mutational dynamicof these sites in the lineages after-duplication-before-speciation (ADBS) compared with that in the lineageafter-duplication-after-speciation (ADAS). If hetero-oligomers were selectively advantageous, we would expectrelaxed constraints ADBS permitting the fixation ofchanges that increase the affinity between distinct mono-mers. We should then detect strong selective constraintsADAS at these same sites due to their importance to main-tain hetero-oligomers.

Identifying Sites under Positive Darwinian Selection byML in CCTs

We examined whether sites surrounding within-ringinter-subunits or between-ring contacts were under adaptiveevolution as to compensate for slightly deleterious muta-tions fixed at these regions. The combination of bothsets of mutations, however, would have no effect on theprotein’s function or structure (neutral mutations).

1386 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

Compensatory mutations could be also under acceleratedrates of evolution, and hence, they did not need to be strictlyunder adaptive evolution. Selective constraints were hencetested using several codon substitution models imple-

mented in the program Codeml in the PAML packagev3.15 (Yang 1997).

As we wanted to find out whether post-duplicationlineages were under specific selective constraints, we

FIG. 1.—Consensus phylogeny of the archaeal CCT sequences data set composed of 72 sequences. Numbers at relevant internal nodes indicatebootstrap support out of 1,000 repetitions. Different colors indicate paralogy. Notice that these are reused in different taxonomic groups. Sequencesgrouped within the same color box or lineages are putative orthologues. Arrowheads indicate most probable gene duplication events consideringtopology and pattern of paralogy down toward leaves. The most remarkable difference with respect to the full data set (supplementary fig. 1,Supplementary Material online) is that Methanomicrobia becomes monophyletic. The distribution of sites under accelerated rates of evolution amongthe different functional categories is indicated using ‘‘cakes.’’ The amount of sites under accelerated evolution is proportional to the area, whereasdifferent colors indicate the percentage affecting within-ring inter-subunit interfaces (yellow), between-ring interfaces (brown), and substrate-bindingsites (blue). Classification of the sites in each one of the 3 functional groups is based on previous reports (Ditzel et al. 1998).

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1387

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

compared the fitness of the data with 3 evolutionary modelsin 2 steps. First, we compared the fitness of the data with theG&Y model (Goldman and Yang 1994) that assumes a sin-gle x value for all lineages and sites against the occurrenceof different categories of x values per site shared acrossall lineages. Both models are implemented in the programCodeml from the PAML package v3.15 (Yang 1997) as M0and M3, respectively. Then, we compared the outcome ofM3 with models where each post-duplication branch wasallowed to have some sites under a particular x value po-tentially different to the x considered for the rest of the tree(the so-called branch-site model B [BSB]) (Yang andNielsen 2002; Zhang et al. 2005). M0 is a special caseof M3 where all site categories have the same x value.Similarly, BSB is an extension of M3 as long as just 2 sitecategories are considered (Yang et al. 2000). Therefore,both comparisons can be carried out using the likelihoodratio test.

Finally, to identify possible overestimated x valuesdue to saturation of synonymous sites we applied asliding-windows analysis (Fares et al. 2002) using theprogram SWAPSC (Fares 2004).

Analysis of Intramolecular Coevolution

To test the hypothesis of coevolution between proteinregions involved in CCT within-ring inter-subunits andbetween-ring contacts or between these regions and nearbyamino acid sites, we used the nonparametric method basedon the mutual information criterion (MIC) developedby Korber et al. (1993) (hereon called MICK) as well as aparametric method coevolution analysis using protein se-quences (CAPS) developed by Fares and Travers (2006).

The MI is represented by the entropies that involve thejoint probability distribution, Pðsi; s#j Þ, of occurrence ofsymbol i at position s and j at position s# of the multiple se-quence alignment. The MI-generated values range between0, indicating independent evolution, and a positive valuewhose magnitude depends on the amount of covariation.In contrast, CAPS compares the correlated variance of theevolutionary rates at 2 sites corrected by the time sincethe divergence of the 2 sequences they belong to. Thismethod compares the transition probability scores between2 sequences at 2 particular sites, using the blocks substitu-tion matrix (Henikoff S and Henikoff JG 1992). Variablepositions included in the alignment for both types of anal-yses were those that are parsimony informative (i.e., theycontain at least 2 types of amino acids and at least 2 of themoccur with a minimum frequency of 2). We assessed thesignificance of the MI values and the CAPS correlation val-ues by randomization of pairs of sites in the alignment,calculation of their MI or CAPS correlation values, andcomparison of the real values with the distribution of 1million randomly sampled values. To correct for multiplenonindependent tests, we implemented the step-downpermutation procedure in both methods and corrected theprobabilities accordingly (Westfall and Young 1993).MICK is implemented in the program PIMIC (availablefrom the corresponding author on request), and CAPS isimplemented in the program CAPS (Fares and McNally

2006). Coevolution analyses were performed in each dataset containing the groups of paralogues highlighted in thephylogenetic tree of figure 1. Each sub-alignment containedat least 9 sequences to ensure a minimum acceptable sen-sitivity for the methods used (Fares and Travers 2006).

ResultsReinvestigating Paralogy in the Phylogeny of ArchaealCCTs

Prior to the analysis of selective constraints in the ar-chaeal CCTs, we reinvestigated the phylogenetic distribu-tion of gene duplications and compared our study with thatby Archibald et al. (2001). We used all the sequences avail-able in the NCBI database until 1 October 2006. After ap-plying some quality filters, we reduced the number to 100full-length sequences. Four of them turned out to be bac-terial chaperonin (GroEL) homologues putatively gainedby LGT events inMethanomicrobia. The remaining 96 gen-uine archaeal CCTs distribute unevenly among species,ranging from a single copy to up to 5 genes in Methano-sarcina acetivorans that also count with a GroEL homo-logue. When all 96 sequences were considered, the treeresolution was poor at deep branches of the Euryarchaeotesgroup probably caused by artifacts due to fast-evolving andpoorly sampled clades indicating the possible presence ofLBA effects (supplementary fig. 1, Supplementary Materialonline). Gene gain and loss seem to be a recurrent phenom-enon in the phylogeny of archaea, as pointed out previously(Archibald et al. 2001). We also noticed the occurrence ofwithin-archaea LGT as the most plausible explanation ofthe deep paralogy of both gene copies in Methanosphaerastadtmanae species.

There are 4 sequences remarkably accelerated: 2 genecopies in M. acetivorans species and another pair of ortho-logues in close related species Natronomonas pharaonisand Haloarcula marismortui. There is no evidence of non-functionalization in both cases because synonymous versusnon-synonymous substitution ratio within accelerated pairsis much lower than 1 (dN/dS � 0.1 and 0.5 for Methanomi-crobia and Halobacteriales, respectively). These values arealso in the range of values obtained within related paraloguegroups that are putatively functional. Sequence identity in-dicates that these 4 sequences are more divergent from theremaining ‘‘moderately’’ evolving archaeal sequences thanare eukarotic homologues (26% identity within tree domaincompared with 35% across tree domains). Additionally,these sequences belong to species with maximum numberof CCT gene copies (4 or 5). These facts indicate that thesegenes may well constitute separate full-fledged chaperon-ing systems.

The final data set comprised 72 sequences that in-cluded those genes that belong to well-supported lineagesincluding more than 2 taxa and that allow for a tree reso-lution consistent with recently published phylogenies forarchaealdomainspeciation (Brochier,Forterre,andGribaldo2005; Brochier, Gribaldo et al. 2005). Nonetheless, thesupport for the deep branches of the tree remained weak(fig. 1). The final tree included a total of 22 sequences forCrenarchaeotes group and comprised species with 2 and 3

1388 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

subunits. Species with 3 subunits belonged to the generaSulfolobus, and previous reports pinpoint the evolution ofCCT protein complexes into 9-member rings in this group(Archibald et al. 1999). The paraphyletic nature of the a sub-unit of Crenarchaeotes indicates that a subset of species un-derwent a 2nd duplication event in the gene coding forsubunita,generatingasubunitwenamea#andathirdsubunitcalled c (fig. 1).

The Euryoarchaeotes group included 50 sequencesand comprised 2 distinct cases of single duplication events(2 subunits) in Thermoplasmata and Thermococci cladesand 2 groups of species with 3 subunits, Methanomicrobiaand Halobacteriales clades (fig. 1).

Gene Duplication Followed by Accelerated Rates ofEvolution in the Archaeal CCT

One of the first questions we aimed at answering waswhether gene duplication was followed by accelerated fix-ation rates of amino acid replacements indicating changeson selective constraints. Application of the 2-cluster test us-ing the gamma-corrected amino acid distances between se-quences supports that gene duplication was almost alwaysfollowed by accelerated rates of evolution (table 1). Theonly exception was the case of the Thermoplasmata lineagewhere, despite the greater fixation rate of mutations in the bsubunit compared with the a subunit, the difference was notsignificant. Our results, however, indicate that one of thesubunit clusters has undergone greater acceleration ofamino acid replacement rates than the other and thereforeduplication altered selective constraints.

Accelerated Evolution of CCT Inter-Subunit Interfacesafter Gene Duplication

To test the hypothesis of the origin of hetero-oligomer-ism as a result of accelerated rates of evolution in inter-subunit contacts after gene duplication, we applied theprogram SWAPSC v1.0 to detect selective constraints.The advantage of using this program over others is that,together with the analysis of dN and dS, it allows testingsaturation of synonymous sites and thus removing themfrom subsequent analyses. Our first approach was to ana-lyze species presenting CCTs formed by 2 divergent sub-

units (namely, a and b; note that the nomenclature of thesubunit is totally arbitrary). In order to distinguish species-specific selective constraints from those constraints im-posed after gene duplication, we focused on duplicatespresent in more than 1 species (ancestral paralogy). Selec-tive constraints analyses in CCTs present 2 different evo-lutionary scenarios in species comprising 2 distinct types ofsubunits compared with those with 3 different types of sub-units (fig. 1).

Analysis of the duplication events that took place onthe ancestor of the Thermoplasmata clade after gene dupli-cation uncovers events of accelerated rates of evolution inthe branch leading to subunits a and b at sites directlyinvolved in inter-subunits contacts (table 2; taking as ref-erence the sequence of Thermoplasma acidophilum, acces-sion number NP_394733, for which the 3-dimensionalstructure is available). Sites having undergone acceleratedrates of evolution not directly involved in inter-subunitinterfaces were located significantly close (less than 8 Adistance) to within-ring inter-subunits contact regions thatpresent evidence of accelerated rates of evolution (table 2and green space–fill structured sites in ). In figure 2a, werotated the 3-dimensional structure to make possible thevisualization of sites under selective constraints. We alsoverified whether CCTs became evolutionarily trapped inhetero-oligmerism by testing whether these regionswere ac-celerated in branches ADAS as well. All the within-ringinter-subunits contact regions presented significantlygreater x values in the branches ADBS compared withthose values in the branches ADAS. In fact, SWAPSCdetected these regions as significantly accelerated ADBSand under strong purifying selection ADAS (results notshown). Differences thus suggest that sites that fixed aminoacid replacements immediately after duplication becamehighly constrained after speciation. Some of the acceleratedregions were detected in substrate-binding sites, althoughthey were always surrounded by compensatory mutations(fig. 2a).

In the Thermococci clade, subunit a was much moreconstrained than subunit b. The latter presented several re-gions affected by accelerated rates of fixation of amino acidreplacements in the lineage ADBS (table 2). These siteshad significantly lower x values in the branch ADAS(data not shown). Evolutionarily accelerated sites not di-rectly involved in within-ring inter-subunits contacts were

Table 1Detection of Accelerated Rates of Amino Acid Substitution Using the 2-Cluster Test Method

Archaea group Species groupa Subunitsb d ± SEc Zd

Euryarchaeotes Methanomicrobia Sub1 � Sub2 , Sub3 0.174 ± 0.030 5.788**Methanomicrobia Sub1 , Sub2 0.041 ± 0.017 2.343*Halobacteriales Suba , Subb 0.108 ± 0.028 6.976**Halobacteriales Suba-b , Subc 0.157 ± 0.019 8.223**Thermococci Suba . Subb 0.032 ± 0.010 3.039**

Crenarchaeotes All species included Subb-b# , Suba-a#-c 0.105 ± 0.027 3.834**All species included Suba-a#, Subc 0.217 ± 0.026 8.422**

NOTE.—Only clades with significant results are shown.a Species included in the comparison between paralogues.b Subunits compared. Names of grouped subunits are separated by a hyphen. Subunits names are arbitrary as used in literature.c d is the absolute value of the difference between the mean branch lengths of the subunit groups compared.d The normal distribution value for d. Significant values at a 5 0.01 are indicated by **.

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1389

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

neighboring regions that are involved in inter-subunit con-tacts (fig. 2b).

Unlike clades of species having 2 different CCT sub-units, groups of species presenting 3 different CCT subunitsin Euryarchaeotes and Crenarchaeotes showed a more com-plex evolutionary dynamic. In these groups, regions af-fected by accelerated rates of fixation in branches ADBSincluded sites involved in within-ring inter-subunit andbetween-ring interfaces, sites responsible for substrate bind-ing, and sites involved in ATP binding (fig. 1 and table 2).We also detected accelerated regions close to sites involvedin substrate binding, within-ring and between-ring contacts(fig. 2c and e). Other accelerated regions always surroundedmost of the accelerated sites in substrate-binding regions.However, some other regions involved in substrate bindingalso showed adaptive Darwinian selection and were notcompensated by neighboring accelerated regions (see nextsection). The results on CCT presenting 3 classes of sub-units suggest that sub-functionalization may have occurredin these cases in sites involved in ATP binding and substratebinding.

Episodic Darwinian Selection of CompensatoryMutations in CCTs

The next step to detect compensatory mutations was toidentify mutations fixed by adaptive evolution at regionsclose to accelerated amino acid sites. We calculated thelikelihood of the data under G&Y, M3, and the BSB codon

models (supplementary table 1, Supplementary Materialonline). In all the comparisons, the M3 improved signifi-cantly the log-likelihood value with respect to G&Y. Thisindicates that CCT subunits have fixed changes under het-erogeneous evolutionary constraints across codons in gen-eral.

To test for the presence of heterogeneous constraintsand adaptive evolution in ADBS lineages, we compared theBSB model for each of the lineages with the M3 model. Inall cases, the BSB significantly improved the log-likelihoodvalue of the M3 model even after correcting by Bonferroni(supplementary table 1, Supplementary Material online).Positive selection was detected in almost all lineages exam-ined especially in sites neighboring within-ring inter-subunits contacts (between 4 and 8 A distance) thatunderwent accelerated fixation rates of amino acid sub-stitutions (fig. 2a–e and table 3). Subunit a from the Ther-moplasmata clade showed evidence for adaptive evolutionin 1 single amino acid site located in the substrate-bindingregion, although this single site was close to a site underadaptive evolution that is located outside the substrate-binding region, indicating compensatory effects. Apartfrom this example, adaptive evolution in CCTs comprising2 distinct classes of subunits was always associated to sitesneighboring inter-subunit contact regions (fig. 2a and b)with a single exception. In contrast, adaptive evolutionwas massively detected in substrate-binding sites in CCTsshowing 3 different types of subunits in addition to inter-subunit regions (fig. 2c–e and table 3).

Amino Acids Coevolution Supports Compensation ofNeutrally Fixed Slightly Deleterious Mutations

The last condition a mutation should meet as to be con-sidered a compensatory mutation is to present evidence ofcoevolution with slightly deleterious mutations. This anal-ysis is particularly complex because of the many differentintermingled coevolutionary effects, including functional,structural, and interaction coevolution. Furthermore, inmany cases sites do not meet the statistical criteria as tobe considered in the analysis of coevolution. For example,many of the sites accelerated or under adaptive evolutionwere discarded because they were not parsimony informa-tive. In spite of these complications, our analysis showsclear intra-subunit coevolution in all CCT clusters exam-ined in Euryarchaeotes and Crenarchaeotes. We also foundthat most of the coevolution has happened among aminoacid sites 3-dimensionally proximal. Figure 3 presents ex-amples of these compensatory mutations. In this figure, werotated 3-dimensional structures conveniently as to showclearly sites under coevolution. Coevolution occurredamong sites belonging to substrate-binding domains, inter-subunit contacts, and neighbor sites in physical contact(less than 8 A distant). The degree of compensation (num-ber of sites under acceleration surrounding sites fixingslightly deleterious mutations), however, differs betweenthe different categories of sites, with the substrate-bindingsites presenting no evidence of coevolution with nearby re-gions in CCT protein complexes sharing 3 distinct classesof subunits, whereas inter-subunit regions presented high

Table 2Amino Acid Sites Detected As Having UndergoneAccelerated Fixation Rates of Evolution in the ArchaealChaperonin CCT

Clade Subunits Amino acid sites

Thermoplasmata a I208, E270, D297, M298,S505, T507, I511, M512

b S89, F195, A399, G403, A404Thermococci a —

b T142, K163, S164, D170, G403Methanomicrobia 1 —

2 S141, K143, K203, T3333 E124, M128, A133, K134, V201,

I217, M298, Q300, I334, S339, F360,V361, A450

1-2 V278, D279, S283Halobacteriales a —

b-c K24, S123, A133, V167, K202, T333b I503, M512c P43, R44, S130, V136, S141, I208, E336,

I337, K24, E25, R275–M277Crenarchaeotes a-a# I511, M512

a# L54, G55, E270, A299, Q300, P446, R447b E132, T142, A146, N272, M277,

E377, T378

NOTE.—Amino acids’ 1-letter code is followed by a number indicating the

position of sites taking the amino acid sequence of Thermoplasma acidophilum as

reference (accession number NP_394733). This table includes only subunits with

significant results within each group of the tree of figure 1. Here, we show only sites

showing accelerated rates of evolution detected by SWAPSC in regions neighboring

or located in within-ring inter-subunit interfaces (black letters), between-ring inter-

subunit interfaces (underlined letters), and substrate-binding sites (white letters in

a black background).

1390 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

FIG. 2.—Three-dimensional localization of amino acid regions having undergone accelerated fixation rates of amino acid substitutions and positiveDarwinian selection. We use the structure of Thermoplasma acidophilum (PDB file: 1A6D) that illustrates 2 adjacent subunits. Because of thehomogeneous distribution of selective constraints in CCTs with 3 different types of subunits, only examples for sites under constraints in 2 (a and b) orcombination of 2 subunits (e.g., the cluster a-b) were plotted in the structure. For example, where 3 subunits were present in a given organism, we onlyplotted the results of coevolution corresponding to 2 clusters. In the structure we emphasize important regions only. Regions under constraints areshown as solid sphere structures for Thermoplasmata (a), Thermococci (b), Methanomicrobia (c), Halobacteriales (d), and Crenarchaeotes (e). Regionsunder accelerated rates of evolution involved in within-ring inter-subunit interfaces, between-ring interfaces, substrate binding, and proximal aminoacids to any of these regions are labeled in yellow, pink, blue, and green, respectively. Regions under positive Darwinian selection for the samefunctional categories are labeled in brown, red, black, and green.

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1391

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

percentage of compensatory mutations (fig. 3c–e). In addi-tion to the compensatory relationship between mutations,we also detected coevolution among sites belonging to in-ter-subunit contacts in all the groups examined (fig. 3).

Analysis of compensatory mutations in Crenarch-aeotes presented very similar results to those of the CCTwith 3 different types of subunits of Halobacteriales speciesgroup. Crenarchaeotes CCT subunits have undergone fixa-tion of coevolving amino acid replacements at regionsneighboring within-ring inter-subunits contacts (fig. 3e).It is worth noticing that accelerated sites were always de-tected to coevolve with positively selected sites (fig. 3).When the restriction of parsimony was removed, most ofthe sites detected in previous analyses as accelerated and un-der adaptive evolution at inter-subunit contacts were re-ported as coevolving. In summary, coevolution has beenoccurring mainly between accelerated sites and sites underadaptive evolution supporting the compensatory role hy-pothesis for positively selected amino acid substitutionsin CCTs presenting 2 distinct classes of subunits. In contrast,CCTs with 3 distinct classes of subunits presented clear ev-idence of such coevolution only in inter-subunit interfaces.

Discussion

In this study, we propose a role for gene duplicationand selection in protein functional innovation. Our resultsstudying the hetero-oligomeric molecular chaperonin CCTin archaea yield interesting information about the evolutionof protein complexity in terms of hetero-oligomerism ver-sus homo-oligomerism. Initially, there is no structuralreason as to why hetero-oligomerism should be more ad-vantageous than homo-oligomerism unless this hetero-oligomerism is linked to some functional role. Exampleof such role is the previously shown correspondence be-

tween hetero-oligomerism and sub-functionalisation inthe eukaryotic CCT (Fares and Wolfe 2003). This isstrongly supported by the fact that all 8-gene copies havebeen kept because the rapid successive gene duplications ofCCTs took place in the most recent ancestor of all eukar-yotes. As ours and others’ results indicate, archaeal CCTshow a strikingly contrasting evolutionary scenario com-pared with eukaryotic CCTs: recurrent paralogy followedby gene non-functionalization and disintegration has beena recurrent phenomenon (Archibald and Roger 2002a,2002b). These observations combined with evidence forgene conversion in archaeal CCTs yield information sup-porting the role of neutral evolution in the emergence ofprotein complexity.

Previous work presented the coevolved interdepen-dence between subunits as the most appealing possibilityto explain the patterns of recurrent paralogy in archaealCCTs (Archibald et al. 1999). These authors proposed thatancestral archaeal CCTs were homo-oligomers but thatgene duplication was followed at some evolutionary stageby the neutral fixation of slightly deleterious mutations inone of the CCT subunits. These mutations may have beenresponsible for a decrease in the fitness (stability) of homo-oligomers therefore favoring fixation of hetero-oligomers.Fixation of mutations in the other paralogue may have in-creased the stability of the hetero-oligomer outcompetinghomo-oligomeric CCTs. As a result, archaeal CCTs becameevolutionarily trapped as hetero-oligomers. A likely evolu-tionary explanation to such an entrapment is that the 1st setof mutations fixed in the 1st paralogue was slightly delete-rious or nearly neutral, whereas the 2nd set of mutations inthe 2nd paralogue was fixed by adaptive evolution condi-tional to the fixation of slightly deleterious mutations (ap-parently non-neutral mutations). Both sets of mutations hadno selective value on the function of the protein when com-bined and were hence neutrally fixed.

Table 3Amino Acid Sites under Adaptive Evolution Detected Using SWAPSC and/or the Branch-Site Model Implemented in PAML

Model ‘ Amino acid sites under adaptive evolution

G&Y �82846.864 —M3 �80906.659 —BSB (foreground branches)

Thermoplasmata a �80895.128 A207, E271Thermoplasmata b �80884.337 F90, K134, Y177, L400, G499Thermococci a �80882.896 V167Thermococci b �80879.584 V167, A168, K171, A404Methanomicrobia 1-2 �80899.282 S283Methanomicrobia 1 �80889.646 S18, E384Methanomicrobia 2 �80894.809 I137, D138, I144, G145, K202, L240–A242, H301, R275, E276Methanomicrobia 3 �80888.479 R127, D241, N272�R275, Q292, S304, A306, D335, E444, I445Halobacteriales a �80859.259 S18, K78, K134, S159, V167, K203

Halobacteriales b-c �80852.325E25, K78, V121, R127, K134, A168, K203, Q204, A207, I208, K223,M277, M298, A299, S332, L400, A404, E444, I496, G500

Halobacteriales b �80749.054E17–G19, K24, E25, A30, I31, D52, L54, V58, A76, K78, V121, S123, Y126, R127, K134,V136, A168, G206, I208, E336, S388, E401, P457, I496–V498, K500, E504, S505, I511–I513

Halobacteriales c �80780.069 K24, A168, E169, D209, K247, N272, L274, H301, T333, I334, R421, I477, F480, G499, K500Crenarchaeotes a-a#-c �80892.373 E271, A397, R497, A502Crenarchaeotes c �80888.914 P226, Q300, R312, R313, I445, A448Crenarchaeotes b-b# �80893.860 G145, A146, Y191, V201, D209, E271, N272, E379, H380, T390, Q425, V488, Q501

NOTE.—Only subunits presenting evidence of adaptive evolution at particular amino acid sites are shown. Here, we show only sites that are located in important

functional regions. Sites located in or neighboring within-ring inter-subunits, between-ring inter-subunit contact regions, substrate-binding sites, and ATP-binding sites are

labeled in black, black underlined, white in a black background, and white in a black background underlined, respectively.

1392 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

FIG. 3.—Three-dimensional localization of regions under coevolution in the archaeal CCTs based on the Thermoplasma acidophilum structure(PDB file: 1A6D). Here, we highlighted coevolving sites under accelerated rates of evolution as solid spheres including sites in within-ring inter-subunits contact interfaces (yellow), substrate-binding regions (blue), between-ring contact interfaces (pink), and proximal regions (green). To visualizethe sites clearly, we have rotated conveniently 3-dimensional structures. The different structures are therefore differently orientated. When sites evolvedunder adaptive evolution, we labeled them in brown, black, red, and green instead. Dashed-line circles surround sites belonging to the same coevolvinggroup. Here, we illustrate a total of 5 examples in Thermoplasmata (a), Thermococci (b), Methanomicrobia (c) Halobacteriales (d), and Crenarchaeotes(e). We only present cases of coevolution involving 2 subunits and arbitrarily assigning each subunit to 1 shown in this figure.

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1393

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

Several lines of evidence in our study support the neu-tral fixation of archaeal CCT hetero-oligomerism and argueagainst the role of selection through sub-functionalizationor neo-functionalization. First, we found that most of themutations fixed in archaeal CCTs with 2 different classesof subunits are located within-ring inter-subunit or between-ring interfaces. Second, these regions show a significantlyaccelerated fixation rate of mutations. Third, acceleratedfixation rates of mutations and mutations fixed by adaptiveevolution have also affected peptide regions in the structureneighboring those accelerated interfaces. Finally, these 2sets of mutations present evidence of coevolution, indicat-ing a compensatory relationship between them.

Interestingly, we found contrasting evolutionary sce-narios when comparing archaeal CCTs with 2 differentclasses of subunits to archaeal CCTs showing 3 differentclasses of subunits. Several mutations in genes codingfor CCTs with 3 distinct subunits have affected substrate-binding regions and ATP-binding sites. These mutationswere fixed by adaptive evolution in one of the subunits, sup-porting their possible involvement in sub-functionalization,and present no evidence of secondary compensatorychanges. These differences between both groups of CCTsprovide evidence in support of a more complex hypothesisto explain the evolution of hetero-oligomerism. A plausibleevolutionary explanation for these conclusions is that oncehetero-oligomerism became neutrally fixed, distinct CCTsubunits started accumulating specializing mutations lead-ing to sub-functionalization. Our data suggest that sub-functionalization may depend on the number of distinctsubunits available in the complex. In this context, eukary-otic CCTs constitute an excellent example of such evolu-tionary process taken to completion. On the other hand,dispensability of several subunits in the hyperthermophilicarchaeon suggests functional redundancy of some of thesubunits and argues against sub-functionalization by forma-tion of CCTs comprising 3 distinct subunits with a fixedgeometry (Kagawa et al. 2003). Analysis of the dispensabil-ity of the different CCT genes (cct1, cct2, and cct3) in Hal-oferax volcanii demonstrates dispensability of at least 2 outof the 3 genes (Kapatai et al. 2006). These authors alsoshow that the CCT protein complex in this archaeon isa double ring with 8-fold symmetry but that the rings aremixed complexes of different subunits (Kapatai et al.2006). Although, this study does not support a fixed geom-etry for a CCT complex formed from 3 distinct subunits, thedifferent use of combinations of 2 subunits suggests that atleast one of the subunits may have undergone sub-function-alization after gene duplication, something also supportedby the fact that CCT3 (called CCT c subunit in this work)cannot support growth on its own (Kapatai et al. 2006). Thevariability in the dispensability of the different CCT subu-nits in different organisms and our limited knowledge aboutthe flexibility of the CCT complex geometry preclude stat-ing definitive conclusions and demonstrate that there is stillmuch to be learned about CCT complexes’ structural andevolutionarily properties in different archaeal lineages.

Coevolution between CCTs and client proteins in eu-karyotic cells may have fuelled a sub-functionalization pro-cess in eukaryotic CCTs compared with archaeal CCTs. Forinstance, there are several examples of coevolution between

eukaryotic CCTs and hetero-oligomeric proteins, also gen-erated at the origin of the eukaryotic cell, such as actin andtubulin (Horwich and Willison 1993; Llorca, McCormacket al. 1999; Llorca et al. 2000). Tubulin binds to 5 CCTsubunits in 2 different arrangements, utilizing hence the8 CCT subunits (Llorca et al. 2000). Llorca et al. (Llorca,McCormack et al. 1999; Llorca et al. 2000) proposed sub-functionalization leading to protein-binding specializationin CCT to explain the binding and folding model with actinand tubulin. Greater number of distinct subunits would per-mit greater possible arrangements in which protein clientscould bind CCTs and greater protein client’s versatility.This is only possible through the coevolution between theseprotein clients and CCTs. Then, it follows that CCTs het-ero-oligomerism and protein client’s versatility are 2 tightlylinked phenomena. In conclusion, hetero-oligomerism maywell be related to the gain of cell protein complexity. Es-tablishing a ground theory for the origin and evolution ofhetero-oligomerism may unearth breakthrough informationon the origin and evolutionary factors responsible for theemergence of cell complexity in eukaryotes.

Supplementary Material

Supplementary table 1 and figure 1 are available atMolecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

Acknowledgments

This work has been supported by Science FoundationIreland. We would also like to thank the editor and the re-viewers for their helpful suggestions.

Literature Cited

Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection ofbest-fit models of protein evolution. Bioinformatics. 21:2104–2105.

Akashi H. 1999. Within- and between-species DNA sequencevariation and the ‘footprint’ of natural selection. Gene.238:39–51.

Amoutzias GD, Robertson DL, Oliver SG, Bornberg-Bauer E.2004. Convergent evolution of gene networks by single-geneduplications in higher eukaryotes. EMBO Rep. 5:274–279.

Archibald JM, Blouin C, Doolittle WF. 2001. Gene duplicationand the evolution of group II chaperonins: implications forstructure and function. J Struct Biol. 135:157–169.

Archibald JM, Logsdon JM, Doolittle WF. 1999. Recurrentparalogy in the evolution of archaeal chaperonins. Curr Biol.9:1053–1056.

Archibald JM, Roger AJ. 2002a. Gene duplication and geneconversion shape the evolution of archaeal chaperonins. J MolBiol. 316:1041–1050.

Archibald JM, Roger AJ. 2002b. Gene conversion and the evo-lution of euryarchaeal chaperonins: a maximum likelihood-based method for detecting conflicting phylogenetic signals.J Mol Evol. 55:232–245.

Brochier C, Forterre P, Gribaldo S. 2005. An emergingphylogenetic core of Archaea: phylogenies of transcriptionand translation machineries converge following addition ofnew genome sequences. BMC Evol Biol. 5:36.

1394 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

Brochier C, Gribaldo S, Zivanovic Y, Confalonieri F, Forterre P.2005. Nanoarchaea: representatives of a novel archaealphylum or a fast-evolving euryarchaeal lineage related toThermococcales? Genome Biol. 6:R42.

Buckle AM, Zahn R, Fersht AR. 1997. A structural model forGroEL-polypeptide recognition. Proc Natl Acad Sci USA.94:3571–3575.

Bukau AL, Horwich B. 1998. The Hsp70 and Hsp60 chaperonemachines. Cell. 92:351–366.

Dimmic MW, Rest JS, Mindell DP, Goldstein RA. 2002. rtREV:an amino acid substitution matrix for inference of retrovirusand reverse transcriptase phylogeny. J Mol Evol. 55:65–73.

Ditzel L, Lowe J, Stock D, Stetter KO, Huber H, Huber R,Steinbacher S. 1998. Crystal structure of the thermosome,the archaeal chaperonin and homolog of CCT. Cell. 93:125–138.

Dunbar AY, Kamada Y, Jenkins GJ, Lowe ER, Billecke SS,Osawa Y. 2004. Ubiquitination and degradation of neuronalnitric-oxide synthase in vitro: dimer stabilization protects theenzyme from proteolysis. Mol Pharmacol. 66:964–969.

Edgar RC. 2004. MUSCLE: multiple sequence alignment withhigh accuracy and high throughput. Nucleic Acids Res.32:1792–1797.

Ellis RJ. 1997. Molecular chaperones: avoiding the crowd. CurrBiol. 7:R531–R533.

Ellis RJ, Hartl FU. 1999. Principles of protein folding in thecellular environment. Curr Opin Struct Biol. 9:102–110.

Fares MA. 2004. SWAPSC: sliding window analysis proce-dure to detect selective constraints. Bioinformatics. 20:2867–2868.

Fares MA, Elena SF, Ortiz J, Moya A, Barrio E. 2002. A slidingwindow-based method to detect selective constraints inprotein-coding genes and its application to RNA viruses. JMol Evol. 55:509–521.

Fares MA, McNally D. 2006. CAPS: coevolution analysis usingprotein sequences. Bioinformatics. 22:2821–2822.

Fares MA, Travers SA. 2006. A novel method for detectingintramolecular coevolution: adding a further dimension toselective constraints analyses. Genetics. 173:9–23.

Fares MA, Wolfe KH. 2003. Positive selection and subfunction-alization of duplicated CCT chaperonin subunits. Mol BiolEvol. 20:1588–1597.

Farr GW, Scharl EC, Schumacher RJ, Sondek S, Horwich AL.1997. Chaperonin-mediated folding in the eukaryotic cytosolproceeds through rounds of release of native and nonnativeforms. Cell. 89:927–937.

Feldman DE, Thulasiraman V, Ferreyra RG, Frydman J. 1999.Formation of the VHL-elongin BC tumor suppressor com-plex is mediated by the chaperonin TRiC. Mol Cell. 4:1051–1061.

Felsenstein J. 1989. PHYLIP—Phylogeny Inference Package(Version 3.2). Cladistics 5:164–166.

Frydman J. 2001. Folding of newly translated proteins in vivo:the role of molecular chaperones. Annu Rev Biochem.70:603–647.

Frydman J, Nimmesgern E, Ohtsuka K, Hartl FU. 1994. Foldingof nascent polypeptide chains in a high molecular massassembly with molecular chaperones. Nature. 370:111–117.

Gavin AC, Bosche M, Krause R, et al. (38 co-authors). 2002.Functional organization of the yeast proteome by systematicanalysis of protein complexes. Nature. 415:141–147.

Geissler S, Siegers K, Schiebel E. 1998. A novel protein complexpromoting formation of functional alpha- and gamma-tubulin.EMBO J. 17:952–966.

Goldman N, Yang Z. 1994. A codon-based model of nucleotidesubstitution for protein-coding DNA sequences. Mol BiolEvol. 11:725–736.

Grantcharova V, Alm EJ, Baker D, Horwich AL. 2001.Mechanisms of protein folding. Curr Opin Struct Biol. 11:70–82.

Grantham J, Llorca O, Valpuesta JM, Willison KR. 2000. Partialocclusion of both cavities of the eukaryotic chaperonin withantibody has no effect upon the rates of beta-actin or alpha-tubulin folding. J Biol Chem. 275:4587–4591.

Guindon S, Gascuel O. 2003. A simple, fast, and accuratealgorithm to estimate large phylogenies by maximum likeli-hood. Syst Biol. 52:696–704.

Gutsche I, Essen LO, Baumeister W. 1999. Group II chaperonins:new TRiC(k)s and turns of a protein folding machine. J MolBiol. 293:295–312.

Gutsche I, Mihalache O, Hegerl R, Typke D, Baumeister W.2000. ATPase cycle controls the conformation of an archaealchaperonin as visualized by cryo-electron microscopy. FEBSLett. 477:278–282.

Hartl FU. 1996. Molecular chaperones in cellular protein folding.Nature. 381:571–579.

Hartl FU, Hayer-Hartl M. 2002. Molecular chaperones in thecytosol: from nascent chain to folded protein. Science.295:1852–1858.

Henikoff S, Henikoff JG. 1992. Amino acid substitution matricesfrom protein blocks. Proc Natl Acad Sci USA. 89:10915–10919.

Ho Y, Gruhler A, Heilbut A, et al. (46 co-authors). 2002.Systematic identification of protein complexes in Saccharo-myces cerevisiae by mass spectrometry. Nature. 415:180–183.

Horwich AL, Willison KR. 1993. Protein folding in the cell:functions of two families of molecular chaperone, hsp 60 andTF55-TCP1. Philos Trans R Soc Lond B Biol Sci.339:313–325 ; discussion 325–316.

Hynes GM, Willison KR. 2000. Individual subunits of theeukaryotic cytosolic chaperonin mediate interactions withbinding sites located on subdomains of beta-actin. J BiolChem. 275:18985–18994.

Ispolatov A, Yuryev I, Mazo I, Maslov S. 2005. Bindingproperties and evolution of homodimers in protein-proteininteraction networks. Nucleic Acids Res. 33:3629–3635.

Kad NM, Ranson NA, Cliff MJ, Clarke AR. 1998. Asymmetry,commitment and inhibition in the GroE ATPase cycle imposealternating functions on the two GroEL rings. J Mol Biol.278:267–278.

Kagawa HK, Yaoi T, Brocchieri L, McMillan RA, Alton T,Trent JD. 2003. The composition, structure and stability ofa group II chaperonin are temperature regulated in a hyper-thermophilic archaeon. Mol Microbiol. 48:143–156.

Kapatai G, Large A, Benesch JL, Robinson CV, Carrascosa JL,Valpuesta JM, Gowrinathan P, Lund PA. 2006. All threechaperonin genes in the archaeon Haloferax volcanii areindividually dispensable. Mol Microbiol. 61:1583–1597.

Kashuba E, Pokrovskaja K, Klein G, Szekely L. 1999. Epstein-Barr virus-encoded nuclear protein EBNA-3 interacts with theepsilon-subunit of the T-complex protein 1 chaperonincomplex. J Hum Virol. 2:33–37.

Keane TM, Creevey CJ, Pentony MM, Naughton TJ,McLnerney JO. 2006. Assessment of methods for amino acidmatrix selection and their use on empirical data shows that adhoc assumptions for choice of matrix are not justified. BMCEvol Biol. 6:29.

Kimura M. 1977. Preponderance of synonymous changes asevidence for the neutral theory of molecular evolution.Nature. 267:275–276.

Klumpp M, Baumeister W, Essen LO. 1997. Structure of thesubstrate binding domain of the thermosome, an archaealgroup II chaperonin. Cell. 91:263–270.

Neutral Fixation of Hetero-Oligomerism in Archaeal CCT 1395

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from

Korber BT, Farber RM, Wolpert DH, Lapedes AS. 1993.Covariation of mutations in the V3 loop of human immuno-deficiency virus type 1 envelope protein: an informationtheoretic analysis. Proc Natl Acad Sci USA. 90:7176–7180.

Kumar S, Tamura K, Nei M. 2004. MEGA3: Integrated softwarefor molecular evolutionary genetics analysis and sequencealignment. Brief Bioinform. 5:150–163.

Leroux MR, Hartl FU. 2000. Protein folding: versatility of thecytosolic chaperonin TRiC/CCT. Curr Biol. 10:R260–R264.

Li WH. 1993. Unbiased estimation of the rates of synonymousand nonsynonymous substitution. J Mol Evol. 36:96–99.

Lingappa JR, Martin RL, Wong ML, Ganem D, Welch WJ,Lingappa VR. 1994. A eukaryotic cytosolic chaperonin isassociated with a high molecular weight intermediate in theassembly of hepatitis B virus capsid, a multimeric particle. JCell Biol. 125:99–111.

Liou AK, Willison KR. 1997. Elucidation of the subunitorientation in CCT (chaperonin containing TCP1) from thesubunit composition of CCT micro-complexes. EMBO J.16:4311–4316.

Llorca O, Martin-Benito J, Gomez-Puertas P, Ritco-Vonsovici M, Willison KR, Carrascosa JL, Valpuesta JM.2001. Analysis of the interaction between the eukaryoticchaperonin CCT and its substrates actin and tubulin. J StructBiol. 135:205–218.

Llorca O, Martin-Benito J, Grantham J, Ritco-Vonsovici M,Willison KR, Carrascosa JL, Valpuesta JM. 2001. The‘sequential allosteric ring’ mechanism in the eukaryoticchaperonin-assisted folding of actin and tubulin. EMBO J.20:4065–4075.

Llorca O, Martin-Benito J, Ritco-Vonsovici M, Grantham J,Hynes GM, Willison KR, Carrascosa JL, Valpuesta JM. 2000.Eukaryotic chaperonin CCT stabilizes actin and tubulinfolding intermediates in open quasi-native conformations.EMBO J. 19:5971–5979.

Llorca O, McCormack EA, Hynes G, Grantham J, Cordell J,Carrascosa JL, Willison KR, Fernandez JJ, Valpuesta JM.1999. Eukaryotic type II chaperonin CCT interacts with actinthrough specific subunits. Nature. 402:693–696.

Llorca O, Smyth MG, Carrascosa JL, Willison KR,Radermacher M, Steinbacher S, Valpuesta JM. 1999. 3Dreconstruction of the ATP-bound form of CCT reveals theasymmetric folding conformation of a type II chaperonin. NatStruct Biol. 6:639–642.

Llorca O, Smyth MG, Marco S, Carrascosa JL, Willison KR,Valpuesta JM. 1998. ATP binding induces large conforma-tional changes in the apical and equatorial domains of theeukaryotic chaperonin containing TCP-1 complex. J BiolChem. 273:10091–10094.

Lockhart P, Novis P, Milligan BG, Riden J, Rambaut A,Larkum T. 2006. Heterotachy and tree building: a case studywith plastids and eubacteria. Mol Biol Evol. 23:40–45.

Madden TL, Tatusov RL, Zhang J. 1996. Applications ofnetwork BLAST server. Methods Enzymol. 266:131–141.

Marianayagam NJ, Sunde M, Matthews JM. 2004. The power oftwo: protein dimerization in biology. Trends Biochem Sci.29:618–625.

McCormack EA, Llorca O, Carrascosa JL, Valpuesta JM,Willison KR. 2001. Point mutations in a hinge linking thesmall and large domains of beta-actin result in trapped foldingintermediates bound to cytosolic chaperonin CCT. J StructBiol. 135:198–204.

Nei M, Kumar S. 2000. Molecular Evolution and Phylogenetics.New York: Oxford University Press.

Nitsch M, Walz J, Typke D, Klumpp M, Essen LO,Baumeister W. 1998. Group II chaperonin in an open

conformation examined by electron tomography. Nat StructBiol. 5:855–857.

Posada D, Crandall KA. 1998. MODELTEST: testing the modelof DNA substitution. Bioinformatics. 14:817–818.

Ronnstrand L. 2004. Signal transduction via the stem cell factorreceptor/c-Kit. Cell Mol Life Sci. 61:2535–2548.

Schoehn G, Hayes M, Cliff M, Clarke AR, Saibil HR. 2000.Domain rotations between open, closed and bullet-shapedforms of the thermosome, an archaeal chaperonin. J Mol Biol.301:323–332.

Schoehn G, Quaite-Randall E, Jimenez JL, Joachimiak A,Saibil HR. 2000. Three conformations of an archaealchaperonin, TF55 from Sulfolobus shibatae. J Mol Biol. 296:813–819.

Sharp PM. 1997. In search of molecular darwinism. Nature.385:111–112.

Sigler PB, Xu Z, Rye HS, Burston SG, Fenton WA, Horwich AL.1998. Structure and function in GroEL-mediated proteinfolding. Annu Rev Biochem. 67:581–608.

Simon AM, Goodenough DA. 1998. Diverse functions ofvertebrate gap junctions. Trends Cell Biol. 8:477–483.

Srikakulam R, Winkelmann DA. 1999. Myosin II folding ismediated by a molecular chaperonin. J Biol Chem. 274:27265–27273.

Stan G, Brooks BR, Lorimer GH, Thirumalai D. 2006. Residuesin substrate proteins that interact with GroEL in the captureprocess are buried in the native state. Proc Natl Acad SciUSA. 103:4433–4438.

Takezaki N, Rzhetsky A, Nei M. 1995. Phylogenetic test of themolecular clock and linearized trees. Mol Biol Evol.12:823–833.

Thulasiraman V, Yang CF, Frydman J. 1999. In vivo newlytranslated polypeptides are sequestered in a protected foldingenvironment. EMBO J. 18:85–95.

Vainberg IE, Lewis SA, Rommelaere H, Ampe C,Vandekerckhove J, Klein HL, Cowan NJ. 1998. Prefoldin,a chaperone that delivers unfolded proteins to cytosolicchaperonin. Cell. 93:863–873.

Valpuesta JM, Martin-Benito J, Gomez-Puertas P, Carrascosa JL,Willison KR. 2002. Structure and function of a protein foldingmachine: the eukaryotic cytosolic chaperonin CCT. FEBSLett. 529:11–16.

Westfall PH, Young SS. 1993. Resampling-based multipletesting. New York: John Wiley & Sons.

Whelan S, Goldman N. 2001. A general empirical model ofprotein evolution derived from multiple protein families usinga maximum-likelihood approach. Mol Biol Evol. 18:691–699.

Yang Z. 1997. PAML: a program package for phylogeneticanalysis by maximum likelihood. Comput Appl Biosci. 13:555–556.

Yang Z, Nielsen R. 2002. Codon-substitution models fordetecting molecular adaptation at individual sites alongspecific lineages. Mol Biol Evol. 19:908–917.

Yang Z, Nielsen R, Goldman N, Pedersen AM. 2000. Codon-substitution models for heterogeneous selection pressure atamino acid sites. Genetics. 155:431–449.

Yang Z, Wong WS, Nielsen R. 2005. Bayes empirical bayesinference of amino acid sites under positive selection. MolBiol Evol. 22:1107–1118.

Zhang J, Nielsen R, Yang Z. 2005. Evaluation of an improvedbranch-site likelihood method for detecting positive selectionat the molecular level. Mol Biol Evol. 22:2472–2479.

Claudia Schmidt-Dannert, Associate Editor

Accepted March 15, 2007

1396 Ruano-Rubio and Fares

by guest on March 2, 2016

http://mbe.oxfordjournals.org/

Dow

nloaded from