Protein Refolding Computer Simulation

Embed Size (px)

Citation preview

  • 8/2/2019 Protein Refolding Computer Simulation

    1/16

    Protein Refolding Versus Aggregation: ComputerSimulations on an Intermediate-resolution

    Protein ModelAnne Voegler Smith and Carol K. Hall*

    Department of ChemicalEngineering, North CarolinaState University, RaleighNC 27695-7905, USA

    Computer simulations are performed on a system of eight model peptidechains to study how the competition between protein refolding andaggregation affects the optimal conditions for refolding of four-helix bun-dles. The discontinuous molecular dynamics algorithm is utilized alongwith an intermediate-resolution protein model that we developed for thiswork. Physically, the model is much more detailed than any model usedto date for simulations of protein aggregation. Each model residue con-sists of a detailed, three-bead backbone and a simplied, single-beadside-chain. Excluded volume, hydrogen bond, and hydrophobic inter-

    actions are modeled with discontinuous (i.e. hard-sphere and square-well) potentials. Simulations efciently sample conformational space, andcomplete folding trajectories from random initial congurations to twofour-helix bundles are possible within two days on a single processorworkstation. Folding of the bundles follows two main pathways, onethrough a trimeric intermediate and the other through an intermediatewith two dimers. The proportion of trajectories that follow each route issignicantly different for the eight-peptide system in this work than in apreviously studied four-peptide system, which yields one four-helix bun-dle, suggesting, as our previous simulations have, that protein foldingproperties are strongly inuenced by the presence of other proteins. Fold-ing of the bundles is optimal within a xed temperature range, with thehigh-temperature boundary a function of the complexity of the protein(or oligomer) to be folded and the low-temperature boundary a function

    of the complexity of the protein's environment. Above the optimal tem-perature range for folding, the model chains tend to unfold; below theoptimal range, the model chains tend to aggregate. As has been seen pre-viously, aggregates have substantial levels of native secondary structure,suggesting that aggregates are composed largely of partially folded inter-mediates, not denatured chains.

    # 2001 Academic Press

    Keywords: discontinuous molecular dynamics; protein folding; proteinmisfolding; aggregation; four-helix bundle*Corresponding author

    Introduction

    The protein aggregation problem is just as com-plicated and interesting as the protein folding pro-

    blem. In multi-protein systems, competition existsbetween the formation of correct intra-proteininteractions during folding and incorrect inter-pro-tein interactions during aggregation.1 The outcome

    of this competition is dependent on properties of

    both the protein itself and the protein's environ-ment. Proteins that aggregate in vivo can have pro-found pathological implications, as in the aberrantaggregation ofb-amyloid proteins to form plaquesin Alzheimer's disease2 and the aggregation of thePrPSc variant of the prion protein in various neuro-degenerative diseases.3 Protein aggregation is alsoa serious obstacle to protein-based drug pro-duction. In host cell systems genetically engineeredto overproduce heterologous proteins of pharma-ceutical importance, the desired proteins oftenaggregate into inclusion bodies which must be dis-

    E-mail address of the corresponding author:[email protected]

    Abbreviations used: DMD, discontinuous moleculardynamics.

    doi:10.1006/jmbi.2001.4845 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 312, 187202

    0022-2836/01/01018716 $35.00/0 # 2001 Academic Press

    http://emailto:[email protected]/http://www.idealibrary.com/http://www.idealibrary.com/http://emailto:[email protected]/
  • 8/2/2019 Protein Refolding Computer Simulation

    2/16

    solved and then treated with specic renaturationconditions to recover active protein.46 Unfortu-nately, the fundamental mechanisms underlyingaggregation are not well understood; and optimalrefolding conditions must be individually deter-mined for each protein of interest.

    The goal of our work is to use computer simu-lations to study how the competition between pro-tein folding and aggregation affects the optimal

    refolding conditions for proteins. We investigaterefolding as a function of temperature in an eight-peptide system designed to form two tetrameric a-helical bundles based on DeGrado and co-workers'de novo designed a family of proteins.711 We com-pare the results for this eight-peptide system withour previous study on a four-chain systemdesigned to form one tetrameric a-helical bundleand a one-chain system designed to form ana-helix.12 With these simulations, we are able toprobe the balance between protein folding andaggregation in multi-protein systems and to offer aphysically based explanation for the optimum inrefolding yield as a function of environmental con-

    ditions that is observed in this work as well as inprevious simulations12,13 and in refolding experi-ments.1 We also study how the conformationalproperties of individual peptides differ in thiseight-peptide system in comparison to the one-and four-peptide systems studied previously, andwe characterize the overall conformational proper-ties of multi-peptide aggregates.

    Computer simulations of isolated proteins with avariety of protein models have provided a wealthof information on protein stability and on the kin-etics and thermodynamics of protein folding. Com-plex, high-resolution models, such as all-atommodels which typically incorporate every atom of

    the protein (with the exception of some hydrogenatoms), are common1416 and have allowed valu-able insights into a variety of interesting phenom-ena including the process of protein unfolding,1719

    the conformational properties of the denaturedstate ensemble,20,21 and the nature of highly specicprotein-protein interactions.22 Idealized, low-resol-ution models, such as on- and off-lattice homo- andheteropolymer chain models in which each aminoacid residue is represented by a single sphere withidentical (homo) or varied (hetero) interaction para-meters,2325 have been used to study conformation-al transitions during folding,2628 the structures ofmolten globule intermediates,29,30 and the confor-

    mational variability in ensembles of low energy,native-like structures.3137

    Intermediate-resolution protein models representa powerful compromise between the extremelysimplied (low-resolution) homo- and heteropoly-mer chain models and the complex all-atom (high-resolution) models currently favored by the proteinfolding community. The two-bead model,3842 withone backbone bead and one side-chain bead perresidue, and the three-bead model,43 with one

    backbone bead and one or two side-chain beadsper residue, allow independent backbone and side-

    chain interactions in the system. Four bead mod-els,44,45 with three backbone beads and one side-chain bead, have been developed to more accu-rately represent protein backbone structure. Thesuccess with these models suggests that intermedi-ate-resolution models, which inherently accesslonger times than their high-resolution counter-parts, may offer reasonable estimates of the foldingprocess and of three-dimensional folded structures.

    Low- and intermediate-resolution proteinmodels of the a family of proteins have been usedpreviously for computer simulation studies of pro-tein folding.41,42,46 Guo and Thirumalai studied thethermodynamics and kinetics of the a4 proteinusing Langevin dynamics simulations on a hetero-polymer chain model and found that four-helix

    bundle folding occurs via a variety of pathways,some of which are complicated and involve long-lived intermediates. They suggest that a regularpattern of hydrophobic and hydrophilic residues iscrucial for four-helix bundle folding and that all denovo protein design efforts should carefully arrangethe hydrophobic and hydrophilic residues so as to

    destabilize non-native folds.46

    Using a simpliedprotein representation with one backbone beadand one side-chain bead per residue, Sikorski et al.performed lattice Monte Carlo simulations on a1B,a2, and a4 peptides.

    41 They too found that thesequence designed by DeGrado and co-workersprovided enough information to successfully yieldthe native structure. Multiple folding pathwayswere observed with equilibrium intermediatestructures possessing substantial native character.

    While computer simulations are widely used tostudy the dynamics of isolated proteins during thefolding process, simulations of protein aggregationare rare and, to our knowledge, have been per-

    formed exclusively on low-resolution proteinmodels. Patro and Przybycien presented some ofthe earliest aggregation simulations in which ahexagon with a mix of hydrophobic and polarsides was used as the model protein and proteinaggregation was studied by monitoring the associ-ation of the hexagons with two-dimensional MonteCarlo simulations.47,48 We recently studied thecompetition between protein refolding and aggre-gation using a 40-chain heteropolymer system andtwo-dimensional lattice Monte Carlo simulationsand found that refolding yield is optimal at inter-mediate values of denaturant concentration andthat aggregation arises from the association of par-

    tially folded intermediates, not from completelydenatured chains.13 These simulations were per-formed with a ``two-letter'' heteropolymer modeltermed an HP model because each bead on thechain is one of two possible types: hydrophobic(H) or polar (P). Istrail et al. performed simulationson a pair of HP chains using two-dimensional lat-tice simulations to study the innate ability of a pairof proteins to aggregate.49 Broglia et al. performedsimulations on a pair of 20-letter model proteinchains using a three-dimensional Monte Carlomethod and found that aggregates consist of par-

    188 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    3/16

    tially folded intermediates, not denatured chains.50

    Recently, exact enumeration studies of HP chainshave been reported by Giugliarelli et al.51 and byHarrison et al.52 Giugliarelli et al. reported a studyof two-dimensional HP chains that probes theinuence of the inter-residue interaction strengthon the formation of either soluble, non-interactingnative structures, aggregates composed of chainswith native structures, or aggregates composed of

    chains with non-native structures. Harrison et al.studied pairs of two- and three-dimensional HPchains to examine the thermodynamics of the con-formational change associated with aggregation ofmodel prion proteins. Given the current power ofcomputers and supercomputers, simulations ofprotein aggregation require simplied proteinmodels. However, our efforts are aimed at simulat-ing aggregation with a model that possesses sig-nicantly more physical detail than the proteinmodels used previously to study aggregation.

    Here, we present our results from computersimulations on refolding and aggregation of twotetrameric a-helical bundle proteins using an inter-

    mediate-resolution protein model. The proteinmodel used here offers a signicant improvementin detail over previous low-resolution proteinmodels used in simulations of aggregating sys-tems. In previous papers, we described the devel-opment of an off-lattice, intermediate-resolutionprotein model that incorporates substantial physi-cal protein detail yet is simple enough to permitsimulations of multiple proteins over long time-s.53,54 In our most recent work, we presentedresults for simulations on the folding of the iso-lated 16-residue model peptide that serves as the

    building block for the tetrameric bundle and forsimulations on the assembly of four of these 16-

    residue model peptides into the native tetramericbundle.12 Here, we simulate eight-chain systems tostudy tetrameric a-helical bundle formation in thepresence of competing aggregation events. Westudy folding to the native state as a function oftemperature, and we characterize the predominantconformations of model peptides that are involvedin aggregated structures. Despite the simplicity ofour model, we obtain a-helical bundle structureswith realistic physical properties in relatively short(on the order of hours) simulations on single-processor workstations.

    Highlights of our simulation results are the fol-lowing. The simplicity of the protein model devel-

    oped for this work, along with the power of thediscontinuous molecular dynamics algorithm,enables observation of complete folding trajectoriesto the native state of two tetrameric a-helical bun-dles via simulations as short as two days on a 500MHz single-processor workstation. There is anoptimal temperature range for bundle assembly inour simulations, and the range is dened by thetendency of the peptides to aggregate at low tem-peratures and to unfold at high temperatures. Weobserve the same two main pathways of bundleassembly as seen previously in the four-peptide

    system (monomer-to-dimer-to-trimer-to-tetramer,and monomer-to-dimer-to-tetramer); however, theproportion of folding trajectories that follow eachroute is different for the four- and eight-peptidesystems, suggesting that the number of chainsin the system has a signicant impact on likelyfolding trajectories. The boundaries marking theoptimal temperature range for folding in the eight-peptide system are different from those in the one-

    and four-peptide systems studied previously. Thehigh-temperature boundary of the optimal tem-perature range for folding appears to be a functionof the complexity of the protein being simulated;and the low-temperature boundary appears to be afunction of the complexity of the protein's environ-ment, with more complex environments (such aslarger number of chains) contributing to aggrega-tion. Aggregates tend to consist of chains with sub-stantial native secondary structure and are heldtogether by a signicant number of non-nativehydrophobic contacts, suggesting that partiallyfolded chains, not denatured chains, are the maincomponent of amorphous aggregates.

    The next section describes the model developedfor this work, including the physical representationand the potential energy function, and the DMDsimulation technique. A further section presentsthe results and discussion for the simulations ofthe folding and aggregation of a two tetramerica-helical bundles. The nal section provides a briefconclusion.

    Models and Methods

    The physical protein representation and modeldetails are described in detail elsewhere.54 We pro-vide a brief description here.

    Physical chain representation

    The protein model has a fairly realistic backbonestructure and a very simplied side-chain struc-ture. Each amino acid residue is modeled with four

    beads as depicted by the four broken circles inFigure 1. An N united atom represents the aminoacid's amide nitrogen and hydrogen, a Ca unitedatom represents the alpha-carbon and its hydro-gen, and a C united atom represents the carbonylcarbon and oxygen. The fourth bead in the model,R, represents the side-chain group. This physicalstructure, a three-bead backbone and one-bead

    side-chain, has been used successfully to study thefolding of isolated proteins elsewhere44,45 in concertwith different search algorithms and potentialenergy functions. Model glycine residues do nothave R beads, and the model cannot currently beused for proline residues because of proline's unu-sual structure. In our model, the inter-residue bondis assumed to be in the trans conguration, all

    backbone bond lengths and bond angles are xedat their ideal values, and the distance between con-secutive Ca united atoms is xed in accordancewith empirical observations. The side-chains in the

    Protein Refolding Versus Aggregation 189

  • 8/2/2019 Protein Refolding Computer Simulation

    4/16

    model may vary in size and distance from Ca,depending on the particular amino acid residues

    being modeled, and are held in positions relativeto the backbone so that all residues are L-isomers.The values of the bond lengths and angles and the

    method used to maintain these values and chiralityare given below. Solvent molecules are not expli-citly included in the model. The effect of solvent isfactored into the energy function as a potential ofmean force.

    Forces and interactions

    Beads in the protein model are subject to fourdifferent types of forces: repulsion due to excludedvolume effects, attraction between bonded beadsand pseudobonded beads (as will be dened

    below), attraction between pairs of backbone beadsduring hydrogen bond formation, and attraction

    between pairs of side-chain beads during hydro-phobic interactions. Each of these forces is rep-resented by a discontinuous potential force, eithera hard-sphere potential:

    uijr

    &IY r4s0Y r b s

    1

    where r is the distance between beads i and j ands is the bead diameter, or a square-well potential:

    uijr

    @IY r4seY s ` r4ls0Y r b ls

    2

    where ls is the well diameter and e is the welldepth. Conceptually, a hard sphere refers to animpenetrable, solid sphere, and a square well refersto an attractive region of thickness l that envelopsthat sphere. Deeper wells correspond to strongerattractive interactions between square-well beads,and shallower wells correspond to weaker attrac-tive interactions. The well depth parameter iscoupled to the temperature so that a single par-ameter, reduced well depth (e*) which is equal toe/kBT, characterizes the protein environment. Highvalues of e*, for example, can be considered tocharacterize a low temperature or poor solventenvironment. Reduced temperature, T*, is theinverse of reduced well depth. In this work, T* isdened by the strength of the hydrogen bondingpotential and is therefore equal to kBT/eHB, whereeHB is the depth of the square well on an N or a C

    bead. The strength of the hydrophobic potential(the depth of the square well on hydrophobic side-chains, eHP) may be varied independently of thestrength of the hydrogen bonding potential (eHB).The reduced time in our simulations is dened to

    be t teHBams2

    12, where t is the time and m is

    the average molecular mass of a bead in ourmodel.

    Excluded volume

    Pairs of beads collide and repel when the dis-tance between them becomes so small that theirsurfaces touch (when rij s). Diameters for each ofthe three types of backbone beads are chosen to be

    reasonable estimates for the sizes of the atoms theyrepresent as described previously54 and are shownin Table 1. For interactions between pairs of neigh-

    bor beads (three or fewer bonds apart along thechain), we allow the beads to overlap by up to25 %. The amount of overlap is chosen to dictatethe range of motion around N-Ca and Ca-C bonds,and our previous work demonstrates that themodel exhibits realistic - conformational free-dom for both non-glycine and glycine residues.54

    Bonds

    Covalent bonds are maintained between neigh-boring beads along the chain backbone andbetween the Ca and R united atoms. Bonded beadsmove freely between separation distances of(1 d)l and (1 d)l, where d is the bond toleranceand l is the ideal bond length between the bonded

    beads. The choice ofd denes the acceptable rangeof uctuation in the bond length. Here, d is chosento be 0.02. In effect, bonds in the simulation uctu-ate within 2 % of their assigned lengths by experi-encing a hard-sphere repulsion at (1 d)l and aninnitely strong square-well attraction at (1 d)l.

    Figure 1. An amino acid residue. The side-chain

    group is denoted R and represents one of 20 differentchemical groups. Broken circles depict atom groups,each of which is represented by a sphere in the model. is rotation around the bond between nitrogen anda-carbon atoms. is rotation around the bond betweena-carbon and carbonyl carbon atoms.

    190 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    5/16

  • 8/2/2019 Protein Refolding Computer Simulation

    6/16

    are random, although they may not violate any ofthe assigned bond lengths and angles listed inTable 1. The initial velocities are chosen at randomfrom a Maxwell-Boltzmann distribution at a veryhigh temperature (T* 0.5). At the start of eachsimulation, the system is annealed from T* 0.5 tothe desired run temperature to minimize kinetictrapping in local free energy minima. The anneal-ing is complete within approximately ve reduced

    time units, a small fraction of overall folding timewhich is typically 1200 or more reduced time units.When a DMD simulation begins, each bead moveswith its individual velocity. The simulation pro-ceeds according to the following schedule: identifythe rst event, move forward in time until thatevent occurs, calculate new velocities for the pairof beads involved in the event and calculate anychanges in system energy resulting from hydrogen

    bond events or hydrophobic interactions, nd thesecond event, and so on.

    Types of events include excluded volume events,bond events, and square-well hydrogen bond andhydrophobic interaction events. An excluded

    volume event occurs when the surfaces of twohard-sphere beads collide and repel each other.Bond (or pseudobond) events include a hard-sphere repulsion event which occurs when the

    bond length is (1 d)l and an innite square-wellattraction event which occurs with the bond lengthis (1 d)l. Square-well events include capture,

    bounce, and dissociation events which occur whenthe square wells of N and C or the square wells oftwo H beads touch. Capture events occur when anattraction is felt between two beads, such as theattraction between an N and a C during the for-mation of a hydrogen bond. In the simulation, theattraction results in an increase in kinetic energy

    (beads N and C move faster toward each other)and a decrease in potential energy (in accordancewith the depth of the N and C square wells). Inessence, the capture event causes the beads to

    become partners. Dissociation events dissolve part-nerships and are the opposite of capture events;the beads move away from each other and losevelocity (lowering the kinetic energy of the system)while the system gains potential energy. Bounceevents occur between partnered beads that lackenough kinetic energy to dissociate. Both energyand momentum are conserved during all types ofevents. The event-to-event nature of DMD offerssignicant computational advantages over stan-

    dard, continuous-potential molecular dynamicstechniques which must proceed through time bytaking very small steps.61 For details on DMDsimulations with square-well potentials, see papers

    by Alder & Wainwright58 and Smith et al.61

    Simulations are performed in the canonicalensemble, which means that the number of par-ticles, volume, and temperature are held constant.Constant number of particles and volume areachieved by creating a virtual three-dimensional

    box for the simulation and allowing the model pro-tein chains to move within that box. Periodic

    boundary conditions are used to eliminate artifactsdue to simulation box wall effects. With this meth-od, the primary simulation box is replicated in-nitely in all dimensions, and chains are allowed tomove freely between the boxes. Since each box isan exact replica of all others, when a chain appearsto be leaving the primary box, its image simul-taneously enters the primary box from the oppositeface. The dimensions of the box are chosen to

    ensure that a chain cannot interact with more thanone image of any other chain. For this study, weuse a cubic box with sides 100 A in length. Con-stant temperature is achieved by implementing theAndersen thermostat method62 as was used pre-viously.28,54 With this procedure, all beads in thesimulation are subject to random, infrequent col-lisions with ghost particles. The post-event velocityof a bead colliding with a ghost particle is chosenrandomly from a Maxwell-Boltzmann distributionat the simulation temperature. We haveimplemented several optimization techniques inthis work, including neighbor lists and false posi-tioning, which have been described elsewhere.61

    Simulations are performed on alpha worksta-tions and range in length from two to four billionevents, the length chosen in each case based on theprogress of the system. We ran all simulationsuntil the structures of the chains and system prop-erties such as internal energy were constant for atleast the nal one billion events of the run. Aver-age system properties for each run were calculated

    based on the system properties in the nal 650million events of each run. We also present rawdata for system properties versus time over thecourse of individual simulations.

    Model peptides

    We perform DMD simulations on eight 16-resi-due model peptide chains, each designed to forman amphipathic a-helix. The peptides have the fol-lowing sequence of hydrophobic (H) and polar (P)residues: PPHPPHHPPHPPHHPP. This sequence isderived from a sequence designed by Ho &DeGrado,7 GELEELLKKLKELLKG, where polarresidues glycine (G) and glutamic acid (E) andhydrophobic residues leucine (L) and lysine (K) arearranged to generate the simplest helical subunit ofa four-helix bundle protein. For purposes of com-puter simulations, Guo & Thirumalai46 reduced theHo & DeGrado sequence to the sequence of Hs

    and Ps shown above.

    Results and Discussion

    In this section, we present results from 56 inde-pendent simulations on systems of eight 16-residuechains. The lowest energy (native) state for thissystem is two tetrameric a-helical bundles. Whennative, each tetrameric a-helical bundle has 48a-helical hydrogen bonds, 12 intra-chain hydro-phobic interactions, and 40 inter-chain hydro-phobic interactions.

    192 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    7/16

    Figure 2 shows three pictures of the native statefor the model 16-residue chain that serves as thehelical subunit for the tetrameric a-helical bundle.Figure 3 shows the native four-helix bundle struc-ture with each chain shown in a different color.The structure has the ``twist'' characteristic of tetra-meric a-helical bundle proteins,7,6365 with heliceslying at angles offset approximately 20 from the

    bundle axis. Bundles with all possible combi-

    nations of parallel and antiparallel helices areobserved during our simulations and are isoener-getic in our model. The native state in our eight-peptide system is two a-helical bundles, each ofwhich is like the one shown in Figure 3.

    We describe progression of the system to thenative state via a single order parameter, Q, calledthe nativeness parameter. Q is dened as the sumofQHB, which is a function of the number of nativea-helical hydrogen bonds that form, and QHP,which is a function of the number of chains thatalign due to hydrophobic contacts. QHB and QHPare calculated as follows:

    QHB 14

    noX ofa-helical hydrogen bonds formed48

    3

    and

    QHP 1

    4

    noX of aligned pairs of chains

    6

    4

    where a pair of chains are ``aligned'' if they are inone of the following two arrangements. In the rstarrangement, two chains are aligned if they lie in

    an anti-parallel direction such that they have atleast one inter-chain hydrophobic contact and the

    distances between the N and C-terminal hydro-phobic side-chain on opposite chains are both lessthan 7 A . Alternatively, two chains are aligned ifthey lie in a parallel direction such that they haveat least one inter-chain hydrophobic contact andthe distances between the two N-terminal hydro-phobic side-chains and between the two C-terminalhydrophobic side-chains are both less than 7 A .With these denitions, the native structure has sixaligned pairs of chains, since chains directly nextto each other in the native bundle are separated byapproximately 4.7 A and chains diagonally acrossfrom each other in the native bundle are separated

    by approximately 6.6 A . Dening alignments in

    bundle proteins in this way has been done else-where.66 The normalization constants of 1/4 ineach equation were chosen so that a Q value of1/2 corresponds to the native state for one tetra-mer, and equal weight is given for correct a-helicalhydrogen bonds formed and correct hydrophobicarrangements. If both tetramers fold to the nativestate, Q 1. To determine Q for the simulation, weconsider, from the eight chains in the system, allpossible combinations of two sets of four chains.We calculate Q for each set of four chains in agiven combination using equations (3) and (4) andsum the Q for the two sets of four chains. Themaximum average Q obtained from all possible

    combinations of two sets of four chains is used asthe value of Q for that simulation.

    Folding of two tetrameric a-helical bundles (asmeasured by Q) is a strong function of temperatureand is very similar to the functional relationshipseen previously for the folding of one tetrameric a-helical bundle in a four-chain system. Variation inQ as a function of reduced temperature is shownin Figure 4 for both the eight-chain system (lledcircles) and the four-chain system studied pre-viously12 (open squares). Each system displays amaximum in Q over a range of temperatures

    Figure 2. The 16-residue peptide in an a-helix. Hydro-phobic residues are dark gray, polar residues are lightgray, and the N-terminal residue is black. The structureon the left is a RasMol77 cartoon rendering of an ideal a-helix (f 70, c 40) with the HP sequence used inthis work. The middle and right structures are drawnwith the AVS software package (Advanced Visual Sys-tems, Inc.) and are the native conformation of the modelchain simulated here shown with full- and half-sizebeads, respectively.

    Figure 3. Native structure of the tetrameric a-helicalbundle shown with half-size beads (left) and full-sizebeads (right). Chain backbones are pastel colors, hydro-phobic side-chains are bright colors, polar side-chainsare light gray, and N-terminal beads are black.

    Protein Refolding Versus Aggregation 193

  • 8/2/2019 Protein Refolding Computer Simulation

    8/16

    which corresponds to an optimal temperaturerange for folding. In 22 of the 29 eight-chain simu-lations performed within the reduced temperaturerange where Q is a maximum (0.088-0.105), theeight-chain system successfully assembles into twotetrameric a-helical bundles and remains withinsmall uctuations of the native state. At highreduced temperatures (above 0.105) where bothhydrogen bonds and hydrophobicity are weak,none of the 14 simulations performed yield stablenative bundles. At low reduced temperatures(below 0.088) where both hydrogen bonds and

    hydrophobicity are very strong, all 20 of the simu-lations performed result in aggregated structures.

    One of the simplications in our model is thatthe strengths of interactions, eHB and eHP, are inde-pendent of temperature, which is not the case forreal proteins. Using low-resolution protein modelswith temperature-dependent hydrophobic inter-actions, Chan and Dill nd that an optimum infolding rate as a function of temperature is primar-ily due to the temperature dependence of thehydrophobic term which allows heat denaturationat high temperatures and cold denaturation at lowtemperatures.67 In our results shown in Figure 4,

    the optimum in folding yield cannot be ascribedstrictly to heat and cold denaturation. The hightemperature behavior is similar to heat denatura-tion in that the yield of native bundles is low

    because the effective interactions, eHB** 1/T* andeHP** 1/(6T*), are too weak to persist longenough for complete folding of helices and assem-

    bly of bundles. The low temperature behavior isdue to aggregated structures being stabilized bylong-lived non-native interactions, not to colddenaturation. In fact, in low-temperature simu-lations starting from the native state, the native

    state persists rather than degrading as would beexpected during cold denaturation (data notshown). It has been shown previously that low-res-olution model proteins tend to become kineticallytrapped more often than real proteins,68 and asimilar phenomenon may exist for intermediate-resolution models such as the one used here. Giventhat our model includes temperature-independentforces and may be more susceptible to kinetic traps

    than real proteins, our conclusion about the exist-ence of an optimal temperature range for foldingof real proteins is somewhat speculative. For thatreason, we focus on qualitative differences betweenthe one-, four-, and eight-chain systems rather thanon the details of a single system.

    Although the folding in one-, four-, and eight-chain systems show similar trends with reducedtemperature, the optimal temperature range forfolding shrinks as the number of chains in the sys-tem increases. The high-temperature boundary onthe optimal temperature folding range is 0.115 forthe one-chain system and 0.105 for both the four-and eight-chain systems. The low temperature

    boundary on the optimal temperature foldingrange is 0.065, 0.080, and 0.088 for the one-, four-,and eight-chain systems, respectively. The positionof the upper boundary on the optimal temperaturerange for folding appears to be dictated by thecomplexity of the protein being folded. The nativestructure of model proteins in both the four- andthe eight-chain systems is a tetrameric a-helical

    bundle, and the upper edge of the optimal tem-perature range for folding is constant at T* 0.105.In contrast, the upper edge of the optimal tempera-ture range for folding of isolated 16-residue pep-tides that serve as the building blocks for thetetrameric bundle was shown previously to be con-

    siderably higher (T* 0.115, data not shown).12

    The position of the lower boundary on the optimaltemperature range for folding appears to be a func-tion of the complexity of the protein environment.At low temperatures, misfolding and aggregationout-compete folding; and aberrant misfolding andaggregation is a more signicant problem as thenumber of chains in the system increases. In theeight-chain system, signicantly more non-nativecontacts and unproductive aggregation interactionsare possible than in the four- or one-chain systems

    because of the increased number of chains; thisadded complexity shifts the lower boundary to ahigher temperature.

    Our hypothesis that the optimal temperaturerange for protein folding is bounded on the highside by the complexity of the protein and the lowside by the complexity of the solution offers aninteresting perspective on the experiments that areperformed to refold proteins in vitro. Our simu-lation results suggest that, for a given protein, onlythe low temperature boundary on the optimal tem-perature range for folding can be manipulatedexperimentally because the high temperature

    boundary is xed by the protein being studied.Reduced temperature in our simulations is a par-

    Figure 4. Nativeness parameter, Q, versus reducedtemperature for the four- (open squares) and eight-chain(lled circles) simulations.

    194 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    9/16

    ameter that can equivalently be considered to beother environmental properties, such as solventquality or denaturant concentration. High reducedtemperature corresponds to good solvent or highdenaturant concentration; low reduced tempera-ture corresponds to poor solvent or low denaturantconcentration. Experimental protein refolding isknown to be optimal under particular conditionsof temperature, solvent, and denaturant con-

    centration,1

    and we suggest that the boundaries onthe optimal ranges observed experimentally aredened at one extreme by the protein being stu-died and at the other extreme by the chosen pro-tein environment.

    Folding to tetrameric a-helical bundles followsmany different trajectories in the eight-chain sys-tem. However, two main pathways can be denedthat summarize the possible folding pathways fora given set of four chains to a tetrameric bundle:

    four monomers A dimer two monomers

    A trimer monomer A tetramer

    four monomers A two dimers A tetramer

    In our previous simulations on four-chain systems,the vast majority (18/22) were shown to fold viathe pathway with a trimeric intermediate. In con-trast, in the eight-chain simulations presented here,slightly fewer than half of the folding trajectories(20/44) follow this path. The rest of the folding tra-

    jectories in the eight-chain simulations (24/44)occur without a trimeric intermediate and fold viathe second path shown above. The proportion offolding trajectories that follow each route is signi-cantly different for the four- and eight-peptide sys-

    tems. This result suggests that protein foldingproperties, such as dominant folding trajectories,are strongly inuenced by the presence of otherproteins. We obtained a similar result in our pre-vious multiprotein simulations on low-resolutionlattice models.13 These observations suggest thatsimulations of isolated proteins, as are standard inthe computational protein folding eld, may notaccurately reect dominant protein folding trajec-tories in vivo or in concentrated solutions.

    Interprotein interactions have been shown exper-imentally to affect the kinetics of protein folding.Using theoretical and experimental approaches,Oliveberg demonstrated that true two-state kinetics

    for monomeric proteins may be masked by transi-ent aggregation at high (%5 mM) protein concen-trations and appear as multistate kinetics.69 Theeffective peptide concentrations in our systems arecomparatively high (%50 mM or 60 mg/l in thefour-peptide system and %100 mM or 120 mg/l inthe four-peptide system), well into the concen-tration regime where transient aggregates are sus-pected to exist experimentally. However, ourmodel native structure is oligomeric and thereforeaggregation itself is a required part of the foldingand assembly process. Further simulation studies

    at a range of effective peptide concentrations and,preferably, on a monomeric model peptide will berequired to determine to what extent the kineticintermediates observed during our simulationsstem from concentration differences and whetherthese intermediates can be compared to those pre-dicted by theoretical and experimental studies.

    Snapshots of an eight-chain simulation atT* 0.10 (within the optimal temperature range

    for folding) in which one tetramer folds via a path-way with a trimeric intermediate and the otherfolds via a pathway without a trimeric intermediateare shown in Figure 5. The rst snapshot (t* 0)shows the random initial conguration of the sys-tem. The individual chains are shown with differ-ent colors (red, green, blue, yellow, orange, purple,magenta, and turquoise). At t* 114.5, the chainshave paired into four separate dimers via hydro-phobic interactions (blue-yellow, red-green, purple-turquoise, and magenta-orange). (In this and in allsubsequent snapshots, the monomeric chains ormulti-chain complexes are shown side-by-side,rather than in their true relative positions in space.)

    The chains vary in the number of a-helical hydro-gen bonds formed from 0 to 12. By t* 142.6,three of the four dimers have formed an aggregate,while the fourth dimer (magenta-orange) remainsfree. The six-chain aggregate quickly separates intoa trimer (green-turquoise-purple), a dimer (blue-yellow), and a monomer (red), as shown in thesnapshots at t* 194.5. As the simulation pro-ceeds, the green-turquoise-purple trimer remainsisolated and the chains in this trimer slowly man-euver into a native-like alignment that offers maxi-mal hydrophobic contacts. In the meantime, theother ve chains undergo signicant changes. The

    blue-yellow dimer breaks apart, and then the red

    and yellow chains form a dimer. Later, the red-yel-low dimer and magenta-orange dimers cometogether to form a tetramer. By t* 308.2, the bluechain, in a b-sheet structure, has joined the tetra-mer but has few connections with it and breaksaway quickly. By t* 822.5, the alignments withinthe red-orange-yellow-magenta tetramer have

    begun to resemble the native state, and the bluechain has lost its b-structure and begins to establisha-helical hydrogen bonds. At t* 1210, the fullyhelical blue chain associates with the green-tur-quoise-purple trimer. Over the next 35 time units,the blue chain aligns with the other chains in itstetramer, and the native structure is observed in

    both tetramers at t* 1260.Figures 6 and 7 offer a more detailed description

    of the folding trajectory shown in Figure 5 inwhich the red-orange-yellow-magenta set of chainsand the green-blue-purple-turquoise set of chainsassemble into independent tetrameric a-helical

    bundles. The top four panels in Figures 6 (red-orange-yellow-magenta bundle formation) and 7(green-blue-purple-turquoise bundle formation)show the number of a-helical hydrogen bondsformed versus reduced time for each of the fourchains in the set and the middle six panels show

    Protein Refolding Versus Aggregation 195

  • 8/2/2019 Protein Refolding Computer Simulation

    10/16

    the number of inter-chain hydrophobic contactsformed versus reduced time for each pair of chainsin the set. Figure 6 shows that the red-orange-yel-low-magenta tetramer forms via association of twodimeric intermediates. Early in the simulation(t* 0-250), the red-orange-yellow-magenta set ofchains (Figure 6) form two dimers, as can be seen

    by the large number of hydrophobic contactsduring this time in the second (red and yellow)

    and fth (orange and magenta) hydrophobic con-tact panels. Despite these inter-chain hydrophobicinteractions, each chain successfully adopts a long,a-helical structure during the time period fromt* 0 to t* 250. At approximately t* 580, thered-yellow and orange-magenta dimers associateas can be seen by the prevalence of hydrophobiccontacts between all pairs in Figure 6. Overapproximately the next 680 time units, the red,yellow, orange, and magenta chains reorient toadopt the native a-helical bundle structure asshown in Figure 5 at t* 1260. Figure 7 shows thatthe green-blue-purple-turquoise tetramer forms viaa trimeric intermediate. The purple and turquoise

    chains are the rst to form long-lasting hydro-phobic contacts (at approximately t* 75). By

    t* 150, the green chain has joined the purple-tur-quoise dimer and each of the three chains in theresulting trimer has substantial a-helical character.The blue chain, however, makes very few hydro-phobic contacts with the green, purple, and tur-quoise chains for the rst 1200 time units of thesimulation. The blue chain also spends the rst 850time units of the simulation in non-helical struc-tures, as can be seen in Figure 5 at reduced times

    through 822.5. At t* 1210, the blue chain nallyassociates with the green-purple-turquoise trimer;and at t* 1260, the native tetrameric a-helical

    bundle is achieved.Non-native hydrogen bonds are common in our

    simulations, and multiple non-native hydrogenbonds can stabilize b-structures, such as the b-sheet exhibited by the blue chain at t* 308.2 inFigure 5. Structures with non-native hydrogen

    bonds are observed in all of our eight-chain simu-lations, and b-structures (either b-turn, b-hairpin,or b-sheets) are observed in 73.0 % of the simu-lations. Table 2 shows the number of simulationsin which non-native and b hydrogen bonds (the

    hydrogen bonds responsible for a b-structure) areobserved for simulations that result in either fold-

    Figure 5. Snapshots of the conformations of each chain or complex of chains in an eight-chain simulation thatresults in formation of two tetrameric a-helical bundles.

    196 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    11/16

    ing to the native state or trapping in a misfoldedor aggregated structure for the one- and four-chainsystems studied previously and for the eight-chainsystem studied in this work. (Simulations resultingin misfolded/aggregated structures include allruns within the optimal temperature range forfolding that do not yield the native state and allruns performed below the optimal temperature

    range for folding.) Non-native hydrogen bondsform in nearly all simulations regardless of thenumber of chains in the system and of the simu-lation outcome. Of the 54 single-chain simulations,only four follow trajectories that do not experiencenon-native hydrogen bonds. In all 37 four-chainsimulations and in all 49 eight-chain simulations,non-native hydrogen bonds are observed. In allthree systems, b-structure is more common in

    Figure 6. Number ofa-helical hydrogen bonds formedand number of inter-chain hydrophobic contacts formedfor the red-orange-yellow-magenta tetramer during thefolding trajectory of the two tetrameric a-helical bundlesdepicted in Figure 5.

    Figure 7. Number ofa-helical hydrogen bonds formedand number of inter-chain hydrophobic contacts formedfor the green-blue-purple-turquoise tetramer during thefolding trajectory of the two tetrameric a-helical bundlesdepicted in Figure 5.

    Protein Refolding Versus Aggregation 197

  • 8/2/2019 Protein Refolding Computer Simulation

    12/16

    simulations that result in misfolds or aggregatesthan in simulations that result in folding to thenative state. In the one-chain system, all trajectoriesending in misfolded structures involved b-hydro-gen bonds, compared with only 43.5 % of trajec-tories that ended in the native state. In the multi-

    chain systems, slightly more misfolding and aggre-gation trajectories than folding trajectories (80.0 %compared to 72.7 % for the four-chain system and92.6 % compared to 77.3 % for the eight-chain sys-tem) involved b hydrogen bonds. This trendsuggests that the presence of b-structures hindersfolding, as is expected for our particular system inwhich a-helical structures are required for folding.Extended, b-strand and b-sheet structures have

    been shown to be prevalent in the ordered, brillaraggregates common to amyloid diseases.2,70,71

    Though we observe b-structures in oursimulations, the amorphous aggregates that form

    contain little or no b-sheet content. In our system,b-structures represent relatively deep energetictraps that retard progression toward native,a-helical structures.

    The structures of the individual chains involvedin aggregates can offer clues as to the points alongthe folding trajectories that are most susceptible todetrimental aggregation events. In our simulations,we observe that aggregated structures, such as theone shown in Figure 8 for a simulation atT

    *

    0.076, often possess substantial native charac-ter. Table 3 describes the amount of native charac-ter found in misfolded and aggregated structuresfor the four-chain system studied previously andthe eight-chain system studied in this work. In

    both the four- and eight-chain systems, the mis-folds and aggregates have a large number ofa-helical (native) hydrogen bonds, representingmore than 70% of the a-helical hydrogen bondsthat would be present in the native state. The ratioof non-native to native hydrogen bonds is verylow in both systems, which indicates that aggre-gated structures are not solely the product ofexcessive non-native hydrogen bonds. In fact, the

    ratio of non-native to native hydrogen bonds isnearly identical in the four- and eight-chain sys-tems (0.18 compared with 0.20), suggesting thathydrogen bonding is unaffected by the number of

    Table 2. Occurrence of non-native and ba hydrogen bonds during one-, four-, and eight-chain simulations within andbelow the optimal temperature range for folding

    No. chains Simulation outcome No. simulations

    No. simulations in whichnon-native hydrogen

    bonds were observed

    No. simulations in whichb hydrogen bonds were

    observed

    1 Native 46 42 (91.3 %b) 20 (43.5 %)Misfolded 8 8 (100 %) 8 (100 %)

    4 Native 22 22 (100 %) 16 (72.7 %)Misfolded 15 15 (100 %) 12 (80.0 %)

    8 Native 22 22 (100 %) 17 (77.3 %)Misfold/aggreg 27 27 (100 %) 25 (92.6 %)

    a We dene b hydrogen bonds as those bonds in stretches of three or more consecutive hydrogen bonds that contribute to ab-turn, b-hairpin, or b-sheet structure.

    b Percentages in parentheses are relative to the total number of simulations in the category.

    Figure 8. An eight-chain aggregate in a simulation atT* 0.076.

    Table 3. Native and non-native characteristics of aggre-gated structures

    Four-chainsimulations

    Eight-chainsimulations

    Average number ofnative hydrogen

    bonds

    34.6 (72.1%a) 67.5 (70.3 %)

    Average ratio ofnon-native to nativehydrogen bonds

    0.18 0.20

    Average number ofaligned pairs ofchains

    2.76 (46.0 %) 4.18 (34.8 %)

    Average ratio ofinter-chain non-native to nativehydrophobicinteractions

    2.39 4.12

    a Percentages in parentheses are relative to the number in thenative structure.

    198 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    13/16

    chains in the system. On average, 2.76 pairs ofchains are aligned in the aggregates in the four-chain system (nearly one-half of the native sixalignments), which is a considerably higher frac-tion than the 4.18 pairs of chains that are alignedin the aggregates in the eight-chain system (one-third of the native 12 alignments). Unlike hydrogen

    bonding, alignment of chains appears to beaffected by the system size. The most striking

    difference between the four- and eight-chain sys-tems is that, while aggregate structures in the four-chain systems have only 2.39 times as many non-native hydrophobic interactions as native hydro-phobic interactions, the aggregate structures in theeight-chain systems have 4.12 times as many non-native hydrophobic interactions as native hydro-phobic interactions. Hydrophobic interactions con-tribute to aggregate stability in both the four- andeight-chain systems. The larger the system, the lar-ger the role non-native hydrophobic interactionsplay in stabilizing aggregated structures.

    As with the one- and four-chain simulations per-formed previously, eight-chain simulations that

    result in correct assembly to the native state in thiswork are very efcient. Folding transitions requireas few as two days on a single-processor 500 MHzworkstation.

    Conclusions

    In eight-chain simulations, where each chain isdesigned to form an identical amphipathic a-helix,the model peptides successfully assemble into twotetrameric a-helical bundles when simulations areperformed at intermediate values of reduced tem-perature. Despite the simplications in our model,

    which include neglecting details of side-chainstructure and implementing only steric, hydrogen

    bonding, and hydrophobic forces, the structuralcharacteristics of the resulting bundle are consist-ent with experimental characterization of Ho &DeGrado's original de novo designed amphipathica-helix sequence711 and with previous simulationresults for this system.12,41,46 This agreement isencouraging and suggests that these and furthersimulations with this model may provide reason-able estimates of real peptide behavior. However,it is possible that the simplications in the modelaffect the simulation results. For example, wemonitor hydrophobic interactions between side-

    chains, but other side-chain interactions, such ashydrogen bonding and salt links, are likely toimpact intermediate structures and aggregate stab-ility. It is also possible that incorporating tempera-ture-dependent hydrophobic interactions andhydrogen bonds will affect the system behavior.Further simulation studies are necessary to fullyassess the robustness of the results.

    Folding of the a-helical bundle follows manydifferent trajectories. However, two main path-ways can be dened, one through a trimeric inter-mediate and the other involving the association of

    two dimers. Interestingly, the proportion of foldingpathways that follow each route is signicantlydifferent for the eight-peptide system than in thepreviously studied four-peptide system. While theeight-chain simulations folded equally via the twopathways, the four-chain simulations heavilyfavored the pathway with a trimeric intermediate.The different folding tendencies of the two systemssuggests that protein folding properties, such as

    dominant pathways, are strongly inuenced by thepresence of other proteins; and simulations of iso-lated proteins, as is standard practice in the com-putational folding eld, should be analyzed withthis caveat in mind.

    The optimal temperature range for folding isdifferent for each of the systems we have studied.From comparisons between one-, four-, and eight-peptide systems, it appears that the high-tempera-ture boundary of the optimal temperature range isa function of the complexity of the protein (or oli-gomer) to be folded, while the low-temperature

    boundary is a function of the complexity of the

    protein's environment and the competitionbetween protein folding and aggregation. There-fore, when experimental refolding of a particularprotein is difcult, efforts to expand the optimaltemperature range should focus on pushing thelow-temperature boundary lower since the high-temperature boundary may be xed.

    In simulations on eight-peptide systems belowtheir optimal temperature ranges for folding,aggregation out-competes folding, as we saw pre-viously in simulations on four-peptide systems. Ingeneral, aggregates in both the eight- and four-peptide systems have substantial levels of native

    secondary structure and appear to be stabilized bya signicant number of non-native hydrophobiccontacts. This observation is in agreement withprevious experimental7276 and simulation13,50 stu-dies that suggest aggregates are composed largelyof partially folded intermediates, as opposed tocompletely denatured chains. All aggregatesobserved in our simulations are amorphous,analogous to experimentally observed inclusion

    body aggregates, with each peptide chain in eachamorphous aggregate adopting a unique partiallyfolded or random coil conguration. We do notobserve brillar aggregates with long-range orderlike those formed by b-amyloid proteins inAlzheimer's disease2 and by prion proteins inCreutzfeld-Jakob disease.52

    In the eight-peptide system, a wide array ofstructures are actively sampled, including non-native compact structures and b-sheet confor-mations. However, the power of the DMD simu-lation algorithm, along with the simplicity of ourintermediate-resolution protein model, enablesobservation of complete folding trajectories to twotetrameric a-helical bundles within two days on a500 MHz single-processor workstation.

    Protein Refolding Versus Aggregation 199

  • 8/2/2019 Protein Refolding Computer Simulation

    14/16

    Acknowledgments

    This work was supported by the GAANN Biotechnol-ogy Fellowship program of the U.S. Department of Edu-cation. Funding was also provided by the NationalInstitutes of Health under grant number GM-56766 andthe National Science Foundation under grant numberCTS-9704044.

    References

    1. Jaenicke, R. & Seckler, R. (1997). Protein mis-assembly in vitro. Advan. Protein Chem. 50, 1-59.

    2. Selkoe, D. J. (1999). Translating cell biology intotherapeutic advances in Alzheimer's disease. Nature,399, A23-A31.

    3. Cohen, F. E. & Prusiner, S. B. (1998). Pathologic con-formations of prion proteins. Annu. Rev. Biochem. 67,793-819.

    4. Manning, M., Patel, K. & Borchardt, R. (1989).Stability of protein pharmaceuticals. Pharm. Res. 6,

    903-918.5. Costantino, H. R., Langer, R. & Klibanov, A. M.(1995). Aggregation of lyophilized pharmaceuticalprotein, recombinant human albumin: effect ofmoisture and stabilization by excipients. Biotechnol.13, 493-496.

    6. King, J. & Betts, S. (1999). A green light for proteinfolding. Nature Biotech. 17, 637-638.

    7. Ho, S. P. & DeGrado, W. F. (1987). Design of a4-helix bundle protein: synthesis of peptides whichself-associate into a helical protein. J. Am. Chem. Soc.109, 6751-6758.

    8. Regan, L. & DeGrado, W. F. (1988). Characterizationof a helical protein designed from rst principles.Science, 241, 976-978.

    9. Hill, C. P., Anderson, D. H., Wesson, L., DeGrado,W. F. & Eisenberg, D. (1990). Crystal structure of a1:implications for protein design. Science, 249, 543-546.

    10. Betz, S. F., Bryson, J. W. & DeGrado, W. F. (1995).Native-like and structurally characterized designeda-helical bundles. Curr. Opin. Struc. Biol. 5, 457-463.

    11. Raleigh, D. P., Betz, S. F. & DeGrado, W. F. (1995).A de novo designed protein mimics the native stateof natural proteins. J. Am. Chem. Soc. 117, 7558-7559.

    12. Smith, A. V. & Hall, C. K. (2001). Assembly of a tet-rameric a-helical bundle: computer simulations onan intermediate-resolution protein model. Proteins:Struct. Funct. Genet. 44, 376-391.

    13. Gupta, P., Hall, C. K. & Voegler, A. C. (1998). Effectof denaturant and protein concentrations upon pro-tein refolding and aggregation: a simple lattice

    model. Protein Sci. 7, 2642-2652.14. Weiner, P. K. & Kollman, P. A. (1981). AMBER -

    assisted model-building with energy renement - ageneral program for modeling molecules and theirinteractions. J. Comput. Chem. 2, 287-303.

    15. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D.,States, D. J., Swaminathan, S. & Karplus, M. (1983).CHARMM - a program for macromolecular energy,minimization, and dynamics calculations. J. Comput.Chem. 4, 187-217.

    16. Levitt, M., Hirschberg, M., Sharon, R. & Daggett, V.(1995). Potential energy function and parameters forsimulations of the molecular dynamics of proteins

    and nucleic acids in solution. Comput. Phys. Com-mun. 91, 215-231.

    17. Caisch, A. & Karplus, M. (1995). Acid and thermal-denaturation of barnase investigated by molecular-dynamics simulations. J. Mol. Biol. 252, 672-231.

    18. Lazaridis, T. & Karplus, M. (1998). ``New view'' ofprotein folding reconciled with the old through mul-tiple unfolding simulations. Science, 278, 1928-1931.

    19. Kazmirski, S. L. & Daggett, V. (1999). Analysismethods for comparison of multiple molecular

    dynamics trajectories: applications to protein unfold-ing pathways and denatured ensembles. J. Mol. Biol.290, 283-304.

    20. Kazmirski, S. L. & Daggett, V. (1998). Simulations ofthe structural and dynamical properties ofdenatured proteins: the ``molten coil'' state of bovinepancreatic trypsin inhibitor. J. Mol. Biol. 277, 487-506.

    21. Wong, K. B., Clarke, J., Bond, C. J., Neira, J. L.,Freund, S. M. V., Fersht, A. R. & Daggett, V. (2000).Towards a complete description of the structuraland dynamic properties of the denatured state ofbarnase and the role of residual structure in folding.J. Mol. Biol. 296, 1257-1282.

    22. Elcock, A. H., Gabdoulline, R. R., Wade, R. C. &

    McCammon, J. A. (1999). Computer simulation ofprotein-protein association kinetics: acetylcholin-esterase-fasciculin. J. Mol. Biol. 291, 149-162.

    23. Lau, K. & Dill, K. (1989). A lattice statistical mech-anics model of the conformational and sequencespaces of proteins. Macromolecules, 22, 3986-3997.

    24. Shakhnovich, E. & Gutin, A. (1993). Engineering ofstable and fast-folding sequences of model proteins.Proc. Natl Acad. Sci. USA, 90, 7195-7199.

    25. Socci, N. & Onuchic, J. (1994). Folding kinetics ofprotein-like heteropolymers. J. Chem. Phys. 101,1519-1528.

    26. Doniach, S., Garel, T. & Orland, H. (1996). Phasediagram of a semiexible polymer chain in a ysolvent: application to protein folding. J. Chem. Phys.

    105, 1601-1607.27. Zhou, Y., Hall, C. K. & Karplus, M. (1996). First-

    order disorder-to-order transition in an isolatedhomopolymer model. Phys. Rev. Letters, 77, 2822-2825.

    28. Zhou, Y., Karplus, M., Wichert, J. M. & Hall, C. K.(1997). Equilibrium thermodynamics of homopoly-mers and clusters: molecular dynamics and MonteCarlo simulations of systems with square-well inter-actions. J. Chem. Phys. 107, 10691-10708.

    29. Hu, W. (1998). Structural transformation in the col-lapse transition of the single exible homopolymermodel. J. Chem. Phys. 109, 3686-3690.

    30. Wu, C. & Wang, X. (1998). Globule-to-coil transitionof a single homopolymer chain in solution. Phys.

    Rev. Letters, 80, 4092-4094.31. Iori, G., Marinari, E., Parisi, G. & Struglia, M. V.(1992). Statistical mechanics of heteropolymer fold-ing. Physica A, 185, 98-103.

    32. Camacho, C. J. & Thirumalai, D. (1993). Kinetics andthermodynamics of folding in model proteins. Proc.Natl Acad. Sci. USA, 90, 6369-6372.

    33. Bratko, D., Charkraborty, A. K. & Shakhnovich, E. I.(1997). The structure of a random heteropolymer ina disordered medium: ensemble growth simulation.J. Chem. Phys. 106, 1264-1278.

    34. Irba ck, A., Peterson, C., Potthast, F. & Sommelius,O. (1997). Local interactions and protein folding: a

    200 Protein Refolding Versus Aggregation

  • 8/2/2019 Protein Refolding Computer Simulation

    15/16

    three-dimensional off-lattice approach. J. Chem. Phys.107, 273-282.

    35. Zhdanov, V. P. & Kasemo, B. (1997). Monte Carlosimulation of protein folding with orientation-dependent monomer-monomer interactions. Proteins:Struct. Funct. Genet. 29, 508-516.

    36. Nymeyer, H., Garcia, A. E. & Onuchic, J. N. (1998).Folding funnels and frustraction in off-lattice minim-alist protein landscapes. Proc. Natl Acad. Sci. USA,95, 5921-5928.

    37. Dinner, A. R. & Karplus, M. J. (1999). The thermo-dynamics and kinetics of protein folding: a latticemodel analysis of multiple pathways with inter-mediates. Phys. Chem. B, 103, 7976-7994.

    38. Kolinski, A. & Skolnick, J. (1992). Discretized modelof proteins. I. Monte Carlo study of cooperativity inhomopolypeptides. J. Chem. Phys. 97, 9412-9426.

    39. Kolinski, A. & Skolnick, J. (1994). Monte Carlosimulations of protein folding. I. lattice model andinteraction scheme. Proteins: Struct. Funct. Genet. 18,338-352.

    40. Kolinski, A. & Skolnick, J. (1994). Monte Carlo simu-lations of protein folding. I. Application to ProteinA, ROP, and crambin. Proteins: Struct. Funct. Genet.18, 353-366.

    41. Sikorski, A., Kolinski, A. & Skolnick, J. (1998).Computer simulations of de novo designed helicalproteins. Biophys. J. 75, 92-105.

    42. Sikorski, A., Kolinski, A. & Skolnick, J. (2000). Com-puter simulations of the properties of the a2, a2C,and a2D de novo designed helical proteins. Proteins:Struct. Funct. Genet. 38, 17-28.

    43. Wallqvist, A. & Ullner, M. (1994). A simpliedamino acid potential for use in structure predictionsof proteins. Proteins: Struct. Funct. Genet. 18, 267-280.

    44. Sun, S. (1993). Reduced representation model of pro-tein structure prediction: statistical potential andgenetic algorithms. Protein Sci. 2, 762-785.

    45. Takada, S., Luthey-Schulten, Z. & Wolynes, P. G.(1999). Folding dynamics with nonadditive forces: asimulation study of a designed helical protein and a

    random heteropolymer. J. Chem. Phys. 110, 11616-11629.

    46. Guo, Z. & Thirumalai, D. (1996). Kinetics andthermodynamics of folding of a de novo designedfour-helix bundle protein. J. Mol. Biol. 263, 323-343.

    47. Patro, S. Y. & Przybycien, T. M. (1994). Simulationsof kinetically irreversible protein aggregate struc-ture. Biophys. J. 66, 1274-1289.

    48. Patro, S. Y. & Przybycien, T. M. (1996). Simulationsof reversible protein aggregate and crystal structure.Biophys. J. 70, 2888-2902.

    49. Istrail, S., Schwartz, R. & King, J. (1999). Latticesimulations of aggregation funnels for protein fold-ing. J. Comput. Biol. 6, 143-162.

    50. Broglia, R. A., Tiana, G., Pasquali, S., Roman, H. E.

    & Vigezzi, E. (1998). Folding and aggregation ofdesigned proteins. Proc. Natl Acad. Sci. USA, 95,12930-12933.

    51. Giugliarelli, G., Micheletti, C., Banavar, J. R. &Maritan, A. (0000). Compactness, aggregation, andprionlike behavior of protein: a lattice model study.J. Chem. Phys. 113, 5072-5077.

    52. Harrison, P. M., Chan, H. S., Prusiner, S. B. &Cohen, F. E. (1999). Thermodynamics of modelprions and its implications for the problem of prionprotein folding. J. Mol. Biol. 286, 593-606.

    53. Smith, A. V. & Hall, C. K. (2000). Bridging the gapbetween homopolymer and protein models: a dis-

    continuous molecular dynamics study. J. Chem.Phys. 113, 9331-9342.

    54. Smith, A. V. & Hall, C. K. (2001). a-Helix formation:discontinuous molecular dynamics on an intermedi-ate-resolution protein model. Proteins: Struct. Funct.Genet. 44, 344-360.

    55. Voet, D. & Voet, J. G. (1990), Biochemistry, JohnWiley & Sons, New York, NY.

    56. Baker, E. N. & Hubbard, R. E. (1984). Hydrogenbonding in globular proteins. Prog. Biophys. Mol.

    Biol. 44, 97-179.57. Klimov, D. K., Betancourt, M. R. & Thirumalai, D.(1998). Virtual atom representation of hydrogenbonds in minimal off-lattice models of a helices:effect on stability, cooperativity and kinetics. FoldingDes. 3, 481-496.

    58. Alder, B. J. & Wainwright, T. E. (1959). Studies inmolecular dynamics I. General method. J. Chem.Phys. 31, 459-466.

    59. Rapaport, D. C. (1978). Molecular dynamics simu-lation of polymer chains with excluded volume.J. Phys. A: Math. Gen. 11, L213-L217.

    60. Bellemans, A., Orban, J. & Belle, D. V. (1980).Molecular dynamics of rigid and non-rigid necklacesof hard discs. Mol. Phys. 39, 781-782.

    61. Smith, S. W., Hall, C. K. & Freeman, B. D. (1997).Molecular dynamics for polymeric uids using dis-continuous potentials. J. Comp. Phys. 134, 16-30.

    62. Andersen, H. C. (1980). Molecular dynamics simu-lations at constant temperature and/or pressure.J. Chem. Phys. 72, 2384-2393.

    63. Crick, F. H. C. (1953). The packing of a-helices:simple coiled-coils. Acta. Crystallog. 6, 689-697.

    64. Kamtekar, S. & Hecht, M. H. (1995). The four-helixbundle: what determines a fold? FASEB J. 9, 1013-1022.

    65. Zhong, Q., Jiang, Q., Moore, P. B., Newns, D. M. &Klein, M. L. (1998). Molecular dynamics simulationof a synthetic ion channel. Biophys. J. 74, 3-10.

    66. Rojnuckarin, A., Kim, S. & Subramaniam, S. (1998).Brownian dynamics simulations of protein folding:

    access to milliseconds time scale and beyond. Proc.Natl Acad. Sci. USA, 95, 4288-4292.

    67. Chan, H. S. & Dill, K. A. (1998). Protein folding inthe landscape prespective: chevron plots and non-Arrhenius kinetics. Proteins: Struct. Funct. Genet. 30,2-33.

    68. Chan, H. S. (1998). Matching speed and locality.Nature, 392, 761-763.

    69. Oliveberg, M. (1998). Alternative explanations formultistate kinetics in protein folding: transientaggregation and changing transition-state ensembles.Acc. Chem. Res. 31, 765-772.

    70. Kirschner, D. A., Abraham, C. & Selkoe, D. J. (1986).X-ray diffraction from intraneuronal paired helicallaments and extraneuronal amyloid bers in

    Alzheimer disease indicates cross-b conformation.Proc. Natl Acad. Sci. USA, 83, 503-507.

    71. Kelly, J. W. (1998). The alternative conformations ofamyloidogenic proteins and their multi-step assem-bly pathways. Curr. Opin. Struct. Biol. 8, 101-106.

    72. Oberg, K., Chrunyk, B. A., Wetzel, R. & Fink, A. L.(1994). Nativelike secondary structure in interleukin-1-beta inclusion bodies by attenuated total relectanceFT-IR. Biochemistry, 33, 2628-2634.

    73. Speed, M. A., Wang, D. I. & King, J. (1995). Multi-meric intermediates in the pathway to the aggre-gated inclusion body state for P22 tailspikepolypeptide chains. Protein Sci. 4, 900-908.

    Protein Refolding Versus Aggregation 201

  • 8/2/2019 Protein Refolding Computer Simulation

    16/16

    74. Wetzel, R. (1996). For protein misassembly, it's the``I'' decade. Cell, 86, 699-702.

    75. King, J., Haase-Pettingell, C., Robinson, A. S., Speed,M. & Mitraki, A. (1996). Thermolabile folding inter-mediates: inclusion body precursors and chaperoninsubstrates. FASEB J. 10, 57-66.

    76. Fink, A. L. (1998). Protein aggregation: foldingaggregates, inclusion bodies and amyloid. FoldingDes. 3, R9-R23.

    77. Sayle, R. & Milner-White, E. J. (1995). RasMol:biomolecular graphics for all. Trends Biochem. Sci. 20,333-379.

    Edited by F. Cohen

    (Received 9 November 2001; received in revised form 14 May 2001; accepted 14 May 2001)

    202 Protein Refolding Versus Aggregation