13
REVIEW Recent developments in structural proteomics for protein structure determination Hsuan-Liang Liu 1 and Jyh-Ping Hsu 2 1 Department of Chemical Engineering and Graduate Institute of Biotechnology, National Taipei Universityof Technology, Taipei, Taiwan 2 Departmentof Chemical Engineering, National Taiwan University, Taipei, Taiwan The major challenges in structural proteomics include identifying all the proteins on the ge- nome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molec- ular dynamic simulations are intensively used as alternative tools to predict the three-dimen- sional structures and dynamic behavior of proteins. This review summarizes recent develop- ments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations. Received: April 7, 2004 Revised: August 26, 2004 Accepted: October 28, 2004 Keywords: Molecular dynamics simulations / Nuclear magnetic resonance spectroscopy / Review / Structural proteomics / Structure-function relationship / X-ray crystallography 2056 Proteomics 2005, 5, 2056–2068 Contents 1 Introduction.......................... 2056 2 Instrumental methods for structure determinations ....................... 2057 2.1 X-ray crystallography ................. 2057 2.2 NMR spectroscopy .................... 2059 3 Computational methods for structure prediction ........................... 2060 3.1 Comparative and de novo structure prediction ........................... 2060 3.2 Molecular dynamics solutions .......... 2063 4 Conclusions.......................... 2066 5 References........................... 2066 1 Introduction The term “proteome” was coined in 1994 by Marc R. Wilkins, vice president and head of bioinformatics at Proteome Sys- tems in Sydney, Australia, to mean the protein complement encoded by a genome. Since then, the activities in proteom- ics can be broken down into three main categories: (1) iden- tifying all the proteins made in a given cell, tissue, or organ- ism; (2) determining how these proteins join forces to form networks akin to electrical circuits; and (3) outlining the precise three-dimensional (3-D) structures of the proteins in an effort to find their Achilles’ heels, i.e., where drugs might turn their activity on or off. Structural proteomics, the deter- mination and prediction of atomic resolution 3-D structures of proteins on a genome-wide scale for better understanding their structure–function relationships, has now provided a new rationale for structural biology and has become a major initiative in biotechnology [1]. It usually involves research in biochemistry, bioinformatics, molecular biology, instru- mental methods such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, and computational approaches. The activities involved in protein structure determination can be summarized in Fig. 1. The large num- Correspondence: Professor Hsuan-Liang Liu, Department of Chemical Engineering and Graduate Institute of Biotechnology, National Taipei University of Technology, Taipei, Taiwan 10608 E-mail: [email protected] Fax: 1886-2-27317117 Abbreviations: 3-D, three-dimensional; MD, molecular dynamics; NMR, nuclear magnetic resonance; PDB, protein data bank 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de DOI 10.1002/pmic.200401104

REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

REVIEW

Recent developments in structural proteomics

for protein structure determination

Hsuan-Liang Liu1 and Jyh-Ping Hsu2

1 Department of Chemical Engineering and Graduate Institute of Biotechnology,National Taipei University of Technology, Taipei, Taiwan

2 Department of Chemical Engineering, National Taiwan University, Taipei, Taiwan

The major challenges in structural proteomics include identifying all the proteins on the ge-nome-wide scale, determining their structure-function relationships, and outlining the precisethree-dimensional structures of the proteins. Protein structures are typically determined byexperimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR)spectroscopy. However, the knowledge of three-dimensional space by these techniques is stilllimited. Thus, computational methods such as comparative and de novo approaches and molec-ular dynamic simulations are intensively used as alternative tools to predict the three-dimen-sional structures and dynamic behavior of proteins. This review summarizes recent develop-ments in structural proteomics for protein structure determination; including instrumentalmethods such as X-ray crystallography and NMR spectroscopy, and computational methods suchas comparative and de novo structure prediction and molecular dynamics simulations.

Received: April 7, 2004Revised: August 26, 2004

Accepted: October 28, 2004

Keywords:

Molecular dynamics simulations / Nuclear magnetic resonance spectroscopy / Review /Structural proteomics / Structure-function relationship / X-ray crystallography

2056 Proteomics 2005, 5, 2056–2068

Contents

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 20562 Instrumental methods for structure

determinations . . . . . . . . . . . . . . . . . . . . . . . 20572.1 X-ray crystallography . . . . . . . . . . . . . . . . . 20572.2 NMR spectroscopy. . . . . . . . . . . . . . . . . . . . 20593 Computational methods for structure

prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 20603.1 Comparative and de novo structure

prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 20603.2 Molecular dynamics solutions . . . . . . . . . . 20634 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . 20665 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 2066

1 Introduction

The term “proteome” was coined in 1994 by Marc R. Wilkins,vice president and head of bioinformatics at Proteome Sys-tems in Sydney, Australia, to mean the protein complementencoded by a genome. Since then, the activities in proteom-ics can be broken down into three main categories: (1) iden-tifying all the proteins made in a given cell, tissue, or organ-ism; (2) determining how these proteins join forces to formnetworks akin to electrical circuits; and (3) outlining theprecise three-dimensional (3-D) structures of the proteins inan effort to find their Achilles’ heels, i.e., where drugs mightturn their activity on or off. Structural proteomics, the deter-mination and prediction of atomic resolution 3-D structuresof proteins on a genome-wide scale for better understandingtheir structure–function relationships, has now provided anew rationale for structural biology and has become a majorinitiative in biotechnology [1]. It usually involves research inbiochemistry, bioinformatics, molecular biology, instru-mental methods such as X-ray crystallography and nuclearmagnetic resonance (NMR) spectroscopy, and computationalapproaches. The activities involved in protein structuredetermination can be summarized in Fig. 1. The large num-

Correspondence: Professor Hsuan-Liang Liu, Department ofChemical Engineering and Graduate Institute of Biotechnology,National Taipei University of Technology, Taipei, Taiwan 10608E-mail: [email protected]: 1886-2-27317117

Abbreviations: 3-D, three-dimensional; MD, molecular dynamics;NMR, nuclear magnetic resonance; PDB, protein data bank

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

DOI 10.1002/pmic.200401104

Page 2: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2057

Figure 1. A summary of the steps in protein structure determination and prediction.

ber of protein structures will yield valuable information tothe rules for predicting protein folding/unfolding mecha-nisms and understanding their biological functions.

Although recent advances in the fields of X-ray crystal-lography [2, 3] and multidimensional NMR spectroscopy [4]have allowed structural biologists to annotate the structuresand biological functions of proteins by determining theiratomic coordinates, they still suffer from some experimentallimitations, such as the efficient and rational production ofproteins with structural properties including high-through-put cloning and expression from multiple vectors in multiplehost organisms, core domain identification using proteolysismethods, and the use of expression and detection tags.Therefore, computational approaches such as conforma-tional search methods and molecular dynamics (MD) simu-lations may become alternatives capable of predicting the3-D structures and understanding the folding/unfoldingmechanisms of proteins. One of the advantages to use com-putational approaches in structural proteomics is to suggestthe structures and biological functions of uncharacterizedproteins solely based on structural homology to another pro-tein with known structure and biological functions. Suchpredicted structural models may further provide valuableinformation for the better understanding of the putative ac-tivities or functions in protein–protein interactions andmetabolic pathways in a given cell, tissue, and organism.

This review focuses on recent developments in structuralproteomics for protein structure determination. Section 2gives a brief background on instrumental methods such asX-ray crystallography and NMR spectroscopy; Section 3 pro-vides details in computational methods including compara-tive and de novo structure prediction and MD simulations;and Section 4 outlines the concluding remarks of proteinstructure determination in structural proteomics.

2 Instrumental methods for structuredeterminations

Improvements in solving protein structures have been madeby numerous innovations over the past decade, includingsolving the X-ray diffraction phase problem by multiple-wavelength anomalous diffraction (MAD) phasing [5]. Theseefforts together have resulted in dramatic accumulation of3-D protein structures in the past few years (Fig. 2) due to theavailability of genome sequences and technical advances incloning and expression [6, 7], despite the fact that many ofthese structures are redundant (i.e., single amino acid muta-tions, ligand complexes, and duplications).

2.1 X-ray crystallography

To date, X-ray crystallography is still the most powerful tech-nique for structure determination and analysis of proteinsbecause it is capable of providing atomic coordinates of thewhole assembly [8, 9]. Ever since the determination of thestructure of myoglobin, these structural data have con-tributed tremendously to our understanding of structuralbiology at the atomic level [10]. Almost all proteins subjectedto structural studies are expressed in heterologous systems.Recent developments in molecular biology have allowedhigh-throughput cloning and expression of proteins. How-ever, overexpression in host cells may result in inappropriatePTM and incorrect folding, which usually forms inclusionbodies and insoluble aggregates, and thus must be dis-carded. Such limitations have been conquered by a numberof strategies, such as using genes from different species,altering constructs, screening for solubility, and utilizingdifferent cellular or cell-free expression systems. Further-more, several novel refolding techniques such as refolding

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 3: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2058 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

Figure 2. The growth of total available protein structures deposited in PDB (figure downloaded from PDB website: http://www.rcsb.org/pdb).

chromatography based on molecular chaperones that arepart of the cellular machinery responsible for foldingnascent protein [11] and the designed small-moleculeagents to assist the refolding process [12], have beendeveloped to recover these proteins.

Once the proteins are correctly expressed and purified,the next step is to form crystals of sufficient quality to collecthigh-resolution data for structure determination. Crystal-lization is usually regarded as a slow, resource-intensive stepwith low success rates. However, much of the failure in thisstep can be attributed to poor protein quality. Therefore, theuse of biophysical methods such as dynamic light scatteringto assess the quality of a protein is a key step before per-forming crystallization experiments [13]. In addition, it isusually necessary to screen a wide range of conditions suchas pH, temperature, salt and protein concentrations, andcofactors, in this step. Particularly for proteins with lowyields, the ability to screen more conditions at the requiredprotein concentration becomes critical. Crystallization hasbenefited enormously from automation and technologiesallowing the use of small protein sample volumes [2] over thepast few years. Meanwhile, crystallization robots have con-tributed to increasing the efficiency of screening crystal-lization conditions. Recently, the so-called second-generationcrystallization robots have been developed to allow manythousands of experiments to be carried out in parallel. These

robots significantly reduce the need for large amounts ofprotein samples by using nanoliter-sized drops to screen forthe optimal conditions. In addition, they also provide imagecapture and analysis systems in place, which monitor thedrops automatically and look for the presence of crystalsusing edge-detection methods [2].

As mentioned above, high-throughput crystallographyhas been facilitated by improved phasing and model buildingmethods, decreased sample requirements through miniatur-ization, as well as robotization and automation from thecrystallization stage to structure determination [14]. Thus, theprocess from genes to crystals to drug candidates is now pos-sible and has been intensively executed in academic researchand commercial groups [15]. Beyond the high-throughputcapabilities for protein expression, crystallization, imageanalysis for automatic crystal detection, and structure deter-mination, a late stage rate-determining step, is the manualintervention required for crystal mounting and alignment inthe X-ray beam. A novel method for automatic mounting,optical crystal alignment, and data collection has been repor-ted recently [16]. It exhibits the same range of accuracy as themanual process and allows data collection using conventionalX-ray systems. In addition, the number of high-brilliancesynchrotron X-ray beam lines dedicated to macromolecularcrystallography has increased significantly, and thus data col-lection times at these facilities can be dramatically reduced.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 4: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2059

The technological advances in X-ray crystallography, overthe past decade, have resulted in rapid growth of proteinstructure data. However, crystal structures of multi-compo-nent systems and membrane proteins are still limited, pro-viding key challenges for these new approaches. Never-theless, it is highly anticipated that these novel approacheswill continue to generate protein crystal structures at unpre-cedented rates and drive a new wave of structure-based drugdiscovery [15].

2.2 NMR spectroscopy

Although X-ray crystallography is regarded as the potentialworkhorse for structural proteomics, the throughput ofstructure determination using it remains unclear. In con-trast, NMR spectroscopy does not require crystals and thussamples appropriate for structure determination can beidentified in a relatively short period of time. Moreover, NMRexperiments can be carried out in aqueous solution underconditions similar to the physiological conditions in whichthe protein normally functions. These particular featureshave provided insights into structure–function relationshipsfor a large number of proteins. However, structure determi-nation by NMR is currently limited by size constraints andlengthy data collection and analysis times. Nevertheless,NMR spectroscopy still plays a significant role in structuralproteomics even with its current limitations [17].

The procedure for NMR solution structure determina-tion can be dissected into six major parts. (1) A suitablesample, usually ,500 mL of a 1 mM protein solution is care-fully prepared. If the molecular mass of a protein exceeds,10 kDa, enrichment with 13C and 15N isotopes is requiredto resolve spectral overlap in 1H NMR spectroscopy. (2) Thissample is subsequently used to record a set of multi dimen-sional NMR experiments, which provide the NMR spectraafter suitable data processing. (3) Determine the completesequential NMR assignments such as the measurement ofresonance frequencies of the NMR-active spins in the pro-tein. (4) The resulting conformation-dependent dispersion ofthe chemical shifts is a prerequisite for deriving experi-mental constraints from various NMR experiments for theNMR structure calculation. (5) Iterations involving structurecalculations and identification of new constraints are carriedout until the overwhelming majority of experimentallyderived constraints are in good agreement with a bundle ofprotein conformations representing the NMR solutionstructure. (6) Finally, the NMR structure can be refined usingconformational energy force fields [4].

As conformational variations in the bundle of structuresreflect the precision of the NMR structure determination, theaccuracy of protein structures determined by NMR isstrongly dependent on the extent and quality of dataobtained. In addition, the sensitivity of the acquired NMRdata relies strongly on the performance of the NMR probe, asophisticated electronic device used to detect NMR signals.Although atomic coordinates in high-resolution crystal

structures are more precisely determined than in the corre-sponding NMR structures, the crystallization process mayselect a certain subset of conformers present under particu-lar conditions. Recently, various approaches towards theautomated structure determination from NMR spectra, suchas the NOESY-Jigsaw [18] in which sparse and unassignedNMR data can be used to reasonably and accurately assesssecondary structure and align it, have been developed. Theinformation obtained is thus useful in quick structural anal-ysis to assess folds before full structure determination andanalysis as a complement to fold prediction approaches [14].In addition, NMR is particularly valuable in structural geno-mics efforts for rapidly characterizing the foldedness of spe-cific protein or RNA constructs [4]. In principle, there arecorrelations between such foldedness criteria and crystal-lizability, so that data from a high-throughput NMR screenmight directly support efforts to generate samples for crys-tallographic analysis [4].

As mentioned, the major challenge for NMR spectrosco-py is the reduction of the data collection time required forstructure determination. Besides, the development of NMRexperiments that allow matching of instrument time invest-ments to the minimum time required for measuring thechemical shift data is another challenge. They can be over-come by reduced dimensionality experiments [19, 20], withsimultaneous frequency labeling of more than one atom typein indirect dimensions. Another advance involves partialdeuteration [21], which provides samples that can be studiedwith improved S/Ns that result from their sharper linewidthsand longer transverse relaxation times. These technologiestogether provide the basis for high-throughput NMR, and areparticularly valuable for samples with limited stabilities andlow solubilities. A novel spectroscopic concept, transverserelaxation optimized spectroscopy (TROSY) based on selec-tion of slowly relaxing NMR transitions, also can providesignificant sensitivity enhancement for large proteins [22,23].

Another important issue of development involves auto-mated analysis of NMR data. The design of cryogenic probeswill certainly assist in reducing the bottleneck of timerequired for data collection in NMR-based structural prote-omics. In addition, some automated resonance- and nuclearOverhauser effects (NOE)-assignment programs, such asNOAH [24], AUTOASSIGN [25], ARIA [26], ANSIG [27],TATAPRO [28], and SANE [29], hold great promise in reduc-ing the amount of time in structure determination. Theserecent developments provide automated analysis of NMRassignments and 3-D structures of proteins up to 200 aminoacids [30]. However, general methods for automated analysisof side chain resonance assignment are not yet well devel-oped, and there are as yet no examples of completely auto-mated protein structure determinations. Nevertheless, NMRspectroscopy can provide structural and biophysical infor-mation that is complementary to X-ray crystallography, andthese two methods will continue to play synergistic roles instructural proteomics.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 5: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2060 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

3 Computational methods for structureprediction

3.1 Comparative and de novo structure prediction

Computational methods usually involve comparative struc-ture prediction and de novo structure prediction. Compara-tive methods usually follow four basic steps: (1) finding atemplate; (2) sequence-template alignment; (3) model build-ing; and (4) model assessment. Successful predictions basedon comparative models have been reviewed by Baker and Sali[31]. It usually entails the use of information from a variety ofsources including pattern analysis, X-ray diffraction data,and physical forces/energy states. The combined use of thesesources of information would predict the probable atomiccoordinates of proteins in space. If there is no empiricallydetermined structure with at least 30% sequence similarityto the target sequence, then there may be no template avail-able that is suitable for reliable comparative modeling. Thus,the potential accuracy of a comparative model is positivelyrelated to the percentage of sequence identity on which it isbased. High-accuracy comparative models are based on morethan 50% sequence identity to their templates. They tend tohave about 1 root-mean-square (RMS) deviation for themain-chain atoms, which is comparative to the accuracy of amedium-resolution NMR structure or a low-resolution X-raystructure [31]. The errors of the homology models occur pri-marily in the area of mistakes in side-chain packing, smallshifts or distortions in the core main-chain regions, and in-serted loops with no matching sequence in the solved struc-ture. Furthermore, sequence alignment mistakes may occa-sionally take place [32]. Although there is a wide range ofapplications of protein structure models, the scope of mod-eling is still limited to the content of the protein structuredatabase. According to Vitkup et al. [33], providing templatesfor 90% of all protein domain families, including membraneproteins, will require the empirical solution of about 16 000new domain structures. This is well in excess of the struc-tural information presently in the protein data bank (PDB),but may be achievable within a decade as a result ofadvancements in high-throughput crystallography and NMRspectroscopy. The homology models may be further investi-gated in a variety of ways, such as site-directed mutagenesis,ligand docking experiments, and protein–protein interac-tions.

An example of using comparative methods to constructthe homology model of the pore-loop domain of the voltage-gated potassium channel Kv1.1 from human Homo sapiensbased on the crystal structure of the bacterial potassiumchannel from Streptomyces lividans (KcsA) [34] is shown inFig. 3 [35]. The built model shares a similar fold as in KcsA,particularly in the selectivity filter, a signature sequenceforming the sequential K1 ion binding sites. In contrast,most of the structural variations occur in the turret as well asin the inner and outer helices. The homology models of theseKv channels provide particularly attractive targets for further

Figure 3. Ribbon presentations of the top (upper) and side(lower) views of the structural model of the pore loop domain ofKv1.1 K1 channel based on the structural homology to KcsA [35].The structural features corresponded to those defined in thecrystallographic structure of KcsA [34] are indicated. The back-bone RMSD of Kv1.1 in after superimposing the structural modelto the crystallographic structure of KcsA is given on the top rightcorner. Figure reproduced with permission.

structure-based studies. Molecular docking experiments of anewly solved scorpion toxin, Tc1, from Tityus cambridgei [36]were subsequently performed towards these channels tovalidate the built structure and the results are illustrated inFig. 4 [37]. The side chain of Lys14 was found to be directlyinvolved in the electrostatic interactions with the first K1 ion-binding site in the selectivity filter. In addition, Fig. 5 showsthat Tc1 exhibits stronger binding affinity towards Kv1.1 thanKcsA due to stronger electrostatic and hydrophobic interac-tions between Tc1 molecule and the outer vestibule of Kv1.1[38].

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 6: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2061

Figure 4. Ribbon representations of the top and side views of the pore loop domain of (A) KcsA and (B) Kv1.1 after docked by Tc1 molecule[37], which is illustrated in CPK. The four subunits of each channel, indicated as I, II, III, and IV, are shown in blue, orange, green, and red,respectively. The locations of the outer helix, turret, pore helix, selectivity filter (represented as sticks with the backbone carbonyl oxygenatoms in red), and inner helix [84] are indicated. One of the four subunits of the two channels is omitted in the side views for clarity. The fiveLys residues of Tc1 are labeled and colored in yellow. The rest of the residues in Tc1 are colored in purple. Figure reproduced with per-mission .

Unlike comparative methods that are limited to proteinfamilies with at least one known structure, de novo predictionis free of such a restriction. Starting on the premise that thenative state of a protein is at the global-free energy mini-mum, the search of conformational spaces for tertiary struc-tures that are particularly low in free energy for a givenamino acid sequence is possible. Recent advance in this areaintroduces a toolkit, ROSETTA, for ab initio protein structureprediction [39]. ROSETTA is based on the assumption thatthe distribution of conformations sampled for a given nine

residue segment of the chain is reasonably well approxi-mated by the distribution of structures adopted by thesequence in known protein structures [40]. Fragment librar-ies for all possible three- and nine-residue segments of thechain are extracted from the protein structure database byusing a sequence profile comparison method. The con-formational space defined by these fragments is searched byusing a Monte Carlo procedure with an energy function thatfavors compact structures with paired b-strands and buriedhydrophobic residues. The output structures are by con-

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 7: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2062 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

struction consistent with the local conformational biasesinherent in the sequence and have low free-energy non-local interactions by virtue of the Monte Carlo optimizationprocedure. For each query sequence a large number of in-dependent simulations are carried out by starting fromdifferent random number seeds. The resulting structuresare clustered, and the centers of the largest clusters areselected as the highest confident models. Thus, it providesmodeling support for different design domains, employingsemantics and syntax appropriate for each. Structuralgenomics of single proteins or their domains, combinedwith protein structure prediction, may help scientists toefficiently characterize the structures of large macro-molecular assemblies. Results from CASP (III, IV, and V)protein structure prediction experiments demonstratedthat considerable progress has been made towards pre-dicting protein structure from its primary sequence infor-mation alone [41–43]. The primary improvements to themethod since CASP III fall into three classes. The firstclass consists of improvements in the basic simulationmethod. The second class of improvements consists of fil-ters that eliminate nonprotein-like conformations fromlarge sets of simulated structures and increase the fre-quency of native-like conformations. The third class of

improvements involves the simultaneous clustering ofconformations generated independently for severalsequences related to a given target sequence.

Comparative modeling is based on the observation thatproteins with similar sequences almost always share similarstructures. Currently, about 30% of known sequences havesufficient sequence similarity to a known structure for currentcomparative modeling methods [44]. About one-third of thesesequences are similar over,80% oftheir length; consequently,complete 3-D models cannot be generated by comparativemethods alone [33]. Recently, a novel method based on thesuccessful de novo structure prediction method ROSETTA [45,46] for modeling structurally variable regions has been pro-posed [44]. Traditionally, loop modeling is defined as the prob-lem of constructing 3-D atomic models for short protein seg-ments corresponding to loops on the protein surface that con-nect regular secondary structure elements. Loop modelingmethods primarily differ in the method of conformation gen-eration and in the evaluation or scoring of alternative con-formations. Algorithms can be generally grouped into knowl-edge-based methods, de novo or ab initio strategies, and com-bined approaches. The knowledge-based approach uses thedatabase of experimental protein structures as a source of loopconformations. In the de novo approach, loop conformations

Figure 5.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 8: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2063

Figure 5. Molecular surfaces of the complexes of (A) KcsA/Tc1 and (B) Kv1.1/Tc1 with Lys14 of Tc1 facing towards the selectivity filters ofthese two channels [38]. Top views of the P-loop regions of KcsA and Kv1.1 after removing Tc1 molecules are shown on the left of (A) and(B), respectively. The dashed-yellow lines indicate the binding locations of Tc1 molecules on the channel surfaces after docking. The fourAsp residues forming the hydrophobic interactions with Tc1 from each subunit are also labeled. Top and side views of the Tc1 moleculesafter docking towards KcsA and Kv1.1 are shown on the top-right and bottom-right of (A) and (B), respectively. The five Lys residues arelabeled. Figure reproduced with permission.

are generated by a variety of methods including MD [47, 48],simulated annealing [49, 50], exhaustive enumeration or heu-ristic sampling of a discrete set of (f,j) angles [51, 52], randomtweak [53, 54], or analytical methods [55, 56]. The ROSETTA-based method described by Rohl et al. [44] is a novel approach forcombining database-derived conformations and de novo predic-tion for loop modeling. The fragment assembly strategy used byROSETTA is currently perhaps the most successful method forde novo structure prediction, and it may be particularly wellsuited for modeling structurally variable regions in proteins.

Methods for protein structure prediction offer some hopefor narrowing the gap between the number of known proteinsequences and the number of experimentally determinedprotein structures. Fold recognition methods can be veryeffective despite their slowness, the requirement for humanintervention to interpret the results, and the inaccuracy ofsequence-structure alignments produced. Threading meth-ods were originally developed to recognize pairs of proteins,which have no obvious similarities in sequence yet havesimilar folds. Recently, it has become clear that in fold

recognition it is useful to distinguish between pairs of pro-teins, which are homologous and those, which are analo-gous. Where no evolutionary relationship is believed to existbetween two structurally similar proteins, threading wouldbe the only applicable method for identifying this type ofrelationship. Usually, the method for threading can be di-vided into three stages: (1) alignment of sequences; (2) calcu-lation of pair potential and salvation terms; and (3) evalua-tion of the alignment using a neural network. One of thepossible applications for a fold recognition method might beto prioritize these gene products for X-ray crystallographicexperiments. Thus, it can be hoped that a very large fractionof proteomes might be amenable to large-scale structureprediction in the near future.

3.2 Molecular dynamics simulations

Though their folding time may not be a subject of activerefinement of evolution, proteins need to adapt well-definedstructures soon after being synthesized and transported to

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 9: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2064 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

their designated locations to perform their biological func-tions. Under the right physiological conditions, proteins canfold into and subsequently maintain the well-defined struc-tures through the delicate balance of enthalpy and entropy[57], weak interactions including van der Waals, electrostatic,and hydrogen bonding forces, and a balance between proteinintramolecular interactions and the interactions with sol-vent. Therefore, in addition to the task of protein structuredetermination in structural proteomics, understanding themechanism of protein folding/unfolding is often referred toas the second half of genetics. There is a growing body ofevidence indicating that while folding can proceed along dif-ferent routes, some paths are more populated than the others[58–61]. Theoretical approaches to predict the stable foldedstructure or the process of protein folding/unfolding fall intothree categories: (1) statistical approaches; (2) conforma-tional search methods; and (3) dynamics simulations [62].Statistical approaches relate amino acid sequence to theknown 3-D structures and are reasonably successful at pre-dicting secondary structure elements [63]. Conformationalsearch methods can, in principle, yield the relative popula-tion of different conformers. A range of conformations isgenerated and an energy function is used to discriminatebetween them. Dynamics simulations have been intensivelyadopted to characterize particular folded states in solution[64, 65], but it is generally assumed that simulations involv-ing physical force fields, atomic degrees of freedom, andexplicit solvent cannot be used to predict protein folding insolution. The tendency has been, instead, to turn to simpli-fied models or representations [66, 67], which have beenwidely applied to understand the physical principles govern-ing the folding/unfolding processes and will continue to playimportant roles in the endeavor. Unfortunately, none of thesesimplified approaches has met with clear success. The chal-lenge is to reproduce the free-energy surface of the moleculein solution with sufficient fidelity. Although simplifiedmodels allow greater sampling of conformational space thanimplicit model, it may sample many regions with inap-propriate weights.

Computational studies of protein structures and folding/unfolding processes have come of age. Among the earlysuccesses were the lattice models, which allow efficientsampling of computational space. When designed properly,these models can give a well-defined global energy mini-mum and can be applied to structure prediction withencoraging results [68]. There are two types of lattice modelsaimed at two distinct objectives. The first category wasdesigned to understand the basic physics governing the pro-tein folding/unfolding mechanisms [69]. The second one,originated from the work by Miyazawa and Jernigan [70], wasgeared towards realistic folding of real proteins as templatesby statistical sampling of the available structures [71] and isoften referred to as statistical or knowledge-based potentials.In the past, the most widely used approaches in proteinstructure prediction have been based on residue-level meth-ods with typically statistical potentials obtained from the

structural database (i.e., PDB). A growing trend in the com-munity has been the development of atomic-level statisticalpotentials [72–74] in attempts to improve the accuracy. It hasbeen suggested that solvent plays a role as the lubricant priorto reaching the native state [75] and ejection of a solventmolecule from the interior may contribute a non-trivial por-tion to the free-energy barriers [76]. Therefore, the accuracyof these models can be further improved with the inclusionof the solvent effect [77, 78].

The parameters used in the all-atom representation ofboth solvent and protein, are obtained through high-levelquantum mechanical calculations on short peptide frag-ments. Such an approach assures the generality and allowsfurther refinement upon the availability of more accuratequantum mechanical methods and upon the need for suchan improvement. These models have been widely imple-mented to study the unfolding processes of small proteins bytemperature jump technique [75, 79–82], changing solventcondition [83, 84], applying external forces [85, 86], or pres-sure [83]. They can be also applied to study the ligand recog-nition [87–89] or the stabilization effect of secondary struc-tures [90]. Figure 6 shows the snapshots from a series of200 ps MD simulations at various temperatures [82] to inves-tigate the unfolding mechanism of the catalytic domain ofAspergillus awamori glucoamylase. The unfolding of this do-main was suggested to follow a putative hierarchical manner,in which the heavily O-glycosylated belt region from residuesT440 to A471 acted as the initiation site, followed by the a-helix secondary structure destruction, and then the collapseof the catalytic center pocket. In addition, MD simulationshave also been applied to determine the role of alcohols withvarious aliphatic chain lengths on the secondary structureintegrity of monomeric melittin [84]. The results of secondarystructure analysis according to DSSP [91] (Fig. 7) indicate thatthe weaker dielectric constant of longer aliphatic chain lengthof alcohol possibly reduces the hydrogen bonding betweenamide protons and surrounding solvent molecules andsimultaneously promotes the intramolecular hydrogenbonding in melittin. Besides unfolding simulations, limitedrefolding simulations were also attempted with partiallyunfolded structures generated from the unfolding simula-tions and considerable fluctuations were observed [92]. Theserefolding experiments have also identified the transitionstates in the vicinity of the native state [93].

Proteins can fold from fully unfolded states to the nativestate by going through many intermediate states of varyingdegrees of stability. Thus, studying the mechanism of pro-tein folding is regarded as the study of these intermediatestates, the relationship between them, and the relationshipbetween them and the native state [94, 95]. However, a majordifficulty in experimental studies of intermediate states is thetransient nature of these states, necessitating fast techniques[96–98]. In contrast, although it is still unrealistic to expectthe protein to reach the native state during the simulationtime scale by current computer power, this time scaleappears to be sufficient to observe some marginally stable

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 10: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2065

Figure 6. Snapshots of the catalytic domain from Aspergillus awamori glucoamylase at 50, 100, 150, and 200 ps at (A) 300 K; (B) 400 K; (C)600 K; and (D) 800 K during 200 ps MD simulations [82]. The N- and C-terminus are indicated in the upper starting structure. The upper andlower starting structures at t = 0 indicate the numbering and the corresponding colors of the 13 a-helices, respectively. The same coloringwas applied to each snapshot. Figure reproduced with permission.

intermediates. Previously, Kazmirski and Daggett [99] havesuccessfully sampled various stable intermediate structuresof hen egg-white lysozyme during two sets of 9 ns MDunfolding simulations. Their results suggested that thekinetic intermediate may be made up of distinct, but rapidlyinterconverting, partially folded conformations distinguish-ed primarily by differences in secondary structure packing.Furthermore, the free energy of the intermediate state hasbeen proved to be the lowest among all the states sampledduring the simulation [100]. These results strongly suggestthat the kinetics, not the force-field artifacts, is the barrier forthe simulation to reach the native state.

Another approach to simplify the protein-folding prob-lem is the building block folding model, in which proteinfragments are produced either experimentally [101, 102] orcomputationally [103–106]. Whether these fragments canform independent folding entities is further determined[107]. In general, there is a correspondence in the regions ofthe polypeptide chains being cleaved by these two methods,despite the difference in their premises [108]. Therefore,proteolytic enzymes may be used as reliable probes of pro-tein structure, dynamics, and folding pathways. Further-

more, the consistency with the experimental cleavagesappears to provide a validation of the building block foldingmodel, where the independently folding hydrophobic unitsare obtained through hierarchical assembly of these con-formationally fluctuating building blocks [109, 110]. As aresult, this model is in line with the proposal that proteinfolding is a hierarchical event [111, 112], where parts con-stituting local minima of energy fold first, with their sub-sequent association and mutual stabilization to finally yieldthe global fold.

With the promises of more accurate models and con-siderably faster computer speed, together with the advance-ment of experimental approaches, such as mutagenesis [113],hydrogen exchange experiments [114, 115], atomic forcemicroscope (AFM) [116], solution structure technique [117,118], and ultrafast kinetic experiments [119, 120], the goals ofpredicting protein structures from their primary sequencesand understanding of the protein folding/unfolding mech-anisms should be achieved in the near future. Hopefully, thediversity of protein structures and the complexity of the in vivofolding process can be dealt with by a combination of experi-mental approaches and further simulation work.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 11: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2066 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

Figure 7. Secondary structures as a function of MD simulation time for melittin at 300, 400, and 600 K in (A) water; (B) methanol; (C) etha-nol; (D) propanol; and (E) butanol [84]. The numbers shown on the y-axis in each plot indicate the number of amino acid residues. a-Helix,turn, and coil estimated according to DSSP [91] are shown in red, blue, and green, respectively. Figure reproduced with permission.

4 Conclusions

Structure prediction methods are likely to play a key role inthe generation of full structural proteome maps. As thenumber of structural templates increases, a vast experi-mental pool of data will be available for future machinelearning experiments and hopefully lead to a rapid improve-ment in the accuracy and speed of such efforts. Furthermore,establishing protein interaction networks is crucial forunderstanding cellular operations. 3-D structures can beused to interrogate the whole interaction network to validateand infer molecular details for the interactions proposed byother approaches. We can expect that the determination andprediction of the 3-D structures and the structure–functionrelationship of proteins for studying protein–protein inter-actions will become increasingly important in the post-genomic era.

The authors gratefully acknowledge the financial supportfrom the National Science Council of Taiwan (NSC-93–2214-E-027-001).

5 References

[1] Smith, T., Nat. Struct. Biol. Suppl. 2000, 7, 927.

[2] Abola, E., Kuhn, P., Earnest, T., Stevens, R. C., Nat. Struct. Biol.2002, 7, 973–977.

[3] Lamzin, V. S., Perrakis, A., Nat. Struct. Biol. 2000, 7(Suppl.),978–981.

[4] Montelione, G. T., Zheng, D., Huang, Y., Szyperski, T., Nat.Struct. Biol. 2000, 7, 982–984.

[5] Ealick, S. E., Curr. Opin. Chem. Biol. 2000, 4, 495–499.

[6] Stevens, R. C., Struct. Fold Des. 2000, 8, R177-R185.

[7] Leskey, S. A., Protein Expr. Purif. 2001, 22, 159–164.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 12: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

Proteomics 2005, 5, 2056–2068 Bioinformatics 2067

[8] Ban, N., Nissen, P., Hansen, J., Moore, P. B., Steitz, T. A., Sci-ence 2000, 289, 905–920.

[9] Zhang, G. Y., Campbell, E. A., Minakhin, L., Richter, C. et al.,Cell 1999, 98, 811–824.

[10] Sali, A., Glaeser, R., Earnest, T., Baumeister, W., Nature 2003,422, 216–225.

[11] Altamirano, M. M., Golbik, R., Zahn, R., Buckle, A. M., Fersht,A. R., Proc. Natl. Acad. Sci. USA 1997, 94, 3576–3578.

[12] Vuillard, L., Rabilloud, T., Goldberg, M. E., Eur. J. Biochem.1998, 256, 128–135.

[13] Moreno, A., Mas-Oliva, J., Soriano-García, M., Salvador, C.O. et al., J. Mol. Struct. 2000, 519, 243–256.

[14] Norin, M., Sundström, M., Trends Biotechnol. 2002, 20, 79–84.

[15] Jhoti, H., Trends Biotechnol. 2001, 19, S67-S71.

[16] Muchmore, S. W., Olson, J., Jones, R., Pan, J. et al., Struc-ture 2000, 8, R243-R246.

[17] Christendat, D., Yee, A., Dharamsi, A., Kluger, Y. et al., Nat.Struct. Biol. 2000, 7, 903–909.

[18] Bailey-Kellog, C., Widge, A., Kelley, J. J., Berardi, M. J. et al.,J. Comput. Biol. 2000, 7, 537–558.

[19] Szyperski, T., Wider, G., Bushweller, J. H., Wüthrich, K., J.Am. Chem. Soc. 1993, 115, 9307–9308.

[20] Szyperski, T., Banecki, B., Braun, D., Glaser, R. W., J. Biomol.NMR 1998, 11, 387–405.

[21] Gardner, K. H., Kay, L. E., Annu. Rev. Biophys. Biomol.Struct. 1998, 27, 357–406.

[22] Wüthrich, K., Nat. Struct. Biol. 1998, 5, 492–495.

[23] Wider, G., Wüthrich, K., Curr. Opin. Struct. Biol. 1999, 9, 594–601.

[24] Mumenthaler, C., Braun, W., J. Mol. Biol. 1995, 254, 465–480.

[25] Zimmerman, D., Hulikowski, C., Huang, Y., Feng, W. et al., J.Mol. Biol. 1997, 269, 592–610.

[26] Nilges, M., O’Donoghue, S., Prog. Nucl. Magn. Reson.Spectrosc. 1998, 32, 107–139.

[27] Helgstrand, M., Kraulis, P., Allard, P., Hard, T., J. Biomol.NMR 2000, 18, 329–336.

[28] Atreya, H., Sahu, S., Chary, K., Govil, G., J. Biomol. NMR2000, 17, 125–136.

[29] Duggan, B. M., Legge, G. B., Dyson, J., Wright, P. E., J. Bio-mol. NMR 2001, 19, 321–329.

[30] Moseley, H. N. B., Montelione, G. T., Curr. Opin. Struct. Biol.1999, 9, 635–642.

[31] Baker, D., Sali, A., Science 2001, 294, 93–96.

[32] Sanchez, R., Sali, A., Proc. Natl. Acad. Sci. USA 1998, 95,13597–13602.

[33] Vitkup, D., Melamud, E., Moult, J., Sander, C., Nat. Struct.Biol. 2001, 8, 559–566.

[34] Doyle, D. A., Morais Cabral, J., Pfuetzner, R. A., Kuo, A. et al.,Science 1998, 280, 69–77.

[35] Liu, H.-L., Lin, J.-C., Proteins: Struct. Funct. Bioinformatics2004, 55, 558–567.

[36] Wang, I., Wu, S.-H., Chang, H.-K., Shieh, R.-C. et al., ProteinSci. 2002, 11, 390–400.

[37] Liu, H.-L., Lin, J.-C., J. Biomol. Struct. Dyn. 2004, 21, 639–650.

[38] Liu, H.-L., Lin, J.-C., Chem. Phys. Lett. 2004, 381, 592–597.

[39] Simons, K. T., Bonneau, R., Ruczinski, I., Baker, D., Proteins:Struct. Funct. Genet. 1999, 37(Suppl. 3), 171–176.

[40] Simons, K. T., Strauss, C., Baker, D., J. Mol. Biol. 2001, 306,1191–1199.

[41] Moult, J., Hubbard, T., Fidelos, K., Pedersen, J. T., Proteins:Struct. Funct. Genet. 1999, 37, 2–6.

[42] Tramontano, A., Leplae, R., Morea, V., Proteins 2001, Suppl.5, 22–38.

[43] Sippl, M. J., Lackner, P., Dominguez, F. S., Prlíc, A. et al.,Proteins 2001, Suppl. 5, 55–67.

[44] Rohl, C. A., Strauss, C. E. M., Chivian, D., Baker, D., Proteins:Struct. Funct. Bioinformatics 2004, 55, 656–677.

[45] Simons, K. T., Kooperberg, C., Huang, E., Baker, D., J. Mol.Biol. 1997, 268, 209–225.

[46] Simons, K. T., Ruczinski, I., Kooperberg, C., Fox., B. et al.,Proteins 1999, 35, 82–95.

[47] Bruccoleri, R. E., Karplus, M., Biopolymers 1990, 29, 1847–1862.

[48] Hornak, V., Simmerling, C., Proteins 2003, 51, 577–590.

[49] Fiser, A., Do, R. K. G., Sali, A., Protein Sci. 2001, 9, 1753–1773.

[50] Rapp, C. S., Friesner, R. A., Proteins 1999, 35, 173–183.

[51] Galaktionov, S., Nikiforovich, G. V., Marshall, G. R., Biopoly-mers 2001, 60, 153–168.

[52] DePristo, M. A., de Bakker, P. I. W., Lovell, S. C., Blundell, T.L., Proteins 2003, 51, 41–55.

[53] Shenkin, P. S., Yarmush, D. L., Fine, R. M., Wang, H. J.,Levinthal, C., Biopolymers 1987, 26, 2053–2085.

[54] Xiang, Z., Soto, C. S., Honig, B., Proc. Natl. Acad. Sci. USA2002, 99, 7432–7437.

[55] Go, N., Scheraga, H. A., Macromolecules 1970, 3, 178–187.

[56] Wedemeyer, W., Scheraga, H. A., J. Comp. Chem. 1999, 20,819–844.

[57] Makhatadze, G. I., Privalov, P. L., Adv. Protein Chem. 1995,47, 307–425.

[58] Dill, K. A., Chan, H. S., Nat. Struct. Biol. 1997, 4, 10–19.

[59] Pande, V. S., Grosberg, A. Y., Tanaka, T., Rokhsar, D. S., Curr.Opin. Struct. Biol. 1998, 8, 66–79.

[60] Dill, K., Protein Sci. 1999, 8, 1166–1180.

[61] Tsai, C. J., Kumar, S., Ma, B., Nussinov, R., Protein Sci. 1999,8, 1181–1190.

[62] Troyer, J. M., Cohen, F. E., in: Lipkowitz, K. B., Boyd, D. B.,(Eds.), Simplified Models for Understanding and PredictingProtein Structure. in: Reviews in Computational Chemistry,vol. 2, VCH, New York 1991, pp. 57–80.

[63] Thomas, P. D., Dill, K. A., J. Mol. Biol. 1996, 257, 457–469.

[64] Constantine, K. L., Friedrichs, M. S., Stouch, T. R., Biopoly-mers 1996, 39, 591–614.

[65] Kovacs, H., Mark, A. E., Johansson, J., van Gunsteren, W. F.,J. Mol. Biol. 1995, 247, 808–822.

[66] Karplus, M., Sali, A., Curr. Opin. Struct. Biol. 1995, 5, 58–73.

[67] Mohanty, D., Elber, R., Thirumalai, D., Beglov, D., Roux, B., J.Mol. Biol. 1997, 272, 423–442.

[68] Skolnick, J., Kolinski, A., Science 1990, 250, 1121–1125.

[69] Go, N., Ann. Rev. Biophys. Bioeng. 1983, 12, 183–210.

[70] Miyazawa, S., Jernigan, R. L., Macromolecules 1985, 18,534–552.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 13: REVIEW Recent developments in structural proteomics for protein ...f10894/full text/43_Proteomics.pdf · The major challenges in structural proteomics include identifying all the

2068 H.-L. Liu and J.-P. Hsu Proteomics 2005, 5, 2056–2068

[71] Miyazawa, S., Jernigan, R. L., J. Mol. Biol. 1996, 256, 623–644.

[72] Rojunckarin, A., Subramaniam, S., Proteins: Struct. Funct.Genet. 1999, 36, 54–67.

[73] Samudrala, R., Moult, J., J. Mol. Biol. 1998, 275, 895–916.

[74] Sippl, M. J., Ortner, M., Jaritz, M., Lackner, P., Flockner, H.,Fold Des. 1998, 1, 289–298.

[75] Caflisch, A., Karplus, M., J. Mol. Biol. 1995, 252, 672–708.

[76] Bursulaya, B. D., Brooks, C. L. III, J. Am. Chem. Soc. 1999,121, 9947–9951.

[77] Lazaridis, T., Karplus, M., J. Mol. Biol. 1999, 288, 477–487.

[78] Lazaridis, T., Karplus, M., Proteins: Struct. Funct. Genet.1999, 35, 133–152.

[79] Liu, H.-L., Wang, W.-C., Chem. Phys. Lett. 2002, 366, 284–290.

[80] Liu, H.-L., Wang, W.-C., J. Biomol. Struct. Dyn. 2003, 20, 615–622.

[81] Liu, H.-L., Wang, W.-C., Protein Eng. 2003, 16, 19–25.

[82] Liu, H.-L., Wang, W.-C., Hsu, C.-M., J. Biomol. Struct. Dyn.2003, 20, 567–574.

[83] Hunenberger, P. H., Mark, A. E., van Gunsteren, W. F., Pro-teins 1995, 21, 196–213.

[84] Liu, H.-L., Hsu, C.-M., Chem. Phys. Lett. 2003, 375, 119–125.

[85] Krammer, A., Lu, H., Isralewitz, B., Schulten, K., Cogel, V.,Proc. Natl. Acad. Sci. USA 1999, 96, 1351–1356.

[86] Paci, E., Karplus, M., J. Mol. Biol. 1999, 288, 441–459.

[87] Chen, H.-M., Ho, C.-W., Liu, J.-W., Lin, K.-Y. et al., Biotechnol.Progr. 2003, 19, 864–873.

[88] Liu, H.-L., Ho, Y., Hsu, C.-M., Chem. Phys. Lett. 2003, 372,249–254.

[89] Liu, H.-L., Ho, Y., Hsu, C.-M., J. Biomed. Sci. 2003, 10, 302–312.

[90] Liu, H.-L., Shu, Y.-C., Wu, Y.-H., J. Biomol. Struct. Dyn. 2003,20, 741–746.

[91] Kabsch, W., Sander, C., Biopolymers 1983, 22, 2577–2637.

[92] Alonso, D. O. V., Daggett, V., J. Mol. Biol. 1995, 247, 501–520.

[93] Li, A., Daggett, V., J. Mol. Biol. 1996, 257, 412–429.

[94] Kim, P. S., Baldwin, R. L., Ann. Rev. Biochem. 1990, 59, 631–660.

[95] Privalov, P. L., J. Mol. Biol. 1996, 258, 707–725.

[96] Eaton, W. A., Munioz, V., Thompson, P. A., Chan, C. K.,Hofrichter, J., Curr. Opin. Struct. Biol. 1997, 7, 10–14.

[97] Roder, H., Colon, W., Curr. Opin. Struct. Biol. 1997, 7, 15–28.

[98] Chamberlian, A. K., Margusee, S., Adv. Protein Chem. 2000,53, 283–328.

[99] Kazmirski, S. L., Daggett, V., J. Mol. Biol. 1998, 284, 793–806.

[100] Lee, M. R., Duan, Y., Kollman, P. A., Proteins: Struct. Funct.Genet. 2000, 39, 309–316.

[101] Fontana, A., de Polverino Laureto, P., De Filippis, V., Scar-amella, E., Zambonin, M., Fold. Des. 1997, 2, R17-R26.

[102] Peng, Z. Y., Wu, L. C., Adv. Protein Chem. 2000, 53, 1–47.

[103] Tsai, C.-J., Nussinov, R., Protein Sci. 1997, 6, 24–42.

[104] Tsai, C.-J., Xu, D., Nussinov, R., Fold. Des. 1998, 3, R71-R80.

[105] Tsai, C.-J., Maizel, J. V., Nussinov, R., Protein Sci. 1999, 8,1591–1604.

[106] Tsai, C.-J., Maizel, J. V., Nussinov, R., Proc. Natl. Acad. Sci.USA 2000, 97, 12038–12043.

[107] Peng, Z.-Y., Kim, P. S., Biochemistry 1994, 33, 2136–2141.

[108] Tsai, C.-J., De Laureto, P. P., Fontana, A., Nussinov, R., Pro-tein Sci. 2002, 11, 1753–1770.

[109] Zehfus, M. H., Rose, G. D., Biochemistry 1986, 25, 5759–5765.

[110] Zehfus, M. H., Proteins Struct. Funct. Genet. 1993, 16, 293–300.

[111] Kumar, S., Ma, B., Tsai, C.-J., Wolfson, H., Nussinov, R.,Cell. Biochem. Biophys. 1999, 31, 141–164.

[112] Kumar, S., Ma, B., Tsai, C.-J., Sinha, N., Nussinov, R., Pro-tein Sci. 2999, 9, 10–19.

[113] Fersht, A. R., Curr. Opin. Struct. Biol. 1995, 5, 79–84.

[114] Englander, J. J., Louie, G., McKinnie, R. E., Englander, S.W., J. Mol. Biol. 1998, 284, 1695–1706.

[115] Hollien, J., Marqusee, S., Proc. Natl. Acad. Sci. USA 1999,96, 13674–13678.

[116] Fisher, T. E., Oberhauser, A. F., Carrion-Vazquez, M., Mars-zalek, P. E., Fernandez, J. M., Trans. Biomed. Sci. 1999, 24,379–384.

[117] Eliezer, D., Yao, J., Dyson, H. J., Wright, P. E., Nat. Struct.Biol. 998, 5, 148–155.

[118] Mok, Y. K., Kay, C. M., Kay, L. E., Forman-Kay, J., J. Mol.Biol. 1999, 289, 619–638.

[119] Gilmanshin, R, Williams, S., Callender, R. H., Woodruff, W.H., Dyer, R. B., Proc. Natl. Acad. Sci. USA 1997, 94, 3709–3713.

[120] Williams, S., Causgrove, T. P., Gilmanshin, R., Fang, K. S. etal., Biochemistry 1996, 35, 691–697.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de