Uncover the conserved property underlying sequence-distant and structure-similar proteins

Uncover the Conserved Property Underlying Sequence-Distant andStructure-Similar Proteins

Jun Gao,1,2 Zhijun Li1,31 Department of Bioinformatics and Computer Science, University of the Sciences in Philadelphia, Philadelphia, PA 19104

2 Institute of Theoretical Chemistry, Shandong University, Jinan 250100, People’s Republic of China

3 Institute for Translational Medicine and Therapeutics, University of the Pennsylvania, Philadelphia, PA 19104

Received 6 August 2009; revised 23 October 2009; accepted 29 October 2009

Published online 4 November 2009 in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/bip.21342

This article was originally published online as an accepted

preprint. The ‘‘Published Online’’ date corresponds to the

preprint version. You can request a copy of the preprint by

emailing the Biopolymers editorial office at biopolymers@wiley.

com

INTRODUCTION

Comparative studies of protein sequences and struc-

tures have revealed that proteins’ three-dimensional

(3D) structures are much more conserved than

sequences.1 Protein sequences with detectable simi-

larity are expected to share a common 3D structural

fold. This is consistent with the law of protein folding, which

states that the 3D structure of a protein is determined by its

amino acid sequence and the solvent.2 Surprisingly, a num-

ber of examples are reported showing that distant sequences

adopt similar structure folds.3 For example, the known crys-

tal structures of intestinal fatty acid-binding protein and

Manduca sexta fatty acid-binding protein 2 can be aligned to

a 1.62 A root-mean-square deviation (RMSD) of the Ca

atoms, whereas they share sequence identity of merely 19%.4

These observations raise an interesting question: what prop-

erties are actually conserved in the 3D structures of those dis-

Uncover the Conserved Property Underlying Sequence-Distant andStructure-Similar Proteins

Additional Supporting Information may be found in the online version of this

article.

Correspondence to: Zhijun Li; e-mail: [email protected]

ABSTRACT:

It is widely accepted that a protein’s sequence determines

its structure. The surprising finding that proteins of

distant sequence can adopt similar 3D structures has

raised interesting questions regarding underlying

conserved properties that are essential for protein folding

and stability. Uncovering the conserved properties may

shed light on the folding mechanism of proteins and help

with the development of computational tools for protein

structure prediction. We compiled and analyzed a

structure pair dataset of 66 high-resolution and low

sequence identity (16–38%) soluble proteins. Structure

deviation for each pair was confirmed by calculating its

Ca SiMax value and comparing its potential energy per

residue. Analysis of favorable inter-residue interactions

for each structure pair indicated that the average number

of inter-residue interactions within each structure

represents a conserved feature of homologous structures of

distant sequence. Detailed comparison of individual types

of interactions showed that the average number of either

hydrophobic or hydrogen bonding interactions remains

unchanged for each structure pair. These findings should

be of help to improving the quality of homology models

based on templates of low sequence identity, thus

broadening the application of homology modeling

techniques for protein studies. # 2009 Wiley Periodicals,

Inc. Biopolymers 93: 340–347, 2010.

Keywords: inter-residue interaction; conservation; average

number of interactions; homology modeling

Contract grant sponsors: PhRMA Foundation, Institute for Translational Medicine

and Therapeutics (ITMAT) Transdisciplinary Program in Translational Medicine

and Therapeutics at University of Pennsylvania.

VVC 2009 Wiley Periodicals, Inc.

340 Biopolymers Volume 93 / Number 4

tant protein sequences?5 Several studies have been performed

based on the analysis of amino acid types of individual pro-

tein sequences6–8 and using position-specific scoring ma-

trices.9 It is proposed that the hydrophobic or hydropathic

profile is preserved in those similar structures with dissimilar

sequences.

From the physico-chemical perspective, the folding of a

polypeptide sequence into a 3D structure is driven by residue–

residue and residue–solvent interactions.10 The inter-residue

interactions that stabilize the 3D structure are mainly hydro-

phobic interaction5,11 and hydrogen bonding.12 Contributions

from ionic interactions and formation of disulfide bonds are

relatively small, but could be essential as well.13,14 In our pre-

vious study, we have analyzed these four types of favorable

inter-residue interactions using a dataset of helical transmem-

brane protein structure pairs.15 Each structure pair in the

dataset has the sequence identity ranging from 7 to 36% and

adopts a similar structure fold. It was found that the average

number of inter-residue interactions found remains conserved

for those low sequence identity structures. This finding not

only helps explain why proteins of distant sequence adopt the

similar structure, but also has practical application for compu-

tational modeling of helical membrane protein structures.15

Compared to helical membrane proteins, structures of solu-

ble proteins are much more diverse. There are at least four

major structural classes, all-a, a/b, a1b, and all-b.16 It is of in-terest to test whether the same conclusion stands for soluble

proteins. In this study, we examined this by analyzing the same

interactions in similar soluble protein structures with low

sequence identity. For this study, we first compiled a dataset of

66 high-resolution protein structure pairs of similar fold, but

with the low sequence identity (16–38%). These structure pairs

represent all four major structural classes. Next, we analyzed

this dataset from several perspectives including the geometric

comparison, the potential energy comparison, and the inter-

residue interaction analysis. The results suggested that, unlike

the geometric measure or the potential energy, the average

number of favorable interactions still remains conserved for

this diverse soluble protein dataset containing structure pairs

of similar protein fold and low sequence identity. This finding

provides novel insight into the conversed properties underlying

proteins of distant sequence and similar structure.

MATERIALS AND METHODS

High-Resolution Homologous Structure Pair DatasetFirst, pairs of structurally similar, sequence dissimilar proteins were

identified by searching the Protein Data Bank17 with the following

criteria: (i) The structure was determined by X-ray crystallographic

methods at a resolution of 2.5 A or better; (ii) The structure con-

tains only one chain; (iii) The structure has more than 60 amino

acid residues; and (iv) The sequence identity between any two struc-

tures was\40%.

Next, all the hits obtained above were winnowed with several cri-

teria related to their structural classification of proteins (SCOP)

classification,16 including: (i) the structure belongs to one of the

four SCOP classes: all-a, all-b, a/b, and a1b; (ii) the structure con-tains only one SCOP domain; and (iii) for any structure retained, at

least one other structure belonging to the same SCOP superfamily

should also be present in the original hit list. Structures within each

superfamily were generally regarded as homologous,16 and this cri-

terion ensured homologous pairs were included. A total of 468 ho-

mologous pairs were obtained.

Finally, this pair dataset was further refined using the following

criteria: (i) the SiMax value of each pair was \5 A. This criterion

was proposed as the measure of true homologous structures.18 The

calculation of SiMax values was based on the Ca RMSD value for

only aligned atoms of the two structures, as proposed.18 The Ca

RMSD value was calculated using molecular operating environment

(MOE) (Molecular Computing Group, version 2006.08); (ii) for a

superfamily containing more than four structure pairs, only four

were randomly selected. The final trimmed dataset contained 66

pairs of homologous structures, representing 34 superfamilies (Ta-

ble I). The sequence identity between each pair was recalculated

using the pairwise alignment package alignment of multiple protein

sequences (AMPS).19 The percentage of the sequence identity

between pairs in this dataset ranges from �16 to 38%.

Calculation of All-Atom Potential EnergyHydrogen atoms were added to each X-ray structure in the high-

resolution structure pair dataset, and the structures were subjected

to 100 steps of in vacuo energy minimization with the AMBER8

package using the all-atom protein force field (ff03).20 This force

field represents a major extension of the general amber force field.

For two structures (Protein Data Bank ID: 1E9M and 1WRI), addi-

tional 900 steps of energy minimization were performed to obtain a

negative potential energy value. The potential energy per residue of

the minimized structures was subsequently calculated and com-

pared within each structure pair.

Derivation of Favorable Inter-Residue InteractionsAn inter-residue interaction between two residues within a protein

structure was defined as one of the four types: hydrophobic interac-

tion, ionic bond, disulfide bond, or hydrogen bond. The first three

interactions were determined in MOE with the sequence separation

cutoff of one. Hydrogen bond interactions were detected using HB-

Plus (version 3.0).21 The average number of interactions were subse-

quently calculated and compared.

Construction and Analysis of Homology

Model DatasetFor each pair in the above high-resolution structure dataset, the first

structure was regarded as the target and the second as the template.

The homology model of the target protein was constructed based

on the template structure. For model building, the target and the

template structures were first aligned using the structure-based

alignment algorithm implemented in MOE with the Blossum62

Analysis of Inter-Residue Interactions 341

Biopolymers

Table I List of PDB Entries for Protein Pairs Included in the High-Resolution Structure Dataset

Pair No. SCOP Superfamily

First Structure Second Structure

Sequence Identity (%)PDB ID Resolution (A) Length PDB ID Resolution (A) Length

1 a.1.1 1A6M 1.00 151 2GDM 1.70 153 19.46

2 a.1.1 1MBA 1.85 147 1KFR 1.60 146 19.44

3 a.1.1 1KFR 1.85 147 2HBG 1.50 147 16.20

4 a.1.1 1MBA 1.60 146 2HBG 1.50 147 21.58

5 a.102.1 1AYX 1.70 492 1GAI 1.70 472 37.73

6 a.102.1 1KWF 0.94 363 1V5C 2.00 386 31.73

7 a.11.1 1HB6 2.00 86 1HBK 2.00 89 26.74

8 a.3.1 1C75 0.97 71 451C 1.60 82 34.29

9 a.3.1 1CTJ 2.50 83 1CC5 1.10 89 23.46

10 a.3.1 1CC5 2.50 83 451C 1.60 82 21.79

11 a.3.1 1CTJ 1.10 89 451C 1.60 82 28.21

12 a.96.1 1MUN 1.20 225 2ABK 1.85 211 22.38

13 b.1.8 1MFM 1.02 153 1OAL 1.50 151 30.00

14 b.18.1 1GWM 1.15 153 1UZ0 2.00 131 21.31

15 b.36.1 1G9O 1.50 91 1QAU 1.25 112 20.00

16 b.36.1 1G9O 1.50 91 1R6J 0.73 82 23.46

17 b.36.1 1QAU 1.25 112 1R6J 0.73 82 21.79

18 b.47.1 1PQ7 1.50 215 1P3C 0.80 224 22.5

19 b.50.1 1FMB 1.80 104 4FIV 1.80 113 31.07

20 b.6.1 1BQK 1.35 124 1PLC 1.33 99 32.58

21 b.6.1 1BQK 1.35 124 1SF3 1.05 105 34.07

22 b.6.1 1PLC 1.33 99 1SF3 1.05 105 24.72

23 b.60.1 1MDC 1.75 131 1TVQ 2.00 125 28.23

24 b.60.1 1QQS 2.40 174 1X8Q 0.85 184 18.18

25 b.68.1 1F8E 1.40 388 1INV 2.40 390 30.71

26 b.80.1 1EE6 2.30 197 1QCX 1.70 359 25.13

27 b.82.1 1FI2 1.60 201 1V70 1.30 105 20.95

28 c.1.2 1NSJ 2.00 205 1THF 1.45 253 22.28

29 c.2.1 1B2L 1.60 254 1NXQ 1.79 251 23.95

30 c.2.1 1B2L 1.60 254 1OAA 1.25 259 20.92

31 c.2.1 1EDO 2.30 244 1NXQ 1.79 251 31.12

32 c.2.1 1NXQ 1.79 251 1OAA 1.25 259 21.81

33 c.23.5 1F4P 1.30 147 1RCF 1.40 169 32.39

34 c.23.5 1F4P 1.30 147 5NUL 1.60 138 32.59

35 c.37.1 1QF9 1.70 194 1QHX 2.50 178 16.28

36 c.43.1 1DPB 2.50 243 1SCZ 2.20 233 31.74

37 c.62.1 1IJB 1.80 202 1MF7 1.25 194 19.15

38 c.62.1 1IJB 1.80 202 1QCY 2.30 193 17.02

39 c.62.1 1MF7 1.25 194 1MJN 1.30 179 34.08

40 c.62.1 1MF7 1.25 194 1QCY 2.30 193 31.35

41 c.66.1 1EJ0 1.50 180 1G8A 1.40 227 28.74

42 c.69.1 1TIB 1.84 269 1USW 2.50 260 33.33

43 c.71.1 1DF7 1.70 159 1RA9 1.55 159 37.50

44 c.71.1 1DF7 1.70 159 3DFR 1.70 162 31.82

45 c.71.1 1KMV 1.05 186 3DFR 1.70 162 28.40

46 c.71.1 1RA9 1.55 159 3DFR 1.70 162 28.03

47 c.93.1 1GCA 1.70 309 1RPJ 1.80 288 24.65

48 c.93.1 1RPJ 1.80 288 1TJY 1.30 316 18.86

49 c.93.1 2DRI 1.80 288 1RPJ 1.60 271 35.93

50 c.93.1 2DRI 1.30 316 1TJY 1.60 271 22.01

51 d.108.1 1CJW 1.80 166 1Q2Y 2.00 140 18.57

52 d.108.1 1CJW 1.80 166 1QST 1.70 160 16.89

342 Gao and Li

Biopolymers

scoring matrix. The generated alignment was then adopted for sub-

sequent model building using the homology modeling software

Modeller (version v7).22 Structural comparison between the homol-

ogy model and the crystal structure of the target protein was meas-

ured by the TM-score.23 The average number of interactions in each

homology model and its corresponding crystal structure were calcu-

lated as above and subsequently compared.

RESULTSTo uncover the conserved properties underlying similar protein

folds of distant sequence, the computational approach included

several steps: (i) compile a high-resolution, low sequence iden-

tity structure pair dataset; (ii) compute the Ca SiMax and com-

pare potential energy in each structure pair; (iii) compute and

compare the average number of favorable inter-residue interac-

tions in each structure pair; and (iv) compute and compare the

average number of favorable interactions in each homology

model and its corresponding X-ray structure.

Structures in Each Pair Vary Significantly in the

High-Resolution Structure Dataset

The structural difference between pairs of the low sequence

identity structures in the high-resolution structure dataset

was measured using Ca SiMax values. As expected, their

structures varied quite significantly with the SiMax value

ranging from 1.63 A to 4.91 A (see Figure 1). Among them,

35 (53%) structure pairs displayed Ca SiMax[ 3.0 A.

Potential Energy per Residue Also Vary Greatly for

Each Pair in the High-Resolution Structure Dataset

Structural difference between structures of distant sequence

in the high-resolution structure dataset was also character-

ized by comparing their potential energy per residue. The

potential energy per residue for each structure pair changed

quite dramatically, ranged from 0.6 to 338% (see Figure 2).

Among them, 27 (41%) structure pairs displayed energy

change [30%. Clearly, the potential energy per residue did

not remain conserved for each structure pair of low sequence

identity.

A potential energy calculation is based on atom–atom

interactions within a molecule. As the type and number of

atoms between pairs of proteins vary, it is understandable

that their potential energy differs. Further examining the

value of the individual components of the potential energy

indicated that variation in both van der Waals and electro-

static interactions contributed most to the difference in

potential energy for each structure pair in the dataset (Sup-

porting Information).

Average Number of Favorable Inter-Residue

Interactions Remains Conserved for Each Pair in the

High-Resolution Structure Dataset

Formation of favorable inter-residue interactions is a hall-

mark of folded protein structures. In our previous study, it

has been shown that the average number of such interactions

remains conserved for low sequence identity, homologous

structures of helical membrane proteins.15 For each pair in

the high-resolution soluble structure dataset presented here,

the difference in the average number of inter-residue interac-

tions was also very small, ranged from 0 to 0.67 (see Figure

3). Further examination of the results showed that for 40

(61%) structure pairs, the absolute difference was �0.2. This

result was in clear contrast with the Ca SiMax of these struc-

Table I (Continued from the previous page.)

Pair No. SCOP Superfamily

First Structure Second Structure

Sequence Identity (%)PDB ID Resolution (A) Length PDB ID Resolution (A) Length

53 d.108.1 1Q2Y 2.00 140 1QST 1.70 160 17.86

54 d.110.3 1N9L 1.40 130 1EW0 1.90 109 30.19

55 d.110.3 1N9L 1.90 109 1NWZ 0.82 125 25.71

56 d.124.1 1IQQ 1.50 200 1UCD 1.30 190 32.26

57 d.15.1 1GNU 1.75 117 1WM3 1.20 72 20.83

58 d.15.4 1WRI 2.07 106 1E9M 1.20 93 20.43

59 d.165.1 1MRJ 1.60 247 1RL0 1.40 255 22.27

60 d.165.1 1MRJ 1.60 247 1UQ5 1.40 263 37.86

61 d.165.1 1RL0 1.40 255 1UQ5 1.40 263 26.23

62 d.169.1 1HQ8 1.95 123 1QDD 1.30 144 27.05

63 d.169.1 1HQ8 1.95 123 1TN3 2.00 137 25.41

64 d.169.1 1QDD 1.30 144 1TN3 2.00 137 22.39

65 d.58.5 2PII 1.45 102 1UKU 1.90 112 16.67

66 d.92.1 1EB6 1.00 177 1G12 1.60 167 21.60


Biopolymers

ture pairs or the difference in their potential energy per resi-

due, confirming the hypothesis that the average number of

favorable interactions represents a conserved property of

soluble proteins of similar fold and dissimilar sequence.

In addition, detailed comparison of the average number

of individual inter-residue interactions between two struc-

tures in each pair showed that the difference in both hydro-

phobic and hydrogen bonding interactions was negligible

FIGURE 1 SiMax analysis of structure pairs in the high-resolution structure dataset. The PDB ID

for pairs 1–66 is listed in Table I.

FIGURE 2 Comparison of potential energy per residue for structure pairs in the high-resolution

structure dataset. For each pair, the structure with the larger value is represented by the gray bar,

and the difference between the pair is represented by the black bar.

344 Gao and Li

Biopolymers

(see Figure 4). For the hydrophobic interactions, the range

was 20.34 to 0.37, and for the hydrogen bonding interac-

tions, the range was 20.37 to 0.50. This suggested that both

types of interactions remain relatively conserved.

Average Number of Favorable Inter-Residue

Interactions Correlates Directly with Quality of

Homology Models

To explore the potential practical application of the above

finding in homology modeling, a set of homology models

were prepared using the structure pairs in the high-resolu-

tion structure dataset. For each pair, the first structure was

designated as the target and the second as the template, a

homology model of the target protein was constructed subse-

quently based on the template protein structure.

Unsurprisingly, the average number of interactions of all

the homology models was lower than their corresponding

crystal structures. Most interestingly, there is a good correla-

tion between the ratio of their average number of interac-

tions and their structure similarity, as measured by the TM-

score (see Figure 5). The linear fitting parameter R was 0.72.

DISCUSSIONThe thrilling finding that proteins of distant sequence can

adopt a similar 3D structure has raised interesting questions

regarding underlying conserved properties that are essential

for protein folding and stability.3,5,6 Uncovering the con-

served properties may shed light on the folding mechanism

FIGURE 3 Comparison of average number of inter-residue interactions for structure pairs in the

high-resolution structure dataset. For each pair, the structure with the smaller value is represented by

the gray bar, and the difference between the pair is represented by the black bar.

FIGURE 4 Difference in average number of hydrophobic and

hydrogen bonding interactions for each structure pair in the high-

resolution structure dataset.


Biopolymers

of proteins and help with the development of computational

tools for protein structure prediction.15 Several bioinfor-

matics approaches to this question are carried out through

comparative analysis of structural properties,5 analysis of

types of amino acid residues in protein sequences,6,7 and sin-

gular value analysis of position-specific scoring matrices.9

These studies have showed the importance of the conserved

hydrophobic or hydropathic profile.

Unlike the previous studies mentioned earlier, we have

developed an approach that focuses on directly elucidating

and comparing favorable inter-residue interactions such as

hydrophobic interactions, hydrogen bonds, ionic bonds, and

disulfide bonds in a simple and quantitative way.24 This

approach was subsequently applied to the analysis of a high-

resolution and low sequence identity structure dataset of

similar membrane protein pairs.15 These structure pairs have

the sequence identity ranged from �6 to 35%. Despite their

large structure deviation, little difference in the average num-

ber of interactions in each pair was observed. This suggested

that the average number of favorable interactions is a con-

served property of homologous membrane proteins.

Using the approach of our previous study on membrane

proteins, we analyzed homologous structure pairs of soluble

proteins of distant sequence. As the first step, we compiled a

high-resolution and low sequence identity structure pair data-

set by systematically analyzing all the reported high-resolution

X-ray structures. The percentage of the sequence identity

between the two structures in each pair in the dataset varied

from 16 to 38%. Unsurprisingly, the two structures in each

pair could differ quite substantially. The backbone SiMax

between them ranged from 1.63 to 4.91 A, and the difference

in the potential energy per residue ranged from 0.6 to 338%.

Nevertheless, the average number of favorable interactions

remains conserved for these low sequence identity pairs.

Interestingly, for structures of each pair studied here, the

difference in the value of the average number of hydrophobic

and hydrogen bonding interactions was found to be negligible

(see Figure 4). This is consistent with the earlier findings that

homologous sequences of similar structure tend to conserve

hydropathic profile.6,7 Further, it provides a quantitative ex-

planation to the conserved property underlying these struc-

tures from the perspective of physico-chemical interactions.

To some extent, this result can be understood from a struc-

tural perspective. In the 3D structure of a protein, a favorable

inter-residue interaction functions as an edge connecting two

residues. For two proteins adopting a similar fold, more or

less the same number of edges, oriented in the approximately

same direction is needed to keep the fold stable.

A conserved property among proteins of distant sequence

yet similar structure is of potential use in improving homol-

ogy modeling techniques based on template of low sequence

identity. Significant progress has been made in developing

computational tools to recognize the fold templates for ho-

mologous proteins with little sequence identity.25 For a

homology model based on such templates of low sequence

identity, obtaining an accurate alignment between the target

and the template sequences remains a challenge.26 On the ba-

sis of the finding here (see Figure 5), an iterative refinement

strategy could be adopted. In this strategy, the average num-

ber of favorable interactions of a model will be compared

with its template structures. If the model value is much less

than the template, then a specific alignment adjustment can

be performed and a new model can be constructed based on

it. The comparison of the average number of interactions

between the new model and the template structures can be

repeated. Depending on the results, this iterative process can

be carried out continuously until the difference in the aver-

age number of interactions between the model and its tem-

plates cannot be further reduced.

CONCLUSIONA dataset of high-resolution soluble protein pairs of similar

fold and distant sequence was compiled and analyzed. It was

confirmed that the average number of favorable inter-residue

interactions represents a conserved property across homolo-

gous proteins of similar fold, regardless the percentage of

their sequence identity. Further, the two main types of inter-

FIGURE 5 Correlation between the ratio of average number of

inter-residue interactions in homology models relative to their X-

ray structures and their Ca TM-score.

346 Gao and Li

Biopolymers

actions, the hydrophobic interaction and the hydrogen bond-

ing, also remain conserved. On the basis of the results, a

refinement strategy was suggested to help improve the

homology models constructed based on the templates of low

sequence identity.

The authors thank Dr. Michael F. Bruist at University of the Sciences

in Philadelphia for his comments about the manuscript.

REFERENCES1. Gibrat, J. F.; Madej, T.; Bryant, S. H. Curr Opin Struct Biol

1996, 6, 377–385.

2. Anfinsen, C. B. Science 1973, 181, 223–230.

3. Sippl, M. J. Curr Opin Struct Biol 2009, 19, 312–320.

4. Banaszak, L.; Winter, N.; Xu, Z.; Bernlohr, D. A.; Cowan, S.;

Jones, T. A. Adv Protein Chem 1994, 45, 89–151.

5. Russell, R. B.; Barton, G. J. J Mol Biol 1994, 244, 332–350.

6. Kinjo, A. R.; Nishikawa, K. Bioinformatics 2004, 20, 2504–2508.

7. Krissinel, E. Bioinformatics 2007, 23, 717–723.

8. Socolich, M.; Lockless, S. W.; Russ, W. P.; Lee, H.; Gardner, K.

H.; Ranganathan, R. Nature 2005, 437, 512–518.

9. Kinjo, A. R.; Nakamura, H. PLoS One 2008, 3, e1963.

10. Gromiha, M. M.; Selvaraj, S. Prog Biophys Mol Biol 2004, 86,

235–277.

11. Kauzmann, W. Adv Protein Chem 1959, 14, 1–63.

12. Gao, J.; Bosco, D. A.; Powers, E. T.; Kelly, J. W. Nat Struct Mol

Biol 2009, 16, 684–690.

13. Betz, S. F. Protein Sci 1993, 2, 1551–1558.

14. Waldburger, C. D.; Schildbach, J. F.; Sauer, R. T. Nat Struct Biol

1995, 2, 122–128.

15. Gao, J.; Li, Z. J Comput Aided Mol Des DOI 10.1007/s10822-

00-9220-.

16. Murzin, A. G.; Brenner, S. E.; Hubbard, T.; Chothia, C. J Mol

Biol 1995, 247, 536–540.

17. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.

N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids

Res 2000, 28, 235–242.

18. Redfern, O. C.; Harrison, A.; Dallman, T.; Pearl, F. M.; Orengo,

C. A. PLoS Comput Biol 2007, 3, e232.

19. Livingstone, C. D.; Barton, G. J. Comput Appl Biosci 1993, 9,

745–756.

20. Case, D. A.; Darden, T. A.; Cheatham, I. T. E.; Simmerling, C.

L.; Wang, J.; Duke, R. E.; Luo, R.; Merz, K. M.; Wang, B.; Pearl-

man, D. A. Amber8; University of California: San Francisco,

2004.

21. McDonald, I. K.; Thornton, J. M. J Mol Biol 1994, 238, 777–

793.

22. Sali, A.; Blundell, T. L. J Mol Biol 1993, 234, 779–815.

23. Zhang, Y.; Skolnick, J. Nucleic Acids Res 2005, 33, 2302–2309.

24. Muppirala, U. K.; Li, Z. Protein Eng Des Sel 2006, 19, 265–275.

25. Lobley, A.; Sadowski, M. I.; Jones, D. T. Bioinformatics 2009,

25, 1761–1767.

26. Sali, A.; Potterton, L.; Yuan, F.; van Vlijmen, H.; Karplus, M.

Proteins 1995, 23, 318–326.

Reviewing Editor: David A. Case


Biopolymers

Documents

Uncover the conserved property underlying sequence-distant and structure-similar proteins