Upload
vandieu
View
216
Download
0
Embed Size (px)
Citation preview
The role of methyl–induced polarization in ionbindingMariana Rossi ∗, Alexandre Tkatchenko ∗ Susan B. Rempe † and Sameer Varma † ‡
∗Fritz-Haber-Institut der Max-Planck-Gesellschaft, D-14195 Berlin, Germany,†Biological and Materials Sciences Center, Sandia National Laboratories, Albuquerque, NM-87185,
United States of America, and ‡Department of Cell Biology, Microbiology and Molecular Biology, Department of Physics, University of South Florida, 4202 E. Fowler Ave.,
Tampa, FL-33620, United States of America
Submitted to Proceedings of the National Academy of Sciences of the United States of America
The chemical property of methyl groups that renders them indis-pensable to biomolecules is their hydrophobicity. Quantum mechan-ical studies undertaken here to understand the effect of point sub-stitutions on potassium (K–) channels illustrate quantitatively howmethyl–induced polarization also contributes to biomolecular func-tion. K–channels regulate transmembrane salt concentration gradi-ents by transporting K+ ions selectively. One of the K+ binding sitesin the channel’s selectivity filter, the S4 site, also binds Ba2+ ions,which blocks K+ transport. This inhibitory property of Ba2+ ionshas been vital in understanding K–channel mechanism. In most K–channels, the S4 site is comprised of four threonine amino acids. TheK–channels that carry serine instead of threonine are significantlyless susceptible to Ba2+ block and have reduced stabilities. We findthat these differences can be explained by the lower polarizabilityof serine compared to threonine as serine carries one less branchedmethyl group than threonine. A T→S substitution in the S4 sitereduces its polarizability, which, in turn, reduces ion binding by sev-eral kcal/mol. While the loss in binding affinity is high for Ba2+,the loss in K+ binding affinity is also significant thermodynamically,which reduces channel stability. These results highlight, in general,how biomolecular function can rely on the polarization induced bymethyl groups, especially those that are proximal to charged moi-eties, including ions, titratable amino acids, sulphates, phosphatesand nucleotides.
ion channels | methylation | polarization | electronic structure | selectivity
Methyl groups play a central role in biomolecular func-tion. As constituents of biomolecules, these groups
define the biomolecule’s solvated configurations. Also, theirpost-translational addition to peptides is regulated tightly toenable numerous physiological processes, including gene tran-scription, and signal transduction [1]. Furthermore, methy-lation of nucleotides is a crucial epigenetic modification thatregulates many cellular processes such as embryonic develop-ment, transcription, chromatin structure, genomic imprint-ing and chromosome stability [2]. The chemical propertyof methyl groups that render them indispensable to theseprocesses is their hydrophobicity, that is, their inability tohydrogen-bond with water molecules, or more generally, withpolar groups.
Methyl groups are, however, also polarizable. For exam-ple, methanol differs from water chemically in that it includesa methyl group, and has an average static dipole polarizabil-ity more than twice that of water (3.3 vs 1.5 A3) [3]. In ad-dition, the average static dipole polarizabilities of alcohols in-crease with addition of methylene bridges (–CH2–). Methanol,ethanol and propanol have increasingly larger polarizabilitiesof 3.3, 4.5 and 6.7 A3, respectively [3]. While such trends in-dicate that methyl or methylene polarizability could be an im-portant contributor to the electrostatic and polarization forcesthat drive biomolecular function, there exist only suggestiveand qualitative evidences as to their actual role. Studies ex-amining the effect of methylation on DNA stability [4] pro-posed that the enhanced stability of methylated DNA may beexplained by considering that methylation of nucleotides in-creases their polarizability. In a different experimental study
concerning the relative stabilities of DNA and RNA helices [5],it was proposed that the largest contribution to stabilizationby methyl groups was due to increased base–stacking abilityrather than favorable hydrophobic methyl–methyl contacts.We present investigations on K–channels that illustrate quan-titatively how methyl–induced polarization contributes to thefunctional properties of biomolecules.
S1
S2
S3
S4
Thrside-chain
Extra-cellular sidea)KcsA … T T V G Y G …Kir 2.1 … T T I G Y G …Kir 2.1* … T S I G Y G …Kir 2.4 … T S I G Y G …Erg … T S V G F G …SK … L S I G Y G …Kcv … S T V G F G …Kcv* … S S V G F G …
b)
Threonine Serine
c)
S4
Fig. 1. (a) Selectivity filter of a representative K–channel, KcsA [7]. The orange
spheres denote K+ ions bound to two of their four preferred bindings sites, S2 and
S4 (PDB ID: 1K4C). Threonine side chains that makes up the S4 site are highlighted
in green. (b) Sequence alignment of the selectivity filter region of K–channels that
have serine residues in their S4 sites. The K–channels whose names are appended
with an asterisk are engineered via site-directed mutagenesis and were functionally
characterized recently by Chatelain et al [16]. (c) Representative configurations of
threonine and serine amino acids.
The primary function of K–channels is to regulate trans-membrane salt concentration gradients [6]. K–channels ac-complish this task by transporting K+ ions selectively throughtheir pores in response to specific external stimuli. Fig. 1ais a representative structure of the selectivity filter of theirpores [7], which shows four preferred binding sites for K+ ions,S1–4. Three of these binding sites, S1–3, can be consideredchemically identical as they provide eight backbone carbonyloxygens for ion coordination. The fourth site, S4, is typically
Reserved for Publication Footnotes
www.pnas.org/cgi/doi/10.1073/pnas.0709640104 PNAS Issue Date Volume Issue Number 1–7
composed of four threonine residues that provide four back-bone carbonyl oxygens and four side chain hydroxyl oxygensfor ion coordination. This S4 site is also the preferred bindingsite for Ba2+ ions that block K+ permeation [8, 9, 10, 11].This inhibitory property of Ba2+ has proven vital toward un-derstanding the mechanisms underlying K–channel function[8, 9, 10, 12, 13, 14, 15, 17, 18].
Genetic selection and site-directed mutagenesis experi-ments on a viral K–channel, Kcv, and the inward rectifierK–channel, Kir 2.1, show that a threonine-to-serine (T→S)substitution in the S4 sites reduces channel susceptibility toBa2+ block by over two orders in magnitude [16]. In addition,this substitution also reduces channel stability, including thechannel’s mean open probabilities. Furthermore, the sequencealignment of K–channels [19] shows that some K–channel subfamilies carry a serine instead of threonine residue in theirS4 sites (Fig. 1b). Among these serine-carrying channels,Kir 2.4 has also been found to exhibit similar characteristicdifferences with respect to the typical threonine–containingchannels [20, 21].
The single chemical difference between a serine and a thre-onine side chain is that serine has one less branched methylgroup than threonine (Fig. 1c). Consequently, serine can beexpected to be less polarizable than threonine. However, isthe difference in electronic polarizability sufficient to explainthe T→S induced changes in K-channel properties, especiallygiven that the ion in the S4 site resides at a distance greaterthan 5 A from the branched methyl group [7]? Serine andthreonine also have different hydrophobicities [22], so do T→Ssubstitutions alter the manner in which their side chain hy-droxyl groups align with the permeation pathway and interactwith ions?
The primary challenge associated with investigating theseissues is to model accurately the broad range of molecularforces involved in ion complexation, in particular, polariza-tion and dispersion that contribute non-trivially to ion-ligandand ligand-ligand energetics [23, 24, 25]. A consistent first-principles approach is, therefore, essential. Toward that end,we use density-functional theory with the semilocal PBE [26]and hybrid PBE0 [27, 28] exchange-correlation functionals.Since van der Waals (vdW) dispersion can be expected to be amajor contributor to ion–methyl energetics in the 5 A distancerange [29], we describe it explicitly using the recently devel-oped DFT+vdW method [30]. In this method, the C6[n(r)]coefficients of the interatomic interaction term C6[n(r)]/R6
are obtained from the self–consistent electron density n(r).This method yields an accuracy of about 0.3 kcal/mol in com-parison to “gold standard” quantum chemical calculationsfor a wide range of intermolecular interactions in moleculardimers [31]. We find that for our systems, the C6 coefficientsof the ions depend strongly on their local coordination en-vironments and can vary by an order in magnitude, makingtheir explicit electron–density dependent evaluation criticalto the accuracy of energetic and structural properties. Thesecalculations show that the reduction in methyl–induced po-larization of the S4 site associated with T→S substitutions issufficient to explain the aforementioned experimental obser-vations.
Results and DiscussionTo understand how T→S substitutions in the S4 site affect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate changes in enthalpies (∆H) and free energies(∆G) for the substitution reactions
AXn + nX′ AX′n + nX, [1]
in the gas phase. We consider first the set of reactions in whichA represents one of the ions Na+, K+ and Ba2+ and X repre-sents one of the molecules, water (W), methanol (M), ethanol(E), and propanol (P). While these four small molecules allprovide hydroxyl oxygens for ion coordination, they differfrom each other in the number of methyl groups or methylenebridges they carry. Consequently they have different staticdipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3, respectively[3]. Note, however, that these four small molecules all havecomparable gas phase dipole moments of 1.85, 1.70, 1.69 and1.68 Debye, respectively [32]. The PBE0 functional we em-ploy yields similar values for gas phase dipole moments, thatis, 1.91, 1.68, 1.72 and 1.77 Debye, respectively. The resultsof the substitution reaction energy calculations are plotted inFig. 2. The ∆G obtained for the set of reactions involving wa-ter molecules and methanols, for which experimental data areavailable [33, 34, 35], are in quantitative agreement with ex-perimental values (Table S1 of supporting information). Theexplicit inclusion of dispersion improves the performance ofPBE0 significantly.
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
< ⌧zz(t)⌧zz(0) >eq � < ⌧xx(t)⌧xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
✏|| = 0
d�/d✏ = 0.37 GPa
25
kxi � xjkAWn ! AMn
AMn ! AEn
AEn ! APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,?
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: August 7, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di↵erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di↵erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
dro
xylgro
upsalign
wit
hth
eper
mea
tion
path
way
and
inte
ract
wit
hio
ns?
Inth
isw
ork
,w
ein
ves
tigate
thes
eis
sues
usi
ng
quantu
mch
emic
alca
lcula
tions.
The
pri
mary
challen
ge
ass
oci
ate
dw
ith
studyin
gm
ethyl
gro
ups
usi
ng
quantu
mch
emis
try
isth
at
one
nee
ds
toacc
ount
for
the
contr
ibuti
on
from
dis
per
sion,
and
inth
isca
se,
for
syst
ems
conta
inin
gth
ousa
nds
of
elec
-tr
ons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[2
0]and
PB
E0
[21,22]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
3]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Resu
lts
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r(W
),m
ethanol(M
),et
hanol(E
)and
pro
panol
(P).
While
all
thes
em
ole
cule
spro
vid
ehydro
xyl
ligands
for
ion-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hylgro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
er-
ent
stati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of
1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
of
thes
eca
lcula
tions
are
plo
tted
inFig
.2.
We
find
that,
ingen
eral,
com
ple
xst
ability
changes
non-t
rivia
lly
wit
hth
enum
ber
of
met
hyl
gro
ups
inth
eco
ord
inati
ng
ligand.
the
magnit
ude
of
the
incr
ease
dep
ends
non-t
rivia
lly
charg
e/si
zera
tio
ofio
n,th
enum
ber
ofligandsco
ord
inati
ng
the
ion,asw
ellasth
edis
tance
from
the
ion
wher
eth
em
ethylgro
up
isadded
.W
efind
that
as
the
num
ber
of
met
hyl
gro
ups
inth
elig-
ands
incr
ease
,co
mple
xst
ability
enhance
s.W
hile
the
magni-
tude
of
this
e↵ec
tex
pec
tedly
dec
rease
sw
hen
met
hyl
gro
ups
are
intr
oduce
dfu
rther
away
from
the
ion,
itre
main
sin
the
physi
olo
gic
ally
signifi
cant
range
even
when
met
hylgro
ups
are
intr
oduce
dat
dis
tance
sgre
ate
rth
an
6A
(AE
n!
AP
n).
Inaddit
ion,
the
change
inco
mple
xst
ability
als
odep
ends
non-
linea
rly
on
ion-c
oord
inati
on
num
ber
.Such
am
ult
i-body
e↵ec
tem
erges
as
aco
nse
quen
ceofre
puls
ion
bet
wee
nth
eco
ord
inat-
ing
ligands,
and
iscr
itic
alto
the
esti
mati
on
ofre
lati
ve
stabil-
itie
sbet
wee
ntw
oio
ns
[16,17].
Inth
ecr
yst
alst
ruct
ure
of
the
Kcs
A[8
],th
edis
tance
be-
twee
nth
eK
+io
nand
the
met
hylgro
up
ofth
eth
reonin
esi
de
chain
inth
eS4
site
is⇠
5A
.T
his
isw
ithin
the
ion-m
ethyl
dis
tance
sex
am
ined
inFig
.2,w
hic
hsu
gges
tsth
at
the
subst
i-tu
tion
ofth
ism
ethylgro
up
wit
ha
hydro
gen
ato
min
aT!
Sm
uta
tion
can
reduce
the
bin
din
ga�
nit
ies
of
all
thre
eio
ns,
Na+,
K+
and
Ba2+
signifi
cantl
y.H
owev
er,
inth
eS4
site
,th
ese
ions
coord
inate
wit
hei
ght
ligands.
Conse
quen
tly,
the
repuls
ion
bet
wee
nth
eligands
will
be
larg
er,
and
the
over
all
reduct
ion
inbin
din
ga�
nity
could
be
smaller
.To
evalu
ate
Table
2.
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxovir
us
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cter
izati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cteri
zation
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cter
izati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tative
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cteri
zation
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cter
izati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cteri
zation
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
H)an
dfree
ener
gie
s(�
G)due
toin
crea
sein
the
num
ber
ofm
ethyl
gro
ups
inth
eio
n-c
oor
din
atin
glig
ands.�
Han
d
�G
are
estim
ated
forth
esu
bst
itution
reac
tionsA
Xn
+nX
0⌦
AX
0 n+
AX
n,
whic
har
eden
ote
das
AX
n!
AX
0 n.
Inth
ese
reac
tions,
Are
pres
ents
one
of
the
ions,
Na+
,K+
orB
a2+
,an
dX
repr
esen
tsone
ofth
elig
and
mole
cule
s,wat
er(W
),
met
han
ol(M
),et
han
ol(E
)or
propan
ol(P
).A
llen
ergie
sar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tofT!
Ssu
bst
itution
on
elec
tron
den
sities
ofan
isola
ted
S4
site
.T
he
di↵
eren
ces
bet
wee
nth
eel
ectr
onic
den
sities
,⇢
Ba
T4�
4⇢
CH
3�
⇢B
aS4�
4⇢
H,
wer
ees
tim
ated
using
the
PB
E0+
vdW
funct
ional
for
the
optim
um
geo
met
ryof
the
BaT4
com
ple
x.T
he
geo
met
ryof
the
BaS
4co
mple
xwas
const
ruct
edfrom
the
BaT4
optim
ized
geo
met
ryby
repla
cing
the
CH
3gro
up
of
each
thre
onin
ere
sidue
with
anH
atom
,an
dth
enre
laxi
ng
the
coor
din
ates
ofH
atom
s.
Mate
rials
and
Meth
ods
All
geo
met
ric
rela
xations
wer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[24,25],
wher
ewe
em-
plo
yed
the
def
ault
“tight”
stan
dar
ds
regar
din
gbas
isse
tsan
din
tegra
tion
grids
for
all
elem
ents
.For
DFT
(LD
A/G
GA
)ca
lcula
tions
thes
ese
ttin
gs
produce
esse
ntial
lyco
n-
2w
ww
.pnas
.org
/cg
i/doi/
10.1
073/pnas
.0709640104
Footlin
eA
uth
or
dro
xylgro
upsalign
wit
hth
eper
mea
tion
path
way
and
inte
ract
wit
hio
ns?
Inth
isw
ork
,w
ein
ves
tigate
thes
eis
sues
usi
ng
quantu
mch
emic
alca
lcula
tions.
The
pri
mary
challen
ge
ass
oci
ate
dw
ith
studyin
gm
ethyl
gro
ups
usi
ng
quantu
mch
emis
try
isth
at
one
nee
ds
toacc
ount
for
the
contr
ibuti
on
from
dis
per
sion,
and
inth
isca
se,
for
syst
ems
conta
inin
gth
ousa
nds
of
elec
-tr
ons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[2
0]and
PB
E0
[21,22]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
3]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Resu
lts
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r(W
),m
ethanol(M
),et
hanol(E
)and
pro
panol
(P).
While
all
thes
em
ole
cule
spro
vid
ehydro
xyl
ligands
for
ion-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hylgro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
er-
ent
stati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of
1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
of
thes
eca
lcula
tions
are
plo
tted
inFig
.2.
We
find
that,
ingen
eral,
com
ple
xst
ability
changes
non-t
rivia
lly
wit
hth
enum
ber
of
met
hyl
gro
ups
inth
eco
ord
inati
ng
ligand.
the
magnit
ude
of
the
incr
ease
dep
ends
non-t
rivia
lly
charg
e/si
zera
tio
ofio
n,th
enum
ber
ofligandsco
ord
inati
ng
the
ion,asw
ellasth
edis
tance
from
the
ion
wher
eth
em
ethylgro
up
isadded
.W
efind
that
as
the
num
ber
of
met
hyl
gro
ups
inth
elig-
ands
incr
ease
,co
mple
xst
ability
enhance
s.W
hile
the
magni-
tude
of
this
e↵ec
tex
pec
tedly
dec
rease
sw
hen
met
hyl
gro
ups
are
intr
oduce
dfu
rther
away
from
the
ion,
itre
main
sin
the
physi
olo
gic
ally
signifi
cant
range
even
when
met
hylgro
ups
are
intr
oduce
dat
dis
tance
sgre
ate
rth
an
6A
(AE
n!
AP
n).
Inaddit
ion,
the
change
inco
mple
xst
ability
als
odep
ends
non-
linea
rly
on
ion-c
oord
inati
on
num
ber
.Such
am
ult
i-body
e↵ec
tem
erges
as
aco
nse
quen
ceofre
puls
ion
bet
wee
nth
eco
ord
inat-
ing
ligands,
and
iscr
itic
alto
the
esti
mati
on
ofre
lati
ve
stabil-
itie
sbet
wee
ntw
oio
ns
[16,17].
Inth
ecr
yst
alst
ruct
ure
of
the
Kcs
A[8
],th
edis
tance
be-
twee
nth
eK
+io
nand
the
met
hylgro
up
ofth
eth
reonin
esi
de
chain
inth
eS4
site
is⇠
5A
.T
his
isw
ithin
the
ion-m
ethyl
dis
tance
sex
am
ined
inFig
.2,w
hic
hsu
gges
tsth
at
the
subst
i-tu
tion
ofth
ism
ethylgro
up
wit
ha
hydro
gen
ato
min
aT!
Sm
uta
tion
can
reduce
the
bin
din
ga�
nit
ies
of
all
thre
eio
ns,
Na+,
K+
and
Ba2+
signifi
cantl
y.H
owev
er,
inth
eS4
site
,th
ese
ions
coord
inate
wit
hei
ght
ligands.
Conse
quen
tly,
the
repuls
ion
bet
wee
nth
eligands
will
be
larg
er,
and
the
over
all
reduct
ion
inbin
din
ga�
nity
could
be
smaller
.To
evalu
ate
Table
2.
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxovir
us
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
H)an
dfree
ener
gie
s(�
G)due
toin
crea
sein
the
num
ber
ofm
ethyl
gro
ups
inth
eio
n-c
oor
din
atin
glig
ands.�
Han
d
�G
are
estim
ated
forth
esu
bst
itution
reac
tionsA
Xn
+nX
0⌦
AX
0 n+
AX
n,
whic
har
eden
ote
das
AX
n!
AX
0 n.
Inth
ese
reac
tions,
Are
pres
ents
one
of
the
ions,
Na+
,K+
orB
a2+
,an
dX
repr
esen
tsone
ofth
elig
and
mole
cule
s,wat
er(W
),
met
han
ol(M
),et
han
ol(E
)or
propan
ol(P
).A
llen
ergie
sar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tofT!
Ssu
bst
itution
on
elec
tron
den
sities
ofan
isola
ted
S4
site
.T
he
di↵
eren
ces
bet
wee
nth
eel
ectr
onic
den
sities
,⇢
Ba
T4�
4⇢
CH
3�
⇢B
aS4�
4⇢
H,
wer
ees
tim
ated
using
the
PB
E0+
vdW
funct
ional
for
the
optim
um
geo
met
ryof
the
BaT4
com
ple
x.T
he
geo
met
ryof
the
BaS
4co
mple
xwas
const
ruct
edfrom
the
BaT4
optim
ized
geo
met
ryby
repla
cing
the
CH
3gro
up
of
each
thre
onin
ere
sidue
with
anH
atom
,an
dth
enre
laxi
ng
the
coor
din
ates
ofH
atom
s.
Mate
rials
and
Meth
ods
All
geo
met
ric
rela
xations
wer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[24,25],
wher
ewe
em-
plo
yed
the
def
ault
“tight”
stan
dar
ds
regar
din
gbas
isse
tsan
din
tegra
tion
grids
for
all
elem
ents
.For
DFT
(LD
A/G
GA
)ca
lcula
tions
thes
ese
ttin
gs
produce
esse
ntial
lyco
n-
2w
ww
.pnas
.org
/cg
i/doi/
10.1
073/pnas
.0709640104
Footlin
eA
uth
or
< ⌧zz(t)⌧zz(0) >eq � < ⌧xx(t)⌧xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
✏|| = 0
d�/d✏ = 0.37 GPa
25
kxi � xjkAWn ! AMn
AMn ! AEn
AEn ! APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,?
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: August 7, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di↵erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di↵erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< ⌧zz(t)⌧zz(0) >eq � < ⌧xx(t)⌧xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
✏|| = 0
d�/d✏ = 0.37 GPa
25
kxi � xjkAWn ! AMn
AMn ! AEn
AEn ! APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,?
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: August 7, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di↵erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di↵erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (∆H) and free energies (∆G)
due to increase in the number of methyl groups or methylene bridges in the ion-
coordinating ligands. ∆H and ∆G are estimated at 298 K for the substitution
reactions given by Equ. 1, which are denoted as AXn → AX′n. All energies are in
units of kcal/mol.
We find that the stability of an ion complex, measuredin terms of both enthalpy and free energy, increases withthe numbers of methyl groups and methylene bridges in thecoordinating ligand. This enhanced stability cannot be ex-plained by considering the subtle differences in the gas phasedipole moments of the coordinating ligands [23], demonstrat-ing that the enhanced stability is due to the larger ligandpolarizabilities arising from the higher numbers of methylgroups or methylene bridges. The magnitude of the increasein stability, however, depends intricately on the interplay be-tween several factors. While the complex is stabilized lesswith each incremental change in hydrocarbon chain length,the increase in stability is still, for the molecules examined,large enough to be relevant physiologically. The ∆H and∆G associated with the substitution reactions also depend
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
non–linearly on ion coordination number. This non-linearityemerges as a consequence of repulsion between the coordinat-ing ligands [23, 36, 37]. Finally, ∆H and ∆G also depend onthe charge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced effects are small. For example,while the ∆G associated with a K+W→ K+M substitution is-1.9 kcal/mol, the ∆G associated with the Ba2+W → Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear effect also un-derscores the importance of explicit polarization effects in ioncomplexation [23]. In a separate set of calculations, we findthat while polarizable force fields can describe such trends,there can be notable quantitative differences with respect toPBE0+vdW (Figure S1 of supporting information).
In the calculations above, the small molecules have similargas phase dipole moments, but markedly different polarizabil-ities. A ligand with a higher polarizability produces, in gen-eral, a more stable ion complex. Can ligands with similar gasphase dipole moments as well as static dipole polarizabilitiesproduce complexes with markedly different stabilities? Bu-tanol has three different isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (∼1.7 Debye) and polarizabili-ties (∼8.9 A3)[3], the differences in their chemical structureswill result in different distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite theirsimilar polarizabilities, they will respond to the spatially vary-ing field differently. It may also be expected that ion com-plexes comprised of isobutanol or 2-butanol, each of whichcarry one additional methyl group compared to n-butanol intheir branched hydrocarbon chains, will have a denser packingof methyl groups near the central ion and, therefore, higherstabilities compared to n-butanol.
To quantify this, we carry out substitution reactions(Equ. 1) involving these three butanol isomers. The resultsare plotted in Fig. 3. Indeed, we find that the isobutanoland 2-butanol complexes are, in general, more stable thanthe corresponding n-butanol complexes. Similar to the studyabove, ∆H depends non–linearly on ion coordination num-ber; a non–linearity that emerges from ligand-ligand repulsion[23, 36, 37], which increases with ligand number. Therefore,packing increasingly higher numbers of methyl groups aroundthe ion does not necessarily imply that the stability of theion complex will increase. For example, while ∆H = −3.0kcal/mol for the substitution reaction KB2 → K(2B)2, ∆H ∼0 for the substitution reaction KB4 → K(2B)4. In addition,despite the fact that one of the two terminal methyl groupsof 2-butanol is closer to the ion than that of isobutanol, com-plexes comprised of 2-butanols are not necessarily more sta-ble than the corresponding complexes of isobutanols. Thus,in addition to the static dipole polarizabilities of the methylgroups, their distribution around the ion also matter.
We note that the distances of the ions from the closestmethyl groups of 2-butanol (∼5 A) are comparable to thedistances of ions from the branched methyl group of threo-nine in the S4 site of KcsA [7, 11]. In addition, the packingof the branched methyl group of threonine in the S4 site ofKcsA is intermediate between the packing of branched methylgroups in 4-fold complexes comprised of 2-butanols and isobu-tanols (Figure S2 of supporting information). This suggeststhat the reduction in methyl–induced polarization from T→Ssubstitutions can reduce ion stabilities in the S4 sites of K–
channels. However, in the S4 sites of K–channels the ionscoordinate with eight ligands, and not just four hydroxyl lig-ands as studied above. Consequently, the repulsion betweenthe coordinating ligands can be larger in the S4 site, whichcould reduce or wash out the contribution of the threoninemethyl groups to the binding of ions at the S4 site. Further-more, a T→S substitution may also alter the manner in whichthe coordinating groups of the S4 site interact with ions.
ion-coordination number. This non-linearity emerges as aconsequence of repulsion between the coordinating ligands[23, 36, 37]. Finally, �H and �G, also depend on thecharge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced e↵ects are small. For example,while the �G associated with a K+W ! K+M substitution is-1.9 kcal/mol, the �G associated with the Ba2+W ! Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear e↵ect also un-derscores the importance of explicit polarization e↵ects in ioncomplexation [23].
In the calculations above, the small molecules have similargas phase dipole moments, but markedly di↵erent polarizabil-ities, and a ligand with a higher polarizability produces, ingeneral, a more stable ion complex. Can ligands with simi-lar gas phase dipole moments as well as static dipole polariz-abilties produce complexes with markedly di↵erent stabilities?Butanol has three di↵erent isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (⇠1.7 Debye) and polarizabili-ties (⇠8.9 A3)[3], the di↵erences in their chemical structureswill result in di↵erent distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite hav-ing similar polarizabilities, they will respond to the distance–dependent field of the ion di↵erently, and the closer proxim-ity of methyl groups in isobutanol and 2-butanol to the ion,compared to n-butanol, can be expected to increase the sta-bility of the complex. To quantify this e↵ect we carry outsubstitution reactions (Equ. 1) involving these three butanolisomers. The results are plotted in Fig. 3. Indeed, we findthat the isobutanol and 2-butanol complexes are more stablethan the corresponding n-butanol complexes. Furthermore,complexes comprising of 2-butanols are more stable than thecorresponding complexes comprising of isobutanols, as one ofthe two terminal methyl groups of 2-butanol is closer to theion than that of isobutanol. Thus, in addition to the staticdipole polarizabilities of the methyl groups, the distributionof methyl groups around the ion also plays a role in ion com-plex stability. We also note that the distances of the ions fromthe closest methyl groups of 2-butanol (⇠5 A) are compara-ble to the distances of ions from the branched methyl groupof threonine in the S4 site of KcsA [7, 11]. This suggests fur-ther that the reduction in methyl–induced polarization due toT!S substitutions in the S4 sites of K–channels can reducethe binding a�nities of all three ions significantly.
ABn ! A(iB)n
Fig. 3. E↵ect of butanol isoform on ion complexation enthalpies (�H). The
enthalpies are estimated at 298 K for the substitution reactions given by Equ. 1,
which are denoted as AXn ! AX0n. All energies are in units of kcal/mol.
In the S4 sites of K–channels the ions coordinate witheight ligands, and not just four hydroxyl ligands as studiedabove. Consequently, the repulsion between the coordinatingligands will be larger in the S4 site, which could reduce orwash out the contribution of the threonine methyl groups tothe binding of ions at the S4 site. Furthermore, a T!S sub-stitution may also alter the manner in which the coordinatinggroups of the S4 site interact with ions.
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS ⌦ AT4�nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coordi-nating groups is not large enough to counteract the stabiliza-tion provided by the presence of methyl groups. The a�nityof the S4 site for all three ions, Na+, K+ and Ba2+, dimin-ishes, although non-linearly, with each threonine substitution.Furthermore, these substitutions result in only minor config-urational changes that do not alter the overall topology ofthe S4 site (Table S2 of supporting information), which im-plies that the e↵ect of T!S substitutions on configurationalentropy will be minimal, and may be neglected. While wefind that the configurational change is generally higher in thecase of Ba2+ complexes as compared to Na+ and K+ com-plexes, we also note that optimizations of Ba2+ complexesfollowing T!S substitutions, but under heavy atom positionrestraints, result in a higher drop in S4 site a�nity. For ex-ample, a BaT4 ! BaS4 substitution followed by a constrainedoptimization yields a �E = XX kcal/mol, as opposed to a �E= XX kcal/mol obtained after an unconstrained optimization.This indicates that if the K–channel matrix were rigid to dis-allow a structural change, a T!S substitution will result ina higher loss in S4 site a�nity for Ba2+, as compared to theresults in Table 1. Together, this suggests that the loss ina�nity of the S4 site noted in Table 1 is not a result of alteredion–binding topologies, but is primarily due to a reduction inthe methyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the a�nity of the S4 site is the highest forBa2+ (Table 1). The free energy di↵erence of 6.9 kcal/mol isequivalent to a 106-fold drop in binding a�nity, where morethan half of the a�nity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop inBa2+ binding a�nity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T!S substi-tution experiments [16]. In addition, it also accounts for thedi↵erence between the Kir 2.4 and Kir 2.1 experimental IC50
values of Ba2+ [20, 21], as Kir 2.4 carries a serine and Kir 2.1a threonine in the S4 site.
The computed drop in stability is, however, greater thanthat inferred from experiments (by over 3 kcal/mol). This dis-crepancy does not result from the pairwise approximation [31]employed in the estimation of the dispersion energy. Whilemany-body dispersion terms can contribute significantly toligand binding [38], we find their contribution to be only0.3 kcal/mol to the BaT4 ! BaS4 substitution reaction (Ta-ble S4 of supporting information). We also do not expectthis discrepancy to be due to the missing structural restraintsfrom the protein matrix on the isolated S4 site, as the T!Ssubstitutions are accompanied by only minor configurationalchanges (Table S2 of supporting information). The overesti-mated drop in Ba2+ binding a�nity is most likely because ourcalculations lack a polarization coupling between the S4 siteand its external environment. Fig. 4 shows that the polar-ization in the electron density of the threonine methyl groupsis along the electric field of the Ba2+ ion, which maximizesthe contribution of methyl polarization to Ba2+ binding. Thepresence of other polar chemical moieties proximal to the S4site, including water molecules, can reduce the contributionof polarization and thereby the overall e↵ect of T!S substi-tutions on Ba2+ binding. Thermodynamically, this reduction
Footline Author PNAS Issue Date Volume Issue Number 3
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
dro
xylgro
upsalign
wit
hth
eper
mea
tion
path
way
and
inte
ract
wit
hio
ns?
Inth
isw
ork
,w
ein
ves
tigate
thes
eis
sues
usi
ng
quantu
mch
emic
alca
lcula
tions.
The
pri
mary
challen
ge
ass
oci
ate
dw
ith
studyin
gm
ethyl
gro
ups
usi
ng
quantu
mch
emis
try
isth
at
one
nee
ds
toacc
ount
for
the
contr
ibuti
on
from
dis
per
sion,
and
inth
isca
se,
for
syst
ems
conta
inin
gth
ousa
nds
of
elec
-tr
ons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[2
0]and
PB
E0
[21,22]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
3]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Resu
lts
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r(W
),m
ethanol(M
),et
hanol(E
)and
pro
panol
(P).
While
all
thes
em
ole
cule
spro
vid
ehydro
xyl
ligands
for
ion-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hylgro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
er-
ent
stati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of
1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
of
thes
eca
lcula
tions
are
plo
tted
inFig
.2.
We
find
that,
ingen
eral,
com
ple
xst
ability
changes
non-t
rivia
lly
wit
hth
enum
ber
of
met
hyl
gro
ups
inth
eco
ord
inati
ng
ligand.
the
magnit
ude
of
the
incr
ease
dep
ends
non-t
rivia
lly
charg
e/si
zera
tio
ofio
n,th
enum
ber
ofligandsco
ord
inati
ng
the
ion,asw
ellasth
edis
tance
from
the
ion
wher
eth
em
ethylgro
up
isadded
.W
efind
that
as
the
num
ber
of
met
hyl
gro
ups
inth
elig-
ands
incr
ease
,co
mple
xst
ability
enhance
s.W
hile
the
magni-
tude
of
this
e↵ec
tex
pec
tedly
dec
rease
sw
hen
met
hyl
gro
ups
are
intr
oduce
dfu
rther
away
from
the
ion,
itre
main
sin
the
physi
olo
gic
ally
signifi
cant
range
even
when
met
hylgro
ups
are
intr
oduce
dat
dis
tance
sgre
ate
rth
an
6A
(AE
n!
AP
n).
Inaddit
ion,
the
change
inco
mple
xst
ability
als
odep
ends
non-
linea
rly
on
ion-c
oord
inati
on
num
ber
.Such
am
ult
i-body
e↵ec
tem
erges
as
aco
nse
quen
ceofre
puls
ion
bet
wee
nth
eco
ord
inat-
ing
ligands,
and
iscr
itic
alto
the
esti
mati
on
ofre
lati
ve
stabil-
itie
sbet
wee
ntw
oio
ns
[16,17].
Inth
ecr
yst
alst
ruct
ure
of
the
Kcs
A[8
],th
edis
tance
be-
twee
nth
eK
+io
nand
the
met
hylgro
up
ofth
eth
reonin
esi
de
chain
inth
eS4
site
is⇠
5A
.T
his
isw
ithin
the
ion-m
ethyl
dis
tance
sex
am
ined
inFig
.2,w
hic
hsu
gges
tsth
at
the
subst
i-tu
tion
ofth
ism
ethylgro
up
wit
ha
hydro
gen
ato
min
aT!
Sm
uta
tion
can
reduce
the
bin
din
ga�
nit
ies
of
all
thre
eio
ns,
Na+,
K+
and
Ba2+
signifi
cantl
y.H
owev
er,
inth
eS4
site
,th
ese
ions
coord
inate
wit
hei
ght
ligands.
Conse
quen
tly,
the
repuls
ion
bet
wee
nth
eligands
will
be
larg
er,
and
the
over
all
reduct
ion
inbin
din
ga�
nity
could
be
smaller
.To
evalu
ate
Table
2.
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tative
Chara
cte
rizati
on
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxovir
us
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
H)an
dfree
ener
gie
s(�
G)due
toin
crea
sein
the
num
ber
ofm
ethyl
gro
ups
inth
eio
n-c
oor
din
atin
glig
ands.�
Han
d
�G
are
estim
ated
forth
esu
bst
itution
reac
tionsA
Xn
+nX
0⌦
AX
0 n+
AX
n,
whic
har
eden
ote
das
AX
n!
AX
0 n.
Inth
ese
reac
tions,
Are
pres
ents
one
of
the
ions,
Na+
,K+
orB
a2+
,an
dX
repr
esen
tsone
ofth
elig
and
mole
cule
s,wat
er(W
),
met
han
ol(M
),et
han
ol(E
)or
propan
ol(P
).A
llen
ergie
sar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tofT!
Ssu
bst
itution
on
elec
tron
den
sities
ofan
isola
ted
S4
site
.T
he
di↵
eren
ces
bet
wee
nth
eel
ectr
onic
den
sities
,⇢
Ba
T4�
4⇢
CH
3�
⇢B
aS4�
4⇢
H,
wer
ees
tim
ated
using
the
PB
E0+
vdW
funct
ional
for
the
optim
um
geo
met
ryof
the
BaT4
com
ple
x.T
he
geo
met
ryof
the
BaS
4co
mple
xwas
const
ruct
edfrom
the
BaT4
optim
ized
geo
met
ryby
repla
cing
the
CH
3gro
up
of
each
thre
onin
ere
sidue
with
anH
atom
,an
dth
enre
laxi
ng
the
coor
din
ates
ofH
atom
s.
Mate
rials
and
Meth
ods
All
geo
met
ric
rela
xations
wer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[24,25],
wher
ewe
em-
plo
yed
the
def
ault
“tight”
stan
dar
ds
regar
din
gbas
isse
tsan
din
tegra
tion
grids
for
all
elem
ents
.For
DFT
(LD
A/G
GA
)ca
lcula
tions
thes
ese
ttin
gs
produce
esse
ntial
lyco
n-
2w
ww
.pnas
.org
/cg
i/doi/
10.1
073/pnas
.0709640104
Footlin
eA
uth
or
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
ion-coordination number. This non-linearity emerges as aconsequence of repulsion between the coordinating ligands[23, 36, 37]. Finally, �H and �G, also depend on thecharge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced e↵ects are small. For example,while the �G associated with a K+W ! K+M substitution is-1.9 kcal/mol, the �G associated with the Ba2+W ! Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear e↵ect also un-derscores the importance of explicit polarization e↵ects in ioncomplexation [23].
In the calculations above, the small molecules have similargas phase dipole moments, but markedly di↵erent polarizabil-ities, and a ligand with a higher polarizability produces, ingeneral, a more stable ion complex. Can ligands with simi-lar gas phase dipole moments as well as static dipole polariz-abilties produce complexes with markedly di↵erent stabilities?Butanol has three di↵erent isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (⇠1.7 Debye) and polarizabili-ties (⇠8.9 A3)[3], the di↵erences in their chemical structureswill result in di↵erent distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite hav-ing similar polarizabilities, they will respond to the distance–dependent field of the ion di↵erently, and the closer proxim-ity of methyl groups in isobutanol and 2-butanol to the ion,compared to n-butanol, can be expected to increase the sta-bility of the complex. To quantify this e↵ect we carry outsubstitution reactions (Equ. 1) involving these three butanolisomers. The results are plotted in Fig. 3. Indeed, we findthat the isobutanol and 2-butanol complexes are more stablethan the corresponding n-butanol complexes. Furthermore,complexes comprising of 2-butanols are more stable than thecorresponding complexes comprising of isobutanols, as one ofthe two terminal methyl groups of 2-butanol is closer to theion than that of isobutanol. Thus, in addition to the staticdipole polarizabilities of the methyl groups, the distributionof methyl groups around the ion also plays a role in ion com-plex stability. We also note that the distances of the ions fromthe closest methyl groups of 2-butanol (⇠5 A) are compara-ble to the distances of ions from the branched methyl groupof threonine in the S4 site of KcsA [7, 11]. This suggests fur-ther that the reduction in methyl–induced polarization due toT!S substitutions in the S4 sites of K–channels can reducethe binding a�nities of all three ions significantly.
ABn ! A(2B)n
Fig. 3. E↵ect of butanol isoform on ion complexation enthalpies (�H). The
enthalpies are estimated at 298 K for the substitution reactions given by Equ. 1,
which are denoted as AXn ! AX0n. All energies are in units of kcal/mol.
In the S4 sites of K–channels the ions coordinate witheight ligands, and not just four hydroxyl ligands as studiedabove. Consequently, the repulsion between the coordinatingligands will be larger in the S4 site, which could reduce orwash out the contribution of the threonine methyl groups tothe binding of ions at the S4 site. Furthermore, a T!S sub-stitution may also alter the manner in which the coordinatinggroups of the S4 site interact with ions.
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS ⌦ AT4�nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coordi-nating groups is not large enough to counteract the stabiliza-tion provided by the presence of methyl groups. The a�nityof the S4 site for all three ions, Na+, K+ and Ba2+, dimin-ishes, although non-linearly, with each threonine substitution.Furthermore, these substitutions result in only minor config-urational changes that do not alter the overall topology ofthe S4 site (Table S2 of supporting information), which im-plies that the e↵ect of T!S substitutions on configurationalentropy will be minimal, and may be neglected. While wefind that the configurational change is generally higher in thecase of Ba2+ complexes as compared to Na+ and K+ com-plexes, we also note that optimizations of Ba2+ complexesfollowing T!S substitutions, but under heavy atom positionrestraints, result in a higher drop in S4 site a�nity. For ex-ample, a BaT4 ! BaS4 substitution followed by a constrainedoptimization yields a �E = XX kcal/mol, as opposed to a �E= XX kcal/mol obtained after an unconstrained optimization.This indicates that if the K–channel matrix were rigid to dis-allow a structural change, a T!S substitution will result ina higher loss in S4 site a�nity for Ba2+, as compared to theresults in Table 1. Together, this suggests that the loss ina�nity of the S4 site noted in Table 1 is not a result of alteredion–binding topologies, but is primarily due to a reduction inthe methyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the a�nity of the S4 site is the highest forBa2+ (Table 1). The free energy di↵erence of 6.9 kcal/mol isequivalent to a 106-fold drop in binding a�nity, where morethan half of the a�nity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop inBa2+ binding a�nity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T!S substi-tution experiments [16]. In addition, it also accounts for thedi↵erence between the Kir 2.4 and Kir 2.1 experimental IC50
values of Ba2+ [20, 21], as Kir 2.4 carries a serine and Kir 2.1a threonine in the S4 site.
The computed drop in stability is, however, greater thanthat inferred from experiments (by over 3 kcal/mol). This dis-crepancy does not result from the pairwise approximation [31]employed in the estimation of the dispersion energy. Whilemany-body dispersion terms can contribute significantly toligand binding [38], we find their contribution to be only0.3 kcal/mol to the BaT4 ! BaS4 substitution reaction (Ta-ble S4 of supporting information). We also do not expectthis discrepancy to be due to the missing structural restraintsfrom the protein matrix on the isolated S4 site, as the T!Ssubstitutions are accompanied by only minor configurationalchanges (Table S2 of supporting information). The overesti-mated drop in Ba2+ binding a�nity is most likely because ourcalculations lack a polarization coupling between the S4 siteand its external environment. Fig. 4 shows that the polar-ization in the electron density of the threonine methyl groupsis along the electric field of the Ba2+ ion, which maximizesthe contribution of methyl polarization to Ba2+ binding. Thepresence of other polar chemical moieties proximal to the S4site, including water molecules, can reduce the contributionof polarization and thereby the overall e↵ect of T!S substi-tutions on Ba2+ binding. Thermodynamically, this reduction
Footline Author PNAS Issue Date Volume Issue Number 3
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
non–linearly on ion coordination number. This non-linearityemerges as a consequence of repulsion between the coordinat-ing ligands [23, 36, 37]. Finally, �H and �G also depend onthe charge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced e↵ects are small. For example,while the �G associated with a K+W ! K+M substitution is-1.9 kcal/mol, the �G associated with the Ba2+W ! Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear e↵ect also un-derscores the importance of explicit polarization e↵ects in ioncomplexation [23].
In the calculations above, the small molecules have similargas phase dipole moments, but markedly di↵erent polarizabil-ities. A ligand with a higher polarizability produces, in gen-eral, a more stable ion complex. Can ligands with similar gasphase dipole moments as well as static dipole polarizabiltiesproduce complexes with markedly di↵erent stabilities? Bu-tanol has three di↵erent isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (⇠1.7 Debye) and polarizabili-ties (⇠8.9 A3)[3], the di↵erences in their chemical structureswill result in di↵erent distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite theirsimilar polarizabilities, they will respond to the spatially vary-ing field di↵erently. It may also be expected that the closerproximity of the ion to the methyl groups in isobutanol and 2-butanol, compared to n-butanol, will result in higher complexstabilities.
To quantify this, we carry out substitution reactions(Equ. 1) involving these three butanol isomers. The resultsare plotted in Fig. 3. Indeed, we find that the isobutanoland 2-butanol complexes are, in general, more stable thanthe corresponding n-butanol complexes. Similar to the studyabove, �H depends non–linearly on ion coordination num-ber; a non–linearity that emerges from ligand-ligand repulsion[23, 36, 37], which increases with ligand number. Therefore,packing increasingly higher numbers of methyl groups aroundthe ion does not necessarily imply that the stability of the ioncomplex will increase. For example, while �H = �3 kcal/molfor the substitution reaction K(B)2 ! K(2B)2, �H ⇠ 0 forthe substitution reaction K(B)4 ! K(2B)4. In addition, de-spite the fact that one of the two terminal methyl groups of2-butanol is closer to the ion than that of isobutanol, com-plexes comprised of 2-butanols are not necessarily more sta-ble than the corresponding complexes of isobutanols. Thus,in addition to the static dipole polarizabilities of the methylgroups, their distribution around the ion also matter.
We note that while the distances of the ions from the clos-est methyl groups of 2-butanol (⇠5 A) are comparable to thedistances of ions from the branched methyl group of threoninein the S4 site of KcsA [7, 11], the packing of methyl groupsin the 4-fold 2-butanol complexes (tetrahedral geometry) isdenser than the that in the S4 site of KcsA (Fig. 1). This sug-gests that the reduction in methyl–induced polarization fromT!S substations will reduce ion stabilities in the S4 sites ofK–channels. However, in the S4 sites of K–channels the ionscoordinate with eight ligands, and not just four hydroxyl lig-ands as studied above. Consequently, the repulsion betweenthe coordinating ligands can be larger in the S4 site, whichcould reduce or wash out the contribution of the threoninemethyl groups to the binding of ions at the S4 site. Further-
more, a T!S substitution may also alter the manner in whichthe coordinating groups of the S4 site interact with ions.
-12
-9
-6
-3
0
3
1 2 3 4 1 2 3 4
dro
xylgro
upsalign
wit
hth
eper
mea
tion
path
way
and
inte
ract
wit
hio
ns?
Inth
isw
ork
,w
ein
ves
tigate
thes
eis
sues
usi
ng
quantu
mch
emic
alca
lcula
tions.
The
pri
mary
challen
ge
ass
oci
ate
dw
ith
studyin
gm
ethyl
gro
ups
usi
ng
quantu
mch
emis
try
isth
at
one
nee
ds
toacc
ount
for
the
contr
ibuti
on
from
dis
per
sion,
and
inth
isca
se,
for
syst
ems
conta
inin
gth
ousa
nds
of
elec
-tr
ons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[2
0]and
PB
E0
[21,22]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
3]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
of
met
hylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r(W
),m
ethanol(M
),et
hanol(E
)and
pro
panol
(P).
While
all
thes
em
ole
cule
spro
vid
ehydro
xyl
ligands
for
ion-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hylgro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
er-
ent
stati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of
1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
of
thes
eca
lcula
tions
are
plo
tted
inFig
.2.
We
find
that,
ingen
eral,
com
ple
xst
ability
changes
non-t
rivia
lly
wit
hth
enum
ber
of
met
hyl
gro
ups
inth
eco
ord
inati
ng
ligand.
the
magnit
ude
of
the
incr
ease
dep
ends
non-t
rivia
lly
charg
e/si
zera
tio
ofio
n,th
enum
ber
ofligandsco
ord
inati
ng
the
ion,asw
ellasth
edis
tance
from
the
ion
wher
eth
em
ethylgro
up
isadded
.W
efind
that
as
the
num
ber
of
met
hyl
gro
ups
inth
elig-
ands
incr
ease
,co
mple
xst
ability
enhance
s.W
hile
the
magni-
tude
of
this
e↵ec
tex
pec
tedly
dec
rease
sw
hen
met
hyl
gro
ups
are
intr
oduce
dfu
rther
away
from
the
ion,
itre
main
sin
the
physi
olo
gic
ally
signifi
cant
range
even
when
met
hylgro
ups
are
intr
oduce
dat
dis
tance
sgre
ate
rth
an
6A
(AE
n!
AP
n).
Inaddit
ion,
the
change
inco
mple
xst
ability
als
odep
ends
non-
linea
rly
on
ion-c
oord
inati
on
num
ber
.Such
am
ult
i-body
e↵ec
tem
erges
as
aco
nse
quen
ceofre
puls
ion
bet
wee
nth
eco
ord
inat-
ing
ligands,
and
iscr
itic
alto
the
esti
mati
on
ofre
lati
ve
stabil-
itie
sbet
wee
ntw
oio
ns
[16,17].
Inth
ecr
yst
alst
ruct
ure
of
the
Kcs
A[8
],th
edis
tance
be-
twee
nth
eK
+io
nand
the
met
hylgro
up
ofth
eth
reonin
esi
de
chain
inth
eS4
site
is⇠
5A
.T
his
isw
ithin
the
ion-m
ethyl
dis
tance
sex
am
ined
inFig
.2,w
hic
hsu
gges
tsth
at
the
subst
i-tu
tion
ofth
ism
ethylgro
up
wit
ha
hydro
gen
ato
min
aT!
Sm
uta
tion
can
reduce
the
bin
din
ga�
nit
ies
of
all
thre
eio
ns,
Na+,
K+
and
Ba2+
signifi
cantl
y.H
owev
er,
inth
eS4
site
,th
ese
ions
coord
inate
wit
hei
ght
ligands.
Conse
quen
tly,
the
repuls
ion
bet
wee
nth
eligands
will
be
larg
er,
and
the
over
all
reduct
ion
inbin
din
ga�
nity
could
be
smaller
.To
evalu
ate
Table
2.
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quanti
tati
ve
Chara
cteri
zati
on
ofIn
trin
sic
Mole
cula
rM
oti
on
Usi
ng
Support
Vect
or
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxovir
us
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cteri
zation
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vect
or
Mach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
trons-ALEX/MARIANA-PLEASEELABORATE...Weem-ploytherecentlydeveloped...Weusedensity-functionaltheorywithinthePBE[18]andPBE0[19,20]approximationsfortheexchange-correlationpotential.Inaddition,weemployare-centlydevelopedvanderWaals(vdW)correctionscheme[21]basedonasummationofpairwiseC6[n]/R
6terms,wherethe
C6coe�cientsarebasedontheself-consistentelectronicden-sity.Thisapproachallowsustotreataccuratelytheenergeticsande↵ectsofpolarizationinoursystems.
WefindthataT!Ssubstitutionisaccompaniedwithonlyminorstructuralchanges.Inaddition,wefindthatthereduc-tioninpolarizabilityofS4associatedwithaT!Ssubstitu-tionissu�cienttoexplainexperimentalobservations.Thesestudiesprovide,ingeneral,avividexampleofhowpeptidefunctionisinfluencedbythepolarizabilityofmethylgroups,especiallythosethatareproximaltochargedmoieties,includ-ingions,titratableside-acidsandphosphates.
ResultsandDiscussionTounderstandhowT!SsubstitutionsintheS4sitea↵ectK–channelfunction,weexaminefirstthegeneralconsequencesofmethylpolarizabilityonthethermodynamicsofionbind-ing.WeestimatethegasphaseenthalpiesandfreeenergiesofNa
+,K
+andBa
2+ionbindingtofourdi↵erentmolecules,
namelywater,methanol,ethanolandpropanol.Whileallthesemoleculesprovidehydroxylligandsforion-coordination,theydi↵erfromeachotherinthenumberofmethylgroupstheycarry.Consequentlytheyhavedi↵erentstaticaverageground-statedipolepolarizabilitiesof1.5,3.3,5.4and6.7A
3,
respectively[15].TheresultsofthesecalculationsareplottedinFig.2.Wefindthatwhiletheadditionofmethylgroupsenhancethestabilityofthecomplex,thee↵ectdecreaseswitheachincrementaladdition.Inaddition,multibodye↵ectsarethee↵ectofcoordinationnumberisnon-trivial.
Table2.distancebetweenmethylandKioninthes4siteisabout5A
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
-18-15-12-9-6-303
123412341234
-20
-16
-12
-8
-4
0
123412341234
K+
Ba2+
Na+ ΔHΔG
nnn
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
<�zz(t)�zz(0)>eq�<�xx(t)�xx(0)>eq
{a,b,c}k(xi,xj)=xi·xj
2nd
�||=0
d�/d�=0.37GPa
25
�xi�xj�AWn�AMn
AMn�AEn
AEn�APn
QuantitativeCharacterizationofIntrinsicMolecularMotionUsingSupportVectorMachines
RalphE.Leighty1
andSameerVarma1,2,3,�
1DepartmentofCellBiology,MicrobiologyandMolecularBiology,
2DepartmentofPhysics,UniversityofSouthFlorida,Tampa,FL33620,USAand
3InstituteofPureandAppliedMathematics,UniversityofCaliforniaLosAngeles,LosAngeles,CA90095,USA
(Dated:July30,2012DRAFT)
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,canbealteredbynumerousenvironmentalfactors,suchastemperature,andalsobythebindingofothermolecules.Anunderstandingofensemble-leveldi↵erencesfromageometricperspectiveisimportantbecauseitprovidesabasisforrelatingthermodynamicchangestochangesinmolecularmotion.Thetaskofdi↵erentiatingtwoconfigurationalensemblesis,however,challengingbecauseitrequirescomparingtwohigh-dimensionaldatasets.Traditionally,whenanalyzingmolecularsimulations,thisproblemiscircumventedbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingsummarystatisticsfromthetwoensemblesagainsteachother.However,sincedimensionalityreductioniscarriedoutpriortocomparisonofensembles,suchstrategiesaresusceptibletoartifactualbiasesfrominformationloss.Hereweintroduceamethodbasedonsupportvectormachines(SVM)thatcompares3-densemblesdirectlyagainstoneanotherintheHilbertspacewithoutaprerequisitereductioninphasespace.Whilethismethodcanbeappliedtoanymolecularsystem,hereweexploreitssensitivitytowardmodelsystemscomprisedofaminoacids.WealsoapplythetechniquetoidentifythespecificregionsofaparamyxovirusGproteinthatarea↵ectedbythebindingofitspreferredhumanreceptor,EphrinB2.Wefindthatthespecificregionsidentifiedbythismethodincludethesetofaminoacidsthatareknownfromexperimentalstudiestoplayavitalroleinviralfusion.Theconfigurationalensemblesoftheviralproteininbothitsboundandunboundstatesweregeneratedusingall-atommoleculardynamicssimulations.
INTRODUCTION
Theensembleof3-dconfigurationsexhibitedbyamolecule,thatis,itsintrinsicmotion,iscorrelatedwiththepropertiesofitsenvironment[1–12].Changesinintensivevariables,suchastemperatureandpressure,modifytheintrinsicmotionofamolecule.Additionally,molecularmotionalsochangesasaresultofbindingwithothermolecules,suchasinligand-substratecomplexesormolecularassemblies.Moreover,theextentofthechangeinmolecularmotionisdependentuponmultiplefactors,includingpropertiesofthemoleculeandthenatureoftheperturbationorexternalpotential.Aquantitativecharacterizationofsuchchangesinmolecularmotionisimportantfrombothscientificaswellasengineeringper-spectivesbecauseitprovidesabasistoassociatechanges
inthermodynamicpropertiesdirectlywithcorrespondingchangesinmolecularmotion.
Di�erentiatingbetweentwoensemblesofmolecularconfigurationsis,however,challenging.Thechallengeliesincomparingtwosetsofhigh-dimensiondata.Forthemotionofan-particlemoleculerepresentedbym-configurations,{an(x)}
m1,thetaskofdi�erentiatingit
fromthemolecule’sreferencestate,{an(x0)}m1,involves
comparingtwo3n-dimensionalvectorspaces.Tradition-ally,whenanalyzingmolecularsimulations,thisprob-lemisdealtwithbyfirstreducingthedimensionsofthetwoensemblesseparately,andthencomparingtheresult-ingsummarystatisticsfromthetwoensemblesagainsteachother[13].Dimensionalityreductioniscarriedout,forexample,byaveragingovern-spacethatinvolvesparticle-clusteringoraveragingoverm-spacethatyields
Fig.2.Changesinioncomplexationenthalpies(��H)andfreeenergies
(��G)duetomethylgroups.�Hand�Gwereestimatedseparatelyforre-
actionsA+nX⌦AXninwhichAisanion,Na+,K+orBa2+,andXis
aligandmolecule,water(W),methanol(M),ethanol(E)orpropanol(P).The
reactionswerethencombinedtoobtain��Hand��G.Allvaluesarereported
inkcal/mol.
Fig.3.E↵ectofT!Ssubstitutiononelectrondensities.Di↵erencesbetweenthe
electronicdensitiesof(BaT++4�4CH3)�(BaS
++4�4H),calculatedwith
PBE0+vdW,areshown.Thegeometryusedwasthatofthefullyrelaxed(PBE+vdW)
BaT++4complex.TobuildBaS
++4theCH3groupsofTweresubstitutedby
Hatomsandonlythesewererelaxed.Thedi↵erenceinelectrondensityshownthus
correspondstothee↵ectofpolarizationontheelectronicdensitiescausedbythe
additionofmethylgroups.
MaterialsandMethods
Allgeometricrelaxationswereperformedwiththeall-electron,localizedbasis
(numericatom-centeredorbitals)programpackageFHI-aims[22,23],whereweem-
2www.pnas.org/cgi/doi/10.1073/pnas.0709640104FootlineAuthor
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mat
eria
lsan
dM
ethods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mat
eria
lsan
dM
ethods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons-A
LE
X/M
AR
IAN
A-P
LE
ASE
ELA
BO
RA
TE
...W
eem
-plo
yth
ere
centl
ydev
eloped
...W
euse
den
sity
-funct
ionalth
eory
wit
hin
the
PB
E[1
8]and
PB
E0
[19,20]appro
xim
ati
onsfo
rth
eex
change-
corr
elati
on
pote
nti
al.
Inaddit
ion,
we
emplo
ya
re-
centl
ydev
eloped
van
der
Waals
(vdW
)co
rrec
tion
schem
e[2
1]
base
don
asu
mm
ati
on
ofpair
wis
eC
6[n
]/R
6te
rms,
wher
eth
eC
6co
e�ci
ents
are
base
don
the
self-c
onsi
sten
tel
ectr
onic
den
-si
ty.
This
appro
ach
allow
susto
trea
tacc
ura
tely
the
ener
get
ics
and
e↵ec
tsofpola
riza
tion
inour
syst
ems.
We
find
thata
T!
Ssu
bst
ituti
on
isacc
om
panie
dw
ith
only
min
or
stru
ctura
lch
anges
.In
addit
ion,w
efind
that
the
reduc-
tion
inpola
riza
bility
of
S4
ass
oci
ate
dw
ith
aT!
Ssu
bst
itu-
tion
issu�
cien
tto
expla
inex
per
imen
tal
obse
rvati
ons.
Thes
est
udie
spro
vid
e,in
gen
eral,
aviv
idex
am
ple
of
how
pep
tide
funct
ion
isin
fluen
ced
by
the
pola
riza
bility
ofm
ethylgro
ups,
espec
ially
those
thatare
pro
xim
alto
charg
edm
oie
ties
,in
clud-
ing
ions,
titr
ata
ble
side-
aci
ds
and
phosp
hate
s.
Res
ults
and
Discu
ssio
nTo
under
stand
how
T!
Ssu
bst
ituti
onsin
the
S4
site
a↵ec
tK
–ch
annel
funct
ion,
we
exam
ine
firs
tth
egen
eral
conse
quen
ces
of
met
hylpola
riza
bility
on
the
ther
modynam
ics
of
ion
bin
d-
ing.
We
esti
mate
the
gas
phase
enth
alp
ies
and
free
ener
gie
sofN
a+,K
+and
Ba2+
ion
bin
din
gto
four
di↵
eren
tm
ole
cule
s,nam
ely
wate
r,m
ethanol,
ethanol
and
pro
panol.
While
all
thes
em
ole
cule
spro
vid
ehydro
xylligandsfo
rio
n-c
oord
inati
on,
they
di↵
erfr
om
each
oth
erin
the
num
ber
of
met
hyl
gro
ups
they
carr
y.C
onse
quen
tly
they
hav
edi↵
eren
tst
ati
cav
erage
gro
und-s
tate
dip
ole
pola
riza
bilit
ies
of1.5
,3.3
,5.4
and
6.7
A3,
resp
ecti
vel
y[1
5].
The
resu
lts
ofth
ese
calc
ula
tions
are
plo
tted
inFig
.2.
We
find
that
while
the
addit
ion
of
met
hylgro
ups
enhance
the
stability
ofth
eco
mple
x,th
ee↵
ect
dec
rease
sw
ith
each
incr
emen
taladdit
ion.
Inaddit
ion,m
ult
ibody
e↵ec
tsare
the
e↵ec
tofco
ord
inati
on
num
ber
isnon-t
rivia
l.
Table
2.
dis
tance
bet
wee
nm
ethyland
Kio
nin
the
s4si
teis
about
5A
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
-18
-15
-12-9-6-303
12
34
12
34
12
34
-20
-16
-12-8-40
12
34
12
34
12
34
K+
Ba2+
Na+
ΔH ΔG
nn
n
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2n
d
� ||=
0
d�/d�
=0.3
7G
Pa
25 �x
i�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cte
rization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vecto
rM
ach
ines
Ralp
hE
.Lei
ghty
1and
Sam
eer
Varm
a1,2
,3,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biolo
gyand
Molecu
lar
Bio
logy
,2D
epart
men
tof
Physi
cs,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Date
d:
July
30,2012
DR
AFT
)
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,
its
intr
insi
cm
oti
on,
can
be
alt
ered
by
num
erous
envir
onm
enta
lfa
ctors
,su
chas
tem
per
atu
re,and
als
oby
the
bin
din
gofoth
erm
ole
cule
s.A
nunder
standin
gofen
sem
ble
-lev
eldi↵
eren
cesfr
om
ageo
met
ric
per
spec
tive
isim
port
ant
bec
ause
itpro
vid
esa
basi
sfo
rre
lati
ng
ther
modynam
icch
anges
toch
anges
inm
ole
cula
rm
oti
on.
The
task
ofdi↵
eren
tiati
ng
two
configura
tionalen
sem
ble
sis
,how
ever
,ch
allen
gin
gbec
ause
itre
quir
esco
mpari
ng
two
hig
h-d
imen
sional
data
sets
.Tra
dit
ionally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
ble
mis
circ
um
ven
ted
by
firs
tre
duci
ng
the
dim
ensi
ons
of
the
two
ense
mble
sse
para
tely
,and
then
com
pari
ng
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er.
How
ever
,si
nce
dim
ensi
onality
reduct
ion
isca
rrie
dout
pri
or
toco
mpari
son
of
ense
mble
s,su
chst
rate
gie
sare
susc
epti
ble
toart
ifact
ualbia
sesfr
om
info
rmati
on
loss
.H
ere
we
intr
oduce
am
ethod
base
don
support
vec
tor
mach
ines
(SV
M)
that
com
pare
s3-d
ense
mble
sdir
ectl
yagain
stone
anoth
erin
the
Hilber
tsp
ace
wit
hout
apre
requis
ite
reduct
ion
inphase
space
.W
hile
this
met
hod
can
be
applied
toany
mole
cula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
of
am
ino
aci
ds.
We
als
oapply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gio
ns
ofa
para
myxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gofit
spre
ferr
edhum
an
rece
pto
r,E
phri
nB
2.
We
find
thatth
esp
ecifi
cre
gio
ns
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
aci
ds
that
are
know
nfr
om
exper
imen
talst
udie
sto
pla
ya
vit
al
role
invir
al
fusi
on.
The
configura
tional
ense
mble
sof
the
vir
al
pro
tein
inboth
its
bound
and
unbound
state
sw
ere
gen
erate
dusi
ng
all-a
tom
mole
cula
rdynam
ics
sim
ula
tions.
INT
RO
DU
CT
ION
The
ense
mble
of
3-d
configura
tions
exhib
ited
by
am
ole
cule
,th
at
is,it
sin
trin
sic
moti
on,is
corr
elate
dw
ith
the
pro
per
ties
of
its
envir
onm
ent
[1–12].
Changes
inin
tensi
ve
vari
able
s,su
chas
tem
per
atu
reand
pre
ssure
,m
odify
the
intr
insi
cm
oti
on
ofa
mole
cule
.A
ddit
ionally,
mole
cula
rm
oti
on
als
och
anges
asa
resu
ltofbin
din
gw
ith
oth
erm
ole
cule
s,su
chasin
ligand-s
ubst
rate
com
ple
xes
or
mole
cula
rass
emblies
.M
ore
over
,th
eex
tentofth
ech
ange
inm
ole
cula
rm
oti
on
isdep
enden
tupon
mult
iple
fact
ors
,in
cludin
gpro
per
ties
of
the
mole
cule
and
the
natu
reof
the
per
turb
ati
on
or
exte
rnal
pote
nti
al.
Aquanti
tati
ve
chara
cter
izati
on
of
such
changes
inm
ole
cula
rm
oti
on
isim
port
antfr
om
both
scie
nti
fic
as
wel
las
engin
eeri
ng
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
basi
sto
ass
oci
ate
changes
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espondin
gch
anges
inm
ole
cula
rm
oti
on.
Di�
eren
tiati
ng
bet
wee
ntw
oen
sem
ble
sof
mole
cula
rco
nfigura
tions
is,
how
ever
,ch
allen
gin
g.
The
challen
ge
lies
inco
mpari
ng
two
sets
of
hig
h-d
imen
sion
data
.For
the
moti
on
of
an-p
art
icle
mole
cule
repre
sente
dby
m-
configura
tions,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiati
ng
itfr
om
the
mole
cule
’sre
fere
nce
state
,{a
n(x
0)}
m 1,in
volv
esco
mpari
ng
two
3n-d
imen
sionalvec
tor
space
s.Tra
dit
ion-
ally,
when
analy
zing
mole
cula
rsi
mula
tions,
this
pro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
para
tely
,and
then
com
pari
ng
the
resu
lt-
ing
sum
mary
stati
stic
sfr
om
the
two
ense
mble
sagain
stea
choth
er[1
3].
Dim
ensi
onality
reduct
ion
isca
rrie
dout,
for
exam
ple
,by
aver
agin
gov
ern-s
pace
that
involv
espart
icle
-clu
ster
ing
or
aver
agin
gov
erm
-space
that
yie
lds
<� z
z(t
)�zz(0
)>
eq�
<� x
x(t
)�xx(0
)>
eq
{a,b
,c}
k(x
i,x
j)
=x
i·x
j
2nd
� ||=
0
d�/d
�=
0.37
GPa
25 �xi�
xj�
AW
n�
AM
n
AM
n�
AE
n
AE
n�
AP
n
Quantita
tive
Chara
cter
ization
ofIn
trin
sic
Mole
cula
rM
otion
Usi
ng
Support
Vec
tor
Mach
ines
Ral
ph
E.
Lei
ghty
1an
dSam
eer
Var
ma1
,2,3
,�
1D
epart
men
tof
Cel
lBio
logy
,M
icro
biology
and
Molecu
lar
Bio
logy
,2D
epart
men
tof
Phys
ics,
Univ
ersi
tyof
South
Flo
rida,
Tam
pa,
FL
33620,
USA
and
3In
stitute
ofPure
and
Applied
Math
ematics
,U
niv
ersi
tyofC
alifo
rnia
Los
Ange
les,
Los
Ange
les,
CA
90095,U
SA
(Dat
ed:
July
30,20
12D
RA
FT
)
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,
can
be
alte
red
by
num
erou
sen
vir
onm
enta
lfa
ctor
s,su
chas
tem
per
ature
,an
dal
soby
the
bin
din
gof
other
mol
ecule
s.A
nunder
stan
din
gof
ense
mble
-lev
eldi↵
eren
cesfr
oma
geom
etri
cper
spec
tive
isim
por
tant
bec
ause
itpro
vid
esa
bas
isfo
rre
lati
ng
ther
modynam
icch
ange
sto
chan
ges
inm
olec
ula
rm
otio
n.
The
task
ofdi↵
eren
tiat
ing
two
configu
rati
onal
ense
mble
sis
,how
ever
,ch
alle
ngi
ng
bec
ause
itre
quir
esco
mpar
ing
two
hig
h-d
imen
sion
aldat
aset
s.Tra
dit
ional
ly,
when
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
ble
mis
circ
um
vente
dby
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
.H
owev
er,
since
dim
ensi
onal
ity
reduct
ion
isca
rrie
dou
tpri
orto
com
par
ison
ofen
sem
ble
s,su
chst
rate
gies
are
susc
epti
ble
toar
tifa
ctual
bia
sesfr
omin
form
atio
nlo
ss.
Her
ew
ein
troduce
am
ethod
bas
edon
suppor
tve
ctor
mac
hin
es(S
VM
)th
atco
mpar
es3-
den
sem
ble
sdir
ectl
yag
ainst
one
anot
her
inth
eH
ilber
tsp
ace
wit
hou
ta
pre
requis
ite
reduct
ion
inphas
esp
ace.
While
this
met
hod
can
be
applied
toan
ym
olec
ula
rsy
stem
,her
ew
eex
plo
reit
sse
nsi
tivity
tow
ard
model
syst
ems
com
pri
sed
ofam
ino
acid
s.W
eal
soap
ply
the
tech
niq
ue
toid
enti
fyth
esp
ecifi
cre
gion
sof
apar
amyxov
irus
Gpro
tein
that
are
a↵ec
ted
by
the
bin
din
gof
its
pre
ferr
edhum
anre
cepto
r,E
phri
nB
2.W
efind
that
the
spec
ific
regi
ons
iden
tified
by
this
met
hod
incl
ude
the
set
ofam
ino
acid
sth
atar
eknow
nfr
omex
per
imen
talst
udie
sto
pla
ya
vit
alro
lein
vir
alfu
sion
.T
he
configu
rati
onal
ense
mble
sof
the
vir
alpro
tein
inbot
hit
sbou
nd
and
unbou
nd
stat
esw
ere
gener
ated
usi
ng
all-at
omm
olec
ula
rdynam
ics
sim
ula
tion
s.
INT
RO
DU
CT
ION
The
ense
mble
of3-
dco
nfigu
rati
ons
exhib
ited
by
am
olec
ule
,th
atis
,it
sin
trin
sic
mot
ion,is
corr
elat
edw
ith
the
pro
per
ties
ofits
envir
onm
ent
[1–1
2].
Chan
ges
inin
tensi
veva
riab
les,
such
aste
mper
ature
and
pre
ssure
,m
odify
the
intr
insi
cm
otio
nof
am
olec
ule
.A
ddit
ional
ly,
mol
ecula
rm
otio
nal
soch
ange
sas
are
sult
ofbin
din
gw
ith
other
mol
ecule
s,su
chas
inliga
nd-s
ubst
rate
com
ple
xes
orm
olec
ula
ras
sem
blies
.M
oreo
ver,
the
exte
ntof
the
chan
gein
mol
ecula
rm
otio
nis
dep
enden
tupon
mult
iple
fact
ors,
incl
udin
gpro
per
ties
ofth
em
olec
ule
and
the
nat
ure
ofth
eper
turb
atio
nor
exte
rnal
pot
enti
al.
Aquan
tita
tive
char
acte
riza
tion
ofsu
chch
ange
sin
mol
ecula
rm
otio
nis
impor
tantfr
ombot
hsc
ienti
fic
asw
ellas
engi
nee
ring
per
-sp
ecti
ves
bec
ause
itpro
vid
esa
bas
isto
asso
ciat
ech
ange
s
inth
erm
odynam
icpro
per
ties
dir
ectl
yw
ith
corr
espon
din
gch
ange
sin
mol
ecula
rm
otio
n.
Di�
eren
tiat
ing
bet
wee
ntw
oen
sem
ble
sof
mol
ecula
rco
nfigu
rati
ons
is,
how
ever
,ch
alle
ngi
ng.
The
chal
lenge
lies
inco
mpar
ing
two
sets
ofhig
h-d
imen
sion
dat
a.For
the
mot
ion
ofa
n-p
arti
cle
mol
ecule
repre
sente
dby
m-
configu
rati
ons,
{an(x
)}m 1
,th
eta
skof
di�
eren
tiat
ing
itfr
omth
em
olec
ule
’sre
fere
nce
stat
e,{a
n(x
0)}
m 1,in
volv
esco
mpar
ing
two
3n-d
imen
sion
alve
ctor
spac
es.
Tra
dit
ion-
ally
,w
hen
anal
yzi
ng
mol
ecula
rsi
mula
tion
s,th
ispro
b-
lem
isdea
ltw
ith
by
firs
tre
duci
ng
the
dim
ensi
ons
ofth
etw
oen
sem
ble
sse
par
atel
y,an
dth
enco
mpar
ing
the
resu
lt-
ing
sum
mar
yst
atis
tics
from
the
two
ense
mble
sag
ainst
each
other
[13]
.D
imen
sion
ality
reduct
ion
isca
rrie
dou
t,fo
rex
ample
,by
aver
agin
gov
ern-s
pac
eth
atin
volv
espar
ticl
e-cl
ust
erin
gor
aver
agin
gov
erm
-spac
eth
atyie
lds
Fig
.2.
Chan
ges
inio
nco
mple
xation
enth
alpie
s(�
�H
)an
dfree
ener
gies
(��
G)
due
tom
ethyl
grou
ps.
�H
and�
Gwer
ees
tim
ated
separ
atel
yfo
rre
-
action
sA
+nX
⌦A
Xn
inw
hic
hA
isan
ion,
Na+
,K+
orB
a2+
,an
dX
is
alig
and
mol
ecule
,wat
er(W
),m
ethan
ol(M
),et
han
ol(E
)or
prop
anol
(P).
The
reac
tion
swer
eth
enco
mbin
edto
obta
in��
Han
d��
G.
All
valu
esar
ere
por
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
.D
i↵er
ence
sbet
wee
nth
e
elec
tron
icden
sities
of(B
aT
++
4�
4C
H3)�
(BaS
++
4�
4H
),ca
lcula
ted
with
PB
E0+
vdW
,ar
esh
own.
The
geom
etry
use
dwas
that
ofth
efu
llyre
laxe
d(P
BE+
vdW
)
BaT
++
4co
mple
x.To
build
BaS
++
4th
eC
H3
grou
ps
ofT
wer
esu
bst
itute
dby
Hat
oms
and
only
thes
ewer
ere
laxe
d.
The
di↵
eren
cein
elec
tron
den
sity
show
nth
us
corr
espon
ds
toth
ee↵
ect
ofpol
ariz
atio
non
the
elec
tron
icden
sities
cause
dby
the
additio
nof
met
hyl
grou
ps.
Mat
eria
lsan
dM
ethods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[22,
23],
wher
ewe
em-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
Fig
.2.
Chan
gesin
ion
com
ple
xation
enth
alpie
s(�
H)an
dfree
ener
gies
(�G
)due
toin
crea
sein
the
num
ber
ofm
ethyl
grou
ps
inth
eio
n-c
oor
din
atin
glig
ands.�
Han
d
�G
are
estim
ated
forth
esu
bst
itution
reac
tion
sA
Xn
+nX
0⌦
AX
0 n+
AX
n,
whic
har
eden
oted
asA
Xn!
AX
0 n.
Inth
ese
reac
tion
s,A
repr
esen
tson
eof
the
ions,
Na+
,K+
orB
a2+
,an
dX
repr
esen
tson
eof
the
ligan
dm
olec
ule
s,wat
er(W
),
met
han
ol(M
),et
han
ol(E
)or
prop
anol
(P).
All
ener
gies
are
repor
ted
inkc
al/m
ol.
Fig
.3.
E↵ec
tof
T!
Ssu
bst
itution
onel
ectr
onden
sities
ofan
isol
ated
S4
site
.T
he
di↵
eren
ces
bet
wee
nth
eel
ectr
onic
den
sities
,⇢
Ba
T4�
4⇢
CH
3�
⇢B
aS4�
4⇢
H,
wer
ees
tim
ated
using
the
PB
E0+
vdW
funct
ional
for
the
optim
um
geom
etry
ofth
e
BaT4
com
ple
x.T
he
geom
etry
ofth
eB
aS
4co
mple
xwas
const
ruct
edfrom
the
BaT4
optim
ized
geom
etry
byre
pla
cing
the
CH
3gr
oup
ofea
chth
reon
ine
residue
with
anH
atom
,an
dth
enre
laxi
ng
the
coor
din
ates
ofH
atom
s.
Mate
rials
and
Met
hods
All
geom
etric
rela
xation
swer
eper
form
edw
ith
the
all-el
ectr
on,
loca
lized
bas
is
(num
eric
atom
-cen
tere
dor
bital
s)pr
ogra
mpac
kage
FH
I-ai
ms
[24,
25],
wher
ewe
em-
plo
yed
the
def
ault
“tig
ht”
stan
dar
ds
rega
rdin
gbas
isse
tsan
din
tegr
atio
ngr
ids
for
all
elem
ents
.For
DFT
(LD
A/G
GA
)ca
lcula
tion
sth
ese
sett
ings
produce
esse
ntial
lyco
n-
2w
ww
.pnas
.org
/cgi
/doi
/10.
1073
/pnas
.070
9640
104
Foot
line
Auth
or
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
trons - ALEX/MARIANA - PLEASE ELABORATE...We em-ploy the recently developed...We use density-functional theorywithin the PBE [18] and PBE0 [19, 20] approximations for theexchange-correlation potential. In addition, we employ a re-cently developed van der Waals (vdW) correction scheme [21]based on a summation of pairwise C6[n]/R6 terms, where theC6 coe�cients are based on the self-consistent electronic den-sity. This approach allows us to treat accurately the energeticsand e↵ects of polarization in our systems.
We find that a T!S substitution is accompanied with onlyminor structural changes. In addition, we find that the reduc-tion in polarizability of S4 associated with a T!S substitu-tion is su�cient to explain experimental observations.Thesestudies provide, in general, a vivid example of how peptidefunction is influenced by the polarizability of methyl groups,especially those that are proximal to charged moieties, includ-ing ions, titratable side-acids and phosphates.
Results and DiscussionTo understand how T!S substitutions in the S4 site a↵ect K–channel function, we examine first the general consequencesof methyl polarizability on the thermodynamics of ion bind-ing. We estimate the gas phase enthalpies and free energiesof Na+, K+ and Ba2+ ion binding to four di↵erent molecules,namely water, methanol, ethanol and propanol. While allthese molecules provide hydroxyl ligands for ion-coordination,they di↵er from each other in the number of methyl groupsthey carry. Consequently they have di↵erent static averageground-state dipole polarizabilities of 1.5, 3.3, 5.4 and 6.7 A3,respectively [15]. The results of these calculations are plottedin Fig. 2. We find that while the addition of methyl groupsenhance the stability of the complex, the e↵ect decreases witheach incremental addition. In addition, multibody e↵ects arethe e↵ect of coordination number is non-trivial.
Table 2. distance between methyl and K ion in the s4 siteis about 5 A
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
-18-15-12-9-6-303
1 2 3 4 1 2 3 4 1 2 3 4
-20
-16
-12
-8
-4
0
1 2 3 4 1 2 3 4 1 2 3 4
K+
Ba2+
Na+
ΔHΔG
n n n
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
< �zz(t)�zz(0) >eq � < �xx(t)�xx(0) >eq
{a,b, c}k(xi,xj) = xi · xj
2nd
�|| = 0
d�/d� = 0.37 GPa
25
�xi � xj�AWn � AMn
AMn � AEn
AEn � APn
Quantitative Characterization of Intrinsic Molecular MotionUsing Support Vector Machines
Ralph E. Leighty1 and Sameer Varma1,2,3,�
1Department of Cell Biology, Microbiology and Molecular Biology,2Department of Physics, University of South Florida, Tampa, FL 33620, USA and
3Institute of Pure and Applied Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA
(Dated: July 30, 2012 DRAFT)
The ensemble of 3-d configurations exhibited by a molecule, that is, its intrinsic motion, can bealtered by numerous environmental factors, such as temperature, and also by the binding of othermolecules. An understanding of ensemble-level di↵erences from a geometric perspective is importantbecause it provides a basis for relating thermodynamic changes to changes in molecular motion.The task of di↵erentiating two configurational ensembles is, however, challenging because it requirescomparing two high-dimensional datasets. Traditionally, when analyzing molecular simulations,this problem is circumvented by first reducing the dimensions of the two ensembles separately,and then comparing summary statistics from the two ensembles against each other. However,since dimensionality reduction is carried out prior to comparison of ensembles, such strategies aresusceptible to artifactual biases from information loss. Here we introduce a method based on supportvector machines (SVM) that compares 3-d ensembles directly against one another in the Hilbertspace without a prerequisite reduction in phase space. While this method can be applied to anymolecular system, here we explore its sensitivity toward model systems comprised of amino acids.We also apply the technique to identify the specific regions of a paramyxovirus G protein that area↵ected by the binding of its preferred human receptor, Ephrin B2. We find that the specific regionsidentified by this method include the set of amino acids that are known from experimental studiesto play a vital role in viral fusion. The configurational ensembles of the viral protein in both itsbound and unbound states were generated using all-atom molecular dynamics simulations.
INTRODUCTION
The ensemble of 3-d configurations exhibited by amolecule, that is, its intrinsic motion, is correlated withthe properties of its environment [1–12]. Changes inintensive variables, such as temperature and pressure,modify the intrinsic motion of a molecule. Additionally,molecular motion also changes as a result of binding withother molecules, such as in ligand-substrate complexes ormolecular assemblies. Moreover, the extent of the changein molecular motion is dependent upon multiple factors,including properties of the molecule and the nature ofthe perturbation or external potential. A quantitativecharacterization of such changes in molecular motion isimportant from both scientific as well as engineering per-spectives because it provides a basis to associate changes
in thermodynamic properties directly with correspondingchanges in molecular motion.
Di�erentiating between two ensembles of molecularconfigurations is, however, challenging. The challengelies in comparing two sets of high-dimension data. Forthe motion of a n-particle molecule represented by m-configurations, {an(x)}m
1 , the task of di�erentiating itfrom the molecule’s reference state, {an(x0)}m
1 , involvescomparing two 3n-dimensional vector spaces. Tradition-ally, when analyzing molecular simulations, this prob-lem is dealt with by first reducing the dimensions of thetwo ensembles separately, and then comparing the result-ing summary statistics from the two ensembles againsteach other [13]. Dimensionality reduction is carried out,for example, by averaging over n-space that involvesparticle-clustering or averaging over m-space that yields
Fig. 2. Changes in ion complexation enthalpies (��H) and free energies
(��G) due to methyl groups. �H and �G were estimated separately for re-
actions A + nX ⌦ AXn in which A is an ion, Na+, K+ or Ba2+, and X is
a ligand molecule, water (W ), methanol (M ), ethanol (E ) or propanol (P). The
reactions were then combined to obtain ��H and ��G. All values are reported
in kcal/mol.
Fig. 3. E↵ect of T!S substitution on electron densities. Di↵erences between the
electronic densities of (BaT++4 � 4CH3)� (BaS++
4 � 4H), calculated with
PBE0+vdW, are shown. The geometry used was that of the fully relaxed (PBE+vdW)
BaT++4 complex. To build BaS++
4 the CH3 groups of T were substituted by
H atoms and only these were relaxed. The di↵erence in electron density shown thus
corresponds to the e↵ect of polarization on the electronic densities caused by the
addition of methyl groups.
Materials and Methods
All geometric relaxations were performed with the all-electron, localized basis
(numeric atom-centered orbitals) program package FHI-aims [22, 23], where we em-
2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
ion-coordination number. This non-linearity emerges as aconsequence of repulsion between the coordinating ligands[23, 36, 37]. Finally, �H and �G, also depend on thecharge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced e↵ects are small. For example,while the �G associated with a K+W ! K+M substitution is-1.9 kcal/mol, the �G associated with the Ba2+W ! Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear e↵ect also un-derscores the importance of explicit polarization e↵ects in ioncomplexation [23].
In the calculations above, the small molecules have similargas phase dipole moments, but markedly di↵erent polarizabil-ities, and a ligand with a higher polarizability produces, ingeneral, a more stable ion complex. Can ligands with simi-lar gas phase dipole moments as well as static dipole polariz-abilties produce complexes with markedly di↵erent stabilities?Butanol has three di↵erent isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (⇠1.7 Debye) and polarizabili-ties (⇠8.9 A3)[3], the di↵erences in their chemical structureswill result in di↵erent distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite hav-ing similar polarizabilities, they will respond to the distance–dependent field of the ion di↵erently, and the closer proxim-ity of methyl groups in isobutanol and 2-butanol to the ion,compared to n-butanol, can be expected to increase the sta-bility of the complex. To quantify this e↵ect we carry outsubstitution reactions (Equ. 1) involving these three butanolisomers. The results are plotted in Fig. 3. Indeed, we findthat the isobutanol and 2-butanol complexes are more stablethan the corresponding n-butanol complexes. Furthermore,complexes comprising of 2-butanols are more stable than thecorresponding complexes comprising of isobutanols, as one ofthe two terminal methyl groups of 2-butanol is closer to theion than that of isobutanol. Thus, in addition to the staticdipole polarizabilities of the methyl groups, the distributionof methyl groups around the ion also plays a role in ion com-plex stability. We also note that the distances of the ions fromthe closest methyl groups of 2-butanol (⇠5 A) are compara-ble to the distances of ions from the branched methyl groupof threonine in the S4 site of KcsA [7, 11]. This suggests fur-ther that the reduction in methyl–induced polarization due toT!S substitutions in the S4 sites of K–channels can reducethe binding a�nities of all three ions significantly.
ABn ! A(iB)n
Fig. 3. E↵ect of butanol isoform on ion complexation enthalpies (�H). The
enthalpies are estimated at 298 K for the substitution reactions given by Equ. 1,
which are denoted as AXn ! AX0n. All energies are in units of kcal/mol.
In the S4 sites of K–channels the ions coordinate witheight ligands, and not just four hydroxyl ligands as studiedabove. Consequently, the repulsion between the coordinatingligands will be larger in the S4 site, which could reduce orwash out the contribution of the threonine methyl groups tothe binding of ions at the S4 site. Furthermore, a T!S sub-stitution may also alter the manner in which the coordinatinggroups of the S4 site interact with ions.
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS ⌦ AT4�nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coordi-nating groups is not large enough to counteract the stabiliza-tion provided by the presence of methyl groups. The a�nityof the S4 site for all three ions, Na+, K+ and Ba2+, dimin-ishes, although non-linearly, with each threonine substitution.Furthermore, these substitutions result in only minor config-urational changes that do not alter the overall topology ofthe S4 site (Table S2 of supporting information), which im-plies that the e↵ect of T!S substitutions on configurationalentropy will be minimal, and may be neglected. While wefind that the configurational change is generally higher in thecase of Ba2+ complexes as compared to Na+ and K+ com-plexes, we also note that optimizations of Ba2+ complexesfollowing T!S substitutions, but under heavy atom positionrestraints, result in a higher drop in S4 site a�nity. For ex-ample, a BaT4 ! BaS4 substitution followed by a constrainedoptimization yields a �E = XX kcal/mol, as opposed to a �E= XX kcal/mol obtained after an unconstrained optimization.This indicates that if the K–channel matrix were rigid to dis-allow a structural change, a T!S substitution will result ina higher loss in S4 site a�nity for Ba2+, as compared to theresults in Table 1. Together, this suggests that the loss ina�nity of the S4 site noted in Table 1 is not a result of alteredion–binding topologies, but is primarily due to a reduction inthe methyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the a�nity of the S4 site is the highest forBa2+ (Table 1). The free energy di↵erence of 6.9 kcal/mol isequivalent to a 106-fold drop in binding a�nity, where morethan half of the a�nity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop inBa2+ binding a�nity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T!S substi-tution experiments [16]. In addition, it also accounts for thedi↵erence between the Kir 2.4 and Kir 2.1 experimental IC50
values of Ba2+ [20, 21], as Kir 2.4 carries a serine and Kir 2.1a threonine in the S4 site.
The computed drop in stability is, however, greater thanthat inferred from experiments (by over 3 kcal/mol). This dis-crepancy does not result from the pairwise approximation [31]employed in the estimation of the dispersion energy. Whilemany-body dispersion terms can contribute significantly toligand binding [38], we find their contribution to be only0.3 kcal/mol to the BaT4 ! BaS4 substitution reaction (Ta-ble S4 of supporting information). We also do not expectthis discrepancy to be due to the missing structural restraintsfrom the protein matrix on the isolated S4 site, as the T!Ssubstitutions are accompanied by only minor configurationalchanges (Table S2 of supporting information). The overesti-mated drop in Ba2+ binding a�nity is most likely because ourcalculations lack a polarization coupling between the S4 siteand its external environment. Fig. 4 shows that the polar-ization in the electron density of the threonine methyl groupsis along the electric field of the Ba2+ ion, which maximizesthe contribution of methyl polarization to Ba2+ binding. Thepresence of other polar chemical moieties proximal to the S4site, including water molecules, can reduce the contributionof polarization and thereby the overall e↵ect of T!S substi-tutions on Ba2+ binding. Thermodynamically, this reduction
Footline Author PNAS Issue Date Volume Issue Number 3
ion-coordination number. This non-linearity emerges as aconsequence of repulsion between the coordinating ligands[23, 36, 37]. Finally, �H and �G, also depend on thecharge/size ratio of the ion, and once again, we find a non-linear relationship between the charge/size ratio of the ion andthe magnitude of the thermodynamic change. Doubling thecharge/size ratio of the ion does not necessarily imply that thestability of the complex also increases by a factor of two, asmay be the case when induced e↵ects are small. For example,while the �G associated with a K+W ! K+M substitution is-1.9 kcal/mol, the �G associated with the Ba2+W ! Ba2+Msubstitution is -7.9 kcal/mol, which is a 4-fold change insteadof a 2-fold change. This additional non-linear e↵ect also un-derscores the importance of explicit polarization e↵ects in ioncomplexation [23].
In the calculations above, the small molecules have similargas phase dipole moments, but markedly di↵erent polarizabil-ities, and a ligand with a higher polarizability produces, ingeneral, a more stable ion complex. Can ligands with simi-lar gas phase dipole moments as well as static dipole polariz-abilties produce complexes with markedly di↵erent stabilities?Butanol has three di↵erent isomers, n-butanol (B), isobutanol(iB) and 2-butanol (2B). While these isomers have compara-ble gas phase dipole moments (⇠1.7 Debye) and polarizabili-ties (⇠8.9 A3)[3], the di↵erences in their chemical structureswill result in di↵erent distributions of methyl groups and/ormethylene bridges around the ion. Consequently, despite hav-ing similar polarizabilities, they will respond to the distance–dependent field of the ion di↵erently, and the closer proxim-ity of methyl groups in isobutanol and 2-butanol to the ion,compared to n-butanol, can be expected to increase the sta-bility of the complex. To quantify this e↵ect we carry outsubstitution reactions (Equ. 1) involving these three butanolisomers. The results are plotted in Fig. 3. Indeed, we findthat the isobutanol and 2-butanol complexes are more stablethan the corresponding n-butanol complexes. Furthermore,complexes comprising of 2-butanols are more stable than thecorresponding complexes comprising of isobutanols, as one ofthe two terminal methyl groups of 2-butanol is closer to theion than that of isobutanol. Thus, in addition to the staticdipole polarizabilities of the methyl groups, the distributionof methyl groups around the ion also plays a role in ion com-plex stability. We also note that the distances of the ions fromthe closest methyl groups of 2-butanol (⇠5 A) are compara-ble to the distances of ions from the branched methyl groupof threonine in the S4 site of KcsA [7, 11]. This suggests fur-ther that the reduction in methyl–induced polarization due toT!S substitutions in the S4 sites of K–channels can reducethe binding a�nities of all three ions significantly.
ABn ! A(2B)n
Fig. 3. E↵ect of butanol isoform on ion complexation enthalpies (�H). The
enthalpies are estimated at 298 K for the substitution reactions given by Equ. 1,
which are denoted as AXn ! AX0n. All energies are in units of kcal/mol.
In the S4 sites of K–channels the ions coordinate witheight ligands, and not just four hydroxyl ligands as studiedabove. Consequently, the repulsion between the coordinatingligands will be larger in the S4 site, which could reduce orwash out the contribution of the threonine methyl groups tothe binding of ions at the S4 site. Furthermore, a T!S sub-stitution may also alter the manner in which the coordinatinggroups of the S4 site interact with ions.
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS ⌦ AT4�nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coordi-nating groups is not large enough to counteract the stabiliza-tion provided by the presence of methyl groups. The a�nityof the S4 site for all three ions, Na+, K+ and Ba2+, dimin-ishes, although non-linearly, with each threonine substitution.Furthermore, these substitutions result in only minor config-urational changes that do not alter the overall topology ofthe S4 site (Table S2 of supporting information), which im-plies that the e↵ect of T!S substitutions on configurationalentropy will be minimal, and may be neglected. While wefind that the configurational change is generally higher in thecase of Ba2+ complexes as compared to Na+ and K+ com-plexes, we also note that optimizations of Ba2+ complexesfollowing T!S substitutions, but under heavy atom positionrestraints, result in a higher drop in S4 site a�nity. For ex-ample, a BaT4 ! BaS4 substitution followed by a constrainedoptimization yields a �E = XX kcal/mol, as opposed to a �E= XX kcal/mol obtained after an unconstrained optimization.This indicates that if the K–channel matrix were rigid to dis-allow a structural change, a T!S substitution will result ina higher loss in S4 site a�nity for Ba2+, as compared to theresults in Table 1. Together, this suggests that the loss ina�nity of the S4 site noted in Table 1 is not a result of alteredion–binding topologies, but is primarily due to a reduction inthe methyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the a�nity of the S4 site is the highest forBa2+ (Table 1). The free energy di↵erence of 6.9 kcal/mol isequivalent to a 106-fold drop in binding a�nity, where morethan half of the a�nity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop inBa2+ binding a�nity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T!S substi-tution experiments [16]. In addition, it also accounts for thedi↵erence between the Kir 2.4 and Kir 2.1 experimental IC50
values of Ba2+ [20, 21], as Kir 2.4 carries a serine and Kir 2.1a threonine in the S4 site.
The computed drop in stability is, however, greater thanthat inferred from experiments (by over 3 kcal/mol). This dis-crepancy does not result from the pairwise approximation [31]employed in the estimation of the dispersion energy. Whilemany-body dispersion terms can contribute significantly toligand binding [38], we find their contribution to be only0.3 kcal/mol to the BaT4 ! BaS4 substitution reaction (Ta-ble S4 of supporting information). We also do not expectthis discrepancy to be due to the missing structural restraintsfrom the protein matrix on the isolated S4 site, as the T!Ssubstitutions are accompanied by only minor configurationalchanges (Table S2 of supporting information). The overesti-mated drop in Ba2+ binding a�nity is most likely because ourcalculations lack a polarization coupling between the S4 siteand its external environment. Fig. 4 shows that the polar-ization in the electron density of the threonine methyl groupsis along the electric field of the Ba2+ ion, which maximizesthe contribution of methyl polarization to Ba2+ binding. Thepresence of other polar chemical moieties proximal to the S4site, including water molecules, can reduce the contributionof polarization and thereby the overall e↵ect of T!S substi-tutions on Ba2+ binding. Thermodynamically, this reduction
Footline Author PNAS Issue Date Volume Issue Number 3
K+Ba2+
Na+
Fig. 3. E↵ect of butanol isoform on ion complexation enthalpies (�H). The
enthalpies are estimated at 298 K for the substitution reactions given by Equ. 1,
which are denoted as AXn ! AX0n. All energies are in units of kcal/mol.
APn ! ABn
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS ⌦ AT4�nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coor-dinating groups is not large enough to counteract the sta-bilization provided by the presence of methyl groups. Thea�nity of the S4 site for all three ions (Na+, K+, Ba2+) di-minishes, although non-linearly, with each threonine substi-tution. Furthermore, these substitutions result in only minorconfigurational changes that do not alter the overall topol-ogy (geometry) of the S4 site (Table S2 of supporting infor-mation), which implies that the e↵ect of T!S substitutionson configurational entropy will be minimal [38], and may beneglected. While we find that the configurational change isgenerally higher in the case of Ba2+ complexes as comparedto Na+ and K+ complexes, we also note that optimizationsof Ba2+ complexes following T!S substitutions, but underheavy atom position restraints, result in a higher drop in S4site a�nity. For example, a BaT4 ! BaS4 substitution fol-lowed by a constrained optimization yields a single point en-ergy di↵erence �E = XX kcal/mol, as opposed to a �E =XX kcal/mol obtained from an unconstrained optimization.This indicates that if the structure of the S4 site remainedunchanged from a T!S mutation, then the a�nity of the S4site for Ba2+ would drop even further than that reported inTable 1. Together, this suggests that the loss in a�nity ofthe S4 site noted in Table 1 is not a result of altered ion–binding topologies, but is primarily due to a reduction in themethyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the a�nity of the S4 site is the highest forBa2+ (Table 1). The free energy di↵erence of 6.9 kcal/mol isequivalent to a 106-fold drop in binding a�nity, where morethan half of the a�nity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop inBa2+ binding a�nity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T!S substi-tution experiments [16]. In addition, it also accounts for thedi↵erence between the Kir 2.4 and Kir 2.1 experimental IC50
Footline Author PNAS Issue Date Volume Issue Number 3
K+Ba2+
Na+
1 2 3 4 1 2 3 4-12
-9
-6
-3
0
3
1 2 3 4
Fig. 3. Effect of butanol isomer on ion complexation enthalpies (∆H). The en-
thalpies are estimated at 298 K for the substitution reactions given by Equ. 1, which
are denoted as AXn → AX′n. All energies are in units of kcal/mol.
To resolve these issues and understand how the threoninemethyl groups contribute to K–channel function, we consideran isolated S4 site composed of threonine residues. In thisrepresentative model, we substitute, in unit increments, thethreonines with serines, that is,
AT4 + nS AT4−nSn + nT, [2]
and estimate the changes in free energy in the gas phase (Ta-ble 1). We find that the repulsion between the eight coor-dinating groups is not large enough to counteract the sta-bilization provided by the presence of methyl groups. Theaffinity of the S4 site for all three ions (Na+, K+, Ba2+) di-minishes, although non-linearly, with each threonine substi-tution. Furthermore, these substitutions result in only minorconfigurational changes that do not alter the overall topology(geometry) of the S4 site (Table S2 and Figure S3 of sup-porting information), which implies that the effect of T→Ssubstitutions on configurational entropy will be minimal [38],and may be neglected. While we find that the configurationalchange is generally higher in the case of Ba2+ complexes ascompared to Na+ and K+ complexes, we also note that op-timizations of Ba2+ complexes following T→S substitutions,but under heavy atom position restraints, result in a higherdrop in S4 site affinity. For example, a BaT4 → BaS4 substi-tution followed by a constrained optimization yields a singlepoint energy difference ∆E = 8.8 kcal/mol, as opposed toa ∆E = 5.2 kcal/mol obtained from an unconstrained opti-mization. This indicates that if the structure of the S4 siteremained unchanged after a T→S mutation, then the affinityof the S4 site for Ba2+ would drop even further than thatreported in Table 1. Together, this suggests that the loss inaffinity of the S4 site noted in Table 1 is not a result of alteredion–binding topologies, but is primarily due to a reduction inthe methyl–induced polarization of the S4 site.
When all four threonines are replaced by serines, the es-timated drop in the affinity of the S4 site is the highest forBa2+ (Table 1). The free energy difference of 6.9 kcal/mol isequivalent to a 106-fold drop in binding affinity, where morethan half of the affinity drop is due to a loss in dispersionenergy (Table S3 of supporting information). The net drop in
Footline Author PNAS Issue Date Volume Issue Number 3
Ba2+ binding affinity accounts for the change in half maximalinhibitory concentrations (IC50) of Ba2+ seen in T→S substi-tution experiments [16]. In addition, it also accounts for thedifference between the Kir 2.4 and Kir 2.1 experimental IC50
values of Ba2+ [20, 21], as Kir 2.4 carries a serine and Kir 2.1a threonine in the S4 site.
The computed drop in stability is, however, greater thanthat inferred from experiments (by over 3 kcal/mol). This dis-crepancy does not result from the pairwise approximation [31]employed in the estimation of the dispersion energy. Whilemany-body dispersion terms can contribute significantly toligand binding [39], we find their contribution to be only0.3 kcal/mol to the BaT4 → BaS4 substitution reaction (Ta-ble S4 of supporting information). We also do not expectthis discrepancy to be due to the missing structural restraintsfrom the protein matrix on the isolated S4 site, as the T→Ssubstitutions are accompanied by only minor configurationalchanges (Table S2 of supporting information).
The overestimated drop in Ba2+ binding affinity is mostlikely because our calculations lack a polarization couplingbetween the S4 site and its external environment. Fig. 4shows that the polarization in the electron density of the thre-onine methyl groups is along the electric field of the Ba2+ ion,which maximizes the contribution of methyl polarization toBa2+ binding. The presence of other polar chemical moietiesproximal to the S4 site, including water molecules, can reducethe contribution of polarization and thereby the overall effectof T→S substitutions on Ba2+ binding. Thermodynamically,this reduction will appear as an increase in the free energypenalty associated with threonine extraction from its localenvironment, which has been shown to influence ion bind-ing [40, 41]. Another possible explanation could be that thecomputed values may not be comparable directly with exper-imental estimates. It is plausible that a T→S substitutionindeed causes the Ba2+ binding affinity of the S4 site to dropclose to the computed value, and in such an event Ba2+ bindsto an alternative site in the filter that is more favorable thanthe serine–containing S4 site. The experimental estimates ofBa2+ IC50 values in the presence of threonines and serinescould, therefore, correspond to two different binding sites ofBa2+ in the filter, instead of two different chemistries of an S4binding site. Irregardless, we find that the presence of methylgroups contributes significantly to the binding of Ba2+ ionsto K–channels.
Fig. 4. Effect of T→S substitution on the electron density ρ of an isolated S4
site bound to Ba2+. This effect was estimated as ∆ρ = (ρBaT4− 4ρCH3
−ρBaS4
+ 4ρH) using the PBE0+vdW functional. In the expression above, ρX are
the electron densities of the isolated groups, X. The BaS4 complex was constructed
from the optimized geometry of the BaT4 complex by first replacing its side chain
CH3 groups with H atoms, and then relaxing the H atom coordinates.
From Table 1, we also see that when all four threoninesare replaced by serines, the stability of the S4 site complexedwith K+ ions drops by 3.4 kcal/mol. It is known that K+
binding to the selectivity filter contributes to the organiza-tion of the filter’s conductive state, as well as to the channel’sability to assemble into tetramers [7, 50, 51, 52]. A dropin K+ binding affinity at the S4 site could, therefore, desta-bilize the filter’s conductive state, and/or impede the chan-nel’s ability to form tetramers. Indeed, electrophoresis stud-ies show that T→S substitutions in Kcv reduce the channel’sability to assemble into tetramers [16]. In addition, electro-physiological studies show that T→S substitutions in Kir 2.1alter the channel’s mean open probabilities [16]. The bind-ing of K+ ions to the S4 site also plays a vital role in theconcerted movement of ions along the permeation pathway[10, 42, 43, 44, 45, 46, 47, 48, 49]. A T→S substitution could,therefore, potentially also alter the kinetics of K+ transportthrough the channel, although no such effects were observed inexperiments on Kcv and Kir 2.1 channels [16]. Nevertheless,it is clear from Table 1 and Figures 2 and 3 that the polar-ization induced by methyl groups can contribute significantlyto how K+ ions interact with the channel, and therefore, theion’s transport properties.
The data in Table 1 also indicates that T→S substitutionswill lower the affinity of the S4 site for Na+ ions. This reduc-tion is, in general, greater than that estimated in the case ofK+ ions, which suggests that T→S substitutions will also in-crease the K/Na selectivity of the S4 site. This, however, doesnot imply that the overall selectivity of the channel will alsoincrease. Thermodynamic calculations on KcsA indicate thatthe contribution of the S4 site to the channel’s K/Na selectiv-ity is small [53]. Furthermore, the channel’s overall selectivityalso depends on the number of 8-fold ion binding sites in theselectivity filter [54], and how these sites interact with theirproximal environment, including with each other [40, 55, 56].While it is not straightforward to ascertain how these effectswill interplay to alter the selective properties of T→S mutantchannels, electrophysiological studies show that T→S substi-tutions in Kcv channels do not alter K/Na selectivity [16].
Together, we find that the polarization induced by methylgroups stabilizes ion complexation significantly, and can,therefore, also be expected to influence the interactions ofother biological charged species. The magnitude of the ef-fect depends non–linearly on ion charge, the proximity of the
4 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
methyl group from the ion, as well as on the number of methylgroups present in the ion complex. T→S substitutions in theS4 sites of K–channels result in removal of methyl groups froma distance greater than 5 A from the ion. While the architec-ture of the binding site is affected minimally, the stability ofthe S4 site for Ba2+ drops by several kcal/mol with the T→Ssubstitution. This destabilization accounts for the changes inIC50 values of Ba2+ seen in experiments. Additionally, theaffinity of the S4 site also drops for K+ ions, which correlateswith changes in gating properties as well as a reduction inthe overall stability of the channel. These results show howpolarization induced by methyl groups is important to bothK–channel function and stability.
SummaryNaturally occurring or engineered T→S substitutions at theS4 ion-binding sites of K–channels reduce the number ofmethyl groups near the ion. Despite this seemingly benignchange in the number of hydrophobic groups, electrophysio-logical experiments show that T→S substitutions modify K–channel properties. Results from our first principles quantummechanical calculations account for these modifications. Wefind that the methyl–induced polarization is large enough thatits reduction in S4 sites due to T→S substitutions reduces ionaffinities dramatically, with minimal change in topology. Thisloss in ion affinity is consistent with experimental observa-tions, such as changes in open channel probabilities, and IC50
values of Ba2+. These results illustrate quantitatively howprotein function can also rely on the polarization induced bymethyl groups. Given the abundance of charged moieties inbiological systems, such as ions, titratable amino acids, sul-phates, phosphates and nucleotides, it appears unlikely thatwe have uncovered an unwonted instance of where the po-larizability of methyl groups matter. The recent advances in
computational methods now permit a proper quantitative as-sessment of the role of methyl groups in biomolecular function.
Materials and MethodsFree energy changes ∆G associated with the substitution reactions (Equations 1 and
2) are obtained after relaxing separately each molecular complex using the DFT+vdW
method [30] implemented in the all-electron, localized basis FHI-aims package [57, 58].
The vdW corrected PBE [26] functional (PBE+vdW) is used for relaxations. Within
the DFT+vdW method, PBE performs well against the S22 data set [59], exhibiting
a mean absolute error of only 0.3 kcal/mol [31]. Relativistic corrections are used only
for complexes containing barium ions, and using the atomic ZORA method [60]. We
employ the “tight” settings for integration grids and basis sets, as described in Ref.
[57], which yield converged energy differences and negligible basis set superposition
error. While the starting geometries of the ion-water complexes are taken from Ref.
[41], the starting geometries of the ion-alcohol complexes are constructed using the
ion-water complexes as templates. The coordinates for the S4 site of the K–channel
selectivity filter are taken from the X-ray structure of KcsA [7].
Following relaxation, the complex geometries are subjected to a single point
calculation using the hybrid PBE0[27, 28]+vdW functional. Due to the partial in-
clusion of exact exchange in PBE0, this functional improves the description of elec-
tron delocalization [61]. This is important to the estimation of vdW contributions,
C6[n(r)]/R6, as they depend explicitly on the self-consistent electron densities
n(r). The translational, rotational and vibrational contributions are estimated sep-
arately for each molecular complex, with the PBE+vdW functional, at a temperature
of 298 K and a pressure of 1 atmosphere. These contributions are then added to the
PBE0+vdW single point energies to obtain the free energies G of individual com-
plexes. The free energy changes ∆G associated with the substitution reactions are
then obtained by subtracting the free energies of reactant complexes from the product
complexes; that is, ∆G =∑
npGp −∑
nrGr , where np and nr are the
stoichiometries of the products and reactants.
ACKNOWLEDGMENTS. We thank Daniel L. Minor, Jr. and Carsten Baldauf forstimulating discussions. MR and AT acknowledge computational time from the Ar-gonne Leadership Computing Facility at ANL, which is supported by the Office ofScience of the US DoE under contract DE-AC02-06CH11357. SBR acknowledgesfunding from Sandia’s LDRD program.SNL is managed and operated by Sandia Cor-poration, a wholly owned subsidiary of LMC, for the US DoEs NNSA under contractDE AC04-94AL85000. AT and SV also acknowledge support from the Spring 2011long program of the Institute of Pure and Applied Mathematics at UCLA.
1. WK Paik, DC Paik, and S Kim. Historical review: the field of protein methylation.
TRENDS in Biochemical Sciences, 32(3):146-152, 2007.
2. KD Roberston. DNA methylation and human disease. Nature Rev. Genetics, 6:597-
610, 2005.
3. DR Lide. CRC Handbook of Chemistry and Physics. Taylor and Francis, Boca Raton,
FL, 2008.
4. LC Sowers, BR Shaw, and WD Sedwick. Base stacking and molecular polarizability:
Effect of a methyl group in the 5-position of pyrimidines. Biochem. Biophys. Res.
Comm., 148(2):790-794, 1987.
5. S Wang, and ET Kool. Origins of the Large Differences in Stability of DNA and RNA
Helices: C-5 Methyl and 2-Hydroxyl Effects Biochemistry, 34:4125-4132, 1995.
6. B Hille. Ionic Channels of Excitable Membranes. Sinauer Associates, Sunderland, MA,
2001.
7. Y Zhou, JH Morais-Cabral, A Kaufman, and R MacKinnon. Chemistry of ion coor-
dination and hydration revealed by a K+ channel-Fab complex at 2.0 A resolution.
Nature, 414:43-48, 2001.
8. CM Armstrong, and SR Taylor. Interaction of barium ions with potassium channels in
squid giant axons. Biophys. J, 30:473-488, 1980.
9. DC Eaton, and MS Brodwick. Effect of barium on the potassium conductance of squid
axons. J Gen. Physiol., 75:727-750, 1980.
10. J Neyton, and C Miller. Discrete Ba2+ block as a probe of ion occupancy and
pore structure in the high-conductance Ca2+-activated K+ channel. J Gen. Physiol.,
92:569-586, 1988.
11. Y Jiang, and R MacKinnon. The Barium Site in a Potassium Channel by X-Ray
Crystallography. J Gen. Physiol., 15(3):269-272, 2000.
12. C Vergara, and R Latorre. Kinetics of Ca2+-activated K+ channels from rabbit mus-
cle incorporated into planar bilayers. Evidence for a Ca2+ and Ba2+ blockade. J Gen.
Physiol., 75:727-750, 1980.
13. N Alagem, M Dvir, and E Reuveny. Mechanism of Ba(2+) block of a mouse in-
wardly rectifying K+ channel: differential contribution by two discrete residues. J
Gen. Physiol., 534:381-39, 2001.
14. P Proks, JF Antcliff, and FM Ashcroft. The ligand-sensitive gate of a potassium
channel lies close to the selectivity filter. EMBO Journal, 4:70-75, 2003.
15. FC Chatelain, N Alagem, Q Xu, R Pancaroglu, E Reuveny, and DL Minor Jr. The
Pore Helix Dipole Has a Minor Role in Inward Rectifier Channel Function Neuron, 47,
833-843, 2005.
16. FC Chatelain, S Gazzarrini., Y Fujiwara, C Arrigoni, C Domigan, G Ferrara, C Pantoja,
G Thiel, A Moroni, and DL Minor Jr. Selection of Inhibitor-Resistant Viral Potassium
Channels Identifies a Selectivity Filter Site that Affects Barium and Amantadine Block.
PLoS ONE, 4(10): e7496, 2009.
17. S Varma, DM Rogers, LR Pratt, and SB Rempe. Perspectives on Ion selectivity: De-
sign principles for K+ selectivity in membrane transport. J Gen. Physiol., 137:479-488,
2011.
18. I Kim and TW Allen. On the selective ion binding hypothesis for potassium channels.
Proc. Natl. Acad. Sci. USA., 108:17963-17968, 2011.
19. RT Shealy, AD Murphy, R Ramarathnam, E Jakobsson, and S Subramaniam.
Sequence-function analysis of the K+-selective family of ion channels using a compre-
hensive alignment and the KcsA channel structure. Biophys. J, 84:2929-2942, 2003.
20. C Topert, F Doring, E Wischmeyer, C Karschin, J Brockhaus, K Ballanyi, C Derst,
and A Karschin Kir2.4: A Novel K+ Inward Rectifier Channel Associated with Mo-
toneurons of Cranial Nerve Nuclei. J Neurosci., 18(11):4096-4105, 1998.
21. H Hibino, A Inanobe, K Furutani, S Murakami, I Findlay, and Y Kurachi Inwardly
Rectifying Potassium Channels: Their Structure, Function, and Physiological Roles.
Physiol. Rev., 90: 291-366, 2010.
22. WC Wimley, SH White. Experimentally determined hydrophobicity scale for proteins
at membrane interfaces. Nature Struct. Biol., 3:842-848, 1996.
23. S Varma, and SB Rempe. Multibody effects in ion binding and selectivity. Biophys.
J., 99:3394-3401, 2010.
24. DL Bostick, and CL Brooks III. Selective complexation of K+ and Na+ in simple
polarizable ion-ligating systems. J Amer. Chem. Soc., 132:13185-13187, 2010.
25. A Tkatchenko M Rossi, V Blum, J Ireta, and M Scheffler. Unraveling the stability
of polypeptide helices: Critical role of van der Waals interactions. Phys. Rev. Lett.,
106:118102-118106, 2011.
26. JP Perdew, K Burke, and M Ernzerhof. Generalized Gradient Approximation Made
Simple. Phys. Rev. Lett., 77(18):3865–3868, 1996.
27. M. Ernzerhof and G. E. Scuseria. Assessment of the Perdew-Burke-Ernzerhof
exchange-correlation functional. J Chem. Phys., 110:5029, 1999.
Footline Author PNAS Issue Date Volume Issue Number 5
28. C. Adamo and V. Barone. Toward reliable density functional methods without ad-
justable parameters: The PBE0 model. J Chem. Phys., 110(13):6158–6170, 1999.
29. N Marom, A Tkatchenko, M Scheffler and L Kronik Describing Both Dispersion In-
teractions and Electronic Structure Using Density Functional Theory: The Case of
Metal-Phthalocyanine Dimers. J Chem. Theory Comput., 6:81-90, 2010.
30. A. Tkatchenko and M. Scheffler. Accurate molecular van der Waals interactions
from ground-state electron density and free-atom reference data. Phys. Rev. Lett.,
102:073005, 2009.
31. N Marom, A Tkatchenko, M Rossi, V Gobre, O Hod, M Scheffler, and L Kronik.
Dispersion interactions within density functional theory: Benchmarking semi-empirical
and pair-wise corrected density functionals. J. Chem. Theory Comput., 7:3944–3951,
2011.
32. RD Nelson, DR Lide, and AA Maryott. Selected Values of Electric Dipole Moments
for Molecules in the Gas Phase. Natl. Stand. Ref. Data Ser. - Nat. Bur. Stnds., 10,
1967.
33. MD Tissandier et al. The proton’s absolute aqueous enthalpy and Gibbs free energy
of solvation from cluster–ion solvation data. J. Phys. Chem. A, 102:7787–7794, 1998.
34. M Peschke, AT Blades, and P Kebarle. Hydration energies and entropies for Mg2+,
Ca2+, Sr2+ and Ba2+ from gas-phase ion-water molecule equilibria determinations.
J. Phys. Chem. A, 102:9978–9985, 1998.
35. SB Nielsen, M Masella, and P Kebarle. Competitive gas-phase solvation of alkali
metal ions by water and methanol. J. Phys. Chem. A, 103:9891–9898, 1999.
36. EE Dahlke, and DJ Truhlar. Assessment of the pairwise additive approximation and
evaluation of many-body terms for water clusters. J. Phys. Chem. B Lett., 110:10595-
10601, 2006.
37. C Krekeler, and L Delle Site. Solvation of positive ions in water: the dominant role
of water-water interaction. J. Phys. Condens. Matter. , 19:192101-192107, 2007.
38. L Pauling. The Structure and Entropy of Ice and of Other Crystals with Some Ran-
domness of Atomic Arrangement. J. Am. Chem. Soc., 57:2680-2684, 1935.
39. RA DiStasio, OA von Lilienfeld, and A Tkatchenko. Collective many-body van der
Waals interactions in molecular systems. Proc. Natl. Acad. Sci. USA, 109: 14791-
14795, 2012.
40. S Varma, and SB Rempe. Tuning ion coordination architectures to enable selective
partitioning. Biophys. J., 93:1093-1099, 2007.
41. S Varma, and SB Rempe. Structural transitions in ion coordination driven by changes
in competition for ligand binding. J. Amer. Chem. Soc., 130:15405-15419, 2008.
42. AL Hodgkin, and RD Keynes. The potassium permeability of a giant nerve fibre. J
Physiol. (London), 128:61-88, 1955.
43. IH Shrivastava and MS Sansom. Simulations of ion permeation through a potas-
sium channel: molecular dynamics of KcsA in a phospholipid bilayer. Biophys J., 78:
557-570, 2000.
44. J Aqvist, and V Luzhkov. Ion permeation mechanism of the potassium channel. Na-
ture, 404:881-884., 2000.
45. S Berneche, and B Roux. Energetics of ion conduction through the K+ channel.
Nature, 414:73-77, 2001.
46. RJ Mashl, Y Tang, J Schnitzer, and E Jakobsson. Hierarchical approach to predicting
permeation in ion channels. Biophys J., 81: 2473-2483, 2001.
47. S Furini, and C Domene. Atypical mechanism of conduction in potassium channels.
Proc. Natl. Acad. Sci. USA, 106:16074-16077, 2009.
48. MO Jensen, DW Borhani, K Lindorff-Larsen, P Maragakis, V Jogini, MP Eastwood,
RO Dror, and DE Shaw. Principles of conduction and hydrophobic gating in K+
channels. Proc. Natl. Acad. Sci. USA, 107:5833-5838, 2010.
49. M Iwamoto and S Oiki Counting Ion and Water Molecules in a Streaming File through
the Open-Filter Structure of the K Channel. J. Neurosci., 31(34):12180-12188, 2011.
50. MN Krishnan, JP Bingham, SH Lee, P Trombley, and E Moczydlowski Functional role
and affinity of inorganic cations in stabilizing the tetrameric structure of the KcsA K+
channel. J Gen. Physiol., 126: 271-283, 2005.
51. MN Krishnan, P Trombley, and E Moczydlowski Thermal stability of the K+ channel
tetramer: cation interactions and the conserved threonine residue at the innermost
site (S4) of the KcsA selectivity filter. Biochemistry, 47: 5354-5367, 2008.
52. S Wang, Y Alimi, A Tong, CG Nichols, and D Enkvetchakul Differential roles of
blocking ions in KirBac1.1 tetramer stability. J Biol. Chem., 284: 2854-2860, 2009.
53. S Noskov, B Roux. Control of ion selectivity in potassium channels by electrostatic
and dynamic properties of carbonyl ligands. Nature, 431:830-834, 2004.
54. MG Derebea, DB Sauera, W Zeng, A Alam, N Shi, and Y Jiang. Tuning the ion
selectivity of tetrameric cation channels by changing the number of ion binding sites.
Proc. Natl. Acad. Sci. USA, 108(2):598-602, 2011.
55. DL Bostick, and CL Brooks. Selectivity in K+ channels is due to topological control
of the permeant ions coordinated state. Proc. Natl. Acad. Sci. USA., 104: 9260-9265,
2007.
56. M Thomas, D Jayatilaka, and B Corry. The predominant role of coordination number
in potassium channel selectivity. Biophys. J., 93:2635-2643, 2007.
57. V Blum, R Gehrke, F Hanke, P Havu, X Ren, K Reuter, and M Scheffler. Ab ini-
tio molecular simulations with numeric atom-centered orbitals. Comp. Phys. Comm.,
180:2175–2196, 2009.
58. V Havu, V Blum, P Havu, and M Scheffler. Efficient o(n) integration for all-
electron electronic structure calculation using numeric basis functions. J Comput.
Phys, 228(22):8367–8379, 2009.
59. P Jurecka, J Sponer, J Cerny, and P Hobza. Benchmark database of accurate (MP2
and CCSD(T) complete basis set limit) interaction energies of small model complexes,
DNA base pairs, and amino acid pairs. Phys. Chem. Chem. Phys., 8:1985, Jan 2006.
60. E. van Lenthe, E. J. Baerends, and J. G. Snijders. Relativistic regular two-component
Hamiltonians. J Chem. Phys., 99(6):4597–4610, 1993.
61. A Cohen, P Mori-Sanchez, and W. Yang. Insights into Current Limitations of Density
Functional Theory Science 321:792–794, 2008.
6 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author
Table 1. Free energy change ∆G at298 K associated with the substitu-tion reaction given by Equ. 2.
n Na+ K+ Ba2+
1 1.5 1.4 2.3
2 2.2 (2.5)a 2.4 (1.7)a 3.4 (3.8)a
3 3.3 2.5 5.0
4 4.6 3.4 6.9a Two simultaneous T → S substitutions can be introduced in two different ways. In one case, the substitutions can be made on adjacent
threonine residues, and in the other case, the substitutions are made on non-adjacent threonine residues. The numbers in brackets
correspond to T → S substitutions made on adjacent threonine residues.
Footline Author PNAS Issue Date Volume Issue Number 7