106
Some Surprises in the Biophysics of Protein Dynamics Vijay S. Pande Departments of Chemistry, Structural Biology, and Computer Science Program in Biophysics Stanford University 1 Friday, March 15, 13

BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

  • Upload
    bios203

  • View
    326

  • Download
    3

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 2: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

2Friday, March 15, 13

Page 3: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Crystallography  gives  a  wealth  of  informa>on

P53 Oligomerization(50% of cancers)

Collagen Helix Formation

(Osteogenesis Imperfecta)

Ribosome:(Last step of

Central Dogma,Antibiotic resistance)

Chaperonin Assisted Folding(relevant to cancer: HSP90 inhibitors)

Aβ peptide aggregation(Alzheimer’s Disease)

3Friday, March 15, 13

Page 4: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Ceci n’est pas une pipe.4Friday, March 15, 13

Page 5: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

“This is not a GPCR”(Hibert et al, TIPS Reviews, 1993)

5Friday, March 15, 13

Page 6: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

“This is not a cell”6Friday, March 15, 13

Page 7: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

7Friday, March 15, 13

Page 8: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

7Friday, March 15, 13

Page 9: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

7Friday, March 15, 13

Page 10: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

7Friday, March 15, 13

Page 11: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

How  do  you  break  a  billion-­‐fold  impasse?      Combine  mul=ple,  powerful,  complementary  technologies  

8

8Friday, March 15, 13

Page 12: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

How  do  you  break  a  billion-­‐fold  impasse?      Combine  mul=ple,  powerful,  complementary  technologies  

8

1)  Folding@home:    very  large-­‐scale  distributed  compu4ng

h#p://folding.stanford.edu

Voelz,  et  al,  JACS  (2010)Ensign  et  al,  JMB  (2007)Shirts  and  Pande,  Science  (2000)

Most  powerful  computer  cluster  in  the  world  (~8  petaflops)

104x  to  105x

8Friday, March 15, 13

Page 13: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

How  do  you  break  a  billion-­‐fold  impasse?      Combine  mul=ple,  powerful,  complementary  technologies  

8

1)  Folding@home:    very  large-­‐scale  distributed  compu4ng

h#p://folding.stanford.edu

Voelz,  et  al,  JACS  (2010)Ensign  et  al,  JMB  (2007)Shirts  and  Pande,  Science  (2000)

Most  powerful  computer  cluster  in  the  world  (~8  petaflops)

104x  to  105x

2)  OpenMM:    Very  fast  MD  (~1µs/day)  on  GPUs

~1µs/day  for  implicit  solvent  simulaton  of  small  proteins  (~40aa)  

h#p://simtk.org/home/openmm

Elsen,  et  al.  ACM/IEEE  conf.  on  Supercompu=ng  (2006)Friedrichs,  et  al.  J.  Comp.  Chem.,  (2009)Eastman  and  Pande.    J.  Comp.  Chem.  (2009)

102x  to  103x

8Friday, March 15, 13

Page 14: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

How  do  you  break  a  billion-­‐fold  impasse?      Combine  mul=ple,  powerful,  complementary  technologies  

8

1)  Folding@home:    very  large-­‐scale  distributed  compu4ng

h#p://folding.stanford.edu

Voelz,  et  al,  JACS  (2010)Ensign  et  al,  JMB  (2007)Shirts  and  Pande,  Science  (2000)

Most  powerful  computer  cluster  in  the  world  (~8  petaflops)

104x  to  105x

2)  OpenMM:    Very  fast  MD  (~1µs/day)  on  GPUs

~1µs/day  for  implicit  solvent  simulaton  of  small  proteins  (~40aa)  

h#p://simtk.org/home/openmm

Elsen,  et  al.  ACM/IEEE  conf.  on  Supercompu=ng  (2006)Friedrichs,  et  al.  J.  Comp.  Chem.,  (2009)Eastman  and  Pande.    J.  Comp.  Chem.  (2009)

102x  to  103x

3)  Markov  State  Models:    Sta4s4cal  mechanics  of  many  trajectories

very  long  4mescale  dynamics  by  combining  

many  simula4ons  

h#p://simtk.org/home/msmbuilder

Bowman,  et  al,  J.  Chem.  Phys.  (2009)Singhal  &  Pande,  J.  Chem.  Phys.  (2005)Singhal,  et  al,  J.  Chem.  Phys.  (2004)

102x  to  103x

8Friday, March 15, 13

Page 15: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

How  do  you  break  a  billion-­‐fold  impasse?      Combine  mul=ple,  powerful,  complementary  technologies  

8

1)  Folding@home:    very  large-­‐scale  distributed  compu4ng

h#p://folding.stanford.edu

Voelz,  et  al,  JACS  (2010)Ensign  et  al,  JMB  (2007)Shirts  and  Pande,  Science  (2000)

Most  powerful  computer  cluster  in  the  world  (~8  petaflops)

104x  to  105x

2)  OpenMM:    Very  fast  MD  (~1µs/day)  on  GPUs

~1µs/day  for  implicit  solvent  simulaton  of  small  proteins  (~40aa)  

h#p://simtk.org/home/openmm

Elsen,  et  al.  ACM/IEEE  conf.  on  Supercompu=ng  (2006)Friedrichs,  et  al.  J.  Comp.  Chem.,  (2009)Eastman  and  Pande.    J.  Comp.  Chem.  (2009)

102x  to  103x

3)  Markov  State  Models:    Sta4s4cal  mechanics  of  many  trajectories

very  long  4mescale  dynamics  by  combining  

many  simula4ons  

h#p://simtk.org/home/msmbuilder

Bowman,  et  al,  J.  Chem.  Phys.  (2009)Singhal  &  Pande,  J.  Chem.  Phys.  (2005)Singhal,  et  al,  J.  Chem.  Phys.  (2004)

102x  to  103x

8Friday, March 15, 13

Page 16: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  are  Markov  State  Models  (MSMs)?

Markov  State  Models  (MSMs)  are  a  theoreOcal  scheme  to  build  models  

of  long  Omescale  phenomena

(1)  to  aid  simulators  reach  long  Omescales  and  (2)  gain  insight  from  

their  simulaOons

see  the  work  of:            Andersen,  Caflisch,  Chodera,  Deuflhard,  Dill,  Grubmüller,  Hummer,  Levy,  Noé,  Pande,  Pitera,  Singhal-­‐Heinrichs,  Roux,  SchüDe,  Swope,  Weber  

9Friday, March 15, 13

Page 17: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

States  avoid  issues  with  projec>ons  and  R.C.’s

Figure  adapted  from  Dobson,  et  al,  Nature

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

10Friday, March 15, 13

Page 18: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

States  avoid  issues  with  projec>ons  and  R.C.’s

Figure  adapted  from  Dobson,  et  al,  Nature

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

dpidt

=X

l

[kl,ipl � ki,lpi]

Master  equaEon:

10Friday, March 15, 13

Page 19: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

MSMs  coarse  grain  conformaEon  space  (to  ~3Å)  to  build  a  Master  equaEon

11

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

Figure  adapted  from  Dobson,  et  al,  Nature

dpidt

=X

l

[kl,ipl � ki,lpi]

Master  equaEon:

Build  from  MD:derive  rate  matrix  from  simulaOon  w/  Bayesian  methods

11Friday, March 15, 13

Page 20: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

but  also  derive  a  coarser  view  for  human  consumpEon

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

dpidt

=X

l

[kl,ipl � ki,lpi]

Master  equaEon:

Build  from  MD:derive  rate  matrix  from  simulaOon  w/  Bayesian  methods

Coarse  grain  MSM:use  eigenvectors  to  idenOfy  collecOve  modes

12Friday, March 15, 13

Page 21: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Heart  of  the  power  of  MSMs

Systema=cally  idenOfying  intermediate  states  allows  us  to(1)  qualitaOvely  understand  and  

(2)  quanOtaOvely  predict  chemical  mechanisms

13Friday, March 15, 13

Page 22: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

• PerturbaEons  to  transiEon  matrix  can  be  handled  like  QM  perturbaEon  theory• Transi4on  matrix  with  error  (T0  =  “real”matrix)

• We  calculate  perturbed  eigenvalues  (ie  rates)

• and  perturbed  eigenvectors  (ie  mechanism)

• Key  result• error  perturbs  eigenvalues• results  will  be  robust  in  the  discrete  region  of  the  eigenvalue  spectrum

• Relevant  for  both  theory  and  experiment

The  MSM  can  tell  us  which  results  are  robust

classical perturbation theory and sloppiness theory to investigateMSM observable robustness in the face of transition probabilityperturbation. We also develop a quantitative Bayesian metricby which robustness can be evaluated, and we discuss implica-tions such robustness holds for future applications of MSMs tobiophysical phenomena.

Methodology.Exploring mechanism in an MSM context.To qualify a senserobustness in mechanistic properties, we must first consider how“mechanism” should be defined in an MSM context. On firstthought, one might consider that a protein’s folding trajectoryrepresents its folding mechanism. However, we argue that thisview of mechanism is overly restrictive: individuals within anensemble experience di�erent state-to-state transition sequencesin the folding process. While the transition matrix defines whichtrajectories are possible, it also does not provide a clear picture ofwhich pathways the ensemble prefers over short and long periodsof time.

The eigenspectrum of the MSM transition matrix, however,provides both kinetic and thermodynamic information about theensemble. With units of probability density, the transition matrixeigenvectors represent the normal modes of time evolution in thesystem. The stationary distribution, the eigenvector with uniteigenvalue, describes the equilibrium populations in the ensem-ble. The other eigenvectors, with sub-unit eigenvalues, describechanges in the system’s population distribution at timescales setby their respective eigenvalues.

Formally, the transfer of probability density in MSMs is re-lated to the eigenvectors via the expression:

⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]

where ⇡(n) represents the system’s nth probability distributionvector, � denotes an eigenvalue of the transition matrix, and gand e are the corresponding right and left eigenvectors of thetransition matrix, respectively [11].

This expression describes how an arbitrary population distri-bution converges to the equilibrium distribution over time. Notethat, as the number of timesteps n becomes large, all sub-uniteigenvalues (through the term �n) and their eigenvectors decayto zero, and eventually only the stationary distribution, multipliedby the unit eigenvalue, remains. Due to discretization, the stateprobability distribution at step n is only rigorously equal to theright hand side when all states give rise to “slow” eigenvectors;no analogy exists to the continuous spectrum of eigenvalues intransfer operator theory [11]. In actuality, the above expressionyields a vector proportional to the exact population distributionat a given timestep; the constant of proportionality can be de-termined through simple fitting.

Relating mechanism to an MSM eigenspectrum o�ers ad-vantages over the alternatives that were previously discussed.The eigenvector decomposition method provides details abouthow entire probability distributions change, allowing for an ideaof mechanism on an ensemble level. Large eigenvector entriesrepresent states that are important to density transfer on the re-laxation timescale of an associated eigenvalue. One can inspectthe set of eigenvectors to find which individual states are mech-anistically relevant at both fast and slow timescales. Informationabout trajectory (which folding pathways are most probable) andend result (how the state probability distribution converges to thestationary distribution) are intrinsic to the eigenspectrum. To-gether, we extend, trajectory and end result define the essentialparts of a folding mechanism. As such, we suggest that an MSMmechanism be defined in the context of eigenvector decomposi-tions.

Perturbation Theory Framework.In quantum mechanics, apopular method for approximating solutions to the Schroedingerequation involves splitting the system Hamiltonian into zeroth-and higher-order parts with expansion parameter ⇥:

H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]

If the eigenvalue problem for the zeroth order Hamiltonian canbe solved exactly, corrections to the eigenvalues and eigenvectorsbased on the “perturbed” Hamiltonian can be calculated with thewell known eigenspectrum perturbation theory [19].

In analogy to the quantum mechanical problem, an MSMtransition matrix could also be augmented by a “perturbationoperator.” Suppose we would like to calculate the impact of arandom perturbation on the eigenspectrum of the transition ma-trix. We could define a perturbed transition matrix T (to firstorder) such that

T ⇥ T0 + ⇥T⇥ [3]

where T0 is the original transition matrix and T⇥ is a matrix ofrandom noise. The first order correction due to noise, �⇥

n for eacheigenvalue �0

n of the transition matrix is given by the simple innerproduct

�⇥n = ⇧e0n|T⇥|e0n⌃ [4]

where is e0n is the nth eigenvector of the zeroth-order transitionmatrix [19]. Corrected eigenvectors are given by the formula

en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]

Using these corrections due to perturbation, one could gauge theimpact of a random noise (or a more systematic) change in atransition matrix on its eigenspectrum. We illustrate the appli-cation of this perturbation theory by applying the above analysisto the eigenvalues of the villin transition matrix.

Need for a New Framework. The method more extensively usedin this study is analogous, though not identical, to classical per-turbation theory. We perturb a transition matrix with noise, cal-culate the “corrected” eigenspectrum, and compare that eigen-spectrum to the original. We decide to use an alternative methodfor two reasons. First, we would like to gauge the rate of change(called the sensitivity) in an eigenvalue or eigenvector with re-spect to the magnitude of perturbation. Furthermore, we wouldlike to know this eigenspectrum sensitivity for each individualparameter in the model. These desires are not trivially fulfilledwith analytical perturbation theory. This paper’s method, drawnfrom the literature and tested on biological models, is designedto estimate such a rate of change [16–18].

Secondly, sophisticated theory for error propagation in MSMshas been developed using a sensitivity based analysis [20, 21].These methods use advanced Bayesian schemes to estimate un-certainty based on the available data. The nature of the sen-sitivities used to estimate such errors, however, has never beenwell characterized. It would be useful to gain intuition aboutthe relative magnitudes of eigenspectrum sensitivities in recentlyconstructed MSMs. Sloppiness-based techniques, as discussedbelow, provide an avenue to do so. While schemes using per-turbation theory to gauge robustness are explored briefly in thefollowing sections, we suggest a sloppiness-based analysis will bepreferable for much of this work.

Sloppiness Framework. To investigate sloppiness in MSM tran-sition probabilities, we choose to use a so-called model-parametercost function on the transition probability matrix. Given the per-turbation of a certain system parameter, the cost function returnsthe induced sum-squared deviation in a dependent observable. In

2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author

classical perturbation theory and sloppiness theory to investigateMSM observable robustness in the face of transition probabilityperturbation. We also develop a quantitative Bayesian metricby which robustness can be evaluated, and we discuss implica-tions such robustness holds for future applications of MSMs tobiophysical phenomena.

Methodology.Exploring mechanism in an MSM context.To qualify a senserobustness in mechanistic properties, we must first consider how“mechanism” should be defined in an MSM context. On firstthought, one might consider that a protein’s folding trajectoryrepresents its folding mechanism. However, we argue that thisview of mechanism is overly restrictive: individuals within anensemble experience di�erent state-to-state transition sequencesin the folding process. While the transition matrix defines whichtrajectories are possible, it also does not provide a clear picture ofwhich pathways the ensemble prefers over short and long periodsof time.

The eigenspectrum of the MSM transition matrix, however,provides both kinetic and thermodynamic information about theensemble. With units of probability density, the transition matrixeigenvectors represent the normal modes of time evolution in thesystem. The stationary distribution, the eigenvector with uniteigenvalue, describes the equilibrium populations in the ensem-ble. The other eigenvectors, with sub-unit eigenvalues, describechanges in the system’s population distribution at timescales setby their respective eigenvalues.

Formally, the transfer of probability density in MSMs is re-lated to the eigenvectors via the expression:

⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]

where ⇡(n) represents the system’s nth probability distributionvector, � denotes an eigenvalue of the transition matrix, and gand e are the corresponding right and left eigenvectors of thetransition matrix, respectively [11].

This expression describes how an arbitrary population distri-bution converges to the equilibrium distribution over time. Notethat, as the number of timesteps n becomes large, all sub-uniteigenvalues (through the term �n) and their eigenvectors decayto zero, and eventually only the stationary distribution, multipliedby the unit eigenvalue, remains. Due to discretization, the stateprobability distribution at step n is only rigorously equal to theright hand side when all states give rise to “slow” eigenvectors;no analogy exists to the continuous spectrum of eigenvalues intransfer operator theory [11]. In actuality, the above expressionyields a vector proportional to the exact population distributionat a given timestep; the constant of proportionality can be de-termined through simple fitting.

Relating mechanism to an MSM eigenspectrum o�ers ad-vantages over the alternatives that were previously discussed.The eigenvector decomposition method provides details abouthow entire probability distributions change, allowing for an ideaof mechanism on an ensemble level. Large eigenvector entriesrepresent states that are important to density transfer on the re-laxation timescale of an associated eigenvalue. One can inspectthe set of eigenvectors to find which individual states are mech-anistically relevant at both fast and slow timescales. Informationabout trajectory (which folding pathways are most probable) andend result (how the state probability distribution converges to thestationary distribution) are intrinsic to the eigenspectrum. To-gether, we extend, trajectory and end result define the essentialparts of a folding mechanism. As such, we suggest that an MSMmechanism be defined in the context of eigenvector decomposi-tions.

Perturbation Theory Framework.In quantum mechanics, apopular method for approximating solutions to the Schroedingerequation involves splitting the system Hamiltonian into zeroth-and higher-order parts with expansion parameter ⇥:

H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]

If the eigenvalue problem for the zeroth order Hamiltonian canbe solved exactly, corrections to the eigenvalues and eigenvectorsbased on the “perturbed” Hamiltonian can be calculated with thewell known eigenspectrum perturbation theory [19].

In analogy to the quantum mechanical problem, an MSMtransition matrix could also be augmented by a “perturbationoperator.” Suppose we would like to calculate the impact of arandom perturbation on the eigenspectrum of the transition ma-trix. We could define a perturbed transition matrix T (to firstorder) such that

T ⇥ T0 + ⇥T⇥ [3]

where T0 is the original transition matrix and T⇥ is a matrix ofrandom noise. The first order correction due to noise, �⇥

n for eacheigenvalue �0

n of the transition matrix is given by the simple innerproduct

�⇥n = ⇧e0n|T⇥|e0n⌃ [4]

where is e0n is the nth eigenvector of the zeroth-order transitionmatrix [19]. Corrected eigenvectors are given by the formula

en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]

Using these corrections due to perturbation, one could gauge theimpact of a random noise (or a more systematic) change in atransition matrix on its eigenspectrum. We illustrate the appli-cation of this perturbation theory by applying the above analysisto the eigenvalues of the villin transition matrix.

Need for a New Framework. The method more extensively usedin this study is analogous, though not identical, to classical per-turbation theory. We perturb a transition matrix with noise, cal-culate the “corrected” eigenspectrum, and compare that eigen-spectrum to the original. We decide to use an alternative methodfor two reasons. First, we would like to gauge the rate of change(called the sensitivity) in an eigenvalue or eigenvector with re-spect to the magnitude of perturbation. Furthermore, we wouldlike to know this eigenspectrum sensitivity for each individualparameter in the model. These desires are not trivially fulfilledwith analytical perturbation theory. This paper’s method, drawnfrom the literature and tested on biological models, is designedto estimate such a rate of change [16–18].

Secondly, sophisticated theory for error propagation in MSMshas been developed using a sensitivity based analysis [20, 21].These methods use advanced Bayesian schemes to estimate un-certainty based on the available data. The nature of the sen-sitivities used to estimate such errors, however, has never beenwell characterized. It would be useful to gain intuition aboutthe relative magnitudes of eigenspectrum sensitivities in recentlyconstructed MSMs. Sloppiness-based techniques, as discussedbelow, provide an avenue to do so. While schemes using per-turbation theory to gauge robustness are explored briefly in thefollowing sections, we suggest a sloppiness-based analysis will bepreferable for much of this work.

Sloppiness Framework. To investigate sloppiness in MSM tran-sition probabilities, we choose to use a so-called model-parametercost function on the transition probability matrix. Given the per-turbation of a certain system parameter, the cost function returnsthe induced sum-squared deviation in a dependent observable. In

2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author

classical perturbation theory and sloppiness theory to investigateMSM observable robustness in the face of transition probabilityperturbation. We also develop a quantitative Bayesian metricby which robustness can be evaluated, and we discuss implica-tions such robustness holds for future applications of MSMs tobiophysical phenomena.

Methodology.Exploring mechanism in an MSM context.To qualify a senserobustness in mechanistic properties, we must first consider how“mechanism” should be defined in an MSM context. On firstthought, one might consider that a protein’s folding trajectoryrepresents its folding mechanism. However, we argue that thisview of mechanism is overly restrictive: individuals within anensemble experience di�erent state-to-state transition sequencesin the folding process. While the transition matrix defines whichtrajectories are possible, it also does not provide a clear picture ofwhich pathways the ensemble prefers over short and long periodsof time.

The eigenspectrum of the MSM transition matrix, however,provides both kinetic and thermodynamic information about theensemble. With units of probability density, the transition matrixeigenvectors represent the normal modes of time evolution in thesystem. The stationary distribution, the eigenvector with uniteigenvalue, describes the equilibrium populations in the ensem-ble. The other eigenvectors, with sub-unit eigenvalues, describechanges in the system’s population distribution at timescales setby their respective eigenvalues.

Formally, the transfer of probability density in MSMs is re-lated to the eigenvectors via the expression:

⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]

where ⇡(n) represents the system’s nth probability distributionvector, � denotes an eigenvalue of the transition matrix, and gand e are the corresponding right and left eigenvectors of thetransition matrix, respectively [11].

This expression describes how an arbitrary population distri-bution converges to the equilibrium distribution over time. Notethat, as the number of timesteps n becomes large, all sub-uniteigenvalues (through the term �n) and their eigenvectors decayto zero, and eventually only the stationary distribution, multipliedby the unit eigenvalue, remains. Due to discretization, the stateprobability distribution at step n is only rigorously equal to theright hand side when all states give rise to “slow” eigenvectors;no analogy exists to the continuous spectrum of eigenvalues intransfer operator theory [11]. In actuality, the above expressionyields a vector proportional to the exact population distributionat a given timestep; the constant of proportionality can be de-termined through simple fitting.

Relating mechanism to an MSM eigenspectrum o�ers ad-vantages over the alternatives that were previously discussed.The eigenvector decomposition method provides details abouthow entire probability distributions change, allowing for an ideaof mechanism on an ensemble level. Large eigenvector entriesrepresent states that are important to density transfer on the re-laxation timescale of an associated eigenvalue. One can inspectthe set of eigenvectors to find which individual states are mech-anistically relevant at both fast and slow timescales. Informationabout trajectory (which folding pathways are most probable) andend result (how the state probability distribution converges to thestationary distribution) are intrinsic to the eigenspectrum. To-gether, we extend, trajectory and end result define the essentialparts of a folding mechanism. As such, we suggest that an MSMmechanism be defined in the context of eigenvector decomposi-tions.

Perturbation Theory Framework.In quantum mechanics, apopular method for approximating solutions to the Schroedingerequation involves splitting the system Hamiltonian into zeroth-and higher-order parts with expansion parameter ⇥:

H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]

If the eigenvalue problem for the zeroth order Hamiltonian canbe solved exactly, corrections to the eigenvalues and eigenvectorsbased on the “perturbed” Hamiltonian can be calculated with thewell known eigenspectrum perturbation theory [19].

In analogy to the quantum mechanical problem, an MSMtransition matrix could also be augmented by a “perturbationoperator.” Suppose we would like to calculate the impact of arandom perturbation on the eigenspectrum of the transition ma-trix. We could define a perturbed transition matrix T (to firstorder) such that

T ⇥ T0 + ⇥T⇥ [3]

where T0 is the original transition matrix and T⇥ is a matrix ofrandom noise. The first order correction due to noise, �⇥

n for eacheigenvalue �0

n of the transition matrix is given by the simple innerproduct

�⇥n = ⇧e0n|T⇥|e0n⌃ [4]

where is e0n is the nth eigenvector of the zeroth-order transitionmatrix [19]. Corrected eigenvectors are given by the formula

en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]

Using these corrections due to perturbation, one could gauge theimpact of a random noise (or a more systematic) change in atransition matrix on its eigenspectrum. We illustrate the appli-cation of this perturbation theory by applying the above analysisto the eigenvalues of the villin transition matrix.

Need for a New Framework. The method more extensively usedin this study is analogous, though not identical, to classical per-turbation theory. We perturb a transition matrix with noise, cal-culate the “corrected” eigenspectrum, and compare that eigen-spectrum to the original. We decide to use an alternative methodfor two reasons. First, we would like to gauge the rate of change(called the sensitivity) in an eigenvalue or eigenvector with re-spect to the magnitude of perturbation. Furthermore, we wouldlike to know this eigenspectrum sensitivity for each individualparameter in the model. These desires are not trivially fulfilledwith analytical perturbation theory. This paper’s method, drawnfrom the literature and tested on biological models, is designedto estimate such a rate of change [16–18].

Secondly, sophisticated theory for error propagation in MSMshas been developed using a sensitivity based analysis [20, 21].These methods use advanced Bayesian schemes to estimate un-certainty based on the available data. The nature of the sen-sitivities used to estimate such errors, however, has never beenwell characterized. It would be useful to gain intuition aboutthe relative magnitudes of eigenspectrum sensitivities in recentlyconstructed MSMs. Sloppiness-based techniques, as discussedbelow, provide an avenue to do so. While schemes using per-turbation theory to gauge robustness are explored briefly in thefollowing sections, we suggest a sloppiness-based analysis will bepreferable for much of this work.

Sloppiness Framework. To investigate sloppiness in MSM tran-sition probabilities, we choose to use a so-called model-parametercost function on the transition probability matrix. Given the per-turbation of a certain system parameter, the cost function returnsthe induced sum-squared deviation in a dependent observable. In

2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author

J. Weber and V. S. Pande. Protein folding is mechanistically robust. Biophys J. (2011)

eigenvalue

 spectrum

(rates  of  M

SM  states) }

discrete  re

gion

}

con=

nuou

s  region

(J.  Weber,  VSP)

slow

fast

14Friday, March 15, 13

Page 23: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

NTL9

Lambda

Folding  simulaEon  has  come  a  long  way  in  15  years

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

blue = explicit solvent

red = implicit solvent

Fs  Peptide

(Folding@home)(ANTON supercomputer)

15Friday, March 15, 13

Page 24: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

NTL9

Lambda

Folding  simulaEon  has  come  a  long  way  in  15  years

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

blue = explicit solvent

red = implicit solvent

Fs  Peptide

(Folding@home)(ANTON supercomputer)

15Friday, March 15, 13

Page 25: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Can  we  quan>ta>vely  predict  experiment?

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

Fip35 WW

WT VillinBBA5Trp Zip

Trp-cage

PandeImplicitExplicit

16Friday, March 15, 13

Page 26: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  has  the  community  done  so  far?

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9

Protein G⋋-repressor

Fip35 WW

HomeodomainVillin Nle

Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D

WT VillinBBA5Trp Zip

Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

17Friday, March 15, 13

Page 27: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Experiments  can  now  probe  detailed  MSM  aspects

RMSD  (Å)

∆G  (k

cal/mol)

Many  states  have  low  ∆G  and  are  highly  structurally  related

(Beauchamp,  Das,  VSP)

Bowman,  Beauchamp,  Boxer,  Pande,  JCP  (2009);Beauchamp,  Das,  Pande,  PNAS  (2011)

18Friday, March 15, 13

Page 28: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Experiments  can  now  probe  detailed  MSM  aspects

RMSD  (Å)

∆G  (k

cal/mol)

Many  states  have  low  ∆G  and  are  highly  structurally  related

(Beauchamp,  Das,  VSP)

Bowman,  Beauchamp,  Boxer,  Pande,  JCP  (2009);Beauchamp,  Das,  Pande,  PNAS  (2011)

18Friday, March 15, 13

Page 29: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Experiments  can  now  probe  detailed  MSM  aspects

RMSD  (Å)

∆G  (k

cal/mol)

Many  states  have  low  ∆G  and  are  highly  structurally  related

from  Reiner,  Henklein,  &  Kie`aber  PNAS  (2010)

(Beauchamp,  Das,  VSP)

Bowman,  Beauchamp,  Boxer,  Pande,  JCP  (2009);Beauchamp,  Das,  Pande,  PNAS  (2011)

18Friday, March 15, 13

Page 30: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

“It is nice to know that the computer understands the problem. But I would like to understand it too.”

– Eugene Wigner, in response to a large-scale quantum mechanical calculation

The  challenge  of  simula>ng  vs  understanding

19Friday, March 15, 13

Page 31: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  brief  history  of  protein  folding  kine>cs  theory

20Friday, March 15, 13

Page 32: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  brief  history  of  protein  folding  kine>cs  theory

• 1990:      Simple  kineEc  models• Master  equa4on  approaches  (Shakhnovich  et  al;  Orland  et  al;  Wolynes  et  al)

• Ladce  model  simula4ons  (Dill;  many  others)

20Friday, March 15, 13

Page 33: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  brief  history  of  protein  folding  kine>cs  theory

• 1990:      Simple  kineEc  models• Master  equa4on  approaches  (Shakhnovich  et  al;  Orland  et  al;  Wolynes  et  al)

• Ladce  model  simula4ons  (Dill;  many  others)

• 2000:    A  naEve-­‐centric  view  dominates• Experiments  suggest  a  two-­‐state  model  for  protein  folding  kine4cs  (Fersht)

• Contact  order  (Plaxco,  Simmons,  Baker)• Minimal  frustra4on/protein  design  approach  (Wolynes;  Shakhnovich;  Pande;  others)

• Consequence:    Go  model  simula4ons,  funnel  energy  landscape  paradigm

20Friday, March 15, 13

Page 34: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  brief  history  of  protein  folding  kine>cs  theory

• 1990:      Simple  kineEc  models• Master  equa4on  approaches  (Shakhnovich  et  al;  Orland  et  al;  Wolynes  et  al)

• Ladce  model  simula4ons  (Dill;  many  others)

• 2000:    A  naEve-­‐centric  view  dominates• Experiments  suggest  a  two-­‐state  model  for  protein  folding  kine4cs  (Fersht)

• Contact  order  (Plaxco,  Simmons,  Baker)• Minimal  frustra4on/protein  design  approach  (Wolynes;  Shakhnovich;  Pande;  others)

• Consequence:    Go  model  simula4ons,  funnel  energy  landscape  paradigm

PHE11

PHE18

TRP24

PHE35

• What  is  a  Go  model?• Hα  =  -­‐ε  ∑ij  Cαij  CNij  • interac4ons  present  in  the  folded  state  are  ajrac4ve

• all  others  are  repulsive

20Friday, March 15, 13

Page 35: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  brief  history  of  protein  folding  kine>cs  theory

• 1990:      Simple  kineEc  models• Master  equa4on  approaches  (Shakhnovich  et  al;  Orland  et  al;  Wolynes  et  al)

• Ladce  model  simula4ons  (Dill;  many  others)

• 2000:    A  naEve-­‐centric  view  dominates• Experiments  suggest  a  two-­‐state  model  for  protein  folding  kine4cs  (Fersht)

• Contact  order  (Plaxco,  Simmons,  Baker)• Minimal  frustra4on/protein  design  approach  (Wolynes;  Shakhnovich;  Pande;  others)

• Consequence:    Go  model  simula4ons,  funnel  energy  landscape  paradigm

• 2010:    The  naEve  centric  view  is  unsaEsfying• Structure  in  the  unfolded  state  (eg  Raleigh)• Slow  diffusion  (eg  Lapidus)• non-­‐na4ve  interac4ons  (eg  Majhews)

PHE11

PHE18

TRP24

PHE35

• What  is  a  Go  model?• Hα  =  -­‐ε  ∑ij  Cαij  CNij  • interac4ons  present  in  the  folded  state  are  ajrac4ve

• all  others  are  repulsive

20Friday, March 15, 13

Page 36: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

A  key  ques>on  domina>ng  protein  folding  theory

How  important  are  non-­‐na=ve  

(i.e.  not  present  in  the  folded  state)  interacOons?

21Friday, March 15, 13

Page 37: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

NTL9

Lambda

Folding  simulaEon  has  come  a  long  way  in  15  years

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

blue = explicit solvent

red = implicit solvent

Fs  Peptide

(Folding@home)(ANTON supercomputer)

22Friday, March 15, 13

Page 38: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Folding  simulaEon  has  come  a  long  way  in  15  years

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

blue = explicit solvent

red = implicit solvent

NTL9Lambda

23Friday, March 15, 13

Page 39: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

24Friday, March 15, 13

Page 40: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Pathway  seen  in  the  movie:    Series  of  metastable  states

starts  in  unfoldedstate

helixformsearly

collapse,then  beta  sheet  forms

final  part  of  beta  ready  to  

align

folded  structure  forms

correspond  to  states  from  our  Markov  State  Model:

snapshots  from  the  movie:

25

Voelz, Bowman, Beauchamp, Pande. JACS (2010) (Voelz,  Bowman,  Beauchamp,  VSP)

25Friday, March 15, 13

Page 41: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

RepeaEng  with  many  more  trajectories  yields  an  MSM:    coarse  visualizaEon

• A  great  deal  of  pathway  heterogeneity  exists  • non-­‐na4ve  structure  plays  a  key  role  in  many  states• metastability  is  onen  structurally  localized  (analogous  to  the  foldon  concept)

b

c

d

e

j

gf

l

i

area  of  each  state  is  propor>onal  to  macrostate  free  energy

width  of  each  arrow  is  propor>onal  to  transi>on  flux

a→l→n    and  a→m→n  comprise  10%  of  the  

total  flux

Top  10  folding  pathways  shows  us:

Flux  calcula>on  method:    TPT:    Vanden-­‐Eijnden,  et  al  (2006)

Berezhkovskii,  Hummer,  Szabo  (2009)  

(Voelz,  Bowman,  Beauchamp,  VSP)

26

a n

m

hk

26Friday, March 15, 13

Page 42: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Contact  map  view  of  the  states  reveals  non-­‐naEve  structure  formaEon  along  the  pathway

27

unfolded basin

more beta

native basin

transition state region

more alpha

m

n

k

h

a

(committor)

(Voelz,  Bowman,  Beauchamp,  VSP)

27Friday, March 15, 13

Page 43: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Contact  map  view  of  the  states  reveals  non-­‐naEve  structure  formaEon  along  the  pathway

27

significant  amountof  non-­‐naEve  

structure,  even  in  high  pfold  states

unfolded basin

more beta

native basin

transition state region

more alpha

m

n

k

h

a

(committor)

(Voelz,  Bowman,  Beauchamp,  VSP)

27Friday, March 15, 13

Page 44: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Beta  sheet  states  slow  folding  in  helical  proteins?

!

(Bowman,  Voelz,  VSP)

G. Bowman, V. Voelz, and V. S. Pande. Atomistic folding simulations of the five helix bundle protein λ6-85. Journal of the American Chemical Society 133 664-667 (2011)

Lambda

28Friday, March 15, 13

Page 45: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

“Intramolecular  amyloids”?

A

B

C

D

E

G

H

xtal structurewithout helix5

F

ßsheets in unfolded state

“λ6-85 is not only thermodynamically, but also kinetically protected from reaching

intramolecular analogs of beta sheet aggregates while folding”

– Prigozhin & Gruebele

Lambda

29Friday, March 15, 13

Page 46: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Consequences  of  projec>onsHow  can  one  reconcile  this  with  the  simple  picture?

(Voelz,  VSP)

V. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Page 47: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Consequences  of  projec>onsHow  can  one  reconcile  this  with  the  simple  picture?

(Voelz,  VSP)

V. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Page 48: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Consequences  of  projec>onsHow  can  one  reconcile  this  with  the  simple  picture?

(Voelz,  VSP)

V. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Page 49: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Consequences  of  projec>onsHow  can  one  reconcile  this  with  the  simple  picture?

(Voelz,  VSP)

V. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Page 50: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Consequences  of  projec>onsHow  can  one  reconcile  this  with  the  simple  picture?

(Voelz,  VSP)

‘‘Regarded from two sides’’ by Diet Wiegman (1984)Kruschela & Zagrovic.

DOI:10.1039/b917186jV. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Page 51: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conclusions

31Friday, March 15, 13

Page 52: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

With MSMs, we can simulate folding on the 10ms timescale

31Friday, March 15, 13

Page 53: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

With MSMs, we can simulate folding on the 10ms timescale

Simulation methods are sufficiently accurate to predict experiment

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9

Protein G⋋-repressor

Fip35 WW

HomeodomainVillin Nle

Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D

WT VillinBBA5Trp Zip

Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

31Friday, March 15, 13

Page 54: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

With MSMs, we can simulate folding on the 10ms timescale

Simulation methods are sufficiently accurate to predict experiment

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9

Protein G⋋-repressor

Fip35 WW

HomeodomainVillin Nle

Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D

WT VillinBBA5Trp Zip

Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

folding via parallel paths of many metastable states

31Friday, March 15, 13

Page 55: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

With MSMs, we can simulate folding on the 10ms timescale

Simulation methods are sufficiently accurate to predict experiment

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9

Protein G⋋-repressor

Fip35 WW

HomeodomainVillin Nle

Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D

WT VillinBBA5Trp Zip

Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

folding via parallel paths of many metastable states

intramolecular amyloid hypothesis

!

31Friday, March 15, 13

Page 56: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Where  do  we  go  from  here?

32Friday, March 15, 13

Page 57: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Petaflops  on  the  cheap  today,  exaflops  soon?

Folding@home

There  are  approximately  a  billion  computers  in  the  world

33Friday, March 15, 13

Page 58: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Petaflops  on  the  cheap  today,  exaflops  soon?

Folding@home

There  are  approximately  a  billion  computers  in  the  world

How  many  GPUs?    How  many  GPU  flops?

33Friday, March 15, 13

Page 59: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Petaflops  on  the  cheap  today,  exaflops  soon?

Folding@home

There  are  approximately  a  billion  computers  in  the  world

How  many  GPUs?    How  many  GPU  flops?

A  million  GPUs  pu]ng  out  1TFLOP  each  gets  us  to  an  exaflop:    we  could  do  this  today

33Friday, March 15, 13

Page 60: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

The  combinaOon  of  new  simulaOon  advances  and  chemically  detailed  models  has  suggested  a  paradigm  change  in  how  

we  conceptualize  protein  folding.

34Friday, March 15, 13

Page 61: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

The  combinaOon  of  new  simulaOon  advances  and  chemically  detailed  models  has  suggested  a  paradigm  change  in  how  

we  conceptualize  protein  folding.

We  are  now  looking  to  apply  MSM  approaches  to  new  areas:1)  basis  of  signal  transducOon2)  protein  misfolding  diseases

both  involving  issues  of  small  molecules  and  the  role  of  chemical  interacOons

35Friday, March 15, 13

Page 62: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

New  interest  in  my  lab:  probing  the  molecular  nature  of  the  mechanism  of  signal  transducEon

GPCRs kinases36Friday, March 15, 13

Page 63: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  do  we  want  to  do?

kinases37Friday, March 15, 13

Page 64: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  do  we  want  to  do?

kinases

•Understand  how  they  funcEon• what  is  the  mechanism  of  ac4va4on  &  inac4va4on?

• how  is  the  signal  transduced?• what  is  the  role  of  chemical  interac4ons  in  this  process?

37Friday, March 15, 13

Page 65: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  do  we  want  to  do?

kinases

•Understand  how  they  funcEon• what  is  the  mechanism  of  ac4va4on  &  inac4va4on?

• how  is  the  signal  transduced?• what  is  the  role  of  chemical  interac4ons  in  this  process?

•Use  this  understanding  to  modulate  their  funcEon• design/predict  novel  small  inhibitors  &  ac4vators

• design/predict  protein  muta4ons  which  yield  new  func4ons  or  new  behaviors

37Friday, March 15, 13

Page 66: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  do  we  want  to  do?

kinases

•Understand  how  they  funcEon• what  is  the  mechanism  of  ac4va4on  &  inac4va4on?

• how  is  the  signal  transduced?• what  is  the  role  of  chemical  interac4ons  in  this  process?

•Use  this  understanding  to  modulate  their  funcEon• design/predict  novel  small  inhibitors  &  ac4vators

• design/predict  protein  muta4ons  which  yield  new  func4ons  or  new  behaviors

•Connect  this  new  chemical  insight  to  basic  biology  and  aspects  of  disease

37Friday, March 15, 13

Page 67: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Protein  Kinases

• Protein  Kinases  are  enzymes  that  modify  the  func4on  of  other  proteins  by  ajaching    phosphate  groups  to  them.  

• The  conforma4onal  change  involved  transfer  of  phosphate  group  of  ATP  to  amino  acids    with  OH  groups  (  Serine,  Threonine  and  Tyrosine).

38Friday, March 15, 13

Page 68: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Conforma>onal  change  in  src  kinase

InacEve acEve

C-­‐helix

A-­‐loop

TYR419

ATP

GLU310

ARG409

LYS295

(Shukla,  VSP)

A-­‐loop

C-­‐helix

hbond

39Friday, March 15, 13

Page 69: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Kine>c  traces  for  ac>va>on/deac>va>on• We  see  many  acEvaEon  events

• MSM  kineEcs  can  be  used  to  predict  experiment

• We  get  reasonable  kineEcs• Ac4va4on  4mescales  consistent  with  experiment  (sub-­‐millisecond  4mescale)

• What  does  the  mechanism  look  like?

40Friday, March 15, 13

Page 70: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

41Friday, March 15, 13

Page 71: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

41Friday, March 15, 13

Page 72: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

41Friday, March 15, 13

Page 73: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

?

41Friday, March 15, 13

Page 74: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Age old challenges of molecular simulation

1. Finding a sufficiently accurate model

2. Sampling sufficiently long timescales

3. Learning something new from the resulting flood of data

?

41Friday, March 15, 13

Page 75: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Kinase  conforma>onal  change(Shukla,  VSP)

42Friday, March 15, 13

Page 76: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

InacOve

Surprise:  an  intermediate  state?(Shukla,  VSP)

43Friday, March 15, 13

Page 77: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Surprise  2:  intermediate  state(s)(Shukla,  VSP)

44Friday, March 15, 13

Page 78: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

We  find  many  intermediates!(Shukla,  VSP)

45Friday, March 15, 13

Page 79: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Problems  with  projec>ons

46Friday, March 15, 13

Page 80: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Problems  with  projec>ons

from Chandler (1998)

46Friday, March 15, 13

Page 81: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Problems  with  projec>ons

from Chandler (1998)

46Friday, March 15, 13

Page 82: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Problems  with  projec>ons

MSMs can tell us where to look as we have a full modelfrom Chandler (1998)

46Friday, March 15, 13

Page 83: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Heart  of  the  power  of  MSMs

Systema=cally  idenOfying  intermediate  states  allows  us  to(1)  qualitaOvely  understand  and  

(2)  quanOtaOvely  predict  chemical  mechanisms

47Friday, March 15, 13

Page 84: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

New  challenges  with  conforma>onal  change

•Building  MSMs  for  conformaEonal  change•much  more  challenging  than  for  protein  folding•as  the  changes  are  much  more  subtle

•We  have  developed  novel  theoreEcal  approaches  to  tackle  these  new  challenges•Metric  learning  approaches:  use  Machine  Learning  to  iden4fy  which  degrees  of  freedom  are  important  and  which  are  noise

•Dimensionality  reduc4on  approaches:    iden4fy  collec4ve  degrees  of  freedom  systema4cally  

•Use  these  new  approaches  to  both  build  bejer  MSMs  but  also  to  ideally  learn  something  new  about  the  system

(McGibbon,  Schwantes,  VSP)

48Friday, March 15, 13

Page 85: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

MSM  reveals  key  intermediates• We  see  many  acEvaEon  events

• MSM  kineEcs  can  be  used  to  predict  experiment

• We  get  reasonable  kineEcs• Ac4va4on  4mescales  consistent  with  experiment  (sub-­‐millisecond  4mescale)

• What  does  the  mechanism  look  like?

49Friday, March 15, 13

Page 86: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

MSM  reveals  key  intermediates

49Friday, March 15, 13

Page 87: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

MSM  reveals  key  intermediates

49Friday, March 15, 13

Page 88: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

C-­‐helixin  inacEve  

ConformaEon

A-­‐loopunfolded

E310-­‐R409  H-­‐bond  broken

Intermediate  2  of  c-­‐src  Kinase  (Simula4on)

E310R409

(Shukla,  VSP)Characterizing  intermediate  2

50Friday, March 15, 13

Page 89: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Cyclin-­‐dependent  Kinase  2  (PDB:  4BCQ)

E310

R409

C-­‐helixin  inacEve  

ConformaEon

A-­‐loopunfolded

E310-­‐R409  H-­‐bond  broken

Intermediate  2  of  c-­‐src  Kinase  (Simula4on)

E310R409

(Shukla,  VSP)Characterizing  intermediate  2

50Friday, March 15, 13

Page 90: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

SimulaEons  predict  drug  stabilizes  intermediate  2

• ANS  binding  to  the  allosteric  site  adjacent  to  C-­‐helix  in  c-­‐src  kinase  stabilizes  the  intermediate  conformaEon• by  blocking  the  interac4ons  between  K295  and  E310  

• h-­‐bond  forma4on  between  K295  and  E310  is  required  for  the  locking  of  the  C-­‐helix  in  the  ac4ve  conforma4on  

• sulfonate  group  in  the  ANS  forms  a  hydrogen-­‐bond  with  the  K295  thereby  locking  it  in  its  inac4ve  conforma4on

(Shukla,  VSP)

51Friday, March 15, 13

Page 91: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

SimulaEons  predict  drug  stabilizes  intermediate  2

•ANS  binding  also  pushes  the  C-­‐helix  away  from  the  ATP  binding  pocket• Superimposi4on  of  the  structures  obtained  from  the  simula4ons  reveal  the  dis4nct  conforma4ons  of  the  c-­‐helix  in  presence  of  ANS:

• ATP-­‐bound  c-­‐src  kinase  (cyan)• ATP  and  ANS-­‐bound  src-­‐kinase,  1  molecule  of  ANS  in  the  allosteric  site  (orange)  ()

• ATP  and  ANS-­‐bound  src-­‐kinase,  2  molecule  of  ANS  in  the  allosteric  site  (green)

(Shukla,  VSP)

52Friday, March 15, 13

Page 92: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Simula>ng  the  kinome

c-­‐src  kinase  (2SRC)

Lyn  kinase  (2ZV7)

Fyn  kinase  (2DQ7)

Hck  kinase  (2HCK)

53Friday, March 15, 13

Page 93: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Rosenbaum et. al., Nature, 2009.

Signal  transduc>on  in  G-­‐protein-­‐coupled  receptors  

54Friday, March 15, 13

Page 94: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

G-­‐Protein  Coupled  Receptor  Structure

Kobilka and coworkers, Nature, 2011.

55Friday, March 15, 13

Page 95: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

!

Key  Details

56Friday, March 15, 13

Page 96: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Trajectories  of  ß2  behavior:  Agonist  bound

!

(Kohlhoff,  Shukla,  Lawrenz,  …,  VSP)

!

57Friday, March 15, 13

Page 97: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

What  do  we  want  to  do?

•Understand  how  they  funcEon•what  is  the  mechanism  of  ac4va4on  &  inac4va4on?•how  is  the  signal  transduced?•what  is  the  role  of  chemical  interac4ons  in  this  process?

•Use  this  understanding  to  modulate  their  funcEon•design/predict  novel  small  inhibitors  &  ac4vators•design/predict  protein  muta4ons  which  yield  new  func4ons  or  new  behaviors

•Connect  this  new  chemical  insight  to  basic  biology  and  aspects  of  disease

58Friday, March 15, 13

Page 98: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

As  in  life,  in  science  it  is  very  dangerous  to  fall  in  love  with  

beau=ful  models.

59Friday, March 15, 13

Page 99: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Several  different  aspects  of  theore>cal  chemistry

60Friday, March 15, 13

Page 100: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Several  different  aspects  of  theore>cal  chemistry

Theory(simplicity,

transparency)

60Friday, March 15, 13

Page 101: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Several  different  aspects  of  theore>cal  chemistry

Theory(simplicity,

transparency)

SimulaEon(detail,  

accuracy)

60Friday, March 15, 13

Page 102: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Several  different  aspects  of  theore>cal  chemistry

Theory(simplicity,

transparency)

SimulaEon(detail,  

accuracy)

InformaEcs(experiment,sta=s=cs)

60Friday, March 15, 13

Page 103: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

My  approach:  unify  theore>cal  approaches

Theory(simplicity,

transparency)

SimulaEon(detail,  

accuracy)

InformaEcs(experiment,sta=s=cs)

61Friday, March 15, 13

Page 104: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

My  approach:  unify  theore>cal  approaches

Theory(simplicity,

transparency)

SimulaEon(detail,  

accuracy)

InformaEcs(experiment,sta=s=cs)

My  approach:  to  unify  

simulaOon,  theory,  and  

informaOcs,  to  build  models  of  long  Omescale  biology  in  

chemical  detail

61Friday, March 15, 13

Page 105: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

“This is not a cell”62Friday, March 15, 13

Page 106: BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Acknowledgements

63Friday, March 15, 13