BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics

Some Surprises in the Biophysics of Protein Dynamics

Vijay S. PandeDepartments of Chemistry, Structural Biology, and Computer Science

Program in BiophysicsStanford University

1

1Friday, March 15, 13

http://events.berkeley.edu/index.php/calendar/sn/chem.html?event_ID=61767&date=2013-02-12&filter=Secondary%20Event%20Type&filtersel=





Crystallography gives a wealth of informa>on

P53 Oligomerization(50% of cancers)

Collagen Helix Formation

(Osteogenesis Imperfecta)

Ribosome:(Last step of

Central Dogma,Antibiotic resistance)

Chaperonin Assisted Folding(relevant to cancer: HSP90 inhibitors)

Aβ peptide aggregation(Alzheimer’s Disease)


Ceci n’est pas une pipe.4Friday, March 15, 13

“This is not a GPCR”(Hibert et al, TIPS Reviews, 1993)


“This is not a cell”6Friday, March 15, 13

Age old challenges of molecular simulation



1. Finding a sufficiently accurate model




2. Sampling sufficiently long timescales





3. Learning something new from the resulting flood of data


How do you break a billion-‐fold impasse? Combine mul=ple, powerful, complementary technologies

8



8

1) Folding@home: very large-‐scale distributed compu4ng

h#p://folding.stanford.edu

Voelz, et al, JACS (2010)Ensign et al, JMB (2007)Shirts and Pande, Science (2000)

Most powerful computer cluster in the world (~8 petaflops)

104x to 105x


http://simtk.org/OpenMM



8





104x to 105x

2) OpenMM: Very fast MD (~1µs/day) on GPUs

~1µs/day for implicit solvent simulaton of small proteins (~40aa)

h#p://simtk.org/home/openmm

Elsen, et al. ACM/IEEE conf. on Supercompu=ng (2006)Friedrichs, et al. J. Comp. Chem., (2009)Eastman and Pande. J. Comp. Chem. (2009)

102x to 103x













8





104x to 105x





102x to 103x

3) Markov State Models: Sta4s4cal mechanics of many trajectories

very long 4mescale dynamics by combining

many simula4ons

h#p://simtk.org/home/msmbuilder

Bowman, et al, J. Chem. Phys. (2009)Singhal & Pande, J. Chem. Phys. (2005)Singhal, et al, J. Chem. Phys. (2004)

102x to 103x























8





104x to 105x





102x to 103x

3) Markov State Models: Sta4s4cal mechanics of many trajectories

very long 4mescale dynamics by combining

many simula4ons

h#p://simtk.org/home/msmbuilder

Bowman, et al, J. Chem. Phys. (2009)Singhal & Pande, J. Chem. Phys. (2005)Singhal, et al, J. Chem. Phys. (2004)

102x to 103x






















What are Markov State Models (MSMs)?

Markov State Models (MSMs) are a theoreOcal scheme to build models

of long Omescale phenomena

(1) to aid simulators reach long Omescales and (2) gain insight from

their simulaOons

see the work of: Andersen, Caflisch, Chodera, Deuflhard, Dill, Grubmüller, Hummer, Levy, Noé, Pande, Pitera, Singhal-‐Heinrichs, Roux, SchüDe, Swope, Weber


States avoid issues with projec>ons and R.C.’s

Figure adapted from Dobson, et al, Nature

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N


States avoid issues with projec>ons and R.C.’s


Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

dpidt

=X

l

[kl,ipl � ki,lpi]

Master equaEon:


MSMs coarse grain conformaEon space (to ~3Å) to build a Master equaEon

11

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N


dpidt

=X

l

[kl,ipl � ki,lpi]

Master equaEon:

Build from MD:derive rate matrix from simulaOon w/ Bayesian methods


but also derive a coarser view for human consumpEon

Synthesis

Disorderedaggregate

Disorderedaggregate

Oligomer

CrystalFiber

Amyloidfibril

Prefibrillarspecies

Degradedfragments

Disorderedaggregate

U

I

N

dpidt

=X

l

[kl,ipl � ki,lpi]

Master equaEon:

Build from MD:derive rate matrix from simulaOon w/ Bayesian methods

Coarse grain MSM:use eigenvectors to idenOfy collecOve modes


Heart of the power of MSMs

Systema=cally idenOfying intermediate states allows us to(1) qualitaOvely understand and

(2) quanOtaOvely predict chemical mechanisms


• PerturbaEons to transiEon matrix can be handled like QM perturbaEon theory• Transi4on matrix with error (T0 = “real”matrix)

• We calculate perturbed eigenvalues (ie rates)

• and perturbed eigenvectors (ie mechanism)

• Key result• error perturbs eigenvalues• results will be robust in the discrete region of the eigenvalue spectrum

• Relevant for both theory and experiment

The MSM can tell us which results are robust

classical perturbation theory and sloppiness theory to investigateMSM observable robustness in the face of transition probabilityperturbation. We also develop a quantitative Bayesian metricby which robustness can be evaluated, and we discuss implica-tions such robustness holds for future applications of MSMs tobiophysical phenomena.

Methodology.Exploring mechanism in an MSM context.To qualify a senserobustness in mechanistic properties, we must first consider how“mechanism” should be defined in an MSM context. On firstthought, one might consider that a protein’s folding trajectoryrepresents its folding mechanism. However, we argue that thisview of mechanism is overly restrictive: individuals within anensemble experience di�erent state-to-state transition sequencesin the folding process. While the transition matrix defines whichtrajectories are possible, it also does not provide a clear picture ofwhich pathways the ensemble prefers over short and long periodsof time.

The eigenspectrum of the MSM transition matrix, however,provides both kinetic and thermodynamic information about theensemble. With units of probability density, the transition matrixeigenvectors represent the normal modes of time evolution in thesystem. The stationary distribution, the eigenvector with uniteigenvalue, describes the equilibrium populations in the ensem-ble. The other eigenvectors, with sub-unit eigenvalues, describechanges in the system’s population distribution at timescales setby their respective eigenvalues.

Formally, the transfer of probability density in MSMs is re-lated to the eigenvectors via the expression:

⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]

where ⇡(n) represents the system’s nth probability distributionvector, � denotes an eigenvalue of the transition matrix, and gand e are the corresponding right and left eigenvectors of thetransition matrix, respectively [11].

This expression describes how an arbitrary population distri-bution converges to the equilibrium distribution over time. Notethat, as the number of timesteps n becomes large, all sub-uniteigenvalues (through the term �n) and their eigenvectors decayto zero, and eventually only the stationary distribution, multipliedby the unit eigenvalue, remains. Due to discretization, the stateprobability distribution at step n is only rigorously equal to theright hand side when all states give rise to “slow” eigenvectors;no analogy exists to the continuous spectrum of eigenvalues intransfer operator theory [11]. In actuality, the above expressionyields a vector proportional to the exact population distributionat a given timestep; the constant of proportionality can be de-termined through simple fitting.

Relating mechanism to an MSM eigenspectrum o�ers ad-vantages over the alternatives that were previously discussed.The eigenvector decomposition method provides details abouthow entire probability distributions change, allowing for an ideaof mechanism on an ensemble level. Large eigenvector entriesrepresent states that are important to density transfer on the re-laxation timescale of an associated eigenvalue. One can inspectthe set of eigenvectors to find which individual states are mech-anistically relevant at both fast and slow timescales. Informationabout trajectory (which folding pathways are most probable) andend result (how the state probability distribution converges to thestationary distribution) are intrinsic to the eigenspectrum. To-gether, we extend, trajectory and end result define the essentialparts of a folding mechanism. As such, we suggest that an MSMmechanism be defined in the context of eigenvector decomposi-tions.

Perturbation Theory Framework.In quantum mechanics, apopular method for approximating solutions to the Schroedingerequation involves splitting the system Hamiltonian into zeroth-and higher-order parts with expansion parameter ⇥:

H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]

If the eigenvalue problem for the zeroth order Hamiltonian canbe solved exactly, corrections to the eigenvalues and eigenvectorsbased on the “perturbed” Hamiltonian can be calculated with thewell known eigenspectrum perturbation theory [19].

In analogy to the quantum mechanical problem, an MSMtransition matrix could also be augmented by a “perturbationoperator.” Suppose we would like to calculate the impact of arandom perturbation on the eigenspectrum of the transition ma-trix. We could define a perturbed transition matrix T (to firstorder) such that

T ⇥ T0 + ⇥T⇥ [3]

where T0 is the original transition matrix and T⇥ is a matrix ofrandom noise. The first order correction due to noise, �⇥

n for eacheigenvalue �0

n of the transition matrix is given by the simple innerproduct

�⇥n = ⇧e0n|T⇥|e0n⌃ [4]

where is e0n is the nth eigenvector of the zeroth-order transitionmatrix [19]. Corrected eigenvectors are given by the formula

en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]

Using these corrections due to perturbation, one could gauge theimpact of a random noise (or a more systematic) change in atransition matrix on its eigenspectrum. We illustrate the appli-cation of this perturbation theory by applying the above analysisto the eigenvalues of the villin transition matrix.

Need for a New Framework. The method more extensively usedin this study is analogous, though not identical, to classical per-turbation theory. We perturb a transition matrix with noise, cal-culate the “corrected” eigenspectrum, and compare that eigen-spectrum to the original. We decide to use an alternative methodfor two reasons. First, we would like to gauge the rate of change(called the sensitivity) in an eigenvalue or eigenvector with re-spect to the magnitude of perturbation. Furthermore, we wouldlike to know this eigenspectrum sensitivity for each individualparameter in the model. These desires are not trivially fulfilledwith analytical perturbation theory. This paper’s method, drawnfrom the literature and tested on biological models, is designedto estimate such a rate of change [16–18].

Secondly, sophisticated theory for error propagation in MSMshas been developed using a sensitivity based analysis [20, 21].These methods use advanced Bayesian schemes to estimate un-certainty based on the available data. The nature of the sen-sitivities used to estimate such errors, however, has never beenwell characterized. It would be useful to gain intuition aboutthe relative magnitudes of eigenspectrum sensitivities in recentlyconstructed MSMs. Sloppiness-based techniques, as discussedbelow, provide an avenue to do so. While schemes using per-turbation theory to gauge robustness are explored briefly in thefollowing sections, we suggest a sloppiness-based analysis will bepreferable for much of this work.

Sloppiness Framework. To investigate sloppiness in MSM tran-sition probabilities, we choose to use a so-called model-parametercost function on the transition probability matrix. Given the per-turbation of a certain system parameter, the cost function returnsthe induced sum-squared deviation in a dependent observable. In

2 www.pnas.org/cgi/doi/10.1073/pnas.0709640104 Footline Author





⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]





H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]



T ⇥ T0 + ⇥T⇥ [3]




�⇥n = ⇧e0n|T⇥|e0n⌃ [4]


en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]










⇡(n) ⇤⇤

i

�ni

�⇤(n�1), gi

⇥ei [1]





H = H0 + ⇥H⇥ + ⇥2H⇥⇥... [2]



T ⇥ T0 + ⇥T⇥ [3]




�⇥n = ⇧e0n|T⇥|e0n⌃ [4]


en = e0n +⇤

j ⇤=n

⇧e0j |T⇥|e0n⌃�0n � �0

j

e0j [5]






J. Weber and V. S. Pande. Protein folding is mechanistically robust. Biophys J. (2011)

eigenvalue

spectrum

(rates of M

SM states) }

discrete re

gion

}

con=

nuou

s region

(J. Weber, VSP)

slow

fast


NTL9

Lambda

Folding simulaEon has come a long way in 15 years

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL

Protein BHomeodomain

Protein G

a3D

Lambda

Kollman

blue = explicit solvent

red = implicit solvent

Fs Peptide

(Folding@home)(ANTON supercomputer)


NTL9

Lambda


1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman



Fs Peptide



Can we quan>ta>vely predict experiment?

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000

Experimental folding time (μs)

Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

Fip35 WW

WT VillinBBA5Trp Zip

Trp-cage

PandeImplicitExplicit


What has the community done so far?

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000


Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9

Protein G⋋-repressor

Fip35 WW

HomeodomainVillin Nle

Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D


Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit


Experiments can now probe detailed MSM aspects

RMSD (Å)

∆G (k

cal/mol)

Many states have low ∆G and are highly structurally related

(Beauchamp, Das, VSP)

Bowman, Beauchamp, Boxer, Pande, JCP (2009);Beauchamp, Das, Pande, PNAS (2011)



RMSD (Å)

∆G (k

cal/mol)






RMSD (Å)

∆G (k

cal/mol)


from Reiner, Henklein, & Kie`aber PNAS (2010)




“It is nice to know that the computer understands the problem. But I would like to understand it too.”

– Eugene Wigner, in response to a large-scale quantum mechanical calculation

The challenge of simula>ng vs understanding


A brief history of protein folding kine>cs theory



• 1990: Simple kineEc models• Master equa4on approaches (Shakhnovich et al; Orland et al; Wolynes et al)

• Ladce model simula4ons (Dill; many others)





• 2000: A naEve-‐centric view dominates• Experiments suggest a two-‐state model for protein folding kine4cs (Fersht)

• Contact order (Plaxco, Simmons, Baker)• Minimal frustra4on/protein design approach (Wolynes; Shakhnovich; Pande; others)

• Consequence: Go model simula4ons, funnel energy landscape paradigm








PHE11

PHE18

TRP24

PHE35

• What is a Go model?• Hα = -‐ε ∑ij Cαij CNij • interac4ons present in the folded state are ajrac4ve

• all others are repulsive








• 2010: The naEve centric view is unsaEsfying• Structure in the unfolded state (eg Raleigh)• Slow diffusion (eg Lapidus)• non-‐na4ve interac4ons (eg Majhews)

PHE11

PHE18

TRP24

PHE35

• What is a Go model?• Hα = -‐ε ∑ij Cαij CNij • interac4ons present in the folded state are ajrac4ve

• all others are repulsive


A key ques>on domina>ng protein folding theory

How important are non-‐na=ve

(i.e. not present in the folded state) interacOons?


NTL9

Lambda


1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman



Fs Peptide




1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman



NTL9Lambda



Pathway seen in the movie: Series of metastable states

starts in unfoldedstate

helixformsearly

collapse,then beta sheet forms

final part of beta ready to

align

folded structure forms

correspond to states from our Markov State Model:

snapshots from the movie:

25

Voelz, Bowman, Beauchamp, Pande. JACS (2010) (Voelz, Bowman, Beauchamp, VSP)


RepeaEng with many more trajectories yields an MSM: coarse visualizaEon

• A great deal of pathway heterogeneity exists • non-‐na4ve structure plays a key role in many states• metastability is onen structurally localized (analogous to the foldon concept)

b

c

d

e

j

gf

l

i

area of each state is propor>onal to macrostate free energy

width of each arrow is propor>onal to transi>on flux

a→l→n and a→m→n comprise 10% of the

total flux

Top 10 folding pathways shows us:

Flux calcula>on method: TPT: Vanden-‐Eijnden, et al (2006)

Berezhkovskii, Hummer, Szabo (2009)

(Voelz, Bowman, Beauchamp, VSP)

26

a n

m

hk


Contact map view of the states reveals non-‐naEve structure formaEon along the pathway

27

unfolded basin

more beta

native basin

transition state region

more alpha

m

n

k

h

a

(committor)



Contact map view of the states reveals non-‐naEve structure formaEon along the pathway

27

significant amountof non-‐naEve

structure, even in high pfold states

unfolded basin

more beta

native basin

transition state region

more alpha

m

n

k

h

a

(committor)



Beta sheet states slow folding in helical proteins?

!

(Bowman, Voelz, VSP)

G. Bowman, V. Voelz, and V. S. Pande. Atomistic folding simulations of the five helix bundle protein λ6-85. Journal of the American Chemical Society 133 664-667 (2011)

Lambda


“Intramolecular amyloids”?

A

B

C

D

E

G

H

xtal structurewithout helix5

F

ßsheets in unfolded state

“λ6-85 is not only thermodynamically, but also kinetically protected from reaching

intramolecular analogs of beta sheet aggregates while folding”

– Prigozhin & Gruebele

Lambda


Consequences of projec>onsHow can one reconcile this with the simple picture?

(Voelz, VSP)

V. A. Voelz, et al. JACS (2012)30Friday, March 15, 13


(Voelz, VSP)



(Voelz, VSP)



(Voelz, VSP)



(Voelz, VSP)

‘‘Regarded from two sides’’ by Diet Wiegman (1984)Kruschela & Zagrovic.

DOI:10.1039/b917186jV. A. Voelz, et al. JACS (2012)30Friday, March 15, 13

Conclusions


Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman

With MSMs, we can simulate folding on the 10ms timescale


Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman


Simulation methods are sufficiently accurate to predict experiment

10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000


Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9


Fip35 WW


Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D


Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit


Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman



10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000


Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9


Fip35 WW


Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D


Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

folding via parallel paths of many metastable states


Conclusions

1998 2000 2002 2004 2006 2008 2010 2012

0.1

1

10

100

1000

10,000

Year

Fold

ing T

ime

(mic

rose

cond

s)

Villin

Fip35 WW Fip35

NTL9

Pin1 WW

Lambda

Villin

Fip35

Lambda

Trp Zip

Trp Cage

BBA5Villin

Fs Peptide

ACBPShaw

Pande

Schulten

Noe

Lambda

Villin

Chignolin

Trp-cage

BBA

Villin

GTT WW

NTL9BBL


Protein G

a3D

Lambda

Kollman



10,000 0.1 1 10 100 1000

10,000

0.01

0.1

1

10

100

1000


Pred

icte

d f

old

ing t

ime

(μs)

Fs Peptide

⋋-repressor

ACBPNTL9

NTL9


Fip35 WW


Fip35 WW

Villin Nle

Protein B

BBL

Pin1 WWFip35

Trp-cage

α3D


Trp-cage

Pande

Shaw

Noé

Schulten

ImplicitExplicit

folding via parallel paths of many metastable states

intramolecular amyloid hypothesis

!


Where do we go from here?


Petaflops on the cheap today, exaflops soon?

Folding@home

There are approximately a billion computers in the world



Folding@home


How many GPUs? How many GPU flops?



Folding@home


How many GPUs? How many GPU flops?

A million GPUs pu]ng out 1TFLOP each gets us to an exaflop: we could do this today


The combinaOon of new simulaOon advances and chemically detailed models has suggested a paradigm change in how

we conceptualize protein folding.


The combinaOon of new simulaOon advances and chemically detailed models has suggested a paradigm change in how

we conceptualize protein folding.

We are now looking to apply MSM approaches to new areas:1) basis of signal transducOon2) protein misfolding diseases

both involving issues of small molecules and the role of chemical interacOons


New interest in my lab: probing the molecular nature of the mechanism of signal transducEon

GPCRs kinases36Friday, March 15, 13

What do we want to do?

kinases37Friday, March 15, 13


kinases

•Understand how they funcEon• what is the mechanism of ac4va4on & inac4va4on?

• how is the signal transduced?• what is the role of chemical interac4ons in this process?



kinases



•Use this understanding to modulate their funcEon• design/predict novel small inhibitors & ac4vators

• design/predict protein muta4ons which yield new func4ons or new behaviors



kinases



•Use this understanding to modulate their funcEon• design/predict novel small inhibitors & ac4vators

• design/predict protein muta4ons which yield new func4ons or new behaviors

•Connect this new chemical insight to basic biology and aspects of disease


Protein Kinases

• Protein Kinases are enzymes that modify the func4on of other proteins by ajaching phosphate groups to them.

• The conforma4onal change involved transfer of phosphate group of ATP to amino acids with OH groups ( Serine, Threonine and Tyrosine).


Conforma>onal change in src kinase

InacEve acEve

C-‐helix

A-‐loop

TYR419

ATP

GLU310

ARG409

LYS295

(Shukla, VSP)

A-‐loop

C-‐helix

hbond


Kine>c traces for ac>va>on/deac>va>on• We see many acEvaEon events

• MSM kineEcs can be used to predict experiment

• We get reasonable kineEcs• Ac4va4on 4mescales consistent with experiment (sub-‐millisecond 4mescale)

• What does the mechanism look like?











✔






✔

✔






✔

✔

?






✔

✔

?


Kinase conforma>onal change(Shukla, VSP)


InacOve

Surprise: an intermediate state?(Shukla, VSP)


Surprise 2: intermediate state(s)(Shukla, VSP)


We find many intermediates!(Shukla, VSP)


Problems with projec>ons



from Chandler (1998)



from Chandler (1998)



MSMs can tell us where to look as we have a full modelfrom Chandler (1998)


Heart of the power of MSMs

Systema=cally idenOfying intermediate states allows us to(1) qualitaOvely understand and

(2) quanOtaOvely predict chemical mechanisms


New challenges with conforma>onal change

•Building MSMs for conformaEonal change•much more challenging than for protein folding•as the changes are much more subtle

•We have developed novel theoreEcal approaches to tackle these new challenges•Metric learning approaches: use Machine Learning to iden4fy which degrees of freedom are important and which are noise

•Dimensionality reduc4on approaches: iden4fy collec4ve degrees of freedom systema4cally

•Use these new approaches to both build bejer MSMs but also to ideally learn something new about the system

(McGibbon, Schwantes, VSP)


MSM reveals key intermediates• We see many acEvaEon events

• MSM kineEcs can be used to predict experiment

• We get reasonable kineEcs• Ac4va4on 4mescales consistent with experiment (sub-‐millisecond 4mescale)

• What does the mechanism look like?


MSM reveals key intermediates


MSM reveals key intermediates


C-‐helixin inacEve

ConformaEon

A-‐loopunfolded

E310-‐R409 H-‐bond broken

Intermediate 2 of c-‐src Kinase (Simula4on)

E310R409

(Shukla, VSP)Characterizing intermediate 2


Cyclin-‐dependent Kinase 2 (PDB: 4BCQ)

E310

R409

C-‐helixin inacEve

ConformaEon

A-‐loopunfolded

E310-‐R409 H-‐bond broken

Intermediate 2 of c-‐src Kinase (Simula4on)

E310R409

(Shukla, VSP)Characterizing intermediate 2


SimulaEons predict drug stabilizes intermediate 2

• ANS binding to the allosteric site adjacent to C-‐helix in c-‐src kinase stabilizes the intermediate conformaEon• by blocking the interac4ons between K295 and E310

• h-‐bond forma4on between K295 and E310 is required for the locking of the C-‐helix in the ac4ve conforma4on

• sulfonate group in the ANS forms a hydrogen-‐bond with the K295 thereby locking it in its inac4ve conforma4on

(Shukla, VSP)


SimulaEons predict drug stabilizes intermediate 2

•ANS binding also pushes the C-‐helix away from the ATP binding pocket• Superimposi4on of the structures obtained from the simula4ons reveal the dis4nct conforma4ons of the c-‐helix in presence of ANS:

• ATP-‐bound c-‐src kinase (cyan)• ATP and ANS-‐bound src-‐kinase, 1 molecule of ANS in the allosteric site (orange) ()

• ATP and ANS-‐bound src-‐kinase, 2 molecule of ANS in the allosteric site (green)

(Shukla, VSP)


Simula>ng the kinome

c-‐src kinase (2SRC)

Lyn kinase (2ZV7)

Fyn kinase (2DQ7)

Hck kinase (2HCK)


Rosenbaum et. al., Nature, 2009.

Signal transduc>on in G-‐protein-‐coupled receptors


G-‐Protein Coupled Receptor Structure

Kobilka and coworkers, Nature, 2011.


!

Key Details


Trajectories of ß2 behavior: Agonist bound

!

(Kohlhoff, Shukla, Lawrenz, …, VSP)

!



•Understand how they funcEon•what is the mechanism of ac4va4on & inac4va4on?•how is the signal transduced?•what is the role of chemical interac4ons in this process?

•Use this understanding to modulate their funcEon•design/predict novel small inhibitors & ac4vators•design/predict protein muta4ons which yield new func4ons or new behaviors

•Connect this new chemical insight to basic biology and aspects of disease


As in life, in science it is very dangerous to fall in love with

beau=ful models.


Several different aspects of theore>cal chemistry



Theory(simplicity,

transparency)



Theory(simplicity,

transparency)

SimulaEon(detail,

accuracy)



Theory(simplicity,

transparency)

SimulaEon(detail,

accuracy)

InformaEcs(experiment,sta=s=cs)


My approach: unify theore>cal approaches

Theory(simplicity,

transparency)

SimulaEon(detail,

accuracy)



My approach: unify theore>cal approaches

Theory(simplicity,

transparency)

SimulaEon(detail,

accuracy)


My approach: to unify

simulaOon, theory, and

informaOcs, to build models of long Omescale biology in

chemical detail


“This is not a cell”62Friday, March 15, 13

Acknowledgements


Documents

BIOS 203 Lecture 6: Some surprises in the biophysics of protein dynamics