39
S.Will, 18.417, Fall 2011 Predicting Protein Folding Paths

Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Predicting Protein Folding Paths

Probabilistic Roadmap Planning (PRM):

Thomas, Song, Amato. Protein folding by motion planning.Phys. Biol., 2005

Page 2: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Protein Folding by Robotics

Probabilistic Roadmap Planning (PRM):

Thomas, Song, Amato. Protein folding by motion planning.Phys. Biol., 2005

Page 3: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Aims

Find good quality folding paths (into given nativestructure)

no structure prediction!

Predict formation orders (of secondary structure)

Page 4: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Motion planning

Motion planning

Probabilistic roadmap planing

Sampling of configuration space QConnect nearest configurations by (simple) local plannerApply graph algorithms to “roadmap”: Find shortest path

Page 5: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Motion planning

Motion planning

Probabilistic roadmap planing

Sampling of configuration space QConnect nearest configurations by (simple) local plannerApply graph algorithms to “roadmap”: Find shortest path

Page 6: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Motion planning

Motion planning

Probabilistic roadmap planing

Sampling of configuration space QConnect nearest configurations by (simple) local plannerApply graph algorithms to “roadmap”: Find shortest path

Page 7: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Motion planning

Motion planning

Probabilistic roadmap planing

Sampling of configuration space QConnect nearest configurations by (simple) local plannerApply graph algorithms to “roadmap”: Find shortest path

Page 8: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Motion planning

Motion planning

Probabilistic roadmap planing

Sampling of configuration space QConnect nearest configurations by (simple) local plannerApply graph algorithms to “roadmap”: Find shortest path

Page 9: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

More on PRM for motion planning

tree-like robots (articulated robots)

Articulated Joint

configuration = vector of angles

configuration space

Q = {q | q ∈ Sn}

S — set of anglesn — number of angles = degrees of freedom (dof)

Page 10: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

More on PRM for motion planning

tree-like robots (articulated robots)

configuration = vector of angles

configuration space

Q = {q | q ∈ Sn}

S — set of anglesn — number of angles = degrees of freedom (dof)

Page 11: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Proteins are Robots (aren’t they?)

Obvious similarity ;-)

==?

Our model

Protein == vector of phi and psi angles (treelike robotwith 2n dof)possible models range from only backbone up to full atom

Page 12: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Proteins are Robots (aren’t they?)

Obvious similarity ;-)

==?

Our model

Protein == vector of phi and psi angles (treelike robotwith 2n dof)possible models range from only backbone up to full atom

Page 13: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Proteins are Robots (aren’t they?)

Obvious similarity ;-)

==?

Our model

C

C C

C C

C

N

NN

O

O O

Protein == vector of phi and psi angles (treelike robotwith 2n dof)possible models range from only backbone up to full atom

Page 14: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Proteins are Robots (aren’t they?)

Obvious similarity ;-)

==?

Our model

C

C C

C C

C

N

NN

O

O O

phi psi

Protein == vector of phi and psi angles (treelike robotwith 2n dof)possible models range from only backbone up to full atom

Page 15: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Differences to usual PRM

no external obstacles, but

self-avoidingnesstorsion angles

quality of paths

low energy intermediate stateskinetically prefered pathshighly probable paths

Page 16: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Energy Function

method can use any potential

Our coarse potential[Levitt. J.Mol.Biol., 1983. ]

each sidechain by only one “atom” (zero dof)

Utot =∑

restraints

Kd{[(di − d0)2 + d2c ]

12 − dc}+ Ehp

first term favors known secondary structure through mainchain hydrogen bonds and disulphide bondssecond term hydrophobic effectVan der Waals interaction modeled by step function

All-atom potential: EEF1[Lazaridis, Karplus. Proteins, 1999. ]

Page 17: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

PRM method for Proteins

Sampling Connecting Extracting

Page 18: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Sampling — Node Generation

Sampling Connecting Extracting

Page 19: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Node Generation

No uniform sampling

configuration space too large⇒ need biased sampling strategy

Gaussian sampling

centered around native conformationwith different STDs 5◦, 10◦, . . . , 160◦

ensure representants for different numbers of nativecontacts

Selection by energy

P(accept q) =

1 if E (q) < EminEmax−E(q)Emax−Emin

if Emin ≤ E (q) ≤ Emax

0 if E (q) > Emax

Page 20: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

More on Node Generation

Visualization of Sampling Strategy

Distribution

Psi and Phi angles RMSD vs. Energy

Page 21: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Node Connection

Sampling Connecting Extracting

Page 22: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 23: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 24: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 25: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 26: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 27: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

P1

P2

P3

P5

P4

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 28: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

Weight

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 29: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Connecting Nodes by Local Planner

connect configurations in close distance

generate N intermediary nodes by local planner

assign weights to edges

Pi =

{e−

∆EkT if ∆E > 0

1 if ∆E ≤ 0Weight =

N∑i=0

−log(Pi)

Page 30: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Extracting Paths

Sampling Connecting Extracting

Page 31: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Extracting Paths

Shortest Path

extract one shortest pathfrom some starting conformation, one path at a time

Single Source Shortest Paths (SSSP)

extract shortest paths from all starting conformationcompute paths simultaneouslygenerate tree of shortest paths (SSSP tree)

Page 32: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Big Picture

Sampling Connecting Extracting

Page 33: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Studied Proteins

Overview of studied proteins, roadmap size, andconstruction times

Page 34: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Formation orders

formation order of secondary structure for verifyingmethod

formation orders can be determined experimentally[ Li, Woodward. Protein Science, 1999. ]

Pulse labelingOut-exchange

prediction of formation orders

single pathsaveraging over multiple paths (SSSP-tree)

Page 35: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Timed Contact Maps

Page 36: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Formation Order

no (reported) contradictions between prediction andvalidation

different kind of information from experiment andprediction

Page 37: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

The Proteins G and L

Studied in more detail

good test case

structurally similar: 1α + 4β

fold differently

Protein G: β-turn 2 forms firstProtein L: β-turn 1 forms first

Page 38: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Comparison of Analysis Techniquesβ-Turn Formation

Page 39: Predicting Protein Folding Paths - Mathematicsmath.mit.edu/classes/18.417/Slides/folding-prm.pdf · Protein folding by motion planning. Phys. Biol., 2005. 2011 Aims Find good quality

S.W

ill,

18

.41

7,

Fa

ll2

01

1

Conclusion

PRM can be applied to “realistic” protein models

Introduced method makes verifiable prediction

Coarse potential is sufficient

Predictions in good accordance to experimental data