/77© Burkhard Rost
�1
title: Membrane structure prediction 1short title: cb1_tmh1
lecture: Computational Biology 1 - Protein structure (for Informatics) - TUM summer semester
/00© Burkhard Rost
Videos: YouTube / www.rostlab.org/talks THANKS :. EXERCISES: Special lectures: • Mikal Boden UQ Brisbane No lecture: • 04/26 Security check Rostlab (exercise WILL be) • 05/01 May Day (also no exercise) • 05/08 Student representation (SVV) - exercise WILL happen • 05/10 Ascension Day (also no exercise) • 05/22 Whitsun holiday (also no exercise) • 05/31 Corpus Christi (also no exercise) • 06/19 no lecture (but exercise) • 06/21 no lecture (but exercise) LAST lecture: bef: Jul 12 Examen: Jul 12 18-20:00 (room TBA) • Makeup: no makeup (sorry due to overload)
�2
Announcements
Dmitrij Nechaev
Your Name
Lothar Richter
Michael Heinzinger
next
CONTACT: [email protected]© Michael Leunig
© Burkhard Rost
1D: TM
transmembrane helix
prediction�3
© Burkhard Rost
Intro: membranes
�4
© Burkhard Rost /77
What to put around a cell?
�5
/78© Burkhard Rost
�6
Roshni Nelson: Cell Membranes
© Roshni Nelson UT Southwestern Dallas http://www.roshninelson.com
http://utsouthwestern.edu/STARS - vimeo.com/31412291
/78© Burkhard Rost
�7
Roshni Nelson: Phospholipids
© Roshni Nelson UT Southwestern Dallas http://www.roshninelson.com
http://utsouthwestern.edu/STARS - vimeo.com/31412291
/78© Burkhard Rost K Rogers (2011) Britannica
Prokaryotic Cell Eukaryotic Cell(bacillus type)
�8
Cellular compartments
© Tatyana Goldberg (TUM Munich)
/78© Burkhard Rost
�9
How 2 separate outside/inside?
/78© Burkhard Rost
�10
Lipid bilayer
Wikipedia © http://en.wikipedia.org/wiki/Lipid_bilayer
/78© Burkhard Rost
-+-
+ +++
-
-
--
----
- ++
++
+
+
HHHHH
H HH
HHH H H
H-
+
solvent
�11
Hydrophobic core of a protein
/78© Burkhard Rost
�12
Lipid bilayer: hydrophobic in inside
© Wikipedia http://en.wikipedia.org/wiki/Lipid_bilayer
/78© Burkhard Rost
�13
Lipid bilayer: hydrophobic in insideeasy to pull aroundhorizontally
© Wikipedia http://en.wikipedia.org/wiki/Lipid_bilayer
/78© Burkhard Rost
�14
Lipid bilayer: hydrophobic in insidehard to enter
© Wikipedia http://en.wikipedia.org/wiki/Lipid_bilayer
/78© Burkhard Rost
�15
Bacterial injection needles
Model of type VI secretion system (TSS6) in gram-negative bacteria
Marek Basler Biozentrum Basel
/78© Burkhard Rost
�16Borenstein DB, Ringel P, Basler M, Wingreen NS (2015) Established Microbial Colonies Can Survive Type VI Secretion Assault. PLoS Comput Biol 11(10): e1004520. doi:10.1371/
Shot through two membranes
Marek Basler
Biozentrum Basel
/78© Burkhard Rost
�17Marek Basler, BT Ho, JJ Mekalanos (2013) Cell 152:884-894
Tit-for-tat: type 6 secretion system counter-attack
Marek Basler
Biozentrum Basel
/78© Burkhard Rost
�18
Localization for drug targets
TMBakheetandAJDoig(2008)Bioinforma)cs
Membrane57 %
Cytoplasm13 %
Extra-cellular13 %
ER7 %
Nucleus3 %
Mito2 %
Other2 %
Microsome2 %
Pero1 %
Drug targets tend to be found in membranes, cytoplasm or are
extra-cellular!© Tatyana Goldberg (TUM Munich)
© Burkhard Rost
TMH (Transmembrane
helix) background
�19
/77© Burkhard Rost
�20
/77© Burkhard Rost
�21
1JB0Cyanobacterial Photosystem I
Jordan P, Krauss N
1E7PFumarate ReductaseLancaster CD, Michel
H
periplasm
cytoplasmCytoplasm (stromal side)
?
/78© Burkhard Rost
�22
Membrane prediction
/78© Burkhard Rost
�23
TM prediction wait for db growth ...
1993
1999
1996
/78© Burkhard Rost
�24
Topology for membrane helical proteins.
exex tratra -cy-cy toto pp ll aa smsm ii cc
ii nn tt rr aa -- cc yy tt oo pp ll aa ss mm ii ccin
protein Aprotein C
C-term
out
in
protein B
C-term
C-term
lipid membranebilayer
inside cytoplasm
outside cytoplasm
© Burkhard Rost
TMH prediction
�25
/78© Burkhard Rost
P H D s e c
H
L
E
4+1""""""
20444
outputlayer
inputlayer
hiddenlayer
20444
21+3""""""
H
L
E
0.5
0.1
0.4percentage of each amino acid in proteinlength of protein (≤60, ≤120, ≤240, >240)distance: centre, N-term (≤40,≤30,≤20,≤10)distance: centre, C-term (≤40,≤30,≤20,≤10)
input global in sequence
input local in sequence
localalign-ment13
adjacentresidues
:::AAAAA.LLLLIIAAGCCSGVV:::
globalstatist.wholeprotein
%AALength∆ N-term∆ C-term
A C L I G S V ins del cons100 0 0 0 0 0 0 0 0 1.17100 0 0 0 0 0 0 33 0 0.42 0 0 100 0 0 0 0 0 33 0.92 0 0 33 66 0 0 0 0 0 0.74 66 0 0 0 33 0 0 0 0 1.17 0 66 0 0 0 33 0 0 0 0.74 0 0 0 33 0 0 66 0 0 0.48
first levelsequence-to- structure
second levelstructure-to- structure
�26
Membrane helices are helices, right?
B Rost (1996) Methods Enzymol 266:525-39
/78© Burkhard Rost
�27
PHDsec “success” on Poly-Valine
HEADER LIPOPROTEIN(SURFACE FILM)COMPND PULMONARY SURFACTANT-ASSOCIATED POLYPEPTIDE C(SP-C)SOURCE PIG (SUS SCROFA)AUTHOR J.JOHANSSON,T.SZYPERSKI,T.CURSTEDT,K.WUTHRICH
AA LRIPCCPVNLKRLLVVVVVVVLVVVVTVGALLMGLOBS sec HHHHHHHHHHHHHHHHHHHHHHHHHPHD sec EEEEEEEEEEEEEEEEEEEEEEE
/78© Burkhard Rost
�28
Goes wrong because swap: outside/inside
Protein
Membrane
H=hydrophobic
LIPID
H
HHHH
H
HH H H
H
HProtein
non-membrane(globular water-soluble)
H=hydrophobicL= hydrophilic
Water
H
LL
HL L
HH
HL L L
L
LL
/78© Burkhard Rost
-+-
+ +++
-
-
--
----
- ++
++
+
+
HHHHH
H HH
HHH H H
H-
+
solvent
�29
Hydrophobic core of a protein
/78© Burkhard Rost
�30
Topology for membrane helical proteins.
exex tratra -cy-cy toto pp ll aa smsm ii cc
ii nn tt rr aa -- cc yy tt oo pp ll aa ss mm ii ccin
protein Aprotein C
C-term
out
in
protein B
C-term
C-term
lipid membranebilayer
inside cytoplasm
outside cytoplasm
/78© Burkhard Rost
�31
Hydrophobic side chains
/78© Burkhard Rost
�32
Eisenberg hydrophobicity indexAA-3 AA-1 Eisenberg
Ile I 1.38Phe F 1.19Val V 1.08Leu L 1.06Trp W 0.81Met M 0.64Ala A 0.62Gly G 0.48Cys C 0.29Tyr Y 0.26Pro P 0.12Thr T -0.05Ser S -0.18His H -0.4Glu E -0.74Asn N -0.78Gln Q -0.85Asp D -0.9Lys K -1.5Arg R -2.53
David Eisenberg, UCLA © https://www.uclaaccess.ucla.edu/
uploads/image/faculty/134.jpg
D Eisenberg et al. (1984) J Mol Biol 179:125-42
/78© Burkhard Rost
�33
Pure hydrophobicity scaleshydrophobicity scales
-6.75
-4.50
-2.25
0.00
2.25
4.50
6.75
9.00
A R N D C Q E G H I L K M F P S T W Y VGES EISEN KYDO
/78© Burkhard Rost
�34
5 Hydrophobicity/tm/occupancy scaleshydrophobicity scales
0.00
0.25
0.50
0.75
1.00
A R N D C Q E G H I L K M F P S T W Y VGES EISEN KYDO OOI HEIJNE
/78© Burkhard Rost
�35
Many indices exist
K Tomii and M Kanehisa (1996) Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins. Protein Eng 9:27-36: Fig. 2 (402 indices)
/78© Burkhard Rost
�36
PHDsec success on Poly-Valine
HEADER LIPOPROTEIN(SURFACE FILM)COMPND PULMONARY SURFACTANT-ASSOCIATED POLYPEPTIDE C(SP-C)SOURCE PIG (SUS SCROFA)AUTHOR J.JOHANSSON,T.SZYPERSKI,T.CURSTEDT,K.WUTHRICH
AA LRIPCCPVNLKRLLVVVVVVVLVVVVTVGALLMGLOBS sec HHHHHHHHHHHHHHHHHHHHHHHHHPHD sec EEEEEEEEEEEEEEEEEEEEEEE
NLKRLLVVVVVVVLVVVVTVGALL h hhhhhhhhhhhhhh h hhhh: hydrophobic
/78© Burkhard Rost
�37
Identify hydrophobic regions
G von Heijne (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225: 487-94: Fig. 4
/78© Burkhard Rost
�38
Topology for membrane helical proteins.
exex tratra -cy-cy toto pp ll aa smsm ii cc
ii nn tt rr aa -- cc yy tt oo pp ll aa ss mm ii ccin
protein Aprotein C
C-term
out
in
protein B
C-term
C-term
lipid membranebilayer
/77© Burkhard Rost
G von Heijne (1986) The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology. EMBO J 5:3021-7 Fig. 2
�39
Positive-inside rule
cytosolic loops
periplasmic loops
/78© Burkhard Rost
�40
Topology for membrane helical proteins.
exex tratra -cy-cy toto pp ll aa smsm ii cc
ii nn tt rr aa -- cc yy tt oo pp ll aa ss mm ii ccin
protein Aprotein C
C-term
out
in
protein B
C-term
C-term
lipid membranebilayer
/78© Burkhard Rost
�41
Heijne rule: positive inside out
0.920.95
0.93
0.91 0.900.92
0.870.89
N-term C-term
5 30 6 5
outout
Eight bestHTM's
µ=0: 0 HTM
µ=2: 2 HTMµ=3: 3 HTM
µ=1: 1 HTM
Loop lengths
Charge:Number of R+Kin loops 1-4
final prediction:∆ =(5+1) - (2+3)>0=> first loop out lipid membrane bilayer
extra-cytoplasmic
intra-cytoplasmic
R+KΣ=2
R+KΣ =5
R+KΣ =3
R+KΣ=1
/77© Burkhard Rost
1. predict <H> 2. assign positive inside-out 3. choose threshold to optimize inside-out difference
�42
G von Heijne (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225: 487-94: Fig. 4
Identify hydrophobic regions
/77© Burkhard Rost
S Jayasinghe, K Hristova, SH White (2001) Energetics, stability, and prediction of transmembrane helices. J Mol Biol 12:927-34idea: optimize hydrophobicity scale for prediction
�43
Hydrophobicity-based
/78© Burkhard Rost
�44
PHDsec success on Poly-Valine
HEADER LIPOPROTEIN(SURFACE FILM)COMPND PULMONARY SURFACTANT-ASSOCIATED POLYPEPTIDE C(SP-C)SOURCE PIG (SUS SCROFA)AUTHOR J.JOHANSSON,T.SZYPERSKI,T.CURSTEDT,K.WUTHRICH
AA LRIPCCPVNLKRLLVVVVVVVLVVVVTVGALLMGLOBS sec HHHHHHHHHHHHHHHHHHHHHHHHHPHD sec EEEEEEEEEEEEEEEEEEEEEEE
/77© Burkhard Rost
�45
HTM
nonHTM
outputlayer
inputlayer
hiddenlayer
20444
21+3""""""
percentage of each amino acid in proteinlength of protein (≤60, ≤120, ≤240, >240)distance: centre, N-term (≤40,≤30,≤20,≤10)distance: centre, C-term (≤40,≤30,≤20,≤10)
input global in sequence
input local in sequence
localalign-ment13
adjacentresidues
:::AAAAA.LLLLIIAAGCCSGVV:::
globalstatist.wholeprotein
%AALength∆ N-term∆ C-term
A C L I G S V ins del cons100 0 0 0 0 0 0 0 0 1.17100 0 0 0 0 0 0 33 0 0.42 0 0 100 0 0 0 0 0 33 0.92 0 0 33 66 0 0 0 0 0 0.74 66 0 0 0 33 0 0 0 0 1.17 0 66 0 0 0 33 0 0 0 0.74 0 0 0 33 0 0 66 0 0 0.48
HTM
nonHTM
3+1""""""
20444
first levelsequence-to- structure
second levelstructure-to- structure
P H D ht m
/78© Burkhard Rost
�46
Dynamic programming on NN ‘energy’
1
01
0residue number
T
N
/78© Burkhard Rost
�47
PHDhtm
refine
0.920.95
0.93
0.91 0.900.92
0.870.89
N-term C-term
5 30 6 5
outout
Eight bestHTM's
µ=0: 0 HTM
µ=2: 2 HTMµ=3: 3 HTM
µ=1: 1 HTM
Loop lengths
Charge:Number of R+Kin loops 1-4
final prediction:∆ =(5+1) - (2+3)>0=> first loop out lipid membrane bilayer
extra-cytoplasmic
intra-cytoplasmic
R+KΣ=2
R+KΣ =5
R+KΣ =3
R+KΣ=1
/78© Burkhard Rost
�48
PHDhtm on Poly-Valine
HEADER LIPOPROTEIN(SURFACE FILM)COMPND PULMONARY SURFACTANT-ASSOCIATED POLYPEPTIDE C(SP-C)SOURCE PIG (SUS SCROFA)AUTHOR J.JOHANSSON,T.SZYPERSKI,T.CURSTEDT,K.WUTHRICH
AA LRIPCCPVNLKRLLVVVVVVVLVVVVTVGALLMGLOBS htm TTTTTTTTTTTTTTTTTTTTTTTTTPHD htm TTTTTTTTTTTTTTTTTTTTTTTT
/78© Burkhard Rost
�49
Membrane helix prediction: TMHMM
TMHMM: sketch
details: inside/outside loop
details: TM core
A Krogh, B Larsson, G von Heijne, EL Sonnhammer (2001) 305:567-80, Fig. 1
/77© Burkhard Rost
Gabor E Tusnady & Istvan Simon (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17:849-850.
�50
Membrane helix prediction: HMMTOP
/78© Burkhard Rost
�51
TMHs (helices) correctly predicted?
C-P Chen, A Kernytsky & B Rost 2002 Protein Science 11, 2774-91
Observed Helix 1 (O1) O2 O3
Predicted Helix 1 (P1) P2 P3
/78© Burkhard Rost
�52
TMHs (helices) correctly predicted: if at most ±5 residues overlap
C-P Chen, A Kernytsky & B Rost 2002 Protein Science 11, 2774-91
Observed Helix 1 (O1) O2 O3
Predicted Helix 1 (P1) P2 P3
here=0
/78© Burkhard Rost
�53
Prediction of membrane helicesQo
k: %
of p
rote
in w
ith al
l TM
H rig
ht
J Reeb, E Kloppmann, M Bernhofer & B Rost (2015) Proteins 83:473-84
© Burkhard Rost
Other problems unravelled by recent
structures
�54
/78© Burkhard Rost
�55
Kingdoms similar in length
J Liu. & B Rost (2001) Prot. Sci. 10, 1970-1979.B Rost (2002) Curr Op Struct Biol, 12, 409-416
eukaryotesbacteriaarchaea
/78© Burkhard Rost
�56
Kingdoms similar in amino acids usage
J Liu. & B Rost (2001) Prot Sci 10, 1970-1979.B Rost (2002) Curr Op Struct Biol, 12, 409-416
eukaryotes
bacteria
archaea
/78© Burkhard Rost
�57
Inventory of life: membrane proteins
0 5 10 15 20 25 30
A pernixA fulgidus
M jannaschiiM thermoautotrophicu
P abyssiP horikoshii
A aeolicusB subtilis
B burgdorferiC jejuni
C pneumoniaeC trachomatisD radiodurans
E coliH influenzae
H pyloriM genitalium
M pneumoniaeM tuberculosisN meningitidis
R prowazekiiS PCC6803T maritimaT pallidum
U urealyticum
S cerevisiaeC elegans
D melanogasterH sapiens (SP/TrEmbl
H sapiens(chr 22)
%mem
eukaryotes
bacteria
archaea
J Liu. & B Rost (2001) Prot. Sci. 10, 1970-1979.
2013 note: some issues with data (incomplete sequences?)e.g. human has more than 18%
/78© Burkhard Rost
�58
Inventory of life: coiled-coil proteins
0 5 10 15 20 25 30
A pernixA fulgidus
M jannaschiiM thermoautotrophicu
P abyssiP horikoshii
A aeolicusB subtilis
B burgdorferiC jejuni
C pneumoniaeC trachomatisD radiodurans
E coliH influenzae
H pyloriM genitalium
M pneumoniaeM tuberculosisN meningitidis
R prowazekiiS PCC6803T maritimaT pallidum
U urealyticumS cerevisiae
C elegansD melanogaster
H sapiens (SP/TrEmblH sapiens(chr 22)
%mem
0 2 4 6 8 10 12
%coils
J Liu. & B Rost (2001) Prot. Sci. 10, 1970-1979.
eukaryotes
bacteria
archaea
/77© Burkhard Rost
statistics for PDB in June 2010:67,086 structures in PDB (June 2010) 1,197 transmembrane 1,014 alpha helical 179 beta barrel
-> < 2% BUT: >20% of all proteins!
�59
TMH proteins: reminders
GE Tusnady, ZS Dosztanyi & I Simon (2005) Bioinformatics 21:1276-7
/77© Burkhard Rost
statistics for PDB in June 2010:67,086 structures in PDB (June 2010) 246 unique* transmembrane
-> < way less than 2% BUT: >20% of all proteins!
• * unique=non-identical sequence (can have PIDE>99.5%!)
�60
TMH proteins: reminders
S Jayasinghe, K Hristova, SH White (2001) Protein Sci 10:455-8
/77© Burkhard Rost
Edda Kloppmann & Marco Punta: 1,035 PDB unique TM structures (Jan 2012)-> 107 Pfam families
�61
TMH proteins: reminders
E Kloppmann, M Punta & B Rost (2012) Curr Op Struct Biol 22:326-32
/78© Burkhard Rost
�62
Thanks to Arne Elofsson
Following slides taken from Arne Elofsson, Stockholm Univ
/77© Burkhard Rost
78 interface helices ~50% of chains contain interface helix Average length ~ 9 aa Longest is 19 aa Most frequent in photosynthetic reaction center
E Granseth, G von Heijne & A Elofsson (2005) J Mol Biol 346:377-85
© Arne Elofsson (Stockholm Univ) �63
Interface helices (Granseth, JMB 2005)
/77© Burkhard Rost
36 reentrant helices • 20 in new classification 24% contain reentry 72% on the outside Length 3-32 residues Loops 11-117 residues
�64
Re-entry regions
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603
© Arne Elofsson (Stockholm Univ)
/78© Burkhard Rost
�65
36 reentry regions in 3 classes
Helix-coil/Coil-helix
Helix-coil-helix Coil
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603© Arne Elofsson (Stockholm Univ)
/78© Burkhard Rost
�66
Predict re-entry regions
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603: Fig. 5
/78© Burkhard Rost
�67
Re-entry predicted in entire genomes
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603
© Arne Elofsson (Stockholm Univ)
0.280.720.24079Observed in dataset
0.520.480.167773E. coli0.400.600.10757S. cerevisiae
0.540.460.154181H. sapiens
Reentrants in
Reentrants out
Reentrant fraction
ProteinsGenome
0.310.220.110.07FractionChannels
Active transporters
Electron transporters
Signal receptors
/77© Burkhard Rost
Membrane protein structures are complex • TM-helices ends at different locations • Different angles • Neighboring helices often interact • Interface helices • reentrant regions No sheets close to the membrane
�68
The not so simple TM proteins
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603
© Arne Elofsson (Stockholm Univ)
/78© Burkhard Rost
�69
More complex structures need new prediction methodsNout
Cin
C
N
cytoplasm
periplasm
Membrane
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603
© Arne Elofsson (Stockholm Univ)
/78© Burkhard Rost
�70
The Z-coordinate
Z-coordinate: distance residue 2 membrane center
Z
0
15
-15
Periplasm
Cytoplasm
H Viklund, E Granseth & A Elofsson (2006) J Mol Biol 361:591-603 © Arne Elofsson (Stockholm Univ)
/00© Burkhard Rost
01: 04/10 Tue: No lecture 02: 04/12 Thu: No lecture 03: 04/17 Tue: No lecture 04: 04/19 Thu: Intro 1: organization of lecture: intro into cells & biology 05: 04/24 Tue: Intro 2: amino acids, protein structure (comparison), domains 06: 04/26 Thu: No lecture 07: 05/01 Tue: SKIP: May Day 08: 05/03 Thu: Alignment 1 09: 05/08 Tue: SKIP: Student Representation (SVV) 10: 05/10 Thu: SKIP: Ascension Day 11: 05/15 Tue: Alignment 2 12: 05/17 Thu: Comparative modeling & exp structure determination & secondary structure assignment 13: 05/22 Tue: SKIP: Whitsun holiday 14: 05/24 Thu: Comparative modeling 2 & 1D: Secondary structure prediction 1 15: 05/29 Tue: 1D: Secondary structure prediction 2 16: 05/31 Thu: SKIP: Corpus Christi 17: 06/05 Tue: 1D: Secondary structure prediction 3 18: 06/07 Thu: 1D: Transmembrane structure prediction 1 19: 06/12 Tue: 1D: Transmembrane structure prediction 2 / Solvent accessibility prediction 20: 06/14 Thu: 1D: Disorder prediction; 2D prediction / 3D prediction 21: 06/19 Tue: No lecture (but exercises) 22: 06/21 Thu: No lecture (but exercises) 23: 06/26 Tue: recap 1 24: 06/28 Thu: recap 2 25: 07/03 Tue: TBA 26: 07/05 Thu: TBA 27: 07/10 Tue: TBA 28: 07/12 Thu: TBA
�71
Lecture plan (CB1 structure: INF)
today