27
Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

Comparative modeling

Ole Lund,

Associate Professor,

CBS, BioCentrum, DTU

Page 2: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Comparative modeling

Also known as homology modeling Uses template from related protein to build

model Based on the finding that

– Protein structure tend to remain approximately the same even when many amino acids have changed during evolution!

– selection for conservation of structure? proteins with similar sequences often have similar

structures

Page 3: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Why make structural models?

Fast and cheap alternative to experimental determination of structures (X-ray & NMR)– Not as accurate as experimental methods– Not all proteins can be modeled with current

methods

Applications– Drug discovery (Requires accurate model)– Plan new experiments (mutations)– Understanding of function

Page 4: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Steps in comparative modeling

1. Find template

2. Make alignment

3. Build loops

4. Model side chains

5. Refinement

6. Evaluate model

Page 5: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Recovery from errors

An error on an earlier step is normally unrecoverable on a later step– The alignment can not make up for a wrong

choice of template– Loop modeling can not make up for a wrong

alignment Errors may be discovered on a later step and

corrected for by going back and correcting it– i.e. by selecting a new (and better) template

Page 6: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Template identification

Search with sequence– Blast– Psi-Blast– Fold recognition methods

Use significance levels (P or E values) - not %ID BLAST reports E-values:

– # of random hits with expected to be found with a given score Rather than P values:

– probability of finding at least one hit with a given score P = 1- exp(-E) E=loge(1-P)

– http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html Use biological information Functional annotation in databases Active site/motifs

Page 7: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Example: Query sequence

>gi|2065035|emb|CAA65601.1| beta-lactamase [Chryseobacterium meningosepticumMLKKIKISLILALGLTSLQAFGQENPDVKIEKLKDNLYVYTTYNTFNGTKYAANAVYLVTDKGVVVIDCPWGEDKFKSFTDEIYKKHGKKVIMNIATHSHDDRAGGLEYFGKIGAKTYSTKMTDSILAKENKPRAQYTFDNNKSFKVGKSEFQVYYPGKGHTADNVVVWFPKEKVLVGGCIIKSADSKDLGYIGEAYVNDWTQSVHNIQQKFSGAQYVVAGHDDWKDQRSIQHTLDLINEYQQKQKASN

Since the discovery of penicillin, bacteria have developed defense mechanisms against these drugs. In particular, this has become a problem during the last decades, where certain pathogenic bacteria have become resistant to antibiotics. The primary defense mechanism is production of beta-lactamases, which are enzymes cleaving beta-lactam antibiotics. http://www.matfys.kvl.dk/~antony/

Page 8: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Blast search vs. pdb

>gi|3318914|pdb|1A7T|A Chain A, Metallo-Beta-Lactamase With Mes gi|3318915|pdb|1A7T|B Chain B, Metallo-Beta-Lactamase With Mes gi|3891997|pdb|1A8T|A Chain A, Metallo-Beta-Lactamase In Complex With L-159,061 gi|3891998|pdb|1A8T|B Chain B, Metallo-Beta-Lactamase In Complex With L-159,061 Length = 232

Score = 126 bits (317), Expect = 7e-30 Identities = 62/216 (28%), Positives = 111/216 (51%), Gaps = 1/216 (0%)

Query: 27 DVKIEKLKDNLYVYTTYNTFNG-TKYAANAVYLVTDKGVVVIDCPWGEDKFKSFTDEIYK 85 D+ I +L D +Y Y + G +N + ++ + ++D P + + + + + Sbjct: 10 DISITQLSDKVYTYVSLAEIEGWGMVPSNGMIVINNHQAALLDTPINDAQTEMLVNWVTD 69

Query: 86 KHGKKVIMNIATHSHDDRAGGLEYFGKIGAKTYSTKMTDSILAKENKPRAQYTFDNNKSF 145 KV I H H D GGL Y + G ++Y+ +MT + ++ P ++ F ++ + Sbjct: 70 SLHAKVTTFIPNHWHGDCIGGLGYLQRKGVQSYANQMTIDLAKEKGLPVPEHGFTDSLTV 129

Query: 146 KVGKSEFQVYYPGKGHTADNVVVWFPKEKVLVGGCIIKSADSKDLGYIGEAYVNDWTQSV 205 + Q YY G GH DN+VVW P E +L GGC++K + +G I +A V W +++Sbjct: 130 SLDGMPLQCYYLGGGHATDNIVVWLPTENILFGGCMLKDNQTTSIGNISDADVTAWPKTL 189

Query: 206 HNIQQKFSGAQYVVAGHDDWKDQRSIQHTLDLINEY 241 ++ KF A+YVV GH ++ I+HT ++N+YSbjct: 190 DKVKAKFPSARYVVPGHGNYGGTELIEHTKQIVNQY 225

http://www.ncbi.nlm.nih.gov/blast/

Page 9: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Template sequence

1A8TB. Chain B, Metallo-...[gi:3891998] BLink, Domains, Links LOCUS 1A8T_B 232 aa linear BCT 23-MAR-1998DEFINITION Chain B, Metallo-Beta-Lactamase In Complex With L-159,061.ACCESSION 1A8T_BVERSION 1A8T_B GI:3891998DBSOURCE pdb: molecule 1A8T, chain 66, release Mar 23, 1998; deposition: Mar 23, 1998; class: Hydrolase; source: Mol_id: 1; Organism_scientific: Bacteroides Fragilis; Strain: Tal3636; Variant: Clinical Isolate; Gene: Ccra; Expression_system: Escherichia Coli; Exp. method: X-Ray Diffraction.KEYWORDS .SOURCE Bacteroides fragilis ORGANISM Bacteroides fragilis Bacteria; Bacteroidetes; Bacteroides (class); Bacteroidales; Bacteroidaceae; Bacteroides.……………ORIGIN 1 aqksvkisdd isitqlsdkv ytyvslaeie gwgmvpsngm ivinnhqaal ldtpindaqt 61 emlvnwvtds lhakvttfip nhwhgdcigg lgylqrkgvq syanqmtidl akekglpvpe 121 hgftdsltvs ldgmplqcyy lggghatdni vvwlptenil fggcmlkdnq ttsignisda 181 dvtawpktld kvkakfpsar yvvpghgnyg gteliehtkq ivnqyiests kp//

Page 10: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Template1A8TChain A

Template recognitionTemplate recognitionBlaB – Beta lactamaseBlaB – Beta lactamase

Page 11: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Alignment of query and template

Look at the alignment used to find the template– Are secondary structure elements active sites and other

motifs aligned?– Can gaps be closed?– Are there place for the insertions?

Change the alignment manually or by a different alignment program/alignment parameters

– Take care not to change it for the worse– On average I only make things slightly worse by manual

intervention!

Page 12: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

AlignmentAlignmentBlaB – Beta lactamaseBlaB – Beta lactamase

BLAB EKLKDNLYVYTTYNTFNGTKY-AANAVYLVTDKGVVVIDCPWGEDKFKSFTDEIYKKHGKKVIMNIATHS1A8T.A TQLSDKVYTYVSLAEIEGWGMVPSNGMIVINNHQAALLDTPINDAQTEMLVNWVTDSLHAKVTTFIPNHW

BLAB HDDRAGGLEYFGKIGAKTYSTKMTDSILAKENKPRAQYTFDNNKSFKVGKSEFQVYYPGKGHTADNVVVW1A8T.A HGDCIGGLGYLQRKGVQSYANQMTIDLAKEKGLPVPEHGFTDSLTVSLDGMPLQCYYLGGGHATDNIVVW

BLAB FPKEKVLVGGCIIKSADSKDLGYIGEAYVNDWTQSVHNIQQKFSGAQYVVAGHDDWKDQRSIQHTLDLIN1A8T.A LPTENILFGGCMLKDNQTTSIGNISDADVTAWPKTLDKVKAKFPSARYVVPGHGNYGGTELIEHTKQIVN

BLAB EYQQKQK1A8T.A QYIESTS

 

Sequence identity 27%

Page 13: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Template vs alignment identification

If the template was hard to find the correct alignment will be tough to make

If the Template is correct part of the model will normally be correct

Page 14: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Build loops

Fragment based methods – Many implementations (M Levitt, L Holm, D Baker etc.)– Fast

Energy based methods– Avoid stereo-chemically infeasible solutions– Can see what is bad but not what is good!

Combination of methods is often used No method can move the model (very much) towards

the native conformation i.e reduce the root mean square deviation (RMSD) = How many Ångstrøms you are off

Page 15: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Loops: The rosetta method

Find fragments (10 per amino acid) with the same sequence and secondary structure profile as the query sequence

Combine them using a Monte Carlo scheme to build them to build the loop

Baker et al.

http://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.php

Page 16: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Model side chains

Knowledge based methods– SCWRL performed well in CASP4

(http://dunbrack.fccc.edu/SCWRL3.php , http://dunbrack.fccc.edu/scwrl3protsci.pdf )

– Energy calculations Slow

Page 17: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

SCWRL (Bower, Cohen & Dunbrack)

Sidechain placement With a Rotamer Library Assumes constant angles and distances of bonds

1. Each residue begins in its most favored rotamer

2. Rotamer search to remove steric clashes between sidechains and backbone

3. Rotamer search to remove steric clashes between sidechains

Page 18: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Model (red) vs template (blue)

Page 19: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Model evaluation

Is the structure unlikely? Distributions of

– Dihedral angles (fraction in most favored regions)– Bond lengths and angles

Procheck– www.biochem.ucl.ac.uk/~roman/procheck/

procheck.html

Page 20: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Example ofProcheckoutput

Page 21: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Benchmarking comparative modeling

CASP– Critical Assessment of Structure Predictions– Sequences from about-to-be-solved-structures

are given to groups who submit their predictions before the structure is published

EVA– Newly solved structures are send to prediction

servers.– Evaluates automatic servers

Page 22: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

CASP4: Best overall fold

1. Venclovas, C

2. Baker, D

3. Sternberg, M

4. Rychlewski, L (Bioinfo.PL)

5. SBI-AT Tramantano et al., 2001

Page 23: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

CASP4: Best details of models

1. Venclovas, C

2. Sternberg, M

3. Honig, B

4. Baker, D

5. SBI-AT

Tramantano et al., 2001

Page 24: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Accuracy of SwissModel

Page 25: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

EVA

http://cubic.bioc.columbia.edu/eva/cm/res/rank.html

Analysis of Fold accuracy (% Equivalent Positions):

Ranking of the methods:

1. sdsc12. 3djigsaw3. SwissModel4. cphmodels5. esypred

Page 26: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Links to modeling servers

Database of links– http://mmtsb.scripps.edu/cgi-bin/renderrelres?protmodel

SwissModel – www.expasy.ch/swissmod/SM_FIRST.html

3D-Jigsaw– www.bmm.icnet.uk/servers/3djigsaw/

SDSC1– http://cl.sdsc.edu/hm.html

ESyPred3D – http://www.fundp.ac.be/urbm/bioinfo/esypred/

CPHmodels– www.cbs.dtu.dk/services/CPHmodels-2.0

Page 27: Comparative modeling Ole Lund, Associate Professor, CBS, BioCentrum, DTU

OL

Practical conclusions

Several servers exist in the public domain Template and alignment must be correct Loops are difficult to model

More info on comparative modeling– http://speedy.embl-heidelberg.de/gtsp/ – http://www.cmbi.kun.nl/gv/course/index.html – http://www.umass.edu/microbio/chime/explorer/

homolmod.htm