37
Databanks + New tools = New insights THE AXIOM Simple Atom Depth Index Calculator protein fold barcoding CATH – ADAPT… -1

D atabanks + New tools = New insights

  • Upload
    wilmet

  • View
    120

  • Download
    0

Embed Size (px)

DESCRIPTION

THE AXIOM. D atabanks + New tools = New insights. S imple A tom D epth I ndex C alculator. protein fold barcoding CATH – ADAPT…. -1. SADIC: a new tool to analyze atom depth. Digging inside objects to discover their origins. Birth of the Earth. protein folding. 2D. - PowerPoint PPT Presentation

Citation preview

Page 1: D atabanks      + New  tools      = New insights

Databanks +New tools =New insights

THE AXIOM

Simple Atom Depth

Index Calculator

protein fold barcodingCATH – ADAPT… -1

Page 2: D atabanks      + New  tools      = New insights

protein foldingBirth of the Earth

Digging inside objects to discover their origins

SADIC: a new tool to analyze atom depth

Page 3: D atabanks      + New  tools      = New insights

* Chakravarty S, Varadarajan R. Residue depth: a novel parameter for the analysis of protein structure and stability. Structure Fold Des. 1999 7:723-732* Pintar A, Carugo O, Pongor S. Atom depth as a descriptor of the protein interior. Biophys J. 2003 84:2553-2561.

atom depth calculated as the distance with:the closest external water*the closest dot of the water accessible surface*the closest surface exposed atom*

atom depth

HEWL 4lzt

2D

Page 4: D atabanks      + New  tools      = New insights

atom depth2D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

Calculation of exposed volumes

3D

HEWL 4lzt

2D

Page 5: D atabanks      + New  tools      = New insights

atom depthCalculation of exposed

volumes

HEWL 4lzt

3D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

Page 6: D atabanks      + New  tools      = New insights

Calculation of exposed volumes

Depth index: Di,r = 2Vi,r / V 0,r

where Vi,r is the exposed volume of a sphere of radius r centered on atom i of the molecule and V0,r is the exposed volume of the same sphere when centered on an isolated atom

HEWL 4lzt

atom depth3D

Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860

the sphere radius r should have the biggest value which makes Vi = 0 for the most buried atom

Page 7: D atabanks      + New  tools      = New insights

0,0

0,5

1,0

1,5

2,0

4,0

8,0

12,0

16,0

20,0

24,0

Di,r

r [Å]

Page 8: D atabanks      + New  tools      = New insights

Thr 47 α carbon Di,9 = 1.59Ile 58 α carbon Di,9 = 0.13Trp 28 α carbon Di.9 = 0.03

58

47

28

atom depth3D vs 2D

HEWL 4lzt

Page 9: D atabanks      + New  tools      = New insights

3D atom depth analysis

from PDB ID1UBQ

http://www.sbl.unisi.it/prococoa/

Di

Page 10: D atabanks      + New  tools      = New insights

SBL Bioinformatics ProjectsProjects SADIC correlated:

1. fold dependent aa compositions of protein cores;

2. towards i-SADIC.----------------------------------------------------

Projects SADIC uncorrelated:1. systematic analysis of PPI

Page 11: D atabanks      + New  tools      = New insights

Di analysis of protein atomsdefining strutural

layers in protein 3D structureseach strutural layer

includes atoms with similar Di’sfast and accurate analysis of

aa content of structural layers

Page 12: D atabanks      + New  tools      = New insights

Ln Dicolor

L6 > 1.2 red

L5 1.0 – 1.2 orange

L4 0.8 – 1.0 yellow

L3 0.6 – 0.8 green

L2 0.4 -0.6 blue

L1 0.2 - 0.4 indigo

L0 < 0.2 violet3 VTR (chitinolytic enzyme 572 aa)

Di analysis of protein atoms

Page 13: D atabanks      + New  tools      = New insights

N 0.19CA 0.30C 0.25O 0.23CB 0.50CG 0.68CD 0.91CE 1.11NZ 1.29

K63

N 0.38CA 0.52C 0.50O 0.52CB 0.76CG 0.95CD 1.17OE1 1.24OE2 1.24

E24

3D atom depth analysisN 0.10CA 0.05C 0.11O 0.18CB 0.02CG 0.02CD1 0.02CD2 0.00

L43

Dimax

Dimax

Dimax

from PDB ID1UBQ

http://ww

w.sbl.unisi.it/prococoa/

Page 14: D atabanks      + New  tools      = New insights

Dimax analysis of protein residues

defining aa occupancy in protein strutural layers

each strutural layer includes residues with

similar Dimax’sfast and accurate analysis of aa distribution in protein

structures

Page 15: D atabanks      + New  tools      = New insights

Dimax analysis of protein singlesquite a few proteins like to stay single

(at least in the crystalline state)

Bioinformatiha 2, Firenze 18 ottobre

-9

Page 16: D atabanks      + New  tools      = New insights

a database of protein singlesExperimental Method: X-RAY (79,770)

Chain Type: Protein (74,456)Only 1 chain in asym. unit: (28,803)Oligomeric state: 1 (21,193)Number of Entities: 1 (3,517)Homologue Removal @ 95% identity (2,410)

2,410 proteins in the dataset

4,657,574 atoms589,383 residues

2162

322482

642802

9621122

12821442

16021762

192202468

1012141618

DOOPS:

Page 17: D atabanks      + New  tools      = New insights

a database of protein singles

2,410 proteins in the dataset

4,657,574 atoms589,383 residues

DOOPS:

Swiss-Prot: 540,958 proteins in the dataset (192 Maa)

2162

322482

642802

9621122

12821442

16021762

192202468

1012141618

0 20001000

Page 18: D atabanks      + New  tools      = New insights

calculation of % amino acid content in L0the first quantitative analysis of a large array of protein cores!

aa % in L0

Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2

Phenylalanine 6.36Glycine 10.81

Histidine 1.32Isoleucine 11.74

Lysine 0.58Leucina 16.27

Methionine 2.49Asparagine 1.7

Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85

Threonine 4.65Valine 13.7

Tryptophan 1.43Tyrosine 2.5

Dimax analysis of protein cores2,410 proteins; 4,657,574 atoms; 589,383 residues DOOPS:

~20 % of total molecular volume ΣDOOPS aa(L0) =

106,088(from 2410 proteins)

core aa if Dimax < 0.2

aa % in L0

Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2

Phenylalanine* 6.36Glycine 10.81

Histidine 1.32Isoleucine 11.74

Lysine 0.58Leucina 16.27

Methionine 2.49Asparagine 1.7

Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85

Threonine 4.65Valine 13.7

Tryptophan 1.43Tyrosine 2.5

Page 19: D atabanks      + New  tools      = New insights

Class Architectures

Topology

Homologous

superfamily

Domains

1 (mainly α) 5 386 875 37,0382 (mainly β) 20 229 520 43,8813 (α & β) 14 594 1113 90,0294 (few sec. str.) 1 104 118 2,588

Total 40 1313 2626 173,536

Di analysis of protein coresfolding clues from aa core

composition?

:

Page 20: D atabanks      + New  tools      = New insights

1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 total

Proteinsmono

213 (84)

84(40)

19(17)

10(3)

17(13)

57(37)

94(73)

134(110)

12(12)

84(73)

52(44)

139(106)

218203

10(8)

49(49)

1,190(872)( )

Di analysis of protein coresfolding clues from aa core

composition?

#

domain

DOOPS + CATHselected Architectures

with ≥ 10 PDB files

:

Page 21: D atabanks      + New  tools      = New insights

Cys

PDB ID 1UZK(A01)

aa % average value (av)av + σ

av + 2σav - σav - 2σ

Towards protein folding barcodes

ribbon

LeuPhe

PDB ID 1RG8(A00)

trefoil

Val

PDB ID 2IMH(A01)

four layersandwich

Class Architectures

Topology

Homologous

superfamily

1 5 386 8752 20 229 5203 14 594 11134 1 104 118

Total 40 1313 2626

% L0 1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 overall

ALA 13,28 10,32 21,46 12,74 9,26 10,05 8,43 9,32 5,5 10,69 10,08 12,58 11,88 14,95 12,01 11.51

ARG 0,6 1,28 0,24 1,39 0 0,64 1,72 0,75 0 0,55 1,11 1,75 0,3 0,47 0,95 0.83ASN 0,67 2,62 0,73 2,77 1,85 2,04 1,77 1,36 0 2,1 2,9 0,96 1,52 2,8 2,1 1.70ASP 1,61 2,62 0,24 2,91 1,23 1,27 2,03 1,79 0 2,1 2,9 3,02 1,77 2,34 0,95 1.77CYS 3,35 2,99 5,37 0,83 22,84 2,04 1,46 4,42 0,92 2,83 2,1 1,49 1,86 1,4 3,05 2.63GLN 0,6 1,5 0,24 1,11 1,23 1,15 1,81 1,69 0 0,46 1,56 2,15 0,99 1,4 1,33 1.21GLU 1,48 1,44 0,73 1,52 0 1,15 1,19 1,04 0 0,91 2,59 2,41 1,08 0,93 0,67 1.20GLY 8,05 8,72 9,76 13,85 16,05 9,92 16,2 10,82 9,17 8,78 11,81 11,35 12,64 13,08 9,91 10.8

1HIS 1,01 1,6 2,44 1,11 0,62 0,76 0,79 0,56 0 2,65 1,96 3,02 1,91 0,47 2,48 1.32

ILE 12,68 9,95 10,73 8,59 6,79 13,61 10,68 10,78 13,76 12,8 11,77 12,53 11,53 7,01 11,34 11.74

LEU 23,88 18,34 22,44 11,77 8,02 17,18 12,97 13,98 33,94 16,54 11,9 14,33 14,22 15,42 13,63 16.27

LYS 0,67 0,91 0 1,11 0 0,38 0,49 0,56 0 0,09 0,62 1,36 0,55 0 0,67 0.58MET 2,62 4,17 1,71 4,99 0 2,8 2,65 3,15 1,83 2,93 2,76 2,41 2,39 3,27 1,91 2.49PHE 6,44 6,79 2,93 4,57 4,32 7,12 7,06 6,73 15,6 7,22 4,95 6,18 6,07 4,21 6,01 6.36PRO 1,34 2,46 3,41 2,63 3,09 3,31 3 2,78 0 3,29 2,9 1,84 2,25 1,4 1,81 2.45SER 3,49 4,55 3,66 5,96 3,09 5,34 5,56 5,13 2,75 2,83 5,35 4,43 4,23 6,07 5,34 4.85THR 2,28 4,81 4,15 7,2 5,56 3,31 5,12 4,47 0,92 3,2 5,22 4,25 4,94 5,14 5,91 4.65TRP 1,01 1,55 0 2,77 3,7 0,38 1,63 2,78 2,75 2,19 1,52 0,66 1,26 0,47 2,1 1.43TYR 2,62 3,69 0,24 4,57 2,47 1,27 2,69 4,38 0,92 3,29 3,12 1,58 2,32 0 2,29 2.50VAL 12,34 9,68 9,51 7,62 9,88 16,28 12,75 13,51 11,93 14,53 12,88 11,7 16,29 19,16 15,54 13.7

# PDB

213 (84)

84(40)

19(17)

10(3)

17(13)

57(37)

94(73)

134(110)

12(12)

84(73)

52(44)

139(106)

218203

10(8)

49(49) 2,410

Di of 173,536 CATH domains28 h, 5’ (average comp. time 1.72

s/domain)Calculations performed on

6 cores 990X CPU based computer

Ala

PDB ID 3CKC(A02)

alphahorseshoe

CATH-ADAPTCATH - atom depth assisted protein tomography

Page 22: D atabanks      + New  tools      = New insights

Towards protein folding barcodesPutting the protein universe in

order

Page 23: D atabanks      + New  tools      = New insights

Towards protein folding barcodesPutting the protein universe in

order

Page 24: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

Page 25: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 26: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

H/D exchange rate profilesD

DD

DD

D

D

D

D

D

D

D

D

D

Page 27: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 28: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 29: D atabanks      + New  tools      = New insights

towards i-SADIC(implemented SADIC)

H/D exchange rate profiles

Page 30: D atabanks      + New  tools      = New insights

2D atom depth or 3D atom depth

H/D exchange rate profiles

data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.

dnwi = or atom distance with the nearest water

molecule

Di,9 = or atom depth index with a probe od radius 9 Å

Page 31: D atabanks      + New  tools      = New insights

iSADIC atom depth 3D atom depth

H/D exchange rate profiles

data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.

Di,9 = or atom depth index with a probe od radius 9 Å

iDi,9 = aDi,9 + bASAi cDi,9 + dDnwi

Page 32: D atabanks      + New  tools      = New insights

iSADIC atom depth 3D atom depth

H/D exchange rate profiles

iDi,9 = aDi,9 + bASAi cDi,9 + dDnwi

Page 33: D atabanks      + New  tools      = New insights

protein-protein interface analysis

biological vs crystallographic interfaces

Page 34: D atabanks      + New  tools      = New insights

crystallographic dimers

biological dimers

Page 35: D atabanks      + New  tools      = New insights
Page 36: D atabanks      + New  tools      = New insights
Page 37: D atabanks      + New  tools      = New insights

vs

N ARG CA ARG C ARG O ARG CB ARG CG ARG CD ARG NE ARG CZ ARG NH1 ARG NH2 ARG H ARG HA ARG HB2 ARG HB3 ARG HG2 ARG HG3 ARG HD2 ARG HD3 ARG HE ARGHH11 ARGHH12 ARGHH21 ARGHH22 ARG

N LYSCA LYSC LYSO LYSCB LYSCG LYSCD LYSCE LYSNZ LYSH LYSHA LYSHB2 LYSHB3 LYSHG2 LYSHG3 LYSHD2 LYSHD3 LYSHE2 LYSHE3 LYSHZ1 LYSHZ2 LYSHZ3 LYS