45
IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

IT og Sundhed 2010/11

Sequence based predictors. Secondary structure and surface

accessibility

Bent Petersen13 January 2011

Page 2: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

NetSurfPReal Value Solvent Accessibility

predictions with amino acid associated reliability

Page 3: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Objective

• Predict residues as being either buried or exposed (25 % threshold)

- Two states/classes, Buried/Exposed

• Predict the Relative Solvent Accessibility, RSA

- “Real” Value

Page 4: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

What is ASA?

• Accessible Solvent Area, Å2

• Surface area accessible to a rolling water molecule

Page 5: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

RSA

RSA = Relative Solvent AccessibilityACC = Accessible area in protein structureASA = Accessible Surface Area in Gly-X-Gly or Ala-X-Ala

Classification Networks “Real” value Networks

Classification: Buried = RSA < 25 %, Exposed = RSA > 25 %“Real” Value: values 0 - 1, RSA > 1 set to 1

Page 6: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Why predict RSA?

• Residues exposed on surface can be:- Involved in PTM’s- Potential epitopes- Involved in Protein-Protein interactions- Prediction of Disease-SNP’s

Page 7: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

How to start?

•What do we want?

- We want to be able to predict the exposure of an AA

•What do we need?

- A training dataset and an independent evaluation dataset

•What information do we need?

- True structural information the Neural Network can train on

•Where do we get that?

- PDB, DSSP

Page 8: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Protein Data Bank, PDB

Berman, H.M., et al., The Protein Data Bank. Nucl. Acids Res., 2000. 28(1): p. 235-242.

Page 9: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Define Secondary Structure of Proteins, DSSP

Kabsch, W. and C. Sander, Dictionary of Protein Secondary Structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 1983. 22(12): p. 2577--2637.

==== Secondary Structure Definition by the program DSSP, updated CMBI version by ElmK / April 1,2000 ==== DATE=23-MAR-2009 .REFERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637 .HEADER TOXIN 12-AUG-98 3BTA .COMPND 2 MOLECULE: PROTEIN (BOTULINUM NEUROTOXIN TYPE A); .SOURCE 2 ORGANISM_SCIENTIFIC: CLOSTRIDIUM BOTULINUM; .AUTHOR R.C.STEVENS,D.B.LACY . 1277 2 2 1 1 TOTAL NUMBER OF RESIDUES, NUMBER OF CHAINS, NUMBER OF SS-BRIDGES(TOTAL,INTRACHAIN,INTERCHAIN) . 55121.0 ACCESSIBLE SURFACE OF PROTEIN (ANGSTROM**2) . 815 63.8 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(J) , SAME NUMBER PER 100 RESIDUES . 24 1.9 TOTAL NUMBER OF HYDROGEN BONDS IN PARALLEL BRIDGES, SAME NUMBER PER 100 RESIDUES . 198 15.5 TOTAL NUMBER OF HYDROGEN BONDS IN ANTIPARALLEL BRIDGES, SAME NUMBER PER 100 RESIDUES . 1 0.1 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I-5), SAME NUMBER PER 100 RESIDUES . 10 0.8 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I-4), SAME NUMBER PER 100 RESIDUES . 125 9.8 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+2), SAME NUMBER PER 100 RESIDUES . 134 10.5 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+3), SAME NUMBER PER 100 RESIDUES . 276 21.6 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+4), SAME NUMBER PER 100 RESIDUES . 9 0.7 TOTAL NUMBER OF HYDROGEN BONDS OF TYPE O(I)-->H-N(I+5), SAME NUMBER PER 100 RESIDUES . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 *** HISTOGRAMS OF *** . 0 0 0 0 0 3 3 1 2 1 0 3 1 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 0 2 RESIDUES PER ALPHA HELIX . 2 0 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PARALLEL BRIDGES PER LADDER . 15 10 7 5 8 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ANTIPARALLEL BRIDGES PER LADDER . 3 3 0 0 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LADDERS PER SHEET . # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA 1 1 A P 0 0 5 0, 0.0 2,-3.8 0, 0.0 3,-0.2 0.000 360.0 360.0 360.0 132.0 74.7 55.7 73.4 2 2 A F - 0 0 115 92,-0.4 93,-0.1 1,-0.1 36,-0.1 -0.206 360.0-142.1 55.7 -62.1 74.7 59.2 74.7 3 3 A V - 0 0 11 -2,-3.8 35,-0.2 91,-0.1 -1,-0.1 0.867 4.9-143.8 70.2 103.3 78.3 59.8 73.7 4 4 A N S S+ 0 0 127 33,-0.3 2,-0.5 -3,-0.2 33,-0.1 0.914 73.7 44.0 -67.5 -53.8 80.1 61.9 76.4 5 5 A K S S- 0 0 94 32,-0.1 2,-0.5 1,-0.0 -1,-0.1 -0.857 79.6-124.0-105.1 133.1 82.5 64.2 74.5 6 6 A Q - 0 0 192 -2,-0.5 2,-0.1 1,-0.1 82,-0.1 -0.568 35.9-150.4 -71.8 118.5 81.6 66.2 71.4 7 7 A F - 0 0 14 -2,-0.5 2,-0.3 80,-0.1 3,-0.1 -0.388 16.9-164.3 -91.4 166.8 84.2 65.3 68.7 8 8 A N > - 0 0 71 -2,-0.1 3,-0.9 1,-0.1 77,-0.0 -0.977 28.9-124.4-143.4 141.5 85.7 67.1 65.7 9 9 A Y T 3 S+ 0 0 17 -2,-0.3 -1,-0.1 1,-0.2 72,-0.1 0.908 109.3 50.7 -57.8 -43.3 87.5 65.3 62.9 10 10 A K T 3 S+ 0 0 141 -3,-0.1 -1,-0.2 70,-0.1 3,-0.1 0.650 77.9 122.5 -70.3 -17.2 90.7 67.4 63.3 11 11 A D S < S- 0 0 45 -3,-0.9 3,-0.1 1,-0.1 2,-0.1 -0.203 77.6 -91.4 -48.0 134.3 91.0 66.8 67.1 12 12 A P - 0 0 99 0, 0.0 -1,-0.1 0, 0.0 -2,-0.1 -0.246 38.0-108.3 -57.6 128.3 94.4 65.3 67.8 13 13 A V + 0 0 41 -3,-0.1 6,-0.2 1,-0.1 4,-0.1 -0.238 38.6 179.2 -51.8 138.5 94.8 61.5 67.8 14 14 A N - 0 0 67 4,-3.7 2,-1.4 2,-0.2 5,-0.2 -0.085 45.1-107.4-144.3 45.7 95.4 60.3 71.4 15 15 A G S S+ 0 0 0 122,-0.4 2,-0.3 3,-0.2 4,-0.2 0.248 100.3 58.5 54.3 -18.1 95.7 56.6 71.7 16 16 A V S S- 0 0 72 -2,-1.4 -2,-0.2 2,-0.5 20,-0.1 -0.996 116.3 -7.4-142.5 145.9 92.2 56.3 73.3 17 17 A D S S+ 0 0 22 -2,-0.3 19,-2.5 18,-0.1 2,-0.2 0.389 136.6 45.3 53.3 -7.2 88.7 57.3 72.3 18 18 A I E S+A 35 0A 6 17,-0.3 -4,-3.7 -11,-0.0 -2,-0.5 -0.649 85.9 128.7-161.1 96.3 90.4 59.0 69.2

Page 10: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Define Secondary Structure of Proteins, DSSP

• DSSP defines 8 types of secondary structure

- G = 3-turn helix (3-10 helix)- H = 4-turn helix (α-helix)- I = 5-turn helix (π-helix)- T = Hydrogen bonded turn (3, 4 or 5 turn)- E = Extended strand- B = Residue in isolated β-bridge- S = Bend- Rest is C = coil

Page 11: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Required datasets

• Training/test

- Used for optimization of settings using 10-fold cross-validation

• Evaluation

- Used for final evaluation, less than 25 % homolog to the training/test dataset.

Page 12: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

10-fold Cross Validation

10-fold Cross Validation- Break dataset into 10 sets of size 1/10

- Train on 9 datasets and test on 1

- Repeat 10 times and take a mean accuracy

Page 13: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Learning / Training dataset

• Training set: Cull_1764:

- Max. Seq. ID: 25 %- Resolution: ≤ 2.0 Å- R-Factor: ≤ 0.2- Seq. Length 30-3000 AA- Including X-ray entries only

Page 14: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

PISCES

Page 15: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Learning / Training dataset

• Homology reduced towards evaluation set CB513 (302 sequences removed)

• Final Training set:- 1764 sequences- 417.978 amino acids

‣ Buried: 55.80 % (233.221 amino acids)‣ Exposed: 44.20 % (184.757 amino acids)

Page 16: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Learning / Training dataset---Sequence/residue statistics---Number of seq.: 1764Longest seq.: 1T3T.A (1283)Shortest seq.: 1YTV.M(6)Number of amino acids: 417978

---Assignment category statistics ---B 184757 ( 44.20%)A 233221 ( 55.80%)

---Amino acid statistics---H 10025 ( 2.40%)G 31743 ( 7.59%)Y 14927 ( 3.57%)V 30171 ( 7.22%)E 27774 ( 6.64%)S 24430 ( 5.84%)P 19589 ( 4.69%)A 35658 ( 8.53%)R 21435 ( 5.13%)Q 15535 ( 3.72%)C 5202 ( 1.24%)K 23054 ( 5.52%)L 38489 ( 9.21%)N 17756 ( 4.25%)T 22998 ( 5.50%)F 17181 ( 4.11%)D 24743 ( 5.92%)I 23550 ( 5.63%)W 6365 ( 1.52%)M 7353 ( 1.76%)

Page 17: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Evaluation dataset

• Final Evaluation dataset:

• CB513:- 513 non-homologous sequences- Seq. Length 20-754 aa- 84.119 amino acids- Buried: 55.81 % (46.948 amino acids)- Exposed: 44.19 % (37.171 amino acids)

Page 18: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Evaluation dataset---Sequence/residue statistics---Number of seq.: 513Longest seq.: 6acn.all(754)Shortest seq.: 1atpi-1(20)Number of amino acids: 84119

---Assignment category statistics ---B 37171 ( 44.19%)A 46948 ( 55.81%)

---Amino acid statistics---R 3812 ( 4.53%)T 5015 ( 5.96%)D 4973 ( 5.91%)C 1381 ( 1.64%)Y 3065 ( 3.64%)G 6657 ( 7.91%)N 3976 ( 4.73%)V 5795 ( 6.89%)I 4642 ( 5.52%)A 7267 ( 8.64%)S 5222 ( 6.21%)K 4976 ( 5.92%)P 3903 ( 4.64%)E 5050 ( 6.00%)L 7134 ( 8.48%)Q 3108 ( 3.69%)M 1710 ( 2.03%)H 1865 ( 2.22%)W 1236 ( 1.47%)F 3268 ( 3.88%)X 19 ( 0.02%)B 31 ( 0.04%)Z 14 ( 0.02%)

Page 19: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

A m i n o a c i d D i s t r i b u t i o n

0

2

4

6

8

1 0

A m i n o a c i d s

C u l l / L e a r n i n g

C B 5 1 3

C u l l / L e a r n i n g 8 . 5 3 1 . 2 4 5 . 9 2 6 . 6 4 4 . 1 1 7 . 5 9 2 . 4 0 5 . 6 3 5 . 5 2 9 . 2 1 1 . 7 6 4 . 2 5 4 . 6 9 3 . 7 2 5 . 1 3 5 . 8 4 5 . 5 0 7 . 2 2 1 . 5 2 3 . 5 7

C B 5 1 3

8 . 6 4 1 . 6 4 5 . 9 1 6 . 0 0 3 . 8 8 7 . 9 1 2 . 2 2 5 . 5 2 5 . 9 2 8 . 4 8 2 . 0 3 4 . 7 3 4 . 6 4 3 . 6 9 4 . 5 3 6 . 2 1 5 . 9 6 6 . 8 9 1 . 4 7 3 . 6 4

A C D E F G H I K L M N P Q R S T V W Y

Page 20: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Neural Network - Input

• Position Specific Scoring Matrices, PSSM

A R N D C Q E G H I L K M F P S T W Y V

B H 2BEM.A 1 -4 -3 -2 -4 -6 -2 -3 -5 11 -6 -5 -3 -4 -4 -5 -3 -4 -5 -1 -6 A G 2BEM.A 2 -2 -5 -3 -4 -5 -4 -5 7 -5 -7 -6 -4 -5 -6 -5 -3 -4 -5 -6 -6 A Y 2BEM.A 3 -1 1 -4 -3 -5 -4 -4 -4 1 -4 -1 -4 -1 2 -5 0 -1 4 7 -2 A V 2BEM.A 4 -1 -5 -5 -6 -4 -4 -5 -5 -5 4 1 -5 6 -3 -2 -2 0 -5 -4 4 B E 2BEM.A 5 -2 -4 -3 0 -4 -1 3 -2 -4 0 -3 -2 1 -2 -3 3 3 -5 -4 0

4 time iterativ psi-blast against nr70

• Secondary Structure predictionsB H 2BEM.A 1 0.003 0.003 0.966A G 2BEM.A 2 0.018 0.086 0.868A Y 2BEM.A 3 0.020 0.199 0.752A V 2BEM.A 4 0.021 0.271 0.679B E 2BEM.A 5 0.020 0.199 0.752

(sec predictor by Pernille Andersen)

Page 21: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Secondary structure predictor

• Developed by Pernille Andersen, incorporated in NetSurfP

• Trained on 2,085 sequences using DSSP

- H = H, E = E, C = ., G, I, B, S and T

- H ~ 30 %, E ~ 20 %, C ~ 50 %

• Performance of ~80 %

• Maximum theoretical limit is ~88 %

Page 22: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Neural Network - Settings

• Window Size: 11-19

• Hidden units: 10, 20, 25, 30, 40, 50, 75, 150, (200)

• Learning rate: 0.01 / (0.005)

• Epocs (training rounds): 200

• 10-fold cross-validation

- 9/10 used for training, 1/10 for testing

Page 23: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Neural network window

Sliding window of 7

170 2BEM.A mol:aa CHITIN-BINDING PROTEIN

HGYVESPASRAYQCKLQLNTQCGSVQYEPQSVEGLKGFPQAGPADGHIASADKSTFFELDQQTPTRWNKLNLKTGPNSFTWKLTARHSTTSWRYFITKPNWDASQPLTRASFDLTPFCQFNDGGAIPAAQVTHQCNIPADRSGSHVILAVWDIADTANAFYQAIDVNLSKBAAABBAAAAAAAABBBBABBABBAABBABAABABBBAABBBABBABAAAABBBBABAAABABBBAABABBABAABABAAABABBBBAABAAAAAAABBBABABBBAAABAABBBAAAAAABBBBBABBBABABABAABBABBBAAAAAAAAABBBBBAAAAAAAABABB

Prediction on middle residueSerine, buried

Page 24: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Neural network window

Sliding window of 7

170 2BEM.A mol:aa CHITIN-BINDING PROTEIN

HGYVESPASRAYQCKLQLNTQCGSVQYEPQSVEGLKGFPQAGPADGHIASADKSTFFELDQQTPTRWNKLNLKTGPNSFTWKLTARHSTTSWRYFITKPNWDASQPLTRASFDLTPFCQFNDGGAIPAAQVTHQCNIPADRSGSHVILAVWDIADTANAFYQAIDVNLSKBAAABBAAAAAAAABBBBABBABBAABBABAABABBBAABBBABBABAAAABBBBABAAABABBBAABABBABAABABAAABABBBBAABAAAAAAABBBABABBBAAABAABBBAAAAAABBBBBABBBABABABAABBABBBAAAAAAAAABBBBBAAAAAAAABABB

Prediction on middle residueProline, exposed

Page 25: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Neural network window

Sliding window of 7

170 2BEM.A mol:aa CHITIN-BINDING PROTEIN

HGYVESPASRAYQCKLQLNTQCGSVQYEPQSVEGLKGFPQAGPADGHIASADKSTFFELDQQTPTRWNKLNLKTGPNSFTWKLTARHSTTSWRYFITKPNWDASQPLTRASFDLTPFCQFNDGGAIPAAQVTHQCNIPADRSGSHVILAVWDIADTANAFYQAIDVNLSKBAAABBAAAAAAAABBBBABBABBAABBABAABABBBAABBBABBABAAAABBBBABAAABABBBAABABBABAABABAAABABBBBAABAAAAAAABBBABABBBAAABAABBBAAAAAABBBBBABBBABABABAABBABBBAAAAAAAAABBBBBAAAAAAAABABB

Prediction on middle residueAlanine, exposed

Page 26: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Method

Page 27: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Wisdom of the crowdSelecting best performing network architectures based on test performance

Better than choosing any single network10-fold % correct predictions Average of set A-J w. sec. structure

79.55

79.66

79.69

79.72

79.75 79.75 79.7579.74

79.7579.76

79.7779.77

79.7679.75

79.7679.75 79.75

79.7679.77 79.77

79.40

79.45

79.50

79.55

79.60

79.65

79.70

79.75

79.80

Series1

Series1 79.55 79.66 79.69 79.72 79.75 79.75 79.75 79.74 79.75 79.76 79.77 79.77 79.76 79.75 79.76 79.75 79.75 79.76 79.77 79.77

Average of

top 1

Average of

top 2

Average of

top 3

Average of

top 4

Average of

top 5

Average of

top 6

Average of

top 7

Average of

top 8

Average of

top 9

Average of

top 10

Average of

top 11

Average of

top 12

Average of

top 13

Average of

top 14

Average of

top 15

Average of

top 16

Average of

top 17

Average of

top 18

Average of

top 19

Average of

top 20

Page 28: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Results - Classification networks

• Training: % Correct MCC#Networ

ks

Best Single Architecture

79.5 0.587 10

All Architectures 79.7 0.592 400

Top 20 Architectures

79.8 0.593 200

Page 29: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011
Page 30: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Results - Classification networks

• Training:

• Evaluation:

% Correct MCC#Networ

ks

Best Single Architecture

79.5 0.587 10

All Architectures 79.7 0.592 400

Top 20 Architectures

79.8 0.593 200

% Correct MCC

Dor and Zhou 78.8Not

Published

NetsurfP CB500/CB513

79.000

0.577

Page 31: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Results • Evaluation

Page 32: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

NetSurfP

/usr/cbs/bio/src/NetSurfP/NetSurfP -h

Page 33: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

NetSurfP

Page 34: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

NetDiseaseSNP

• Disease-SNP prediction (Morten Bo Johansen)

• Without NetSurfP:Cross-validation: MCC= 0.569Cross-Evaluation: MCC= 0.560

• With NetSurfP:Cross-validation: MCC= 0.583Cross-Evaluation: MCC= 0.572

Page 35: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Paper is out..What then?

Page 36: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Statistics

• Submissions to the webserver from CBS website

Page 37: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Paper is out..What then?

Page 38: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Paper is out..What then?

Page 39: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

Paper is out..What then?

Page 40: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011
Page 41: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011
Page 42: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011
Page 43: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

As of 12 Jan 2011

136003 sequences submitted from 13494unique IP’s

Page 44: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011

First citation 24 october 2009 :-)

Page 45: IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011