63
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)

Proteomics Informatics –

  • Upload
    gratia

  • View
    49

  • Download
    1

Embed Size (px)

DESCRIPTION

Proteomics Informatics – Protein identification II: search engines and protein sequence databases  (Week 5). General Criteria for a Good Protein Identification Algorithms. The response to random input data should be random. - PowerPoint PPT Presentation

Citation preview

Page 1: Proteomics Informatics –

Proteomics Informatics – Protein identification II: search engines and

protein sequence databases (Week 5)

Page 2: Proteomics Informatics –

The response to random input data should be random.

Maximum number of correct identification and minimum number of incorrect identifications for any data set.

Maximal separation between scores for correct identifications and the distribution of scores for random matching proteins for any data set.

The statistical significance of the results should be calculated.

The searches should be fast.

General Criteria for a Good Protein Identification Algorithms

Page 3: Proteomics Informatics –

Search Parameters

Parent tolerance

+/- daltons/ppm

Frag. Tolerance +/- daltons/ppmComplete mods Cys alkylationPotential mods(artifacts)

Met/Trp oxidation, Gln/Asn deamidation

Potential mods(PTMs)

Phosphoryl, sulfonyl, acetyl, methyl, glycosyl, GPI

Cleavage Trypsin ([KR]|{P})Scoring method

Scores or statistics

Sequences FASTA files

Page 4: Proteomics Informatics –

MS

Identification – Peptide Mass Fingerprinting

MS

Digestion

All Peptide Masses

Pick Protein

Compare, Score, Test Significance

Repeat for each protein

SequenceDB

Identified Proteins

Page 5: Proteomics Informatics –

Response to Random Data

Nor

mal

ized

Fre

quen

cy

Page 6: Proteomics Informatics –

ProFound – Search Parameters

http://prowl.rockefeller.edu/

Page 7: Proteomics Informatics –

ProFound – Protein Identification by Peptide Mapping

pattern

r

iiirr

ii F

mmrmm

gNrNIkPDIkP

2

1

20

minmax

1 2

)(

2exp

2!)!()|()|(

W. Zhang & B.T. Chait, Analytical Chemistry72 (2000) 2482-2489

Page 8: Proteomics Informatics –

ProFound Results

Page 9: Proteomics Informatics –

Peptide Mapping – Mass Accuracy

ProFound

0

1

2

3

4

5

6

7

0 0.5 1 1.5 2

Mass Tolerance (Da)

-log(

e)

Mascot

0

20

40

60

80

100

120

140

0 0.5 1 1.5 2

Mass Tolerance (Da)Sc

ore

Page 10: Proteomics Informatics –

Peptide Mapping - Database SizeS. cerevisiae

Fungi

All Taxa

Expectation Values

Peptide mapping example:S. Cerevisiae 4.8e-7

Fungi 8.4e-6

All Taxa 2.9e-4

Page 11: Proteomics Informatics –

Missed Cleavage Sites

u = 1

u = 2

u = 4

Expectation Values

Peptide mapping example:u=1 4.8e-7

u=2 1.1e-5

u=4 6.8e-4

Page 12: Proteomics Informatics –

Peptide Mapping - Partial Modifications

No Modifications

Phophorylation (S, T, or Y)

Searched Searched With Without Possible Modifications Phosphorylation

of S/T/Y

DARPP-32 0.00006 0.01

CFTR 0.00002 0.005

Even if the protein is modified it is usually better to search a protein sequence database without specifying possible modifications using peptide mapping data.

Page 13: Proteomics Informatics –

Peptide Mapping - Ranking by Direct Calculation of the Significance

Page 14: Proteomics Informatics –

MS/MS

LysisFractionation

Tandem MS – Database Search

MS/MS

Digestion

SequenceDB

All FragmentMasses

Pick Protein

Compare, Score, Test Significance

Repeat for all proteins

Pick PeptideLC-MS

Repeat for

all peptides

Page 15: Proteomics Informatics –

Algorithms

Page 16: Proteomics Informatics –

Comparing and Optimizing Algorithms

Score

Score 1-Specificity

1-Specificity

Sens

itivi

tySe

nsiti

vity

Algorithm 1

Algorithm 2

True

True

False

False

Score

Score 1-Specificity

1-Specificity

Sens

itivi

tySe

nsiti

vity

Algorithm 1

Algorithm 2

True

True

False

False

Page 17: Proteomics Informatics –

17

MS/MS - Parent Mass Error and Enzyme Specificity

)!!( ybIII nnxx

Expectation Values

MS/MS example:Dm=2, Trypsin 2.5e-5

Dm=100, Trypsin 2.5e-5

Dm=2, non-specific 7.9e-5

Dm=100, non-specific 1.6e-4

Page 18: Proteomics Informatics –

Sequest

Cross-correlation

Page 19: Proteomics Informatics –

X! Tandem - Search Parameters

http://www.thegpm.org/

Page 20: Proteomics Informatics –

X! Tandem - Search Parameters

Page 21: Proteomics Informatics –

X! Tandem - Search Parameters

Page 22: Proteomics Informatics –

sequences

sequences

spectra

Conventional, single stage searching

Generic search engine

Test all cleavages,

modifications, & mutations

for all sequences

Page 23: Proteomics Informatics –

Determining potential modifications- e.g., oxidation, phosphorylation, deamidation

- calculation order 2n - NP complete

Some hard problems in MS/MS analysis in proteomics

Allowing for unanticipated peptide cleavages - e.g., chymotryptic contamination in trypsin - calculation order ~ 200 × tryptic cleavage - “unfortunate” coefficient

Detecting point mutations - e.g., sequence homology - calculation order 18N

- NP complete

Page 24: Proteomics Informatics –

sequences

sequences

spectra

Multi-stage searching

Trypticcleavage

Modifications #1

Modifications #2

Point mutation

X! Tandem

Page 25: Proteomics Informatics –

Search Results

Page 26: Proteomics Informatics –

Search Results

Page 27: Proteomics Informatics –

Sequence Annotations

Page 28: Proteomics Informatics –

Search Results

Page 29: Proteomics Informatics –

Search Results

Page 30: Proteomics Informatics –

Mascot

http://www.matrixscience.com/cgi/search_form.pl?FORMVER=2&SEARCH=MIS

Page 31: Proteomics Informatics –

LysisFractionation

DigestionLC-MS/MS

Identification – Spectrum Library Search

MS/MS

Spectrum Library

PickSpectrum

Compare, Score, Test Significance

Repeat for

all spectra

Identified Proteins

Page 32: Proteomics Informatics –

1. Find the best 10 spectra for a particular sequence, with the same PTMs and charge.2. Add the spectra together and normalize the intensity values.

3. Assign a “quality” value: the median expectation value of the 10 spectra used.

4. Record the 20 most intense peaks in the averaged spectrum, it’s parent ion z, m/z, sequence, protein accessions & quality.

Steps in making an Annotated Spectrum Library (ASL):

Page 33: Proteomics Informatics –

0

2

4

6

8

10

0 10 20 30 40 50

peptide length

fract

ion

of li

brar

y (%

)Spectrum Library Characteristics – Peptide Length

Page 34: Proteomics Informatics –

0

10

20

30

40

50

10 30 50 70 90 110 130 150 170 190

protein Mr (kDa)

% c

over

age

residuespeptides

Spectrum Library Characteristics – Protein Coverage

Page 35: Proteomics Informatics –

Library spectrum

Test spectrum(5:25)

(5:25)

Results: 4 peaks selected, 1 peak missed

Identification – Spectrum Library Search

Page 36: Proteomics Informatics –

Matches Probability1 0.452 0.153 0.0164 0.000395 0.0000037

Apply a hypergeometric probability model: - 25 possible m/z values; - 5 peaks in the library spectrum; and - 4 selected by the test spectrum.

How likely is this?Identification – Spectrum Library Search

Page 37: Proteomics Informatics –

If you have 1000 possible m/z values and 20 peaks in test and library spectrum?

1.0E-14

1.0E-12

1.0E-10

1.0E-08

1.0E-06

1.0E-04

1.0E-02

1.0E+00

1 2 3 4 5 6 7 8 9 10

matches

p 1 matched: p = 0.65 matched: p = 0.0002

10 matched: p = 0.0000000000001

Identification – Spectrum Library Search

Page 38: Proteomics Informatics –

ExperimentalMass Spectrum

Library of AssignedMass Spectra

M/Z

Best search result

Identification – Spectrum Library Search

Page 39: Proteomics Informatics –

X! Hunter

Page 40: Proteomics Informatics –

1. Use dot product to find a library spectrum that best matches a test spectrum.2. Calculate p-value with hypergeometric distribution.

3. Use p-value to calculate expectation value, given the identification parameters.4. If expectation value is less than the median expectation value of the library spectrum, report the median value.

X! Hunter algorithm:

Page 41: Proteomics Informatics –

X! Hunter Result

Query Spectrum

Library Spectrum

Page 42: Proteomics Informatics –

Dynamic Range In Proteomics

Large discrepancy between the experimental dynamic range and the range of amounts of different proteins in a proteome

ExperimentalDynamic Range

Distribution of Protein Amounts

Log (Protein Amount)

Num

ber

of P

rote

ins

The goal is to identify and characterize all components of a proteome

Desired Dynamic Range

Page 43: Proteomics Informatics –

Loss of material

Limit of amountof material

Loss of material

Limit of amountof material

Separation of material

Detection limitDynamic range

MassSeparation

Detection

MassSeparation

PeptideSeparation

PeptideLabeling

ProteinSeparation

Digestion

ProteinLabeling

SampleExtraction

Ionization

Fragmentation

Protein AbundanceProtein Abundance

Page 44: Proteomics Informatics –

Experimental Designs

SimulatedProtein Separation

PeptideSeparation

"Retention time" (bin)

y

1 k

y

1 k

# of

pe

ptid

es

per b

in

Mass SpectrometryMS

dynamicrange

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

Protein AbundanceProtein Abundance

Digestion

Sample

Page 45: Proteomics Informatics –

Parameters in Simulation● Distribution of protein amounts in sample

● Loss of peptides before binding to the column

● Loss of peptides after elution off the column

● Distribution of mass spectrometric response for different peptides present at the same amount

● Total amount of peptides that are loaded on column (limited by column loading capacity)

● # of peptide fractions

● # of Proteins in each fraction

● Total amount of peptides that are loaded on column (limited by column loading capacity)

● # of peptide fractions

● Dynamic range of mass spectrometer

● Detection limit of mass spectrometer

Protein Separation

PeptideSeparation

"Retention time" (bin)

y

1 k

y

1 k

# of

pe

ptid

es

per b

in

Mass SpectrometryMS

dynamicrange

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

10

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

MS dynamicrange

m1

m2

m3

m4

m5m

6

m1

m2

m3

m4

m5m6

Protein AbundanceProtein Abundance

Digestion

Sample

Page 46: Proteomics Informatics –

Simulation Results for 1D-LC-MS

Complex Mixtures of Proteins

RPC

Digestion

MS Analysis

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

No ProteinSeparation

Protein Separation:10 fractions

Protein Separation:10 fractions

No ProteinSeparation

Tissue

Tissue

Body Fluid

Body Fluid

Page 47: Proteomics Informatics –

Success Rate of a Proteomics Experiment

DEFINITION: The success rate of a proteomics experiment is defined as the number of proteins detected divided by the total number of proteins in the proteome.

Log (Protein Amount)

Num

ber

of P

rote

ins

ProteinsDetected

Distribution of Protein Amounts

Page 48: Proteomics Informatics –

Relative Dynamic Range of a Proteomics Experiment

DEFINITION: RELATIVE DYNAMIC RANGE, RDRx,where x is e.g. 10%, 50%, or 90%

Log (Protein Amount)

RDR90

RDR50

RDR10Frac

tion

of

Prot

eins

Det

ecte

dN

umbe

r of

Pro

tein

s

ProteinsDetected

Distribution of Protein Amounts

Page 49: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Succ

ess

Rat

e

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Succ

ess

Rat

e

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000 100000Number of Proteins in Mixture

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)Number of Proteins in Mixture

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

Tissue

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

Body Fluid Body Fluid1 1 2

RDR50 Success Rate

TissueBody Fluid

1

1

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

Tissue 2

2

2

Page 50: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100Amount Loaded [mg]

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100Amount Loaded [mg]

Succ

ess

Rat

e

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100Amount Loaded [mg]

Succ

ess

Rat

e

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100Amount Loaded [mg]

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)Amount of Peptides Loaded on the

Column

Tissue Body Fluid Body Fluid2 2 3

RDR50 Success RateTissueBody Fluid

2

2

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

Tissue 3

33

Page 51: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000 100000Number of Peptide Fractions

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000 100000Number of Peptide Fractions

Succ

ess

Rat

e

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0 2 4 6 8 10log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000 100000Number of Peptide Fractions

Succ

ess

Rat

e

0

0.2

0.4

0.6

0.8

1

10 100 1000 10000 100000Number of Peptide Fractions

Rel

ativ

e D

ynam

ic R

ange

(RD

R50

)Peptide Separation

Tissue Body Fluid Body Fluid3 3 4

RDR50 Success Rate

TissueBody Fluid

3 3

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

Tissue 4

4 4

Page 52: Proteomics Informatics –

Amount loaded and peptide separation

1. Protein separation2. Amount loaded 3. Peptide separation

Order:

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

11

11

Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

11

11

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

22

Tissue

110

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

11

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

11

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

33

Amountloaded

33

Tissue1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

11

11

Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

44

Peptideseparation

44

33

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

33

Amountloaded

1. Protein separation2. Peptide separation3. Amount loaded

11

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

22

1111

Tissue1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

1111

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

33

Peptideseparation

33

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange

1.0

0.8

0.6

0.4

0.2

00 0.2 0.4 0.6 0.8 1.0

Success Rate

Rel

ativ

e D

ynam

ic R

ange Tissue

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

1111

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

22Proteinseparation

22

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

44

Amountloaded44

0

0.005

0.01

0.015

0.02

0.025

0 1 2 3 4 5 6log(Protein Amount)

Num

ber o

f Pro

tein

s

33

Peptideseparation

33

Protein separationAmount loadedPeptide separation

Ranges:Protein separation: 30000 – 3000 proteins in each fractionAmount loaded: 0.1 ug – 10 ugPeptide separation: 100 – 1000 fractions

Page 53: Proteomics Informatics –

Repeat Analysis

1 Analysis

Page 54: Proteomics Informatics –

2 Analyses

Repeat Analysis

Page 55: Proteomics Informatics –

3 Analyses

Repeat Analysis

Page 56: Proteomics Informatics –

4 Analyses

Repeat Analysis

Page 57: Proteomics Informatics –

5 Analyses

Repeat Analysis

Page 58: Proteomics Informatics –

6 Analyses

Repeat Analysis

Page 59: Proteomics Informatics –

7 Analyses

Repeat Analysis

Page 60: Proteomics Informatics –

8 Analyses

Repeat Analysis

Page 61: Proteomics Informatics –

Repeat Analysis: Simulations

0

0.1

0.2

0.3

0 2 4 6 8 10

Number of Repeats

Suce

ss R

ate

Experiment

Simulation

0

0.1

0.2

0.3

0.4

0.5

0 2 4 6 8 10

Number of RepeatsR

DR

10

Experiment

Simulation

Page 62: Proteomics Informatics –

Summary

• The success rate of proteome analysis is influenced by the following factors (listed in order of importance):

• Amount of peptides loaded on column or mass spectrometric detection limit

• The degree of peptide separation or mass spectrometric dynamic range

• The degree of protein separation

Page 63: Proteomics Informatics –

Proteomics Informatics – Protein identification II: search engines and

protein sequence databases (Week 5)