Developing Algorithms for the Determination of Relative ...people.mpim-bonn.mpg.de/aruray/documents/nebpres.pdf · Jake Marcus, Anne Eaton, Melanie Kanter, Arunima Ray Mentors: Shawn

Developing Algorithms for the Determination ofRelative Peptide Abundances from LC/MS Data

Jake Marcus, Anne Eaton, Melanie Kanter, Arunima Ray

Mentors: Shawn Cokus, Matteo Pellegrini, Parag Mallick, Roland Luethy

Presented by: Arunima Ray, SUNY Geneseo

• Spielberg Family Center for Applied Proteomics (SFCAP)

– Aims to develop and use proteomic tools to facilitate patient management decisions

• Proteomics

– Large-scale study of proteins

– Proteins are the effectors of most biological functions

– The proteome is dynamic: • differs across cell types

• changes over time based on external and internal factors

Proteomics and SFCAP

• Need to quantify the differences in proteomes

Individualized treatment

Data generation: LC/MS

• Start with a sample of proteins

• Protein sizes are varied:

– Digest into fragments: peptides

• Separate by affinity to water

• Ionize

• Further separate by mass/charge

Affinity to water Mass/charge Peptide Protein

… … … …

246 725.4 K.ACSQRPR.W ADH

793 432.87 R.IGYADIK.W EPO

1075 5367.91 K.LGANAILK.W HB

Affinity to water Mass/charge Peptide Protein

… … … …

246 725.4 K.ACSQRPR.W ADH

793 432.87 R.IGYADIK.W EPO

1075 5367.91 K.LGANAILK.W HB

Aff

inity to w

ate

r

Mass/Charge

List of Identifications

This picture was generated by just

6 proteins !!

The data

Challenges

• Locate isotopes

• Identifications not centered

• Unknown spread along vertical axis

• Noise

In

ten

sit

y

Isotopes

Point of Identification

Step 1: find associated feature

• Use affinity and charge information from identification

• Mass of the most abundant isotope

• Consider a region large enough to include the entire feature of interest

Mass/Charge

Inte

nsity (

x10

5)

Point of Identification

Step 2: consolidate information from isotopes

• Isotopes form parallel curves

• Select relevant mass/charge values, extract corresponding data

Mass/Charge

Inte

nsity (

x10

5)

Step 2: consolidate information from isotopes

Mass/Charge

Inte

nsity

Step 3: limit along vertical axis (basic)

• Find highest peak

• Search along vertical axis until 4 out of 5 consecutive data points are below threshold

Affinity to water

Inte

nsity (

x10

5)

Threshold

Step 3: limit along vertical axis (fingerprinting)

• Carbon in nature:

– 98.93% in C-12 form

– 1.07% in C-13 form

• For a peptide of n carbons:

– Probability for all C-12 = 0.9893n

– Probability for 1 C-13, n-1 C-12 = n.(0.9893)n-1.(0.0107)

• Compute similarity score for intensities along mass/charge

• Graph for different affinity values

• Stop when scores below 10%

0

5

10

15

20

25

30

35

40

0 1 2 3 4 5

Exp

ecte

d p

erce

nta

ge

Number of C-13 atoms

Step 4: fit a curve and quantify

10 15 20 25 30 35 40 45 50 55 600

0.2

0.4

0.6

0.8

1

1.2

1.4Gamma, R2=0.98137

Affinity to water

Inte

nsi

ty (

scale

d)

Gamma, R2 = 0.9814Gamma, R2 = 0.98

• Need to model the distribution of intensities along vertical axis

• Gauss or gamma curve fit to data by nonlinear regression

Evaluation

• 5 protein mix – (~250 peptides)– Different amounts in

each sample

• 6 protein mix – (~150 peptides)– Same amount in every

sample

1x 2x 3x

1x 1x 1x

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

1 2 3 4 5 6 7

Modules

IQR

Final Version

Module combinations

Optimizing the algorithm

1.1

1.12

1.14

1.16

1.18

1.2

1.22

1.24

1.26

1.28

1 2 3 4 5 6 7

Modules

Ob

s/E

xp

Final Version

Module combinations

5 protein mix evaluations

6 protein mix evaluations

Outcomes

• Adaptable algorithm

• Modularity facilitates comparative analysis

• Addressed key challenges in quantification

• Investigated characteristics of features

• Attempt with real data

– The first (and only) run!

Acknowledgments

• Institute for Pure and Applied Mathematics

• Spielberg Family Center for Applied Proteomics

• My teammates Jake, Anne and Melanie

• Our awesome mentors Shawn, Matteo, Roland and Parag.

Documents

Developing Algorithms for the Determination of Relative ...people.mpim-bonn.mpg.de/aruray/documents/nebpres.pdf · Jake Marcus, Anne Eaton, Melanie Kanter, Arunima Ray Mentors: Shawn