29
SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS 1 Yan Yan Department of Computer Science University of Western Ontario, Canada

SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

SPECTRA LIBRARY ASSISTEDDE NOVO PEPTIDESEQUENCING FOR HCD ANDETD SPECTRA PAIRS

1 Yan Yan

Department of Computer Science

University of Western Ontario Canada

OUTLINE

cent Background Tandem mass spectrometry Peptide sequencing methods

cent Proposed method Use of spectra libraries Spectra merging Peptide tags De novo sequencing model

cent Experiments and results Data Experiments and comparison

cent Conclusions2

cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of

individual compoundscent Tandem mass spectrometry (MSMS)

It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques

cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)

cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra

3

BACKGROUND

MSMS PROCESS

Precursor ions of interest

FragmentationCIDHCDETDhellip

MSMSMass analyzer 2

Ion source Mass analyzer 1

MS

Fragment ions

Protein digestion

Peptide sampleenzyme

Peptide separation

Protein sample

Detector

Sample preparation

4

cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary

ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)

Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry

BACKGROUND

5

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 2: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

OUTLINE

cent Background Tandem mass spectrometry Peptide sequencing methods

cent Proposed method Use of spectra libraries Spectra merging Peptide tags De novo sequencing model

cent Experiments and results Data Experiments and comparison

cent Conclusions2

cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of

individual compoundscent Tandem mass spectrometry (MSMS)

It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques

cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)

cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra

3

BACKGROUND

MSMS PROCESS

Precursor ions of interest

FragmentationCIDHCDETDhellip

MSMSMass analyzer 2

Ion source Mass analyzer 1

MS

Fragment ions

Protein digestion

Peptide sampleenzyme

Peptide separation

Protein sample

Detector

Sample preparation

4

cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary

ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)

Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry

BACKGROUND

5

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 3: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent Mass spectrometry (MS) An analytical technique measuring mass-to-charge ratio (mz) of

individual compoundscent Tandem mass spectrometry (MSMS)

It contains two or more mass analyzers It breaks the compounds into smaller fragment ions Fragmentation techniques

cent Collision-induced dissociation (CID)cent High-energy collisional dissociation (HCD)cent Electron transfer dissociation (ETD)

cent MSMS experiments Input protein samples Output tandem mass (MSMS) spectra

3

BACKGROUND

MSMS PROCESS

Precursor ions of interest

FragmentationCIDHCDETDhellip

MSMSMass analyzer 2

Ion source Mass analyzer 1

MS

Fragment ions

Protein digestion

Peptide sampleenzyme

Peptide separation

Protein sample

Detector

Sample preparation

4

cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary

ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)

Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry

BACKGROUND

5

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 4: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

MSMS PROCESS

Precursor ions of interest

FragmentationCIDHCDETDhellip

MSMSMass analyzer 2

Ion source Mass analyzer 1

MS

Fragment ions

Protein digestion

Peptide sampleenzyme

Peptide separation

Protein sample

Detector

Sample preparation

4

cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary

ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)

Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry

BACKGROUND

5

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 5: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent Different ion types of MSMS There are commonly 6 different ionsand they form 3 complimentary

ion pairs B-ions and y-ions are the most common ions in CIDHCD spectra C-ions and z-ions are common in ETD spectra Losing small molecules such as ammonia (NH3) and water (H2O)

Peptide fragmentation notation httpenwikipediaorgwikiTandem_mass_spectrometry

BACKGROUND

5

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 6: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

6

TWO WAYS OF PEPTIDE SEQUENCING

Frank et al JPR 2006

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 7: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

BACKGROUND

cent Spectra library Set of annotated experimental MSMS spectra Publicly available libraries

cent chemdatanistgov Additional information help with de novo sequencing

7

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 8: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

cent Find all length 2 paths select middle peak

cent Find complimentary ion pairs and select them Output all selected peaks form one spectrum S

8

k

maa2

j

maa1

i

output

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 9: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

9

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 10: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD

cent Improved spectra merging -- use spectra libraries as training sets Calculate significant scores

cent Middle ion of the 2-tagscent Complementary ion pairs

Assign significant scores on peaks during selectioncent Peak I (mz charge score)

10

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 11: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

N-terminal

Tag1 Tag2

C-terminal

PROPOSED METHOD

11

cent De novo sequencing Generate peptide tags from both spectra Graph with multiple edge types (GMET)

Y Yan A Kusalik and F-X Wu ldquoNovoHCD De novo peptide sequencing from HCD spectrardquo NanoBioscience IEEE Transactions on vol 13 no 2 pp 65ndash72June 2014

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 12: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD

12

cent Peptide tag improvement Generate length-3 tags Merge tags with two successive amino acids overlap

(and ions overlap)cent Eg ti = TAG and tj = AGT tij = TAGT

Assign significant scores to tagscent Rank candidate peptides with significant scores

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 13: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Experiment MSMS spectra dataset

Spectra librariescent human peptide spectral library (from chemdatanistgov) of

183140 HCD spectracent 100000 ETD spectra of synthetic unmodified peptides

13

Dataset of

spectraSpectrum

chargeSelected

pairs

SCX_HCD_decon 1952+2 to +6 161

SCX_ETD_decon 612SCX_HCD_no_decon 2557

+2 to +5 249SCX_ETD_no_decon 1298

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 14: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Results Significant score calculated using spectra libraries

14

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 15: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Results Full length accuracy ndash top three candidates output

15

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 16: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Results Accuracy comparison with different output

cent SCX_HCD_decon and SCX_ETD_decon dataset pair

cent SCX_HCD_no_decon and SCX_ETD_no_decon dataset pair

16

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 17: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Results Computational time

17

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 18: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

CONCLUSIONS

cent Spectra libraries adds additional information forbetter peptide sequencing performance

cent Full length accuracy increases up to 11 whenonly the highest ranked candidates output

cent Computation time saves up to 40 when mergedlong peptide tags were used

18

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 19: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

THANK YOU19

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 20: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

FUTURE WORK PLAN

cent Experiments Compare with more methods Use more datasets

cent Algorithm improvement Other ways of selecting signal peaks to merge spectra

cent Spectrum specific features

cent Method development Multiple spectra sequencing

20

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 21: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent Peptides Peptides are organic compounds consisting of 2 or more amino

acids There are 20 standard amino acids in nature Two amino acids connect to each other by forming a peptide

bond and losing a molecule of water

21

BACKGROUND

Amino acid structure Dec 20 2013httpenwikipediaorgwikiAmino_acid

Formation of a peptide from two amino acids Dec 20 2013 httpwwwscientificpsychiccomfitnessaminoacidshtml

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 22: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent ECDETD spectra New data that has been available in recent years C-Ions and z-ions are dominant in ETDECD spectra Ions lose small molecules such as ammonia (NH3) and water

(H2O)

BACKGROUND

22

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 23: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent CIDHCD spectra Common ion types and mass calculation

BACKGROUND

23

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 24: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent ECDETD spectra Common ion types and mass calculation

BACKGROUND

24

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 25: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

cent Complimentary ion pairs In CIDHCD

bi-ion + yn-i-ion = MH+ +H = Mparent + 2H In ECDETD

ci-ion + zn-i-ion = Mparent + 3H(ci-1)-ion + (zn-i+1)-ion = Mparent + 3H

BACKGROUND

25

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 26: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD

cent Spectra merging Peak selection (within each spectrum)

1 Find all length 2 paths output middle peakcent Relationship amino acid difference cent Ions regular or with loss of small molecules

2 Find complimentary ion pairs and outputcent Ions consider different charges

Output peaks to form spectra S26

k

maa2

j

maa1

i

output

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 27: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

PROPOSED METHOD (HAVE PROBLEMS)cent De novo sequencing

Tagscent Find 3-tags from each of the original spectrum (denoted as 119878 and 119878)

Find peaks associated with a tag t in S Calculate AACs separated by t (use 119875amp )

cent Threshold control if more tags needed Extend paths to form Graphs with multiple edge type

(GMETs) with AAC restrictioncent Ion types including by- and cz-ionscent Ion type determined by previous step

Assembling tags and partial peptides from GMETs to be candidate peptides 27

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 28: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

28

Results inferred by previous published two methods for HCD and ExD data respectively

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively

Page 29: SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE HCD ETDadmis.fudan.edu.cn/giw2016/slides/session-01/2-PresentationGIW20… · ¢ human peptide spectral library (from chemdata.nist.gov) of

EXPERIMENTS AND RESULTS

cent Experiment Data

cent Select spectra pairs having the same peptide sequences

29

Results inferred by previous published two methods for HCD and ExD data respectively