50
22/09/2015 v 1.0.0 LC-MS Pre-processing (xcms) W4M Core Team

LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

22/09/2015 v 1.0.0

LC-MS

Pre-processing (xcms)

W4M Core Team

Page 2: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

SECTION 1

Acquisition files pre-processing with xcms: extraction, alignment and retention time drift correction.

2

Page 3: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

web:

http://metlin.scripps.edu/xcms/

forums :

https://groups.google.com/forum/#!forum/xcms

http://metabolomics-forum.com

R based software,

Free

A lot of parameters to tune,

No graphical interface

Need to write a R script

xcmsOnline webservice

Extraction with XCMS

Page 4: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

LC-MS Data

What is provided by

the LC-MS

devices…

What we want for

data analysis

Page 5: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

• Extraction • Extraction of ions in each sample

independantly.

• Baseline correction

• Creation of extracted ion chromotograms ( EIC )

• Grouping alignment •Each ion is aligned across all samples

• Retention time correction (optional) •On the basis of « well behave » peaks, a

LOESS (non linear) regression is used to correct

the retention time of each ion in order to improve

the alignment.

Useful for HPLC, less usefull for UPLC

•Fill peaks •Replace missing data with baseline value

•Statistics and visualisation (optionals)

•CAMERA

• For annotation of adducts, fragments and

isotopes

Extraction with XCMS

Page 6: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

Extraction with XCMS

Page 7: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms Extraction algorithms

• MatchedFilter is dedicated to centroid or profile low resolution MS data

• Centwave is dedicated to centroid high resolution MS data

7

Page 8: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcmsSet parameters

Page 9: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcmsSet parameters

Page 10: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms extraction matchfilter algorithm

• extraction of ions in each sample independantly.

•Creation of extracted ion base peak chromotograms (EIBPC)

•Model and filter with a second derivative gaussian model

•Intensity is a peak integration or peak height

Smith C., Anal. Chem., 2006

Page 11: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

• extraction of ions in each sample independently. Baseline correction

•Creation of extracted ion chromatograms ( Matchfilter)

step parameter

steps parameter

fwhm parameter

xcms extraction matchfilter algorithm

Smith C., Anal. Chem., 2006

Page 12: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

Influence of

parameter (fwhm, full

width at half

maximum) on extract

ion chromatogram

(EIC)

xcms extraction matchfilter algorithm

Tautenhahn, BioInformatics, 2008

Page 13: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

Influence of

parameter (fwhm, full

width at half

maximum) on extract

ion chromatogram

(EIC)

xcms extraction matchfilter algorithm

Tautenhahn, BioInformatics, 2008

Page 14: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms Extraction centwave algorithm

As for matchfilter, the extraction of ions is made in each sample independently.

"Mass traces" or region of interest (ROI) with m/z deviation in consecutive scans less

than a define value are located.

Then, chromatographic peak are detected.

2 main parameters has to be set :

-ppm according to mass accuracy

-peakwidth acording to the

chromatographic peak width

range.

Page 15: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcms centwave parameters

Page 16: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcms centwave parameters

Page 17: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms Extraction centwave algorithm

Centwave chromatographic peaks detection (same example as metachedFilter).

Tautenhahn, BioInformatics, 2008

Page 18: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms Extraction centwave algorithm

Centwave chromatographic peaks detection (same example as metachedFilter).

Tautenhahn, BioInformatics, 2008

Page 19: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms centwave parameters

xcms forum : How to choose peakwidth ?

"The main purpose of the peakwidth parameter is to roughly estimate the peak width range, this

parameter is not a threshold. The wavelets used for peak detection are calculated from this

parameter. If you use HPLC and your peaks are normally 20 - 60 s wide (base peak with), just go

with that, i.e. peakwidth=c(20,60) centWave will still detect peaks that are 15s or 80 s wide!

Important: Do not choose the minimum peak width too small, it will not increase sensitivity, but

cause peaks to be split."

Using peakwidth = c(20,60) the

peak will be split in three

peaks, each detected as a

~10s wide separate peak

(since they are separated by a

local minimum) :

using peakwidth = c(20,120)

will keep the peak intact :

Example: peak width ~ 45 s

Page 20: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

significant groups (Qcpools vs Blanks)

Centwave parameters effect on extraction

20

Nb. of extracted peaks Nb. of extracted groups (ions)

duplicate groups (same nominal m/z with RT +/30sec)

Page 21: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

significant groups (Qcpools vs Blanks)

Centwave parameters effect on extraction

21

Nb. of extracted peaks Nb. of extracted groups (ions)

duplicate groups (same nominal m/z with RT +/30sec)

Page 22: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

significant groups (QCpools vs Blanks)

Centwave mzdiff effect on extraction

22

Nb. of extracted peaks Nb. of extracted groups (ions)

duplicate groups (same nominal m/z with RT +/30sec)

Page 23: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcms extraction output

sampleMetadata.tsv is

initialized at this step. It

must contain all

informations needed for

further analyses: batch

correction and statistical

analyses.

This file must be

downloaded in order to add

all these informations and

then uploaded.

Page 24: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcms extraction output

Page 25: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

group parameters

CAMERA

Annotation of

Adduct

Fragments

and isotopes

Page 26: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms alignment group

• Peak density chromatogram width determines the number of peaks that are included in the

same group. The parameter used correspond to the standard deviation of the peak density

chromatogram. This parameter can be interpreted as a retention time window.

•The other parameter corresponds to the mass window : mzwid

bw = 30 sec

mzwid

bw = 10 sec

Page 27: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms group output

CAMERA

Annotation

of

Adduct

Fragments

and isotopes

mzwid define the

intervals of m/z

bw define the

width of the

gaussian curve

Page 28: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms group output

Two distinct m/z merge as

one group. Mzwid and bw

too large

Page 29: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms group output

Two distinct m/z merge as

one group. Mzwid and bw

too large

Page 30: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

group parameters effect on extraction

30

significant groups (QCpools vs Blanks)

Nb. of extracted groups (ions)

duplicate groups(same nominal mz with RT +/30sec)

Page 31: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

Instrument p.p.m. Peak width bw mzwid Prefilter

HPLC/Q-TOF 30 c(10,60) 5 0.025 c(0,0)

HPLC/Q-TOF

(high resolution)

15 c(10,60) 5 0.015 c(0,0)

HPLC/Orbitrap 2.5 c(10,60) 5 0.015 c(3,5000)

Ultraperformance

liquid

chromatography

(UPLC)/Q-TOF

30 c(5,20) 2 0.025 c(0,0)

UPLC/Q-TOF

(high resolution)

15 c(5,20) 2 0.015 c(0,0)

UPLC/Orbitrap 2.5 c(5,20) 2 0.015 c(3,5000)

Example of xcms parameters

Page 32: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms workflow retcor

CAMERA

Annotation of

Adduct

Fragments

and isotopes

Page 33: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms retcor output

CAMERA

Annotation

of

Adduct

Fragments

and isotopes

retcor improving retention time

must be followed by a second

group step.

Page 34: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms fillPeaks

CAMERA

Annotation

of

Adduct

Fragments

and isotopes

Filling method:

«chrom» for LCMS

«MSW» for direct

infusion.

Page 35: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

xcms fillPeaks

CAMERA

Annotation of

Adduct

Fragments

and isotopes

Page 36: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

22/09/2015 v 1.0.0

MS data processing

Report creation and Annotations

Yann GUITTON

Page 37: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

CAMERA an other R package integrated in Galaxy

The R-package CAMERA is a Collection of Algorithms

for MEtabolite pRofile Annotation.

Its primary purpose is the annotation and evaluation of

LC-MS data. It includes algorithms for annotation of

isotope peaks, adducts and fragments in peak lists.

Additional methods cluster mass signals that

originate from a single metabolite, based on rules for

mass differences and peak shape comparison

xcms diffreport & CAMERA

Page 38: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

38

CAMERA.annotate= CAMERA::annotateDiffreport

In details:

1- xcms::diffreport : Generates features list, EICs, BoxPlot

and statistics

M x T y

xcms diffreport & CAMERA

Page 39: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

39

CAMERA.annotate= CAMERA::annotateDiffreport

In details:

1- xcms::diffreport : Generates features list, EICs, BoxPlot

and statistics

EICs Boxplots

m/z =M x

m/z

Inte

nsity

xcms diffreport & CAMERA

Page 40: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

40

CAMERA.annotate= CAMERA::annotateDiffreport

In LC-MS ESI Features are usally not alone

Number of features is not equal to number of detected molecules

m/z =M x

m/z

Inte

nsity

RT: 11.14 - 12.50

11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 12.0 12.1 12.2 12.3 12.4 12.5

Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Rel

ativ

e Abu

ndan

ce

11.92

12.3412.08 12.43 12.4711.48 11.7411.3711.30

NL: 1.21E6

m/z= 452.30000-452.32000 F: FTMS + c ESI Full ms [100.00-1000.00] MS VB-2J-F-3

Search in raw data

VB-2J-F-3 #828 RT: 11.94 AV: 1 NL: 1.15E6F: FTMS + c ESI Full ms [100.00-1000.00]

449.5 450.0 450.5 451.0 451.5 452.0 452.5 453.0 453.5 454.0 454.5 455.0 455.5 456.0 456.5

m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

105

110

115

120

Rel

ativ

e Abu

ndan

ce

452.30527

453.30850

454.31198450.49002 451.57953 456.63266

C13 isotopes

xcms diffreport & CAMERA

Page 41: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

41

CAMERA.annotate= CAMERA::annotateDiffreport

In details:

2- CAMERA::xsAnnotate: read xcms object

3- CAMERA::groupFWHM: search co-eluting features RT based

4- CAMERA::findIsotopes: search for isotopic realtion between features (C12/C13)

5- CAMERA::groupCorr : try to improve co-elution separation

6- CAMERA::findAdduct: search for known adducts and fragments [M+Na]+,

[M+H-H2O]+, ….

Non annotated, but low intensity

[M+H]+ 682.589

[M+H+NH3]+ 682.589

[M+Na]+ 682.589

[M+K]+ 682.589

[M+NH4+ACN]+ 682.589

[2M+Na]+ 682.589

[2M+K]+ 682.589

xcms diffreport & CAMERA

Page 42: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

42

CAMERA adduct annotation: defining rules

Green= Annotation OK

Red= No annotation or wrong or adduct not in CAMERA

[M+H-NH3]+

[M+H-NH3-H2O]+ ?

zoom

xcms diffreport & CAMERA

Page 43: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

43

CAMERA.annotate= CAMERA::annotateDiffreport

In details:

2- CAMERA::xsAnnotate: read xcms object

3- CAMERA::groupFWHM: search co-eluting features RT based

4- CAMERA::findIsotopes: search for isotopic realtion between features (C12/C13)

5- CAMERA::groupCorr : try to improve co-elution separation

6- CAMERA::findAdduct: search for known adducts and fragments [M+Na]+,

[M+H-H2O]+, ….

Some times …. Co-elution are not fully

resolved, have a look to your data

xcms diffreport & CAMERA

Page 44: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

44

M x T y

annotations

Additionnal information added to the diffreport by CAMERA

xcms diffreport & CAMERA

Page 45: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

45

CAMERA.annotate= CAMERA::annotateDiffreport

In details:

1- xcms::diffreport : Generates features list, EICs, BoxPlot and

statistics

2- CAMERA::xsAnnotate: read xcms object

3- CAMERA::groupFWHM: search co-eluting features RT based

4- CAMERA::findIsotopes: search for isotopic realtion between

features (C12/C13)

5- CAMERA::groupCorr : try to improve co-elution separation

6- CAMERA::findAdduct: search for known adducts and

fragments [M+Na]+, [M+H-H2O]+, ….

Many steps= quite a lot of parameters

xcms diffreport & CAMERA

Page 46: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

xcms diffreport & CAMERA

Page 47: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

CAMERA

Annotation of

Adduct

Fragments

and isotopes

More parameters

xcms diffreport & CAMERA

Page 48: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

Output files

• xset.annotate.variableMetadata.tsv

For each metabolite (row) : the value of the intensity in each sample, fold, anova, mzmed, mzmin, mzmax, rtmed,

rtmin, rtmax, npeaks, isotopes, adduct and pcgroup

• xset.annotate.dataMatrix.tsv

A tabular file which represents for each metabolite (row), the value of the intensity in each sample (column).

• xset.annotate.zip

It contains filebase_eic, filebase_box and filebase.tsv for one conditon vs another (Anova analysis).

• xset.annotate.Rdata rdata.camera.quick or rdata.camera.positive or rdata.camera.negative

Rdata file, that be used outside Galaxy in R.

xcms diffreport & CAMERA : outputs

Page 49: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

Output files

• xset.annotate.variableMetadata.tsv

For each metabolite (row) : the value of the intensity in each sample, fold, anova, mzmed, mzmin, mzmax, rtmed,

rtmin, rtmax, npeaks, isotopes, adduct and pcgroup

xcms diffreport & CAMERA : outputs

Page 50: LC-MS - workflow4metabolomics · LC-MS data. It includes algorithms for annotation of isotope peaks, adducts and fragments in peak lists. Additional methods cluster mass signals that

And next…database search!