View
4
Download
0
Category
Preview:
Citation preview
Proteomics, Metabolomics And Mass Spectrometry
Steven S. Gross Department of Pharmacology, x66257
Clinical and Research Genomics -‐ 4/30/2014
Plan for Today’s Lecture!
Describe enabling mass spectrometry(MS)-based strategies & principals used for analysis of biomolecules Show how MS can be applied for to broadly iden<fy proteins in complex biological mixtures – PROTEOMICS Show how untargeted LC-‐MS can be used to broadly discover changes in small molecule expression in extracts of cells, <ssues and biofluids -‐ METABOLOMICS
PROTEOME The Entire Protein Complement of a Genome A major determinant of cell phenotype
Dynamically changing with cell cycle, aging, environment, differentiation, interventions, acute insults, disease
Much more complex than one gene, one protein adage!
All proteins predicted from the genome (22K ORFs predicted in humans) + proteins arising by:
- Alternative mRNA splicing - Posttrsanslational protein processing (proteolysis and > 40 known chemical modifications)
Undefined Complexity of the Proteome
Defines cell phenotype on a moment-to-moment basis Not predicted by any other “omic” info: Genomic, Transcriptomic or Proteomic
A biomedical frontier that is largely undefined & only partly “self” determined
Elucidation is a challenge that will yield breakthrough knowledge & disease cures
METABOLOME The entire Metabolite complement of a Genome
Mass Spectrometers Weigh Molecules
Unit of mass is the “Dalton” (Da, amu) = 1/12 of 12C -‐ the most abundant isotope of carbon 1 Da = 1/Avogadro’s number = 1.6605 x 10-‐24 g Ions are detected by MS as m/z, not abs. mass!
Nomenclature Used to Describe Molecular Mass
Nominal Mass considers only unitary mass of ions Mono-isotopic Mass considers mass defects, e.g., H = 1.00782 Average Mass considers mass defects and isotopic abundance, e.g. 1.109% 13C vs 12C; 0.38% 15N 0.0036 vs 14N
A!3C = 36!2O = 32!1N = 14!7H = 7! 89! (-18)! 71 !
Residue Masses for Amino Acids
Enabling Breakthroughs for Protein Identification by Mass Spectrometry
Soft ionization techniques enabled very large molecules"to be studied without excess fragmentation""MALDI – matrix-assisted laser desorption ionization""ESI – electrospray ionization"
"Establishment of protein sequence databases "(from translation of the genome) ""Development of powerful search engines for databases "
MALDI!
ESI!
Mass Spectrometry Can be Used To Inform on Protein Structure
Confirm identity of a predicted protein Identify an unknown protein – peptide fingerprinting Identify an unknown protein - fragmentation
pattern of one or more component peptides Specify posttranslational protein modifications and sites Elucidate protein binding partners in a coprecipitate - learn the composition of multi-protein assemblies Quantify differential protein expression
Protein mass spectrometry may be: Top-down - Analyzing intact proteins Bottom-up – Analyzing peptides in a protein digest
C381H856N107O114S6!monoisotopic mass = 8676.15!
Sample Preparation for Bottom-up Mass Spectrometry
Enzymatic Digest and
Fractionation
Sample Preparation for Bottom-up Mass Spectrometry
Silver-stained 2-D Gel of Soluble Yeast Proteins - Gygi et al., PNAS 2000
Single Stage MS
MS
Trypsin-Digested Protein Spectrum
To Determine Peptide Mass, Must Know Charge: Isotopic Resolution of a Singly-Charged Peptide
m/z=1285.508 (mass of ion/charge=1285.508/1)
mass of ion=1285.508
Isotopic Resolution of a Doubly-Charged Peptide
m/z=785.865 (z=2)
M = (m x z) – z m = 1571.73 - 2
mass of uncharged peptide = 1569.73
Protein Identification from Peptide Maps
What Does the Search Engine Do?!
Database Searching: MS
Information from Peptide Map Expts
• • • • • • • •
Incubate gel band with trypsin, extract tryptic
peptides, desalt
Peptide ion masses determined by MS
Protein Identification by In-gel Trypsinolysis + MS
54 kDa
45 kDa
Excise band
MKKCTILVVASLLLVNSLLPGYGQNKIIQA QRNLNELCYNEGNDNKLYHVLNSKNGKIYN RNTVNRLLPMLRRKKNEKKNEKIERNNKLK QPPPPPNPNDPPPPNPNDPPPPNPNDPPPP
NPNDPPPPNANDPPPPNANDPAPPNANDPA PPNANDPAPPNANDPAPPNANDPAPPNAND PAPPNANDPPPPNPNDPAPPQGNNNPQPQP
RPQPQPQPQPQPQPQPQPQPRPQPQPQPGG NNNNKNNNNDDSYIPSAEKILEFVKQIRDS ITEEWSQCNVTCGSGIRVRKRKGSNKKAED LTLEDIDTEICKMDKCSSIFNIVSNSLGFV
ILLVLVFFN
Compare observed peptide ion masses with
peptide database
Tandem Mass Spectrometry (MS/MS) Provides More Confident Protein ID
MS/MS
Tandem Mass Spectrometry (MS/MS) !
Peptide Fragmentation
-HN-CH-CO-NH-CH-CO-NH-
Ri CH-R’
bi
yn-i yn-i-1
bi+1 R”
i+1
i+1 ai
xn-i
ci
zn-i
Peptide Fragmentation Peptide: S-G-F-L-E-E-D-E-L-K
MW ion ion MW 88 b1 S GFLEEDELK y9 1079
145 b2 SG FLEEDELK y8 1022 292 b3 SGF LEEDELK y7 875 405 b4 SGFL EEDELK y6 762 534 b5 SGFLE EDELK y5 633 663 b6 SGFLEE DELK y4 504 778 b7 SGFLEED ELK y3 389 907 b8 SGFLEEDE LK y2 260
1020 b9 SGFLEEDEL K y1 147
Peptide Fragmentation
K 1166
L 1020
E 907
D 778
E 663
E 534
L 405
F 292
G 145
S 88 b ions
100
0 250 500 750 1000 m/z
% In
tens
ity
147 260 389 504 633 762 875 1022 1080 1166 y ions y6
y7
y2 y3 y4
y5
y8 y9
b3
b5 b6 b7 b8 b9
b4
Digest with Specific Protease
Trypsin (K, R; not followed by P) Chymotrypsin (F, W, Y, L, M) Lys-C (K) Arg-C (R) Asp-N (D, N-terminal) V8-bicarb (E) V8-biphosph (E, D) {CNBr (M)}
Digest with Specific Protease
Why trypsin?
High specificity (K or R, not followed by P)
Acetylated form commercially available (acetylation lessens autodigestion)
Autolysis peaks are great internal calibrants
(842.509 and 2212.11)
Protein Identification by MS/MS
Input: • Mass of the parent ion, and pattern
match to the MS/MS spectrum Output: • Amino-acid sequence of the peptide
and protein of origin
Database Searching: MS/MS
De Novo Peptide Sequencing
100
0 250 500 750 1000 m/z
% In
tens
ity
E L F
KL
SGF G
E D E
L E
E D E L
De Novo Interpretation
Amino-Acid Residual MW Amino-Acid Residual MW A Alanine 71.03712 M Methionine 131.04049 C Cysteine 103.00919 N Asparagine 114.04293 D Aspartic acid 115.02695 P Proline 97.05277 E Glutamic acid 129.04260 Q Glutamine 128.05858 F Phenylalanine 147.06842 R Arginine 156.10112 G Glycine 57.02147 S Serine 87.03203
H Histidine 137.05891 T Threonine 101.04768 I Isoleucine 113.08407 V Valine 99.06842 K Lysine 128.09497 W Tryptophan 186.07932 L Leucine 113.08407 Y Tyrosine 163.06333
Common Posttranslation Protein Modifications
Analysis of Protein Complexes: Mitochondrial Proteins Resolved by 2D Native Blue-Gels
Ion Trap MS
HPLC-Chip/MS
interface
HPLC-Chip (not to scale)
NanoLC
Nanospray tip, tip assembly & fittings
Nano LC Column
Enrichment column, capillaries, fittings, frits HV ESI contact
RF tag
HPLC-Chip
HPLC or Capillary Electrophoresis Coupled to MS: High-Res Peptide Separation with “Chip-Column”
General Strategy for Proteome Analysis
- Sample Preparation (sub-proteome?) - Protein Separation (2-DE, Capillary LC) - Protein Identification (MALDI-TOF, MS/MS) - Specify Nature and Position of Modifications (analyze protein fragmentation by MS)
Quantifying Absolute and Differential Protein Expression by Mass Spectrometry
-‐ Stable Isotope Labeling by Amino acids in Cell
culture (SILAC)
-‐ Isotope Coded Affinity Tag MS (ICAT) -‐ Label-‐free QuanRficaRon -‐ Fluorescence MulRplex 2-‐DE
SILAC
Stable Isotope Labeling by Amino acids in Cell Culture
ICAT
Isotope Coded Affinity Tag MS
METABOLOMICS
Metabolites, not Genes or Proteins Define a Cell’s Phenotype
Untargeted Profiling of Metabolites in Cell Extracts & Biofluids
Targeted £ Look for known metabolites, one-at-a-time £ Absolute quantification, based on comparison with standards
Untargeted £ Quantify known and unknown metabolites, en-masse £ Metabolites (aka “features”) defined by LC RT/accurate mass
pairs £ Relative quantification, based on ion count abundance £ Differentially-expressed features established by ANOVA £ Molecular identities based on comparison with Stds & database
QTOF for Metabolite Profiling £ Accurate mass (<1 ppm) to enable unequivocal assignment of formulae & MS/MS £ Robust high-‐resoluRon chromatographic systems and soZware for reproducible specificaRon of “features” with broadly differing biophysical properRes (i.e., all cellular metabolites in the range of 50 -‐1000 Da m/z)
£ Wide dynamic range and low fmole detecRon to broadly cover feature space £ Use searchable small molecule database for feature IdenRficaRon (METLIN)
£ StaRsRcal soZware for relaRve quanRficaRon of “features” and group changes
£ Standards and MS/MS-‐based fragmentaRon to annotate the growing METLIN database with RT/accurate mass pairs for confident molecular idenRficaRon of features in high-‐throughput
UHD QTOF
Raw MS TOF data (RT, m/z, Abundance)
Feature Alignment ( RT, Mass)
Metabolite ID,STD RT match DB search and fragmentaRon confirmaRon
Molecular Feature ExtracRon
Chemometrics profiling &MulRvariate Analysis
)
Pathway interpretaRon
Recursive Analysis (Find by ions)
Untargeted Metabolite Profiling Data Acquisition/Analysis Workflow
≈850 features have a METLIN database match
Taurine Carnitine Proline Arginine Uric acid
Betaine Valine Guanosine Lysine Citrulline
Creatinine Tryptophan Ornithine LysoPE(18:0) LysoPC(18:0)
C Extracted ion counts showing peaks for 56 repeated quantifications (as overlays) for some exemplary molecules
A B Profile plot for 56 repeated injections of a human serum sample: normalized ion intensities for 615 metabolites quantified in all samples shows a flat
run-to-run distribution
Overlay of chromatograms for 56 repeat analyses of a human serum sample showing reproducible quant. of total ion counts
Nor
mal
ized
Inte
nsity
Val
ues
-20
-10
0
20
10
-
-
-
-
-
10 20 30 40 50 Injection Number Retention Time (min)
Ion
Inte
nsity
1 2 3 4 5 6 8 7 9 10 11 12 13 14 16 15 17 18 14 16 15 17 18
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1 1.1 1.2 1.3 1.4 1.5 -
- - - - - - - - - - -
- - - -
x 107
Repeated Metabolite Profiling of a Single Human Plasma
!"#$%&$'()$*(+,$($-.//,0,)',1!"#$%&$'()$*(+,$($-.//,0,)',1Screened
Well
GA-I
Screened
Well
GA-I
Not Screened
Affected
GA-I
The above siblings have the same genetic defect, Glutaric Acidemia, type I (GA-I). Unlike the flanking sisters, the severly-afflicted middle child did not benefit from early screening, diagnosis and treatment. (From Dr. Robert E. Grier, Detroit MC).
Benefit of Early Screening for IEMs
Limitations to Newborn Screening for IEMs
Cost – Each condition is extremely rare (1:5,000 to 1: 250,000). Too expensive to individually test for all treatable genetic diseases, even if assays were available - and they are not! Speed – Screening needs to be complete between 2 and 3 days after birth, to enable the initiation of appropriate therapy before mother and baby have left the hospital. Current approaches for IEM screening specifically target each condition and do not offer the possibility of broad coverage. Sample Availability - Babies simply don’t have enough blood to enable testing for all known IEMs using prevailing methods
The Heel Stick
Results of Expanded NBS by MS/MS: Doing the Math
(Schulze et al. Pediatrics 2003) § 250,000 neonates screened for 23 IEM
– Overall sensiRvity = 100% for classic forms of disorders – 106 newborns with confirmed metabolic disorder (70 required
treatment)
– Prevalence of 23 metabolic disorders = 1/2,400 – Overall specificity = 99.67% – 825 false posiRves (0.33% false posiRve rate) – 61 /106 were judged to have benefited from screening and treatment
Beneficiaries = 58% of true posiNves = 1/4,100 newborns
Potential Strengths of Untargeted Metabolite Profiling for IEM Screeing
of Neonates Coverage: mulRplex screening for known and unrecognized IEMs
Specificity: Low false-‐posiRve rate due to verificaRon by consideraRon of mulRple IEM-‐informing metabolites
Sample: Only a few µl of plasma needed for comprehensive assays
Speed: Screening can be completed before baby and mom leave hospital for swiZ iniRaRon of therapy, if needed
Cost: Low, aZer iniRal investment in hardware/soZware IEM Insights: Possibility to establish new knowledge regarding
the systems biology of various monogeneRc diseases
Proof-of-Principal Test of Untargeted Metabolite Profiling
for IEM Screening
In collaboration with Dr. Tila Worgall & the Metabolism Lab at Columbia Presbyterian Hospital, NYC
Tyrosinemia Argininosuccinic Aciduria
Hyperprolinemia
Babies are unaffected at birth, but later poisoned by failure to effecRvely metabolize proteins in mothers milk A few days aZer birth, develop hyperammonemia, ketoacidosis, vomiRng, respiratory distress, lethargy, seizures & possibly coma Prevalence of 1 in 70,000 (autosomal recessive). Untreated results in death – treated can result in severe developmental disabiliRes TREATMENT – Low protein diet with arginine supplementaRon. Liver transplant, as for other Urea Cycle defects
Argininosuccinate Lyase Deficiency (ASA; Urea Cycle Defect)
ASL Deficiency
ASL-‐deficiency Control Heat map, depicRng hierarchical clustering analysis pefrormed on 1185 masses idenRfied with 100% frequency in at least one group, comparing plasma from control neonates (n = 4) vs. a paRent diagnosed with argininosuccinate lyase deficiency. All samples were analyzed in triplicate
Untargeted Plasma Metabolite Profiling of Patient with Confirmed ASL Deficiency
ANP chromatog. + ion monitoring
A Volcano Plot of argininosuccinate lyase-‐deficiency vs. control yielded 228 features that were exported for recursion analysis . Of these, 167 features were expressed with 100% frequency in at least one group and 124 /167 demonstrated fold-‐changes > 2.0, with P <0.05. Right Panel: PCA, performed on 167 features idenRfied with 100% frequency aZer recursion analysis. LeX Panel: Loadings Plot
ASL Deficiency Statistics
Loading plot (colored by mass) demonstrates that argininosuccinate and citrulline have the top loadings among 124 discriminaRng metabolites
Argininosuccinate
ASL Deficiency - PCA Loading Plot
Citrulline
Loading plot (colored by mass) demonstrates that argininosuccinate and citrulline have the top loadings among 124 discriminaRng metabolites
Argininosuccinate
ASL Deficiency - PCA Loading Plot
Citrulline
con
con
ASA
ASA
ASL Deficiency
Inosine Hydrouracil Deoxyinosine
ASA ASA ASA
con con con
Untargeted Metabolite Profiling Offers a Powerful Systems Biology Approach for
Discovering the Actions of Genes and Drugs - Likely to Revolutionize Clinical Diagnostics
CONCLUSION
Recommended