Identification and quantitation of proteins in human ... · electrophoresis and differential precipitation. In the late 90's the fast ... protein analysis and is the approach adopted

M

Literature Thesis

by

Natalia Charotti Barquinero

10629718

May-June 2016 12 EC

Supervisor: Examiner:

dr. A. (Alina) Astefanei dr. G.L. (Garry) Corthals dr. P.J.(Peter) Schoenmakers

MSc Chemistry

Analytical Sciences

Identification and quantitation of proteins in human plasma

and serum by LC-MS/MS

1

Abstract Plasma and serum have been referred as the most complex human proteome and are considered among the most attractive samples for biomarker discovery because they are readily available and representative of the physiological state of an individual at a given time. The biggest challenge when analyzing proteins in these matrixes is their wide dynamic range covering more than ten orders or magnitude. Since LC-MS/MS is the method of choice for plasma and serum proteomics, in this work, recent studies applying this approach are reviewed. The aim is to give an insight on the fundamentals and currently used techniques, workflows and technology to successfully identify and quantify proteins in these matrixes according to the objective of a given experiment. Up to now, immunoaffinity depletion of high-abundance proteins followed by nano-2D-LC is preferred following the current trend to increase sample fractionation prior MS analysis. Miniaturized LC coupled through nano-ESI to hybrid spectrometers seems to be the most efficient setup to deliver the highest number of identified proteins. Tandem mass spectrometry is pivotal for protein identification and the acquisition mode (i.e., MRM, DDA or DIA) and analyzer should be selected according to the aim of the study to deliver optimal results.

2

Table of contents

1. Introduction ............................................................................................. 4

2. Sample complexity .................................................................................. 8

2.1. Plasma ................................................................................................... 8

2.2. Serum .................................................................................................... 9

2.3. Sample selection .................................................................................. 9

3. Sample Preparation ............................................................................... 11

3.1. Depletion of high abundance proteins ............................................. 12

3.1.1. Antibody-based affinity methods ........................................................ 12

3.1.2. Protein ligand-based affinity methods ................................................ 13

3.1.3. Precipitation and ultrafiltration ............................................................ 14

3.2. Enrichment of low abundance proteins ........................................... 14

3.2.1. Peptide-protein affinity methods ......................................................... 14

3.3. Digestion ............................................................................................. 16

4. Downstream separation methods ........................................................ 18

4.1. One-dimensional liquid chromatography (1D-LC) ........................... 19

4.2. Two-dimensional liquid chromatography (2D-LC) .......................... 20

5. Mass Spectrometry ............................................................................... 31

5.1. Ionization techniques:........................................................................ 31

5.2. Mass analyzers: .................................................................................. 32

5.3. Ion detectors ....................................................................................... 36

5.4. Peptide fragmentation ....................................................................... 36

5.5. Acquisition strategies ........................................................................ 37

5.5.1. Data-Dependent Acquisition (DDA) ............................................... 38

5.5.2. Data-Independent Acquisition (DIA) .............................................. 39

5.6. Advances in MS instrumentation ...................................................... 43

6. Quantitative proteomics with LC-MS/MS............................................. 44

6.1. Stable isotopic labelling .................................................................... 45

6.2. Absolute quantitation (AQUA) ........................................................... 46

6.3. Label-free methods ............................................................................ 47

6.3.1. Spectral counting ............................................................................ 47

6.3.2. Ion currents-based methods .......................................................... 47

7. Bioinformatics ....................................................................................... 50

3

7.1. Protein identification .......................................................................... 50

7.2. Validation of results ........................................................................... 51

8. Discussion ............................................................................................. 53

Conclusion ................................................................................................... 57

References ................................................................................................... 59

4

1. Introduction Proteomics is the study of proteins in a highly complex biological specimen. It involves determining proteins abundance, profile, structure, interactions, and modifications. Developments in this field have increased exponentially in the last three decades in search for answers to questions proposed by genome studies1. Proteins hold information not only about the human complexity but as well about the state of the organism at a given time point. The variety of proteins found expressed in the human proteome exceeds the number of proteins encoding genes by almost two orders of magnitude 1-3 . Alternative splicing and post-translational modifications (PTMs) are responsible for this difference. These modifications harbor plenty information about the function of proteins such as signaling pathways inside cells 4 . Two main strategies have arisen for protein analysis. The first one focuses on protein function and expression under specific physiological conditions, whereas the other monitors quantitative changes in the pattern of expression under different circumstances, e.g. disease versus a control group 5 . Considering the vast diversity of the protein population, the initial approaches of proteomics made used of essential fractionation techniques such as gel electrophoresis and differential precipitation. In the late 90's the fast development of two-dimensional electrophoresis and mass spectrometry had a significant impact in protein analysis and characterization 5,6 . Proteomics had further benefited from coupling liquid chromatography to mass spectrometry. This hyphenated technique had become a powerful analytical tool and the method of choice for the characterization and quantification of proteins and metabolites giving insight into biochemical pathway activity 4 . Mass spectrometry was originally applied to analyze small molecules due to limitations in the dynamic range. Consequently, the most logical step going into protein analysis was to cleave them into peptides. Those shorter fragments had the advantage of being more soluble and easier to separate by multidimensional methods 5,7 . However, by breaking the protein in pieces, the diversity significantly increases, and crucial information about the location of modifications might be lost, allowing only a fraction of the peptides to be identified. The study of proteins through the analysis of enzymatic cleaved peptides is called 'bottom-up' or 'shotgun' proteomics 4,8 . Most efforts in proteomic research followed this approach, shaping the rapid development of liquid chromatography and tandem mass spectrometry to improve peak capacity (the number of peaks that can be resolved by the system 9 ) and peptide identification in complex peptide mixtures 10,11 . As mass spectrometry’s instruments dynamic range dramatically improved and the existing ionization techniques produce multiply charged ions, the possibility of analyzing intact proteins emerged. This approach known as top-down proteomics became an attractive alternative method, opening the door to the analysis of PTMs 4,10-12 . The workflow of both approaches is shown in Figure 1.1.

5

Nowadays ‘bottom-up’ proteomics remains the most popular method for protein analysis and is the approach adopted in this review. In this context, the main challenge is dealing with the enormous number of peptides generated after protein digestion added to the existing complexity of the protein mixture in the sample. For this reason, sample preparation is a key step to reduce the sample complexity and will be extendedly discussed in section 3. Regarding sample matrixes, blood has been profoundly studied due to its easy access and its widespread nature. It is the most attractive source of biomarkers because plasma and serum contain not only blood-related proteins and peptides but also tissue-specific proteins poured into the blood while it perfuses through all tissues carrying valuable information about the biological status of the body 13-15 . The dynamic range of the proteins present

in plasma is in the order of 1010 (from mg/mL to pg/mL) 16 . There is no unique technology hitherto that can deal with such a broad spectrum of concentrations. More than 90 % of protein mass in plasma is cover by only 14 proteins, albumin being the most abundant. The removal of this overwhelming faction uncovers a "hidden proteome" most suitable for biomarker discovery. The most important strategies for reducing the dynamic range of proteins in plasma and serum are sample fractionation (differential precipitation, (gel) electrophoresis and chromatography), selective depletion of highly abundant proteins (HAP), and low abundance proteins enrichment (LAP)17 .

Figure 1.1: 'top-down' versus 'bottom-up' proteomics strategies 11

6

Unfortunately, all have the disadvantage of suffering from protein losses because albumin and other abundant proteins are known to carry or bind other proteins 16 . Additionally, new mass spectrometers acquisition modes have been developed to reach the least abundant proteins, some with promising results 18,19 .

Figure 1.2: Workflow for the analysis of proteins by LC-MS/MS from plasma or serum

Plasma/Serum

Sample Preparation (Quantitation specific steps)

HAP depletion LAP enrichment

Downstream Fractionation

Methods 1DE /2DE

Nano LC/UPLC LC x LC

Protein identification/quantification

Mass spectrometry

MALDI/SELDI TOF MS QTOF MS/MS

ESI/ nano ESI LTQ-ion trap LTQ-Orbitrap QTOF

DDA DIA SRM

MRM

Bioinformatics

Digestion

7

Besides being able to detect and identify a given protein, in the pharmaceutical arena and biomarker validation stages, quantification is pivotal 17,20 . Many methodologies exist to quantify proteins 4,12 , and they differ as the objective is relative or absolute quantification and according to whether or not they make use of labels. The nature of these labels and stage in the workflow where they are introduced have an effect on the quality of the quantification process 12 . Liquid chromatography coupled online with mass spectrometry (LC-MS/MS) has become the preferred method for sensitive detection, identification and quantification of blood proteomics and other many applications 4,21 . In this work, the current techniques and new advances in the analysis of plasma and serum proteins by LC–MS/MS are reviewed. The main focus is drawn to the sample preparation steps (enrichment of LAP and depletion of HAP), downstream separation methods (liquid chromatography), quantification strategies, mass spectrometric instrumentation, and data acquisition modes. It has been structured as the workflow shown in the Figure 1.2 with the aim of giving a useful overview of the recent developments in the area of blood proteomics.

8

2. Sample complexity These matrixes are especially attractive for proteomic studies because it is expected that specific changes in the abundance of small secreted proteins and peptide hormones produce shifts in the low molecular weight proteins/peptides profile of plasma and serum giving information over a disease process occurring somewhere in the organism 22 . The plasma or serum proteome is very dynamic entity susceptible to physiological conditions such as stress, sport, meals, sleep, diseases and in women, pregnancy 13 . Currently capillary LC-nanoESI-MS/MS is the method of choice to study these proteomes. Most efforts have been directed towards developing methods to accessing the low abundance proteins and peptides 13 . The Human Proteome Organization (HUPO) in collaboration with nearly fifty teams has started in 2003 a project to complete the plasma proteome map23,24. To date a total of 10,546 proteins and isoforms are listed in the plasma proteome database available online (http://www.plasmaproteomedatabase.org) 13,23 .

In this section, the nature and main differences between plasma and serum are highlighted to facilitate the identification of features to consider when selecting the type of sample for a proteomic study according to the analysis methodology and instrumentation available.

2.1. Plasma Plasma is the liquid portion of the blood, where the cellular components are suspended. It is obtained as the supernatant layer after centrifugation of blood collected with anticoagulants. The latter are exogenous substances added to the collecting tube that interact with specific elements hindering the formation of a clot 15 . The most used anticoagulants are potassium-EDTA, sodium-citrated and lithium-heparin. The first two act by seizing calcium ions needed for clotting process. The latter is a glycosaminoglycan that binds to antithrombin III inhibiting the formation of thrombin, which prevents the fibrin to form a clot 24-26 . The choice of anticoagulant has been proven to change the low-mass proteome profile. For example, the profile obtained by plasma with EDTA is substantially different from that obtained from plasma collected with citrate. This could be explained by the aggregating effect that EDTA has on the platelets, changing the protein content in plasma 25,27 . In addition, the removal of bivalent metal ions may affect the natural folding of some proteins, leading to different interactions with depletion columns 28 . Moreover, sodium citrate is added in liquid form to the blood collecting tubes and has a dilution effect on the blood sample. The downstream analysis techniques are also important to consider when selecting the anticoagulant, namely assays that require Ca+2 or Mg+2 ions are not compatible with plasma obtained with EDTA and citrate. Similarly, methodologies such as SELDI-MS (surface enhanced laser

http://www.plasmaproteomedatabase.org/

9

desorption ionization-MS), which relays on affinity interactions to bind proteins to a chip surface, might be affected due to the hindering effect of negatively charged molecules such as heparin on chip-protein interactions 26 . Platelet-poor plasma is the preferred sample for peptidomics studies. Platelets contain a substantial number of peptides and enzymes that are released upon aggregation. This process can be activated by a variety of factors, being one of them low temperatures it is crucial the removal of the platelets before cooling down and freezing the samples 24,26 . The time between centrifugation and separation of the globular fraction also has an impact on the protein profile obtained because the metabolism of blood cells continuous after the blood has been collected. Therefore, standardization of sample collection, processing and storage protocols is critical to improving reproducibility, and it would allow making comparisons between samples or states without being misled by pre-processing artifacts 26 .

2.2. Serum The serum is obtained by centrifuging the blood after letting it clot in a glass tube or a plastic tube containing glass beads or powder to accelerate the clotting process. In some cases, the tube includes a separation gel with an intermediate density that facilitates the isolation of fractions after centrifugation 15,22,26 . The composition of serum is similar to that of plasma. The main difference lies in the absence in the former of fibrinogen and other proteins involved in the clotting process (called coagulation factors) 15 . Historically serum has been the sample of choice for clinical analysis, mainly because the addition of anticoagulants interferes with several tests 22 . The coagulation time is of importance since it influences the small proteins/peptides profile of this matrix. The changes have been attributed to transformations of coagulation factors, and fibrinolysis 22 . Similarly to plasma, the standardization of pre-processing protocols is required to ensure the comparability of proteomes 26 .

2.3. Sample selection

The choice between plasma and serum must be made according to the

specific target and the methodology selected for the analysis. In most

proteomics studies, they have been used indistinctively ignoring the elemental

differences between the matrixes. Even though most studies have use serum

mainly to avoid detrimental effects of anticoagulants (competing ionization in

MS, an overwhelming signal in NMR) 29 , plasma might be the best option

when coagulation factors are to be studied, or a more stable sample is

required 16,22 .

10

Although the differences could be neglected in most biochemical analyses, in proteomic studies many authors claim having obtained significantly different proteins and peptides profiles from plasma and serum, to the point of stating their proteomes are not comparable and therefore should not be considered exchangeable in biomarker discovery and comparative proteome studies

22,25,26,29 . Hemolysis, which is the mechanical rupture of the membrane of red blood cells during sample collection, has a substantial impact on the analytical outcome. Studies have shown that additional peaks are found in hemolyzed specimens, which makes this type of samples inadmissible for comparative proteomics 1,25 . The introduction of protease inhibitors is recommended for both matrixes in 'top-down' approaches, but if included before complete digestion in the 'bottom-up' workflow, they would have an adverse effect on the results. In addition, glycerol has shown a positive effect on preserving protein structure and function during storage after lyophilization or crystallization processes of plasma and serum 26 .

11

3. Sample Preparation Most proteomics studies are interested in the low concentration proteins and peptides present in blood because of their greater potential for biomarkers discovery. For this reason, in this section the most used strategies and the latest advanced to access this "hidden proteome" will be discussed in detail. Furthermore, the steps associated with specific quantification methods will be boarded later in the in section 4.

Fig. 3.1 shows the extension of the dynamic range of proteins in serum. The plasma and serum have been referred as the most complex of human proteomes 30 . The first ten most abundant proteins, which include albumin (human serum albumin, HSA), immunoglobulins (Ig), transferrin, haptoglobin, glycoprotein, complement C3, etc. are part of the so-called high abundance protein fraction (HAP) having a concentration range in the order of mg/mL. Even though together they only represent less than 0.1 % of the whole variety of proteins, they gather more than 95% of the total protein mass in plasma

1,14,31 . On the other hand, the low abundance proteins (LAP), with biomarker potential are in the concentration range of ng and pg/mL and include cytokines, interleukins, prostatic-specific antigen, etc. 31 . This broad dynamic range has turned the biomarker discovery process comparable to the search for a needle in a haystack.

Figure 3.1: The dynamic range of serum proteins 16

12

The two main approaches to reducing the sample complexity are still either to deplete HAP or to enrich LAP 8 . However, recent developments in mass spectrometry acquisition modes and two dimensions liquid chromatography (LC x LC) separations allow using the whole sample with encouraging results

16,18 . The recent approaches will be discussed along with MS acquisitions modes and 2D-LC.

3.1. Depletion of high abundance proteins

3.1.1. Antibody-based affinity methods These methods use specifically designed antibodies directed to HAP. The

antibodies are immobilized on particles packed in various formats. The most common applications use IgG and IgY polyclonal antibodies responsible for the high specificity and efficient removal of most HAP proteins (between 84 to 98% of the total protein mass) 31 . The commercially available kits based on this principle and used in most proteomic studies are listed in Table 3.1 31,32 The Multi-affinity removal system (MARS-6, 7 or 14) columns use polyclonal IgG antibodies, which are immobilized on an affinity resin packed inside a spin cartridge or chromatographic column. The number refers to the number of HAP targeted proteins for removal. The three of them are meant to remove/deplete albumin, IgG, transferrin, α1-antitrypsin, haptoglobin, and IgA. Human-7 and Human-14 can additionally remove fibrinogen and only Human-14 is able to remove α-2-macroglobulin, α-1-acid glycoprotein, IgM, apolipoprotein Al, apolipoprotein All, complement C3, and transthyretin. The last column can deplete 94% of the total protein mass with an efficiency of 95 to 99%, and besides being the most expensive one it has been reported to work for more than 200 runs14,30. It has been used in several plasma proteomics studies 33-37 and it has been mentioned as the preferred column for biomarker discoveries studies 13,32,38 .

The Seppro® columns deplete the top 14 HAP using IgY (avian) antibody attached to microbeads, with a variety of formats that go from bulk slurry, chromatography column, spin column to tip. Several studies have reported using these columns for biomarker discovery studies 32,39-42 , and it has been mention that has a comparable performance to MARS-Human 1432 depleting about 95% of the total protein mass29.

ProteoPrep® 20 It has been used in many comparative studies of depletion techniques 32,37,43,44 and several biomarker studies 26,39,45 but currently it has been discontinued by Sigma-Aldrich after acquiring Seppro® technology. It applied a mixture of polyclonal IgG antibodies against 20 HAP from human plasma. The antibodies were attached to a spherical support specially designed to diminish non-specific binding. It has been reported to deplete 98% of the total protein mass in plasma31,46 .

13

None of the above applications is free from the principal drawbacks of this approach, which are the co-depletion due to non-specific binding and also losses of LAP bound to carrier proteins such as albumin. Another important factor to mention is the high cost associated with the use of antibodies 31 .

Table 3.1: Most used immunoaffinity depletion columns in proteomic studies of plasma and serum

Supplier Commercial name HAP removed

Agilent Technologies MARS-Human 6

Albumin, IgG, transferrin, α1-antitrypsin, haptoglobin, and IgA

Polyclonal IgG MARS-Human 7 All the above plus fibrinogen.

MARS-Human 14

All the above plus α2-macroglobulin, α1-acid glycoprotein, IgM, apolipoprotein Al, apolipoprotein All, complement C3, and transthyretin.

Sigma-Aldrich (developed by GenWay Biotech, Inc.) Polyclonal IgY

Seppro® MIXED12-

LC20 column/ Seppro®

SuperMix LC2 column

used in conjunction with

IgY14 LC5

Albumin, IgG, IgA, α2-macroglobulin, α1-Antitrypsin, IgM, haptoglobin, fibrinogen, α1-acid glycoprotein, apolipoprotein A-I and A-II, apolipoprotein B, complement C3, transferrin.

Sigma-Aldrich Polyclonal mixed IgG antibodies Product has been discontinued

ProteoPrep® 20 Albumin, IgG, transferrin, fibrinogen, IgA, α2- macroglobulin, IgM, α1- antitrypsin, complement C3, haptoglobin, apolipoprotein A1, A3, and B; α1- Acid Glycoprotein, ceruloplasmin, complement C4, C1q, IgD, prealbumin, and plasminogen.

3.1.2. Protein ligand-based affinity methods These methods make use of proteins (natural or recombinant) that have the capacity to specifically recognize and bind other proteins. For example, protein A and protein G are bacterial proteins that have the capacity to be attached to the Fc regions of immunoglobulins, if used combined with a dye-ligand which selectively binds albumin 47 , a high proportion of the total protein mass is depleted 22,31,48 . Another application based on ligand affinity is the affibody ligands which consist of recombinant proteins designed to have similar properties as protein A20,28. Recently, a new class of affinity ligands consisting on heavy chains of Camelidae antibodies has been developed. Commercially available as

CaptureSelect® HumanPlasma14 (ThemoFisher Scientific), these antibody fragments are capable to remove 14 of the most abundant proteins of plasma

14

and serum. The advantage is that the production of the ligands is performed in yeast cells and they do not require animal immunization. Hence, there is less variability in the product. The fragments bind the proteins through the variable domain of the heavy chain in a comparable manner as whole antibodies. They remove albumin, IgG, IgM, IgA, IgE, IgD and free light chains, transferrin, fibrinogen, α-1 antitrypsin, α-2 macroglobulin, α-1 acid glycoprotein, apolipoprotein A1, and haptoglobulin 49 .

3.1.3. Precipitation and ultrafiltration The use of chemicals to selectively precipitate albumin has been demonstrated using sodium chloride with ethanol or ammonium sulfate 47 , as well as reducing agents such as dithiothreitol (DDT) and tris(2-carboxyethyl)phosphine (TCEP) 50 . Furthermore, the use of organic solvents (acetonitrile 2:1 or trichloroacetic acid) in the presence of ion-pairing agents can have similar effects 22,51 . This method has low specificity and consequently suffers from significant co-depletion and losses due to protein carrier effect 22,39 .

3.2. Enrichment of low abundance proteins The use of immunoaffinity columns prepared with specially design antibodies to bind target proteins (immune-capture) has been used to enrich a particular fraction of LAP, especially in pharmaceutical applications 52-55 Stable isotope standards and capture by antipeptide antibodies (SISCAPA) is a high throughput method developed to enrich low-abundance peptides from complex mixtures and improve multiple reaction monitoring performance 56 . The use of dye-ligands in the study of albumin is an example of an enrichment technique that uses chemical affinity interactions 1,22 . Lectins have been employed in the study of serum glycoproteins thanks to their capacity to recognize and bind these modifications 22,31 . Another area that had benefited from enrichment methodologies is the study of phosphoproteome by applying the immobilized metal affinity capture (IMAC) methodology that uses beads with copper and iron variants or titanium oxide (TiO2)-based solid-phase material in chromatographic columns to bind proteins with this PTM selectively 22,25,31,39 .

3.2.1. Peptide-protein affinity methods A. Cysteinyl enrichment

The presence of free sulfhydryl group of cysteine residues in peptides and proteins can be explode to enrich them using thiol affinity resins 57 . This type of resins can used with plasma samples to either deplete mecaptalbumin (which has a free sulfhydryl group in position 34, HSA–Cys34) or to enrich cysteinyl peptides after albumin depletion 58 . It is important to keep in mind that although albumin contain only one free sulfhydryl group, given its high abundance it accounts for the largest fraction of thiols in plasma 59 . A study have reported using cysteinyl enrichment to removed mecaptalbumin and

15

obtain an enriched fraction of HSA–Cys34 adducts, which are considered biomarkers of exposition to xenobiotic toxicants and reactive endogenous species 60 .

All the above examples are applications directed to enrich a particular part of the low abundance proteome. A relatively new and innovative method has been developed to enrich LAP in general, and it is based on peptide-protein affinity interactions and it is described below 31 .

B. Combinatorial library of hexapeptides

This technology commercially available as ProteoMiner produced by Bio-Rad Laboratories takes advantage of the capacity of amino acids and short

peptides to bind proteins. It uses a combinatorial library of hexapeptides that are generated to bind an enormous variety of proteins. For example, a library made with the combination of the 20 proteinogenic amino acids would generate 206 = 64 x 106 total unique binder hexapeptides 61 .

The hexapeptides are immobilized on a solid support packed inside a mini spin column. As the plasma/serum sample goes through it, each protein variety is bound to the unique hexapeptides capable of attaching it, in this way the binding sites able to capture HAP are rapidly saturated, and the rest of the HAP is not bound anymore. On the contrary, hexapeptides that capture LAP keeps adding more molecules as more sample is added to the column, in this way the total effect is a relative enrichment in LAP relative to the amount of HAP attached to the column62 . This technique has been compared to depletion columns previously described, and results are similar and complementary to the other technologies, becoming a good alternative to explore the “hidden proteome”61-64 .

The main advantages of this method over immunoaffinity columns are that the enrichment is not selective, but it embarks a rather holistic approach, it is also reproducible51 , has a higher sample loading capacity, is simple to use and disposable to avoid sample contamination61 . On the other hand, this advantages might no be achieved if the sample size is not big enough. Additionally, it comes in the spin format which has low reproducibility for a high number of samples 31 .

(a) Immunodepletion (b) Combinatorial library of hexapeptides

Figure 3.2: Schematic of techniques shrinking the dynamic range of plasma and serum samples by (a) immunodepletion methods and (b) enrichment by combinatorial library of hexapeptides 61

16

3.3. Digestion

In the ‘bottom-up’ approach the proteins are digested by enzymes, the cleavage sites depend on the type of protease selected for the digestion. There are several proteases used in proteomics. Their selectivity is an important factor to consider when selective them because it determines the type and variability of the peptides that can be obtained 65 .

Protease Cleavage specificity Common proteomic usage

Trypsin -K,R--Z- not -K,R--P- general protein digestion

Endoproteinase Lys-C -K--Z alternative to trypsin for increase peptide length; multiple protease digestion; 18O labeling

Chymotrypsin -W,F,Y--Z- and

-L,M,A,D,E--Z- at slower

rate

multiple protease digestion

Subtilisin Broad specificity to native

and denatured proteins


Elastase -B--Z- multiple protease digestion

Endoproteinase Lys-N -Z--K- Increase peptide length; create a higher charge state for ETD

Endoproteinase Glu-C -E--Z- and 3000 times

slower at -D--Z-

multiple protease digestion; 18O labeling

Endoproteinase Arg-C -R--Z- multiple protease digestion

Endoproteinase Asp-N -Z--D- and -Z--cysteic acid

but not -Z--C-


Endoproteinase Asp-N -Z--D- and -Z--cysteic acid

but not -Z--C-


Proteinase K -X--Y- Nonspecific digestion of membrane-bound proteins

OmpT -K,R--K,R- Increase peptide length for middle down proteomics

aB - uncharged, nonaromatic amino acids (i.e., A, V, L, I, G, S);X-aliphatic, aromatic, or hydrophobic amino acids, Z-any amino acid.

Trypsin is the most used enzyme in proteomics studies. It is a serine protease that cleaves the bonds at the carboxyl side of arginine and lysine. Consequently, the trypsin peptides obtained should have a Lys (K) or Arg (R)

Table 3.2: Common proteases used for ‘Shotgun’ proteomics8

17

at the C-termini, which is useful information when identifying peptides8 . Other enzymes also used in proteomics are listed in Table 3.2.

There are several features to be considered when digesting proteins65 :

The complexity of the sample increases exponentially.

The physicochemical properties of the peptides are more diverse that that of the intact protein and a separation method might struggle to separate them efficiently.

The first order kinetics followed by endopeptidases such us trypsin bias the efficiency of the cleavage towards high abundance proteins if the digestion time is not sufficiently long.

There will always be peptides that are too long or too short to be analyzed by a given LC-MS system, making the full protein sequence recovery impossible.

The digestion process introduces interlaboratory variability considering the conditions and time allowed for the process to take place. Most laboratories perform the digestion overnight.

Because of the above-mentioned reasons, in the case of small proteins and

peptides, the analysis of an intact protein might be preferred, this approach is

called ‘peptidomics’8,66 .

For endoprotease to have access to the whole protein structure is necessary for the proteins to be adequately solubilized and unfolded. Considering that the tertiary and even quaternary structures of many proteins are maintained by disulfide bridges, proteins have to be reduced and denatured, which is achieved by reduction with dithiothreitol (DTT) followed by alkylation with iodoacetamide (IAA) or iodoacetic acid to avoid that the bridges are formed again (protection)51,52,67 . The use of organic solvents, urea or surfactants such as SDS (sodium dodecyl sulfate) have proven to improve trypsin digestion because it unfolds the macromolecules52 , but its incompatibility with LC-MS led to the development of MS compatible surfactants such as ProteaseMAX, Invitrosol, Rapigest, PPS Silent Surfactant8 .

Modification of the digestion by physical methods such as heating, microwave heating under acidic conditions, covalent immobilization of trypsin within microreactors and microparticles can improve the efficiency of the process8 .

18

4. Downstream separation methods To simplify the sample prior MS analysis, different methods are used to further fractionate the peptides mixture. The focus of this section will be on the reported separation methods by liquid chromatography. Gel fractionation methods such as 1D and 2D gel electrophoresis, isoelectric focusing (IEF), capillary electrophoresis, ion mobility separations and related techniques will not be discussed in this work.

Liquid chromatography The ‘bottom-up’ proteomic approach relies on the identification of sufficiently unique peptides to allow protein identification. It combines efficient separations with mass spectrometry for the characterization of a complex mixture of peptides. Liquid chromatography is a very important separation tool in proteomics due to its high resolving power and compatibility with ESI-MS. The separation is based on the distribution of the analyte between the mobile (MP) and stationary phases (SP). The latter is packed inside a column and consists of spherical particles or a porous monolithic material that selectively interacts with the analytes separating them while they are pushed through the column by the mobile phase68 . The composition of the mobile phase can remain constant (isocratic conditions) or it can be changed (gradient conditions) to induced a gradual elution of the peptides from the column69 . Different formats exist depending on the type of interaction that produces the separation69 :

Reversed phase (RP, hydrophobic interactions): SP: non-polar (C4-18), MP: aqueous composition + organic modifier (acetonitrile, methanol, etc.).

Normal Phase: (NP, hydrophilic interactions): SP: polar (silica, amino,

cyan, diol), MP: organic solvent (dichloromethane, chloroform, toluene,

etc.).

Ion exchange (IEX, electrostatic interactions): SP: Strong cation exchange (SCX), weak cation exchange (WCX), strong anion exchange (SAX) and weak anion exchange (WAX) (ion exchange resins), MP: buffers with decreasing or increasing pH or salt concentration in cation or anion exchange, respectively.

Affinity (chemical affinity): SP: dye, lectins, hydrazide, TiO2, thiol resins, etc., MP: acidic buffers.

Size exclusion (SEC, physical interactions): size, molecular weight. SP: gels with specific pore sizes, MP: aqueous or organic solvents.

Hydrophilic interaction liquid chromatography (HILIC, hydrophilic interactions): SP: polar (silanol, diol, etc.), neutral (n-HILIC), Zwitterionic (ZIC-HILIC or z-HILIC) and charged (ERLIC) 70,71 , MP: aqueous-organic miscible mixture (at least 70% of organic modifier)

88,89.

19

Figure 4.1 shows a schematic with the interactions in the most used LC systems used in proteomics.

The efficiency of a chromatographic separation in proteomics is measured by the peak capacity, which has been defined as the number of peaks that can

be resolved by the system (𝑛𝑐 )9 and can be calculated by the following equation (Eq. 4.1):

𝑛𝑐 =𝑡𝑔

𝑊+ 1 Eq. 4.1

Where tg is the gradient time, and W is the average peak width at half maximun9. High peak capacity is required to improve ionization efficiency, decrease ion suppression (cause by co-eluting high and low abundance peptides) and increase peptide identifications. 41,52,72,73 .

4.1. One-dimensional liquid chromatography (1D-LC)

Reversed phase LC is the standard one-dimensional format applied in proteomic studies because it can be directly couple with electrospray-mass spectrometry. The reason for this is that the mobile phase is a miscible mixture of water and an organic solvent, frequently acetonitrile that can be easily evaporated in the electrospray. Furthermore, the addition of acids (formic, acetic or trifluoroacetic) to the mobile phase denatures, positively

Figure 4.1: Schematic representation of the different interaction between a tryptic peptide and different stationary phases71

20

charges the peptides, which is also beneficial for the ionization process 68 .

In the search for higher separation efficiencies, the standard HPLC (high performance liquid chromatography) reversed phase columns with dimensions of 15-30 cm x 2.1-4.6 mm x 2-50 μm (length × internal diameter × particle size) have been reduced to 50 mm × 2.1 mm × 1.7-5 μm using mostly C-8 and C-18 bonded silica particles 41,74,75 . The reduction of particle size and gradient length have showed an increment in peak capacity, but the further reduction in particle diameter comes at the expense of an increase in the backpressure 72,74 . Strategies for decreasing the pressure on the system are the use of superficial porous particles (1.7 μm solid core surrounded by 0.5 μm of porous layer) or monolithic columns (with no particles at all, but a porous skeleton with channels extended across the column length). Also

increasing the column temperature (35 to 60°C) reduces the viscosity of the mobile phase, lowering the backpressure 52 .

Another platform that became very popular in the pharmaceutical field due to its high sample throughput and separation efficiency is UPLC (ultra performance liquid chromatography) because it allows the use of sub-2- μm particles at high backpressures (20,000 psi)41,76-78 . On the other hand, sometimes sacrificing analysis time for sensitivity, nano-LC gained popularity in biomarker discovery thanks to the low flow rates used with very narrow columns such as capillary columns (0.1-1 mm of i.d.) and nanobore columns (0.025 -0.1 mm i.d.). This format in combination with small porous particles and monolithic column technology improves peak resolution and the efficiency of the ionization by being able to use nano-electrospray interphases and relatively long columns and gradients 41,79 . A typical nano-LC column has an i.d. of 75 μm requiring flows of 200 nL/min or less and it is common practice to place a wider i.d. guard column before the analytical to protect the latter, increase the loading capability and for desalting purposes 80 .

Even though, 1D-LC has been very useful for peptide separation and protein identification in many proteomics applications, the peak capacity associated to a 1D system is not high enough to deal with the overwhelming sample complexity of plasma and serum. For such cases, two-dimensional liquid chromatography renders a better performance 68,81 .

4.2. Two-dimensional liquid chromatography (2D-LC) The application of multidimensional chromatographic separations results more efficient to resolve a higher number of peptides and identify a larger number of proteins 1,73,82 . According to Giddings 83 , the peak capacity of a two-dimensional LC (2D-LC) system is equal to the product of the peak capacities

of the individual 1D separations (𝑛𝑐1 × 𝑛𝑐

2), but the latter is only achievable if the mechanism of separation in each dimension are independent from each other, i.e., are orthogonal 9,84 .

An example of the combination of two orthogonal separations is IEX-RP chromatography, where the separation in the first dimension is performed

21

through electrostatic interactions and on the second dimension by hydrophobic mechanisms9 . Recent studies proposed that coupling two RP systems using sufficiently high pH (10) in the first dimension and low pH (2.6) in the second one could provide also orthogonal separations9,20,70-73. Additionally, HILIC has demonstrated to be a good alternative for the first-dimensional separation. It is ideal for the separation of polar and highly hydrophilic compounds requiring low salt concentration and a high proportion of organic modifier, showing the highest degree of orthogonality when combined with RPLC (even more than SCX-RP and SEC-RP). The Zwitterionic stationary phase has both positive and negative charges. Hence, hydrophilic and electrostatic interactions that facilitate the separation of similarly charged peptides that cannot be resolved by SCX). Even though the incompatibility of the mobile phases between HILIC and RP is a major issue,

Zhao et al.85 successfully coupled offering a sensitive alternative to conventional 2D-LC for protein identification 41,70,86,87 .

Figure 4.2 shows normalized retention time plots where the highest degree of orthogonality is related to the lack of correlation between the separations in the two dimensions. It seems clear that HILIC, SCX and basic RP are good combinations for acidic RP to achieve high degree of orthogonality.

Figure 4.2: normalized retention time plots of different combinations of LC modes168

22

The combinations of the two dimensions can be done either online or offline modes71 . In offline 2D-LC, the fractions eluting from the first dimension are collected and the solvent is evaporated. The concentrated analytes are later re-dissolved in a solvent compatible with the second dimension separation system. The obvious advantage is achieving higher peak capacity due to the possibility of optimizing the separation in each dimension. On the other hand, the additional steps are time-consuming and can produce to analyte losses and low reproducibility9,88 . Online 2D-LC is technically challenging because it requires compatibility between the mobile phases used in both dimensions and a very fast second dimension to maintain the resolution obtained from the first one. A strategy to accomplish this is to use multiple second dimension columns. Some of the advantages the online setups are higher reproducibility, automation, higher throughput and reduced sample amount. However, they require complex valve systems (2-10 valves) and customized software9,87. Different instrumental setups exist to couple multidimensional-LC to MS and the most commonly used ones are show in Figure 4.3. Pivotal importance has been given to the online coupling to tandem MS. This approach, known as MudPIT (Multidimensional Protein Identification Technology), has been successfully used in many plasma and serum proteomic analysis 1,73,82,89,90 see also Table 4.1.

Figure 4.3: Different instrumental setups for two-dimensional separations

applying SCX and RP. A) In an off-line setup the sample is first separated by

SCX and fractions are collected. The fractions can be processed if needed and

are subsequently separated by RP-LC and analyzed by MS. B) An example of an

on-line column switching setup. The sample is first loaded onto the SCX

column and eluted stepwise onto the trap column. The sample is then desalted

and subsequently eluted onto the analytical RP column followed by MS

analysis. C) In the MudPIT approach the SCX and RP materials are in one

capillary that also functions as spray tip for direct MS analysis. In the triphasic

setup an additional RP phase is packed before the SCX and functions as a trap

for desalting the sample prior to SCX–RP-MS. D) Chromatograms of a MudPIT

analysis. Each color indicates the RP separation after one salt step71

23

The classic combination performs a first dimension separation on an SCX column and a RP separation in the second one. The reason for this preference is that SCX requires elution with increasing salt concentrations or decreasing buffer pH which is compatible with RP stationary phases (not the case for SAX that needs high pH and would dissolve the silica-based support material) 80 . The analytes exiting the SCX column are focused at the beginning of the secondary column (RP, C-8 or C-18) because the strong eluent of the first dimension is a weak eluent of the second dimension. In addition, this gives the RP column a desalting capability crucial the compatibility with the mass spectrometer 9,71,73 .

24

Table 4.1: Analysis of plasma and serum proteins by bottom up proteomics using LC-MS/MS

Objective and matrix year Sample Preparation Liquid chromatography

1D-LC 2D-LC MS

Ionization Instrument Acq mode Proteins Id Ref.

Identification of

carbonylated proteins

in plasma

2014 Avidin affinity enrichment

Trypsin digestion

Nano-UPLC

C-18

10 cm x75 𝜇m x

1.7 𝜇m

Nano-ESI LTQ Orbitrap DDA

114

carbonylated

proteins

91

Plasma proteome 2005

Trypsin digestion C-18 SPE

Online MicroSPE C-18

4 cm x 75𝜇m x 5 𝜇m, 300Å

SCX

Polysulfoethyl A

5 𝜇m, 300 Å

80 cm x 320 𝜇m

Nano-LC

C-18

85 cm x 30 𝜇m x 3 𝜇m,

300 Å

Nano-ESI LCQ XP Ion trap DDA 1682 92

Plasma proteome 2014

Depletion of HAP using

PROT20S kit

Trypsin digestion

Nano-LC

C-18

15 & 50 cm x75 𝜇m

x 2 𝜇m,100 Å

Nano-ESI LTQ-Orbitrap DDA 114 93

Human Plasma N-

Glycoproteome

2005

Depletion by MARS-6

Enrichment with hydrazide

resins

Trypsin digestion

SCX

Polysulfoethyl A

20cm x 2.1 mm

5 𝜇m, 300 Å

Nano-LC

C-18

65 cm x 150 𝜇m x 3

𝜇m, 300 Å

Nano-ESI LCQ Ion trap DDA

303 N-

glycoproteins

94

Plasma

Quantitative

proteomics

2015 Whole plasma

Trypsin digestion

Nano-LC

Ethyl hybrid C-18

25 cm x 75 μm x

1.7μm

Nano-ESI QTOF DIA

MSE 59 95

Protein

Quantitation in

plasma

2011 Precipitation with IPA

Trypsin digestion

HPLC

C-18

5 cm x 2.1 mm x

3.5 μm

ESI Q-LIT SRM n.a. 55

Plasma proteome 2015 Depletion of HAP by MARS14

Trypsin digestion

Offline HPLC

C-18

5 cm x 4.6 mm x

5 𝜇m, 300 Å

Nano-LC

C-18

15 cm x 75 𝜇m x

3 𝜇m, 100 Å

Nano-ESI TripleTOF DDA 303 96

Plasma

proteomics 2006

Whole plasma Depletion of HAP by MARS14 Depletion of HAP by IgY-12 Depletion of HAP by IgY-12 Depletion of HAP by IgY-12

Trypsin digestion

SCX

Polysulfoethyl A

20 cm x 4.6 mm

5 𝜇m, 300 Å

Nano-LC

C-18

65 cm x 150 𝜇m x

3 𝜇m

Nano-ESI

RP-LTQ

RP-LTQ

RP-LTQ

RP-FT-ICR

SCX-RP-LTQ

DDA

96

111

122

162

369

respectively

41

Targeted protein

quantitation in

plasma

2011 Trypsin digestion

Online-C-18 SPE

HPLC C-18

10 cm x 2 mm x

5 μm

ESI Q-LIT SRM n.a. 97

25



1D-LC 2D-LC MS


Targeted Plasma

proteomics 2012

SISCAPA

Trypsin digestion

UPLC

C-18

5 cm x 2.1 mm x

1.8 μm, 300 Å

Nano-ESI QQQ MRM 5 56

Plasma

Proteomics 2015

Precipitation with saturated ammonium sulfate solution

Trypsin digestion

Nano-LC C-18

50 cm x 75 𝜇m x

3 𝜇m, 100 Å

Nano-ESI LTQ-Orbitrap DDA 224 98

Growth factor

quantification in

plasma

2013 Immunoaffinity enrichment

Trypsin digestion

Nano-LC

C-18

5 cm x 175 𝜇m x

3 𝜇m, 100 Å

Nano-ESI Q-LIT SRM n.a. 99

Plasma

Proteomics 2005

Trypsin digestion C-18 SPE

Online MicroSPE

C-18

4 cm x 150 𝜇m

5 or 3 𝜇m, 300Å

Nano-LC

C-18

200, 90 and 40 cm

x 50 𝜇m x 3, 2 and

1.4 𝜇m (300, 100

and 120 Å)

Nano-ESI LTQ DDA 835 100

Plasma

Proteomics 2008

Depetion with ProteomeLab + Seppro IgY-SuperMix Trypsin digestion

SCX polysulfoethyl A 20 cm x 2.1mm

5 𝜇m, 200 Å

Capillary LC C-18

65 cm x 75 𝜇m x 3

𝜇m

Nano-ESI LTQ DDA 695 101

Plasma

Proteomics 2009

Depletion with ProteoMiner Depletion with MARS-14

Trypsin digestion

Nano-LC C-18

15 cmx 75 𝜇m MALDI TOF/TOF DDA

86

94 63

Rheumatoid arthritis

plasma proteome 2009

Depletion with albumin and IgG

depletion kit + Multi lectins

affinity column

Trypsin digestion

Nano- LC

C-18

15 cm x 75 𝜇m x

5 𝜇m, 200 Å

Nano-ESI LTQ FT DDA 308 102

Quantitative plasma

proteomics 2014

Depletion with MARS-14 Trypsin digestion

SCX SPE

Nano-LC 50 cm x 75 𝜇m x 2

𝜇m, 100 Å Nano-ESI LTQ-Orbitrap DDA 149 103

Quantitative plasma

proteomics 2005

Precipitation with methanol Trypsin digestion

SCX polysulfoethyl A 20 cm x 4.6 mm x 5 μm, 300 Å

Capillary LC C-8

65 cm x 150 𝜇m x

3 𝜇m

ESI FT-ICR DDA 429

104

Quantitation of

plasma proteins 2006


Nano-LC C-18, 75 𝜇m

Nano-ESI Q-LIT MRM n.a. 105

26



1D-LC 2D-LC MS


Plasma proteomics 2009 Depletion with MARS-7

Trypsin digestion

Nano-LC C-18

15 cm x 75 𝜇m x

5 𝜇m, 100 Å

Nano-ESI LTQ-Orbitrap DIA

PaCIFIC 746 18

Characterization

of albumin adducts in

plasma

2010 Albumin adduct enrichment with thiol-affinity resins Trypsin digestion

HPLC C-8

7.5 cm x 500 𝜇m ESI LTQ-Orbitrap DDA n.a. 57

Glycoproteins Non-Glyco-proteins

Cysteinyl proteins

Non-cysteynil proteins

in plasma

2006

Depletion with Seppro MIXED-12 Trypsin digestion SPE C-18

Hydrazide capture SCX

polysulfoethyl A 20 cm x 2.1 mm

Thiol-affinity resins

SCX polysulfoethyl A 20 cm x 2.1 mm

Capillary-LC C-18

65 cm x 150 𝜇m x

3 𝜇m

Nano-ESI LIT DDA

662

1,486

1,977

1,914

58

Depletion of HAP vs.

Enrichment of LAP in

plasma

2011

Enrichment of LAP with ProteoMiner Depletion of HAP with ProteoPrep20 Depletion of HAP with ProteoExtract+ProteoPrep20 Trypsin digestion

SPE SCX Capillary LC

C-18 15 cm x 75 𝜇m

Nano-ESI QTOF DDA

318

334

429

64

LAP plasma

proteomics 2010

Whole plasma HAS depletion HAS depletion+ MARS-14 Trypsin digestion

Nano-LC C-12

10 cm x 75 𝜇m x

4 𝜇m

Nano-ESI LTQ-orbitrap DDA

29

61

76

106

Identification and

quantitation of the

plasma proteome

2015

Depletion of HAP proteins with IgY14 Digestion with Lys-C iTRAQ labels

Offline HPLC Basic pH RP

15 cm x 2.1 mm 300 Å

Nano-LC C-18

20 cm x 75 𝜇m x

1.9 𝜇m

C-18 10 cmx 75 𝜇m x

3 𝜇m

Nano-ESI

Q-Orbitrap

Q-Orbitrap

DDA

DDA

5303

3400

107

27



1D-LC 2D-LC MS


Glycated albumin

quantitation in

plasma

2015 Whole plasma Trypsin digestion

UPLC C-18

15 cm x 2.1 mm x 1.9𝜇m

Micro-LC C-18

10 cm x 300 𝜇m x

3 𝜇m, 120 Å

Micro-LC Ethyl hybrid

25 cm x 75 𝜇m x

1.7 𝜇m

Nano-ESI

Q-Orbitrap

TripleTOF

Q-IM-TOF

DDA

PRM

DDA

DIA

SWATH

DIA

MSE

n.a. 108

Quantitation

of albumin adducts in

plasma

2015 Albumin adduct enrichment with thiol-affinity resins Trypsin digestion

Nano-LC C-18

25 cm x 3 𝜇m Nano-ESI Q-Orbitrap DIA n.a. 109

Quantitation of

peptides in plasma 2013

SPE C-18 Trypsin digestion

UPLC C-18

10 cm x 2.1 mm x 1.7𝜇m

Nano-ESI QQQ SRM n.a. 110

Serum proteomics 2002 Depletion by protA/G

Trypsin digestion

SCX

Polysulfoethyl A

5 𝜇m, 300 Å

Nano-LC

C-18

60 cm x150 𝜇mx 5

𝜇m

Nano-ESI LCQ XP ion trap DDA 490 90

Serum proteomics 2011

Whole serum

Depletion of HAP by MARS14

Depletion by ProteoPrep20

Trypsin digestion

Nano-LC

C-18

15 cm x 100 𝜇m x 5

𝜇m

Nano-ESI LTQ-Orbitrap DDA

266

404

442

respectively

38

Protein drugs

Quantitation in serum 2009

Albumin depletion kit

Protein A enrichment

Immunoaffinity enrichment

Trypsin and Lys-C digestion

Nano-LC

C-18

15 cm x 75 μm x

3 μm, 200 Å

Nano-ESI LTQ-XL SRM n.a. 111

Interferon

quantification in

serum

2009

Monolithic C-18 SPE


Cation exchange SPE

Nano-LC

C-18

5 cm x 2.1 mm x

5 μm

Heated-

ESI QQQ SRM n.a. 112

28



1D-LC 2D-LC MS


Serum

Proteomics 2014

Precipitation with acetone Trypsin digestion

Online SCX

5 cm x 75 𝜇m

Nano-LC

C-18

10 cm x 75 𝜇m

Nano-ESI LTQ-XL DDA >1000 16

Human serum

albumin 2012 Trypsin digestion

Nano UPLC

C-18

75 𝜇m x 360 𝜇m,

100 Å

Nano-ESI

LTQ-FT

LTQ-Orbitrap

DDA

DIA

FT-

ARM

150

attomoles of

BSA

113

Interferon

quantification in

serum

2014 SCX-SPE


UPLC-LC

C-18

5 cm x 2.1 mm x

1.7 μm, 135 Å

Nano-ESI QQQ MRM n.a. 114

Quantification of

enterotoxin A in

serum

2012

Streptavidin-coated

magnetized beads

Trypsin digestion

Nano-LC

C-18

15 cm x 75 𝜇m x 3

𝜇m

Nano-ESI Q-LIT SRM n.a. 115

Quantification of

transferrin in serum 2012

Depletion with ProteoPrep and

albumin and IgG depletion kit

Trypsin digestion

Capillary LC

C-18

2 cm x 2.1 mm x

3 𝜇m

Nano-ESI QQQ MRM n.a. 116

Serum

Proteomics 2005

Depletion with MARS-6

Trypsin digestion

Offline SCX

5 cm x 0.8 mm Nano-LC

Nano-ESI

MALDI

Ion

trap

TOF

89

76 117

Quantification of

parathyroid hormone

in serum

2010 Immunocapture beats

Trypsin digestion

Nano-LC

C-18

5 cm x 2.1 mm x

3 𝜇m, 120 Å

ESI QQQ SRM n.a. 118

Quantification of HCG

in serum 2012

Immunocapture beats

Trypsin digestion

HPLC

C-18

5 cm x 1 mm x

5 𝜇m, 300 Å

Nano-API QQQ SRM n.a. 119

Serum

Proteomics 2006

Dye –ligand Multiple Affinity Removal chromatography Trypsin digestion

SCX

Poly SEA

15 cm x 2 mm

5 𝜇m, 300 Å

Capillary LC

C-18

15 cm x 1 mm x

3 𝜇m

Nano-ESI Ion trap DDA n.a. 21

Quantitation of serum

glycoproteins 2007

Hydrazide capture Trypsin digestion

Nano-LC

C-18

75 𝜇m

Nano-ESI LTQ-FT-ICR MRM n.a. 120

29



1D-LC 2D-LC MS


Serum PTMs

Proteomics 2016

Immunoaffinity enrichment IMAC Trypsin digestion

Nano-LC C-18

10 cm x 75 𝜇m x

5 𝜇m, 100 Å

Nano-ESI LTQ-Orbitrap DDA 520

688 121

Quantitation of 4

proteins in serum 2011


Nano-LC C-18

20 cm x 75 𝜇m x

1.7 𝜇m

Nano-ESI QTOF DIA

MSE n.a. 122

Quantitation of

proteins in serum 2011

Whole serum Trypsin digestion

Nano-LC C-18

15 cm x 75 𝜇m x

5 𝜇m, 100 Å

Nano-ESI LTQ-XL DIA

PaCIFIC 311 123

Absolute quantitation

of 6 exogenous

proteins in serum

2006 Whole serum Trypsin digestion

Nano-LC C-18

15 cm x 300 𝜇m Nano-ESI QTOF

DIA

MSE n.a. 124

Targeted detection of

LAP proteins in

serum

2008 Depletion by precipitation of HAP with acetonitrile Trypsin digestion

Nano-LC C-18

15 cm x 75 𝜇m Nano-ESI QQQ MRM 29 51

Serum

Proteomics 2004


Online SCX polysulfoethyl A

10 cm x 320 μm

Offline SCX

Hypersil 25 cm x 4.6 mm

Offline SCX Hypersil

25 cm x 4.6 mm

micro-LC C-18

10 cm x 180 𝜇m x

5 𝜇m, 300 Å

Micro-LC

C-18 25 cm x 150 𝜇m

Nano-LC

C-18 15 cm x 150 𝜇m

Micro-ESI

Micro-ESI

Nano-ESI

Ion trap DDA

131

224

330

89

Quantitation of

peptides in serum 2009

SPE C-18 Trypsin digestion Immunocapture

UPLC C-8

5 cm x 1 mm x 5 𝜇m, 300 Å

ESI QQQ SRM n.a. 53

Quantitation of

therapeutic protein in

serum

2015 Precipitation with methanol Trypsin digestion

Capillary LC C-18

10 cm x 2.1 mm x 1.7𝜇m

Capillary LC C-8

10 cm x 2.1 mm x 2.5 𝜇m

API QQQ MRM n.a.

20

30



1D-LC 2D-LC MS


Serum proteomics 2009 Whole serum Trypsin digestion

SCX polysulfoethyl A 5 cm x 4.6 mm x 5 μm, 300 Å

C-18

15 cm x 2.1 mm

x 3.5 μm, 130 Å

C-18

15 cm x 2.1 mm

x 3.5 μm, 130 Å

Capillary-LC C-18

15 cm x 300 μm

Capillary-LC

C-18

15 cm x 300 μm

Capillary-LC

C-18

15 cm x 300 μm

Nano-ESI Q-TOF DIA

MSE

142

184

52

125

Differential

expression proteins

in serum

2014 Enrichment of LAP with ProteoMiner +Lectins affinity Trypsin digestion

Capillary LC C-8

20 cm x 75 𝜇m x

3 𝜇m

Nano-ESI LTQ-orbitrap DDA 58 126

Serum proteomics 2002 Depletion by protA/G

Trypsin digestion

SCX

Polysulfoethyl A

5 𝜇m, 300 Å

Nano-LC

C-18

60 cm x150 𝜇mx 5

𝜇m

Nano-ESI LCQ XP ion trap DDA 490 90

Serum proteomics 2005 Depletion of HAP with MARS-6 Digestion with Lys-C

Offline SCX polysulfoethyl A 3.5 cm x 3 μm

x 3.5 μm

Capillary LC C-18

3.5 cm x 300 𝜇m x

5 𝜇m

ESI Ion trap DDA 107 82

Plasma and serum

proteomics 2007


Offline SAX 10 mm x 10 𝜇m

Nano-LC C-18

25 cm x 75 𝜇m Nano-ESI LTQ-FT-ICR DDA 1662 127

Quantification of

thyroglobulin in

plasma or serum

2013 Immuno-precipitation Trypsin digestion

Zorbax XDB-CN

5 cm x 2.1 mm

5 𝜇m

Nano-LC

Poroshell C-18

10 cm x 3 mm x 2.7

𝜇m, 120 Å

Nano-API QQQ MRM n.a. 128

LAP

Plasma and Serum

Proteomics

2006 Precipitation of HPA with acetonitrile Trypsin digestion

Supernatant Desalting

C-18 ZipTip

Nano-LC C-18

15 cmx 75 𝜇m x 5

𝜇m, 200 Å

Nano-ESI Ion trap-FT DDA 34 in plasma

50 in serum 129

Plasma and serum

proteomics 2009

LAP enrichment with ProteoMiner Trypsin digestion

Nano-LC C-18

15 cm x 75 𝜇m Nano-ESI LTQ-Orbitrap DDA 134 130

31

5. Mass Spectrometry Regardless of the method used to fractionate the peptides in serum or plasma, the preferred identification tool for peptides is mass spectrometry. This technique generates ions from the liquid phase of analyte molecules by several mechanisms, separates them under vacuum according to their mass to charge ratio (m/z) and detects and measures their intensity. The MS and MS/MS analysis generates a spectrum where the intensities of the ions are in the y-axis and the m/z in the x-axis. The essential components of a mass spectrometer are the ion source, the mass analyzer(s) and ion detection system1,65 .

5.1. Ionization techniques: Mass spectrometry has been used for more than four decades already in many analytical laboratories to analyze and characterize small molecules, applying electron capture ionization techniques that are too harsh for protein analysis65 . Proteomics, on the other hand, switched from immunoassays to mass spectrometry as identification tool only when the coupling with liquid chromatography became possible by the development of ionization techniques such as matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI). These softer ionization methods allow proteins and peptides to be charged without breaking apart and in the case of ESI they present multiple charges1,131 .

In MALDI the peptides are crystallized within a matrix (organic acid) on a metal plate surface, the matrixes along with the analytes are brought into the gas phase by using a laser. Due to the solid nature of the support, MALDI is performed offline to LC-MS, and it is preferred for the analysis of tissue samples and cells (see Figure 5.1a) 131 . ESI uses a different principle for the ionization. In this case, the analytes in solution are ionized and vaporized by forming a fine spray at the end of a high

Figure 5.1: Schematic of (a) MALDI ionization65 (b) ESI ionization process131

(a )

(b)

32

voltage needle (see Figure 5.1b). Once the analytes are in the gas phase, they are separated in one or several consecutive mass analyzers. When analytes co-elute from the LC into the ESI, hydrophobic molecules are ionized preferably, this effect is known as ion suppression and is responsible of the concentration-related sensitivity of this ionization technique8,132 . A way to improve the ionization and reduce ion suppression is to decrease the flow rate of the chromatographic separation coming into the sprayer4,133 . Therefore, capillary LC with nano-ESI formats (performed at atmospheric pressure) are the preferred tools for proteomic studies of plasma and serum134 . Surface enhanced laser desorption/ionization (SELDI) is an application where the sample-presenting medium acts as a solid phase extractor combining on-chip sample preparation steps (purification, extraction modifications, amplification) with laser desorption ionization. SELDI is mostly used in biomarker discovery studies of intact proteins profiling and disease/control states comparisons13,135 .

5.2. Mass analyzers: There are basically four types of mass analyzers: quadrupole (Q), time of flight analyzers, ion traps and Fourier transform ion cyclotron. Depending on the application and the required performance parameters, they can be used individually or in a hydride combination1 . The quadrupole has four parallel electrode rods that when a radio frequency (RF) and direct current (DC) are applied to them, they generate an electric field that focuses the ions in the center of the rods. At any the combination of RF+DC only certain ions with a particular m/z are a stable trajectory that directs them to the detector, the rest collide with the rods. Consequently, this mass analyzer can also be used as a mass filter or as a scanner (see Figure 5.2) 131,136 . Ion traps are based on similar principles as the quadrupoles. However, they differ in the architecture of the electrodes, the 3D ion trap (LCQ) having a ring and two end-caps electrodes instead of parallel rods (see Figure 5.3a). A potential is applied to the end-caps to confine the ions inside the trap allowing to store and select ions with a particular m/z to be ejected to the detector. Similarly, quadrupoles can be used to trap ions by adding capping electrodes to generate a potential well inside, this type of ion traps as called 2D ion traps or linear ion traps (LIT or LQT, see Figure 5.3b)136,137 .

Figure 5.2: Schematic of the principle of a quadrupole mass analyzer65

33

Time of flight analyzers (TOF) measure the time that takes the ions to travel inside a 1 m field-free tube after being accelerated with the same kinetic energy through an electric pulse. The principle states that low m/z ions will reach the detector earlier than ions high higher m/z, the time is then converted into m/z values. TOF analyzers use a reflectron to compensate for differences in the initial kinetic energy of ions with same m/z, in this way the ions with same m/z arrive at the same time to the detector regardless of their initial kinetic energy (see Figure 5.4 below)131,136 .

Figure 5.4: Schematic of the time of flight analyzer with reflectron131

Ion cyclotrons orthogonally accelerated ions using a powerful magnet and generates a cyclonic motion. The frequencies of these motions are recorded on detector plates, deconvoluted via Fourier-transformed and converted into m/z values (see Figure 5.5a). Instruments working on this principle are called Fourier-Transform-Ion Cyclotron Resonance (FT-ICR) mass spectrometers and are the most expensive and accurate ones restricting their used to research facilities131,136,138 . Orbitap (OT) instruments are based on a similar principle as the FT-ICR, but they do not require a powerful magnet and the ions spin around a central barrel-like shape electrode. The image of the axial motion of the ions is recorded and Fourier-transformed into m/z values (see Figure 5.5b). These main differences have a significant impact on the price of these spectrometers making them more accessible for analytical laboratories131,136,139 .

(a) (b)

Figure 5.3: (a) Schematic of a transversal section of a LCQ and (b) and schematic of a standalone LQT with two detectors for radial expulsion of the ions131

34

The performance of a mass analyzer is frequently assessed in terms of resolution (m/z ÷ full width at half maximum (FWHM) of the peak) and mass accuracy (the difference between the theoretical mass and the measured mass of a compound). Another characteristic that might be determinant at the time to buy one is their price.

Quadrupoles and ion traps are the most frequently for high throughput applications due to their robustness and reasonable low price. However, they have a relatively low resolution (3000) and mass accuracy (0.1 Da)86. A study reported 4 to 6 fold increase in the number of peptides and proteins identified using an LQT than compared to an LCQ with the same protocols and acquisition parameters 137 . TOF analyzers are in more accurate than the quadrupoles and ion traps but less sensitive (typical resolution of 25,000 and 0.005 Da mass accuracy) with also an intermediate price, it also sensitive to factors such as room temperature, ion intensities, and detector dead time131 . Finally, the best performance parameters are obtained with the FT-ICR and OT (sub-ppm mass accuracy and a resolution of 400,000 for a 1,000 m/z ion)136 , but their price is also high. Several setups of these analyzers allow to perform tandem MS (MS/MS). For instance, in some cases hybrid mass spectrometers that combine the advantages of a couple of mass analyzers to improve the sensitivity and selectivity of the detection and to be able to isolate, fragment and analyze ions subsequently, are required. A typical MS/MS (MS2) experiment uses one analyzer to select ions with particular m/z values (precursor ions), then in a collision cell, they are activated and fragmented (by collision with neutral gas: collision induced dissociation) into smaller ions (product ions) that are subsequently analyzed. This process can be repeated a ‘’n’’ number of times, being called MSn experiment.

(a) (b)

Figure 5.5: Principle of the (a) FT-ICR and (b) orbitrap136

35

MS2 experiments were initially performed in triple quadrupoles (QQQ), where the first quadrupole selects the ions, the second one serves as collision cell and the third can filter specific ions or scan them131 . However in search of higher selectivity and sensitivity and to be able to perform MSn experiments, hybrid mass spectrometers are currently used in most proteomics studies of plasma and serum (see table 4.1), the most common setups are shown in Figure 5.6.

(a) (b)

(c)

Figure 5.6: Schematic of hybrid mass spectrometers used for plasma and serum proteomics (a) triple quadruple-Linear Ion Trap (QLT), (b) LTQ orbitrap, 131 and a (c) Triple TOF (QTOF)8

Figure 5.7: Schematic of a 2D nano LC system coupled to nano-ESI QTOF mass spectrometer. The first dimension separation is performed on a SCX column, followed by trapping on C-18 trap columns for desalting and

transporting the sample for the second dimension RP separation131

36

Bottom-up workflows typically apply 2D-gel electrophoresis of proteins, then digestion and analysis by MALDI-TOF, or protein digestion with trypsin, followed by capillary LC or nano-LC (1D or 2D) coupled to nano-ESI MS/MS. One important characteristic to consider when coupling LC to MS is the duty cycle of the mass analyzer. In conventional LC where the peak widths are in the range of 10-30 s, most analyzers are capable of performing several MS or MS/MS experiments. In the case of fast LC, where peaks can be as thin as 1-2 s, only the TOF analyzers are fast enough to deliver good resolution131 .

5.3. Ion detectors After the ions are separated according to the m/z by the mass analyzer , they are detected and the signal converted into ion currents, the detection is done by electron and photon multipliers where the ions hit a surface that emits either electrons or photons, and the signal is amplified and recorded.There are basically fout types of ion detectors131 :

Discrete-dynode multipliers

Channel electron multipliers

Photon multiplier

Multichannel plate multipliers

5.4. Peptide fragmentation To be able to identify peptides and through them proteins, the former should undergo fragmentation to acquire structural information from the mass spectrometry analysis. This is achieved in a tandem MS experiments inside an activation or collision cell, where the ionized peptides are activated by energy transference to induce the dissociation. Currently, the most common activation method is called collision induced (CID) or activated dissociation (CAD). This is achieved by adding a neutral gas to the ionized peptides into the collision cell. The molecules of gas collide with the peptide molecules transferring their kinetic energy. As a result the bond in the peptide breaks, leaving the positive charge on the N- or C-termini fragment (generating b-type or y-type ions respectively, see Figure 5.8)8 . Other activation methods produce different ions as other bonds are broken, the information obtained by these methods is complementary to the CID and adds valuable information to the identification of peptides. Electron capture dissociation (ECD) and electron transfer dissociation (ETD) generate c-type and z-type ions using thermal electrons and fluoranthene respectively, to introduce an electron into the peptide structure and induce random fragmentation of the peptide backbone (see Figure 5.9). The primary applications of these dissociation methods are in the study of PTMs because the fragmentation is limited to the backbone leavening the modification intact. The generation of a ladder of ions facilitates determining the sequence of the peptide and locating the modification8,136 .

37

5.5. Acquisition strategies In the early days of LC-MS and LC-MS/MS the three following acquisition modes were commonly used 140 :

Full scan mode (the most frequently used mode for LC-MS): it scans all the ions (the whole spectra of m/z) present at a given time.

Selected Ion Monitoring (SIM): is a targeted approach where a particular m/z, is selected and only the ions with this m/z are detected, and their MS spectra will be recorded. In the case of MS/MS, a spectrum of fragment ions is also acquired from each peptide with the selected m/z.

Selected Reaction Monitoring (SRM) or Multiple Reaction Monitoring (MRM): is also a targeted approach performed by tandem MS (primarily performed in a QQQ). It consists of choosing a precursor ion in MS1 analyzer (Q1), fragmenting it inside the collision cell (Q2) and selecting and detecting a particular fragment/product ion in the MS2 analyzer (Q3). The MS can also monitor several transitions from the same sample (MRM). This approach has been used extensively for sensitive and specific quantification and has become the gold standard in protein quantification methods141 .

With the development of faster and hybrid mass spectrometers, new automated acquisition routines became possible such as data-depended

Figure 5.9: Schematic of the ETD or ECD dissociation and the fragment ions generated8

Figure 5.8: Illustration of the bonds that are broken during CID8

38

acquisition (DDA), which is currently of standard use in the bottom-up proteomics and data-independent acquisition (DIA) that appears as a good alternative to explore the large dynamic range proteins present in plasma and serum. Additionally, a targeted variation of SRM, called parallel reaction monitoring (PMR), has also been developed with high-resolution and high mass accuracy to increase specificity and robustness in the presence of interfering ions because it records all product ions from the selected precursor ion141 (see Figure 5.10).

5.5.1. Data-Dependent Acquisition (DDA) DDA is a tandem MS acquisition method, which cycles between the acquisitions of a survey MS1 scan (full MS scan) that isolates the most abundant precursor ions and a subsequent target MS2 of the isolated ions. The MS/MS scans have a narrow isolation window (typically of 2 m/z)8 . By using this data collecting method, sometimes also referred as information-dependent acquisition (IDA), when peptides co-elute from the LC column (always the case for plasma and serum samples) the most abundant peptides will mask the less abundant ones biasing the obtained data142 . Additionally, the stochasticity in the selection process of precursor ions lowers the reproducibility and interferes with quantification (restricting it to MS1 or reporter ions in MS2). For example: if a peptide has no matching spectra, it would be impossible to know if the peptide was not detectable or simply missed during the sampling for MS/MS143 . A strategy to include low abundance ions in the selection is to add the most abundant ones an exclusion list for a short period of time (dynamic exclusion, DE)144,145 . Even though DE is essential for the sensitivity of DDA, this “one peak at a time” sampling process adds a limitation by reducing the probability that a peptide is sampled at the apex of its elution profile form the LC 146 . The spectra obtained by DDA are suitable for the data search for peptide and proteins protein identification147 . It has been used extensively in shotgun proteomics to identify detected peptides in complex mixtures and ultra-high

Figure 5.10: Scheme of acquisition modes for target proteomic quantification. PRM is more robust against interfering ions because it records all fragment ions instead of only few pre-selected ones as in SRM141

39

throughput applications141 . However, as mentioned above, the problem with DDA lies in the detection of low abundance peptides148 . Several types of analyzers are used in the analysis of plasma and serum (see Table 4.1): LIT, LTQ, LTQ-Orbitrap, QTOF, LTQ-FT-ICR141 .

5.5.2. Data-Independent Acquisition (DIA) DIA was developed as an alternative to DDA in order to extend the detectable dynamic range, lower the detection limits and improve confidence in peptide identification and relative quantification of proteins145 . It uses a different approach where there is no selection of precursor ions and thus, no bias towards abundant peptides. Specifically, the spectrometer isolates and fragments all the peptide ions present either as they enter the mass analyzer

(broadband DIA) or inside a relatively broad window (about 6-90 m/z), moving to the next nonoverlapping window until the whole predetermined mass range is covered18,19,145,148 .

The MS/MS spectra obtained, frequently contain fragment ions of more than one precursor ion, which makes the data analysis much more challenging than with DDA because the information about the precursor ions is lost (see Figure 5.11).

There are several DIA methods developed by different research groups. In general, they have the same comprehensive approach, but they differ in the windows size, the location of the fragmentation (in-source or in-collision cell fragmentation) and data analysis methods to identify peptides and proteins145

(see Table 5.1).

Figure 5.11: (a) Orbitrap scan showing the ±1 m/z isolation window where two precursor ions A (blue) and B (red) co-elute, the sub indices indicate different isotopes, (b) the multiplex or mixed MS/MS spectrum, the ion fragments from peptide A (blue) and from

B (red)145

(a)

(b)

40

Table 5.1: DIA methods and their associated publications, window m/z selection range, whether multiplex spectra were considered and their ion ionization methods145

Method Reference m/z selection

window

Multiplex Dissociation

method

Shotgun CID 149 Full m/z range Y CID in-source

Original DIA 150 10 m/z N CID in collision

cell

MSE 151 Full m/z range Y CID in collision

cell

p2CID 152 Full m/z range Y CID in-source & in

collision cell

PAcIFIC 18 2.5 m/z N CID in collision

cell

AIF 153 Full m/z range Y CID in HCD

collision cell

XDIA 154 20 m/z Y ETD in cell

SWATH 147 25 m/z Y CID in collision

cell

FT-ARM 113 12 m/z or 100 m/z Y CID in collision

cell

MSE: MS everything p2CID: parallel collision- induced-dissociation PAcIFIC: precursor acquisition independent from ion count

XDIA: extended data-independent acquisition FT-ARM: Fourier transform-all reaction monitoring SWATH

From the listed above, PAcIFIC18 , XDIA154 , SWATH-MS147 and FT-ARM113

work in a similar manner. First, they collect ions from a specific m/z window,

then all the ions present are fragmented, the MS/MS multiplexed spectrum is

recorded, and finally the window is shifted and another cycle begins 145 . MSE

or MS everything, is also a data-independent acquisition approach that uses

increased duty cycles and a quadrupole to transfer the ions into a collision cell

where the collision energy (C.E.) alternates between low and high levels151 .

With low C.E. the spectrum of the precursor ions is obtained and with high

C.E. the spectrum of fragment ions is obtained in alternated cycles, in this

way the coverage of the dynamic range of peptides is increased122 .

These methods require instruments that can selectively trap ions, hence are

only possible in analyzers with trapping capabilities, except for SWATH-MS

and MSE that are performed in QTOF instruments. There are several methods

for deconvolution of multiplexed spectra with the aim of peptide identification.

Some of the experimental efforts make use of high resolution and high mass

accuracy measurements, aligned chromatographic retention times, ion-

mobility drift profiles or known ion fragmentation patterns145,147,155 .

Figure 5.12 shows the information obtained from the analysis of one precursor ion (VLENTEIGDSIFDK++) with DDA and DIA acquisition modes. DDA pre-selects the precursor ion (semi-stochastically) in a very narrow m/z window (2 m/z) across the whole predefined mass range (500 – 900 m/s). DIA, on the

41

other hand, covers the same range with wide consecutive isolation windows fragmenting all the precursor ions present in each window. As result DDA information is sparse and belongs to a specific precursor ion whereas DIA gives a comprehensive map of fragment ions that does not depend on the isolation of precursor ions. Theoretically, a DDA-equivalent spectrum can be extracted from the elution evolution of the fragment ions obtained by DIA. Furthermore, DIA data can be re-extracted in silico to include new fragments for the identification of peptides without the need to re-acquiring data143 .

Figure 5.12: Comparison of DDA and DIA. DDA acquires MS/MS scans with a narrow m/z isolation window (dots of 2 m/z) of ions detected in a previous MS1

scan whereas DIA uses wide windows (20-25 m/z) without pre-selection of ions. The same DDA spectrum can be extracted from the elution profiles obtained with DIA experiments143 .

Although DIA methods are not biased, more sensitive and reproducible, and provide a higher level of certainty for peptide identification, the data analysis is not straightforward and data search engines and databases currently available for DDA data cannot be directly used with DIA data. The application of chemometrics and the development of data analysis software such as Skyline, PeakView, PLGS-Identify and OpenSWATH allow to extract useful

42

information for the identification of peptides and proteins contained in DIA maps156 . Figure 5.13 shows a workflow for deconvolution of MS/MS spectrum from DIA data obtained by SWATH-MS using OpenSWATH analysis155 .

Several studies have reported using DIA for the analysis of plasma and serum

95,108,109,122,123,125 (see also Table 4.1). However, DDA continues to be the bottom-up proteomics method of choice due to the easier data analysis and compatibility with data search engines. It is expected that in the future DIA methods, particularly SWATH-MS will become routine for all the advantages

Figure 5.13: (a) The DIA method used here consists of sequential acquisition of

fragment-ion spectra with overlapping precursor isolation windows. Here, a

swath window width of 25 m/z is depicted which allows stepping through a

mass range of 400–1,200 m/z in 32 individual steps. If all fragment-ion spectra of

the same isolation window are aligned, an MS2 map (so-called swath) is obtained

(right side, swath 4 out of 32 is schematically shown). (b) The individual steps

performed by the OpenSWATH software are illustrated for a peptide precursor

with three transitions: red, green and blue. The steps are data conversion,

retention-time alignment, chromatogram extraction, peak-group scoring and

statistical analysis to estimate an FDR (false-discovery rate)155

43

over DDA previously mentioned and once data analysis becomes simpler156 . Also as technology develops, it is also plausible that DDA and DIA and other acquisition modes are able to deliver equivalent comprehensive information.

5.6. Advances in MS instrumentation The major advances in instrumentation have been made to improve sensitivity and specificity for accurate protein identification and quantification.

The main issues that affect sensitivity are157 :

Efficiency of ESI ionization: has been improved by advances in nano-

ESI with concomitant reduction of the LC flow133,158 .

Ion losses during due reduced detection duty cycles in the ESI-MS interface have been tackled by implementing electrodynamic single-stage and dual-stage ion funnels to ensure high efficiency in ion transmission into the MS157 .

Selectivity can be enhanced by improving the fractionation techniques, MS resolving power and reduction of the background noise can be accomplished. Some approaches include the use of differential mobility spectroscopy (DMS) and selected reaction monitoring cubed (SRM3)157 .

44

6. Quantitative proteomics with LC-MS/MS Quantification of proteins in plasma or serum is crucial to use the level of certain proteins (biomarkers) as an indicator of the physiological state of an individual. It can be used for diagnosis, therapy monitoring (protein biopharmaceutical or biomarker), and prognosis of the course of the disease. For validation stage of biomarkers frequently absolute quantification methods are required95 . Additionally, quantification of protein biopharmaceuticals is applied in pharmacokinetic studies and for the validation and quality control process. Doping testing is also a major application of quantitative

proteomics in these matrixes 54,119 . Mass spectrometry is not intrinsically quantitative because there is no direct relationship between the quantity of a compound and the intensity of the response of its ions in an MS experiment. The main reason for this is the wide variety of physicochemical properties of proteolytic peptides leading to differences in the mass spectrometric response. Consequently, accurate quantification of MS signals can be only achieved by comparing the responses of specific peptides between experiments. If one compares the amount of a protein between two physiological states or regarding a control sample, it is called relative quantitation. Conversely, absolute quantitation is achieved determining the total amount of a protein or peptide present in a sample11,159 .

Discovery vs. Target proteomics Proteomics can take two different approaches depending on the aim of the study. Discovery proteomics focuses on optimizing protein identification by increasing the number of peptides identified sacrificing the number of samples and analysis time. Target proteomics, on the other hand, put the efforts on monitoring few well-defined peptides and optimizes chromatographic separation and instrumental parameters to achieve high sensitivity and throughput to analyze a large number of samples. Quantitation of proteins in discovery proteomics is performed with the aim of detecting differences between the levels of proteins at different physiological states (healthy/disease). For this reason, relative quantitation is more

Figure 6.1: This figure demonstrates the gap between peptide content and ability to quantitate those peptides and proteins comprehensively to provide quantitative coverage161

45

frequently done after fractionating the sample to reduce the dynamic range of plasma or serum. The main applications are found in biomarker discovery. The quantification of target proteins by MS uses designed methods to increase sensitivity, specificity and throughput and assess the abundance of a few proteins in complex mixtures, where SRM and MRM have been crucial. This approach commonly performs absolute quantitation, and it is performed daily in pharmaceutical and diagnostic applications160 . In this section, the focus will be centered on LC-MS and LC-MS/MS methods for quantification of proteins since those are currently applied in the analysis of complex samples such as plasma and serum. Many factors contribute to the accuracy of the quantification, including sample preparation steps, LC resolution, scan speed, sensitivity and the ability of the spectrometer to isolate precursor ions for MS/MS experiments to mention few161 . Figure 6.1 shows the fraction of peptides and proteins that can be quantitated in a complex sample with current methods and it gives an impression of how challenging it is to achieve accurate quantitation 161 . Quantitative LC-MS-based is performed making use of isotopic labels or by label-free methods. Both approaches are discussed briefly bellow, and the main advantages and drawback are mentioned.

6.1. Stable isotopic labelling This method is based on the introduction of a differential mass tag (stable isotope) that affects only the mass of the protein or peptide without changing the chromatographic or MS analysis. Mainly relative quantitation is achieved by comparing the responses of the peptides holding the heavy isotope (known amount) and the light isoform160 . There are several methods to introduce the isotopic label11,160,161 : Metabolic labeling: also called stable isotope labelling by amino acids or SILAC, where the heavy isotopes are incorporated into the proteins during cellular growth. The main advantage it is that the heavy and light isotope peptides can be mixed before sample processing which reduces variability, at the same it impairs its applicability for plasma because it requires metabolic activity (restricted to cellular culture). Chemical or enzymatic labelling: the isotopes are incorporated via a chemical or enzymatic reaction. Probably the simplest way to introduce a heavy isotope is adding O18 during protein digestion to be introduced in the C-termini of the tryptic peptides. An example of chemical labelling is the isotope-coded affinity tags (ICAT), this approach uses chemical reactions to bind labels (heavy and light) specifically to cysteine residues reducing sample complexity. Considering that not all proteins have Cys residues, this specificity can also be seen as a drawback limiting its application to certain types of proteins. Also, it only allows the comparison of two samples per analysis.

46

Other types of chemical labelling tags are the isobaric tags for relative and absolute quantification (iTRAQ) and Tandem mass tag (TMT) where the mass-balanced labeled peptides co-elute in LC, but they produce different fragments (reporter ions) upon MS/MS fragmentation. iTRAQ allows the quantification of 8 different peptides per experiment, but it has the disadvantage of requiring trap instruments and high collision energy to accurately detect the reporter ions.

6.2. Absolute quantitation (AQUA) AQUA is performed in target approaches because the identity of the protein or peptide to be quantified should be known. The method consists on synthesizing a target peptide containing heavy isotopes and spiking a known amount into the peptide mixture as internal standard (IS) before LC-MS/MS analysis 8 . The structure of the internal standard should be the same as that of the analyte, with the exception of having incorporated heavy isotopes. In this way, IS and analyte co-elute are only separated by the mass analyzer to compensate for losses during the analysis process. Finally, the ratio of the abundance of heavy and light fragment ions is used along with the known amount of IS to calculate the concentration of the analyte accurately 17 . MRM or SRM are the primary acquisition modes used in AQUA. It targets specific peptides in complex mixtures and using a triple quadrupole or a quadrupole-LIT that filters the precursor ions and the fragment ions to monitor them over time159,161 .

Figure 6.2: Schematic of the different quantification methods using isotopes. Note the stage where the heavy isotope peptide/protein or amino acid is incorporated into the workflow160

47

Examples of applications of these modes for the quantification of plasma and serum proteins are found in Table 4.1. The quantitation methods above described have in common that the final quantities are calculated by comparing ratios of heavy and light isotopic fragments. Besides differing in the type of label and mechanism of label attachment, they also vary in the workflow stage of label introduction and on acquiring MS1 or MS2 experiments. Figure 6.2 shows the different stages at which the labels are introduced. The peptide carrying the heavy isotope is shown in red and the analyte in blue 160 . Labelling techniques have disadvantages related to the labelling reagents cost, labeling efficiency, additional processing steps required, limited number of samples per analysis and difficulty detecting LAP. As to fulfill the necessity of methods with broad applicability, low expense, and high sample throughput allowing comparison of an unlimited number of samples, label-free quantitation methods were developed161 .

6.3. Label-free methods In this section, the two widely label-free quantification strategies mostly used are discussed.

6.3.1. Spectral counting As mentioned in the previous section, DDA experiments are biased towards the most abundant peptides, i.e., a peptide with higher abundance will be selected more frequently for MS/MS than a peptide with a lower abundance. Based on this principle, the number of times that a peptide is selected for MS/MS (spectral count) can be used as a rough estimation of its concentration in the sample. This method is limited to relative quantitation and it is not reliable for trace and low protein mass analysis because it favors the sampling of high abundance proteins161 . Variations of methods based on this principle differ from each other on the normalization procedure of the spectral counts and their scope4,8 . The methods used for statistical analysis and normalization are simpler than the ones required for ions-current methods8 . The protein abundance index (PAI) is the ratio between the number of peptides identified and the theoretical number of observable peptides of each protein and it is used to calculate the protein abundance. A modification of this method is known as exponentially modified PAI (emPAI) 161 . Furthermore, absolute protein expression (APEX) is a variation introduced to inventorize the protein content of a cell116 159 .

6.3.2. Ion currents-based methods This quantification method consists of measuring the elution profile of one or more peptide fragments and correlating them to their concentration in the

48

samples. The elution profile is determined by integrating the peak areas or heights of the m/z of the fragment ions to be quantified from extracted ions chromatograms (XICs). Using this method, only the intensity of the same fragment ions can be compared between samples161 . The acquisition of the data is performed using SRM or MRM for specificity and sensitivity 4 . Adding a known amount of a non-labeled internal standard prior to LS-MS/MS analysis and measuring the intensities of three peptides derived from the analyte protein absolute quantification can be achieved. Comparing the intensities of these peptides to that of the IS it provides an estimate of the amount of protein in absolute units with a certain degree of accuracy 4 .

The accuracy of ion-current methods increases with high peptide resolution,

retention time alignment and high accuracy mass spectrometers. The

combination of LC-MS and LC-MS/MS data from a sample allows matching

the retention times of the peptides to their accurate mass and identify the

peaks to be integrated with high certainty. However, several extra steps are

required compared to labeled methods to ensure accurate quantitation (see

figure 6.3)161 .

DDA experiments have been used extensively for quantification using label-

free methods but they are known for not being suitable for LAP. DIA methods

such us XDIA and SWATH have been reported to be suitable for

quantification of proteins with a wider coverage147,154 .

Figure 6.3: Generic data

processing and analysis

workflow for quantitative

mass spectrometry. Yellow

icons indicate steps

common to all

quantification approaches

with or without the use of

stable isotopes. Blue icons

in the boxed area refer to

extra steps required when

using mass spectrometric

signal intensity values for

quantification159

49

There are software developed for label-free quantification. Few examples are listed below 8 :

Skyline

SIEVE

QuanLynx

Elucidator

Expressionist

ProteinQuant

IDEAL-Q

SuperHirn

PEPPeR

IdentiQuantXL

Msight

Figure 6.4: Schematic of a relative quantitation of proteins using label-free methods160

50

7. Bioinformatics Bioinformatics is essential to deal with the enormous amount of data

generated by LC-MS/MS experiments.

Shotgun proteomics bioinformatics workflows require several steps to identify

and quantify proteins from the analysis of peptides. The integration of

software tools to process this kind of data into commercially available

instruments has helped to simplify proteomic analysis. Nevertheless, there is

still a lot to do to standardize workflows and methodologies to make

proteomics accessible to users and not only experts1,8 .

Examples of integrated software are:

Integrated Proteomics Pipeline (IPL)

pFind Studio

ProteoIQ

Proteome- Discoverer

Scaffold

MaxQuant

Transproteomics Pipeline

The main steps and the tools required to achieve protein identification described as follows:

7.1. Protein identification A typical bioinformatics workflow for identification and characterization of proteins by bottom-up proteomics using LC-MS/MS analysis is shown in Figure 7.1. The identification of peptides/proteins is achieved by comparing theoretical data from databases with experimental data acquired in MS/MS experiments. The data search can be performed using one of the following search engines

8,162

Free-software: Mascot, Comet, X!tamden, OMSSA, Phenyx and Protein Prospector.

Commercial software: BioTools, MassLynx, SEQUEST, SpectrumMill and Protein Pilot.

The vendor software usually has incorporated a search engine to perform automatic database searches and provide a list of results with scores8 . Databases commonly used are NCBInr, Swiss Port, UniRef100, EST, IPI-Human and MSDB. Swiss Port is suitable for human proteomics, it is a very well curated, non-redundant, high-quality database, searches are concise and fast but proteins with low abundance might be missed if they only are represented by one or two spectra. An alternative is a non-identical, comprehensive database that contains explicit sequence of

51

known proteins (NCBInr, EST and UniRef100), the disadvantage is that the searches are time-consuming due to the large size of the database162 .

Figure 7.1: Workflow for protein identification and characterization by bottom-up proteomics162

The most important parameters that influence the quality of the results are163 :

Digestion enzyme (e.g. trypsin) and miscleavages

Database

Modifications: phosphorylation, glycosylation, N-terminal pyroglutamic acid, etc.

The Steps to perform data searches are shown in Figure 7.2.

7.2. Validation of results

Figure 7.3: Schematic of the steps followed to perform database search for protein identification. The typical steps are on the left and the potential additional steps on the right163

52

The data sets are matched and scored according to the search engine, and the scoring method can be descriptive, interpretative or probabilistic. An important parameter to be determined for large data sets is the false discovery rate (FDR), which is calculated by validating the search results at peptide and protein level.

Validation is accomplished by repeating the search with the same data but on a database where the sequences have been reversed or randomized, the number of matches represents the rate of false positive in the results164 . It is important to determine if the FDR are in the ranges of 0.1, 1 or 10, where the lowest value is preferred since it is associated with a high confident level. However, FDR are not reliable if the data set is small (only a few spectra per protein)117 162 .

It is important to remember that the searches engines identify proteins and that in a complex mixture, peptides can be assigned to more than one protein. Hence, the identification of proteins generates a list of possible proteins that matches the peptides. FDR for peptides and proteins are not the same. They might be lower or higher than the other, and it is important to consider the identification parameters pre-set in the software for identification. One requires several peptides to be matched to a protein for confident identification, if only one peptide matches a protein it is considered a suspect rather than an identified protein 162 .

A specific online database of plasma proteins is the plasma proteome database (PPP, http://www.plasmaproteomedatabase.org), it has been developed with the aim to be a central repository of information regarding the plasma proteome by collecting the data produced by Human Plasma Proteome Project (HPPP) and other scientific studies. It contains information about PTMs, isoforms, protein localization and tissue expression, diseases, functions, etc. 23 .

http://www.plasmaproteomedatabase.org/

53

8. Discussion Table 4.1 summarizes recent proteomic studies on plasma and/or serum using LC-MS/MS. It includes the analyzed matrix, the primary strategy to reduce the sample complexity along with the enzyme used for digestion, the LC setup, followed by MS ionization techniques, analyzers, acquisition modes and to finalize, the number of identified proteins. It is important to mention that to achieve a high number of protein identifications it necessary to reduce the sample complexity by using different techniques in each stage of the analytical workflow (sample preparation, LC separation, mass spectrometry analysis). The main findings of this review are discussed below.

Sample preparation

The importance of LAP enrichment and HAP depletion can be seen across the whole table. For example, one of the studies reviewed here18 performed the depletion of 7 most abundant proteins in plasma using MARS-7 whereas a second study117 analyzed the whole serum. As a result, the depleted sample had more than double the number of identified proteins (746) than the other one (311). Similarly, Qian et al.9 reported 96 protein identifications using whole serum whereas 122 using a depleted sample. Smith et al.38 obtained 442 identifications from depleted serum and 266 from the non-depleted one which is in line with the other works and supports the claim that depletion of HAP helps to increase protein identifications.

Consequently, immunoaffinity depletion of HAP is the most popular method to reduce sample complexity in blood analysis and delivers the highest number of identified proteins (i.e., 5303107 and 1662127 ). Enrichment of a specific subset of proteins, such as cysteinyl enrichment (thiols resins), glycoprotein enrichment (hydrazide capture or lectins affinity resins) and phosphate enrichment (IMAC) following depletion of HAP also provides a relatively high number of identifications (197757, 688115, 308102 ).

Regarding the digestion step, trypsin is the enzyme used almost exclusively with few exceptions where Lys-C is preferred 82,107,111,112,114 . The authors explain this as being a consequence of the fact that Lys-C produces longer peptides because it only cleaves at the carboxyl side of Lys residues. Hence, longer peptides help in the protein identification process in general proteomics

101. The effect of this enzyme in the number of identification is difficult to assess from the reviewed studies due to marked differences in their experimental designs.

Liquid chromatography

Further fractionations are performed by chromatographic separations using 1D or 2D setups. One-dimensional LC often made use of C-18 columns with dimensions between 15 and 20 cm in length, 75 μm i.d., 1.9 to 5 μm particle sizes and porous sizes between 100 and 200 Å, producing a high number of protein identifications (300 – 746)18,38,127 . In addition, there are three papers that used C-8 columns (5-20 cm x 75 μm - 1 mm x 3-5 μm, 300 Å)53,57,165 and

54

one that used a C-12 column (10 cm x 75 μm x 4 μm)106 . C-8 columns were often used for high throughput targeted quantitative applications in UPLC systems in both serum and plasma and the C-12 column were applied to investigate LAP in plasma delivering 29-76 identifications.

The maximum number of identified proteins by 1D-LC methods does not exceed 746, even when combined with depletion of HAP, DIA mode and an LTQ-Orbitrap18 .

On the other hand, a study reporting a 2D-LC-MS/MS high throughput method delivered about 5300 identified proteins with an average of 4600 proteins identified per sample101. This worked applied IgY depletion, Lys-C digestion, RP (15 cm x 2.1 mm, 300 Å) using basic conditions (pH 10) in the first dimension and nano-RP-LC (C-18, 20 cm x 75 μm i.d. x 3 μm) in the second (pH 2.6), coupled to nanoESI-Q-Orbitrap mass spectrometer (DDA acq. mode)101. The difference in pH between the two RP separation columns is responsible for the high number of identifications, indicating that the systems are sufficiently orthogonal. Several other research groups129 ,166 , 20 , 96 also applied 2D-LC using RP in both dimensions (C-18/C-18 or C-18/C-8) resulting in 300-835 identifications.

Moreover, between 1000 and up to 3300 non-redundant plasma and serum identified proteins have been reported using the typical 2D-nano-LC configuration (SCX-RP), in either online or offline formats on a capillary

column (5 to 65 cm, 75-150 𝜇m i.d. and 3 𝜇m of particle diameter)16,41,58 . The orthogonality of these two separations increases the peak capacity and the resolution of the method, which translates into a larger number of proteins that can be identified. Nonetheless, it is important to mention that using ion exchange columns as first dimension separation implies the use of C-18 trapping columns before the RP analytical column for desalting purposes. These extra steps make offline setups more frequent but also more time-consuming and increases the probability of losses. A less frequent but

significantly effective combination was the use of offline SAX (10 mm x 10 μm) as first dimension separation and C-18 (25 cm x 75 μm) as the second dimension identifying a total of 1662 proteins in serum127 . The success of this approach is probably related to the completely different and consequently incompatible separation modes for which the only solution is an offline setup.

Although the success in protein identification could also be attributed to other strategies besides the LC separation, such as sample complexity reduction by depletion or enrichment of the selected group of proteins as well as to the high resolution and mass accuracy of the mass analyzer used or acquisition mode selected. A comparison between 1D and 2D-LC-MS/MS showed a significant improvement in the number of identifications going from the 1D to 2D-LC, from 52 to 184, respectively, when all other steps were kept the same 125 .

Recent studies (2014-2016)16,20,91,93,95,96,98,103,107-109,114,121,165 suggest that there is a tendency for increasing sample fractionation prior MS analysis (affinity enrichment clean-up and multidimensional LC) and apply DIA

55

methods with quantitative purposes. The high resolving power of these setups and the comprehensive data obtained would be expected to produce a large number of identifications but this relation was difficult to establish considering the differences in the aims and experimental setups .

Mass spectrometry

Regarding the ionization techniques, nano-ESI is the preferred one (see table 4.1.) because its low flow through the sprayer increases the ionization efficiency, reducing ion suppression and interferences due to matrix effects. For the same reason, LC capillary columns (75 -300 μm i.d.) are commonly used to couple the chromatographic system to the nanoESI.

It is important to notice that the mass accuracy and resolution of the mass analyzers have a significant influence in the number of identifications. For example, Faca et al.127 reported identifying 1662 protein in serum using MARS-6 and trypsin digestion followed by 2D nano-LC coupled with nano-ESI and LTQ-FT-ICR analyzer (DDA mode). Contrastingly, Li et al. 89 reported a maximum of 330 identifications using a similar workflow but on a less accurate analyzer (ion trap, DDA mode), see Table 4.1.

The most popular analyzers for untargeted qualitative plasma and serum proteomics were linear ion traps, which were frequently coupled to a quadrupole, FT-ICR or orbitrap analyzer (Q-LIT, LTQ-FT-ICR, LTQ-Orbitrap) operated almost exclusively in DDA mode (see table 4.1).

Concerning the reported quantitative studies, there are mainly two avenues:

1. Targeted quantitation by SRM or MRM experiments on Q-LIT and

QQQ instruments using isotopically labeled IS. This method is highly selective and sensitive but requires a previous knowledge of signature peptides and fragment ions patterns produced by the protein of interest. The high sensitivity, reproducibility and accuracy of these methods are based on the capacity of the mass spectrometer to isolate and selectively measure pre-selected ions in complex mixtures. Several examples are shown in Table 4.1 20,51,53,55,56,97,99,105,110-112,114-

116,119,120,128,167 . 2. Untargeted quantitation, which became possible thanks to DIA

methods, are performed on TripleTOF, QTOF or LTQ analyzers. The main DIA approaches applied to these matrixes were MSE

90,102,116,118,161, PAcIFIC 18,123 and SWATH-MS 108 . All three methods produce a massive amount of data, especially SWATH-MS that creates a comprehensive 3D map of all the peaks detected in each 25 m/z window across a predetermined range (400-1200 m/z), and which can be questioned whenever new questions appear. To be able to extract quantitative information, specially developed complicated software are required. In the case of MSE, continuum MS data is generated requiring alignment, baseline correction and complicated data-preprocessing to extract quantitative information.

56

Nowadays, MassLynx, Skyline and MassQuant apply algorithms to deal with MSE comprehensive data124,151 . Conversely, PAcIFIC has the advantage that by using a smaller window (2.5 m/z), the acquired spectrum is more likely to belong to a unique precursor ion as in DDA. As a result, it can be used with standard instrumentation, software and databases.

With regard to general quantitation software, Skyline is available online for free, and it is used frequently for quantitation because besides being able to build the SRM/MRM, DDA or DIA experiment it can also perform the data analysis107,109,143,156 .

57

Conclusion The choice between plasma and serum has to be made according to the aim of the study and the downstream methodology to be applied. Regardless of the chosen matrix, the analysis of proteins will require the application of techniques to reduce its wide dynamic range considering that most studies center their attention on low abundance proteins as a source of biomarkers. Each step of the analytical bottom-up proteomics workflow can be designed and optimized to achieve the highest degree of separation of proteins and peptides.

Despite their elevated cost, immunoaffinity depletion of high abundance proteins seems to be the most efficient and most popular method to achieve a high number of identified proteins. A more economical alternative is the use of a combinatorial library of hexapeptides for the enrichment of low abundance proteins. Regarding the chromatographic separations, 1D-nano-LC (RP) and 2D-nano-LC using SCX in the first dimension and RP (C-18) in the second are commonly use in proteomics studies of plasma and serum. Regarding the number of identified proteins, 2D-LC is by far the most efficient method of separation for these matrixes. Capillary and nano-LC are preferred because of the improved ionization efficiency obtained reducing the flow going into the nano-ESI. For high throughput applications, UPLC is frequently used whereas for biomarker discovery nano-LC with longer capillary columns is more common.

Qualitative studies use all sorts of hybrid mass analyzers such as QTOF, Q-LIT, LTQ-Orbitrap and LTQ-FTR and mostly applying data-depended acquisition mode with dynamic exclusion to improve sensitivity. For targeted quantification, on the other hand, QQQ on SRM or MRM modes are still pivotal for absolute quantification, using isotopic internal standards, TMT or iTRAQ. Relative quantification is frequently performed with spectral counting methods or using the ion current approach.

Recently, research groups started to use data-independent acquisition modes for quantitation and to improve coverage of the proteins in plasma and serum. MSE have been employed in several studies, and there are indications that PAcIFIC and SWATH-MS are gaining acceptance. The latter probably will

become routine in many proteomic laboratories as the data analysis becomes simpler because it comprehensively scrutiny the sample producing a map that can be re-query to improve the number of identification without new data acquisition. It seems that regardless the acquisition mode selected, as chromatography and mass spectrometers technology develop to improve resolution, mass accuracy, and duty cycle, any mode will deliver similar comprehensive information in the future.

Overall, increasing fractionation of the sample to reduce its complexity prior MS analysis and the implementation of DIA approaches seems to be trend in recent years for the analysis of plasma and serum proteins using LC-MS/MS. The choice of each component from sample selection until bioinformatics tools and methods should be made keeping in mind the research question

58

and the high complexity of this matrix.

59

References

1. Chandramouli K, Qian P. Proteomics: Challenges, techniques and possibilities to overcome biological sample complexity. Human Genomics and Proteomics : HGP. 2009.

2. Dhingra V, Gupta M, Andacht T, Fu ZF. New frontiers in proteomics research: A perspective. Int J Pharm. 2005;299(1):1-18.

3. Blackstock WP, Weir MP. Proteomics: Quantitative and physical mapping of cellular proteins. Trends Biotechnol. 1999;17(3):121-127.

4. Cutillas PR TJ. LC-MS/MS in proteomics: Methods and applications. Humana Press; 2010.

5. Anderson NL, Matheson AD, Steiner S. Proteomics: Applications in basic and applied biology. Curr Opin Biotechnol. 2000;11(4):408-412.

6. Yates JR. Mass spectrometry: From genomics to proteomics. Trends in Genetics. 2000;16(1):5-8.

7. Veenstra TD, Smith RD. Proteome characterization and proteomics. Academic Press, 2003; 2003.

8. Zhang Y, Fonslow BR, Shan B, Baek M, Yates JR. Protein analysis by shotgun/ bottom- up proteomics. Chem Rev. 2013;113(4):2343.

9. Donato P, Cacciola F, Mondello L, Dugo P. Comprehensive chromatographic separations in proteomics. Journal of Chromatography A. 2011;1218(49):8777-8790.

10. Resing KA, Ahn NG. Proteomics strategies for protein identification. FEBS Lett. 2005;579(4):885-889.

11. Lottspeich F. Chapter 1 top down and bottom up analysis of proteins (focusing on quantitative aspects). In: The Royal Society of Chemistry; 2011:1-10. http://dx.doi.org/10.1039/9781849733144-00001. 10.1039/9781849733144-00001.

12. Verrastro I, Pasha S, Karina TJ, Pitt AR, Spickett CM. Mass spectrometry- based methods for identifying oxidized proteins in disease: Advances and challenges. Biomolecules. 2015;5(2):378-411.

13. Liumbruno G, D&Amp, Apos, Alessandro A, Grazzini G, Zolla L. Blood- related proteomics. Journal of Proteomics. 2010;73(3):483-507.

14. Mrozinski P, Zolotarjova N, Chen H. Human serum and plasma protein depletion – novel high-capacity affinity column for the removal of the “Top 14” abundant proteins . Agilent Technologies, Inc. 2008.

15. Tissot J. Blood proteomics. Journal of Proteomics. 2010;73(3):466-467.

16. Xiao M, Chen Y, Yu H, et al. Analysis of the whole serum proteome using an integrated 2D LC-MS/MS system. Analytical Methods; Anal.Methods. 2014;6(18):7157-7160.

http://dx.doi.org/10.1039/9781849733144-00001

60

17. Villar-Garea A, Griese M, Imhof A. Biomarker discovery from body fluids using mass spectrometry. Journal of Chromatography B. 2007;849(1):105-114.

18. Panchaud A, Scherl A, Shaffer SA, et al. Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean.(author abstract)(report). Anal Chem. 2009;81(15):6481.

19. Zeliadt N. Moving target, New mass spectrometry–based techniques are blurring the lines between discovery and targeted proteomics. The Scientist, Lab tools/Mass Spectrometry. 2014.

20. Shen Y, Zhang G, Yang J, et al. Online 2D- LC- MS/ MS assay to quantify therapeutic protein in human serum in the presence of pre- existing antidrug antibodies. Anal Chem. 2015;87(16):8555.

21. Govorukhina NI, Reijmers TH, Nyangoma SO, van DZ, Jansen RC, Bischoff R. Analysis of human serum by liquid chromatography–mass spectrometry: Improved sample preparation and data analysis. Journal of Chromatography A. 2006;1120(1):142-150.

22. Luque-Garcia J, Neubert TA. Sample preparation for serum/ plasma profiling and biomarker identification by mass spectrometry. Journal of Chromatography A. 2007;1153(1):259-276.

23. Muthusamy B, Hanumanthu G, Suresh S, et al. Plasma proteome database as a resource for proteomics research. Proteomics. 2005;5(13):3531-3536.

24. Tammen H, Schulte I, Hess R, et al. Peptidomic analysis of human blood specimens: Comparison between plasma specimens and serum by differential peptide display. Proteomics. 2005;5(13):3414-3422.

25. Hsieh S, Chen R, Pan Y, Lee H. Systematical evaluation of the effects of sample collection procedures on low‐ molecular‐ weight serum/ plasma proteome profiling. Proteomics. 2006;6(10):3189-3198.

26. Rai A, Gelfand C, Haywood B, et al. HUPO plasma proteome project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples. Proteomics. 2005;5(13):3262-3277.

27. White JG. EDTA- induced changes in platelet structure and function: Clot retraction. Platelets. 2000;11(1):49.

28. Alsaif M, Guest PC, Schwarz E, et al. Analysis of serum and plasma identifies differences in molecular coverage, measurement variability, and candidate biomarker selection. PROTEOMICS – Clinical Applications. 2012;6(5-6):297-303.

29. Denery JR, Nunes AAK, Dickerson TJ. Characterization of differences between blood sample matrices in untargeted metabolomics.(author abstract)(report). Anal Chem. 2011;83(3):1040.

30. Anderson NL, Polanski M, Pieper R, et al. The human plasma proteome: A nonredundant list developed by combination of four separate sources. Molecular & cellular proteomics : MCP. 2004;3(4):311.

61

31. Fang X, Zhang W. Affinity separation and enrichment methods in proteomic analysis. Journal of Proteomics. 2008;71(3):284-303.

32. Polaskova V, Kapur A, Khan A, Molloy MP, Baker MS. High‐ abundance protein depletion: Comparison of methods for human plasma biomarker discovery. Electrophoresis. 2010;31(3):471-482.

33. Brand J, Haslberger T, Zolg W, Pestlin G, Palme S. Depletion efficiency and recovery of trace markers from a multiparameter immunodepletion column. Proteomics. 2006;6(11):3236-3242.

34. Echan LA, Tang H, Ali‐ khan N, Lee K, Speicher DW. Depletion of multiple high‐ abundance proteins improves protein profiling capacities of human serum and plasma. Proteomics. 2005;5(13):3292-3303.

35. Shen Z, Want EJ, Chen W, et al. Sepsis plasma protein profiling with immunodepletion, three- dimensional liquid chromatography tandem mass spectrometry, and spectrum counting. Journal of proteome research. 2006;5(11):3154.

36. Ahn S, Khan A. Detection and quantitation of twenty-seven cytokines, chemokines and growth factors pre- and post-high abundance protein depletion in human plasma. EuPA Open Proteomics. 2014;3:78-84.

37. Yadav AK, Bhardwaj G, Basak T, et al. A systematic analysis of eluted fraction of plasma post immunoaffinity depletion: Implications in biomarker discovery (immunodepleted plasma: Analysis of eluted fraction). PLoS ONE. 2011;6(9):e24442.

38. Smith MPW, Wood SL, Zougman A, et al. A systematic analysis of the effects of increasing degrees of serum immunodepletion in terms of depth of coverage and other key aspects in top‐ down and bottom‐ up proteomic analyses. Proteomics. 2011;11(11):2222-2235.

39. Puangpila C, Mayadunne E, El Rassi Z. Liquid phase based separation systems for depletion, prefractionation, and enrichment of proteins in biological fluids and matrices for in-depth proteomics analysis-an update covering the period 2011-2014. Electrophoresis. 2015;36(1):238.

40. Keshishian H, Addona T, Burgess M, Kuhn E, Carr SA. Quantitative, multiplexed assays for low abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution. Molecular & cellular proteomics : MCP. 2007;6(12):2212.

41. Qian W, Jacobs JM, Liu T, Camp DG, Smith RD. Advances and challenges in liquid chromatography- mass spectrometry- based proteomics profiling for clinical applications. Molecular & cellular proteomics : MCP. 2006;5(10):1727.

42. Keshishian H, Addona T, Burgess M, et al. Quantification of cardiovascular biomarkers in patient plasma by targeted mass spectrometry and stable isotope dilution. Molecular & cellular proteomics : MCP. 2009;8(10):2339.

43. Cañas B, Piñeiro C, Calvo E, López-Ferrer D, Gallardo JM. Trends in sample preparation for classical and second generation proteomics. Journal of Chromatography A. 2007;1153(1):235-258.

62

44. Wehr T. Recent developments in high-abundance protein removal techniques.(DIRECTIONS IN DISCOVERY). LC-GC North America. 2008;26(3):278.

45. Steinsträßer L, Jacobsen F, Hirsch T, et al. Immunodepletion of high-abundant proteins from acute and chronic wound fluids to elucidate low-abundant regulators in wound healing. BMC Research Notes. 2010;3:335-335.

46. Schuchard MD, Melm CD, Crawford AS, Chapman HA, Cockrill SL, Ray KB. Immunoaffinity depletion of 20 high abundance human plasma proteins.removal of approximately 97% of total plasma protein improves identification of low abundance proteins.Origins. 2005:21:17–23.

47. Colantonio DA, Dunkinson C, Bovenkamp DE, Van Eyk JE. Effective removal of albumin from serum. Proteomics. 2005;5(15):3831-3835.

48. Ahmed N, Barker G, K TO, et al. Proteomic- based identification of haptoglobin-1 precursor as a novel circulating biomarker of ovarian cancer. Br J Cancer. 2004;91(1):129.

49. Kullolli M, Warren J, Arampatzidou M, Pitteri SJ. Performance evaluation of affinity ligands for depletion of abundant plasma proteins. Journal of Chromatography B. 2013;939:10-16.

50. Warder SE, Tucker LA, Strelitzer TJ, et al. Reducing agent-mediated precipitation of high-abundance plasma proteins. Anal Biochem. 2009;387(2):184-193.

51. Kay R, Barton C, Ratcliffe L, et al. Enrichment of low molecular weight serum proteins using acetonitrile precipitation for mass spectrometry based proteomic analysis. Rapid Communications in Mass Spectrometry. 2008;22(20):3255-3260.

52. van DB, Niessen WMA, van Dongen WD. Bioanalytical LC–MS/ MS of protein- based biopharmaceuticals. Journal of Chromatography B. 2013;929:161-179.

53. Winther B, Nordlund M, Paus E, Reubsaet L, Halvorsen TG. Immuno‐ capture as ultimate sample cleanup in LC‐ MS/ MS determination of the early stage biomarker ProGRP. Journal of Separation Science. 2009;32(17):2937-2943.

54. Yu N, Ho E, Wan T, Wong A. Doping control analysis of recombinant human erythropoietin, darbepoetin alfa and methoxy polyethylene glycol- epoetin beta in equine plasma by nano- liquid chromatography–tandem mass spectrometry. Anal Bioanal Chem. 2010;396(7):2513-2521.

55. Wu ST, Ouyang Z, Olah TV, Jemal M. A strategy for liquid chromatography/ tandem mass spectrometry based quantitation of pegylated protein drugs in plasma using plasma protein precipitation with water- miscible organic solvents and subsequent trypsin digestion to generate surrogate peptides for detection. Rapid communications in mass spectrometry : RCM. 2011;25(2):281.

56. Razavi M, Pearson TW, Frick LE, et al. High-throughput SISCAPA quantitation of peptides from human plasma digests by ultrafast, liquid chromatography-free mass spectrometry. Journal of Proteome Research. 2012;11(12):5642-5649.

63

57. Funk WE, Li H, Iavarone AT, Williams ER, Riby J, Rappaport SM. Enrichment of cysteinyl adducts of human serum albumin. Anal Biochem. 2010;400(1):61-68.

58. Liu T, Qian W, Gritsenko MA, et al. High dynamic range characterization of the trauma patient plasma proteome. Molecular & cellular proteomics : MCP. 2006;5(10):1899.

59. Fabisiak JP, Sedlov A, Kagan VE. Quantification of oxidative/ nitrosative modification of CYS( 34) in human serum albumin using a fluorescence- based SDS- PAGE assay. Antioxidants & redox signaling. 2002;4(5):855.

60. Haugen DA. Charge-shift strategy for isolation of hemoglobin-carcinogen adducts formed at the β93 cysteine sulfhydryl groups. Chem Res Toxicol. 1989;2(6):379-385.

61. Paulus A, Freeby S, Academia K, et al. Accessing low-abundance proteins in serum and plasma with a novel, simple enrichment and depletion method. Life Science Group, Bio-Rad Laboratories, Inc. 2007.

62. Boschetti E, Righetti PG. The ProteoMiner in the proteomic arena: A non- depleting tool for discovering low- abundance species. Journal of Proteomics. 2008;71(3):255-264.

63. Wehr T, Sun C, Li L, et al. Comparison of high-abundance protein depletion techniques for biomarker discovery. . 2009.

64. Millioni R, Tolin S, Puricelli L, et al. High abundance proteins depletion vs low abundance proteins enrichment: Comparison of methods to reduce the plasma proteome complexity ( depletion vs enrichment for plasma proteome study). PLoS ONE. 2011;6(5):e19603.

65. Scherl A. Clinical protein mass spectrometry. Methods. 2015;81:3-14.

66. Tammen H, Hess R, Rose H, Wienen W, Jost M. Peptidomic analysis of blood plasma after in vivo treatment with protease inhibitors—A proof of concept study. Peptides. 2008;29(12):2188-2195.

67. Hale JE, Butler JP, Gelfanova V, You J, Knierman MD. A simplified procedure for the reduction and alkylation of cysteine residues in proteins prior to proteolytic digestion and mass spectral analysis. Anal Biochem. 2004;333(1):174-181.

68. Shi Y, Xiang R, Horváth C, Wilkins JA. The role of liquid chromatography in proteomics. Journal of Chromatography A. 2004;1053(1):27-36.

69. Tuli L RH. LC-MS based detection of differential protein expression. J Proteomics Bioinform. 2009(Oct 2;2:):416-438.

70. Di Palma S, Boersema PJ, Heck AJR, Mohammed S. Zwitterionic hydrophilic interaction liquid chromatography (ZIC-HILIC and ZIC-cHILIC) provide high resolution separation and increase sensitivity in proteome analysis. Anal Chem. 2011;83(9):3440.

64

71. Di Palma S, Hennrich ML, Heck AJR, Mohammed S. Recent advances in peptide separation by multidimensional liquid chromatography for proteome analysis. Journal of Proteomics. 2012;75(13):3791-3813.

72. Hsieh E, Bereman M, Durand S, Valaskovic G, MacCoss M. Effects of column and gradient lengths on peak capacity and peptide identification in nanoflow LC- MS/ MS of complex proteomic samples. J Am Soc Mass Spectrom. 2013;24(1):148-153.

73. Camerini S, Mauri P. The role of protein and peptide separation before mass spectrometry analysis in clinical proteomics. Journal of Chromatography A. 2015;1381:1-12.

74. Fast and ultrafast HPLC on sub-2 microm porous particles-- where do we go from here?(COLUMN WATCH). LCGC Europe. 2006;19(6):352.

75. Henry RA. The early days of HPLC at DuPont.(THE HISTORY OF CHROMATOGRAPHY). LC-GC North America. 2009;27(2):146.

76. Oyaert M, Peersman N, Kieffer D, et al. Novel LC– MS/ MS method for plasma vancomycin: Comparison with immunoassays and clinical impact. Clinica Chimica Acta. 2015;441:63-70.

77. Fekete S, Gassner A, Rudaz S, Schappler J, Guillarme D. Analytical strategies for the characterization of therapeutic monoclonal antibodies. Trends in Analytical Chemistry. 2013;42:74-83.

78. Li H, Ortiz R, Tran L, et al. General LC- MS/ MS method approach to quantify therapeutic monoclonal antibodies using a common whole antibody internal standard with application to preclinical studies.(author abstract)(report). Anal Chem. 2012;84(3):1267.

79. Shen Y, Smith RD, Unger KK, Kumar D, Lubda D. Ultrahigh-throughput proteomics using fast RPLC separations with ESI-MS/ MS.(author abstract). Anal Chem. 2005;77(20):6692.

80. Rieux L. A nanoLC-MS-based platform for peptide analysis s.n. . 2006.

81. Liu H, Lin D, Yates JR. Multidimensional separations for protein/peptide analysis in the post-genomic era. Biotechniques. 2002;Apr;32(4):898, 900, 902.

82. Martosella J, Zolotarjova N, Liu H, Nicol G, Boyes BE. Reversed- phase high- performance liquid chromatographic prefractionation of immunodepleted human serum proteins to enhance mass spectrometry identification of lower- abundant proteins. Journal of proteome research. 2005;4(5):1522.

83. Giddings JC. Concepts and comparisons in multidimensional separation. J. High Resolut. Chromatogr. Commun. 1987;10:319-323.

84. Camenzuli M, Schoenmakers PJ. A new measure of orthogonality for multidimensional chromatography. Anal Chim Acta. 2014.

85. Zhao Y, Kong RPW, Li G, et al. Fully automatable two‐ dimensional hydrophilic interaction liquid chromatography–reversed phase liquid chromatography with online

65

tandem mass spectrometry for shotgun proteomics. Journal of Separation Science. 2012;35(14):1755-1763.

86. Palma SD, Mohammed S, Albert JRH. ZIC- cHILIC as a fractionation method for sensitive and powerful shotgun proteomics. Nature Protocols. 2012;7(11):2041.

87. Puangpila C, Mayadunne E, El Rassi Z. Liquid phase based separation systems for depletion, prefractionation, and enrichment of proteins in biological fluids and matrices for in‐ depth proteomics analysis—An update covering the period 2011– 2014. Electrophoresis. 2015;36(1):238-252.

88. Motoyama A, Yates JR. Multidimensional LC separations in shotgun proteomics. Anal Chem. 2008;80(19):7187.

89. Li X, Gong Y, Wang Y, et al. Comparison of alternative analytical techniques for the characterisation of the human serum proteome in HUPO plasma proteome project. Proteomics. 2005;5(13):3423-3441.

90. Adkins JN, Varnum SM, Auberry KJ, et al. Toward a human blood serum proteome: Analysis by multidimensional separation coupled with mass spectrometry. Molecular & cellular proteomics : MCP. 2002;1(12):947.

91. Bollineni RC, Fedorova M, Blüher M, Hoffmann R. Carbonylated plasma proteins as potential biomarkers of obesity induced type 2 diabetes mellitus. Journal of proteome research. 2014;13(11):5081.

92. Shen Y, Jacobs JM, Camp DG,II, et al. Ultra-high-efficiency strong cation exchange LC/ RPLC/ MS/ MS for high dynamic range characterization of the human plasma proteome.(author abstract). Anal Chem. 2004;76(4):1134.

93. Dayon L, Kussmann M. Proteomics of human plasma: A critical comparison of analytical workflows in terms of effort, throughput and outcome. EuPA Open Proteomics. 2013;1:8-16.

94. Liu T, Qian W, Gritsenko MA, et al. Human plasma N- glycoproteome analysis by immunoaffinity subtraction, hydrazide chemistry, and mass spectrometry. Journal of Proteome Research. 2005;4(6).

95. Kramer G, Woolerton Y, van Straalen J,P., et al. Accuracy and reproducibility in quantification of plasma protein concentrations by mass spectrometry without the use of isotopic standards. PloS one. 2015;10(10):e0140097.

96. Hassis ME, Niles RK, Braten MN, et al. Evaluating the effects of preanalytical variables on the stability of the human plasma proteome. Anal Biochem. 2015;478:14-22.

97. Halquist MS, Karnes HT. Quantification of alefacept, an immunosuppressive fusion protein in human plasma using a protein analogue internal standard, trypsin cleaved signature peptides and liquid chromatography tandem mass spectrometry. Journal of Chromatography B. 2011;879(11):789-798.

66

98. Bollineni RC, Guldvik I, Gronberg H, Wiklund F, Mills I, Thiede B. A differential protein solubility approach for the depletion of highly abundant proteins in plasma using ammonium sulfate. Analyst. 2015;140(24):8109-8117.

99. Neubert H, Muirhead D, Kabir M, Grace C, Cleton A, Arends R. Sequential protein and peptide immunoaffinity capture for mass spectrometry- based quantification of total human beta- nerve growth factor.(report)(author abstract). Anal Chem. 2013;85(3):1719.

100. Shen Y, Zhang R, Moore RJ, et al. Automated 20 kpsi RPLC-MS and MS/ MS with chromatographic peak capacities of 1000- 1500 and capabilities in proteomics and metabolomics. Anal Chem. 2005;77(10):3090.

101. Qian W, Kaleta DT, Petritis BO, et al. Enhanced detection of low abundance human plasma proteins using a tandem IgY12-SuperMix immunoaffinity separation strategy. Molecular & cellular proteomics : MCP. 2008;7(10):1963.

102. Zheng X, Wu S, Hincapie M, Hancock WS. Study of the human plasma proteome of rheumatoid arthritis. Journal of Chromatography A. 2009;1216(16):3538-3545.

103. Dayon L, Núñez Galindo A, Corthésy J, Cominetti O, Kussmann M. Comprehensive and scalable highly automated MS-based proteomic workflow for clinical biomarker discovery in human plasma. J Proteome Res. 2014;13(8):3837-3845.

104. Qian W, Monroe ME, Liu T, et al. Quantitative proteome analysis of human plasma following in vivo lipopolysaccharide administration using 16O/18O labeling and the accurate mass and time tag approach. Molecular & cellular proteomics : MCP. 2005;4(5):700.

105. Anderson L, Hunter CL. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Molecular & cellular proteomics : MCP. 2006;5(4):573.

106. Shuford CM, Hawkridge AM, Burnett JC, Muddiman DC. Utilizing spectral counting to quantitatively characterize tandem removal of abundant proteins (TRAP) in human plasma. Anal Chem. 2010;82(24):10179.

107. Keshishian H, Burgess MW, Gillette MA, et al. Multiplexed, quantitative workflow for sensitive biomarker discovery in plasma yields novel candidates for early myocardial injury. Molecular and Cellular Proteomics. 2015;14(9):2375-2393.

108. Korwar AM, Vannuruswamy G, Jagadeeshaprasad MG, et al. Development of diagnostic fragment ion library for glycated peptides of human serum albumin: Targeted quantification in prediabetic, diabetic, and microalbuminuria plasma by parallel reaction monitoring, SWATH, and MSE. Molecular & cellular proteomics : MCP. 2015;14(8):2150.

109. Porter C, Bereman M. Data- independent-acquisition mass spectrometry for identification of targeted-peptide site-specific modifications. Anal Bioanal Chem. 2015;407(22):6627-6635.

67

110. Bronsema KJ, Bischoff R, van de Merbel N,C. High- sensitivity LC-MS/ MS quantification of peptides and proteins in complex biological samples: The impact of enzymatic digestion and internal standard selection on method performance. Anal Chem. 2013;85(20):9528.

111. Lu Q, Zheng X, Mcintosh T, et al. Development of different analysis platforms with LC- MS for pharmacokinetic studies of protein drugs. Anal Chem. 2009;81(21):8715.

112. Yang Z, Ke J, Hayes M, Bryant M, Tse FLS. A sensitive and high-throughput LC–MS/ MS method for the quantification of pegylated-interferon-α2a in human serum using monolithic C 18 solid phase extraction for enrichment. Journal of Chromatography B. 2009;877(18):1737-1742.

113. Weisbrod CR, Eng JK, Hoopmann MR, Baker T, Bruce JE. Accurate peptide fragment mass analysis: Multiplexed peptide identification and quantification. Journal of proteome research. 2012;11(3):1621.

114. Bowen CL, Kehler J, Mencken T, Orr B, Szapacs M. Utilizing LC- MS/ MS to provide adaptable clinical bioanalytical support for an extended half- life bioactive peptide fused to an albumin- binding domain antibody. Analytical Methods; Anal.Methods. 2014;7(1):237-243.

115. Adrait A, Lebert D, Trauchessec M, et al. Development of a protein standard absolute quantification (PSAQ™) assay for the quantification of staphylococcus aureus enterotoxin A in serum. Journal of Proteomics. 2012;75(10):3041-3049.

116. Yu Y, Xu J, Liu Y, Chen Y. Quantification of human serum transferrin using liquid chromatography– tandem mass spectrometry based targeted proteomics. Journal of Chromatography B. 2012;902:10-15.

117. Tang N, Miller CA. An integrated approach to improve sequence coverage and protein identification by combining LC-MALDI MS/MS and nano-LC/MS/MS . Agilent Technologies, Inc. 2005.

118. Kumar V, Barnidge DR, Chen L, et al. Quantification of serum 1-84 parathyroid hormone in patients with hyperparathyroidism by immunocapture in situ digestion liquid chromatography- tandem mass spectrometry. Clin Chem. 2010;56(2):306.

119. Lund H, Løvsletten K, Paus E, Halvorsen TG, Reubsaet L. Immuno- MS based targeted proteomics: Highly specific, sensitive, and reproducible human chorionic gonadotropin determination for clinical diagnostics and doping analysis. Anal Chem. 2012;84(18):7926.

120. Stahl-Zeng J, Lange V, Ossola R, et al. High sensitivity detection of plasma proteins by multiple reaction monitoring of N- glycosites. Molecular & cellular proteomics : MCP. 2007;6(10):1809.

121. Gu H, Ren J, Jia X, et al. Quantitative profiling of post-translational modifications by immunoaffinity enrichment and LC-MS/ MS in cancer serum without immunodepletion. Molecular & Cellular Proteomics; Mol.Cell.Proteomics. 2016;15(2):692-702.

68

122. Levin Y, Hradetzky E, Bahn S. Quantification of proteins using data‐ independent analysis (MSE) in simple andcomplex samples: A systematic evaluation. Proteomics. 2011;11(16):3273-3287.

123. Acosta-Martin A, Panchaud A, Chwastyniak M, et al. Quantitative mass spectrometry analysis using PAcIFIC for the identification of plasma diagnostic biomarkers for abdominal aortic aneurysm (AAA plasma biomarkers detected by PAcIFIC MS). PLoS ONE. 2011;6(12):e28698.

124. Silva JC, Gorenstein MV, Li G, Vissers JPC, Geromanos SJ. Absolute quantification of proteins by LCMSE: A virtue of parallel MS acquisition. Molecular & cellular proteomics : MCP. 2006;5(1):144.

125. Gilar M, Olivova P, Chakraborty AB, Jaworski A, Geromanos SJ, Gebler JC. Comparison of 1‐ D and 2‐ D LC MS/ MS methods for proteomic analysis of human serum. Electrophoresis. 2009;30(7):1157-1167.

126. Selvaraju S, El Rassi Z. Targeting deeper the human serum fucome by a liquid- phase multicolumn platform in combination with combinatorial peptide ligand libraries. Journal of Chromatography B. 2014;951-952:135-142.

127. Faca V, Pitteri SJ, Newcomb L, et al. Contribution of protein fractionation to depth of analysis of the serum and plasma proteomes. Journal of proteome research. 2007;6(9):3558.

128. Kushnir MM, Rockwood AL, Roberts WL, Abraham D, Hoofnagle AN, Meikle AW. Measurement of thyroglobulin by liquid chromatography- tandem mass spectrometry in serum and plasma in the presence of antithyroglobulin autoantibodies. Clin Chem. 2013;59(6):982.

129. Zheng X, Baker H, Hancock WS. Analysis of the low molecular weight serum peptidome using ultrafiltration and a hybrid ion trap- fourier transform mass spectrometer. Journal of Chromatography A. 2006;1120(1):173-184.

130. Li L, Sun C, Freeby S, Yee D, Kieffer-Jaquinod S. Protein sample treatment with peptide ligand library: Coverage and consistency. J Proteomics Bioinform. 2009:2:485-494.

131. Hopfgartner G. Introduction to MS in bioanalysis, mass spectrometry in medicinal chemistry. Wiley- VCH Verlag GmbH & co. KGaA. 2007:1-62.

132. Cech NB, Krone JR, Enke CG. Predicting electrospray response from chromatographic retention time. Anal Chem. 2001;73(2):208.

133. Wilm MS, Mann M. Electrospray and taylor- cone theory, dole's beam of macromolecules at last? International Journal of Mass Spectrometry and Ion Processes. 1994;136(2):167-180.

134. Wilm M, Mann M. Analytical properties of the nanoelectrospray ion source. Anal Chem. 1996;68(1):1.

135. Tang N, Tornatore P, Weinberger SR. Current developments in SELDI affinity technology. Mass Spectrom Rev. 2004;23(1):34-44.

69

136. Scherl A. Clinical protein mass spectrometry. Methods. 2015;81:3-14.

137. Mayya V, Rezaul K, Cong Y, Han D. Systematic comparison of a two- dimensional ion trap and a three- dimensional ion trap mass spectrometer in proteomics. Molecular & cellular proteomics : MCP. 2005;4(2):214.

138. Marshall AG, Hendrickson CL, Jackson GS. Fourier transform ion cyclotron resonance mass spectrometry: A primer. Mass Spectrom Rev. 1998;17(1):1-35.

139. Makarov A. Electrostatic axially harmonic orbital trapping: A high- performance technique of mass analysis. Anal Chem. 2000;72(6):1156-1162.

140. IonSource C. Introduction to MS Quantitation and Modes of LC/MS monitoring . http://www.ionsource.com/tutorial/msquan/intro.htm. Updated January 19, 2016 02:48:47 PM. Accessed June 03, 2016 03:22:25 PM, 2016.

141. Peterson AC, Russell JD, Bailey DJ, Westphall MS, Coon JJ. Parallel reaction monitoring for high resolution and high mass accuracy quantitative, targeted proteomics. Molecular & cellular proteomics : MCP. 2012;11(11):1475.

142. Panchaud A, Jung S, Shaffer SA, Aitchison JD, Goodlett DR. Faster, quantitative, and accurate precursor acquisition independent from ion count. Anal Chem. 2011;83(6):2250.

143. Jarrett DE, Maclean B, Johnson R, Xuan Y, Michael JM. Multiplexed peptide analysis using data- independent acquisition and skyline. Nature Protocols. 2015;10(6):887.

144. Gatlin CL, Eng JK, Cross ST, Detter JC, Yates JR. Automated identification of amino acid sequence variations in proteins by HPLC/ microspray tandem mass spectrometry. Anal Chem. 2000;72(4):757.

145. Chapman JD, Goodlett DR, Masselon CD. Multiplexed and data‐ independent tandem mass spectrometry for global proteome profiling. Mass Spectrom Rev. 2014;33(6):452-470.

146. Bern M, Finney G, Hoopmann MR, Merrihew G, Toth MJ, Maccoss MJ. Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. Anal Chem. 2010;82(3):833.

147. Gillet LC, Navarro P, Tate S, et al. Targeted data extraction of the MS/ MS spectra generated by data- independent acquisition: A new concept for consistent and accurate proteome analysis. Molecular & cellular proteomics : MCP. 2012;11(6):O111.016717.

148. Zhang W. Progress in mass spectrometry acquisition approach for quantitative proteomics. Chinese Journal of Analytical Chemistry. 2014;42(12):1859-1868.

149. Purvine S, Eppel* J, Yi EC, Goodlett DR. Shotgun collision‐ induced dissociation of peptides using a time of flight mass analyzer. Proteomics. 2003;3(6):847-850.

http://www.ionsource.com/tutorial/msquan/intro.htm

70

150. John DV, Meng-Qiu Dong, Wohlschlegel J, Dillin A, John RY. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nature Methods. 2004;1(1):39.

151. Silva JC, Denny R, Dorschel CA, et al. Quantitative proteomic analysis by accurate mass retention time pairs.(author abstract). Anal Chem. 2005;77(7):2187.

152. Ramos AA, Yang H, Rosen LE, Yao X. Tandem parallel fragmentation of peptides for mass spectrometry.(author abstract). Anal Chem. 2006;78(18):6391.

153. Geiger T, Cox J, Mann M. Proteomics on an orbitrap benchtop mass spectrometer using all- ion fragmentation *. Molecular & Cellular Proteomics : MCP. 2010;9(10):2252-2261.

154. Carvalho PC, Han X, Xu T, et al. XDIA: Improving on the label- free data- independent analysis. Bioinformatics. 2010;26(6):847-848.

155. Hannes LR, Rosenberger G, Navarro P, et al. OpenSWATH enables automated, targeted analysis of data- independent acquisition MS data. Nat Biotechnol. 2014;32(3):219.

156. Hopfgartner G, Lesur A, Varesio E. Analysis of biopharmaceutical proteins in biological matrices by LC- MS/ MS II. LC- MS/ MS analysis. Trends in Analytical Chemistry. 2013;48:52-61.

157. Shi T, Su D, Liu T, et al. Advancing the sensitivity of selected reaction monitoring- based targeted quantitative proteomics. Proteomics. 2012;12(8):1074-1092.

158. Smith RD, Shen Y, Tang K. Ultra-sensitive and quantitative analyses from combined separations-mass spectrometry for the characterization of proteomes. Acc Chem Res. 2004;37(4).

159. Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: A critical review. Anal Bioanal Chem. 2007;389(4):1017-1031.

160. Thermo Fisher Scientific Inc. Quantitative proteomics. https://www.thermofisher.com/nl/en/home/life-science/protein-biology/protein-biology-learning-center/protein-biology-resource-library/pierce-protein-methods/quantitative-proteomics.html#icat. Updated 2015. Accessed 08/06/2016, 2016.

161. Wasinger VC, Zeng M, Yau Y. Current status and advances in quantitative proteomic mass spectrometry. . 2013;2013.

162. Cottrell JS. Protein identification using MS/ MS data. Journal of Proteomics. 2011;74(10):1842-1851.

163. Webhofer C, Schrader M. Chapter 7 bioinformatic tools for the LC-MS/MS analysis of proteins and peptides. In: The Royal Society of Chemistry; 2011:87-103. http://dx.doi.org/10.1039/9781849733144-00087. 10.1039/9781849733144-00087.

https://www.thermofisher.com/nl/en/home/life-science/protein-biology/protein-biology-learning-center/protein-biology-resource-library/pierce-protein-methods/quantitative-proteomics.html#icat



http://dx.doi.org/10.1039/9781849733144-00087

71

164. Wang G, Wu WW, Zhang Z, Masilamani S, Shen R. Decoy methods for assessing false positives and false discovery rates in shotgun proteomics. Anal Chem. 2009;81(1):146.

165. Selvaraju S, El Rassi Z. Targeting deeper the human serum fucome by a liquid- phase multicolumn platform in combination with combinatorial peptide ligand libraries. Journal of Chromatography B. 2014;951-952:135-142.

166. Gilar M, Olivova P, Chakraborty AB, Jaworski A, Geromanos SJ, Gebler JC. Comparison of 1‐ D and 2‐ D LC MS/ MS methods for proteomic analysis of human serum. Electrophoresis. 2009;30(7):1157-1167.

167. Kumar V, Barnidge DR, Chen L, et al. Quantification of serum 1-84 parathyroid hormone in patients with hyperparathyroidism by immunocapture in situ digestion liquid chromatography- tandem mass spectrometry. Clin Chem. 2010;56(2):306.

168. Gilar M, Olivova P, Daly AE, Gebler JC. Orthogonality of separation in two- dimensional liquid chromatography.(author abstract). Anal Chem. 2005;77(19):6426.

Documents

Identification and quantitation of proteins in human ... · electrophoresis and differential precipitation. In the late 90's the fast ... protein analysis and is the approach adopted