12

Uploaded file 130063321946083603

  • Upload
    -

  • View
    152

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Uploaded file 130063321946083603
Page 2: Uploaded file 130063321946083603

Article

pubs.acs.org/jpr

LC−MS/MS Characterization of O-Glycosylation Sites and GlycanStructures of Human Cerebrospinal Fluid GlycoproteinsAdnan Halim, Ulla Ruetschi, Goran Larson, and Jonas Nilsson*Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University ofGothenburg, 413 45 Gothenburg, SwedenS

* Supporting Information

ABSTRACT: The GalNAc O-glycosylation on Ser/Thrresidues of extracellular proteins has not been wellcharacterized from a proteomics perspective. We previouslyreported a sialic acid capture-and-release protocol to enrichtryptic N- and O-glycopeptides from human cerebrospinalfluid glycoproteins using nano-LC−ESI−MS/MS with colli-sion-induced dissociation (CID) for glycopeptide character-ization. Here, we have introduced peptide N-glycosidase F(PNGase F) pretreatment of CSF samples to remove the N-glycans facilitating the selective characterization of O-glycopeptides and enabling the use of an automated CID−MS2/MS3 search protocol for glycopeptide identification. We usedelectron-capture and -transfer dissociation (ECD/ETD) to pinpoint the glycosylation site(s) of the glycopeptides, identified aspredominantly core-1-like HexHexNAc-O- structure attached to one to four Ser/Thr residues. We characterized 106 O-glycosylations and found Pro residues preferentially in the n − 1, n + 1, and/or n + 3 positions in relation to the Ser/Thrattachment site (n). The characterization of glycans and glycosylation sites in glycoproteins from human clinical samples providesa basis for future studies addressing the biological and diagnostic importance of specific protein glycosylations in relation tohuman disease.KEYWORDS: glycoproteomics, glycopeptide, tandem mass spectrometry, PNGase F, hydrazide chemistry

INTRODUCTIONExtracellular proteins are frequently modified post-translation-ally with N-glycans on Asn residues and O-glycans on Ser/Thrresidues.1 Recently, O-glycosylation of Tyr residues have alsobeen reported.2,3 Both N- and O-glycans are often terminatedwith sialic acids, with N-acetyl-5-neuraminic acid (Neu5Ac)being the dominant form in human glycoproteins,4 which areessential for a multitude of cellular interactions.5−8 Mucins, thatis, glycoproteins with long stretches rich in Ser, Thr, and Proresidues, are heavily GalNAc O-glycosylated on these Ser/Thrresidues.9,10 Such “mucin glycosylations” are known to protectepithelial cells from physical stress and to act as decoymechanisms for microbes.11 Nonmucin glycoproteins also carryGalNAc O-glycosylations on site-specific Ser/Thr resi-dues3,12−15 and single or few clustered O-glycans have beenshown to selectively block proteases from cleaving their peptidetarget sites.16−18 The proteolytic destiny, processing pathway,lifetime, and biological function of a glycoprotein can thus bespecifically determined by its glycosylation status. To betteraddress the significance of site-specific O-glycosylation ofspecific glycoproteins, it is accordingly important to map theO-glycosylation sites. As opposed to the Asn-X-Ser/Thr consensus motif of N-glycosylation, no apparent consensus motif for O-glycosylationseems to exist. This is likely due to the existence and differentialexpression of up to 20 different mammalian genes coding for a

© 2012 American Chemical Society

■ family of polypeptide GalNAc transferases (ppGalNAc-Ts),which are together responsible for addition of the initialGalNAcα1-O-Ser/Thr on the polypeptide substrates.19,20 EachppGalNAc-T seems to exhibit rather unique specificity for theO-glycosylation motif and also to show a tissue-specificdistribution. Accordingly, it has been shown that modelpeptides containing S/T-X-X-P and P-S/T sequences arefavorably subjected to initial glycosylation by ppGalNAc-T1and -T221−23 due to substrate recognition by their catalyticdomain. However, additional glycosylation of neighboring Ser/Thr residues might also be facilitated, because of binding ofppGalNAc-T1 and -T2 through their lectin domains24,25 to thenewly formed O-glycopeptide, which undermines the straightpeptide-sequence-dependent O-glycosylation.26 Additionally,ppGalNAc-T427 and -T1028 glycosylate several Ser/Thrresidues by specific recognition of preformed GalNAc-O-through their lectin domains and also independently byrecognition of GalNAc-O- through their catalytic domains.26,28

Two web resources are available where GalNAc O-glycosylationsites are predicted based on known glycosylation sites[Netoglyc 3.1, http://www.cbs.dtu.dk/services/NetOGlyc/;and Isoform Specific O-Glycosylation Prediction (ISOGlyP),

Received: June 12, 2012Published: December 13, 2012

573 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 3: Uploaded file 130063321946083603

Journal of Proteome Research

http://isoglyp.utep.edu], but the validity of predicted sites mustbe questioned when experimental confirmation is lacking. Glycoproteomic techniques aimed at mapping O-glycosyla-tion sites have recently been introduced as powerful tools forstructural characterization of native glycoproteins.3,12,13,29

Darula and Medzihradszky used the jacalin lectin, recognizingGalNAcα1-O- of the core 1 structure, to purify tryptic O-glycopeptides and identified 23 O-glycosylation sites frombovine serum glycoproteins.13,30 Recently, they expanded theirlist to include 125 O-glycosylation sites by using prefractiona-tion steps at both the glycoprotein and glycopeptide levels.31

Steentoft et al. used zinc-finger nuclease-induced knockout ofthe core 1 Gal T1 chaperone cosmc, inhibiting furtherelongation of GalNAc-O- precursor substrates, for studies ofO-glycosylation in Simple cell cultures.3 Additionally, theyemployed Vicia villosa agglutinin (VVA) lectin chromatographyto enrich GalNAc-modified tryptic glycopeptides and identifiedmore than 350 O-glycosylation sites.3 Another glycoproteomicsapproach is the usage of TiO2 solid phases for the enrichmentof sialylated glycopeptides in combination with peptide N-glycosidase F (PNGase F) treatment to release formerly N-glycosylated peptides from the solid phase.32 The usability ofthis methodology for the purification of O-glycopeptides hasyet to be demonstrated. We initially developed a protocol forsialic acid capture and release of both N- and O-glycoproteins/glycopeptides from clinical samples12 using hydrazide chem-istry.33 Mild periodate oxidation was used to introduce analdehyde on sialic acid terminated glycoproteins, which werethen covalently captured onto hydrazide beads and trypsindigested, and finally, tryptic glycopeptides were released byformic acid hydrolysis of the acid-labile sialic acid glycosidicbond. Using liquid chromatography coupled to tandem massspectrometry (LC−MS/MS) for the glycopeptide analyses, weidentified desialylated glycans of 36 N- and 44 O-glycosylationsites on human cerebrospinal fluid (CSF) glycoproteins.12 Wealso used this method to identify desialylated glycans of 58 N-and 63 O-glycosylation sites from human urine samples.34

TheN-glycan structures were essentially all of the complex type, andthe O-glycans were mainly of the core 1 type. For these CSFand urine samples, the presence of abundant N-glycopeptideswas prominent in the ion chromatograms and reduced thelikelihood to fragment less abundant coeluting O-glycopeptides.To specifically study the site-specific O-glycosylation of CSFproteins, we have now included a pretreatment step usingPNGase F to selectively remove N-glycans from nativeglycoproteins and thus facilitate the selective MS analysis ofO-glycopeptides. We have now also developed an automatedprotocol to search for the HexHexNAc-O-substituted peptidesusing the Mascot search engine. For the assignment of specificSer/Thr/Tyr glycosylation site(s) for peptides containingmultiple hydroxylated amino acid, we used electron-capturedissociation (ECD) and electron-transfer dissociation (ETD)to allow for selective peptide backbone fragmentation of O-glycopeptides.

Article

deidentified, that is, all patient information was removed, beforeusage in this study. The use of deidentified clinical samples formethod development is in agreement with Swedish law, and thestudy was permitted by the head of the Clinical Chemistrylaboratory, Sahlgrenska University Hospital (Dnr 797-550/12).

PNGase F Pretreatment and Sialic AcidCapture-and-Release Protocols

Aliquots of CSF samples (1 mL) were dialyzed against waterusing membranes with a 12−14 kDa molecular-weight cutoff(MWCO) (Spectrum Lab) (n = 2) or desalted on SephadexPD-10 columns (GE Healthcare) (n = 6). The samples werelyophilized, dissolved in 50 μL of water, and subjected toPNGase F treatment according to the manufacturer’s protocol(New England Biolabs). The samples were denatured at 50 °Cfor 10 min in the glycoprotein denaturing buffer. Temperatureshigher than 60 °C should be avoided because of risk ofirreversible sample denaturation. G7 buffer, NP40, and PNGaseF were added and incubated at 37 °C for 16 h. The sampleswere then desalted against water using 10 kDa MWCOmicrodialysis (Pierce). Finally, the samples (100−200 μL) weresubjected to sialic acid capture and release for the enrichmentof desialylated glycopeptides, as described elsewhere.12

Liquid Chromatography−Mass Spectrometry

MATERIALS AND METHODSThe CSF samples (10 mL, n = 8) were taken on the suspicionof infection but were, upon analysis, found to have normalwhite blood cell count and blood brain barrier function. Thesamples were collected by lumbar puncture and werecentrifuged at 1800g for 10 min within 30 min after samplecollection, aliquoted (1 mL fractions), and stored at −80 °Cpending analysis. The aliquots of the CSF samples were

574

Mass spectrometric analysis was performed essentially asdescribed in ref 12. In short, samples were dissolved in 20 μLof 0.1% formic acid and separated by nano-liquid chromatog-raphy on a 150 × 0.075 mm C18 reverse-phase column(Zorbax; Agilent Technologies) in 50 min for elution of narrowchromatographic peaks and 120 min for broader peaks, with agradient from 0 to 50% acetonitrile in 0.1% formic acid at aflow rate of 200−300 nL/min. The eluting peptides wereallowed through a nano-ESI source to a hybrid linearquadrupole ion trap/FT ion cyclotron resonance (ICR) massspectrometer equipped with a 7 T magnet (LTQ-FT; ThermoFisher Scientific). All spectra were acquired in positive-ionmode, and the mass spectrometer was operated in the data-dependent mode to automatically switch between MS1, MS2,and MS3 acquisition. The FTICR precursor scan was acquiredat an isotopic resolution of 50000, and the most intense ion wasisolated and fragmented in the linear ion trap (LTQ) using anormalized collision energy of 30%. For each MS2 spectrum,the five most intense fragment ions were sequentially selectedfor CID fragmentation in MS3. A repeat count of two was used,and ions were then dynamically excluded for 180 s. For ECD,the precursor ions were guided to the ICR cell and fragmented.The most abundant ion from an inclusion list, obtained byinitial use of the CID−MS2/MS3 approach, was selected forfragmentation and irradiated with low-energy electronsproduced by an emitter cathode for 80 ms using an arbitraryenergy setting of 4 or 5 in duplicate fragmentation events. For higher-energy collision dissociation (HCD) and ETD,we used Orbitrap Velos and Orbitrap XL instruments(Thermo), respectively. The reverse-phase C18 chromatog-raphy and ESI interface setups were as previously described.35

The MS run times were 70 min, and the gradient ranged from 0to 40% acetonitrile in 0.1% formic acid. For the Velos Orbitrapexperiments, the MS1 precursor scans and CID−MS2 spectrawere acquired with an isotopic resolution of 30000 and 7500,respectively, in the Orbitrap. The software could thus assign thecharge states of MS2 peaks, which was necessary for attainingdata-dependent CID−MS3 transitions from the five mostabundant peaks in each MS2 spectrum. The CID−MS3 spectra

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 4: Uploaded file 130063321946083603

Journal of Proteome Research

were acquired as profile data in the LTQ. The normalizedcollision energies for CID−MS2 and −MS3 were set to 30%,and the minimum signal intensities for data dependenttriggering of CID were set to 10000 and 500 counts in theMS1 and MS2 steps, respectively. Also, one HCD-MS2 spectrumwas acquired on the Orbitrap Velos, after the MS3 events, atnormalized collision energy of 40%. For the ETD experiments,using the Orbitrap XL, the normalized collision energy was setto 35% and the activation time was 200 ms. For each MS1

spectrum, at a resolution of 30000, three ETD spectra werecollected, and the minimum signal required was set to 100000counts. The ETD spectra were collected either as profile datafrom the Orbitrap, at a resolution of 7500, or as centroided datafrom the LTQ.

Analysis of MS Data

Article

specifying the topic glycosylation in the FT line, the search terms(GalNAc...) and (HexNAc...), and experimentally verif ied. Theneighboring ±10 amino acid residues were plotted, andWeblogos37 were constructed using version 3.1 (http://weblogo.threeplusone.com), where the previously reportedsites from CSF were omitted to avoid bias from themethodology used both previously12 and in this report.

RESULTS

PNGase F Treatment

The LC−MS/MS files from CID acquisitions were convertedto Mascot general format (.mgf) using the Raw2 msmapplication.36 The top 12 peaks per 100 Da were selected,and MS3 spectra were included. The in-house Mascot serverwas accessed through Mascot Daemon (version 2.3.0), andsearches were performed with the enzyme specificity set toTrypsin and then changed to Semitrypsin. The human sequencesof the Swiss-Prot database were searched (20249 sequences;January 25, 2011), but then the NCBI database (16392747sequences; December 27, 2011) was used to account forsequence variations. HexHexNAc (365.1322 Da) on Ser, Thr,and Tyr residues was set to variable modification together withneutral loss of HexHexNAc and Hex (162.0528 Da) for scoringpurposes and from the “peptide” to account for neutral loss ofHexHexNAc and Hex from the precursor. Alternatively,Hex2HexNAc2 (730.2644 Da) and HexHexNAc2 (568.2116Da) on Ser, Thr, and Tyr residues were set to variablemodifications together with neutral losses of the same masses inseparate searches. Other variable modifications were Asn-to-Asp conversion (+0.9840 Da), methionine oxidation, and lossof NH3 for peptides with N-terminal Gln and N-terminalcarbamidomethyl-Cys. Carbamidomethyl-Cys was set to a fixedmodification. The Instrument setting of ion trap was selected.Peptide tolerance was set to 10 ppm, and fragment tolerancewas set to 0.6 Da. All MS2 and MS3 spectra of Mascot-proposedO-glycopeptides were manually checked to contain theanticipated HexHexNAc-O- or (HexHexNAc-O-)2 structuresand were further investigated for matches that pinpointed theglycan to a specific Ser/Thr/Tyr residue within the peptide. The ECD and ETD spectra were converted and aggregatedusing Mascot distiller (version 2.3.2.0, Matrix Science), and theions were presented as singly protonated in the output Mascotfile. Search parameters were set as described above, except thatthe fragment tolerance was set to 0.03 Da, no neutral losseswere allowed for the HexHexNAc modification, and theInstrument parameters were set to consider c, z, and z + 1ions. Also, the precursor ion masses of ECD and ETD spectrawere matched manually to those of glycopeptides that had beenidentified by the automated Mascot search protocol. The MS-product tool from Protein prospector (http://prospector.ucsf.edu) was used to prepare peak lists of c and z ions forglycopeptide matches, and O-glycosylation sites were pin-pointed to unique Ser/Thr/Tyr residues by tracing c and z ionsthat included or lacked HexHexNAc-O- modifications.

Data Analysis of Glycosylation Sites

We subjected eight deidentified CSF samples to peptide N-glycosidase F (PNGase F) treatment and enriched O-linkedglycopeptides (O-glycopeptides) with the sialic acid capture-and-release protocol (Figure 1A). For two CSF samples, half ofthe volumes were treated with PNGase F and the other half wasleft untreated, and then both were subjected to glycopeptideenrichment. Peaks of tryptic N-linked glycopeptides (N-glycopeptides) were virtually absent in the PNGase F treatedCSF samples (Figure 1B) but were prominent in the untreatedsamples (Figure 1C). By inspection of the CID−MS2 and−MS3 spectra, we identified several O-glycopeptides withmainly HexHexNAc-O- structure, most likely corresponding tothe core 1 (Galβ3GalNAcα-O-) glycan.Automated Mascot Search to Identify O-Glycopeptides

Glycosylation sites of human proteins in the Uniprotknowledge base (UniprotKB) database were compiled by

575

To efficiently analyze the fragment-ion spectra, we designed aprotocol to automate the Mascot searches for HexHexNAc-O-substituted peptides (Figure 2). Use of the Raw2 msmapplication36 for the generation of Mascot .mgf search filesallowed the precursor masses (MS1) to be assigned not only tothe CID−MS2 spectrum but also to five consecutive MS3

spectra. Thus, the high mass accuracy (<5 ppm) of theFTICR or Orbitrap Velos MS1 precursor ions was implementedfor the subsequent MS2 and MS3 spectra that were measured atlow resolution, but with high sensitivity, in the LTQ.Accordingly, a variable modification corresponding to HexHex-NAc (365.1322 Da) on Ser/Thr/Tyr residues and thesimultaneous neutral loss of the same mass to account forthe lack of HexHexNAc of the peptide ion were included asparameters during database searches. An advantage of usingRaw2 msm was that all isotopic peaks were used to calculate theprecursor mass, which gives a better mass accuracy compared tomerely picking the first isotopic peak (Figure S1 and Table S1,Supporting Information). We first tested this search protocol on the LC−MS/MS filesthat we previously had analyzed manually.12 Of the 43HexHexNAc-O- and (HexHexNAc-O-)2-substituted peptidesthat had been manually identified, we now were able toautomatically identify 35 in less than 5 min as opposed toweeks of manual interpretation. The O-glycopeptides that werenot automatically identified either had precursor-ion intensitiesthat were too weak or the MS1 precursor ions were assignedwrong charge states by the Raw2 msm application. Theautomated Mascot search protocol identified one additional O-glycopeptide, 60-AIMGAAHEPSPPGGLDAR-77 from β-gal-actoside α-2,6-sialyltransferase 2 (ST6Gal II/SIAT2, Uni-protKB ID used in Table 1), for which the only possibleglycosylation site (Ser-69) is underlined. We then analyzed theO-glycopeptides from the PNGase F treated CSF samples andidentified 85 peptides constituting 106 unique O-glycosylationsites, of which about half had not previously been described(Table 1). For identified O-glycopeptides containing several

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 5: Uploaded file 130063321946083603

Journal of Proteome Research Article

for the Tyr-55 and Ser-56 alternatives, the scores were 32. Thisdifference in scores was due to the diagnostic presence of aglycosylated y13 fragment ion [y(13)-162, Figure 2D], whichpinpointed the glycosylation site to Thr-63 and excluded Tyr-55 and Ser-56 from being glycosylated (Figure 2B−D). Asecond example of CID fragmentation of the peptide backbonein the presence of intact glycosylation was for the O-glycosylation of 899-ALLIPPSSPMPGP-911 from Brevicancore protein (PGCB), which was automatically assigned toSer-905 (Figure S2, Supporting Information). In general, highabundance of b- and/or y-ion peaks, arising from fragmentationat the N-terminal side of Pro,38 was often significant for CIDpeptide fragmentation, also in the presence of intact or partiallyintact HexHexNAc-O- structures.

ECD and ETD Fragmentation to Pinpoint CorrectGlycosylation Sites

Figure 1. PNGase F pretreatment and sialic acid capture-and-releaseprotocol of CSF samples. (A) CSF samples were subjected to PNGaseF treatment (step 1), subjected to periodate oxidation, captured onhydrazide beads, and trypsin-digested while still attached to the beads(steps 2−4). Desialylated O-glycopeptides were released by formicacid hydrolysis (step 5). The LC−MS total-ion chromatogram of O-glycopeptides enriched from (B) PNGase F pretreated and (C)untreated CSF. Selected parent ions corresponding to chromato-graphic peaks are annotated with their nominal m/z values. N, N-glycopeptide; O, O-glycopeptide.

hydroxylated residues, we were able to pinpoint 50 attachmentsites correctly using CID or ECD/ETD.CID Fragmentation to Pinpoint the Correct GlycosylationSite

To further verify O-glycopeptide identities and to assign thecorrect Ser, Thr, or Tyr glycosylation sites for glycopeptidescontaining two or more Ser/Thr/Tyr residues, we used ECDfor peptide fragmentation without simultaneous fragmentationof glycans39 (Figure 3). For example, we assigned theglycosylation site of the abundant ion (m/z 993 in Figure 1Band Figure S1A, Supporting Information) of the C-terminal301-VQAAVGTSAAPVPSDNH-317 peptide from apolipopro-tein E (APOE) to Ser-308 (m/z 662 in Figure 3A), which is inaccordance with other studies.3,40 Also, a Hex2HexNAc2

glycoform of this glycopeptide was present, and ECD-MS2

showed that both Ser-308 and Thr-307 were glycosylated withtwo separate HexHexNAc-O- structures (Figure 3B). A thirdECD example was the 23-LLSDHSKPTAETVAPDN-TAIPSLR-46 glycopeptide, from SPARC-like protein 1(SPRL1), where Thr-31 and Thr-40 (both underlined) werefound to be glycosylated with HexHexNAc-O- whereas the fouradditional Ser/Thr residues were unglycosylated (Figure 3C).However, by the use of CID−MS2/MS3, we identified(HexHexNAc-O-)3 and (HexHexNAc-O-)4 glycoforms of thesame tryptic peptide (Figure S3, Supporting Information), andthe identifications were based on the presence of a diagnosticy4 fragment ion (m/z 472) in common to the three glycoforms.We also used electron-transfer dissociation (ETD) fragmenta-tion and pinpointed, for example, the O-glycosylation site of theHexHexNAc-O-substituted 649-GLTTRPGSGLTNIK-662peptide from the amyloid precursor protein (APP/A4) toThr-651 or Thr-652 (Figure 3D). By the ECD and ETDapproach, we assigned 31 glycosylation sites to unique Ser/Thrresidues of peptides with several Ser/Thr alternatives. We didnot identify any Tyr-glycosylated Aβ peptides in the CSFsamples,2 nor did we observe any evidence for other Tyr-glycosylated peptides.3 In total, using a combination of CIDand ECD/ETD, 67 desialylated glycans of unique O-glycosylation sites were pinpointed to correct Ser/Thr residues.Seventeen O-glycopeptides contained only one HexHexNAc-O-structure and one available Ser/Thr glycosylation site.

Automated Search for More Complex Glycoforms

We allowed for a neutral loss of Hex (−162.0528 Da) from theHexHexNAc-O-substituted precursor. Thus, all possibleHexNAc-O-substituted b- and y-ion peaks in the MS2 andMS3 spectra were taken into account in the Mascot search. Twoexamples where CID was used to pinpoint glycosylation sitesare given below. The O-glycosylation site of 55-YSQAVPAV-TEGPIPEVLK-72 from cathepsin D (CATD) was assigned toThr-63 with a Mascot score of 36 (p < 0.05 threshold 29), but576

Apart from the core-1-like HexHexNAc-O- structure, the core 2compatible Hex(HexHexNAc)HexNAc-O- and Hex(HexNAc)-HexNAc-O- structures (730.2644 and 568.2116 Da, respec-tively) were introduced as allowed modifications in separateMascot searches. A few false hits of Hex2HexNAc2 arose fromO-glycopeptides containing two separate HexHexNAc-O-structures but were disqualified because of a lack of diagnosticsaccharide oxonium ions otherwise typically found in the CID−

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 6: Uploaded file 130063321946083603

Journal of Proteome Research Article

Figure 2. Mascot search method for automated identification of HexHexNAc-O- substituted peptides. (A) The MS1 precursor was measured inFTICR or Orbitrap mode and (B) was subjected to CID to generate the MS2 spectrum. (C) Further CID of the peptide (top) and peptide +HexNAc ion (bottom) generated the MS3 spectra that were used in Mascot searches to identify the O-glycopeptide. The CID−MS2 and −MS3

transitions follow black (straight) arrows, and the MS1 precursor assignments follow red (rounded) arrows.MS2/MS3 spectra of more complex O-glycopeptides.34 Onenovel O-glycopeptide with Hex2HexNAc2 modification was,however, identified as 210-AATVGSLAGQPLQER-224 fromApolipoprotein E (APOE) (Figure S4, Supporting Informa-tion). Peaks from the saccharide oxonium ions [HexHexNAc2]+

(m/z 569.3 in Figure S4A, Supporting Information) and[Hex2HexNAc2]+ (m/z 731.3 Da in Figure S4A, SupportingInformation) exceeding [HexHexNAc]+ (m/z 366.3 in FigureS4A, Supporting Information) in mass were observed, whichverified the presence of the more complex core-2-likeHex(HexHexNAc)HexNAc-O- structure as opposed to twoseparate HexHexNAc-O- structures. The CID−MS3 spectrumof the peptide + HexNAc ion (m/z 850.9) was used for theautomated identification (Figure S4A, right spectrum; Support-ing Information). Also, the ETD spectrum of the [M + 3H]3+

precursor indeed showed that the complete glycan was attachedsolely to Thr-213 (Figure S4B, Supporting Information). Amanual survey of all CID−MS2 and −MS3 spectra wasperformed to investigate for the presence of O-glycopeptideswith more complex glycans, but none were found, demonstrat-ing that, using this methodology, sialylated HexHexNAc-O-structures appeared vastly dominating in these samples.

577

Reproducibility of Sample Preparation, LC−MS/MSAnalysis, and Glycosylation Pattern of Human CSF Samples

The reproducibility of the sample preparation, LC−MS/MSanalysis, and presence of the same O-glycosylation sites acrossindividual CSF samples was assayed by analyzing 19 abundantglycopeptides from six CSF samples that were acquiredsequentially using identical preparative and LC−MS/MSsettings on the FTICR instrument (Table S2, SupportingInformation). These glycopeptides were selected because theywere automatically identified by Mascot searches in at leastthree of the six samples. The Mascot scores for these 19glycopeptides were similar across the six samples; for example,the differences between median and average scores were <5%for all glycopeptides except for (HexHexNAc-O-)2-substituted301-VQAAVGTSAAPVPSDNH-317 from APOE. MS1 peaksof the 19 glycopeptides were manually identified having thecorrect mass (±5 ppm) and expected elution time (±2 min) inall of the six samples. We also used an alternative approach,exemplified by the (HexHexNAc-O-) 2 -substituted 23-LLSDHSKPTAETVAPDNTAIPSLR-46 glycopeptide fromSPRL1, which was automatically identified by Mascot in onlyone of the six LC−MS/MS (Table S2, SupportingInformation). However, the MS1 peak (Figure S5A inset,Supporting Information) was indeed present, although at

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 7: Uploaded file 130063321946083603

Journal of Proteome Research

Table 1. O-Glycopeptides Identified from Human CSF Glycoproteinsa,b

Article

Ser/Thr residues of glycosylation sites that were experimentally verified are shown in bold red. Pro residues in glycosylated S/T-X-X-P, P-S/T andS/T-P sequences are shown in bold blue. Ser/Thr/Tyr residues in glycopeptides with experimentally unverified glycosylation site(s) are shown inbold green. The number of HexHexNAc-O- sites is indicated in the last column when more than one are present. b(a) Previously not reportedglycosylation site. (b) Reported from human CSF.12 (c) Previously reported (UniprotKB). (d) Previously reported in immunopurified APP/A4 fromhuman CSF.2 (e) Reported from human cell culture.3

a

varying intensities, in all six samples (Figure S5A−F,Supporting Information). Thus, the reproducibility of theglycosylation patterns in the CSF samples was typicallyconsistent during the sample preparations LC−MS/MSanalyses and between individuals.

Weblogo Analysis of the O-Glycosylation Sites

We prepared a Pro frequency plot for the ±10 residuessurrounding the 67 experimentally verified O-glycosylation sitesand found that the fractions of Pro occurrence at S/T-X-X-P, P-S/T, and S/T-P sequences were about one-half, one-third, andone-fourth, respectively (Figure 4A, left). The sum of thefraction values exceeds 1 because more than one sequencecombination often occurred per each glycosylation site (e.g., inS/T-P-X-P). The Weblogo plot (Figure 4A, right) demon- 578

strated that a Pro residue was sequence conserved at the n − 1,n + 1, and n + 3 positions where n is the O-glycosylation site. Asa comparison, all experimentally verified GalNAc-O-glycosyla-tion sites for human proteins in the UniprotKB (222 sites,release 2012_02) were analyzed in Pro frequency and Weblogoplots (Figure 4B), essentially confirming our results. However,the frequencies were not as pronounced as when only our CSFdata were used. The combination of Pro in n + 1 and n + 3 (S/T-P-X-P) was found in approximately one-third of theexperimentally verified glycosylation sites (Figure 4C). Thefrequency of Pro in the n + 2 position was low, but that of Alaand Leu was higher at the n + 2 position for the S/T-X-X-Psequence (Figure 4C). Two typically glycosylated sequenceswere thus T-P-A-P and T-P-L-P, where T-P-A-P was a favorablemotif for the O-glycosylation of model peptides by ppGalNAc-

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 8: Uploaded file 130063321946083603

Journal of Proteome Research Article

Figure 3. ECD and ETD spectra of O-glycopeptides. ECD spectrum of the C-terminal tryptic HexHexNAc-O-substituted 301-VQAAVGTSAAPVPSDNH-317 peptide (A) from APOE and (B) its (HexHexNAc-O-)2 glycoform. (C) ECD spectrum of (HexHexNAc-O-)2-substituted 23-LLSDHSKPTAETVAPDNTAIPSLR-46 from SPRL1. (D) ETD spectrum of HexHexNAc-O-substituted 649-GLTTRPGSGLTNIK-662 from APP/A4.

T123 and was the first site to be glycosylated in a peptidecontaining multiple Ser/Thr and Pro residues by the brain-specific ppGalNAc-T13.41

Selected Examples of Identified O-Glycoproteins

Some selected examples of identified O-glycosylations werechosen because partial manual analysis was also used and aglycoform abundance study was carried out (APOE), becausesix previously unknown O-glycosylation sites were identified inone O-glycoprotein (ETBR2), because O-glycosylation wasidentified at Thr residues of the N-glycosylation Asn-X-Ser/Thrconsensus sequence (ETBR2 and YIPF3), and because anunexpected lack of anticipated O-glycosylation in the PNGase Ftreated CSF samples was found (HEMO).Apolipoprotein E

The dominating MS1 precursors in the LC−MS chromato-grams (Figure 1B) were HexHexNAc-O-substituted 301-VQAAVGTSAAPVPSDNH-317 [m/z 993 in Figure 1B andFigure S1A (Supporting Information) and m/z 662 in Figure3A and Table S1 (Supporting Information)] and 210-AATVGSLAGQPLQER-224 [m/z 931 in Figure 1B and FigureS1C and Table S1 (Supporting Information)] containing thewell-established Ser-308 and Thr-212 glycosylation sites,respectively.3,12,40,42 Also, additional O-glycosylation of Thr-307 (Figure 3B) has been identified from cell culture3 and fromCSF.12 Additionally, Steentoft et al. identified a thirdglycosylation site on Ser-314 of 301-VQAAVGT-SAAPVPSDNH-317, where the three underlined residueswere all substituted with HexNAc.3 We manually searched for

579

the corresponding (HexHexNAc-O-)3-substituted peptide inthe LC−MS/MS spectra and found one MS1 precursor thatdeviated by 2.3 ppm from the theoretical mass and had an ionintensity that was approximately 1% compared to the(HexHexNAc-O-)2-substituted peptide (Figure S6, SupportingInformation), which, in turn, usually was in the range of 2% inrelation to the HexHexNAc-O-substituted peptide (Figure S1,Supporting Information). The CID−MS2 spectrum supportedthe (HexHexNAc-O-)3 structure, and we thus confirmed Ser-314 to be a minor glycosylation site in APOE from a humanCSF sample. We also identified two additional minor glycosylation sitescarrying core-1-like HexHexNAc-O- structure at Thr-26 of theN-terminal 19-KVEQAVETEPEPELR-33 peptide and at Thr-36 of the sequential peptide 34-QQTEWQSGQR-42 fromAPOE. They are minor because the two major O-glycopeptidescontaining the Thr-212 and Ser-308 glycosylation sitesdominate the ion chromatograms and the two newly observedAPOE glycopeptides are present at much lower ion intensities(Figure S1 and Table S1, Supporting Information) and wereonly automatically selected for CID− and ECD−MS2 in CSFsamples that had been treated with PNGase F.

Endothelin B Receptor-Like Protein 2 and YIPF3

For Endothelin B receptor-like protein 2 (ETBR2), weidentified six glycosylation sites, all with core-1-like HexHex-NAc-O- structures, present on three glycopeptides of theextracellular part of the protein (Figure S7A, SupportingInformation). We found that the 22-VSGGAPLHLGR-32

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 9: Uploaded file 130063321946083603

Journal of Proteome Research Article

Figure 4. Proline frequency and Weblogo probability plots of O-glycosylation sites. (A) Proline frequency (left) and Weblogo (right)for the 67 experimentally verified O-glycosylation sites in this study.(B) Proline frequency (left) and Weblogo (right) for 222experimentally verified O-glycosylation sites from human proteins inthe UniprotKB database and (C) Weblogo plot for glycopeptidescontaining the S/T-X-X-P glycosylation sequence and identified in thisstudy.

tryptic peptide was glycosylated (Figure S7B, SupportingInformation), which is different from the proposed signalsequence cleavage at Gly-25 (Figure S7A, SupportingInformation; see entry ETBR2_HUMAN in the UniprotKBdatabase) and indicates that O-glycosylation can indeed affectthe cleavage of the signal peptide of glycoproteins. Two andthree glycosylation sites were identified on the 70-PIH-PAGLQPTKPLVATSPNPGK-91 peptide of ETBR2 (panelsC and D, respectively, of Figure S7, Supporting Information),where Thr-85 was unglycosylated in the (HexHexNAc-O)2-substituted peptide, supporting an initial glycosylation of Thr-79 within the P-S/T sequence and Ser-86 within the S/T-P-X-Psequence. Also, the 104-GNLTGAPGQR-113 peptide fromETBR2 was found to be glycosylated (Figure S7F, SupportingInformation), and interestingly, Asn-105 had been changed toAsp-105. Because Asp is in an Asn-X-Ser/Thr N-glycosylationmotif, this indicates that an N-glycan at Asn-105 washydrolyzed during the PNGase F treatment. A second exampleof O-glycosylation of Ser/Thr in the Asn-X-Ser/Thr consensuswas demonstrated from the HexHexNAc-O-substituted 331-LPTTVLNATAK-341 peptide from YIPF3 where the Asn alsohad been converted to Asp (Figure 5). The presence ofHexNAc-substituted y5, y6, and y7 and unglycosylated b6, b7,and b8 fragments (expanded in Figure 5B) demonstrated thatThr-339 was the O-glycosylation site.

Hemopexin

Figure 5. CID−MS2 and −MS3 of HexHexNAc-O-substitutedLPTTVLDATAK from protein YIPF3. (A) CID−MS2 at the MS1

precursor (m/z 747.9). (B) CID−MS3 of the peptide + HexNAc ion(m/z 666.8 in spectrum A) and an expansion showing the presence ofHexNAc-substituted y5-y7 fragments. Note that the Asn (N) residuewas identified as Asp (D).

the LC−MS/MS spectra of CSF samples even withouttreatment with PNGase F (Figure 1C). The CID−MS2/MS3

and ECD spectra of HexHexNAc-O- and (HexHexNAc-O-)2-substituted Hemopexin peptides are shown for entry HEMO inthe Additional Spectra section of the Supporting Information.We initially believed that these O-glycopeptides would be majorions also in the LC−MS/MS spectra of PNGase F treated CSF,but quite surprisingly, there was no trace of them in thePNGase F treated samples (Figure 1B). Although this was areproducible result limited to Hemopexin, the explanation forthis finding is still unclear.■

Hemopexin (HEMO) is both N- and O-glycosylated,43 andHexHexNAc-O-substituted 24-TPLPPTSAHGNVAE-GETKPD-43 at m/z 795 and HexHexNAc-O-substituted 24-TPLPPTSAHGNVAEGETKPDPDVTER-49 at m/z 771,where Thr-24 is the O-glycosylation site, were prominent in

580

DISCUSSIONIn this study, we have added the use of PNGase F treatment toremove N-glycans prior to our sialic acid capture-and-releaseprotocol to selectively characterize O-glycopeptides originatingfrom CSF glycoproteins. This pretreatment was reproduciblysuccessful and made it possible to identify a larger number ofO-glycosylations because there was a reduced analytical

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 10: Uploaded file 130063321946083603

Journal of Proteome Research

interference from N-glycopeptides in the LC−MS/MS spectra(Figure 1). As N-glycans are hydrolyzed by the PNGase Ftreatment, formerly N-glycosylated Asn residues were changedto Asp residues (+0.9840 Da), which was introduced as anallowed modification in the Mascot searches to facilitate thepossible identification of O-glycosylation sites also onpreviously N-glycosylated tryptic peptides. Two such O-glycopeptides were identified, 331-LPTTVLDATAK-341 fromYIPF3, which was glycosylated at Thr-339 (Figure 5) andcontained Asp-337 instead of Asn-337, and 104-GDLTGAPGQR-113 from ETBR2 containing an Asn-106 toAsp-106 change (Figure S7F, Supporting Information).Interestingly, in both cases, the Thr residue of the N-glycosylation Asn-X-Ser/Thr consensus motif was O-glycosy-lated, demonstrating that a preformed N-glycan structure doesnot necessarily block the ppGalNAc-T from interaction with itssubstrate. Simultaneous N- and O-glycosylations of the Asn-X-Ser/Thr motif has previously been described.44

Automated strategies for the structural characterization of O-glycopeptides are known to be demanding.29 Some protocolsare available for glycan fragmentation analysis of glycopeptideswith already known peptide sequence(s),45,46 but automatedprotocols aimed at analyzing both the glycan structure and thepeptide sequence are scarce. 29 The predominance ofHexHexNAc-O- substituted peptides in our study made itefficient to design an automated Mascot search protocol toidentify core 1 substituted O-glycosylation sites using amultistage CID−MS2/MS3 approach. Conveniently, theHexHexNAc-O- structure is predominantly fragmented duringthe MS2 step generating the peptide and the peptide + HexNAcfragments as major ion peaks. Subsequent MS3 of the peptideion generates peptide backbone fragmentation into the b- andy-ion series. Thus, by introducing HexHexNAc (+365.1322 Da)as a variable modification of Ser/Thr/Tyr in the Mascot search,which simultaneously allowed for neutral loss of HexHexNAc,the high accuracy measured glycopeptide mass was used as theprecursor for the CID−MS3 spectrum of the peptide ion(Figure 2). The use of a similar strategy for assignment of high-accuracy MS1 precursor masses for subsequent MS2 and MS3

ofphosphopeptides has been shown to increase the number ofidentified peptides.47 CID−MS2 of the HexHexNAc-O-substituted peptide ion and MS3 of the peptide + HexNAcion often resulted in peptide backbone fragmentation of theremaining glycopeptide, which was used to assign theglycosylated Ser/Thr site within peptides containing severalSer/Thr residues (Figure 2 and Figure S2, SupportingInformation). The automated Mascot search protocol couldalso be expanded to search for more complex glycans because acore-2-like Hex(HexHexNAc)HexNAc-O- structure on Thr-212 of 210-AATVGSLAGQPLQER-224 from APOE was alsoidentified (Figure S4, Supporting Information), which is inaccordance with a previous glycoproteomics study of APOE.40

Occasionally, we also performed manual analysis of CID−MS2/MS3 spectra to further characterize O-glycosylation sites(Figures S3 and S6, Supporting Information). To correctly pinpoint the attachment site(s) of the O-glycopeptides, we used ECD and ETD on FTICR and Orbitrapinstruments, respectively. In total, we have successfullyidentified 106 O-glycosylation sites from CSF proteins andexperimentally verified the exact attachment site for 67 of these(Table 1). To check for analytical reproducibility, we selected

581

Article

the same instrument (Table S2, Supporting Information) andone O-glycopeptide, which was automatically identified in onlyone of the six samples (Figure S5, Supporting Information).Based on Mascot scores, retention times, and the presence ofaccurate MS1 peaks in all of the LC−MS/MS runs, the 20glycopeptides were reproducibly found in all six CSF samples,thus showing analytical reproducibility and similarity withrespect to O-glycosylation pattern between individuals. Wewere unable to identify any Tyr O-glycosylations in ourPNGase F treated CSF samples, indicating that the recentlydescribed HexNAc-O-Tyr modifications are relatively unusu-al.2,3

The sialic acid capture-and-release protocol is very specificfor the enrichment of formerly sialylated glycopeptides, and it isimportant to note that nonsialylated glycoproteins will not beenriched. It is thus not possible to assay glycosylation sites andglycan structure of nonsialylated O-glycans using this method-ology. A second drawback of the protocol, which is commonfor all protocols where glycopeptides are purified from theirunglycosylated peptide counterparts, is that importantinformation regarding site occupancy, that is, the relativedistribution of glycosylation versus unglycosylation of peptides,cannot be addressed in a quantitative manner. Another possiblelimitation of the protocol could be that the presence of O-glycans in the vicinity of Lys/Arg residues might block theaccess of trypsin to cleave the glycoproteins while attached tothe hydrazide beads, and thus some glycopeptides could bemissed in the LC−MS/MS analysis. This would be particularlyvalid for mucins containing highly O-glycosylated regions andsome highly glycosylated O-glycopeptides may be too large andcomplex for the present LC−MS/MS and/or automatedMascot analysis, and will thus not be identified. However,only four of the 84 identified O-glycopeptides of Table 1contained an internal (i.e., not present at the glycopeptide N-or C-terminal) missed trypsin cleavage site and thus O-glycansdo not seem to block trypsin to any larger extent from cleavingreduced/alkylated nonmucin O-glycoproteins immobilizedonto the beads. Steentoft et al recently identified more than 350 O-glycosylation sites from five different human cell-lines, ofwhich mucin-16 contributed with about 100 sites.3 Interest-ingly, only twelve of their reported O-glycosylation sites are incommon with this study (Table 1), four of which were fromAPP/A4 and four from APOE. We recently reported theidentification of 57 O-glycosylation sites from human urineproteins using the sialic acid capture-and-release strategy,34

and15 of those glycosylation sites were in common with this study.Thus, a combination of methods and sample sources is neededto accomplish comprehensive O-glycoproteomic mapping ofproteins in relevant cells and clinical samples. A few of the O-glycopeptides reported here are most likely peptide fragments,that is, neuropeptides that are released into the CSF. Forinstance, endogenous neuropeptides containing the threedifferent glycopeptide stretches from ProSAAS that we presenthere are generated by convertase cleavage of the proprotein(see entry PCSK1_HUMAN in the UnprotKB database) andwere also identified in a neuropeptidomics study of humanchromaffin secretory vesicles.48 More importantly, the sameProSAAS neuropeptides were identified and found to be

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Page 11: Uploaded file 130063321946083603

Journal of Proteome Research

We observed predominantly O-glycosylations on Ser/Thrresidues (position n), which had Pro residues at the n − 1, n +1, and/or n + 3 positions (P-S/T, S/T-P, and S/T-X-X-P,respectively). This selective glycosylation-enhancing effect ofPro has been described previously based on data analysis ofreported O-glycosylation sites.51−54 Such glycosylation motifshave also been demonstrated for ppGalNAc-T1 and -T2 towardS/T-P and S/T-X-X-P and for ppGalNAc-T2 toward P-S/T onmodel peptide libraries,21 and ppGalNAc-T3, -T5, and -T12also exhibit similar Pro specificities.22 In addition, modelpeptides containing T-P-A-P have been identified to be proneto O-glycosylation by ppGalNAc-T123 and brain-specificppGalNAc-T13,41 a glycosylation sequence that was alsoidentified in this study (Figure 4C). Any of these ppGalNAc-Ts are thus likely candidates for performing the O-glycosylations observed for CSF O-glycoproteins. Our studygives support to similar sequence-specific interactions betweenthe ppGalNAc-Ts and their substrates irrespective of whetherthey are model peptides or natural proteins occurring in vivo.For 41 of the 106 identified O-glycosylation sites, we were notable to pinpoint the actual glycosylated Ser/Thr residue usingCID, ECD, or ETD (Table 1). Many of these O-glycopeptidesindeed contained P-S/T, S/T-P, and S/T-X-X-P sequences butcould nevertheless not be confidently assigned because of a lackof unequivocal MS/MS data. In conclusion, by removing the N-glycans from human CSFsamples by PNGase F in a pretreatment step, it was possible toselectively enrich tryptic O-glycopeptides using a sialic acidcapture-and-release protocol. The core-1-like HexHexNAc-O-structure was vastly dominant, which facilitated the use of anautomated Mascot search protocol for identification of the O-glycosylation sites. By using this methodology, we were able toexpand our list of O-glycosylation sites of CSF glycoproteins bya factor of 3. We believe the this strategy should be useful forother clinical subproteomes as well, particularly those wherecomplex N-glycosylations are quantitatively dominating, such ashuman serum samples.

Article

ASSOCIATED CONTENTS

* Supporting InformationAdditional information as noted in text. This material isavailable free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected] Tel.: +46 31 342 2174.Fax: +46 31 82 84 58.Notes

The authors declare no competing financial interest.

ACKNOWLEDGMENTSWe thank Prof. Henrik Zetterberg and Prof. Kaj Blennow at theNeurochemistry Laboratory, Sahlgrenska University Hospital,for access to CSF samples. Expert MS assistance by Dr. CarinaSihlbom and Sjoerd van der Post at the Proteomics CoreFacility, The Sahlgrenska Academy, is acknowledged. Thisstudy was supported by grants from the Swedish ResearchCouncil (8266 to G.L.), Alzheimer Foundation, and Magn.Bergwall Foundation and governmental grants to theSahlgrenska University Hospital. The Inga-Britt and ArneLundberg Research Foundation and the Knut and Alice

582

(1) Varki, A.; Cummings, R.; Esko, J.; Freeze, H.; Stanley, P.;Bertozzi, C. R.; Hart, G.; Etzler, M. E. Essentials of Glycobiology.: ColdSpring Harbor Laboratory Press: New York, 2009. (2) Halim, A.; Brinkmalm, G.; Ruetschi, U.; Westman-Brinkmalm, A.;Portelius, E.; Zetterberg, H.; Blennow, K.; Larson, G.; Nilsson, J. Site-specific characterization of threonine, serine, and tyrosine glycosyla-tions of amyloid precursor protein/amyloid β-peptides in humancerebrospinal fluid. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (29),11848−11853. (3) Steentoft, C.; Vakhrushev, S. Y.; Vester-Christensen, M. B.;Schjoldager, K. T.-B. G.; Kong, Y.; Bennett, E. P.; Mandel, U.;Wandall, H.; Levery, S. B.; Clausen, H. Mining the O-glycoproteomeusing zinc-finger nuclease−glycoengineered SimpleCell lines. Nat.Methods 2011, 8 (11), 977−982. (4) Varki, A. Uniquely human evolution of sialic acid genetics andbiology. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 8939−8946. (5) Schauer, R. Sialic acids as regulators of molecular and cellularinteractions. Curr. Opin. Struct. Biol. 2009, 19 (5), 507−514. (6) Liu, Y.-C.; Yen, H.-Y.; Chen, C.-Y.; Chen, C.-H.; Cheng, P.-F.;Juan, Y.-H.; Chen, C.-H.; Khoo, K.-H.; Yu, C.-J.; Yang, P.-C.; Hsu, T.-L.; Wong, C.-H. Sialylation and fucosylation of epidermal growthfactor receptor suppress its dimerization and activation in lung cancercells. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (28), 11332−11337. (7) Sørensen, A. L.; Rumjantseva, V.; Nayeb-Hashemi, S.; Clausen,H.; Hartwig, J. H.; Wandall, H. H.; Hoffmeister, K. M. Role of sialicacid for platelet life span: Exposure of β-galactose results in the rapidclearance of platelets from the circulation by asialoglycoproteinreceptor-expressing liver macrophages and hepatocytes. Blood 2009,114 (8), 1645−1654. (8) Pang, P.-C.; Chiu, P. C. N.; Lee, C.-L.; Chang, L.-Y.; Panico, M.;Morris, H. R.; Haslam, S. M.; Khoo, K.-H.; Clark, G. F.; Yeung, W. S.B.; Dell, A. Human sperm binding is mediated by the sialyl-Lewisx

oligosaccharide on the zona pellucida. Science 2011, 333 (6050),1761−1764. (9) Larsson, J. M. H.; Karlsson, H.; Sjovall, H.; Hansson, G. C. Acomplex, but uniform O-glycosylation of the human MUC2 mucinfrom colonic biopsies analyzed by nanoLC/MSn. Glycobiology 2009, 19(7), 756−766. (10) Sihlbom, C.; van Dijk, H. I.; Lidell, M. E.; Noll, T.; Hansson, G.C.; Backstrom, M. Localization of O-glycans in MUC1 glycoproteinsusing electron-capture dissociation fragmentation mass spectrometry.Glycobiology 2009, 19 (4), 375−381. (11) Johansson, M. E. V.; Larsson, J. M. H.; Hansson, G. C. The twomucus layers of colon are organized by the MUC2 mucin, whereas theouter layer is a legislator of host-microbial interactions. Proc. Natl.Acad. Sci. U.S.A. 2011, 108, 4659−4665. (12) Nilsson, J.; Ruetschi, U.; Halim, A.; Hesse, C.; Carlsohn, E.;Brinkmalm, G.; Larson, G. Enrichment of glycopeptides for glycanstructure and attachment site identification. Nat. Methods 2009, 6(11), 809−811. (13) Darula, Z.; Medzihradszky, K. F. Affinity enrichment andcharacterization of mucin core-1 type glycopeptides from bovineserum. Mol. Cell. Proteomics 2009, 8 (11), 2515−2526. (14) Balog, C.; Mayboroda, O.; Wuhrer, M. Mass spectrometricidentification of aberrantly glycosylated human apolipoprotein C-IIIpeptides in urine from Schistosoma mansoni-infected individuals. Mol.Cell. Proteomics 2010, 9 (4), 667−681. (15) Sun, W.; Parry, S.; Ubhayasekera, W.; Engstrom, Å; Dell, A.;Schedin-Weiss, S. Further insight into the roles of the glycans attachedto human blood protein C inhibitor. Biochem. Biophys. Res. Commun.2010, 403 (2), 198−202. (16) Semenov, A. G.; Postnikov, A. B.; Tamm, N. N.; Seferian, K. R.;Karpova, N. S.; Bloshchitsyna, M. N.; Koshkina, E. V.; Krasnoselsky,M. I.; Serebryanaya, D. V.; Katrukha, A. G. Processing of Pro-Brain

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

■Wallenberg Foundation are acknowledged for MS instrumen-tation funding.

REFERENCES

Page 12: Uploaded file 130063321946083603

Journal of Proteome Research

Natriuretic Peptide Is Suppressed by O-Glycosylation in the RegionClose to the Cleavage Site. Clin. Chem. 2009, 55 (3), 489−498. (17) Schjoldager, K. T.-B. G.; Vester-Christensen, M. B.; Bennett, E.P.; Levery, S. B.; Schwientek, T.; Yin, W.; Blixt, O.; Clausen, H. O-glycosylation modulates proprotein convertase activation of angio-poietin-like protein 3: Possible role of polypeptide GalNAc-trans-ferase-2 in regulation of concentrations of plasma lipids. J. Biol. Chem.2010, 285 (47), 36293−36303. (18) Maryon, E. B.; Zhang, J.; Jellison, J. W.; Kaplan, J. H. Humancopper transporter 1 lacking O-linked glycosylation is proteolyticallycleaved in a Rab9-positive endosomal compartment. J. Biol. Chem.2009, 284 (41), 28104−28114. (19) Tabak, L. A. The role of mucin-type O-glycans in eukaryoticdevelopment. Sem. Cell Dev. Biol. 2010, 21 (6), 616−621. (20) ten Hagen, K. G.; Fritz, T. A.; Tabak, L. A. All in the family: theUDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases. Glyco-biology 2003, 13 (1), 1R−16R. (21) Gerken, T. A.; Raman, J.; Fritz, T. A.; Jamison, O. Identificationof common and unique peptide substrate preferences for the UDP-GalNAc:polypeptide α-N-acetylgalactosaminyltransferases T1 and T2derived from oriented random peptide substrates. J. Biol. Chem. 2006,281 (43), 32403−32416. (22) Gerken, T. A.; Jamison, O.; Perrine, C. L.; Collette, J. C.;Moinova, H.; Ravi, L.; Markowitz, S. D.; Shen, W.; Patel, H.; Tabak, L.A. Emerging paradigms for the initiation of mucin-type protein O-glycosylation by the polypeptide GalNAc transferase family ofglycosyltransferases. J. Biol. Chem. 2011, 286 (16), 14493−14507. (23) Yoshida, A.; Suzuki, M.; Ikenaga, H.; Takeuchi, M. Discovery ofthe shortest sequence motif for high level mucin-type O-glycosylation.J. Biol. Chem. 1997, 272 (27), 16884−16888. (24) Fritz, T. A.; Hurley, J. H.; Trinh, L.-B.; Shiloach, J.; Tabak, L. A.The beginnings of mucin biosynthesis: The crystal structure of UDP-GalNAc:polypeptide α-N-acetylgalactosaminyltransferase-T1. Proc.Natl. Acad. Sci. U.S.A. 2004, 101 (43), 15307−15312. (25) Fritz, T. A.; Raman, J.; Tabak, L. A. Dynamic associationbetween the catalytic and lectin domains of human UDP-GalNAc:polypeptide α-N-acetylgalactosaminyltransferase-2. J. Biol.Chem. 2006, 281 (13), 8613−8619. (26) Raman, J.; Fritz, T. A.; Gerken, T. A.; Jamison, O.; Live, D.; Liu,M.; Tabak, L. A. The catalytic and lectin domains of UDP-GalNAc:polypeptide α-N-Acetylgalactosaminyltransferase function inconcert to direct glycosylation site selection. J. Biol. Chem. 2008, 283(34), 22942−22951. (27) Wandall, H. H.; Irazoqui, F.; Tarp, M. A.; Bennett, E. P.;Mandel, U.; Takeuchi, H.; Kato, K.; Irimura, T.; Suryanarayanan, G.;Hollingsworth, M. A.; Clausen, H. The lectin domains of polypeptideGalNAc-transferases exhibit carbohydrate-binding specificity forGalNAc: Lectin binding to GalNAc-glycopeptide substrates is requiredfor high density GalNAc-O-glycosylation. Glycobiology 2007, 17 (4),374−387. (28) Perrine, C. L.; Ganguli, A.; Wu, P.; Bertozzi, C. R.; Fritz, T. A.;Raman, J.; Tabak, L. A.; Gerken, T. A. Glycopeptide-preferringpolypeptide GalNAc transferase 10 (ppGalNAc T10), involved inmucin-type O-glycosylation, has a unique GalNAc-O-Ser/Thr-bindingsite in its catalytic domain not found in ppGalNAc T1 or T2. J. Biol.Chem. 2009, 284 (30), 20387−20397. (29) Jensen, P. H.; Kolarich, D.; Packer, N. H. Mucin-type O-glycosylationPutting the pieces together. FEBS J. 2010, 277 (1),81−94. (30) Darula, Z.; Chalkley, R. J.; Baker, P.; Burlingame, A. L.;Medzihradszky, K. F. Mass spectrometric analysis, automatedidentification and complete annotation of O-linked glycopeptides.Eur. J. Mass Spectrom. 2010, 16 (3), 421−428. (31) Darula, Z.; Sherman, J.; Medzihradszky, K. F. How to digdeeper? Improved enrichment methods for mucin core-1 typeglycopeptides. Mol. Cell. Proteomics 2012, 11, O111−016774. (32) Palmisano, G.; Lendal, S. E.; Engholm-Keller, K.; Leth-Larsen,R.; Parker, B. L.; Larsen, M. R. Selective enrichment of sialic acid-containing glycopeptides using titanium dioxide chromatography with

583

Article

analysis by HILIC and mass spectrometry. Nat. Protoc. 2010, 5 (12),1974−1982. (33) Zhang, H.; Li, X.-j.; Martin, D. B.; Aebersold, R. Identificationand quantification of N-linked glycoproteins using hydrazidechemistry, stable isotope labeling and mass spectrometry. Nat.Biotechnol. 2003, 21 (6), 660−666. (34) Halim, A.; Nilsson, J.; Ruetschi, U.; Hesse, C.; Larson, G.Human urinary glycoproteomics; attachment site specific analysis ofN-and O-linked glycosylations by CID and ECD. Mol. Cell. Proteomics2012, 11, M111−013649. (35) Carlsohn, E.; Nystrom, J.; Karlsson, H.; Svennerholm, A.-M.;Nilsson, C. L. Characterization of the outer membrane protein profilefrom disease-related Helicobacter pylori isolates by subcellularfractionation and nano-LC FT-ICR MS analysis. J. Proteome Res.2006, 5 (11), 3197−3204. (36) Olsen, J. V. Parts per Million Mass Accuracy on an OrbitrapMass Spectrometer via Lock Mass Injection into a C-trap. Mol. Cell.Proteomics 2005, 4 (12), 2010−2021. (37) Crooks, G. E.; Hon, G.; Chandonia, J.-M.; Brenner, S. E.WebLogo: A sequence logo generator. Genome Res. 2004, 14 (6),1188−1190. (38) Breci, L. A.; Tabb, D. L.; Yates, J. R.; Wysocki, V. H. CleavageN-terminal to proline: Analysis of a database of peptide tandem massspectra. Anal. Chem. 2003, 75 (9), 1963−1971. (39) Cooper, H. J.; Hakansson, K.; Marshall, A. G. The role ofelectron capture dissociation in biomolecular analysis. Mass Spectrom.Rev. 2005, 24 (2), 201−222. (40) Lee, Y.; Kockx, M.; Raftery, M. J.; Jessup, W.; Griffith, R.;Kritharides, L. Glycosylation and sialylation of macrophage-derivedhuman apolipoprotein E analyzed by SDS-PAGE and massspectrometry: Evidence for a novel site of glycosylation on Ser290.Mol. Cell. Proteomics 2010, 9 (9), 1968−1981. (41) Zhang, Y.; Iwasaki, H.; Wang, H.; Kudo, T.; Kalka, T.; Hennet,T.; Kubota, T.; Cheng, L.; Inaba, N.; Gotoh, M.; Togayachi, A.; Guo,J.; Hisatomi, H.; Nakajima, K.; Nishihara, S.; Nakamura, M.; Marth, J.;Narimatsu, H. Cloning and characterization of a new human UDP-N-acetyl-α-D-galactosamine:polypeptide N-acetylgalactosaminyltransfer-ase, designated pp-GalNAc-T13, that is specifically expressed inneurons and synthesizes GalNAc α-serine/threonine antigen. J. Biol.Chem. 2003, 278 (1), 573−584. (42) Wernette-Hammond, M. E.; Lauer, S. J.; Corsini, A.; Walker, D.;Taylor, J. M.; Rall, S. C. Glycosylation of human apolipoprotein E. Thecarbohydrate attachment site is threonine 194. J. Biol. Chem. 1989, 264(15), 9094−9101. (43) Takahashi, N.; Takahashi, Y.; Putnam, F. W. Structure of humanhemopexin: O-Glycosyl and N-glycosyl sites and unusual clustering oftryptophan residues. Proc. Natl. Acad. Sci. U.S.A. 1984, 81 (7), 2021−2025. (44) Bock, S. C.; Skriver, K.; Nielsen, E.; Thøgersen, H. C.; Wiman,B.; Donaldson, V. H.; Eddy, R. L.; Marrinan, J.; Radziejewska, E.;Huber, R. Human C1 inhibitor: Primary structure, cDNA cloning, andchromosomal localization. Biochemistry 1986, 25 (15), 4292−4301. (45) Deshpande, N.; Jensen, P. H.; Packer, N. H.; Kolarich, D.GlycoSpectrumScan: Fishing glycopeptides from MS spectra ofprotease digests of human colostrum sIgA. J. Proteome Res. 2010, 9(2), 1063−1075. (46) Cooper, C. A.; Gasteiger, E.; Packer, N. H. GlycoModAsoftware tool for determining glycosylation compositions from massspectrometric data. Proteomics 2001, 1 (2), 340−349. (47) Timm, W.; Ozlu, N.; Steen, J. J.; Steen, H. Effect of high-accuracy precursor masses on phosphopeptide identification from MS3

spectra. Anal. Chem. 2010, 82 (10), 3977−3980. (48) Gupta, N.; Bark, S. J.; Lu, W. D.; Taupenot, L.; O’Connor, D.T.; Pevzner, P.; Hook, V. Mass Spectrometry-Based Neuropeptido-mics of Secretory Vesicles from Human Adrenal MedullaryPheochromocytoma Reveals Novel Peptide Products of ProhormoneProcessing. J. Proteome Res. 2010, 9, 5065−5075. (49) Zougman, A.; Pilch, B.; Podtelejnikov, A.; Kiehntopf, M.;Schnabel, C.; Kumar, C.; Mann, M. Integrated analysis of the

dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584

Journal of Proteome Research

cerebrospinal fluid peptidome and proteome. J. Proteome Res. 2008, 7(1), 386−399. (50) Gram Schjoldager, K. T.-B.; Vester-Christensen, M. B.; Goth, C.K.; Petersen, T. N.; Brunak, S.; Bennett, E. P.; Levery, S. B.; Clausen,H. A Systematic Study of Site-specific GalNAc-type O-GlycosylationModulating Proprotein Convertase Processing. J. Biol. Chem. 2011,286 (46), 40122−40132. (51) Wilson, I. B.; Gavel, Y.; von Heijne, G. Amino acid distributionsaround O-linked glycosylation sites. Biochem. J. 1991, 275 (Pt 2), 529−534. (52) Elhammer, A. P.; Poorman, R. A.; Brown, E.; Maggiora, L. L.;Hoogerheide, J. G.; Kez dy, F. J. The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferredfrom a database of in vivo substrates and from the in vitroglycosylation of proteins and peptides. J. Biol. Chem. 1993, 268(14), 10029−10038. (53) Gupta, R.; Birch, H.; Rapacki, K.; Brunak, S.; Hansen, J. E. O-GLYCBASE version 4.0: A revised database of O-glycosylatedproteins. Nucleic Acids Res. 1999, 27 (1), 370−372. (54) Thanka Christlet, T. H.; Veluraja, K. Database analysis of O-glycosylation sites in proteins. Biophys. J. 2001, 80 (2), 952−960.

Article

584 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584