Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Chains

Embed Size (px)

Citation preview

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    1/13

    Scheme for Ranking Potential HLA-A2 Binding PeptidesBased on Independent Binding of Individual PeptideSide-ChainsKenneth C. Parker,* Maria A. Bednarek, an d Jo hn E. Coligan**Biological Resources Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health,Bethesda, MD 20892; and +Merck Sharp and Dohme Research Laboratories, Rahway, NJ07065

    ABSTRACT. method to predict the relative binding strengths of all possible nonapeptides to the M H C class Imolecule H L A - A 2 has been developed based on experimental peptide binding data. These data indicate that, formost peptides, eachside-chain of the peptide contributes a certain amount to the stability of theH L A - A 2 complexthat is independent of the sequence of the peptide. To quantify these contributions, the binding data from a set of154 peptideswere combined together to generate a table containing 1 8 0 coefficients (20 amino acids X 9positions), each of which represents the contribution of one particular amino acid residue at a specified positionwithin the peptide to binding o HLA-A2. Eighty peptides formed stable H L A - A 2 complexes, asassessed bymeasuring he rate of dissociation of p2m. The remaining 74 peptides ormed complexes that had a half-life of P2mdissociation of less than 5 min at 37C, or did not bind toHLA-A2, and were ncluded because they could be usedto constrain the values of some of the coefficients. The theoretical binding stability (calculated by multiplyingtogether the corresponding coefficients) matched the experimental binding stability to within a factor of 5. Thecoefficients were then used to calculate the theoretical binding stability for all the previously identified self orantigenic nonamer peptides known to bind to HLA-A2. The binding stability for all other nonamer peptides thatcould be generated from the proteins from which these peptides were derived was also predicted. In every case,the previously described H L A - A 2 binding peptides were ranked in th e top 2% of all possible nonamers for eachsource protein. Therefore, most biologically relevant nonamer peptides should be identifiable using the table ofcoefficients. We conclude that the side-chains of most nonamer peptides to the first approximation bind inde-pendently of one another to the H L A - A 2 molecule. Journal of Immunology, 1994, 152: 163.

    M HC c lass I molecules are normally expressedon the cell surface n a stable complex, withany one of a large number of peptides gener-ated upo n proteolysis of intracellular proteins (1, 2 ) . Intheory,eachallelicvariant of aclass I MHC moleculeselects these peptides based on the complemen tary struc-ture of the pep tide and the p olymorph ic pocketswithin thepeptide-binding groove (3, 4). In the past, motifs specificto individual class I molecules have been determined bycomparing the sequences of endogeno us p eptides isolatedfrom purified class I molecules (5, 6) , or by comparing thesequences of peptides hat are known to bind to each classI molecule (7). Inevery case studied so far, at certainReceived for publication June 25 , 1993. Accepted for publication October 6,1993.The costsof publicationof this article were defrayed in part by the paymentofpage charges. This artic le must therefore be hereby marked advertisement inaccordance wit h 1 8 U.S.C. Section 1734 solely to indicate this fact. Address correspondence and reprint requests to Dr. Kenneth C. Parker,Building 4, Room 413, BRB, NIAID, NIH, Bethesda, MD 20892.

    positions w ithin the peptide, one aa2or a small number ofrelated aa re oun d to be nearly invariant; these aaarecalleddominan t anchor residues ( 5 ) and appear toanchor the peptide into the class I peptide binding siteby having a structure hat is co mplem entary to a pocket ofthe eptide-binding groove. F or xample, ndogeno uspeptides isolated from purified HLA-A2 contain a s dom-inant ancho r residues Leu or Met at P2, and Val or Leu atP9 (5 , S), which are hought to bind in the B and F pockets,respectively (4). So me of the other positions within theendogenouspeptidesarealsoenriched or pecific aa;these are defined as auxiliary anchor residues 5 ) . In manycases, it is not clear to whatdegree heallele-specificpeptide-binding motifs consisting of dom inant and auxil-iary anchor residues are a consequencef the requirements

    Abbreviations used in this paper: aa, amino acid residue; CF, gel filtration;tide 58-66 (CILCFVFTL); P1, position 1 in a peptide; Pa, last position in aIBS, independent binding of side-chains; M1, influenza A matrix protein pep-peptide; p2m, p2-microglobulin.

    Copyright 0 1994 by The American Assoclation of Immunologists 0022-1 767/94/$02.00

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    2/13

    164 RAN KING POTENTIAL HLA- A2 BIN DIN G PEPTIDES

    for peptide binding (9), or whether part of the motif is aconsequence of Ag processing ( 5 ) .These motifs have oc-casionally proven useful for localizing the optimal class Ibinding peptide within the sequen ce of a longer peptidethat is known to contain a HLA-A 2-restricted T cell epi-tope (1 0 , l l ) . However, in other cases,no obvious motif ispresent within the antige nic peptide (12-15), possibly re-flecting the limitations in our know ledge about what areacceptable variations in peptide-binding motifs.

    T o determine to what degree the dominan t anchor res-idues or the auxiliary anchor residues of he HLA-A2 mo-tif are important or peptide binding,we have extended ourprevious peptide-binding study (16) to include a more ex-tensive set of nonapeptides, many ofwhich are interrelatedby single aa substitutions. These new data indicate that ateach position within the peptide, some aa are more favor-able than others, regardless of the sequence of the rest ofthe peptide; therefore, it should theoretically be possible toimprove upon predictions based solely on the anchor res-idues at P2 and P 9 that are specific to HLA -A2 bindingpeptides. W e present a table of 180 coefficients specificfor each of the 20 aa at each of the 9 positions within thepeptide. Th is table of coefficients incorporates all of thedata that we have collected and can be used to predict thestability of HLA-A2 complexescontaining any desiredpeptide. A m athematical comb inatorial approach similarothat used to generate he HL A-A 2 bindingcoefficientscouldalso be applied to othermacromolecular nterac-tions, such as between oligonucleotides and DNA -bindingproteins.Materials and MethodsPeptidesPeptides were synthesized and purified as described (17).HLA-A2 binding assaysNative isoelectric focusing gel and GF peptide binding assays were sedas described (17). These assays measure peptide binding indirectly bymonitoring the ability ofpeptides to promote incorporation of lZ5I-labeled P2m into HLA-A2/Pzm/peptide heterotrimeric complexes. Be-cause the HLA H chain is refolded from inclusion bodies prepared fromEscherichia coli, there are no endogenous peptides present to confoundthe data. Instead, '251-&m is incorporated intoHLAcomplexes onlywhen an appropriate synthetic peptide is present.Rate measurementsThe stability of HLA-A2 complexes containing specific peptides wasassessed by measuring the Pzm dissociation rate as described (17).Briefly, HLA complexes containing 1z51-pzmwere isolated by GF and&m dissociation was measured by means of a second round of GF. Tomake accurate measurements of half-lives that were less than 20 min, itwas necessary to collect the purified complexes from the first round ofGF directly into microcentrifuge tubes that were then maintained at 0C.Each aliquot to be used for a time point was separately incubated frombetween 1 and 30 min at 37C before the second round of GF.Mathematical modelingTo combine the data mathematically from a large number of experiments,a Fortran program was written that could optimize for the values of all

    180 (20 aa X 9 positions) coefficients with any number of simultaneousequations. The rate data for a peptide whose sequence was GILGFVFTLwould be entered as follows:ERR = In(t,,z)- In(G1 X I2 X L3 X G4 X F5 X V6 X F7 X T8 X L9 X Constant)

    where ER R squared equals the error function to be minimized, r,,2 equalsthe measured half-life of dissociation in minutes at 37"C, G1 represents,for example, the coefficient for Gly at P1 to be determined (see Table I),and Constant equals the overall normalization constant. For the peptidesthat had half-lives of dissociation of

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    3/13

    Journal of Immunology

    Torrance, CA). In this program, the inpu ts are he table of coefficients tobe used, a table con taining peptide sequen ces (in single aa code) andexperimental bindin g data. The program calculates the theoretical disso-ciation rate and the ratio of the experimental to theoretical.

    Ranking of potential HLA-A2 binding peptidesSoftware was written in the Dbase I11 programming language that gen-erates a table containin g the sequence of every possible contiguous nineaa peptide starting from a proteins primary sequence. The peptides canthen be indexed according to the theoretical pzm dissociation rate,calculated using the coefficients (see Table V).

    ResultsPzm dissociation rate data for pairs of peptides thatdiffer at single aaIt hasbeen ound that the P2mdissociation rate fromHLA-A2 complexes containing peptides with a Leu at P2and a Val at P9 varies over at least four orders of magni-tude, depending on the sequence of the rest of the peptide(17). Nonetheless,he eu at P2 in eithereptideGLGGGGGGV (t1/2< 1 min) and in peptide LLF-GYPVYV ( t I l z4000 min) might stabilize the correspond-ing HLA -A2 complex to the same degree. One way to testthis idea is to compare the P2m dissociation rates of pairsof p eptides that differ in seq uence by a single aa substi-tution. T able I contains data of this kind, listed accordingto the position within the peptide, and then alphabeticallyaccording to the single letter aa code of the aa to be com -pared (first column). For example, at the top of Table I,there are five pairs of peptides that differ only by substi-tution of Ly s for Gly at P1. The ratio of P2m dissociationrates obtained or HLA -A2 complexescontainingeachpair of peptides is shown n Table I, second column. It canbe seen that, in this instance, the ratio of the rate constantsfalls within a rather narrow range, from 2.5 to 6.1. Sim i-larly, peptides that contain Leu at P2 bind between 10 0and 200 times better than the corresponding Ala peptide,whereas Ile peptides bind about 15- to 25-fold worse thanthe corresponding Leu peptides. In most cases, of the 15combinations of aa that are compared in Table I , the rangein the ratio observed is considerably less than an order ofmagnitude. T he variability in these ratios may be d ue toexperimental error (especially in cases where one or bothdissociation rates are 24170>2 2>2 415

    113 73.451.4.88

    1043>loo>2 4>400>3 6

    >2 29.24 .01.12.31.2

    3.81.11.610.32.615.03.8

    7.5>1 81.8

    KALGFVFTLK ILG FVFTLK I LGKVFTLKLFGGGGGVKLFGGVGGVGLFGGVGGVGLLGFVFTLGLFGGGGGVGLFGGVGGVGLLGFVFTLGLFGGVGGVGLLGFVFTLLLFGY P V Y VGLFGGGGGVGLFGGVGGVLLFGYPVYVGIAGFVFTLILASLFAAVGLFGGGGGVGLFGGVGGVGLFGGGFGVGLFGGFGGVGLFGGGVGVGLFGGGVGVGLGFVFTLGLFGGGGGVGILGFVFTLKLGFVFTLGLFGGVGGVGMFGGVGGVKLFGGVGGVGILGFVFTLGLFGGGFGVGILGFVFTLGLFGGGFGVGLFGGGVGVGLLGGGVGVGILGFVFTL

    41 0250021 004707301 2 0150001 1 01 2 0150001 2 01500064001101 2 064002 4 03481 1 01 2 020001 8 01 1 08 3 0

    1 0 0 01 1 010002500

    1208 47301000200010002000

    8 3 09 01 0 0 0

    GALGFVFTLGLGFVFTLGILGKVFTLGLFGGGGGVGLFGGVGGVGAFGGVGGVGALGFVFTLGF G G G G G VGF G G V G G VGILGFVFTLGQFGGVGGVGQLGFVFTLLQFGY P V Y VG M F G G G G G VGMFGGVGGVLMFGY P V Y VGIEGFVFTLI ESLFAAVGLGGGGGGVGLGGGVGGVGLGGGGFGVGLGGGFGGVGLLGGGVGVGLLGGGVGVG ILKFVFTLGLFKGGGGVGILGKVFTLKILGKVFTLGLFGGGGGVGMFGGGGGVKLFGGGGGVGILGFVATLGLFGGGAGVGLGFVETLGLFGGGEGVGLFGGGGGVGLLGGGGGVGILGFVFTA

    8910004401101 2 0

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    4/13

    166 RANKINGOTENTIAL HLA-A2 BIN DIN G PEPTIDES

    culations, we chose to limit the number of variables bysolving for only those coefficients that are most im portantfor binding to HLA-A2 (82 coefficients in all), based onthe peptides that we have studied. All other coefficientswere assigned a neutral value of 1 .0.We calculated these coe ff~cie nts rom data on the sta-bility of HLA-A2 complexes containing individual pep-tides, and also took account of which peptides did not bindto HLA-A2. Peptides that form HLA-A2 complexes canbe distinguished from nonbinding peptides by use of a GFassay in which the ability of each peptide to promote in-corporation of 1251-/3,m into HLA-A2 complexes is as-sessed. Average percentage.s of peptide-dependent P zm in-corporation are listed in column 2 of Tables II,III, IV, andVI. The pep tides that formed HLA-A2 complexes could befurther subdivided according to how stable the com plexeswere once they were formed. Those peptides that formedcomplexes that had a half-life of dissociation of >S min at37C (see column 3) are listed in Table 11, whereas thosepeptides that formed less stable complexes are listed inTable 111. Peptides that caused the incorporation of lessthan 10% of the labeled P2m into HLA-A2 complex whenassayed at a co ncentration of 1mM w ere considered to benonbinders for HLA-A2, and are listed in Table IV . In thecalculations, each peptide in Table I1 corresponded to oneindependent equation; in which the product of the appro-priate coefficients and an overall normalization constantwas set equal to the exp erimentally measured half-life (seeMaterials and Methods).In contrast, the peptides in TableI11 and Table IV corresponded to an inequality; in w hichthe product of the appropriate coefficients and the overallnormalization constant was set equa l to less than a half-lifeof 5 min. For the purpose of discussion, each category ofpeptide was further divided by sequence into three cate-gories; those based on a poly-Gly or poly-Ala backbone(Tables IIA, IIIA, and IVA), those related to the M 1 pep-tide, which is an optimal HLA-A2 restricted antigenic in-fluenza A matrix peptide (Table IIB an d IVB), and otherpeptides (Table IIC, IIIB, and IVC), including HLA-A2-restricted antigenic peptides and essentially random viralpeptides that we had synthesized.When the equations and inequ alities corresponding toeach of the peptides listed in Tables 11, 111, an d IV weresimultaneously solved, the coefficients listed in T able Vwere obtained. These coefficients were then used to cal-culate the theoretical half-life of dissociation of p2m listedin the column labeled theo in Tables 11, 111, and IV. Itcan be seen from the ratio listed n he fifth column ofTable 11 that, in every case where accurate experimentalhalf-lives are obtainable, the theoretical binding stabilitiesdiffer from the actual binding stabilities by less than afactor of 5.0, and the average ratio is a factor of 1.6. Th eoverall fit of the data is shown grap hically in Figure l A ,where the theoretical half-life of &m dissociation is plot-ted vs the experimental half-life. Considering that theserate constants vary over at least four orders of m agnitude,

    the fit is impressive. A ratio of coefficients similar to thatlisted in Table I, second column, can be calculated fromthe coefficients in Table V. For example, the ratio of co-efficients from Table V for KUG 1 is 3.465 /0.578 = 6.0,compared to a range of between 2.5 and 6.1 as shown inTabie I, second column. This verifies that the c~ ff ic ie nt s.shown in Table V faithfully reflect the contribution ofeach aa of a nonam er peptide for binding to HLA-A2. Thefact that the majority of the half-lives is predicted well(Fig. lA) upports the premise that side-chain/side-chaininteractions are in the majority of cases of minimalimportance in peptide binding.Two features of the coefficients listed in Table V are ofparticular relevance. First, the most important coefficientsin Table V are those that are significantly different from1.0. Second, the higher the frequency of the coefficientamong the equ ations, the more ac curately known the valueof the coefficient. This is because the value of a coefficienthas a greater impact on the overall error if the coefficientis present in a large number of equations, especially if thepeptides that correspond to those equations form stableHLA-A2 complexes. The frequency of e ach aa/peptide po-sition combination in peptides that form stable HLA-A2complexes (Table 11) and in peptides that do not form sta-ble complexes (Tables 111 and IV ) is listed in parenthesesin Table V. It can be seen that the coefficient for L2 is bothimportant and accurately known, because it has a highnum erical value (103.1831, and it appears in 33 differentequations (co ~esp ond ing o 33 peptides that form stableHLA-A2 complexes), and n 39 additional inequalities(corresponding to 39 different peptides that either formunstable HLA-A2 complexes, or do not bind at all). Incontrast, the coefficient for K2 is more tentative, becauseonly one peptide containing a Lys at P2 was tested thatformed a stable HLA-A2 complex3. To get the most ac-curate values for the coefficients, we included in the cal-culations equations corresponding to as large a num ber ofpeptides as possible, because ea ch additional peptide addsan additional constraint to the values of nine different co-efficients. However, there w ere certain peptides that w ereexcluded from the set (listed in Table VI) because theirbinding properties appeared tobe inconsistent with thebulk of the peptides. In particular, three peptides stood out(Table VIA) that bound reasonably well to HLA-A2 andformed com plexes in high yield. These three peptides, webelieve, violate the assumption of independent binding ofside-chains (see below). The remainder of the peptideswere excluded because the data seemed in some way to bedubious (Table VIB). Some of these peptides promotedthe incorporation of a rather small percentage of 1251-P2minto complexes. O ne possible explanation for this could bethat a contaminant in the peptide preparation is the a ctive

    Analysis of the stability of complexes containing five additional peptidesthatcontain Lys at P2 indicates that the coefficient for Lys at P 2 as listed in TableV is too high, and should be close to 1 .O.

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    5/13

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    6/13

    168 RANKIN G POTENTIAL HLA -A2INDING PEPTIDES

    Table Il-Continued

    Sequence" GF Expt.f l /2c

    Theoreticalredictedtl/, Ratioe ( I / , Ratios

    C. Other nonapeptides that form stable HLA-A2 complexes (Continued)FIFPNYTIV 1.5 670I ISWDQS 60

    3.4I I A S I F A A V 6 9.8 1.770 1.1350I ESLFAAV 330 1 .040 1.170 108K L G E F F N Q M 90 4.7K I G E F Y N Q M

    35090 1.8 44 7.9K M F G Y P V Y V

    8020 1.5 5790400 3.4 58000 133.9

    I I F G Y P V Y V 70 6400 7800 1.2 9000I I W K G E G A V 80 1.4LMFGYPVYV 2 70 1.3 12090 7300.7 3100 3.14400I N F G Y P V Y V 40 411 1 o 350 2.4I Q F G Y P V Y V 90 8.6NIVAHTFKV 1800 750 2.5 2 70 6.98040 75N LV PMVATV 90 1.948030 1.1 360QMAAR 1.38010 49 2.3 1 3 8.5QMWQARITV 90 42060 1.3 1400RILQTGIHV 3.39000 1 o 280R I V N G S L A I 70 81 64 1.3 36 1.1S L Y N T V A T L 8020 2100 5.6

    2.33 70TINAWVKVV 70 1004 1.10WIYRETCNL 80 1.4

    Y I F K R M I D I 90 570 420 1.4 390 1.5

    360

    91 210600 2.3

    species, as has been foun d in other assay systems (19). Inmost cases, dissociation rates for the peptides listed in Ta-ble VI B were d ifficult to calculate because the majority ofthe counts incorporated into HLA-A2 complex dissociatedrapidly, although a small percentage of the com plex dis-sociated with the half-life listed. For ALFAAAAAY andGQLGFVFTK no internally consistent half-life could beobtained. In addition, all of these peptides have anchorresidues at P2, P3, or P9 (marked in bold in Table VIB),that are infrequent or absent amongpeptides that formstable HLA -A2 comp lexes. For all of these reasons, webelieve it would be w isest to exclude these peptides fromthecalculations for the time being, until more peptidesare synthesized and tested that could help address theseproblems.Explanation of the values obtained for thecoefficientsThe coefficients in Table V corroborate the data obtainedpreviously from endogen ous peptide sequence analyses (5,8) that the most important anchor positions are at P2 andP9 . In addition to Leu and Met at P2, Ile and Gln are alsorelatively well tolerated. Although Gln has not been pre-viously reported to be an anchor residue at P2 for wild-type HLA -A2, it wa s recently found to be present at P2 inpooled endogen ous peptides isolated from mutant HLA-A*0 205 molecules (20), which differ from A*0 201 mole-cules by a single substitution (F9Y) in the B pocket. Ourdata suggest that the most abundant anchor residues at P2for this mutant are different from wild type HLA-A2 in aquantitative sense only. At P1, negatively ch arged residuesare unfavorable, whereas Lys is favorable. This can most

    easily be explained by an ionic nteraction with E63,which is known to be located near the N-terminus of thepeptide-binding site (4). At P3, aromatic residues are fa-vorable, and charged residues are most often unfavorable.A few exceptional peptides, notably ILDKKVEKV andILKEPVHGV, can formstableHLA-A2complexesde -spite the charged residue at P3, presumably by m eans ofoverriding favorable interactions with other peptide resi-dues (a violation of the IBS condition). Most residues areequally well tolerated at P4; however, our data tentativelyindicate that large hydrophobic residues like Phe are un-favorable. At P5-P7, aromatic esiduesseem to be fa-vored, as at P3; howev er, KLF GFV FTV , which containsPhe at P3, P5, and P7, binds much less well han would bepredicted (2,000 min vs 30 0,000 min predicted) if each ofthese positionscontributed ndependen tly. Most likely,this is due to the limited sp ace that is available within thepeptide-binding groove to accommodate bulky side-chains(this would be a second violation of the IB S principle). AtP8, Val is significantly less favorable than Ala, Glu, Ly s,or Thr, at least in the context of the matrix peptide se-quence (GILGFVFTL). This may indicate that the hydro-phobic isopropyl group f Val cannot be accommod ated aseasily as hydrophilic, o r smaller side-chains. At P9, V aland Leu are better than Met and Ile, and all other residuesexamined appear to be very much worse. The importanceof the P9 position is exemplified by the data collectedusing peptides hat belong to the paradigm GLFGG GFGX ,because GLFGGGFGF,GLFGGGFGN, and GLFGGG-FGS form complexes that are at least 1000-fold less stablethan GLFG GG FGV . Mo reover, most peptides that containeither Lys or Tyr at P9 do not bind appreciably, despite

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    7/13

    Journal of Immunology 169

    Table 1 1 1 . A. Poly-Gly and poly-Ala nonapeptides that formunstable HLA-A2 complexes

    Sequencea CFb t7 1 2 ~ Theod RatioeG A F G G V G G VG A F G G V G G YG E F G G V G G VG G F G G V G G VG I F G G G G G VG I G G F G G G LG I G G G G G G IG L D G G G G G VG L D G K G G G VG I D K K G G G VG L F G G G F G FG L F G G G F G GG L F G G G F G NG L F G G G F G SG L F G G G G G AG I F G G G G G IG L F G G G G G MG I F G G G G G TG I F G G G G G YG L G F G G G G VG L G G F G G G VG L G G G F G G VG L G G G G G F VG L G G G G G G YG L G G G V G G VG I L G G G G G VG I P G G G G G VG N F G G V G G VG S F G G V G G VG T F G G V G G V

    203020104020206010307050806050706050104060606010405 040102040

    3

    415131144

    253313

    1

    1.10.0024.00.104.40.1 20.01 62.97.47.44.94.94.94.99.24.90.0820.0830.0092.61.40.340.0010.0845 O0.585.06.4

    14

    15

    1.3

    1.57.41.14.91.54.96.61.13.1

    250.01.82.37.51 15.2

    4.3B . Other nonapeptides that form unstable HLA-A2 complexes

    A G N S A YY V 10 0.21G L F P G Q F A Y 10H I I I G V F M L 10I E S L F R A V 20K K K Y K L K H I

    2 5.0 2.710M L A S I D L K Y 20

    M L E R E L V R K 10 0.01 1

    4.85 O0.1 80.14

    Sequence, in single-letter aa code.Experimentally measured half-life o f Pzm dissociation in min at 37C. If

    than 5 min.no number is present, the half-life was difficult to measure, but is probably lessdTheoretical half-life of P2m dissociation, calculated using coefficients inTable V.e Factor by which the theoretical half-life differs from the measured half-life.

    "Average 70of P2m incorporation as assessed by gel filtration.

    otherwisevery avorable esidues (e.g., GILG FVFT K,KLYEKVYTY; see Table IVB and IVC).Application of the binding coefficients to ranking ofknown antigenic and endogenous peptidesIt would be interesting o know if the know n endogen ouslysynthesized self and antigenic peptides areamo ng the bestHL A-A 2 binding peptides. Theo retically, arge numb ersof peptides may be more capable of binding to HLA-A2,butmight never begenerated in vivo. To determinewhether his is likely, he coefficients in Table V wereused to rank all of the potential nonam ers from each of theproteins for w hich a know n antigenic or endogenous pep-tide has been identified. Th e parameters that describe the

    Table IV . A. Poly-Cly and poly-Ala nonapeptides hat do notbind to HLA-A2

    Sequencea CFb Theo'A L A A A A A A KG D F G G V G G VG F F G G V G G VG H F G G V G G VG I F G G G G G AG I G G G F G G LG I G G G G F G LG L F G G G G G FG L G G G G G G IG L G G G G G G VG P F G G V G G VG R I G G G G G IG Y F G G V G G V

    1458932645556

    0.245.05 O5.00.900.0681 o0.080.1 70.345.00.0365 O

    B. M1 related nonapeptides that do not bind to HLA-A2E I L G F V F T K 0G I L G F V F T E

    5 O2 4.9G I I G F V F T K 4.0

    C . Other nonapeptides that do not b ind to HLA-A2D I Y R I F A E L 4 5.0E I K D T K E A I 3E I Y K R W I I L 7 4.00.01E L D A P N S H YE L K S K Y W A IE I K V K N L E IE L R S L Y N T VE L R S R Y W A I

    1 0.0591 5.12 1.12 5 O3 3.3

    E R YK D Q Q I 4 4.9G E I Y K R W I I 5 4.0G I P V G G N E KG M Q W N S T A FI I K Q K I A D LI R G S V A H KK I F I A G N S A

    5 0.1 54 0.0448 1.97 0.0201 5 O

    K L YK V Y T Y 3 5.0L G F V F T L T V 5.0I L S F I P S D F 5P I N P F V S H K

    0.0041R Y W A I R T R S

    3.72T P O D I N T M I

    0.0823 1.7Average 70of P2 m incorporation, as assessed by ge l filtration.Theoretical half-life of P2m dissociation, calculated using coefficients in

    a Sequence, in single letter aa code.

    Table V.

    ranking of each peptide are shown in Table VII. Column3 show s the numb er of overlapping nonam ers that couldbe generated from each protein. Column 4 show s the the-oretical half-life of Pzm dissociation for the most stablenonamer. Th e next tw o columns list the rank of the peptideusing he experimentally measured half-life of dissocia-tion, followed by the measured half-life. Finally, the lasttwo colum ns list the rank of peptide when the peptide'stheoretical half-lifeof dissociation isused, followed by thetheoretical half-life. (Note that ourcurrent algorithm iscapable of ranking n onam ers only, although some longerpeptides could form comparably stable complexes.) Thefirst peptide, the nfluenza matrix peptide GILGF VFT L,was previously found to be a major target of all HLA-"restricted, influenza-specific CTL, both in humans(21) and in HLA-A2 transgenic mice (22). It ranks firstamon g all possible nonam ers rom he matrix protein

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    8/13

    170 RANKINGOTENTIAL HLA-A2 BIN DIN G PEPTIDES

    Table V. Coefficients used to calculate theoretical rate constantsaaa CoeffreqC aa CoeffreqaoeffreqA1c1DlE lF lG1H1I1K lLfM1N1P1Q1R1s1T1V lW lY1A2c 2D2E2F2G2H212K2L2M2N2P24 2R2s2T2v 2w 2Y2A3c 3D3E3F3G3H313K3L3M3N3P3R3s3T3v 3w 3Y3

    4 3

    1 ooo0.5970.0410.5781 ooo0.5780.0441.0003.4651.ooo1.ooo1.0001.0001 ooo1 ooo1 ooo1.0001 ooo1.0001 ooo1 .ooo0.5000.5002.8400.5000.5000.50010.15120.5243103.18357.9200.5420.50010.0060.5000.5006.0805.9190.5000.5001.0001.0000.7260.05411.3830.0881.0001.8490.0243.6851.0001 ooo1.2611.0000.0331.0001.ooo2.17312.9787.61 3

    A4c 4D4E4F4G4H414K4L4M 4N4P44 4R4s4T4v 4w 4Y4A5c 50 5E5F5G 5H515K5L5M5N5P5Q5R5s5T5v 5w 5Y5A6C6D6E6F6G6H616K6L6M 6N6P6R6S6T6V6W6

    Q6

    1 ooo1.0001.0001 ooo0.0271.0001 ooo0.0781 ooo0.6461 ooo1.0001,0001 .ooo1 .ooo1 ooo1.0001 ooo1 ooo1 ooo1 ooo1 ooo1.0000.7566.0440.8041 ooo1 ooo2.0851 ooo1 .ooo1.0001 .ooo1.0001.ooo1.ooo1.0001.0002.6808.0021 ooo1.0001 ooo3.2464.3691.0601.0001.0000.2391 ooo1.0001 ooo1 ooo1 ooo1 ooo1 ooo1.0002.5880.251

    A7c7D7E7F7G7H717K7L7M7N7P7R7s7T7v 7w 7Y7A8C8D8E8F8G8H818K8L8M8N8P84 8R8S8T8V8W8Y8A9c 9D9E9F9G9H919K9L9M9N9P9Q9R9s9T9v 9w 9Y9

    Q7

    1.ooo1.ooo1 ooo0.8206.3830.1051 ooo1 oo o0.6031.0001 ooo1 ooo1.ooo1 ooo0.2771.0001 oo o1.1205.9511.0001 ooo1 .ooo1.0001.0001 ooo1 ooo1.ooo1.0001 ooo1 .ooo1 .ooo1.0001 ooo1 ooo1.0001.ooo1 oo o0.2931.ooo1.0001 ooo0.0100.0100.0150.0090.0090.0100.5340.0152.3571.5010.0090.0100.0100.0100.0090.0094.8840.0100.0092.416 3.470 ~~1 I \-, .

    a aa using single letter code, followed by the position within the peptide.exactly 1 . 0 0 0 were constrained to equal 1 O. No coefficient is known to better than two decimal places; many coefficients may be off by greater than a factor ofCoefficient, calculated by solving simultaneously equations corresponding to each of the peptides in Tables II , 111, and IV. Coefficients whose value equal2.0. The value in this table is representative of the raw output f rom the Fortran program. At P2, coefficients were assigned a value of 0.500 i f no peptides werecontained this adpeptide position combination. Note that all undetermined coefficients in Table V have been assigned he value of 1 .O, which corresponds to thestudied that formed stable cornpiexeswith this adpeptide position combination. At P9, coeffic ients were assigned a value of 0.010 if no eptides were studied thatcoeffic ient for Ala at that sameposition, In making predictionsof the stability of WLA-A2 complexes containing unknown peptides, one could substitute a coefficientwith the corresponding coefficient of a chemically more similar aa. The overall normalization coefficient = 0.1 51.

    c First number: number of peptides that contain the aa at the posit ion in question in Table II . Second number: number of peptides that contain the aa at theposition i n question in Tables 111 and IV .

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    9/13

    Journal of Immunology 171

    Fit of experimental to theoreticalt 1/2 in mi nutes at 37"CIA

    10 1 2 3 4 5

    log(theoretica1)Fit of experimental to predictedt 1/2 in minu tes at 37 "C

    5 0

    4 t

    0-1 0 1 2 3 4 5 6

    log(predictad)FIGURE 1. A, Comparison of theoretical half-life of P2mdissociation to the experimentally measured half-life.Thedata from the third column of Table II were plotted againstthe fourth column. The l ine indicates the position of a perfectfit. B, Comparison of the predicted half-life of Pgn dissocia-tion to th e experimentally measured half-life. The data fromthe third column of Table I I were plotted against the sixthcolumn. T h e l ine indicates the position of a perfect fit.

    (Table VIIA ) and third amo ng all possible nonamers en-coded by the influenzagenome (datano tshown). TheHTLV-1 derived,LA-A2-restrictedeptideLF -GY PV YV (23) also ranks first from its source protein. T heHIV polymerase-derived,HLA -A2-restricted, ntigenicpeptide ILKEPVHGV (11) "theoretically" ranks 45th ofthe 1007 possiblenonamers in he polprotein, whichwould place it only in the top 5% . How ever, in the case ofILKEPVHGV, the experimental rank is much higher thanthe heoreticalrank, because ILKEPVH GV binds muchbetter than expected based on the coefficients in Table V.Notably,none of the othe r even higher-ranking HI Vpolymerase-encoded peptides are predicted to bind muchmore than twofold better (data not shown). The remainingthree antigenic peptides, KLGE FYN QM M (24), FIAGN -

    Table VI. A. Nonapeptides that m ay violate th e side-chainindependence ruleSequencea C F ~ t , ,2c Theo"atioe

    KLFGFVFTV 60 2000 300,000 150I LDKKVEKV 50 2900 250 12I LKEPVHGV 8090 4.8 38B. Peptides that form HLA-A2 complexes t ha t behave irregularly

    A L F A A A A A Y 30 2G I G F G G G G L 20 200 0 200000G K F G G V G G V 100 22 3.7G L F G G G G G K 30 0EILGFVFTL' 10 85GIKGFVFTLR

    77060 2500

    9.1G Q L G F V F T K 70 5ILGFVFTLT" 50 140 0KILGFVFTK 5 210 30 7.2KKLGFVFTL 30 750 9300 13K L F E K V Y N Y 20 9 8L R F G Y P V Y V 20 400 180 2.3

    1.2

    5 500

    Sequence, in single letter aa code.bAverage % of Pr m incorpora tion, as assessed by gel filtration.Experimentally measured half-life of P2 m dissociation in min at 37C."Theore tical half-life of P 2m dissociation, calculated using coefficients inTable V.

    e Factor by which the theoretical half-life differs from the measured half-life.'This peptid e wou ld have been placed in Table ll B if it were better able toform H LA-A 2 complexes. The dissociation rate of complexes containing thisthe coefficients.peptide is consistent wi th the rest of the data, but it was n ot used to calculategThis peptide reproducibly fails to form complexes with the expected pl.Instead, the complex has the same charge as CILCFVFTL complexes.This peptide is likely to be contaminated with trace amounts of ILC-FVFTL, which is known to form comp lexes with the m easured stability (16).

    SAYEYV (23), and FLPSDFFPSV (25) , are longer thannine aa long, w hich is why no theoretical rank of half-lifeis isted. W hen the experimentally measured half-life ofthesepeptides iscom pare d against he heoretical half-lives of allpossiblenonamers rom he source protein,each of these peptides ranks close to the top.

    Whe n the endogenous peptides are examined, we seethat ILD KKV EKV, like ILKEPVHG V, binds much moretightly than ex pected using the coefficients in Tab le V.When itsexperimentallymeasuredhalf-life sused forranking purposes, it ranks at the top of the list. With theexception of LLD VPT AA V, the other endogenous pep-tides also rank in the top few percent of all possible non-amers fro m their s ourc e protein. Note that our estimatesfor the ranking of these remaining endogenous peptidesare inherently less accurate becausewe have not measuredthe half-lives of complexes containing these peptides. Theendogeno us peptide that ranks the lowest, LLD VPT AA V,was derived from the leader peptide of IP30, and was iso-lated from a cell line with a mutation n Ag processing, sothatpresumablyonlypeptidesderived from the eaderpeptide were available for binding to HL A-A 2 (26). Weconc lude that most antigenic peptides and most predomi-nant self peptides are selected from am ong those peptidesthat can form the most stable class I complexes.

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    10/13

    172 PEPTIDES

    Table VII. Ranking of HLA-A2 antigenic and endogenous peptides using he Coefficients

    Protein SequenceC No. 9- Highestmersd t l / L eExperimental Theoreticalg

    Rank tl lL Rank t l / ZA. Antigenic peptidesFlu matrix (10, 21)

    HTLV-1 tax (23)HIV polymerase (1 1)Inf luenza nucleoprotein (24)H C M V gB (23)Hepatitis core Ag (25)6. Endogenous peptideshsp 84 (1 6)ip30 (8)tis 21 (8)helicase (8)

    pp61 ( 8 )phosphorylase regulatory A (8)DhosDhorvlase reg ulatory B ( 8 )

    G I L G F V F T LL L F G Y P V Y VI L K E P V H G VK L G E F Y N Q M MF I A G N S A Y E Y VF L P S D F F P S V

    I D K K V E K VL L D V P T A A VT L W V D P Y E VY L L P A I V H IS L L P A I V E LS L L P A I V E LS L L P A I V E L

    24 435 0100755288 917 5

    71 529 515054 658 158 156 7

    800800040060 02000400

    90050 01000400900900900

    ~

    1st1s t8th4th3rd1st

    1S

    1000400019019010001500

    2800

    1st1st45th

    19th7t h1st8th6t h6t h6th

    800800010

    2 06 010003020 0200200identi f ied (1 2-1 4).Th ere are other examples of sequences that are known to conta in HLA -A2-re strict ed peptid es but, i n these other cases, the optima l peptide has not beenProtein of origin.Amino ac id sequence.Numbe r of nonamers in the protein.

    nonamers that cou ld be generated from the same prote in. These columns are blank for peptides that have not been tested.Rank of peptide determined by comparing the experimental lymeasured half- l i fe ofP2 m dissociation to the theoretical half-life of Pzm dissociation for all theRank of peptide determ ined by com paring the theoretical half- l i fe ofzmdissociation to the theoret ical half-life ofPzm dissociation for all the nonamers thatof the theoretical half-life of P2m dissociation for onger peptidescou ld be generated from the same protein. These columns are blank for pep tides tha tare longer than nonamers, because we cann otmake an accurate prediction

    Theoretical half-life of P2m dissociation ( i n min at 37C) for the peptide that ranked first for this protein.

    DiscussionOne of the major reasons to study peptide binding to classI molecules is to be able to determine which peptides arelikely to be antigenic, starting from the primary seq uenceof (for example) a viral protein. In addition, it would beuseful to know why certain peptides areantigenic, butmost peptides are immunologically silent. The data inTable VI1 suggests that so far as we can tell, dominantantigeniceptides in HLA -A2-restrictedmmune re-sponses are am ong those pep tides that bind m ost tightly toHLA -A2. If this turns out to be generally correct, then itshould be possible to develop m athematical algorithms toidentify most antigenic peptides using approaches similarto that described herein that are tailored to the peptide-binding properties of each histocompatibility Ag.The class I MH C protein HL A-A 2 has been shown tobind certain peptides, generally 9 aa in length, that pref-erentially contain Leu or M et at P2 and a Val or Leu at theC-terminus (P9) (5, 8). Th e residues at these two p ositionshave been termed anch or residues ( 5 ) because their rela-tive lack of variability indicates that they serv e as primarycontact points between the peptide and the class I bindingsite. How ever, peptides that contain both Leu at P2 andVal at P9 form com plexes whose stability spans at leastfour orders of magnitude (16), indicating that the aa atother positions can serve a s auxiliary anchor residues thatare critical for peptide binding. Therefore, to make usefulpredictions about peptide binding affinity, if possible, thecontribution of both thedom inant and auxiliary anchor

    residues must beanalyzed on aquantitative basis. Th esimplest approach is to assume that each amino acid side-chain binds independen tly of the rest of the peptide (theIBS hypothesis). It seems reasonable to expect that formany peptides, the IB S hypothesis will adequately explainpeptide binding, and for other peptides, more complicatedexplanations will be needed to explain peptide binding.When ever IBS is rue, the binding affinity of any nonamercan be broken down into nine different coefficients, eachof which is dependent only n the identity of the aa and theposition within the peptide. Therefore, a table containing180 different coefficients would contain the informationnecessary to calculate a probable binding affinity for anypossible nonamer.

    To calculate the coefficients, we m easured the stabilityof a large number of HLA-A2 complexes containing dis-tinct peptides, as assessed by measuring the rate of Pzmdissociation. W e also compiled a ist of peptides that wereunable to make stable complexes with HLA-A2. To solvefor the coefficients, the Pzm dissociation data foreachpeptide was treated as an independent equation, in whichthe measured h alf-life of Pzm dissociation w as set equal tothe produ ct of the nine co efficients (see Materials andMethods). In theory, asufficiently argeset of peptidebinding data could be used to solve for all of the coeffi-cients simultaneously. n practice, we calculated values forthe coefficients hat were most important to our currentpeptide database. Until every aa at every position has been

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    11/13

    Journal of Immunology 1 73

    tested, we cannot exclude the possibility that other coe f-ficients may also contribute significantly to peptide bind-ing to HLA -A2. Despite these approximations, we foundthat for the vast majority of he peptides that we havetested, the binding data were consistent with the IBS hy-pothesis. Only for 3 of 83 peptides was it necessary topropose ignificant ide-chainlside-chain nteractions toexplain the observed peptide-binding properties. W e con-clud e that for most p eptides, the stability of the HLA-A21P2m/pep tide complex is what would be expected if eachside-chain of the peptide bound independen tly to the classI molecule.

    Com piling a table of pep tide binding coefficients asedon individual peptide side-chains has severalpowerful ad-vantages. First, the coefficients in Table V incorporate allof the peptide-binding data that we have collected so far,with a few exceptions (see Table VI). The table of coef-ficients can then be used to estimate the binding stabilityof HLA-A2 complexes containing an untested peptide. Assoon as additional binding data beco me available, he newdata can be used to refine theaccur acy of the table ofcoefficients. Seco nd, the able can be used to make a quan-titative prediction about wh ich aa in a given peptide are ofprimary importance for binding to HLA-A2. For example,in the case of the influenza matrix peptide GILGF VFT L,the Phe residues at P5 and P7 are predicted to be almost asimportant as the Ile at P2. This information could be usedto predictwhich ubstitutions in an antigenicpeptidemight allow it to bind more tightly to HLA-A 2. In somecases, a peptide that binds very weakly to HL A- A2 mightbe convertednto useful vaccine andidateby hismeans. Third, experiments can be designed to test everycoefficient in the table by measuring the stability ofHL A- A2 complexes containing peptides that differ at theaa in question. Fourth, when ever the binding of a peptideis badly predicted by the table of coefficients, one wouldpredict that significant side-chainlside-chain nteractionsare taking place or that some side-chain is oriented in asignificantly different d irection than usual.The most obvious way to test the validity of the coef-ficients in Table V would be to predict the half-lives ofP2m dissociation for complexes formed with a new set ofpeptides, and then to compare the predictions against ex-perimental measurements. We have not explicitlydonethis, because we have used all new information to improvethe v alues of the co efficients. Instead, to test the power ofthis methodology to predict which peptides would makethe most stable HLA-A2 complexes, the coefficients wererecalculated for each of the 80 peptides that bind stably toHLA -A2, using all of the eq uations used to calculate thecoefficients in TableVexcept for the equation corre-sponding to the peptide to be tested. The factor by whichthe predicted half-life of P2m dissociation differs fromthe measured half-life is listed in Table 11, seventh column.It can be seen that althoug h this factor is always greaterthan the factor obtained when the peptide to be tested is

    includ ed in the set of equ ations (Tab le 11, fifth colum n),the half-lives of 62 of the 80 peptides were still predictedwith in a factor of five. In most cases, the poorest predic-tions can be easily explained. For example, ALFFFD IDL(Table IIC) was predicted poorly because it was the onlypeptide that formed stable HLA -A2 complexes that con-tained a Phe at P4. Wh en the equation for ALFFF DIDLwas deleted, the program calculated the highest value forthe coefficient for Phe at P4 that was consistent with theobservation that GLG FGG GG V (Table IIA) and LLS-FLPS DF (Table IVC) do not form stable HLA-A2 com-plexes. It turns out that this causes the value of the coef-ficient for F4 to increase from 0 .027 (Table V) to 16.7,which is an artificially high value. This happens becauseGLGFG GGGV and LLSFLPSDF have such poor anchorresidues at P3 and P9, respectively that the coefficient forF4 could be a s high as 16.7, and these peptides would stillnot be expected to form stable HLA-A2 complexes.

    As a further check on the logic behind the calculations,the coefficie nts were recalculated allo wing all of the 180coefficients to be variables. This allowed the overall errorfunction to decrease from a value of 22.8 to a value of12.3. The new set of coefficients was very similar to thatin Table V (data no t shown ), especially or the coefficientsthat apply to a large n umber of peptides (like L2 and V 9).A s would be expected considering the number of vari-ables, certain coefficients were poorly defined. For exam -ple, the coefficients for both W 1 and C7 were present onlyin thequationorresponding to the data forWLY RETC NL (see Table II), and other co efficients couldnot be calculated at all because peptidescontaining thecorresponding aa we re not available. Nonetheless, with asufficiently large set of peptides, these difficulties wouldbe overcome, and all of the coefficients could be simulta-neously calculated. Thus, it will not always be n ecessaryto make intuitive choices as to which coefficients shouldbe allowed to deviate from a value of 1.0. However, itwould be possible to reduce the number of variable coef-ficients used in our calculations by two distinct means.First,some of the coefficients that were allowed to bevariables were calculated o have values near 1.0 (e.g., D3,P3 , L4, E5 , G5, G6, E7, and V7), and therefore (in retro-spect) need not have been variables. Second, in somecases, chemically similar aa were found to have similarcoefficients, even though the algorithm used to calculatethe coeffic ients did not take this into account (e.g., F3, Y 3,W3 and F5, Y5 , W5), and therefore the number of vari-ables could be reduced by constraining several coefficientsto have the same value.

    The IBS hypothesis is based on the following theoreti-cal considerations. Th e logarithm of each coefficient canbe thought of as being related to a partial free energy ofactivation for the process of dissociation of the complex.Th e partial free energies of activation should be additive,assuming hepeptideside-chains bind independently tothe HLA H chain, and assum ing the rate-limiting step for

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    12/13

    174 RANKING POTENTIAL HLA-A2 BINDIN G PEPTIDES

    the dissociation of each complex is the same. The sign ofthe ogarithm of a coefficient can be either negative orpositive, depending o n w hether the aa makes a favorableor unfavorable contribution to the stability of the comp lex.The equation E = -RT In K converts from energy, whichis additive, to the coefficients themse lves, which are fac-tors that contribute to the value of the rate constant.Instead of measuring free energies, we have measured ex-clusively the complex stability, as deduced from the half-life of P2m dissociation. These kinetic measurementsmayhave some advantages over ree energy measurements, be-cause there is no contribution from variations in free pep-tide solvation, and the uncertain status of the HLA Hchain/P2m dimer. However, whenever wo peptide side-chains interact with o ne another, compete for binding tothe sam e pocket, transmit structural changes o other pock-ets, or alter the structure of the rate-limiting step for dis-sociation, the IBS condition would be violated.

    The IBS hypothesis is based on the approximation thatany one aa at a given peptide position would be able toadopt nearly the sam e limited set of conformations, re-gardless of the rest of the peptide's sequenc e. Th e cur-rently availablecrystal tructure data indicate that theoverall structure of the peptide-binding groove is similarfor all peptide com plexes (4, 27-32), and adjustsonlyslightly to differen t peptides (28). In particular, the con-formation of the peptide is constrained by the canonica lhydrogen bonds between the class I molec ule and the pep-tide termini at both end of the peptide-binding groove (28,30,3 2). In addition, one might expect that for HLA -A2 theanchor residue at P2 wouldalways be buried in theBpocket (4). These constraints wou ld be expected to limitthe potential flexibility of the peptide. The energetics ofthe co nforma tions of each aa, and its interactions w ith thepeptide-binding groove would determine the values f thecoefficients, which might also incorporate side-chain sol-vation effects in the case of exposed aa, and also second-ary effects transmitted to other residues by lim itations tothe conformational flexibility of the peptidebackbone.Theore tically, energy minim ization calculations based oncrystallographically determinedcoordinates of one pep-tide/H chain complex should be able to quantitate the en-ergetic consequences of substitutions in the peptide, mak-ing the table of coefficients obsolete. At this oint in time,however, these calculations are cumb ersome and unreli-able, and the table of coefficients can provide a first ap-proximation to the bindingproperties of an unknownpeptide.Because it appears that longer peptides can loop out inthe middle in order to maintain favorable contacts at bothtermini (28, 31), it would be possible to extend the IB Sidea to account for thebindingproperties of peptideslonger than 9 aa. In th is case, th e coefficients for P1-P4might be the sam e as with nonamers, but the coefficientsfor P6-P9 would apply to P (a-4) - P a , where s tands fo rthe last amino acid in the peptide. We have found that the

    binding properties of some peptides can be explained ad-equately in this way (data not shown), but many peptidesare predicted very poorly, especially when there is a Glyresidue at P2 or P3. So far, the longest peptide that wehave tested that appears to bind by the looping-out mech-anism is the 15-mer GLFGGGGGVKGGFGV,which con-tains favorable dominant anchor residues at P2 and P O,and also avorable auxiliary anchor residues at P 3 andP(a -2) . Before this extension of the IBS hypothesis willbe generally useful, it will be necessary to w ork out anadditional set of rules that takes into account th e varietyof peptide backbone conformations that can be used toaccommodate the looped-out residues.

    W e have used the coefficients listed in Table V to askwhether the well-studied antigenic and endogenous pep-tides represent the highest affinity peptides that could begenerated rom their parent proteins. The calculationslisted in Ta ble VI1 indicate that, so far as we an tell, thesebiologically important peptides are usually among the top2% of all possible HL A-A 2 binding peptides. For exam -ple, the optimal HLA-"restricted peptide GI LG FV FT Lis predicted to bind more tightly to HLA-A2 than anyother peptide hat can be derived rom the influenza matrixprotein, even though it contains a relatively u nfavorableIle at P2. Thus, the re is no reason to believe that antigenicpeptides are preferentially selected from a lower affinityset of peptides, as has been proposed (33). It would beinteresting to determine w hether any of he other peptidesthat are predicted to form stable complexes with H LA-A2are ever antigenic or associated with HLA-A2 in vivo. Inthis way, one could address the relative importance to an-tigenicity of peptidebinding to HLA -A2compared toother actors ike protein proteolysis, protein turnover,peptide stability, peptide ransport, the rate of formation ofthe complex, and holes in the T cell repertoire. Unlike thedissociation of the HLA-A2 complex, which is a unimo-lecular process, the processes that affect the rate of for-mation of the HLA-A2 complex are potentially subject tocontrol mec hanism s that may differ between cell ypes,making it much more difficult to study them. In any case,we believe that the coefficients n Table V provide the bestmeans available so far to identify HL A-A 2 binding pep-tides, whether or not they turn out to be antigenic, immu-nologically silent, or never formed in vivo. It wou ld beinteresting to determin e w hether a table of coefficients cal-culated by similar means would be able to improve hepredictive power of the motifs that have been elucidatedfor class I1 binding peptides (34). Other macromo lecularinteractions such as ligandlantibody and oligonucleotide1DNA binding protein mightalsobeaddressedusingamathematical approach similar to that described here.Note added inproof. Software is being developed o makethe coefficients in Table V ublicly accessible through theNationalCenter for Biotechn ology Information at theNational Library of Medicine.

  • 8/6/2019 Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side-Ch

    13/13

    Journalof Immunology 175

    Acknowledgments bility of MH C cla ss heterotrimers and depen ds on which peptide isbound. J . Immunol. 149:1896.The authors thank Drs. David Alling, William Biddison, George18 .Davidon,W . C. 1975. Optimally conditioned optimization algorithmsHutchinson, William Magner, David Margulies, Michael Shields , and without line searches. Math. rogram 9:l .John Spouge for useful discussions, and thank Lisa Hull for excellent 19. Schu mach er, T. N. M., M. L. H. De Bruijn, L. N. Vernie, w. M. Kast,technical assistance. C. J . M. Melief, J. J.Neefjes, and H. L. Ploegh. 1991. Peptide selectionby MHC class I molecules. Nature 350.703.20. Rotzschke, O., K. Falk, S. Stevanovic, G. Jung, and H.-G.References Rammensee. 1992. Peptide motifs of closely related HLA class I1.2.3.

    4.

    5.

    6.

    7.

    8.

    9.

    10 .

    11 .

    12 .

    13 .

    14 .

    15 .16 .

    17.

    Bjorkman, P. J., and P. Parham. 1990. Structure, function, and di-versity of class Imajor histocompatibility com plex molecules. Annu.Rev. Biochem. 59:253.Townsend, A., and H. Bodmer. 1989. Antigen recognition by classI-restricted T lymphocytes. Annu. Rev, Immunol. 7.601.Garrett, T. P., M. A. S aper , P. J. Bjorkman, J. L. Strom inger, andD. C. Wiley. 1989. Specificity pockets fo r the side chains of peptideantigens in HLA-Aw68. Nature 342.692.Sape r, M . A,, P. J. Bjorkman, and D . C. Wiley. 1 991. Refined struc-ture of the human histocompatibility antigen HLA-A2 at 2.6 8, reso-lution. J . Mol. Biol. 219:277.Falk, K., 0. Rotzschke, S. Stevanovic, G. Jung, and H.-G.Ramm ensee. 1991. Allele-specific motifs revealed by sequencing ofself-peptides eluted from MHC molecules. Nature 351.290.Jardetzky, T. S., W. S. Lane, R. A. Robinson, D. R. Madden, andD. C. W iley. 19 91. Identification of self peptides bound to purifiedHLA-B27. Nature 353t326.Romero, P., G. Corrad in, I. F. Luesch er, and J. L. Maryanski. 1991.H-2Kd-restricted antigenic peptides share a simple binding motif.J . Exp. Med. 174:603.Hunt, D. F., R. A. Henderson, J. Shabanowitz, K. Sakaguchi, H.Michel, N. Sevilir, A. L. Cox, E. Appella, and V. H. Engelhard. 19 92.Characterization of peptides bound to the class I MHC moleculeHLA-A2.1 by mass spectroscopy. Science 255:1261.Corr, M., L. F. Boyd, S. R. Frankel, S. Kozlowski, E. A. Padlan, andD. H. Margulies. 1992. Endogenouspeptides of a soluble major his-tocompatibility complex class I molecule, H-2Ld: Sequence motif,Med. I 76.1681.quantitative binding, and molecular modeling of the com plex. J. Exp.Bednarek, M. A,, S. Y. Sauma, M. C. Gammon, G. Porter, S.Tamhankar, A. R. Williamson, and H. J. Zweerink. 1991. The mini-mum peptide epitope from the influenza virus matrix protein: Extraand intracellular loading of HLA-A2. J . Immunol. 147t4047.Tsomides, T. J. , B. D. Walk er, and H. N. Eisen. 1991.An optimal viralpeptide recognized by C D8+ T cells binds very tightly to the re-stricting class I major histocompatibility complex protein on intactcells but not to the purified class I protein. Proc. N atl. Acad. Sci. SA88:11276.Claverie, J., P. Kourilsky, P. Langlade-Demoyen, A. Chalufour-Prochnika, G. Dadaglio, F. T ekaia, F.lata, and L. Bougueleret. 198 8.T-immunogenic peptides are constituted of rare sequence patterns.Use in the identification of T epitopes in the human immunodefi-ciency virus gag protein. Eur. J. Immunol. 18.1547.Clerici, M., D. R. Lucey , R. A. Zajac, R. N. Boswell, H. M. Gebel,H. Takahashi, J . A. Berzofsky, and G. M . Shearer. 1 991. Detectionof cytotoxic T lymphocytes specific for synthe tic eptides of gp16 0in HIV-seropositive individuals. J . Immunol. 146:2214.Dadaglio, G., A. Leroux, P. Langlade-Demoyen, E. M . Bahraoui, F.Traincard, R. Fisher, and F. Plata. 1 991. Epitope recognition of co n-served HIV envelope sequences y human cytotoxic T ymphocytes.J . Immunol. 147t2302.Choppin, J., J. -G . Guillet, and J. -P. Uvy . 1992 .HL A class I bindingregions of HIV-1 proteins. Crit. Rev. Immunol. 12:l.Parker, K. C., M. A. Bednarek, L. K. Hull, U. Utz, B. Cunningham,H. J . Zweerink, W. E. Biddison, and J. E. Coligan. 1992. Sequencemotifs important for peptide binding to the human MH C class mol-ecule, HLA-A2. J . Immunol. 149.3580.Parker, K. C., M. Dibrino, L. Hull, and J. E. Coligan. 1992. TheP2-microglobulin dissociation rate is an accurate measure of the sta-

    molecules encompass substantial differences. Eur. J . Immunol.22:2453.21. Gotch, F., J. Rothbard, K. How land, A. Townsend, and A.McMichael. 1987. Cytotoxic T lymphocytes recognize a fragment ofinfluenza virus matrix protein in association with HLA-A2. Nature326.881.22. Vitiello, A,, D. Marchesini, J . Furze, L. A. Sherman, andR. W.Chesnut. 19 91. Analysis of the HLA -restricted influenza-specific cy-totoxic T lymphocyte response in transgenic mice carrying a chimerichuman-mouse class I major histocompatibility com plex. J . Exp. Med.173:1007.23. Utz, U., . Koenig, J. E. Coligan, and W. E. Biddison. 1992. Pres-entation of three different viral peptides, HT LV-1 tax, HCMV gB, andinfluenza virus M1, is determined by common structural features ofthe HLA-A2.1 molecule. J . Immunol. 149.214.24. Robbins, P. A, , L. A. Lettice, P. Rota, J. Santos-Aguado, J. Rothbard,A. J. McMichael, and J. L. Strominger. 1989. Comparison betweentwo peptide epitopes presented to cytotoxic T ymphocytes by HLA-A2 . Evidence for discrete locations within HLA-A2. J . Immunol.143.4098.25. Bertoletti, A., F. V. Chisari, A. Penna, S. Guilhot, L. Galati, G.Missale, P. Fowler, H. -J. Schlicht, A. Vitiello, R. C. C hesnut, F.Fiaccadori, and C. Ferrari. 1993. Definition of a minimal optimalcytotoxic T-cell epitope within the hepatitis B virus nucleocapsidprotein. J . Virol. 67:2376.26. Henderson, R. A,,H. Michel, K. Sakaguchi, J . Shabanowitz, E.Appe lla, D. . Hunt, andV. H. Engelhard. 1992. HLA-A2.1-associated peptides from a mutant cell line: a secon d pathway ofantigen presentation. Science 255:1264.27. Madden, D. R., J. C. Gorg a, J. L. Strom inger, and D,C. Wiley. 19 92.The three-dimensional structure of HLA-B27 at 2.1 A esolution sug-gests a general mechanism for tight peptide binding to MHC. Cell70:1035.28. Fremo nt, D. H., M. Matsum ura, E. A. Stura, P. A. Peterson, and I. A.Wilson. 1992 . Crystal structures of two viral peptides in com plex withmurine MHC class I H-2Kb. Science 257.919.

    29. Zhang, W., A. C. M. Young, M. Imarai, S. G. Nathenson, and J. C.Sacchettini. 1992. Crystal structure of the major histocompatibilitycomplex c lass I H-2Kb molecule containing a single viral peptide:Implications for peptide binding and T-cell receptor recognition.Proc. Natl. Acad. Sci. USA 89.8403.30. Silver, M. L., H.-C. Guo, J. L. Strominger, and D. C. Wiley. 1992.Atom ic structure of a human M HC molecule presenting an influenzavirus peptide. Nature 360:367.31. Guo, H.-C., T. S. Jardetzky, T. P. J. Garrett, W. S. Lane, J. L.Strorninger, and D. C. Wiley. 1992. Different length peptides bind toHLA -Aw 68 similarly at their ends but bulge out in the middle. Nature

    360:364.32. Mad den, D. R., J. C. Gorga, J. L. Strominger, and D. C. Wiley. 1991.The structure of HLA-B27 reveals nonamer self-peptides bound in anextended conformation. Nature 353:321.33. Ohno, S . 1992. Self o cytotoxic T cells has to be 1000 or less highaffinity nonapeptides per MHC antigen. Immunogenetics 36:22.34. OSullivan, D. , T. Arrhenius, J . Sidney, M.-F. Del Guercio, M.Albertson, M. Wall, C. Oseroff, S. Southwood, S.M. Colbn, F. C. A.Gaeta, and A. Sette. 1991. n the interaction of promiscuous antigenicpeptides with different DR alleles: identification of common struc-tural motifs. J . Immunol. 147:2663.