6
Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul Lee a,1 , Keunwan Park b,1 , Jieun Han a,1 , Joong-jae Lee a,1 , Hyun Jung Kim c,d , Seungpyo Hong b , Woosung Heu a , Yu Jung Kim e , Jae-Seok Ha e , Seung-Goo Lee e , Hae-Kap Cheong c , Young Ho Jeon c , Dongsup Kim b,2 , and Hak-Sung Kim a,2 a Department of Biological Sciences, and b Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 305-701, Korea; c Division of Magnetic Resonance Research, Korea Basic Science Institute, Cheongwon, Chungbuk, 363-883, Korea; d College of Pharmacy, Chungbuk National University, Cheongju, Chungbuk, 361-763, Korea; and e Industrial Biotechnology and Bioenergy Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon 305-806, Korea Edited by Max D. Cooper, Emory University, Atlanta, GA, and approved December 28, 2011 (received for review August 12, 2011) Repeat proteins have recently been of great interest as potential alternatives to immunoglobulin antibodies due to their unique structural and biophysical features. We here present the develop- ment of a binding scaffold based on variable lymphocyte receptors, which are nonimmunoglobulin antibodies composed of Leucine- rich repeat modules in jawless vertebrates, by module engineering. A template scaffold was first constructed by joining consensus repeat modules between the N- and C-capping motifs of variable lymphocyte receptors. The N-terminal domain of the template scaf- fold was redesigned based on the internalin-B cap by analyzing the modular similarity between the respective repeat units using a computational approach. The newly designed scaffold, termed Repebody,showed a high level of soluble expression in bacteria, displaying high thermodynamic and pH stabilities. Ease of mole- cular engineering was shown by designing repebodies specific for myeloid differentiation protein-2 and hen egg lysozyme, respec- tively, by a rational approach. The crystal structures of designed repebodies were determined to elucidate the structural features and interaction interfaces. We demonstrate general applicability of the scaffold by selecting repebodies with different binding affi- nities for interleukin-6 using phage display. non-antibody scaffold repeat protein modular architecture molecular binder R epeat proteins consist of varying numbers of consecutive homologous-structural modules (units) of 2040 amino acid residues with characteristic secondary structures. Numerous re- peat proteins have been identified in nature, and their modular architecture has evolved to be suitable for mediating many impor- tant biological functions, including proteinprotein interactions, cell adhesion, signaling processes, neural development, bacterial pathogenicity, extracellular matrix assembly, and immune response (1). Due to their unique features, such as modular architecture and large interaction surfaces, repeat proteins have recently attracted much attention as templates for the development of alternative scaffolds to immunoglobulin antibodies (24). Immunoglobulin antibodies are widely used in biotechnology and biomedical fields as binding molecules and therapeutics; however, they have some drawbacks, such as requirement for an expensive mammalian cell- based manufacturing system, difficulty in rational design, large molecular mass, a tendency to aggregate, and intellectual property restrictions (5, 6). Hence, considerable effort has been made to develop the alternatives, and a variety of protein scaffolds have been reported including ankyrin repeat, fibronectins, anticalins, affibodies, and A domains, etc. (7, 8). It was recently shown that the adaptive immune system in jaw- less vertebrates such as lampreys and hagfish is based on variable lymphocyte receptors (VLRs) instead of immunoglobulin anti- bodies (911). VLRs consist of highly diverse Leucine-rich repeat (LRR) modules and are characterized by an assembly of repeat- ing 2029 residue LRR modules, each of which has a β-strand- turn-α helix structure, in a horseshoe-shaped solenoid fold. LRR modules are known to be one of the most abundant structural motifs and have been identified in more than 2,000 proteins in nature (12). VLRs are produced in lymphocytes by somatic gene rearrangement of diverse LRR modules, giving rise to a vast re- pertoire of >10 17 unique receptors. Since their initial discovery in jawless vertebrates, the functional and structural aspects of VLRs have been demonstrated (1318). Along with the unique structur- al feature of repeat proteins, the inherent role of VLRs as highly diverse antibodies suggests that they could be developed into alternative binding scaffolds devoid of the limitations found in immunoglobulin antibodies. Here, we present the development of a binding scaffold based on VLRs by module engineering. A template scaffold was first con- structed by joining consensus LRR repeat modules between the N- and C-capping motifs from VLRs found in nature. For ease of molecular engineering and manufacturing using bacterial expres- sion system, the N-terminal domain of the template scaffold was redesigned based on the internalin-B cap by analyzing the modular similarity between the respective repeat units using a computa- tional approach. The newly designed scaffold was named Repe- bodybecause it was derived from naturally occurring antibodies composed of repeat modules. Ease of molecular engineering was shown by designing repebodies specific for myeloid differentiation protein-2 (MD2) and hen egg lysozyme (HEL), respectively, by a rational approach. The crystal structures of designed repebodies were determined to elucidate the structural features and inter- action interfaces. To demonstrate general applicability of the scaf- fold, we constructed a phage-displayed library and selected the re- pebodies with different binding affinities for interleukin-6 (IL-6). Results Design of Consensus LRRV Module and a Template Scaffold. Variable lymphocyte receptors (VLRs) of jawless vertebrates are com- posed of an N-terminal cap (LRRNT), the first LRR (LRR1), up to nine 24-residue variable LRR (LRRV), an end LRRV Author contributions: S.-C.L., K.P., J.H., J.-j.L., D.K., and H.-S.K. designed research; S.-C.L., K.P., J.H., J.-j.L., H.J.K., S.H., W.H., Y.J.K., and J.-S.H. performed research; S.-C.L., K.P., J.H., J.-S.H., S.-G.L., H.-K.C., Y.H.J., D.K., and H.-S.K. analyzed data; and S.-C.L., K.P., J.H., S.-G.L., H.-K.C., and H.-S.K. wrote the paper. The authors declare no conflict of interest. *This Direct Submission article had a prearranged editor. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 3RFJ and 3RFS). 1 S.-C.L., K.P., J.H., and J.-j.L. contributed equally to this work. 2 To whom correspondence may be addressed. E-mail: [email protected] or kds@ kaist.ac.kr. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1113193109/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1113193109 PNAS February 28, 2012 vol. 109 no. 9 32993304 BIOCHEMISTRY Downloaded by guest on March 8, 2020

Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

Design of a binding scaffold based on variablelymphocyte receptors of jawless vertebratesby module engineeringSang-Chul Leea,1, Keunwan Parkb,1, Jieun Hana,1, Joong-jae Leea,1, Hyun Jung Kimc,d, Seungpyo Hongb, Woosung Heua,Yu Jung Kime, Jae-Seok Hae, Seung-Goo Leee, Hae-Kap Cheongc, Young Ho Jeonc, Dongsup Kimb,2, and Hak-Sung Kima,2

aDepartment of Biological Sciences, and bDepartment of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST),Daejeon, 305-701, Korea; cDivision of Magnetic Resonance Research, Korea Basic Science Institute, Cheongwon, Chungbuk, 363-883, Korea; dCollege ofPharmacy, Chungbuk National University, Cheongju, Chungbuk, 361-763, Korea; and eIndustrial Biotechnology and Bioenergy Research Center,Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon 305-806, Korea

Edited by Max D. Cooper, Emory University, Atlanta, GA, and approved December 28, 2011 (received for review August 12, 2011)

Repeat proteins have recently been of great interest as potentialalternatives to immunoglobulin antibodies due to their uniquestructural and biophysical features. We here present the develop-ment of a binding scaffold based on variable lymphocyte receptors,which are nonimmunoglobulin antibodies composed of Leucine-rich repeat modules in jawless vertebrates, bymodule engineering.A template scaffold was first constructed by joining consensusrepeat modules between the N- and C-capping motifs of variablelymphocyte receptors. The N-terminal domain of the template scaf-fold was redesigned based on the internalin-B cap by analyzing themodular similarity between the respective repeat units usinga computational approach. The newly designed scaffold, termed“Repebody,” showed a high level of soluble expression in bacteria,displaying high thermodynamic and pH stabilities. Ease of mole-cular engineering was shown by designing repebodies specific formyeloid differentiation protein-2 and hen egg lysozyme, respec-tively, by a rational approach. The crystal structures of designedrepebodies were determined to elucidate the structural featuresand interaction interfaces. We demonstrate general applicabilityof the scaffold by selecting repebodies with different binding affi-nities for interleukin-6 using phage display.

non-antibody scaffold ∣ repeat protein ∣ modular architecture ∣molecular binder

Repeat proteins consist of varying numbers of consecutivehomologous-structural modules (units) of 20–40 amino acid

residues with characteristic secondary structures. Numerous re-peat proteins have been identified in nature, and their modulararchitecture has evolved to be suitable for mediating many impor-tant biological functions, including protein–protein interactions,cell adhesion, signaling processes, neural development, bacterialpathogenicity, extracellular matrix assembly, and immune response(1). Due to their unique features, such as modular architecture andlarge interaction surfaces, repeat proteins have recently attractedmuch attention as templates for the development of alternativescaffolds to immunoglobulin antibodies (2–4). Immunoglobulinantibodies are widely used in biotechnology and biomedical fieldsas binding molecules and therapeutics; however, they have somedrawbacks, such as requirement for an expensive mammalian cell-based manufacturing system, difficulty in rational design, largemolecular mass, a tendency to aggregate, and intellectual propertyrestrictions (5, 6). Hence, considerable effort has been made todevelop the alternatives, and a variety of protein scaffolds havebeen reported including ankyrin repeat, fibronectins, anticalins,affibodies, and A domains, etc. (7, 8).

It was recently shown that the adaptive immune system in jaw-less vertebrates such as lampreys and hagfish is based on variablelymphocyte receptors (VLRs) instead of immunoglobulin anti-bodies (9–11). VLRs consist of highly diverse Leucine-rich repeat(LRR) modules and are characterized by an assembly of repeat-

ing 20–29 residue LRR modules, each of which has a β-strand-turn-α helix structure, in a horseshoe-shaped solenoid fold. LRRmodules are known to be one of the most abundant structuralmotifs and have been identified in more than 2,000 proteins innature (12). VLRs are produced in lymphocytes by somatic generearrangement of diverse LRR modules, giving rise to a vast re-pertoire of >1017 unique receptors. Since their initial discovery injawless vertebrates, the functional and structural aspects of VLRshave been demonstrated (13–18). Along with the unique structur-al feature of repeat proteins, the inherent role of VLRs as highlydiverse antibodies suggests that they could be developed intoalternative binding scaffolds devoid of the limitations found inimmunoglobulin antibodies.

Here, we present the development of a binding scaffold basedonVLRs bymodule engineering. A template scaffold was first con-structed by joining consensus LRR repeat modules between theN- and C-capping motifs from VLRs found in nature. For ease ofmolecular engineering and manufacturing using bacterial expres-sion system, the N-terminal domain of the template scaffold wasredesigned based on the internalin-B cap by analyzing the modularsimilarity between the respective repeat units using a computa-tional approach. The newly designed scaffold was named “Repe-body” because it was derived from naturally occurring antibodiescomposed of repeat modules. Ease of molecular engineering wasshown by designing repebodies specific for myeloid differentiationprotein-2 (MD2) and hen egg lysozyme (HEL), respectively, by arational approach. The crystal structures of designed repebodieswere determined to elucidate the structural features and inter-action interfaces. To demonstrate general applicability of the scaf-fold, we constructed a phage-displayed library and selected the re-pebodies with different binding affinities for interleukin-6 (IL-6).

ResultsDesign of Consensus LRRV Module and a Template Scaffold. Variablelymphocyte receptors (VLRs) of jawless vertebrates are com-posed of an N-terminal cap (LRRNT), the first LRR (LRR1),up to nine 24-residue variable LRR (LRRV), an end LRRV

Author contributions: S.-C.L., K.P., J.H., J.-j.L., D.K., and H.-S.K. designed research; S.-C.L.,K.P., J.H., J.-j.L., H.J.K., S.H., W.H., Y.J.K., and J.-S.H. performed research; S.-C.L., K.P., J.H.,J.-S.H., S.-G.L., H.-K.C., Y.H.J., D.K., and H.-S.K. analyzed data; and S.-C.L., K.P., J.H., S.-G.L.,H.-K.C., and H.-S.K. wrote the paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Data deposition: The atomic coordinates and structure factors have been deposited in theProtein Data Bank, www.pdb.org (PDB ID codes 3RFJ and 3RFS).1 S.-C.L., K.P., J.H., and J.-j.L. contributed equally to this work.2To whom correspondence may be addressed. E-mail: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1113193109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1113193109 PNAS ∣ February 28, 2012 ∣ vol. 109 ∣ no. 9 ∣ 3299–3304

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0

Page 2: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

(LRRVe), a connecting peptide (CP), and the C-terminal cap(LRRCT) (Fig. 1A). To develop a binding scaffold based onVLRs, we first designed a consensus LRRV module consisting ofthe framework residues by analyzing the conserved pattern invariable LRR modules of VLRs. Consensus design of repeatingmodules was shown to increase the solubility and stability of var-ious repeat proteins (19–22). Based on the sequence alignmentsfor 1,000 LRR modules from the UniProt database (23) and 439from the National Center for Biotechnology Information (NCBI)nonredundant protein sequences (nr) database (24), we obtaineda consensus 24-residue LRRV module (Fig. 1B). The LRRV-fra-mework residues were conserved, and a moderately conservedregion was fixed with the most relevant and less-charged residues.The residues in the variable regions, which are denoted as “x,”can be rationally selected or randomized for the generation ofmolecular binders for various targets. Ten positions were almostperfectly conserved, and dominant amino acids were found at sixpositions. At positions 3, 15, and 24, three amino acid residues(N, Q, and K) were chosen as consensus residues because theywere found in a large portion of both databases and have smallercharges. Another five sites showed large variations in sequences;four were in the concave region, and one was in the convex region.The residues on the most highly conserved and hypervariable re-gions are shown in the typical VLR structure (Fig. S1 A and B).

The number of LRRV modules in naturally occurring VLRshas been shown to range between 0–9 (17). By taking into accountthe number of LRRV modules in the known VLRs with antigen-binding affinities, we constructed a template scaffold by sealingfive consensus-LRRVmodules using the N- and C-capping motifsof VLRs found in nature, which was designated as VLRc-5. Ami-no acid residues at hypervariable regions were selected based ontheir frequencies for each position in LRR modules in nature.Hence, the template scaffold was composed of LRRNT, LRR1,five 24-residue LRRV, LRRVe, CP, and LRRCT, which had a mo-

lecular mass of approximately 29 kDa. The nucleotide sequenceof the gene coding for the template scaffold is shown in Table S1,and the gene was synthesized after codon optimization for expres-sion in Escherichia coli. The template scaffold was expressed insoluble form in the E coli Origami strain, showing an expressionlevel of about 2 mg∕L.

Redesign of the N-Terminal CappingMotif.Even though the templatescaffold was expressed in soluble form inE. coli, its expression levelwas too low for practical applications. Thus, we used a variety ofmethods to increase the expression level of the template scaffold,including fusion of various protein partners and expression underdifferent conditions in the presence of chaperons etc.; however, nosignificant increase in expression was observed when these differ-ent approaches were used. It was recently shown that the folding ofLRR proteins proceeds through an N-terminal transition state en-semble and that the α-helical cap may polarize better the foldingpathway of the repeat proteins by acting as a fast-growing nucleus(25). Therefore, we attempted to redesign the N-terminal domainof the template scaffold based on the most favorable capping motiffound in LRR proteins. We analyzed the N-terminal cappingmotifs found in LRR proteins in terms of the helical content andsimilarity to the consensus LRRV module, and found that theN-terminal capping motif of internalin B fits most the criteria. In-ternalin B also belongs to a family of LRR proteins and is com-posed of 22-residue LRRmodules (26). Internalin B was shown toinduce phagocytosis in hepatocytes in nature, and its crystal struc-ture was determined [Protein Data Bank (PDB) ID code 1OTO].

Our strategy was to substitute the internalin-B cap for theN-terminal domain of the template scaffold using a computationalapproach involving two consecutive steps. The first step was todetermine the site to connect the N-terminal cap of internalin Bto the template scaffold by analyzing the modular similaritybetween the respective repeat units, and the second step was to

Fig. 1. Design of consensus LRR module and Repe-body scaffold. (A) Overall architecture of a VLR injawless vertebrates. A typical VLR consists of an N-terminal cap (LRRNT), the first LRR (LRR1), up to nine24-residue variable LRR (LRRV), an end LRRV (LRRVe),a connecting peptide VLR (CP), and the C-terminalcap (LRRCT). (B) Sequence of a consensus LRRV mod-ule based on the analysis of a conserved pattern indiverse LRR modules of VLRs. Perfectly conserved re-sidues at 10 positions are indicated in black, and theresidues dominantly occurring at six positions are pre-sented in orange. X denotes arbitrary amino acid re-sidues. (C) Module-pairs composed of two adjoiningrepeat modules from internalin B and the templatescaffold (VLRc-5), respectively. Six and five module-pairs were generated from internalin B and the tem-plate scaffold, respectively, and the module-pair 2from internalin-B was shown to have the highest si-milarity to the module-pair 1 from the template scaf-fold as indicated in red. (D) Schematic of the design ofthe Repebody scaffold. Based on the similarity scoresof the module-pairs, the internalin-B domain span-ning from the N-terminal cap to LRR2 (violet) wasfused to the domain of the template scaffold span-ning from LRRV2 to the C terminal (red), resultingin the Repebody scaffold. Hence, the newly designedRepebody scaffold was composed of the N-terminalcap, LRR1, LRRV1 from internalin B (violet), four LRRV(2–5), LRRVe, CP, and the C-terminal cap from thetemplate scaffold (red). The model structures of thetemplate and Repebody scaffolds were generated byhomology modeling using the crystal structures ofVLR (PDB ID code 2O6S) and internalin B (PDB ID code1OTO) as the templates, respectively.

3300 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1113193109 Lee et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0

Page 3: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

optimize the amino acid residues at the fusion site connectingthe two heterogeneous modules. For this, we obtained the modelstructure of the template scaffold by homology modeling (see SIMaterials and Methods, Modeling of Protein Structures). To deter-mine the optimal connecting region, six and five module-pairscomposed of two adjoining LRR modules from internalin B andthe template scaffold, respectively, were generated (Fig. 1C). Com-parison of the modular similarity between the respective module-pairs from internalin B and the template scaffold revealed thatthe module-pair 2 from internalin B had the highest similarityto the module-pair 1 of the template scaffold (Table S2). Basedon this analysis, we fused the internalin-B domain spanning fromthe N-terminal cap to LRR2 with the domain of the template scaf-fold spanning from LRRV2 to the LRRCTas depicted in Fig. 1D.We assumed that if the sequences of two module-pairs were mostsimilar, the first module of internalin B would match well with thesecond module in the template scaffold, and the correspondingmodule-pairs were considered a primary candidate for connectingthe two different LRR proteins. Using this approach, both LRR3of internalin B and LRRV1 of the template scaffold were deleted.We further optimized the amino acid residues on the connectedmodules using the fixed backbone design protocol of the Rosettasoftware (27). The newly designed scaffold was termed “Repe-body” because it was derived from repeat module-based anti-bodies. Consequently, the Repebody scaffold was composed ofthe N-terminal cap, LRR1, LRRV1 from internalin B, four LRRV(2–5), LRRVe, CP, and the C-terminal cap from the template scaf-fold; and its nucleotide sequence was shown in Table S1. We testedthe expression of the Repebody scaffold in E. coli after codonoptimization, and the expression level was significantly increasedup to 60 mg∕L culture (Fig. 2A).

Variation in the Size of the Repebody Scaffold. Repeat proteins ex-hibit the unique structural features resulting from the assembly ofhomologous structural repeat units (1, 12). To test the modularityof the Repebody scaffold, we varied the numbers of LRRV mod-ules from 3–6, and investigated the biophysical properties of theresulting scaffolds in terms of expression level and stabilitiesagainst temperature and pH. The nucleotide sequences of theconstructed Repebody scaffolds are listed in Table S1. All of themwere well expressed in soluble form in E. coli (Fig. 2B), and theirexpression levels ranged from 50–80 mg∕L culture (Fig. 2C). Thescaffold containing the three LRRV modules (Repebody-3)displayed the highest expression level of about 80 mg∕L, and theexpression level decreased at the number of LRRV modules in-

creased. The melting temperatures of the Repebody scaffoldwere closely related to the number of LRRV modules, and thescaffold with six LRRV modules (Repebody-6) had the highestmelting temperature of 85 °C (Fig. 2C). The number of LRRVmodules seems to be the major factor dictating the thermo-dynamic stability of the Repebody, whereas the substituted inter-nalin-B cap had only a negligible effect. The stability of theRepebody scaffold against pH was also tested, and it was shownto be stable over pH values ranging between 3–12 (Fig. 2D).These results indicate that the Repebody scaffold retains a mod-ular architecture and highly stable conformation.

Design of Repebodies by a Rational Approach. To assess ease of mo-lecular engineering, we first designed a repebody for myeloid dif-ferentiation protein-2 (MD2) by a rational approach. MD2 waschosen as a target protein because it plays a crucial role in themammalian TLR4-mediated innate immune response (28–30),and the crystal structure of the TV3/MD2 (PDB ID code 2Z65)complex has been reported (31). TV3 is a hybrid protein compris-ing the LRR modules from human TLR4 (Toll-like receptor 4)and a hagfish VLR. To design a repebody for MD2, we predictedthe residues that interact with MD2 by superimposing the modelstructure of the Repebody scaffold onto the crystal structure ofTV3 in complex with MD2 (Fig. S2A). Based on the analysis, 11residues on the scaffold were selected and changed (Fig. S2B):N91I, S93T, T94G, A115V, N117V, T118E, S139N, G141A,Y142H, R163D, and N165S. The gene coding for the molecularbinder for MD2 (Table S1), designated as MD2-repebody, wassynthesized and expressed in E. coli. MD2-repebody was also ex-pressed at high level as a monomeric form (Fig. S3A), and showeda distinct binding to MD2 (Fig. S3B, C) and a negligible cross-reactivity (Fig. S3D). Binding affinity of MD2-repebody (KD)for MD2 was estimated to be 388 nM by SPR equilibrium analysis(Fig. 3A). We then tested whether the MD2-repebody had aneffect on the lipopolysaccharide (LPS)-induced immune responseusing a cell-based assay system. The MD2-repebody exhibited aconsiderable attenuating activity (Fig. S3E), implying that it maybe a potential therapeutic for treatment of severe inflammationand sepsis.

To further test ease of molecular design, we attempted toproduce a repebody for hen egg lysozyme (HEL). We chose HELbecause it is easy to prepare large amounts of the protein and thecrystal structure of the VLR/HEL (PDB ID code 3G3A) complexhas been reported (17). Similarly, the crystal structure of theMD2-repebody was superimposed on the crystal structure of

Fig. 2. Expressions of the Repebody scaffold in E.coli. (A) SDS-PAGE analyses of the expressed templateand Repebody scaffolds. Lanes 1 and 2 are the super-natant fractions from the template and Repebodyscaffolds, respectively. Lane 3 is the Repebody scaffoldafter purification over a Ni-NTA column. M representsstandard size marker. (B) SDS-PAGE analysis of the ex-pressed Repebody scaffolds with different LRRV mod-ule numbers ranging from 3–6. Lanes 1 and 2, threemodules; Lanes 3 and 4, four modules; Lanes 5 and 6,five modules; Lanes 7 and 8, six modules. P indicatesthe insoluble pellet fraction from the protein lysate,and S the supernatant fraction from the protein ly-sate. M indicates standard size marker. (C) Expressionlevels and melting temperatures of the Repebodyscaffolds with different LRRV module numbers. Ex-pression levels of proteins were measured after pur-ification over a Ni-NTA column. Expression level andmelting temperature of the template scaffold wereabout 2 mg∕L and 81 °C, respectively. (D) Meltingtemperature of the Repebody scaffold at differentpH values. Melting temperatures were determinedby measuring molar ellipticities at 222 nm as a func-tion of temperature at indicated pH.

Lee et al. PNAS ∣ February 28, 2012 ∣ vol. 109 ∣ no. 9 ∣ 3301

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0

Page 4: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

the VLR/HEL complex to determine the residues to be changed(Fig. S2C). As a result, nine residues of the MD2-repebody wereselected (Fig. S2D): N139Y, A141Y, E161R, S165D, Y166N,D185Q, R187S, Y189N, and Q190D. In addition to the nine mu-tations, CP and LRRCT spanning from Gln-208 to Thr-266 werereplaced with those of VLR in complex with HEL because theloop of LRRCTon VLR was shown to participate in the interac-tion with HEL (17). The designed repebody for HEL was desig-nated as HEL-repebody, and its nucleotide sequence is listed inTable S1. HEL-repebody also was shown to be expressed at highlevel and form the complex with HEL in solution (Fig. S4A).HEL-repebody displayed a distinct binding to HEL (Fig. S4B)and a negligible cross-reactivity (Fig. S4C). Binding affinity ofHEL-repebody for HEL was estimated to be 3.7 μM by isother-mal titration calorimetry (ITC) analysis (Fig. 3B).

To gain insight into the structural features and interaction in-terfaces, the crystal structures of the MD2- and HEL-repebodieswere determined at 1.7 Å and 1.8 Å resolutions, respectively(Table S3). The crystal structures of the two repebodies revealedthat both proteins maintain a characteristic horseshoe-shaped fold,displaying well conserved backbone structures despite a numberof mutations. To check if substitution of the internalin-B capcaused any changes in the conformation of the LRR domain, wesuperimposed the crystal structures of two designed repebodieson the crystal structure of a lamprey VLR (PDB ID code 2O6S)(Fig. 3 C and D). The backbone root-mean-square deviations ofthe two repebodies relative to VLR except for the substitutedinternalin-B cap were about 0.33 Å and 0.36 Å, respectively. Thisresult indicates that the Repebody scaffold retained nearly thesame conformation, and that fusion of the internalin-B cap had anegligible effect on the conformation of the original LRR domain.To understand the interaction interfaces, the crystal structures ofthe MD2- and HEL-repebodies were superimposed on the com-plex crystal structures of TV3/MD2 (PDB ID code 2Z65) andVLR/HEL (PDB ID code 3G3A), respectively (Fig. 3 E and F).Analysis of the binding hot spots by FoldX 3.0 beta (complex_alas-can function) suggests that hydrogen bonds between the sidechains play a major role in interactions. Specifically, in the case

of the MD2-repebody, E118, D163, and S165 were predicted tobe the binding hot spot, and E118 and S165 appeared to formhydrogen bonds with T112, E111, and R106 of MD2. In addition,D163 of the MD2-repebody was likely to interact with the posi-tively charged residue, R106, of MD2. As for the HEL-repebody,hydrogen bonds involving R161, D165, Y241, and N243 seemedto be critical for the binding of the HEL-repebody to HEL. Thecharged residues, R161 and D165, were likely to have charge inter-actions with D48 and R73 of HEL, mediating the hydrogen bond-ing with P70 and R73 of HEL. The accuracy of the model structureof the Repebody scaffold was tested by superimposition on thecrystal structure of the MD2-repebody (Fig. S5) (32). The modelstructure was well fitted into the crystal structure of theMD2-repe-body with a Cα rmsd of 0.95 Å.

Selection of a Repebody by Phage Display. In order to show generalapplicability of the Repebody scaffold, we attempted to generatea repebody for other target by phage display selection. As a pro-tein target, interleukin-6 (IL-6) was employed because it wasknown to be involved in many diseases like inflammation andcancers (33). Two adjoining repeat modules (LRRV module 1and 2) of the Repebody scaffold were chosen, and three hyper-variable sites (positions 8, 10, and 11) on each repeat modulewere subjected to randomization for generating a synthetic diver-sity (Fig. 4A). The library was constructed by repeat module-based overlap PCR using the primers with NNK (where N standsfor any of four nucleotides and K for G or T) degenerate codon,and a phage-displayed library of approximately 108 clones wasgenerated. After four rounds of standard panning process againstIL-6, 96 clones were randomly chosen for the assay of the bindingactivity in a 96-well plate using phage ELISA. We selected 82 po-sitive clones showing significant signals (signal to noise >10) anddetermined their sequences. Selected repebodies were shown tohave distinct amino acid residues at the mutation sites, and theirsequence conservation is shown by sequence logos (Fig. 4B). Of82 positive clones, we selected three repebody clones (B3, C8,and F11) showing high signals in phage ELISA. Three isolatedclones were tested in terms of specificity using phage ELISA, and

Fig. 3. Binding affinities and crystal structures of designed repebodies. (A) Binding affinity of MD2-repebody for MD2 by SPR equilibrium analysis. The fittingcurve was obtained by plotting the response units (RU) against the MD2 concentrations. (B) Binding affinity of HEL-repebody for HEL by ITC (C) Superposition ofthe crystal structure of the MD2-repebody (blue) on the crystal structure of VLR (PDB ID code 2O6S, yellow). (D) Superposition of the crystal structure of the HEL-repebody (pink) on the crystal structure of VLR (PDB ID code 2O6S, yellow). (E) Interaction interface of the MD2-repebody. Eleven mutated residues potentiallyinteractingwithMD2 are shown in blue sticks. (F) Interaction interface of the HEL-repebody. Ninemutated residues potentially interacting with HEL are indicatedin pink sticks, and the replaced C-cap region is represented in light blue.

3302 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1113193109 Lee et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0

Page 5: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

they were shown to be highly specific for IL-6 (Fig. 4C), display-ing negligible cross-reactivities against other proteins. The se-lected repebodies were expressed in E. coli, and their bindingaffinities for IL-6 were determined by ITC (Fig. 4D). The se-lected repebodies displayed the binding affinities ranging from48–117 nM, showing variation in amino acid residue betweenone and three (Fig. 4E). This result indicates that the Repebodyscaffold can be broadly used for generating the molecular binderswith high affinity and specificity for a variety of targets by phagedisplay selection.

DiscussionWe have successfully developed the Repebody scaffold based onVLRs by module engineering. The present results demonstratethat the developed scaffold can be widely used for generating thetarget-specific molecular binders for applications in biotechnol-ogy and biomedical fields by a rational design and phage displayselection. One of the key issues in the development of an alter-native scaffold is the ease of engineering and mass productionusing bacterial expression system (7, 8). Our approach to redesignthe N-terminal domain of the template scaffold based on the in-ternalin-B cap successfully achieved a high-level soluble expres-sion of the Repebody scaffold up to 80 mg∕L in E. coli, enablingease of manufacturing and engineering using bacterial expressionsystem. This result seemed to stem from the role of the N-term-inal capping motif of internalin B, which may have acted as a fast-growing nucleus to create a discrete and polarized folding path-way onto which proximal LRR modules can propagate. Approxi-mately half of internalin B, which is composed of the N-terminalhelical cap and the first three LRR modules, is likely to facilitatesuch folding pathway. We tested the applicability of our approachto Toll-like receptor 4 (TLR4) with different modular structure.A template scaffold was first constructed by assembling sevenLRR modules of the TLR-4 ectodomain between the N- andC-terminal capping motifs from a hagfish VLR. However, this

template scaffold was found not to be expressed at all in E. coli,and a variety of methods to express the scaffold had no effect. Weredesigned the N-terminal domain of the scaffold based on theinternalin-B cap using the same approach as described above.The resulting TLR-4 scaffold was shown to be expressed in asoluble form in E. coli, and its expression level reached about10 mg∕L (Fig. S6), which supports that our approach can effec-tively be used for developing binding scaffolds based on repeatproteins composed of LRR modules.

The unique structural feature of repeat proteins lies in a mod-ular architecture stemming from an assembly of homologousstructural modules (repeats) in a horseshoe-shaped solenoid fold(34). The modular architecture of the Repebody scaffold allowedfor variation in the numbers of LRR modules, and all the con-structed scaffolds were also well expressed at high levels in solubleform in E. coli, showing high thermodynamic and pH stabilities. Itis worth noting that the target interaction surface of the Repebodyscaffold can be easily controlled by adding or deleting the LRRVmodules without disruption of the overall structure of the Repe-body scaffold, whereas non-repeat-globular proteins have fixedsizes of interaction surfaces. Increases in the concave surface areaand molecular mass of the Repebody scaffold per LRRV modulewere estimated to be 220 Å2 (16) and 3 KDa, respectively, whichindicates that the interaction surface of the Repebody scaffold canbe effectively modulated. The melting temperature of the Repe-body scaffold increased in proportion to the number of LRRVmodules. A similar trend in thermodynamic stability was reportedfor the consensus-designed ankyrin-repeat proteins (35). It wasshown that LRR modules with the framework residues assembleinto a solenoid fold, forming a tight hydrophobic core that laterallystabilized consecutive repeat modules. Hence, the number ofLRRV modules appeared to be the main factor dictating the ther-modynamic stability of the Repebody scaffold, whereas the inter-nalin-B cap had a negligible effect on thermodynamic stability. The

Fig. 4. Selection of a repebody for IL-6 by phage display. (A) Sites for introducing mutations for the construction of a phage-displayed library. The numbersindicate the positions on the Repebody scaffold. (B) Sequence conservation of 82 repebody clones shown as sequence logos. The height of individual lettersindicates the frequency of recovered amino acid at specified position. The numbers indicate the positions on each repeat module (C) Specificity of the selectedrepebodies by phage ELISA. Purified proteins were coated on a 96-well plate, sequentially reacted with purified phage and HRP-conjugated anti-M13 mono-clonal antibody, and absorbance was measured at 450 nm using a plate reader. In the case of a competitive assay, soluble interleukin-6 (sIL-6) was added. Errorbars indicate the deviation in triplicate experiments. (D) Isothermal titration calorimetry data for IL-6 binding to the selected repebodies (F) Amino acid se-quences and KD values from ITC of the selected repebodies.

Lee et al. PNAS ∣ February 28, 2012 ∣ vol. 109 ∣ no. 9 ∣ 3303

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0

Page 6: Design of a binding scaffold based on variable lymphocyte ...Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering Sang-Chul

high stability of the Repebody scaffold against pH is likely to beclosely related to its high thermodynamic stability.

We have shown ease of molecular engineering and generalapplicability of the Repebody scaffold by creating the molecularbinders for three defined targets by a rational design and phagedisplay selection. The designed repebodies for MD2 and HELhad the binding affinities of 388 nM and 3.7 μM, respectively,displaying unique interaction interfaces for the respective targets.The crystal structures of the MD2- and HEL-repebodies indi-cated that they retain a characteristic horseshoe-shaped folddespite numerous mutations on the concave surface, presentinga modular architecture and rigid backbone structure. A diversephage-displayed library could be constructed by introducingmutations specifically into six hypervariable sites on two repeatmodules. The use of phage display enabled a successful selectionof the repebodies for IL-6 with KD values ranging from 48–117 nM. The selected repebodies were shown to be highly specificfor IL-6, displaying negligible cross-activities, which seems tostem from the inherent role of VLRs in adaptive immune system.The modularity of the Repebody scaffold allowed variations inthe number of repeat modules as well as in amino acid residueson individual modules. Thus, interacting surface of the Repebodyscaffold for a target can be easily modulated by changing thenumber of repeat modules to be mutated for a library construc-tion. It has been suggested that proteins with a large flat surfaceand rigid structure offer distinct advantage in the design of mo-lecular binders for a variety of targets, partly because they inducethe rigid body interactions and consequently a low loss of entropyupon binding (5). With a modular architecture and rigid back-bone structure, the Repebody scaffold offers distinct advantagesover globular proteins in creating the target-specific molecularbinders by rational and library-based approaches.

In conclusion, the present results demonstrate a successful de-velopment of the Repebody scaffold based on VLRs by module

engineering as an alternative to immunoglobulin antibodies. Withunique biophysical and structural features, the Repebody scaffoldcan broadly be used for generating molecular binders for thera-peutic purpose as well as for applications in diagnostics such asprotein chips, bioimaging, and immuno-assays by rational designand library-based approaches. In addition, a repebody with highaffinity and specificity for a target is expected to be applied toaffinity purification, due to its high thermodynamic and pHstabilities.

Materials and MethodsThe genes encoding various Repebody scaffolds were synthesized aftercodon optimization for E. coli (Genscript), and their sequences are listedin Table S1. The model structure of the template scaffold was obtained byhomology modeling using Modeller9v4 software (32). The crystal structureof a VLR (PDB ID code 2O6S) was used as a template. We designed molecularbinders for MD2 and hen egg lysozyme (HEL) as the protein targets based onthe complex crystal structures of TV3/MD2 (PDB ID code 2Z65) and VLR/HEL(PDB ID code 3G3A), respectively. Binding affinities of repebodies weredetermined by surface plasmon resonance (SPR) analysis using a Biacore 3000system (GE Healthcare) and isothermal titration calorimetry (ITC) (iTC200system, Microcal). Cell-based assay for LPS-induced immune response wasperformed using the macrophage-like cell line THP-1 (TIB-202TM, ATCC). Aphage display selection was carried out as described elsewhere (36). A repe-body library was constructed by introducing random mutations specificallyinto hypervariable sites on repeat modules of the Repebody scaffold usingthe primers with NNK degenerate codon. Detailed experimental proceduresare described in SI Materials and Methods.

ACKNOWLEDGMENTS. We thank Zeev Pancer for helpful discussion. This re-search was supported by Pioneer Research Program for Converging Technol-ogy (20110001745), Advanced Biomass R and D Center (Grant ABC-2011-0031363), Brain Korea 21, and Basic Research Lab (2009-0086964) of Ministryof Education, Science, and Technology. The support from the high-fieldNuclear Magnetic Resonance Research Program of Korea Basic Science Insti-tute is also appreciated (Y.H.J.).

1. Andrade MA, Perez-Iratxeta C, Ponting CP (2001) Protein repeats: Structures, func-tions, and evolution. J Struct Biol 134:117–131.

2. Binz HK, Stumpp MT, Forrer P, Amstutz P, Pluckthun A (2003) Designing repeat pro-teins: Well-expressed, soluble and stable proteins from combinatorial libraries of con-sensus ankyrin repeat proteins. J Mol Biol 332:489–503.

3. Binz H, et al. (2004) High-affinity binders selected from designed ankyrin repeat pro-tein libraries. Nat Biotechnol 22:575–582.

4. Main ERG, et al. (2005) A recent theme in protein engineering: The design, stabilityand folding of repeat proteins. Curr Opin Struct Biol 15:464–471.

5. Skerra A (2007) Alternative non-antibody scaffolds for molecular recognition. CurrOpin Biotechol 18:295–304.

6. Werner RG (2004) Economic aspects of commercial manufacture of biopharmaceuti-cals. J Biotechnol 113:171–182.

7. Gebauer M, Skerra A (2009) Engineered protein scaffolds as next-generation antibodytherapeutics. Curr Opin Chem Biol 13:245–255.

8. Binz HK, Amstutz P, Pluckthun A (2005) Engineering novel binding proteins from non-immunoglobulin domains. Nat Biotechnol 23:1257–1268.

9. Pancer Z, et al. (2004) Somatic diversification of variable lymphocyte receptors in theagnathan sea lamprey. Nature 430:174–180.

10. Alder MN, et al. (2005) Diversity and function of adaptive immune receptors in a jaw-less vertebrate. Science 310:1970–1973.

11. Saha NR, Smith J, Amemiya CT (2010) Evolution of adaptive immune recognition injawless vertebrates. Semin Immunol 22:25–33.

12. Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N (2004) Structural prin-ciples of leucine-rich repeat (LRR) proteins. Proteins 54:394–403.

13. Alder MN, et al. (2008) Antibody responses of variable lymphocyte receptors in thelamprey. Nat Immunol 9:319–327.

14. Herrin BR, et al. (2008) Structure and specificity of lamprey monoclonal antibodies.Proc Natl Acad Sci USA 105:2040–2045.

15. Tasumi S, et al. (2009) High-affinity lamprey VLRA and VLRB monoclonal antibodies.Proc Natl Acad Sci USA 106:12891–12896.

16. Han BW, Herrin BR, Cooper MD, Wilson IA (2008) Antigen recognition by variable lym-phocyte receptors. Science 321:1834–1837.

17. Velikovsky CA, et al. (2009) Structure of a lamprey variable lymphocyte receptor incomplex with a protein antigen. Nat Struct Biol 16:725–744.

18. Deng L, et al. (2010) A structural basis for antigen recognition by the T cell-like lym-phocytes of sea lamprey. Proc Natl Acad Sci USA 107:13408–13413.

19. Kajander T, Cortajarena AL, Regan L (2006) Consensus design as a tool for engineeringrepeat proteins. Methods Mol Biol 340:151–170.

20. Main ERG, Xiong Y, Cocco MJ, D’Andrea L, Regan L (2003) Design of stable alpha-he-lical arrays from an idealized TPR motif. Structure 11:497–508.

21. Mosavi LK, Minor DL, Peng ZY (2002) Consensus-derived structural determinants of theankyrin repeat motif. Proc Natl Acad Sci USA 99:16029–16034.

22. Stumpp MT, Forrer P, Binz HK, Pluckthun A (2003) Designing repeat proteins: Modularleucine-rich repeat protein libraries based on the mammalian ribonuclease inhibitorfamily. J Mol Biol 332:471–487.

23. Apweiler R, et al. (2010) The Universal Protein Resource (UniProt) in 2010. NucleicAcids Res 38:D142–D148.

24. Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): A curatednon-redundant sequence database of genomes, transcripts and proteins.Nucleic AcidsRes 35:D61–D65.

25. Courtemanche N, Barrick D (2008) The leucine-rich repeat domain of internalin B foldsalong a polarized N-terminal pathway. Structure 16:705–714.

26. Marino M, Braun L, Cossart P, Ghosh P (1999) Structure of the InlB leucine-rich repeats,a domain that triggers host cell invasion by the bacterial pathogen L-monocytogenes.Mol Cell 4:1063–1072.

27. Liu Y, Kuhlman B (2006) RosettaDesign server for protein design. Nucleic Acids Res 34:W235–W238.

28. Medzhitov R (2001) Toll-like receptors and innate immunity. Nat Rev Immunol1:135–145.

29. Kumar H, Kawai T, Akira S (2009) Toll-like receptors and innate immunity. BiochemBiophys Res Commun 388:621–625.

30. Jung K, et al. (2009) Toll-like receptor 4 decoy, TOY, attenuates gram-negative bacter-ial sepsis. PLoS One 4:e7403.

31. Kim HM, et al. (2007) Crystal structure of the TLR4-MD-2 complex with bound endo-toxin antagonist eritoran. Cell 130:906–917.

32. Fiser A, Do RKG, Sali A (2000) Modeling of loops in protein structures. Protein Sci9:1753–1773.

33. Nishimoto N, Kishimoto T (2006) Interleukin 6: From bench to bedside. Nat Clin PractRheumatol 2:619–626.

34. Kobe B, Kajava AV (2000) When protein folding is simplified to protein coiling: Thecontinuum of solenoid protein structures. Trends Biochem Sci 25:509–515.

35. Wetzel SK, Settanni G, Kenig M, Binz HK, Pluckthun A (2008) Folding and unfoldingmechanism of highly stable full-consensus ankyrin repeat proteins. J Mol Biol376:241–257.

36. Lee CMY, Iorno N, Sierro F, Christ D (2007) Selection of human antibody fragments byphage display. Nat Protoc 2:3001–3008.

3304 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1113193109 Lee et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 8

, 202

0