14
1 The sequence of human ACE2 is suboptimal for binding 1 the S spike protein of SARS coronavirus 2 2 Erik Procko 3 Department of Biochemistry, University of Illinois, Urbana IL 61801 4 Email: [email protected] 5 SUMMARY. The rapid and escalating spread of SARS coronavirus 2 (SARS-CoV-2) 6 poses an immediate public health emergency. The viral spike protein S binds ACE2 7 on host cells to initiate molecular events that release the viral genome 8 intracellularly. Soluble ACE2 inhibits entry of both SARS and SARS-2 coronaviruses 9 by acting as a decoy for S binding sites, and is a candidate for therapeutic and 10 prophylactic development. Using deep mutagenesis, variants of ACE2 are identified 11 with increased binding to the receptor binding domain of S. Mutations are found 12 across the interface, in the N90-glycosylation motif, and at buried sites where they 13 are predicted to enhance folding and presentation of the interaction epitope. When 14 single substitutions are combined, large increases in binding can be achieved. The 15 mutational landscape offers a blueprint for engineering high affinity proteins and 16 peptides that block receptor binding sites on S to meet this unprecedented 17 challenge. 18 In December, 2019, a novel zoonotic betacoronavirus closely related to bat coronaviruses 19 spilled over to humans at the Huanan Seafood Market in the Chinese city of Wuhan (1, 2). 20 The virus, called SARS-CoV-2 due to its similarities with the severe acute respiratory 21 syndrome (SARS) coronavirus responsible for a smaller outbreak nearly two decades prior 22 (3, 4), has since spread human-to-human rapidly across the world, precipitating 23 extraordinary containment measures from governments (5). Stock markets have fallen, 24 travel restrictions have been imposed, public gatherings canceled, and large numbers of 25 people are self-isolating. These events are unlike any experienced in generations. 26 Symptoms of coronavirus disease 2019 (COVID-19) range from mild to dry cough, fever, 27 pneumonia and death, and SARS-CoV-2 is devastating among the elderly and other 28 vulnerable groups (6, 7). 29 The S spike glycoprotein of SARS-CoV-2 binds angiotensin-converting enzyme 2 (ACE2) on 30 host cells (2, 8-13). S is a trimeric class I viral fusion protein that is proteolytically 31 processed into S1 and S2 subunits that remain noncovalently associated in a prefusion 32 state (8, 11, 14). Upon engagement of ACE2 by a receptor binding domain (RBD) in S1 (15), 33 conformational rearrangements occur that cause S1 shedding, cleavage of S2 by host 34 proteases, and exposure of a fusion peptide adjacent to the S2' proteolysis site (14, 16-18). 35 Favorable folding of S to a post-fusion conformation is coupled to host cell/virus 36 membrane fusion and cytosolic release of viral RNA. Atomic contacts with the RBD are 37 restricted to the protease domain of ACE2 (19, 20), and soluble ACE2 (sACE2) in which the 38 neck and transmembrane domains are removed is sufficient for binding S and neutralizing 39 infection (12, 21-24). In principle, the virus has limited potential to escape sACE2- 40 . CC-BY-NC-ND 4.0 International license was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which this version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236 doi: bioRxiv preprint

The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

1

ThesequenceofhumanACE2issuboptimalforbinding1

theSspikeproteinofSARScoronavirus22

ErikProcko3

DepartmentofBiochemistry,UniversityofIllinois,UrbanaIL618014Email:[email protected]

SUMMARY. The rapid and escalating spread of SARS coronavirus 2 (SARS-CoV-2)6posesanimmediatepublichealthemergency.TheviralspikeproteinSbindsACE27on host cells to initiate molecular events that release the viral genome8intracellularly. SolubleACE2inhibitsentryofbothSARSandSARS-2coronaviruses9by acting as a decoy for S binding sites, and is a candidate for therapeutic and10prophylacticdevelopment.Usingdeepmutagenesis,variantsofACE2areidentified11with increased binding to the receptor binding domain of S. Mutations are found12across the interface, in theN90-glycosylationmotif,andatburiedsiteswhere they13arepredictedtoenhancefoldingandpresentationoftheinteractionepitope.When14single substitutionsare combined, large increases inbinding canbeachieved.The15mutational landscape offers a blueprint for engineering high affinity proteins and16peptides that block receptor binding sites on S to meet this unprecedented17challenge.18

InDecember,2019,anovelzoonoticbetacoronaviruscloselyrelatedtobatcoronaviruses19spilledovertohumansattheHuananSeafoodMarketintheChinesecityofWuhan(1,2).20The virus, called SARS-CoV-2 due to its similarities with the severe acute respiratory21syndrome(SARS)coronavirusresponsibleforasmalleroutbreaknearlytwodecadesprior22(3, 4), has since spread human-to-human rapidly across the world, precipitating23extraordinary containmentmeasures from governments (5). Stockmarkets have fallen,24travel restrictions have been imposed, public gatherings canceled, and large numbers of25people are self-isolating. These events are unlike any experienced in generations.26Symptomsof coronavirusdisease2019 (COVID-19) range frommild todry cough, fever,27pneumonia and death, and SARS-CoV-2 is devastating among the elderly and other28vulnerablegroups(6,7).29

TheSspikeglycoproteinofSARS-CoV-2bindsangiotensin-convertingenzyme2(ACE2)on30host cells (2, 8-13). S is a trimeric class I viral fusion protein that is proteolytically31processed into S1 and S2 subunits that remain noncovalently associated in a prefusion32state(8,11,14).UponengagementofACE2byareceptorbindingdomain(RBD)inS1(15),33conformational rearrangements occur that cause S1 shedding, cleavage of S2 by host34proteases,andexposureofafusionpeptideadjacenttotheS2'proteolysissite(14,16-18).35Favorable folding of S to a post-fusion conformation is coupled to host cell/virus36membrane fusion and cytosolic release of viral RNA. Atomic contactswith the RBD are37restrictedtotheproteasedomainofACE2(19,20),andsolubleACE2(sACE2)inwhichthe38neckandtransmembranedomainsareremovedissufficientforbindingSandneutralizing39infection (12, 21-24). In principle, the virus has limited potential to escape sACE2-40

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 2: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

2

mediated neutralization without simultaneously decreasing affinity for native ACE241receptors,therebyattenuatingvirulence.Furthermore,fusionofsACE2totheFcregionof42human immunoglobulin can provide an avidity boost while recruiting immune effector43functions and increasing serum stability, an especially desirable quality if intended for44prophylaxis (23, 25), and sACE2 has proven safe in healthy human subjects (26) and45patientswith lungdisease(27). RecombinantsACE2hasnowbeenrushed intoaclinical46trial forCOVID-19 inGuangdongprovince,China (Clinicaltrials.gov#NCT04287686), and47peptidederivativesofACE2arealsobeingexploredascellentryinhibitors(28).48

SincehumanACE2hasnot evolved to recognize SARS-CoV-2 S, itwashypothesized that49mutationsmaybefoundthatincreaseaffinityfortherapeuticanddiagnosticapplications.50The coding sequence of full length ACE2 with an N-terminal c-myc epitope tag was51diversifiedtocreatealibrarycontainingallpossiblesingleaminoacidsubstitutionsat11752sitesspanningtheentireinterfacewithSandliningthesubstrate-bindingcavity.Sbinding53isindependentofACE2catalyticactivity(23)andoccursontheoutersurfaceofACE2(19,5420), whereas angiotensin substrates bindwithin a deep cleft that houses the active site55(29).Substitutionswithinthesubstrate-bindingcleftofACE2thereforeactascontrolsthat56areanticipatedtohaveminimalimpactonSinteractions,yetmaybeusefulforengineering57out substrate affinity to enhance in vivo safety. It is important to note though that58catalyticallyactiveproteinmayhavedesirableeffectsforreplenishinglostACE2activityin59COVID-19patientsinrespiratorydistress(30,31).60

TheACE2librarywastransientlyexpressedinhumanExpi293Fcellsunderconditionsthat61typically yield nomore than one coding variant per cell, providing a tight link between62genotypeandphenotype(32,33).Cellswerethenincubatedwithasubsaturatingdilution63of medium containing the RBD of SARS-CoV-2 fused C-terminally to superfolder GFP64(sfGFP:(34))(Fig.1A).LevelsofboundRBD-sfGFPcorrelatewithsurfaceexpressionlevels65ofmyc-taggedACE2measuredbydualcolorflowcytometry.Comparedtocellsexpressing66wildtypeACE2(Fig.1C),manyvariants intheACE2 library fail tobindRBD,while there67appeared tobea smallernumberofACE2variantswithhigherbindingsignals (Fig.1D).68Cells expressing ACE2 variants with high or low binding to RBD were collected by69fluorescence-activatedcellsorting(FACS),referredtoas"nCoV-S-High"and"nCoV-S-Low"70sorted populations, respectively. During FACS, fluorescence signal for boundRBD-sfGFP71continuouslydeclined,requiringthecollectiongatestoberegularlyupdatedto'chase'the72relevant populations. This is consistent with RBD dissociating over hours during the73experiment.ReportedaffinitiesofRBDforACE2rangefrom1to15nM(8,10).74

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 3: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

3

75

Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-76CoV-2 S. 77(A) Media from Expi293F cells secreting the SARS-CoV-2 RBD fused to sfGFP was collected 78and incubated at different dilutions with Expi293F cells expressing myc-tagged ACE2. Bound 79RBD-sfGFP was measured by flow cytometry. The dilutions of RBD-sfGFP-containing medium 80used for FACS selections are indicated by arrows. 81(B-C) Expi293F cells were transfected with wild type ACE2 plasmid diluted with a large excess 82of carrier DNA. It has been previously shown that under these conditions, cells typically acquire 83no more than one coding plasmid and most cells are negative. Cells were incubated with RBD-84sfGFP-containing medium and co-stained with fluorescent anti-myc to detect surface ACE2 by 85flow cytometry. During analysis, the top 67% (magenta gate) were chosen from the ACE2-86positive population (purple gate) (B). Bound RBD was subsequently measured relative to 87surface ACE2 expression (C). 88(D) Expi293F cells were transfected with an ACE2 single site-saturation mutagenesis library and 89analyzed as in B. During FACS, the top 15% of cells with bound RBD relative to ACE2 90expression were collected (nCoV-S-High sort, green gate) and the bottom 20% were collected 91separately (nCoV-S-Low sort, blue gate). 92

Figure 2. A mutational landscape of ACE2 for high binding signal to the RBD of SARS-93CoV-2 S. 94Log2 enrichment ratios from the nCoV-S-High sorts are plotted from ≤ -3 (i.e. 95depleted/deleterious, orange) to neutral (white) to ≥ +3 (i.e. enriched, dark blue). ACE2 primary 96structure is on the vertical axis, amino acid substitutions are on the horizontal axis. *, stop 97codon. 98

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 4: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

4

99

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 5: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

5

Transcripts in the sorted populationswere deep sequenced, and frequencies of variants100werecomparedtothenaiveplasmidlibrarytocalculatetheenrichmentordepletionofall1012,340 coding mutations in the library (Fig. 2). This approach of tracking an in vitro102selectionorevolutionbydeepsequencingisknownasdeepmutagenesis(35).Enrichment103ratios (Fig. 3A and 3B) and residue conservation scores (Fig. 3D and 3E) closely agree104between two independent sort experiments, giving confidence in thedata. For themost105part,enrichmentratios(Fig.3C)andconservationscores(Fig.3F)inthenCoV-S-Highsorts106are anticorrelatedwith thenCoV-S-Low sorts,with the exception of nonsensemutations107whichwereappropriatelydepletedfrombothgates. Thisindicatesthatmost,butnotall,108nonsynonymousmutations in ACE2 did not eliminate surface expression. The library is109biasedtowardssolvent-exposedresiduesandhasfewsubstitutionsofburiedhydrophobics110thatmighthavebiggereffectsonplasmamembranetrafficking(33).111

112

Figure 3. Data from independent replicates show close agreement. 113(A-B) Log2 enrichment ratios for ACE2 mutations in the nCoV-S-High (A) and nCoV-S-Low (B) 114sorts closely agree between two independent FACS experiments. Nonsynonymous mutations 115are black, nonsense mutations are red. Replicate 1 used a 1/40 dilution and replicate 2 used a 1161/20 dilution of RBD-sfGFP-containing medium. R2 values are for nonsynonymous mutations. 117(C) Average log2 enrichment ratios tend to be anticorrelated between the nCoV-S-High and 118nCoV-S-Low sorts. Nonsense mutations (red) and a small number of nonsynonymous 119mutations (black) are not expressed at the plasma membrane and are depleted from both sort 120populations (i.e. fall below the diagonal). 121(D-F) Correlation plots of residue conservation scores from replicate nCoV-S-High (D) and 122nCoV-S-Low (E) sorts, and from the averaged data from both nCoV-S-High sorts compared to 123

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 6: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

6

both nCoV-S-Low sorts (F). Conservation scores are calculated from the mean of the log2 124enrichment ratios for all amino acid substitutions at each residue position. 125

MappingtheexperimentalconservationscoresfromthenCoV-S-Highsortstothestructure126ofRBD-boundACE2(19)showsthatresiduesburiedintheinterfacetendtobeconserved,127whereas residues at the interface periphery or in the substrate-binding cleft are128mutationallytolerant(Fig.4A).TheregionofACE2surroundingtheC-terminalendofthe129ACE2α1helixandβ3-β4strandshasaweaktoleranceofpolarresidues,whileaminoacids130at theN-terminal endofα1and theC-terminal endofα2preferhydrophobics (Fig.4B),131likely in part to preserve hydrophobic packing betweenα1-α2. These discrete patches132contacttheglobularRBDfoldandalongprotrudingloopoftheRBD,respectively.133

TwoACE2residues,N90andT92thattogetherformaconsensusN-glycosylationmotif,are134notablehotspotsforenrichedmutations(Fig.2and4A). Indeed,allsubstitutionsofN90135andT92,withtheexceptionofT92SwhichmaintainstheN-glycan,arehighlyfavorablefor136RBDbinding,andtheN90-glycanisthuspredictedtopartiallyhinderS/ACE2interaction.137

138

Figure 4. Sequence preferences of ACE2 residues for high binding to the RBD of SARS-139CoV-2 S. 140

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 7: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

7

(A) Conservation scores from the nCoV-S-High sorts are mapped to the cryo-EM structure 141(PDB 6M17) of RBD (pale green ribbon) bound ACE2 (surface). The view at left is looking 142down the substrate-binding cavity, and only a single protease domain is shown for clarity. 143Residues conserved for high RBD binding are orange; mutationally tolerant residues are pale 144colors; residues that are hot spots for enriched mutations are blue; and residues maintained as 145wild type in the ACE2 library are grey. Glycans are dark red sticks. 146(B) Average hydrophobicity-weighted enrichment ratios are mapped to the RBD-bound ACE2 147structure, with residues tolerant of polar substitutions in blue, while residues that prefer 148hydrophobic amino acids are yellow. 149(C) A magnified view of part of the ACE2 (colored by conservation score as in A) / RBD (pale 150green) interface. Accompanying heatmap plots log2 enrichment ratios from the nCoV-S-High 151sort for substitutions of ACE2-T27, D30 and K31 from ≤ -3 (depleted) in orange to ≥ +3 152(enriched) in dark blue. 153

Mining thedata identifiesmanyACE2mutations that areenriched forRBDbinding. For154instance,thereare122mutationsto35positionsinthelibrarythathavelog2enrichment155ratios>1.5 in thenCoV-S-High sort. At least adozenACE2mutations at the structurally156characterized interface enhance RBD binding, and will be useful for engineering highly157specific and tight binders of SARS-CoV-2 S. Themolecular basis for how some of these158mutations enhance RBD binding can be rationalized from the RBD-bound cryo-EM159structure(Fig.4C):hydrophobicsubstitutionsofACE2-T27increasehydrophobicpacking160witharomaticresiduesofS,ACE2-D30EextendsanacidicsidechaintoreachS-K417,and161aromatic substitutions of ACE2-K31 contribute to an interfacial cluster of aromatics.162However,engineeredACE2receptorswithmutationsattheinterfacemaypresentbinding163epitopes that are sufficiently different from native ACE2 that virus escape mutants can164emerge, or theymay be strain specific and lack breadth to bind S in future coronavirus165outbreaks.166

Instead, attention was drawn to mutations in the second shell and farther from the167interface that do not directly contact S but instead have putative structural roles. For168example,prolinesubstitutionswereenrichedatfivelibrarypositions(S19,L91,T92,T324169andQ325)where theymightentropically stabilize the first turnsofhelices. Prolinewas170also enriched atH34,where itmay enforce the central bulge inα1. Multiplemutations171were also enriched at buried positionswhere theywill change local packing (e.g. A25V,172L29F, W69V, F72Y and L351F). The selection of ACE2 variants for high binding signal173thereforenotonlyreportsonaffinity,butalsoonpresentationatthemembraneoffolded174structurerecognizedbySARS-CoV-2S. Thepresenceofenrichedstructuralmutations in175the sequence landscape is especially notable considering the ACE2 library was biased176towardssolvent-exposedpositions.177

Thirty single substitutions highly enriched in the nCoV-S-High sort were validated by178targetedmutagenesis(Fig.5).BindingofRBD-sfGFPtofulllengthACE2mutantsincreased179compared to wild type, yet improvements were small andweremost apparent on cells180expressing low ACE2 levels (Fig. 5A). To rapidly assess mutations in a format more181relevant to therapeutic development, the soluble ACE2 protease domain was fused to182sfGFP. Expression levels of sACE2-sfGFPwerequalitatively evaluatedby fluorescenceof183thetransfectedcultures(Fig.6A),andbindingofsACE2-sfGFPtofulllengthSexpressedat184

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 8: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

8

the plasmamembrane wasmeasured by flow cytometry (Fig. 6B). A single substitution185(T92Q) that eliminates theN90 glycan gave a small increase in binding signal (Fig. 6B).186FocusingonthemosthighlyenrichedsubstitutionsintheselectionforSbindingthatwere187alsospatiallysegregatedtominimizenegativeepistasis(36),combinationsofmutationsin188sACE2 gave large increases in S binding (Table 1 and Fig. 6B). While this assay only189providesrelativedifferences,thecombinatorialmutantshaveenhancedbindingbyatleast190an order of magnitude, which would give KDs in the picomolar range based on the191publishedKDofwildtypeACE2withS(8,10).Othercombinationsofmutationsmayhave192even greater effects. While monomeric sACE2 has been extensively studied in vivo193(includinginman)andisthemostlikelytoadvancefirsttoclinicaluse,fusionsofsACE2to194FcofIgGorIgAclassesmayprovideadditionalbenefitsofavidity,recruitmentofimmune195effector functions, andsecretion in to the respiratorymucosa.Plasmids for these fusions196aredepositedwithAddgene.197

198

Figure 5. Single amino acid substitutions in ACE2 predicted from the deep mutational 199scan to increase RBD binding have small effects. 200(A) Expi293F cells expressing full length ACE2 were stained with RBD-sfGFP-containing 201medium and analyzed by flow cytometry. Data are compared between wild type ACE2 (black) 202and a single mutant (L79T, red). Increased RBD binding is most discernable in cells expressing 203low levels of ACE2 (blue gate). 204(B) RBD-sfGFP binding was measured for 30 single amino acid substitutions in ACE2. Data are 205GFP mean fluorescence in the low expression gate (blue box in panel A) with background 206fluorescence subtracted. 207

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 9: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

9

208

Table1.CombinatorialmutantsofsACE2.Variant MutationssACE2.v1 H34A,T92Q,Q325P,A386LsACE2.v2 T27Y,L79T,N330Y,A386LsACE2.v3 A25V,T27Y,T92Q,Q325P,A386LsACE2.v4 H34A,L79T,N330Y,A386LsACE2.v5 A25V,T92Q,A386LsACE2.v6 T27Y,Q42L,L79T,T92Q,Q325P,N330Y,A386L

209

210

Figure 6. Engineered sACE2 with enhanced binding to S. 211(A) Expression of sACE2-sfGFP mutants was qualitatively evaluated by fluorescence of the 212transfected cell cultures. 213(B) Cells expressing full length S were stained with dilutions of sACE2-sfGFP-containing media 214and binding was analyzed by flow cytometry. 215

While deep mutagenesis of viral proteins in replicating viruses has been extensively216pursued to understand escape mechanisms from drugs and antibodies, the work here217

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 10: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

10

shows how deepmutagenesis can be directly applicable to therapeutic designwhen the218selectionmethodisdecoupledfromvirusreplicationandfocusedonhostfactors.219

METHODS220

Plasmids.Thematurepolypeptide(a.a.19-805)ofhumanACE2(GenBankNM_021804.1)221was cloned in to theNheI-XhoI sites of pCEP4 (Invitrogen)with aN-terminalHA leader222(MKTIIALSYIFCLVFA),myc-tag, and linker (GSPGGA). Soluble sACE2 fused to superfolder223GFP(34)wasconstructedbygeneticallyjoiningtheproteasedomain(a.a.1-615)ofACE2224tosfGFP(GenBankASL68970)viaagly/ser-richlinker(GSGGSGSGG),andpastingbetween225theNheI-XhoIsitesofpcDNA3.1(+)(Invitrogen).Asynthetichumancodon-optimizedgene226fragment (Integrated DNA Technologies) for the RBD (a.a. 333-529) of SARS-CoV-2 S227(GenBankYP_009724390.1)wasN-terminallyfusedtoaHAleaderandC-terminallyfused228to superfolder GFP (34) and ligated in to the NheI-XhoI sites of pcDNA3.1(+). Human229codon-optimized full length S (a.a. 16-1273) was subcloned from pUC57-2019-nCoV-230S(Human)(MolecularCloud)withaN-terminalHAleader(MKTIIALSYIFCLVFA),myc-tag,231andlinker(GSPGGA).232

Tissue Culture. Expi293F cells (ThermoFisher) were cultured in Expi293 Expression233Medium(ThermoFisher)at125rpm,8%CO2,37 °C. ForproductionofRBD-sfGFP,cells234werepreparedto2×106/ml.Permlofculture,500ngofpcDNA3-RBD-sfGFPand3µgof235polyethylenimine (MW25,000; Polysciences)weremixed in 100µl ofOptiMEM (Gibco),236incubatedfor20minutesatroomtemperature,andaddedtocells.TransfectionEnhancers237(ThermoFisher)wereadded19hpost-transfection,andcellswereculturedfor110h.Cells238wereremovedbycentrifugationat800×gfor5minutesandmediumwasstoredat-20°C.239After thawing and immediately prior to use, remaining cell debris andprecipitateswere240removedby centrifugation at 20,000× g for 5minutes. SolubleACE2-sfGFPproteinwas241producedby thesameprotocolwith the followingmodifications:TransfectionEnhancers242wereadded22-1/2hpost-transfection,andmediumsupernatantwasharvestedafter60h.243

Deepmutagenesis.117residueswithintheproteasedomainofACE2werediversifiedby244overlap extension PCR (37) using primers with degenerate NNK codons. The plasmid245library was transfected in to Expi293F cells using Expifectamine under conditions246previouslyshowntotypicallygivenomorethanasinglecodingvariantpercell(32,33);1247ngcodingplasmidwasdilutedwith1,500ngpCEP4-ΔCMVcarrierplasmidpermlofcell248cultureat2×106/ml,andthemediumwasreplaced2hpost-transfection.Thecellswere249collected after 24 h, washed with ice-cold PBS supplemented with 0.2% bovine serum250albumin(PBS-BSA),andincubatedfor30minutesonicewitha1/20(replicate1)or1/40251(replicate2)dilutionofmediumcontainingRBD-sfGFPintoPBS-BSA.Cellswereco-stained252with anti-myc Alexa 647 (clone 9B11, 1/250 dilution; Cell Signaling Technology). Cells253werewashed twicewith PBS-BSA, and sorted on a BD FACSAria II at the Roy J. Carver254Biotechnology Center. Themain cell populationwas gated by forward/side scattering to255removedebrisanddoublets,andDAPIwasaddedtothesampletoexcludedeadcells.Of256themyc-positive(Alexa647)population,thetop67%weregated(Fig.1B).Ofthese,the15257% of cells with the highest and 20% of cells with the lowest GFP fluorescence were258collected (Fig. 1D) in tubes coated overnight with fetal bovine serum and containing259

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 11: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

11

Expi293 Expression Medium. Total RNA was extracted from the collected cells using a260GeneJETRNApurificationkit(ThermoScientific),andcDNAwasreversetranscribedwith261high fidelity Accuscript (Agilent) primed with gene-specific oligonucleotides. Diversified262regions of ACE2were PCR amplified as 5 fragments. Flanking sequences on the primers263addedadapterstotheendsoftheproductsforannealingtoIlluminasequencingprimers,264uniquebarcoding,andforbindingtheflowcell.AmpliconsweresequencedonanIllumina265NovaSeq6000usinga2×250ntpairedendprotocol.DatawereanalyzedusingEnrich(38),266andcommandsareprovidedintheGEOdeposit.Briefly,thefrequenciesofACE2variantsin267thetranscriptsofthesortedpopulationswerecomparedtotheirfrequenciesinthenaive268plasmidlibrarytocalculateanenrichmentratio.269

Flow Cytometry Analysis of ACE2-S Binding. Expi293F cells were transfected with270pcDNA3-myc-ACE2orpcDNA3-myc-Splasmids(500ngDNApermlofcultureat2×106/271ml) using Expifectamine (ThermoFisher). Cells were analyzed by flow cytometry 24 h272post-transfection. To analyze binding of RBD-sfGFP to full lengthmyc-ACE2, cells were273washedwithice-coldPBS-BSA,andincubatedfor30minutesonicewitha1/30dilutionof274medium containingRBD-sfGFP and a 1/240 dilution of anti-mycAlexa 647 (clone 9B11,275CellSignalingTechnology).CellswerewashedtwicewithPBS-BSAandanalyzedonaBD276LSR II. To analyze binding of sACE2-sfGFP to full lengthmyc-S, cellswerewashedwith277PBS-BSA,andincubatedfor30minutesonicewithaserialdilutionofmediumcontaining278sACE2-sfGFP and a 1/240 dilution of anti-myc Alexa 647 (clone 9B11, Cell Signaling279Technology).CellswerewashedtwicewithPBS-BSAandanalyzedonaBDAccuriC6.Data280wereprocessedwithFCSExpress(DeNovoSoftware)281

Reagentanddataavailability.PlasmidsaredepositedwithAddgeneunderIDs141183-5282and 145145-78. Raw and processed deep sequencing data are deposited inNCBI’s Gene283ExpressionOmnibus(GEO)withseriesaccessionno.GSE147194.284

ACKNOWLEDGEMENTS.StaffattheUIUCRoyJ.CarverBiotechnologyCenterassistedwith285FACS and Illumina sequencing. Hannah Choi assisted with plasmid preparation. The286developmentofdeepmutagenesis to studyvirus-receptor interactionswassupportedby287NIHawardR01AI129719.288

CONFLICTOFINTERESTSTATEMENT.E.P.istheinventoronaprovisionalpatentfiling289bytheUniversityofIllinoiscoveringaspectsofthiswork.E.P.isacofounderofOrthogonal290Biologics,Inc.291

REFERENCES292

1. N.Zhuetal.,ANovelCoronavirusfromPatientswithPneumoniainChina,2019.N.293Engl.J.Med.382,727–733(2020).294

2. P.Zhouetal.,Apneumoniaoutbreakassociatedwithanewcoronavirusofprobable295batorigin.Nature.579,270–273(2020).296

3. J.S.M.Peirisetal.,Coronavirusasapossiblecauseofsevereacuterespiratory297syndrome.Lancet.361,1319–1325(2003).298

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 12: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

12

4. CoronaviridaeStudyGroupoftheInternationalCommitteeonTaxonomyofViruses,299ThespeciesSevereacuterespiratorysyndrome-relatedcoronavirus:classifying3002019-nCoVandnamingitSARS-CoV-2.NatMicrobiol.4,3(2020).301

5. A.Patel,D.B.Jernigan,2019-nCoVCDCResponseTeam,InitialPublicHealth302ResponseandInterimClinicalGuidanceforthe2019NovelCoronavirusOutbreak-303UnitedStates,December31,2019-February4,2020.MMWRMorb.Mortal.Wkly.Rep.30469,140–146(2020).305

6. W.Wang,J.Tang,F.Wei,Updatedunderstandingoftheoutbreakof2019novel306coronavirus(2019-nCoV)inWuhan,China.J.Med.Virol.92,441–447(2020).307

7. C.Huangetal.,Clinicalfeaturesofpatientsinfectedwith2019novelcoronavirusin308Wuhan,China.Lancet.395,497–506(2020).309

8. A.C.Wallsetal.,Structure,Function,andAntigenicityoftheSARS-CoV-2Spike310Glycoprotein.Cell(2020),doi:10.1016/j.cell.2020.02.058.311

9. Y.Wan,J.Shang,R.Graham,R.S.Baric,F.Li,Receptorrecognitionbynovel312coronavirusfromWuhan:Ananalysisbasedondecade-longstructuralstudiesof313SARS.J.Virol.(2020),doi:10.1128/JVI.00127-20.314

10. D.Wrappetal.,Cryo-EMstructureofthe2019-nCoVspikeintheprefusion315conformation.Science,eabb2507(2020).316

11. M.Hoffmannetal.,SARS-CoV-2CellEntryDependsonACE2andTMPRSS2andIs317BlockedbyaClinicallyProvenProteaseInhibitor.Cell(2020),318doi:10.1016/j.cell.2020.02.052.319

12. W.Lietal.,Angiotensin-convertingenzyme2isafunctionalreceptorfortheSARS320coronavirus.Nature.426,450–454(2003).321

13. M.Letko,A.Marzi,V.Munster,Functionalassessmentofcellentryandreceptor322usageforSARS-CoV-2andotherlineageBbetacoronaviruses.NatMicrobiol.11,1860323(2020).324

14. M.A.Tortorici,D.Veesler,Structuralinsightsintocoronavirusentry.Adv.VirusRes.325105,93–116(2019).326

15. S.K.Wong,W.Li,M.J.Moore,H.Choe,M.Farzan,A193-aminoacidfragmentofthe327SARScoronavirusSproteinefficientlybindsangiotensin-convertingenzyme2.J.Biol.328Chem.279,3197–3201(2004).329

16. I.G.Madu,S.L.Roth,S.Belouzard,G.R.Whittaker,Characterizationofahighly330conserveddomainwithinthesevereacuterespiratorysyndromecoronavirusspike331proteinS2domainwithcharacteristicsofaviralfusionpeptide.J.Virol.83,7411–3327421(2009).333

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 13: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

13

17. A.C.Wallsetal.,Tectonicconformationalchangesofacoronavirusspike334glycoproteinpromotemembranefusion.Proc.Natl.Acad.Sci.U.S.A.114,11157–33511162(2017).336

18. J.K.Millet,G.R.Whittaker,HostcellentryofMiddleEastrespiratorysyndrome337coronavirusaftertwo-step,furin-mediatedactivationofthespikeprotein.Proc.Natl.338Acad.Sci.U.S.A.111,15214–15219(2014).339

19. R.Yanetal.,StructuralbasisfortherecognitionoftheSARS-CoV-2byfull-length340humanACE2.Science,eabb2762(2020).341

20. F.Li,W.Li,M.Farzan,S.C.Harrison,StructureofSARScoronavirusspikereceptor-342bindingdomaincomplexedwithreceptor.Science.309,1864–1868(2005).343

21. H.Hofmannetal.,SusceptibilitytoSARScoronavirusSprotein-driveninfection344correlateswithexpressionofangiotensinconvertingenzyme2andinfectioncanbe345blockedbysolublereceptor.Biochem.Biophys.Res.Commun.319,1216–1221346(2004).347

22. C.Leietal.,Potentneutralizationof2019novelcoronavirusbyrecombinantACE2-Ig.348bioRxiv,2020.02.01.929976(2020).349

23. M.J.Mooreetal.,Retrovirusespseudotypedwiththesevereacuterespiratory350syndromecoronavirusspikeproteinefficientlyinfectcellsexpressingangiotensin-351convertingenzyme2.J.Virol.78,10628–10635(2004).352

24. V.Monteiletal.,InhibitionofSARS-CoV-2infectionsinengineeredhumantissues353usingclinical-gradesolublehumanACE2.Cell.DOI:10.1016/j.cell.2020.04.004,1–35428(2020).355

25. P.Liuetal.,NovelACE2-Fcchimericfusionprovideslong-lastinghypertension356controlandorganprotectioninmousemodelsofsystemicreninangiotensinsystem357activation.KidneyInt.94,114–125(2018).358

26. M.Haschkeetal.,Pharmacokineticsandpharmacodynamicsofrecombinanthuman359angiotensin-convertingenzyme2inhealthyhumansubjects.ClinPharmacokinet.52,360783–792(2013).361

27. A.Khanetal.,Apilotclinicaltrialofrecombinanthumanangiotensin-converting362enzyme2inacuterespiratorydistresssyndrome.CritCare.21,234(2017).363

28. G.Zhang,S.Pomplun,A.R.Loftis,A.Loas,B.L.Pentelute,Thefirst-in-classpeptide364bindertotheSARS-CoV-2spikeprotein.bioRxiv,2020.03.19.999318(2020).365

29. P.Towleretal.,ACE2X-raystructuresrevealalargehinge-bendingmotion366importantforinhibitorbindingandcatalysis.J.Biol.Chem.279,17996–18007367(2004).368

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint

Page 14: The sequence of human ACE2 is suboptimal for binding the S ... · 3 75 76 Figure 1. A selection strategy for ACE2 variants with high binding to the RBD of SARS-77 CoV-2 S. 78 (A)

14

30. R.L.Kruse,Therapeuticstrategiesinanoutbreakscenariototreatthenovel369coronavirusoriginatinginWuhan,China.F1000Res.9,72(2020).370

31. H.Zhang,J.M.Penninger,Y.Li,N.Zhong,A.S.Slutsky,Angiotensin-converting371enzyme2(ACE2)asaSARS-CoV-2receptor:molecularmechanismsandpotential372therapeutictarget.IntensiveCareMed.309,1864(2020).373

32. J.D.Herediaetal.,MappingInteractionSitesonHumanChemokineReceptorsby374DeepMutationalScanning.J.Immunol.200,ji1800343–3839(2018).375

33. J.Parketal.,StructuralarchitectureofadimericclassCGPCRbasedonco-trafficking376ofsweettastereceptorsubunits.JournalofBiologicalChemistry.294,4759–4774377(2019).378

34. J.-D.Pédelacq,S.Cabantous,T.Tran,T.C.Terwilliger,G.S.Waldo,Engineeringand379characterizationofasuperfoldergreenfluorescentprotein.Nat.Biotechnol.24,79–38088(2006).381

35. D.M.Fowler,S.Fields,Deepmutationalscanning:anewstyleofproteinscience.Nat.382Methods.11,801–807(2014).383

36. J.D.Heredia,J.Park,H.Choi,K.S.Gill,E.Procko,ConformationalEngineeringofHIV-3841EnvBasedonMutationalToleranceintheCD4andPG16BoundStates.J.Virol.93,385e00219–19(2019).386

37. E.Prockoetal.,Computationaldesignofaprotein-basedenzymeinhibitor.J.Mol.387Biol.425,3563–3575(2013).388

38. D.M.Fowler,C.L.Araya,W.Gerard,S.Fields,Enrich:softwareforanalysisofprotein389functionbyenrichmentanddepletionofvariants.Bioinformatics.27,3430–3431390(2011).391

392

.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (whichthis version posted April 6, 2020. . https://doi.org/10.1101/2020.03.16.994236doi: bioRxiv preprint