8
UNIT 5.1 Production of Recombinant Proteins in Escherichia coli The genetics and biochemistry of Escher- ichia coli are probably the best understood of any known organism. The knowledge gained in the study of E. coli biology has been applied to the development of many of today’s molecular cloning techniques. Most cloning vectors and methods utilize E. coli or its phages as a pre- ferred host, primarily because of the ease with which the bacterium can be grown and geneti- cally manipulated. These same characteristics made E. coli an attractive early choice as a host for the production of large quantities of protein encoded by cloned genes. Aside from its well-studied biology, E. coli is suitable as the basis of an expression system because of its rapid doubling time and its ability to grow in inexpensive media. Years of study devoted to gene expression in E. coli have provided nu- merous choices for transcriptional and transla- tional control elements that can be applied to the expression of foreign genes (UNITS 5.2 & 5.3). As a result, E. coli has been and continues to be the expression system of choice and a sub- stantial body of literature has accumulated on the successful expression of foreign genes in this host. Several problems with protein expres- sion in E. coli have been encountered, and many have been ultimately solved. This unit de- scribes methods that have been developed for production of recombinant proteins in E. coli and potential pitfalls that may be encountered. GENERAL STRATEGIES FOR GENE EXPRESSION IN E. COLI The basic approach used to express foreign genes in E. coli begins with insertion of the gene into an expression vector, usually a plasmid (obtained from a commercial supplier or from the author of a published study). The next step involves transforming a suitable E. coli host strain with the plasmid, for example, by elec- troporation (UNIT 5.2). Transformed cells are then subjected to evaluation of plasmid stability and foreign protein expression following induction of the controllable transcriptional promoter present in the expression vector (UNIT 5.2). Once small-scale shaker flask experiments have identified successful expression systems (UNIT 5.2), the transformed E. coli strain can be used in large-scale fermentation systems (UNIT 5.3), which can be designed to produce recombinant proteins with altered amino acids (e.g., se- lenomethionine in place of methionine) or stable isotope tags to facilitate structural studies. Pro- duction is followed by protein purification (Chapter 6) and characterization (Chapter 7). If the gene to be expressed in E. coli is of eukaryotic origin, it must be a cDNA copy, as E. coli will not recognize and splice out introns from transcripts of genomic copies of eukary- otic genes. Typical E. coli expression vectors contain the following elements. Selectable Marker Expression plasmids contain sequences en- coding a selectable marker to ensure mainte- nance of the vector in the host cell. Commonly used selectable markers in E. coli include bla (which encodes β-lactamase and confers resis- tance to ampicillin and other β-lactam antibi- otics), cat (which encodes chloramphenicol acetyltransferase and confers resistance to chloramphenicol), and tet (which encodes a membrane protein that confers resistance to tetracycline). Origin of Replication Replication of a plasmid as an independent extrachromosomal element is controlled by its origin of replication. Some plasmids are under so-called stringent control, wherein the plas- mid’s replication is coupled to replication of the host chromosome. As such, only one or at most a few copies of the plasmid are maintained in each cell. More commonly, E. coli vectors used for expression of cloned genes utilize relaxed control such as the ColE1 replicon found in pBR322 and derivatives. These plas- mids generally have copy numbers of 10 to 200 plasmids per cell. The presence of multiple copies of the gene and its associated control elements can provide advantages in mRNA production and, as a result, may be reflected in an increase in the total amount of protein that accumulates. Nevertheless, the reduced copy number of stringent replicons may sometimes be advantageous: it allows for modulation of gene dosage, which can sometimes be helpful for altering the kinetics of expression or reduc- ing the basal transcriptional levels of the plas- mid-encoded promoter. Contributed by Edward R. LaVallie Current Protocols in Protein Science (1995) 5.1.1-5.1.8 Copyright © 2000 by John Wiley & Sons, Inc. CPPS 5.1.1 Production of Recombinant Proteins

Recombinant Protin

Embed Size (px)

DESCRIPTION

Production of recombinant proteins in E coli

Citation preview

Page 1: Recombinant Protin

UNIT 5.1Production of Recombinant Proteins inEscherichia coli

The genetics and biochemistry of Escher-ichia coli are probably the best understood ofany known organism. The knowledge gained inthe study of E. coli biology has been applied tothe development of many of today’s molecularcloning techniques. Most cloning vectors andmethods utilize E. coli or its phages as a pre-ferred host, primarily because of the ease withwhich the bacterium can be grown and geneti-cally manipulated. These same characteristicsmade E. coli an attractive early choice as a hostfor the production of large quantities of proteinencoded by cloned genes. Aside from itswell-studied biology, E. coli is suitable as thebasis of an expression system because of itsrapid doubling time and its ability to grow ininexpensive media. Years of study devoted togene expression in E. coli have provided nu-merous choices for transcriptional and transla-tional control elements that can be applied tothe expression of foreign genes (UNITS 5.2 & 5.3).As a result, E. coli has been and continues tobe the expression system of choice and a sub-stantial body of literature has accumulated onthe successful expression of foreign genes inthis host. Several problems with protein expres-sion in E. coli have been encountered, and manyhave been ultimately solved. This unit de-scribes methods that have been developed forproduction of recombinant proteins in E. coliand potential pitfalls that may be encountered.

GENERAL STRATEGIES FORGENE EXPRESSION IN E. COLI

The basic approach used to express foreigngenes in E. coli begins with insertion of the geneinto an expression vector, usually a plasmid(obtained from a commercial supplier or fromthe author of a published study). The next stepinvolves transforming a suitable E. coli hoststrain with the plasmid, for example, by elec-troporation (UNIT 5.2). Transformed cells are thensubjected to evaluation of plasmid stability andforeign protein expression following inductionof the controllable transcriptional promoterpresent in the expression vector (UNIT 5.2). Oncesmall-scale shaker flask experiments haveidentified successful expression systems (UNIT

5.2), the transformed E. coli strain can be usedin large-scale fermentation systems (UNIT 5.3),which can be designed to produce recombinant

proteins with altered amino acids (e.g., se-lenomethionine in place of methionine) or stableisotope tags to facilitate structural studies. Pro-duction is followed by protein purification(Chapter 6) and characterization (Chapter 7).

If the gene to be expressed in E. coli is ofeukaryotic origin, it must be a cDNA copy, asE. coli will not recognize and splice out intronsfrom transcripts of genomic copies of eukary-otic genes. Typical E. coli expression vectorscontain the following elements.

Selectable MarkerExpression plasmids contain sequences en-

coding a selectable marker to ensure mainte-nance of the vector in the host cell. Commonlyused selectable markers in E. coli include bla(which encodes β-lactamase and confers resis-tance to ampicillin and other β-lactam antibi-otics), cat (which encodes chloramphenicolacetyltransferase and confers resistance tochloramphenicol), and tet (which encodes amembrane protein that confers resistance totetracycline).

Origin of ReplicationReplication of a plasmid as an independent

extrachromosomal element is controlled by itsorigin of replication. Some plasmids are underso-called stringent control, wherein the plas-mid’s replication is coupled to replication ofthe host chromosome. As such, only one or atmost a few copies of the plasmid are maintainedin each cell. More commonly, E. coli vectorsused for expression of cloned genes utilizerelaxed control such as the ColE1 repliconfound in pBR322 and derivatives. These plas-mids generally have copy numbers of 10 to 200plasmids per cell. The presence of multiplecopies of the gene and its associated controlelements can provide advantages in mRNAproduction and, as a result, may be reflected inan increase in the total amount of protein thataccumulates. Nevertheless, the reduced copynumber of stringent replicons may sometimesbe advantageous: it allows for modulation ofgene dosage, which can sometimes be helpfulfor altering the kinetics of expression or reduc-ing the basal transcriptional levels of the plas-mid-encoded promoter.

Contributed by Edward R. LaVallieCurrent Protocols in Protein Science (1995) 5.1.1-5.1.8Copyright © 2000 by John Wiley & Sons, Inc. CPPS

5.1.1

Production ofRecombinantProteins

Page 2: Recombinant Protin

PromotersAn essential component of expression vec-

tors is a controllable transcriptional promoterwhich, when induced, can direct the productionof large amounts of mRNA from the clonedgene. There are a variety of controllable pro-moters that are routinely used. The lac pro-moter system utilizes the transcriptional con-trol elements from the E. coli β-galactosidasegene (Miller and Reznikoff, 1978). The lacpromoter (like the tac and trc promoters, whichhave optimized lac promoter RNA polymeraserecognition sequences) is controlled by bindingof lac repressor (the lacI gene product) to anoperator sequence in the promoter region. Therepressor can be expressed either from the hostgenome (single copy, in which case an overex-pressing repressor allele called lacIq should beused) or from the expression vector (multiplecopies, with resultant tighter transcriptionalcontrol). Induction of the promoter is generallyaccomplished by the addition of isopropyl-β-D-thiogalactopyranoside (IPTG), a lactose ana-log that binds to the lac repressor and prohibitsits binding to the lac operator. An example ofa commercially available lac promoter vectoris pBluescript (Stratagene). The pKK223-3vector from Pharmacia Biotech is a source ofthe tac promoter. The trc promoter is used inthe vectors pSE280, pSE380, and pSE420 fromInvitrogen.

Another commonly used promoter systemutilizes the major leftward promoter of bacte-riophage lambda, pL, and its control elements(Shimatake and Rosenberg, 1981). The pL pro-moter contained on an expression vector can becontrolled by the phage-encoded cI repressor,which is typically expressed from an integratedcopy of the phage in the host genome. A tem-perature-sensitive cI repressor (called cI857) isusually used; this encodes a repressor that isfunctional at lower temperatures but denaturesat temperatures above 37.5°C. Thus, pL-medi-ated protein synthesis can be induced by asimple temperature shift. Alternatively, tem-perature-independent pL promoter systemshave been developed that utilize a cI repressorgene under the control of a separate induciblepromoter (Mieschendahl et al., 1986; LaVallieet al., 1993a). A commercial source for the pL

vector is the pPL-Lambda vector from Pharma-cia Biotech.

The T7 RNA polymerase promoter is an-other popular transcriptional control elementfor heterologous expression. The RNA polym-erase from bacteriophage T7 is highly selectivefor specific T7 phage promoter sequences that

are uncommon in other DNAs (Studier et al.,1990). A gene of interest placed under the con-trol of a T7 promoter can be selectively ex-pressed by induction of host-encoded T7 polym-erase synthesis, which itself is under the controlof an inducible promoter such as lac. Inductionof T7 polymerase synthesis causes almost ex-clusive expression of the gene under the controlof the T7 promoter. In some instances, the T7transcription can outcompete the host RNA po-lymerase, resulting in accumulation of largeamounts of the gene product of interest. Thereare many commercially available expressionvectors that utilize the T7 polymerase/promotersystem, such as the pGEMEX vectors fromPromega, the pRSET vectors from Invitrogen,and the pET vectors from Novagen.

Translation Initiation SequenceInitiation of translation on mRNAs requires

the presence of a so-called Shine and Dalgarnosequence or ribosome binding site (RBS) inclose proximity to an initiator methionine(Shine and Dalgarno, 1974). The RBS consistsof a purine-rich stretch of nucleotides comple-mentary to the 3′ end of 16S RNA, located 5 to13 bases 5′ to an initiator ATG. RBS elementstypically used in expression vectors derive fromwell-translated E. coli or bacteriophage genes.For instance, the pTrcHis and pRSET vectors(Invitrogen) and the pGEMEX (Promega) vec-tors use the T7 gene 10 RBS.

SPECIFIC EXPRESSIONSTRATEGIES

Direct Intracellular ExpressionDirect expression refers to the fusing of the

coding sequence of interest to transcriptionaland translational control sequences on an ex-pression vector, with an initiator methioninecodon preceding the open reading frame. Thisapproach can be used to produce cytoplasmicproteins, and it can also be used for the intra-cellular expression of normally secreted pro-teins. In the latter case, the DNA sequenceencoding the signal peptide is replaced by theinitiator methionine codon.

Success with the direct approach is oftenvariable. First, translation initiation is inconsis-tent due to the fact that sequences 3′ to theinitiator methionine can influence the effi-ciency of ribosome binding (Looman et al.,1987; Bucheler et al., 1990). For reasons thatare not fully understood, maximizing the A+Tcontent of the 5′ end of the coding sequence(taking advantage of the degeneracy of the

Current Protocols in Protein Science

5.1.2

Production ofRecombinant

Proteins inEscherichia coli

Page 3: Recombinant Protin

genetic code) can sometimes improve the effi-ciency of translation initiation (De Lamarter etal., 1985; Devlin et al., 1988). Second, recom-binant proteins produced in the cytoplasm oftenform dense, insoluble aggregates of proteincalled inclusion bodies (Schein, 1989). In someways, this can be viewed as an advantage be-cause inclusion bodies are easily purified fromthe soluble proteins (UNIT 6.3). In addition, pro-teins in inclusion bodies are usually resistant toproteolytic degradation. In cases where activityof the expressed protein is unnecessary (e.g., inproduction of protein to be used as an antigento produce antibodies), inclusion body forma-tion and resultant insolubility of the proteinmay actually be preferred. If active protein isdesired, however, recovery of properly foldedprotein from inclusion bodies requires the totaldenaturation of the protein using reagents suchas urea or guanidine, followed by subsequentrefolding using protocols that must be deter-mined empirically for each protein (UNIT 6.5).The success of refolding protocols is variable,depending on the particular protein, and mayyield little or no correctly folded material.

If protein insolubility is encountered and isundesired, production of soluble protein cansometimes be enhanced by simply lowering thegrowth temperature during protein synthesis(Bishai et al., 1987; Schein and Noteborn,1988). If the protein is normally secreted, thena secretory construct may produce more solubleprotein (see section on Secretion). Alterna-tively, production of the protein as a fusion toa highly soluble partner may alleviate the prob-lem (see section on Fusion Proteins).

If the gene is expressed poorly, translationefficiency may be a problem. If this is the case,it is often advantageous to maximize the A+Tcontent of the 5′ coding sequence and/or re-place the RBS with a sequence that initiatestranslation more efficiently (Olins and Rang-wala, 1989). Another potential cause of poortranslation is a preponderance of “rare” codonsin the coding sequence, that is, codons that areunderutilized by E. coli (Robinson et al., 1984).Stretches of contiguous rare codons may con-tribute to lower expression levels and should bechanged to codons that are frequently used byE. coli. Finally, poor expression levels may becaused by product instability. In this instance,use of host strains and induction methods thatminimize proteolytic degradation should beattempted (UNIT 5.2).

In instances where the gene is expressedwell and soluble protein can be produced, theprotein must then be purified from an abundant

and diverse mixture of other E. coli cytoplasmicproteins. This can be a difficult and time-con-suming process, and the purification schemewill be different for each protein based on itsphysical and biochemical characteristics(Chapters 6 and 8).

SecretionSecretion of proteins in E. coli is mediated

by the presence of an N-terminal signal se-quence that is cleaved after translocation of theprotein. Expression of cloned gene products assecreted proteins in E. coli has been utilized asan alternative to cytoplasmic expression forproteins that are normally secreted. In E. colithe protein is secreted to the periplasmic spacebetween the cytoplasmic and outer membranes,in contrast to extracellular secretion that occursin gram-positive bacteria and eukaryotic cells.The result is that in E. coli the secreted proteinremains cell-associated, although in a “com-partment” separated from the cytoplasmic pro-teins that make up the vast majority of the totalcellular protein. This can be advantageous interms of protein purification if techniques areused that release only periplasmic contentswhile leaving the cytoplasmic membrane intact(Neu and Heppel, 1965).

Secretion of heterologous gene products hasbeen successfully employed for various pro-teins that are difficult to produce in the cyto-plasm of E. coli as soluble and active proteins,including various growth factors (Cheah et al.,1994), receptors (Fuh et al., 1990), and recom-binant Fab fragments (Skerra, 1994). Althougheukaryotic signal peptides have been reportedto function in E. coli, most secretion vectorsutilize signal peptides derived from prokaryoticgenes such as the ompA (Takahara et al., 1988;Cheah et al., 1994), pelB (Power et al., 1992),phoA (Oka et al., 1985), or hisJ (Vasquez et al.,1989) signal peptides. Although secretion pro-vides an alternative to cytoplasmic expressionthat can sometimes result in the production ofproperly folded and active protein, the yield ofdesired protein is often low. Also, overex-pressed gene products have been reported toform inclusion bodies even in the periplasm(Bowden and Georgiou, 1990); so secretion isnot a panacea for the problem of insolubility.

Fusion ProteinsThe problems that have plagued successful

overexpression of cloned gene products in E.coli—namely, inconsistent expression levels,protein insolubility, and difficult purification ofthe gene product from E. coli contaminants—

Current Protocols in Protein Science

5.1.3

Production ofRecombinantProteins

Page 4: Recombinant Protin

have been most successfully addressed by theuse of fusion proteins. Fusion proteins are cre-ated via a translational fusion of the codingsequence for the protein of interest to a genefor a highly expressed protein partner (or “car-rier” protein). Typically, the gene encoding theprotein of interest is inserted in-frame 3′ to thecoding sequence for the carrier protein, in placeof the usual termination codon. This allowsuniform translational initiation of the carrierprotein regardless of the coding sequence fusedto its 3′ end, which helps to ensure consistentexpression levels.

Carrier proteins are usually chosen based onspecific attributes that make them suitable inthis role. The most successful fusion systemsemploy the maltose-binding protein (MBP;Maina et al., 1988), glutathione S-transferase(GST; Smith and Johnson, 1988), or thiore-doxin (TRX; LaVallie et al., 1993a). The genesfor these proteins are well expressed, and theproteins are highly soluble and provide specificphysical characteristics or affinities to aid pu-rification. These qualities also extend to theprotein sequences fused to them, thereby en-hancing the solubility and ease of purificationof the entire fusion protein.

Maltose-binding protein is a 43-kDa se-creted protein from E. coli. As its name implies,it binds specifically to maltose or amylose, aproperty that can be exploited in purificationschemes (Guan et al., 1988). MBP fusions canbe secreted into the periplasm, or they can beexpressed without the MBP signal peptide,which results in accumulation of the fusionprotein in the cytoplasm. MBP fusions are usu-ally well expressed, and a high proportion ofMBP fusion proteins are soluble. MBP expres-sion plasmids and reagents can be purchasedfrom New England Biolabs. The MBP expres-sion vectors utilize the tac promoter and aplasmid-encoded lacIq gene for transcriptionalcontrol. These vectors contain a recognitionsequence for the site-specific protease factor Xato allow removal of the carrier protein follow-ing purification (see section on Fusion ProteinCleavage Methods).

Glutathione S-transferase is a 26-kDa cyto-plasmic protein from Schistosoma japonicum.Carboxy-terminal protein fusions to GST areusually soluble and well-expressed (Smith andJohnson, 1988). GST binds specifically to glu-tathione, and GST fusion proteins can be puri-fied in a single step from crude bacterial lysatesby affinity chromatography on immobilizedglutathione. The GST expression plasmids are

available from Pharmacia Biotech, and utilizethe tac promoter and plasmid-encoded lacIq forinducible transcriptional control. A variety ofvectors are available which contain either thefactor Xa or the thrombin recognition sequenceto allow cleavage and removal of the GST fromthe protein of interest following affinity purifi-cation (see section on Fusion Protein CleavageMethods).

Thioredoxin is a 12-kDa intracellular E. coliprotein. It is very soluble, can be highly over-expressed from plasmid vectors (Lunn et al.,1984), and has been shown to be a very suc-cessful fusion partner. A wide variety of geneproducts can be produced abundantly in solublefashion when fused to TRX (LaVallie et al.,1993a). Two different characteristics of TRXcan be exploited to allow specific purificationof some TRX fusion proteins. TRX accumu-lates at specific sites along the inner surface ofthe cytoplasmic membrane, called Bayer’spatches or adhesion zones. These sites consti-tute an osmotically sensitive compartment, andTRX (and some TRX fusions) can be selec-tively released from the cytoplasm and sepa-rated from the bulk of E. coli proteins by os-motic shock or freeze/thaw procedures. In ad-dition, TRX is thermostable, and some TRXfusions can be purified by selective thermaldenaturation of contaminants. In instanceswhere osmotic shock or heat treatments do notprovide adequate purification, an altered TRXprotein has been developed (E.R.L., manuscriptin preparation) that allows specific purificationby metal-chelate affinity chromatography. TheTRX fusion vector (available from Invitrogen)contains an enterokinase recognition sequencepositioned at the junction between TRX and theC-terminal fusion partner.

In addition to the carrier proteins listedabove, there are many other proteins that havebeen used to produce fusions for the purposeof generating large amounts of protein in E.coli. Among them are E. coli β-galactosidase(Ruther and Muller-Hill, 1983) and TrpE (Yan-sura, 1990), Staphylococcus aureus protein A(Nilsson et al., 1985; vector available fromPharmacia Biotech), chloramphenicol acetyl-transferase (Knott et al., 1988), bacteriophagelambda cII protein (Nagai and Thøgersen,1987), and various carbohydrate-binding pro-teins (Taylor and Drickamer, 1991; Helman andMantsala, 1992). Table 5.1.1 lists many fusionpartners that have been described in the litera-ture.

Current Protocols in Protein Science

5.1.4

Production ofRecombinant

Proteins inEscherichia coli

Page 5: Recombinant Protin

Fusion TagsFusion tags are small stretches of amino

acids added to the N-terminal or C-terminal endof a protein. Although they do not usually helpto increase expression levels or protein solubil-ity, they can be advantageous in protein purifi-cation and detection. The tags are generallychosen either because they encode an epitopethat can be detected and purified using an anti-body that binds to it, or because the tag aminoacids provide a physical characteristic that canbe exploited for easy and specific purification.A popular example is the polyhistidine tag,usually a stretch of six consecutive histidineresidues added to either the N or C terminus ofa protein, that provides specific binding to met-al chelate resins. There are a number of poly-histidine vectors on the market, such as thepTrcHis vector from Invitrogen. Other exam-ples, such as polyarginine (Brewer and Sassen-feld, 1985) or polyaspartic acid tags (Dalbøgeet al., 1987), can be used to alter the binding

behavior of a protein on ion-exchange resins.A noteworthy fusion tag is a small stretch ofamino acids recognized and biotinylated invivo by the biotin-protein ligase in E. coli(Schatz, 1993). This allows specific capture ofthe biotinylated protein on immobilized avidinor streptavidin. Table 5.1.2 lists a number of themore popular fusion tags.

Fusion Protein Cleavage MethodsWhereas fusion proteins and fusion tags

have proved useful for the reasons outlinedabove, the end result is that the protein ofinterest will carry additional sequences thatmay hamper its functional activity. There areseveral methods for removal of the carrier pro-tein, which can be divided into chemical cleav-age methods and enzymatic (proteolytic) cleav-age methods. Chemical cleavage of fusion pro-teins can be accomplished with reagents suchas cyanogen bromide (Met↓; Itakura et al.,1977; Villa et al., 1989), 2-(2-nitrophenyl-

Table 5.1.1 Common Fusion Proteins

Fusion protein Specific purification methoda

Maltose-binding protein Amylose bindingGlutathione S-transferase Glutathione bindingThioredoxin Selective release, heat stability, MCACProtein A IgG bindingβ-Galactosidase APTG or anti-β-galactosidase antibody bindingChloramphenicol acetyltransferase Chloramphenicol bindinglac repressor lac operator binding (Lundeberg et al., 1990)Galactose-binding protein Galactose bindingCyclomaltodextrin glucanotransferase Cyclodextrin bindingLambda cII protein NoneTrpE protein NoneaMCAC, metal-chelate affinity chromatography; IgG, immunoglobulin G; APTG, p-amino-β-D-thiogalactoside.

Table 5.1.2 Fusion Tags

Amino acid tag Ligand Reference/supplier

Polyhistidine Metal-chelate affinity resin Qiagen

Polyaspartic acid Anion-exchange resin Dalbøge et al. (1987)Polyarginine Cation-exchange resin Brewer and Sassenfeld (1985)Polyphenylalanine HIC resina Persson et al. (1988)Polycysteine Thiol Persson et al. (1988)In vivo biotinylated peptide Avidin or streptavidin Schatz (1993)Flag peptide Anti-Flag antibody International Biotechnologies

(IBI)aHIC, hydrophobic interaction chromatography (UNIT 8.4).

Current Protocols in Protein Science

5.1.5

Production ofRecombinantProteins

Page 6: Recombinant Protin

sulfenyl) - 3 - methyl - 3′ - bromoindolinine orBNPS-skatole (Trp↓; Dykes et al., 1988), hy-droxylamine (Asn↓Gly; Bornstein and Balian,1977), or low pH (Asp↓Pro; Szoka et al., 1986).The use of these reagents, as described in UNIT

11.4, can be applied to the site-specific cleavageof fusion proteins provided that the fusion isdesigned so that the chemically labile bond ispositioned at the desired point of scission.Chemical cleavage reagents tend to be inexpen-sive and efficient, and many of the reactions canbe performed under denaturing conditions sothat even insoluble proteins can be cleaved.However, one disadvantage of using chemicalcleavage reagents for the site-specific cleavageof fusion proteins is that the reactions are gen-erally performed under extremes of pH and/ortemperature that can result in unwanted aminoacid side-chain modifications. Chemical cleav-age methods also have the disadvantage of lowspecificity, so it is more likely that the proteinof interest will contain an internal cleavage site.

Enzymatic digestion is usually the methodof choice for soluble fusion protein cleavage asthe reactions are carried out under relativelymild conditions. Proteases such as trypsin(Lys↓ or Arg↓), endoproteinase Asp-N (↓Asp),and Staph V8 S. aureus V8 protease (Glu↓ orAsp↓) can be used in this role. High-qualitytrypsin, S. aureusV8 protease (endoproteinaseGlu-C), and endoproteinase Asp-N can all beobtained from Boehringer Mannheim. Theseproteases, like chemical cleaving reagents, arelimited by their low degree of specificity (recog-nizing and cleaving at single amino acids).

Alternatively, thrombin (Leu-Val-Pro-Ar-g↓Gly-Ser; Gearing et al., 1989), factor Xa[Ile-Glu(or Asp)-Gly-Arg↓; Nagai andThøgersen, 1984, 1987; Gardella et al., 1990],renin (Pro-Phe-His-Leu↓Leu-Val-Tyr; Haffeyet al., 1987), collagenase (Pro-X↓Gly-Pro-Y↓;Germino and Bastia, 1984), and enterokinase(Asp-Asp-Asp-Asp-Lys↓; Dykes et al., 1988;LaVallie et al., 1993b) are much more suitablein this role. All of these enzymes have extendedsubstrate recognition sequences (up to sevenamino acids in the case of renin), which greatlyreduces the likelihood of unwanted cleavageselsewhere in the protein. Factor Xa and enterok-inase are most useful for cleaving off C-termi-nal fusion partners because they cleave on thecarboxyl-terminal side of their respective rec-ognition sequences. This allows the release offusion partners containing their authenticamino terminus. In addition, the catalyticsubunit of bovine enterokinase has been cloned

and expressed (LaVallie et al., 1993b), and therecombinant enzyme has been shown to beapproximately 100-fold more efficient than thenative intestinally derived enzyme in fusionprotein cleavage reactions (Racie et al., 1995).The recombinant enterokinase is available fromNew England Biolabs.

SUMMARYMethods for the overexpression of cloned

gene products in E. coli have improved signifi-cantly since it was first attempted. Commonproblems such as variable expression levels,inclusion body formation, and purification dif-ficulties have been successfully addressed byadvancements in expression technology. Prob-ably the most significant of these advancementshas been the development of fusion proteinsand fusion tag expression and purification tech-niques. These methods have resulted in moreconsistent production of soluble and active pro-tein, and have allowed for simple and efficientpurification of the proteins from bacteriallysates. Although the production of soluble,properly folded, and active recombinant pro-teins in E. coli is still not guaranteed, the like-lihood of success is far greater than it was justa few years ago. This progress should helpensure that E. coli will continue to be the hostorganism of choice for recombinant proteinproduction.

LITERATURE CITEDBishai, W.R., Rappuoli, R., and Murphy, J.R. 1987.

High-level expression of a proteolytically sensi-tive diphtheria toxin fragment in Escherichiacoli. J. Bacteriol. 169:5140-5151.

Bornstein, P. and Balian, G. 1977. Cleavage at Asn-Gly bonds with hydroxylamine. Methods Enzy-mol. 47:132-145.

Bowden, G.A. and Georgiou, G. 1990. Folding andaggregation of β-lactamase in the periplasmicspace of Escherichia coli. J. Biol. Chem.265:16760-16766.

Brewer, S.J. and Sassenfeld, H.M. 1985. The purifi-cation of recombinant proteins using C-terminalpoly-arginine fusions. Trends Biotechnol. 3:119-122.

Bucheler, U.S., Werner, D., and Schirmer, R.H. 1990.Random silent mutagenesis in the initial tripletsof the coding region: A technique for adaptinghuman glutathione reductase-encoding cDNA toexpression in Escherichia coli. Gene 96:271-276.

Cheah, K.C., Harrison, S., King, R., Crocker, L.,Well, J.R., and Robins, A. 1994. Secretion ofeukaryotic growth hormones in Escherichia coliis influenced by the sequence of the matureproteins. Gene 138:9-15.

Current Protocols in Protein Science

5.1.6

Production ofRecombinant

Proteins inEscherichia coli

Page 7: Recombinant Protin

Dalbøge, H., Dahl, H.H.M., Pedersen, J., Hansen,J.W., and Christensen, T. 1987. A novel enzy-matic method for production of authentic hGHfrom an Escherichia coli-produced hGH precur-sor. Bio/Technology 5:161-164.

De Lamarter, J.F., Mermod, J.J., Liang, C.M., Elia-son, J.F., and Thatcher, D.R. 1985. Recombinantmurine GM-CSF from E. coli has biologicalactivity and is neutralized by a specific antis-erum. EMBO J. 4:2575-2581.

Devlin, P.E., Drummond, R.J., Toy, P., Mark, D.F.,Watt, K.W.K., and Devlin, J.J. 1988. Alterationof the amino-terminal codons of human granu-locyte colony-stimulating factor increases ex-pression levels and allows efficient processingby methionine aminopeptidase in Escherichiacoli. Gene 65:13-22.

Dykes, C.W., Bookless, A.B., Coomber, B.A., No-ble, S.A., Humber, D.C., and Hobden, A.N.1988. Expression of atrial natriuretic factor as acleavable fusion protein with chloramphenicolacetyltransferase in Escherichia coli. Eur. J. Bio-chem. 174:411-416.

Fuh, G., Mulkerrin, M.G., Bass, S., McFarland, N.,Brochier, M., Bourell, J.H., Light, D.R., andWells, J.A. 1990. The human growth hormonereceptor. Secretion from Escherichia coli anddisulfide bonding pattern of the extracellularbinding domain. J. Biol. Chem. 265:3111-3115.

Gardella, T.J., Rubin, D., Abou-Samra, A.-B., Keut-mann, H.T., Potts, J.T. Jr., Kronenberg, H.M.,and Nussbaum, S.R. 1990. Expression of humanparathyroid hormone (1-84) in Escherichia colias a factor X-cleavable fusion protein. J. Biol.Chem. 265:15854-15859.

Gearing, D.P., Nicola, N.A., Metcalf, D., Foote, S.,Willson, T.A., Gough, N.M., and Williams, R.L.1989. Production of leukemia factor in Es-cherichia coli by a novel procedure and its usein maintaining embryonic stem cells in culture.Bio/Technology 7:1157-1161.

Germino, J. and Bastia, D. 1984. Rapid purificationof a gene product by genetic fusion and sitespecific proteolysis. Proc. Natl. Acad. Sci. U.S.A.81:4692-4696.

Guan, C., Li, P., Riggs, P.D., and Inouye, H. 1988.Vectors that facilitate the expression and purifi-cation of foreign peptides in Escherichia coli byfusion to maltose-binding protein. Gene 67:21-30.

Haffey, M.L., Lehman, D., and Boger, J. 1987. Site-specific cleavage of a fusion protein by renin.DNA 6:565-571.

Hellman, J. and Mantsala, P. 1992. Construction ofan E. coli export-affinity vector for expressionand purification of foreign proteins by fusion tocyclomaltodextrin glucanotransferase. J. Bio-technol. 23:19-34.

Itakura, K., Hirose, T., Crea, R., Riggs, A.D.,Heyneker, H.L., Bolivar, F., and Boyer, H.W.1977. Expression in Escherichia coli of a chemi-cally synthesized gene for the hormone somato-statin. Science 198:1056-1063.

Knott, J.A., Sullivan, C.A., and Weston, A. 1988.The isolation and characterisation of humanatrial natriuretic factor produced as a fusion pro-tein in Escherichia coli. Eur. J. Biochem.174:405-410.

LaVallie, E.R., DiBlasio, E.A., Kovacic, S., Grant,K.L., Schendel, P.F., and McCoy, J.M. 1993a. Athioredoxin gene fusion system that circumventsinclusion body formation in the E. coli cyto-plasm. Bio/Technology 11:187-193.

LaVallie, E.R., Rehemtulla, A., Racie, L.A.,DiBlasio, E.A., Ferenz, C., Grant, K.L., Light,A., and McCoy, J.M. 1993b. Cloning and func-tional expression of a cDNA encoding the cata-lytic subunit of bovine enterokinase. J. Biol.Chem. 268:23311-23317.

Looman, A.C., Bodlaender, J., Comstock, L.J., Ea-ton, D., Jhurani, P., de Boer, H.A., and van Knip-penberg, P.H. 1987. Influence of the codon fol-lowing the AUG initiation codon on the expres-sion of a modified lacZ gene in Escherichia coli.EMBO J. 6:2489-2492.

Lundeberg, J., Wahlberg, J., and Uhlen, M. 1990.Affinity purification of specific DNA fragmentsusing a lac repressor fusion protein. Genet. Anal.Techn. Appl. 7:47-52.

Lunn, C.A., Kathju, S., Wallace, B.J., Kushner, S.R.,and Pigiet, V. 1984. Amplification and purifica-tion of plasmid-encoded thioredoxin from Es-cherichia coli K12. J. Biol. Chem. 259:10469-10474.

Maina, C.V., Riggs, P.D., Grandea, A.G., Slatko,B.E. , Moran, L.S. , Tagliamonte, J .A. ,McReynolds, L.A., and Guan, C. 1988. An Es-cherichia coli vector to express and purify for-eign proteins by fusion to and separation frommaltose-binding protein. Gene 74:365-373.

Mieschendahl, M., Petri, T., and Hanggi, U. 1986.A novel prophage independent trp regulatedlambda pL expression system. Bio/Technology4:802-808.

Miller, J.H. and Reznikoff, W.S. (eds.) 1978. TheOperon. Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y.

Nagai, K. and Thøgersen, H. C. 1984. Generation ofβ-globin by sequence-specific proteolysis of ahybrid protein in Escherichia coli. Nature309:810-812.

Nagai, K. and Thøgersen, H.C. 1987. Synthesis andsequence-specific proteolysis of hybrid proteinsproduced in Escherichia coli. Methods Enzymol.153:461-481.

Neu, H.C. and Heppel, L.A. 1965. Release of en-zymes from Escherichia coli by osmotic shockduring the formation of spheroplasts. J. Biol.Chem. 240:3685-3692.

Nilsson, B., Abrahmsen, L., and Uhlen, M. 1985.Immobilization and purification of enzymeswith staphylococcal protein A gene fusion vec-tors. EMBO J. 4:1075-1080.

Oka, T., Sakamoto, S., Miyoshi, K., Fuwa, T., Yoda,K., Yamasaki, M., Tamura, G., and Miyake, T.1985. Synthesis and secretion of human epider-

Current Protocols in Protein Science

5.1.7

Production ofRecombinantProteins

Page 8: Recombinant Protin

mal growth factor by Escherichia coli. Proc.Natl. Acad. Sci. U.S.A. 82:7212-7216.

Olins, P.O. and Rangwala, S.H. 1989. A novel se-quence element derived from bacteriophage T7mRNA acts as an enhancer of translation of thelacZ gene in Escherichia coli. J. Biol. Chem.264:16973-16976.

Persson, M., Bergstrand, M.G., Bulow, L., and Mos-bach, K. 1988. Enzyme purification by geneti-cally attached polycysteine and polyphenylalan-ine tags. Anal. Biochem. 172:330-337.

Power, B.E., Ivancic, N., Harley, V.R., Webster,R.G., Kortt, A.A., Irving, R.A., and Hudson, P.J.1992. High-level temperature-induced synthesisof an antibody VH-domain in Escherichia coliusing the PelB secretion signal. Gene 113:95-99.

Racie, L.A., McColgan, J.M., Grant, K.L.,DiBlasio-Smith, E.A., McCoy, J.M., and LaVal-lie, E.R. 1995. Production of recombinant bo-vine enterokinase catalytic subunit in Es-cherichia coli using the novel secretory fusionpartner DsbA. Bio/Technology. In press.

Robinson, M., Lilley, R., Little, S., Emtage, J.S.,Yarranton, G., Stephens, P., Millican, A., Eaton,M., and Humphreys, G. 1984. Codon usage canaffect efficiency of translation of genes in Es-cherichia coli. Nucl. Acids Res. 12:6663-6671.

Ruther, U. and Muller-Hill, B. 1983. Easy identifi-cation of cDNA clones. EMBO J. 2:1791-1794.

Schatz, P.J. 1993. Use of peptide libraries to map thesubstrate specificity of a peptide-modifying en-zyme: A 13 residue consensus peptide specifiesbiotinylation in Escherichia coli. Bio/Technol-ogy 11:1138-1143.

Schein, C.H. 1989. Production of soluble recombi-nant proteins in bacteria. Bio/Technology7:1141-1149.

Schein, C.S. and Noteborn, M.H.M. 1988. Forma-tion of soluble recombinant proteins in Es-cherichia coli is favored by lower growth tem-perature. Bio/Technology 6:291-294.

Shimatake, H. and Rosenberg, M. 1981. Purified λregulatory protein cII positively activates pro-moters for lysogenic development. Nature292:128-132.

Shine, J. and Dalgarno, L. 1974. The 3′-terminalsequence of Escherichia coli 16S ribosomalRNA: Complementarity to nonsense triplets andribosome binding sites. Proc. Natl. Acad. Sci.U.S.A. 71:1342-1346.

Skerra, A. 1994. A general vector, pASK84, forcloning, bacterial production, and single-steppurification of antibody Fab fragments. Gene141:79-84.

Smith, D.B. and Johnson, K.S. 1988. Single-steppurification of polypeptides expressed in Es-cherichia coli as fusions with glutathione S-transferase. Gene 67:31-40.

Studier, F.W., Rosenberg, A.H., Dunn, J.J., andDubendorff, J.W. 1990. Use of T7 RNA polym-erase to direct expression of cloned genes. Meth-ods Enzymol. 185:60-89.

Szoka, P.R., Schreiber, A.B., Chan, H., and Murthy,J. 1986. A general method for retrieving compo-nents of a genetically engineered fusion protein.DNA 5:11-20.

Takahara, M., Sagai, H., Inouye, S., and Inouye, M.1988. Secretion of human superoxide dismutasein Escherichia coli. Bio/Technology 6:195-198.

Taylor, M.E. and Drickamer, K. 1991. Carbohy-drate-recognition domains as tools for rapid pu-rification of recombinant eukaryotic proteins.Biochem. J. 274:575-580.

Vasquez, J.R., Evnin, L.B., Higaki, J.N., and Craik,C.S. 1989. An expression system for trypsin. J.Cell. Biochem. 39:265-276.

Villa, S., DeFazio, G., and Canosi, U. 1989. Cyano-gen bromide cleavage at methionine residues ofpolypeptides containing disulfide bonds. Anal.Biochem. 177:161-164.

Yansura, D.G. 1990. Expression as trpE fusion.Methods Enzymol. 165:161-166.

Contributed by Edward R. LaVallieGenetics Institute, Inc.Cambridge, Massachusetts

Current Protocols in Protein Science

5.1.8

Production ofRecombinant

Proteins inEscherichia coli