6
High throughput protein production for functional proteomics Pascal Braun 1 and Josh LaBaer 2 1 Harvard University, Department for Chemistry and Chemical Biology, 12 Oxford Street, Cambridge, MA 02138, USA 2 Institute of Proteomics, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02129, USA A major impact of genome projects on human health will be their contribution to the understanding of pro- tein function. Proteins are the engines of biological sys- tems, nearly all pharmaceuticals act on proteins and increasingly proteins themselves are used therapeuti- cally. As biology enters the post-genomic era, research- ers have begun to embrace the exciting opportunity of investigating proteins in high throughput (HT) exper- iments. The study of proteins includes a vast array of techniques ranging from enzyme catalysis assays to interaction and structural studies. Many of these methods depend on purified proteins. The discovery of thousands of novel protein-coding sequences and the increased availability of large cDNA collections provide the opportunity to investigate protein function in a sys- tematic manner and at an unprecedented scale. This opportunity highlights the need for development of HT methods for protein isolation. This article describes the challenges faced and the approaches taken to develop proteome-scale protein expression systems. Purified proteins are a key reagent for numerous assays that address fundamental questions about their structure, function and regulation. The applications of high-through- put protein expression and purification fall into two broad classes (Fig. 1). In one set of applications, HT protein expression can be used as a first step to screen for optimal conditions or gene constructs before scaling up for high- yield protein production. Experimental approaches, such as protein crystallization and the production of protein affinity reagents, often require milligram quantities of protein. Obtaining protein for these studies can be challenging because proteins sometimes express poorly or fold improperly when produced in heterologous systems. In these applications, success often depends on the time- consuming trial-and-error process of attempting to express different versions of the target protein until a well-expressing, soluble and correctly folded construct can be identified. The ability to screen many constructs simultaneously in a multi-well format could speed up this screening process considerably. In the second class of applications, the HT expression of proteins can be used directly as the front end for various HT applications. A growing number of reports describe methods for HT biochemical experiments. Although some are still in early development, applications such as protein microarrays [1,2], multi-well solution biochemistry [3,4] and the isolation of protein complexes for analysis by mass spectrometry [5–7] would all benefit from improved methods for HT protein isolation. Because each of these applications supports a broad range of powerful exper- iments, it is clear that HT protein expression systems will become an important common application for post-geno- mic biology. Expression systems Over the past few decades several protein expression systems have been developed for recombinant protein expression. Each of these systems has its strengths and weaknesses concerning yield, proper folding, post-transla- tional modification (PTM), cost, speed and ease of use. With respect to HT application, it is also useful to consider Fig. 1. Applications for high throughput (HT) protein purification. For applications that require large amounts of purified proteins (mg) HT methods for protein iso- lation can be used to efficiently screen many different constructs (orthologues, tags etc.) to identify those that produce a high yield of soluble protein. For micro- scale applications HT protein purification provides the front end to produce pro- teins for various applications that require limited amount of protein per sample. TRENDS in Biotechnology Pre-screening Protein interactions Enzyme activities Enzyme substrate networks Effects of PTM Targets of small molecules Structural studies Affinity reagents HT compound screens Microscale applications HT expresssion and purification Large scale production Corresponding author: Pascal Braun ([email protected]). Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003 383 http://tibtec.trends.com 0167-7799/$ - see front matter q 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0167-7799(03)00189-6

High throughput protein production for functional proteomics

Embed Size (px)

Citation preview

Page 1: High throughput protein production for functional proteomics

High throughput protein productionfor functional proteomicsPascal Braun1 and Josh LaBaer2

1Harvard University, Department for Chemistry and Chemical Biology, 12 Oxford Street, Cambridge, MA 02138, USA2Institute of Proteomics, Harvard Medical School, 240 Longwood Avenue, Boston, MA 02129, USA

A major impact of genome projects on human health

will be their contribution to the understanding of pro-

tein function. Proteins are the engines of biological sys-

tems, nearly all pharmaceuticals act on proteins and

increasingly proteins themselves are used therapeuti-

cally. As biology enters the post-genomic era, research-

ers have begun to embrace the exciting opportunity of

investigating proteins in high throughput (HT) exper-

iments. The study of proteins includes a vast array of

techniques ranging from enzyme catalysis assays to

interaction and structural studies. Many of these

methods depend on purified proteins. The discovery of

thousands of novel protein-coding sequences and the

increased availability of large cDNA collections provide

the opportunity to investigate protein function in a sys-

tematic manner and at an unprecedented scale. This

opportunity highlights the need for development of HT

methods for protein isolation. This article describes the

challenges faced and the approaches taken to develop

proteome-scale protein expression systems.

Purified proteins are a key reagent for numerous assaysthat address fundamental questions about their structure,function and regulation. The applications of high-through-put protein expression and purification fall into two broadclasses (Fig. 1). In one set of applications, HT proteinexpression can be used as a first step to screen for optimalconditions or gene constructs before scaling up for high-yield protein production. Experimental approaches, suchas protein crystallization and the production of proteinaffinity reagents, often require milligram quantities ofprotein. Obtaining protein for these studies can bechallenging because proteins sometimes express poorlyor fold improperly when produced in heterologous systems.In these applications, success often depends on the time-consuming trial-and-error process of attempting toexpress different versions of the target protein until awell-expressing, soluble and correctly folded construct canbe identified. The ability to screen many constructssimultaneously in a multi-well format could speed upthis screening process considerably.

In the second class of applications, the HT expression ofproteins can be used directly as the front end for variousHT applications. A growing number of reports describemethods for HT biochemical experiments. Although some

are still in early development, applications such as proteinmicroarrays [1,2], multi-well solution biochemistry [3,4]and the isolation of protein complexes for analysis by massspectrometry [5–7] would all benefit from improvedmethods for HT protein isolation. Because each of theseapplications supports a broad range of powerful exper-iments, it is clear that HT protein expression systems willbecome an important common application for post-geno-mic biology.

Expression systems

Over the past few decades several protein expressionsystems have been developed for recombinant proteinexpression. Each of these systems has its strengths andweaknesses concerning yield, proper folding, post-transla-tional modification (PTM), cost, speed and ease of use.With respect to HT application, it is also useful to consider

Fig. 1. Applications for high throughput (HT) protein purification. For applications

that require large amounts of purified proteins (mg) HT methods for protein iso-

lation can be used to efficiently screen many different constructs (orthologues,

tags etc.) to identify those that produce a high yield of soluble protein. For micro-

scale applications HT protein purification provides the front end to produce pro-

teins for various applications that require limited amount of protein per sample.

TRENDS in Biotechnology

Pre-screening

Protein interactionsEnzyme activitiesEnzyme substrate networksEffects of PTMTargets of small molecules

Structural studiesAffinity reagentsHT compound screens

Microscaleapplications

HT expresssionand purification

Large scale production

Corresponding author: Pascal Braun ([email protected]).

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003 383

http://tibtec.trends.com 0167-7799/$ - see front matter q 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0167-7799(03)00189-6

Page 2: High throughput protein production for functional proteomics

the ‘success rate’ – the fraction of proteins that can beproduced in practical yields.

Escherichia coli

Escherichia coli is the simplest and by far the most widelyused organism for protein expression. The most appreci-ated advantages of this system are its speed, ease of useand low cost. These advantages are sometimes offset by thelack of eukaryotic PTMs and the low solubility of someproteins. Several groups have developed screening assaysto quickly assess the solubility of recombinant proteins,and some have developed in vivo assays to avoid thetedious processing associated with lysis of the bacteria andprotein purification [8–10]. Although potentially verypowerful, these approaches have not been evaluated on alarge scale. Others have used more direct assays forprotein solubility that involve microscale purification orlysis followed by relative quantification of soluble proteinby ELISA [9], dot-blot [11,12], mass spectrometry [13], orSDS–PAGE [14–16].

Purification tags High-throughput protein purificationdepends on affinity tags to provide a generic purificationstrategy. Additionally, certain affinity tags have a ben-eficial effect on protein solubility especially in bacterialprotein expression [17,18]. Although the advantages ofaffinity tags greatly outweigh their potential problems, itis important to consider that any tag can potentiallyinterfere with folding, function or crystallization of thetarget proteins. It is often reasonably assumed, althoughrarely formally demonstrated, that small tags, such as theHis6-tag, bear a smaller risk of steric interference thanlarger tags, such as glutathione-S-transferase (GST) ormaltose binding protein (MBP).

The His6-tag is a popular purification tag because of itssmall size, its relatively strong reversible binding andbecause it functions under denaturing conditions. Theseadvantages have led many structural proteomics groups touse the His6-tag almost exclusively for their initial studies.Table 1 summarizes the cumulative data of severalstudies. Although most of these approaches did not involveHT microscale purifications, the data are neverthelessinstructive with regard to the success rates. For prokar-yotic proteins the success rate is typically ,50%, whichmost likely reflects the close phylogenetic relationshipbetween the target organisms and the expression system.For eukaryotic proteins, the success rates are significantly

lower (Table 2). One exception is the study by Yee et al.,who successfully purified 63% of 93 proteins fromSaccharomyces cerevisiae, although notably all of theseproteins were ,23 kDa [19]. All reports note that proteinsbecome progressively more difficult to purify as theirmolecular weight increases [14,19–21].

Given the poor performance of the His6-tag for highereukaryotic proteins, many alternative purification tagshave been developed, many of which increase the solubilityof the recombinant proteins. Several studies system-atically examined the success rates obtained with differenttags in HT purifications (Table 2). Hammerstrom et al.selected 27 small human proteins (6–19 kDa) to assess thesolubilizing properties of seven different tags (Table 2)[15]. However, because small proteins are generally easierto solubilize than larger proteins [14,19–21], conclusionsfrom this study are somewhat limited. By contrast, Shihet al. investigated the effects of eight different tags on thesolubility of 40 different proteins covering a size range of9 kDa to 140 kDa [16]. To investigate the effect of differenttags on the ability to obtain purified protein our studyinvestigated the yield and purity of 32 human proteins(16–150 kDa) purified with each of four affinity tags [14].In all of the relevant studies, the large tags, MBP and thebacterial protein NusA, performed consistently well withrespect to solubility. In the study by Hammerstrom, the B1domain of Staphylococcus aureus Protein G (GB1) andthioredoxin (Trx) seem to combine good solubilizingproperties with a small size. However, Trx did not performas well when larger proteins were tested. In addition,NusA and Trx are not affinity tags and would requirecombination with a second tag that can contribute anaffinity moiety for purification. Sebastian et al. use a GST-Strep-double tag to purify 88% of 42 small peptide fusionproteins in a semi-automated process to yields between80 mg ml21 and 590 mg ml21 [22].

In our purification experiments, the GST-tag and theMBP-tag enabled the purification of 80% of 32 test setproteins with yields of 0.3–5.0 mg ml21. To furtherevaluate the GST-tag on a larger scale, a randomly chosenset of 428 human proteins was expressed and purifiedusing the GST-tag. In this experiment, ,50% of allproteins could be purified [14] (P. Braun, unpublished).In addition ,70% of 728 tested proteins could be purifiedunder denaturing conditions using the His6-tag. Thedenatured proteins can be used for antibody production,

Table 1. Protein purification success of His6-tagged proteins under non-denaturing conditions in different studies

Organism # Prot. % Purified

(% soluble)

Protein properties Purification scale Ref

Methanobacterium thermoautotrophicum 424 41 (47) No transmembrane proteins 50 ml [20]

Thermotoga maritime 1376 47 Random 96 well, after 65 ml

fermentation

[21]

Thermotoga maritima 21 93 ,23 kDa, no transmembrane proteins 1 l [19]

Escherichia coli 130 61 ,23 kDa, no transmembrane proteins 1 l [19]

Methanobacterium thermoautotrophicum 250 51 ,23 kDa, no transmembrane proteins 1 l [19]

Myxoma Virus 25 46 ,23 kDa, no transmembrane proteins 1 l [19]

Saccharomyces cerevisiae 93 63 ,23 kDa, no transmembrane proteins 1 l [19]

Caenorhabditis elegans 167 6 10–140 kDa, no transmembrane proteins 96 well 1 ml [13]

Homo sapiens 32 13 16–110 kDa, no transmembrane proteins 96 well 1 ml [14]

Several studies utilized or examined the His6-tag for protein purification under non-denaturing conditions. For prokaryotes and archaea the success rates are ,50%, whereas

for higher eukaryotes the success rates are significantly lower.

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003384

http://tibtec.trends.com

Page 3: High throughput protein production for functional proteomics

for the manufacture of analytical protein arrays orsubmitted to preparative refolding [23].

Functional integrity of purified proteins An importantquestion of recombinant protein production regards thefunctional integrity of the produced proteins. Based on thequality of NMR spectra, Christendat et al. estimated that57 out of 100 soluble proteins from E. coli might be in astate of aggregation and thus potentially non functional[20]. This finding is supported by Yee et al., who found thatthe NMR spectra of 27% to 55% of proteins (depending onthe organism) that were soluble in E. coli indicatedaggregation or conformational instability of the protein[19]. At this point, the significance of these findings withrespect to the function of the target proteins is unclear.Nonetheless, these results underscore the requirement forapproaches that can assess the functional integrity of largenumbers of purified proteins. Generic assays for specificgroups of proteins might be a step in this direction.However, despite justified skepticism it must also not beforgotten that thousands of functional proteins have beenproduced in bacteria over the past few decades and despitenumerous alternatives E. coli is still the most widely usedprotein expression system.

Cell-free expression

Cell-free expression systems are particularly attractive forHT applications because the absence of a cell membraneeliminates the harsh process steps associated withintroducing DNA into cells, lysing cells and clearinglysate. Expression systems from eukaryotic cell lysateshave the additional benefit that most PTMs are executedproperly.

The most widely used open expression systems arebacterial, wheat germ and reticulocyte lysates,although lysates from other cell types have also beenmade [24,25]. Early on, cell-free expression had verylow protein yields but several developments in recentyears have significantly improved these yields so thatyields of up to 6 mg ml21 have been reported forindividual proteins [26]. Important changes includedaltering the concentration of the lysate [27], introdu-cing semi-continuous and continuous reaction [28,29],

adding energy regeneration systems [30] and, in thecase of wheat-germ lysate, removing a ‘suicide’ systemon the outside of the seeds [31]. The lysate itself is thedialyzed S30 supernatant of a cell lysate [32]. Onetricky aspect in making these lysates are the concen-trations of various essential components, such asmagnesium ions. Unfortunately, the transcription andtranslation steps have very specific and differentrequirements for the concentration of various ions.Thus the optimal concentrations have to be identifiedand carefully controlled. All major cell lysates also arecommercially available but are relatively expensive andof proprietary composition, which can be a disadvantage ifsalt or buffer concentrations need to be adjusted.

The Riken Structural Genomics Initiative in Japan(http://www.rsgi.riken.go.jp) produces its target proteinsalmost exclusively in bacterial cell-free expression sys-tems. About one quarter of randomly chosen mouse cDNAclones could be produced with yields of 0.1 mg ml21 orhigher [33]. Adapting the bacterial cell-free system toHT expression, Busso et al. found that most of 24evaluated prokaryotic proteins behaved similarly whenexpressed in vivo or in vitro. Interestingly, the authorsalso found that the location of the His6-tag can affectsolubility and total expression levels and that a C-terminalHis6-tag often performs worse [12]. In our own exper-iments using 60 His6-tagged proteins from Pseudomonasaeruginosa only 15% could be purified under non-denaturing conditions. Under denaturing conditions 90%of proteins were purified using an N-terminal His6-tag(T. Murthy, unpublished).

A recent report by Sawasaki explores wheat germ lysatefor proteomic microscale protein production [34]. Theauthors report that 50 out of 54 human and Arabidopsisthaliana proteins could be detected by colloidal coomassiestaining after HT PCR cloning, transcription and HT cell-free protein production. The protein yields obtainedranged from 0.1 mg ml21 to 2.3 mg ml21. The authorsshow that their method generates functional protein bydemonstrating autophosphorylation activity of four out offive kinases and recording an NMR spectrum of onefurther protein [34].

Table 2. The effects of polypeptide tags on solubility and purification of test proteins

Tag Size (kDa) Hammerstrom et al. [17] Shih et al. [18] Braun et al. [9]

27 Proteins 40 Proteins 32 Proteins

6–19 kDa 9–140 kDa 16–110 kDa

Soluble Soluble Soluble Purified

His6 1 29% n/a 16% 13%

Calmodulin binding peptide 4 n/a ,38% 67% 15%

GB1 6 68% n/a n/a n/a

Thioredoxin 11 74% ,38% n/a n/a

ZZ 14 55% n/a n/a n/a

Cellulose binding protein 17 n/a ,38% n/a n/a

GST 26 48% 38% 50% 80%

MBP 42 70% 60% 90% 80%

NusA 54 52% 60% n/a n/a

Intein-CN 55 n/a ,38% n/a n/a

Three studies investigate the effects of polypeptide tags on the solubility and the ability to purify a limited set of eukaryotic test set proteins of various sizes. Hammerstrom et al.

and Shih et al. score solubility based on coomassie blue analysis of total versus cleared lysates on SDS-PAGE. By contrast, Braun et al. scored a protein soluble if 20% of total

protein could be detected in the cleared lysate by western blot analysis. After protein purification, a protein was scored as positive if 15% of eluate produced a visible band of

the correct size on a coomassie stained SDS-PAGE. This corresponds to a minimal yield of ,300 ng protein per ml of culture.

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003 385

http://tibtec.trends.com

Page 4: High throughput protein production for functional proteomics

Other systems

For the expression of physiological proteins, includingtherapeutics, many groups are turning to mammaliancells such as CHO or HEK293 cells because these cells offera cellular environment closer to the physiological one.Durocher et al. report the high-level and HT production ofproteins in HEK293-EBNA1 cells. They expressed 14different proteins in vessels ranging from 50 ml shakerflasks to 14 l bioreactors and obtained levels of purifiedrecombinant protein between 1 mg ml21 and 50 mg ml21

[35]. Users must weigh the fidelity of PTMs, correct foldingand processing in mammalian cells against their slowdoubling rates, potentially inefficient transfection, overalllower productivity and their dependence on expensivereagents, such as serum. Moreover, owing to the fragility ofmammalian cells, growth in HT multi-well formats posesan additional challenge.

Interest in Saccharomyces cerevisiae as a proteinexpression system has increased recently because thisorganism combines the advantages of an inexpensive, fastgrowing unicellular organism with the physiologicalproperties of a eukaryotic cell. In 1999 Phizicky et al.built a collection of all coding sequences of S. cerevisiae andexpressed and purified these in 64 pools of 96 proteins eachand then screened them for several biochemical activities;however, protein expression and integrity were notspecifically addressed [36]. Later Zhu et al. usedS. cerevisiae to express 5800 proteins from the sameorganism, which were used to build protein arrays, withapparent success [2]. Both these studies expressed yeastproteins in yeast. However, for proteins from other speciesit is not clear if S. cerevisiae is a preferable alternative withrespect to success rate and yield. One important differencein PTMs by yeast and mammalian cells lies in the glyco-sylation patterns, in which yeast put on a different type ofchain (high mannose) compared with higher eukaryotes.

Protein production in insect cells is commonly per-formed using the baculovirus system. This proteinexpression system is popular for large scale proteinexpression. Appreciated advantages of insect cells arethe robust and relatively inexpensive cell culture and thefact that most eukaryotic PTMs are executed properly [37].Albala et al. described a HT compatible strategy to expressproteins in insect cells. In initial tests, 34 out of 81 wellsproduced soluble protein [38]. For HToperations, however,the need to make the viruses, a multistep process, and theneed to maintain high virus titers are challenges of thebaculovirus system that are being addressed [39].

The role of bioinformatics

The broad range of biochemical properties represented bythe proteome requires a variety of different approaches toensure that all proteins can be produced. In general, thegoal is to use the simplest and least expensive method thatproduces protein of sufficient quantity and quality.Currently, the best approach for each protein can only beascertained by trial and error. However, as more and moreHT studies are executed in different systems, the resultingdata can be used to infer rules that predict whichexpression systems are likely to succeed for proteinswith specific features. Such rules will enable researchers

to triage their protein expression clones into the bestsystems (Fig. 2). This can only happen if positive andnegative results for protein expression and purification arecollected in databases.

Efforts in this area have already begun and data fromprotein production in E. coli have been systematicallyanalysed. In one study, the authors found that the averagepH of their successfully expressed soluble proteins wasslightly lower than that of the starting protein population(6.54 versus 7.18) [21], whereas our own preliminaryanalysis suggest a bi-modal function of protein purificationsuccess in which some protein groups such as smallGTPases or kinases can be purified with high successrates, whereas proteins from other groups do not expressand purify well [14]. The most sophisticated analysis hasbeen done by Bertone et al. who developed an integrateddatabase that stores nearly all conceivable parameters ofpurified proteins including organism of origin, proteinparameters and various purification parameters [40].Using the data of 740 proteins that were expressed andpurified in E. coli for structural studies, the authorsperform a parameter analysis to identify criteria that canbe used to predict protein solubility. Mining these data, theauthors compute a decision tree, which enables theprediction of the solubility (‘yes’ versus ‘no’) of untestedproteins based on biochemical and biophysical par-ameters. Although the database is currently intended toidentify promising targets for structural proteomics, thistype of analysis will also have value for the assignment ofthe most promising expression system for HT proteinproduction pipelines [40].

Conclusions and future outlook

The future of many functional proteomics experimentsdepends on the ability to obtain large numbers of different,functional proteins. For several expression systems HTprocesses have been developed that can provide proteinsfor first experiments. These expression systems can beused in a complementary fashion dictated by applicationand the protein set, however coordination of differentexpression and purification schemes might pose chal-lenges. The results from analysis of the bacterial data setssuggests that some rules can be found that will help topredict which protein groups can be made with highsuccess rates in the respective systems. Independent of theexpression systems used it will be crucial to develop HTcompatible methods to assess the physical and functionalintegrity of the purified proteins so that experiments withthese proteins are meaningful and can be interpreted.

As the methods for HT protein expression continue to berefined and improved, they will have a larger role in thedevelopment of other technologies. Two general strategiesseem possible to satisfy long-term demand for proteins.One strategy would be to focus on increasing protein yieldto maximize the number of experiments possible from asingle preparation. However, this approach is associatedwith the cost of storing the proteins in an accessiblemanner that preserves protein function. Alternatively, afocus on decreasing costs and improving automation couldlead to a ‘just-in-time’ approach to protein preparation.Using this approach, the process of producing protein via

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003386

http://tibtec.trends.com

Page 5: High throughput protein production for functional proteomics

HT methods could become sufficiently robust and inex-pensive that it will be easier to prepare protein fresh foreach experiment than to make large amounts of proteinand store it.

AcknowledgementsThe authors appreciate the valuable input from Doug Buckley (Exilexis),Jim Hartley (NIH), Andre Iffland (Harvard Medical School) and T. Murthy(Harvard Medical School).

References

1 MacBeath, G. and Schreiber, S.L. (2000) Printing proteins asmicroarrays for high-throughput function determination. Science289, 1760–1763

2 Zhu, H. et al. (2001) Global analysis of protein activities usingproteome chips. Science 293, 2101–2105

3 King, R.W. et al. (1997) Expression cloning in the test tube. Science 277,973–974

4 Elia, A.E. et al. (2003) Proteomic screen finds pSer/pThr-bindingdomain localizing Plk1 to mitotic substrates. Science 299,1228–1231

5 Gavin, A.C. et al. (2002) Functional organization of the yeast proteomeby systematic analysis of protein complexes. Nature 415, 141–147

6 Ho, Y. et al. (2002) Systematic identification of protein complexesin Saccharomyces cerevisiae by mass spectrometry. Nature 415,180–183

7 Blagoev, B. et al. (2003) A proteomics strategy to elucidate functionalprotein-protein interactions applied to EGF signaling. Nat. Biotechnol.21, 315–318

8 Waldo, G.S. (2003) Genetic screens and directed evolution for proteinsolubility. Curr. Opin. Chem. Biol. 7, 33–38

9 Lesley, S.A. et al. (2002) Gene expression response to misfolded proteinas a screen for soluble recombinant protein. Protein Eng. 15, 153–160

10 Wigley, W.C. et al. (2001) Protein solubility and folding monitored invivo by structural complementation of a genetic marker protein. Nat.Biotechnol. 19, 131–136

11 Knaust, R.K. and Nordlund, P. (2001) Screening for soluble expressionof recombinant proteins in a 96-well format. Anal. Biochem. 297,79–85

12 Busso, D. et al. (2003) Expression of soluble recombinant proteins in acell-free system using a 96-well format. J. Biochem. Biophys. Methods55, 233–240

13 Chance, M.R. et al. (2002) Structural genomics: a pipeline for providingstructures for the biologist. Protein Sci. 11, 723–738

Fig. 2. Triage approach to protein purification. In the envisioned triage approach, the biochemical and biophysical parameters of all target protein sequences would be ana-

lysed. Subsequently, the proteins would be sorted to the easiest expression system. After quality control, the successfully purified proteins will be used and the failed pro-

teins will be re-entered into the process to be attempted in the next easiest system. In addition, all wrong predictions will be used to refine the analysis and sorting process.

Abbreviations: CAP, cellulose associated protein; CBP, calmodulin binding peptide; GB1, B1 domain of Staphylococcus aureus Protein G; GST, glutathione-S-transferase;

His6: Hexa-Histidine-tag; HT, high throughput; MBP, maltose binding protein; PTM, posttranslational modification; Trx, thioredoxin; ZZ: 2 sequential Z-domains of Staphy-

lococcus aureus Protein A.

TRENDS in Biotechnology

Target proteins

System ACondition 2

System BCondition 1

System CCondition 1

System BCondition 2

System ACondition 1

Successful

Failed

Algorithm refinement

Application

Analysis of protein properties

Proteinsorting

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003 387

http://tibtec.trends.com

Page 6: High throughput protein production for functional proteomics

14 Braun, P. et al. (2002) Proteome-scale purification of human proteinsfrom bacteria. Proc. Natl. Acad. Sci. U. S. A. 99, 2654–2659

15 Hammarstrom, M. et al. (2002) Rapid screening for improved solubilityof small human proteins produced as fusion proteins in Escherichiacoli. Protein Sci. 11, 313–321

16 Shih, Y.P. et al. (2002) High-throughput screening of solublerecombinant proteins. Protein Sci. 11, 1714–1719

17 Kapust, R.B. and Waugh, D.S. (1999) Escherichia coli maltose-bindingprotein is uncommonly effective at promoting the solubility ofpolypeptides to which it is fused. Protein Sci. 8, 1668–1674

18 Smith, D.B. (2000) Generating fusions to glutathione S-transferase forprotein studies. Methods Enzymol. 326, 254–270

19 Yee, A. et al. (2002) An NMR approach to structural proteomics. Proc.Natl. Acad. Sci. U. S. A. 99, 1825–1830

20 Christendat, D. et al. (2000) Structural proteomics of an archaeon. Nat.Struct. Biol. 7, 903–909

21 Lesley, S.A. et al. (2002) Structural genomics of the Thermotogamaritima proteome implemented in a high-throughput structuredetermination pipeline. Proc. Natl. Acad. Sci. U. S. A. 99,11664–11669

22 Sebastian, P. et al. (2003) Semi automated production of a set ofdifferent recombinant GST-Streptag fusion proteins. J. Chromatogr. BAnalyt. Technol. Biomed. Life Sci. 786, 343–355

23 Middelberg, A.P. (2002) Preparative protein refolding. Trends Bio-technol. 20, 437–443

24 Giannakouros, T. and Georgatsos, J.G. (1988) A high-yield cell-freesystem of protein synthesis of mouse liver. Int. J. Biochem. 20,511–519

25 Tuite, M.F. et al. (1980) Faithful and efficient translation ofhomologous and heterologous mRNAs in an mRNA-dependent cell-free system from Saccharomyces cerevisiae. J. Biol. Chem. 255,8761–8766

26 Kigawa, T. et al. (1999) Cell-free production and stable-isotope labelingof milligram quantities of proteins. FEBS Lett. 442, 15–19

27 Nakano, H. et al. (1994) An increased rate of cell-free protein synthesisby condensing wheat- germ extract with ultrafiltration membranes.Biosci. Biotechnol. Biochem. 58, 631–634

28 Sawasaki, T. et al. (2002) A bilayer cell-free protein synthesis systemfor high-throughput screening of gene products. FEBS Lett. 514,102–105

29 Hino, M. et al. (2002) Requirement of continuous transcription for thesynthesis of sufficient amounts of protein by a cell-free rapidtranslation system. Protein Expr. Purif. 24, 255–259

30 Kim, D.M. and Swartz, J.R. (1999) Prolonging cell-free proteinsynthesis with a novel ATP regeneration system. Biotechnol. Bioeng.66, 180–188

31 Madin, K. et al. (2000) A highly efficient and robust cell-free proteinsynthesis system prepared from wheat embryos: plants apparentlycontain a suicide system directed at ribosomes. Proc. Natl. Acad. Sci.U. S. A. 97, 559–564

32 Lesley, S.A. (1995) Preparation and use of E. coli S-30 extracts.Methods Mol. Biol. 37, 265–278

33 Yokoyama, S. et al. (2000) Structural genomics projects in Japan. Nat.Struct. Biol. 7(Suppl), 943–945

34 Sawasaki, T. et al. (2002) High-throughput expression of proteins fromcDNAs catalogue from Arabidopsis in wheat germ cell-free proteinsynthesis system. Tanpakushitsu Kakusan Koso 47 (Suppl. 8),1003–1008

35 Durocher, Y. et al. (2002) High-level and high-throughput recombinantprotein production by transient transfection of suspension-growinghuman 293-EBNA1 cells. Nucleic Acids Res. 30, E9

36 Martzen, M.R. et al. (1999) A biochemical genomics approach foridentifying genes by the activity of their products. Science 286,1153–1155

37 Possee, R.D. (1997) Baculoviruses as expression vectors. Curr. Opin.Biotechnol. 8, 569–572

38 Albala, J.S. et al. (2000) From genes to proteins: high-throughputexpression and purification of the human proteome. J. Cell. Biochem.80, 187–191

39 Zhao, Y. et al. (2003) Improving baculovirus recombination. NucleicAcids Res. 31, E6–6

40 Bertone, P. et al. (2001) SPINE: an integrated tracking database anddata mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res. 29, 2884–2898

Could you name the most significant papers published in

life sciences this month?

Updated daily, Research Update presents short, easy-to-read commentary on the latest hot papers,

enabling you to keep abreast with advances across the life sciences.

Written by laboratory scientists with a keen understanding of their field, Research Update will clarify the significance

and future impact of this research.

Our experienced in-house team is under the guidance of a panel of experts from across the life sciences

who offer suggestions and advice to ensure that we have high calibre authors and have spotted

the ‘hot’ papers.

Visit the Research Update daily at http://update.bmn.com and sign up for email alerts to make sure you don’t miss a thing.

This is your chance to have your opinion read by the life science community, if you would like to contribute, contact us at

[email protected]

Opinion TRENDS in Biotechnology Vol.21 No.9 September 2003388

http://tibtec.trends.com