100
THE MOST POPULAR AMONG BACTERIAL HOSTS IS It was first identified (1885) by Theodor Escherich And it is, now, the world-wide most employed microorganism in both research and applied labs as a commensal of human gut ESCHERICHIA COLI

It was first identified (1885) by Theodor Escherich as a

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: It was first identified (1885) by Theodor Escherich as a

THE MOST POPULAR AMONG BACTERIAL HOSTS IS

It was first identified (1885) by Theodor Escherich

And it is, now, the world-wide most employed microorganism in both research and applied labs

as a commensal of human gut

ESCHERICHIA COLI

Page 2: It was first identified (1885) by Theodor Escherich as a

Every single gene of the E.coli K12 type strain (MG1655)

Most of the knowledge on transcription, translation, DNA replication, genetic code, trasducing phages and so on, have been acquired by studying E. coli

Has been silenced

to determine which genes are

essential

and which are not

Page 3: It was first identified (1885) by Theodor Escherich as a

It is a Gram-negative bacterium belonging to

γ-proteobacteria Proteobacteria Enterobacteriaceae

didermal structure

•CELL WALL

PERIPLASM

•CELLULAR (inner) MEMBRANE (IM)

•cytoplasm

•peptidoglycan

•Outer membrane (OM)

PERIPLASM an oxidizing environment, ~ 4% total proteins

CYTOPLASM a reducing compartment

OUTER MEMBRANE Where surface epitopes can be expressed

Page 4: It was first identified (1885) by Theodor Escherich as a

Straight rod ~ 2 µm x 0,5 µm

4 main phylogenetic groups (A, B1, B2, D)

Plenty of serovars

O (>170) = polysaccharide

chains LPS

H (hauch, german for «mist, whiff») (>50) = flagellar proteins

K (Kapsel, german for capsule) (>100) = capsular antigenes (microcapsular

equivalent)

Antigens: O: … H… K…

Escherichia coli K-12

Model microorganism

A very useful biotechnological tool

K12

Page 5: It was first identified (1885) by Theodor Escherich as a

RESPIRATION

AEROBIC OR

FERMENTATION

IT GROWS WELL ON RICH MEDIA (37 -42 °C) BUT CAN GROW IN THE RANGE 10-45 °C

ANAEROBIC

WITH A MEAN GENERATION TIME = 20’

ATP (ENERGY) IS PRODUCED BY

Page 6: It was first identified (1885) by Theodor Escherich as a

IT IS ABLE TO ADAPT TO SEVERAL CONDITIONS

Page 7: It was first identified (1885) by Theodor Escherich as a

E. coli is the most efficient and widely-used host for recombinant protein production and for the production of DNA plasmid to be used to transform other hosts

Well-known genetics

Plenty of mutants are available

Well-known fermentations pathways

Make it easier to set and control industrial processes

Although not naturally competent, it can reach a high transformation efficiency

Fast growth in many media

E. coli provides large biomasses, and its cells are very easy to lyse

Pathogenicity class = 1 Safe to be

used

Page 8: It was first identified (1885) by Theodor Escherich as a

The most common origin in the plasmid vectors employed for E. coli, is the ColE1/pMB1 one

ColE1 = pMB1: 1 base

ColE1/pMB1 harbouring plasmids have a relaxed control

By halting the protein synthesis, the plasmid still replicates, as the polymerases are very stable enzymes. So, the ratio plasmid vs cellular debris can be modified

The ColE1 family encompasses multicopy plasmids in the range 15-

20 ~ 500 copies/cell

A LOT OF PLASMIDS VECTORS ARE AVAILABLE FOR E. COLI

Page 9: It was first identified (1885) by Theodor Escherich as a

P15 (from a natural E. coli plasmid) Relaxed replicon, 20-40 copies

The plasmids harbouring this replication origin can coexist with ColE1/pMB1 in the same cell

Further replicons (different incompatibility groups)

pSC101 (from a natural Salmonella plasmid)

Stringent Replicon, very low copy number (1-5)

F (natural plasmid Fertility factor) and φ (bacteriophage) P1 origins

1-2 copy/cell, used for artificial chromosomes (BACs & PACs)

Compatible with both p15 e ColE1/pMB1

e.g. pACYC184

Page 10: It was first identified (1885) by Theodor Escherich as a

conditional replication systems (suicide vectors)

e.g. The R6K replication occurs at three sites on the plasmid called the alpha, gamma and beta origins

In E. coli the pi protein can be supplied in trans by a prophage (lambda pir) that carries a cloned copy of the pir gene: the plasmid will replicate in a “lambda pir” host, but will unable to do so in other E. coli strains

It requires the “pi protein” encoded by the pir gene, to function

Plasmids endowed of the oriR6K will not be able to replicate in the absence of the pir gene product

Conditional (temperature sensitive) ORIs are also largely employed: they are based on mutated replication enzymes which are inactivated at >30 °C

conditional replication systems are usually employed in order to obtain chromosomic integrations of heterologous fragment, or alleles exchange

Page 11: It was first identified (1885) by Theodor Escherich as a

SELECTION MARKERS

Escherichia coli is sensitive to

In bacteria, antibiotics are used almost exclusively

KANAMYCIN

CHLORAMPHENICOL

TETRACYCLINE

AMPICILLIN

only few cells take up DNA We need selectable markers to detect them

A selective agent kills or prevents the growth of those cells that do not contain the foreign DNA, leaving only the desired ones

Page 12: It was first identified (1885) by Theodor Escherich as a

Is a β-lactam antibiotic, which interfers with the peptidoglycan synthesis

AMPICILLIN Its target is the transpeptidation reaction: PBPs act by removing the terminal D-Ala residue and creating

the trans-peptide bond

β-lactam antibiotics act as pseudosubstrates for the Penicillin Binding Proteins by stably

acylating their active site

Serine active site

D-A

DAP

D-A

D-A

DAP

D-A

Page 13: It was first identified (1885) by Theodor Escherich as a

R H CH3

N

S

CH3 COO- O

β-lactam

BLA_TEM

Penicilloic acid (inactive)

HN

R H CH3 S CH3 COO-

O O

TPase TGase TPase TGase

Which acts mainly on penicillins, by hydrolizing the β-lactam ring, and preventing them to bind to PBPs

The resistance determinant used on plasmids (blaTEM) encodes a periplasmic class A β-lactamase

PBPs

Page 14: It was first identified (1885) by Theodor Escherich as a

Easy to use Low costs Fast growth of the resistant colonies

AMPICILLIN

The Ampr cells secrete the enzyme

Excellent choice for the laboratory

DRAWBACKS, DISCOURAGING FOR THE LARGE PRODUCTIONS

R

plasmidless cells can grow S S

The antibiotic concentration decreases

ADVANTAGES

Page 15: It was first identified (1885) by Theodor Escherich as a

On solid media, this is evident by the appearing of satellite colonies

A possible alternative is the carbenicillin, more expensive but stabler

In liquid media the loss of plasmid is high (up to 80%)

S S

S

S S

S

S S

S

R

-Ampicillin degrades spontaneously at acidic pH

If the selective agent is ampicillin, the starter cultures should not grow more than 8-10 hs

-if not correctly stored, the activity decreases quickly

Page 16: It was first identified (1885) by Theodor Escherich as a

CHLORAMPHENICOL

It blocks protein synthesis by binding to the 50s ribosomal subunit

The determinant used as a marker (Chloramphenicol Acetyl Transferase - CAT) derives from the TN9 transposon

CAT is a cytosolic, tetrameric protein

Cat + CM + acetyl CoA hydroxy-methyl derivatives, unable to bind to the target

DRAWBACKS: Chloramphenicol slows the bacterial growth rate

it is toxic and potentially carcinogenic

It is not allowed for productions aimed to human use

It has to be dissoved in ethanol and a particular attention must be paid in mixing the antibiotic to the medium as microclumps could occurr

Page 17: It was first identified (1885) by Theodor Escherich as a

TETRACYCLINES

H H

H O

H

CONH2

H O H O

H O H O

N(CH3)2 CH3

O O

Are a large group of antibiotics with a molecular structure containing four rings

They are naturally produced, by Streptomyces aureofaciens or S. rimosus, or semisynthetic

Tetracyclines act by forming a stable bond with the 30S ribosomal subunit, so deforming the "A“ site

for the selection just a small amount of the hydrochloride (usually 5 μg/ml) will be sufficient

HCl

The gene used for the selection in E. coli vectors is tetC from pSC101, which encodes a tetracycline efflux system, regulated by the repressor TetR

DRAWBACKS: a limited solubility (0.4 mg / mL in water; 20 mg / mL in alcohol) and photolability

P A E

30S

P A E

30S

AA2

TETRACYCLINES

Page 18: It was first identified (1885) by Theodor Escherich as a

H2SO4

KANAMYCIN

produced by Streptomyces kanamyceticus, it is the most frequently employed among amynoglicosides

the genic determinant used for plasmid selection is derived by the Tn 903 transposon and confers resistence to both kanamycin and neomycin

It acts by interacting with at least three ribosomal proteins, so inhibiting protein synthesis and increasing translation errors

The product is APH(3')I, an Aminoglycoside 3'-phosphotransferase

For the selection, kanamycin sulfate is used, generally at a 30-50 μg/mL concentration

Page 19: It was first identified (1885) by Theodor Escherich as a

AMP TET

TET

AMP

Antibiotic resistance genes can be used also for the SCREENING, (older plasmids) but this implies a longer time for handling

E.g. cloning within the tet gene, that is therefore disrupted

Day 1: after having selected on AMP, the transformant colonies are streaked or replicated on tetracycline plates

Day 2 : the clones unable to grow on tetracycline, harbour a charged plasmid and must be picked from the first plate

AMP

Page 20: It was first identified (1885) by Theodor Escherich as a

The best suited screening markers are those conferring features that can be detected by hystochemical methods

α-peptide of the β-galactosidase (blue/white)

It requires a defective host strain, deleted in the α-peptide encoding region so to be complemented by the plasmid

Reagent: X-gal (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside)

The insertional inactivation is obtained by cloning the DNA fragment within the marker encoding gene

Page 21: It was first identified (1885) by Theodor Escherich as a

Some other selection systems can be used with every E. coli strain (i.e. they do not depend upon mutations in the host genome)

phoC (Morganella morganii): aspecific phosphatase, can be detected by adding BCIP (5-bromo-4-Chloro-

3 indolyl-phosphate) to the agar medium

Melanine-like pigment of Streptomyces: can be detected by adding tyrosine to the agar medium

The insertional inactivation is obtained by cloning the DNA fragment between the marker encoding gene and the promoter

It has, however, a rather low sensitivity and the browning is not clear cut on the isolated colonies

Page 22: It was first identified (1885) by Theodor Escherich as a

PROMOTERS

A good promoter must:

be strong enough

to function in many (most of) E. coli strains

inexpensive

simple

Independent from the normal components of culture media

To be induced in a manner

T78T82G68A58C52A54 -- 162117521819 -- T82A89T52A59A49T89 -35 spacer Pribnow (-10)

That is to be as much as possible consistent with the E. coli consensus

Page 23: It was first identified (1885) by Theodor Escherich as a

a strict control allows to obtain a good biomass yield before starting to express the product

A tight control is needed because

The presence of heterologous proteins can induce/activate the host proteases

The product could be toxic to the cell

Or could aspecifically interact with some cellular components

By binding or damaging DNA

By sequestering essential proteins

Through an improper interaction By causing oxydative

stresses or inducing response systems

Page 24: It was first identified (1885) by Theodor Escherich as a

“UP” SEQUENCES are «optional» promoter features

A+T RICH SEQUENCES LOCATED UPSTREAM THE -35

By binding to the alpha subunit of the RNA polymerase

UP sequences have been observed in

They are less conserved than -35 and -10 regions Their consensus is [-59 nnAAA(A/T)(A/T)T(A/T)TTTTnnAAAAnnn -38]

Whenever present, they increase the expression level (30-70%)

Monoderms (Bacillus, Clostridium)

Didermes (E. coli)

Bacteriophage (λ and Mu)

Page 25: It was first identified (1885) by Theodor Escherich as a

Shine Dalgarno start of the ORF A good SD region, in E. coli is about 6-8-bp long

consensus 5’-UAAGGAGG-3’

The start codon is usually AUG Some microbial species with an high G+C can use alternative codons ( the most frequent one is GUG) To express these ORFs in E. coli, it is expedient to modify the start codon

The bp number

4-14 possible

7-9 Most frequent

8 OPTIMAL

UAAGGAGG-xxxxxxxx-AUG

The quality of the bases A+T OK

G+C NO!

C and G negatively bias the translation efficiency

In the region spacing from SD and the start codon, it is very important to check:

Page 26: It was first identified (1885) by Theodor Escherich as a

The best thing would be to calculate the translation rate for each individual CDS, in terms of thermodynamics

It’s also necessary to avoid palindromic sequences involving the ORF start or masking the SD region

ccugaauUAAGGAGGnnnnnnAUGauucagg

UAAGGAGGacucgagaAUGnnnnnnnnucucgagu

And, if needed, to modify the SD sequence according to it (the translation rate should not exceed the folding and secretion ones)

This is almost impossible to do experimentally, but there are programs that can help, such as

the RBS Calculator at http://salis.psu.edu/software

Page 27: It was first identified (1885) by Theodor Escherich as a

Induction

The growth does not change after the induction

The growth rate decreases

The effects of the overexpression are depicted by the growth curve of the host

check for lysis

If the host cells lyse, consider the lysis amount, the related loss of product

decide whether to try other conditions and/or induction times, or not

Favourable curves

Page 28: It was first identified (1885) by Theodor Escherich as a

Induction

Costitutive Espression of a toxic product

The growth is inhibited/ the host cells lyse

Negative curves

Page 29: It was first identified (1885) by Theodor Escherich as a

Promoters frequently used in E. coli

lac promoter

INDUCTION: ALLOLACTOSE (lactose byproduct)/IPTG YIELD: MEDIUM to LOW REGULATION: NEGATIVE (Repressor: LacI) CONTROL: INADEQUATE

Z Y A O P I repressor β-galactosidase

permease promoter

operator

LacZ + HOH

LacZ + HOH

glucose

galactose lactose

allolactose

Page 30: It was first identified (1885) by Theodor Escherich as a

Due to the basal expression, a few molecules of LacZ and LacY are there

when lactose is made available

so to convert lactose to allolactose and relieve the repression (induction)

Z

Y

negative regulated genes are expressed at low (basal) level in the bacterial cell

Z

Y

Page 31: It was first identified (1885) by Theodor Escherich as a

Plac is a rather leaky promoter: its basal expression is hardly limited

Unless the repressor is located on the same plasmid

In the lab, induction is often not actually necessary

or the host is a LacI overespriming (LacIq) one

Page 32: It was first identified (1885) by Theodor Escherich as a

IPTG is a gratuitous inducer

it binds to the repressor that is no more able to bind the operator

but it is not metabolized

the concentration remains constant and can be controlled

However, it is relatively expensive (lactose can be used instead)

Z Y A O P I

IPTG is not allowed in GMP for human use, further limiting the usefulness of pLac for

industrial productions

Page 33: It was first identified (1885) by Theodor Escherich as a

TTTACA TATGTT TGGAATTGTGAGCGGATAACAATT Plac

it differs from the consensus in 3 nucleotides

the spacer length is sub-optimal (18 nt)

the spacer length is still non ideal

TTTACA TATAAT TGGAATTGTGAGCGGATAACAATT lacUV5

Although largely used for in the lab applications, Plac is not a strong promoter

The derivate PlacUV5 is much stronger, due to a 2 bp mutation in the -10 hexamer, that enhances the recruitment of the RNA polymerase σ70 subunit

Page 34: It was first identified (1885) by Theodor Escherich as a

WILD TGTGAGTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCT UV5 ...A...................................................

-65 -35

-10 +1 Operator RBS WILD CGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATG UV5 .....AA................................................

The net effect of the three-point mutations is the creation of a stronger promoter that is less sensitive to the glucose effect

Moreover, a third point mutation located in the CAP/cAMP binding site, decreases the affinity of PlacUV5 for CAP/cAMP and the sensitivity to

catabolite repression

The good recruitment of σ70 bypasses the need for a positive activation (CAP) typical of the wild-type Plac

Page 35: It was first identified (1885) by Theodor Escherich as a

PtacI

Ptac ensures a good yield: higher than Plac but not so high as the T7 promoter

INDUCTION: IPTG YIELD: MEDIUM-HIGH (5x than the parental ones) REGULATION: LAC I BASAL EXPRESSION : HIGH

-35 -10

lac operator

TTAACT TTGACA trp

TTTACA TATAAT AATTGTGAGCGGATAACAATT lacUV5

TATAAT AATTGTGAGCGGATAACAATT TTGACA RBS tac I

Its strenght could be a drawback for toxic products and membrane proteins

It is a synthetic hybrid promoter derived from the E. coli trp and lacUV5 promoters

Page 36: It was first identified (1885) by Theodor Escherich as a

tetA

INDUCTION: ANHYDROTETRACYCLINE

YIELD: MEDIUM TO HIGH

REGULATION: NEGATIVE

CONTROL: TIGHT

BASAL EXPRESSION: LOW

P tetA

TetR heterologous gene

-

The repressor encoding gene is placed on the plasmid: that’s why the system does not depend from the background of the host strain

NOT DEPENDENT UPON THE STRAIN AND/OR THE METABOLIC STATE OF THE HOST THE INDUCER MOLECULE IS NOT EXPENSIVE

Page 37: It was first identified (1885) by Theodor Escherich as a

The induction is performed with ANHYDROTETRACYCLINE at low concentrations

A full induction(~50 ng/ml) does not bias at all the growth rate of E. coli

OH

OH OH

OH

CH3 N(CH3)2

O

O CONH2 O

A-TET TET Binding efficiency

35

1

A-TET TET Bactericidal action

100

1

This promoter has been successfully used to produce many heterologous proteins (FABs toxins..)

Non-induced cells induced cells

1 100 =

Page 38: It was first identified (1885) by Theodor Escherich as a

INDUCTION: ARABINOSE DOSE DEPENDENT

YIELD: MODULABILE

CONTROL: POSITIVE

REGULATION: TIGHT

BASAL EXPRESSION : VERY LOW

AraBAD The promoter of this tightly

controlled operon, is a suitable tool for the heterologous

expression in E. coli

CTGACG -- 18 -- TACTGT TTGACA -- 17 –- TATAAT

araBAD

consensus

Inducer=Arabinose: very cheap and suitable for the Good Manifacturing Practices

The almost totally lack of a basal expression, in arabinose free media added with glucose, balances the structural weakness of this promotor

P AraC + heterologous gene

0.001%

ARA

3

1%

ARA +

The arabinose amount needed to induce expression depends upon the genetic background of the host strain

Page 39: It was first identified (1885) by Theodor Escherich as a

rhaPBAD

RhaR transcriptional activator 1 RhaS positively regulates the entire regulon (rhaBAD: catabolism and rhaT: transport)

INDUCTION: RHAMNOSE

YIELD: MEDIUM

REGULATION: TIGHT AND ELABORATE

CONTROL: TIGHT

BASAL ESPRESSION: LOW

RhaR

RhaS

L-rhamnose

RhaB RhaA RhaD

RhaT The regulon is however also subjected

to a catabolite repression

The optimal time to induce and to collect the product must be determined experimentally and carefully

Page 40: It was first identified (1885) by Theodor Escherich as a

the choice between lysogenic and lytic cycle depends upon the balance between CI and CRO The CI product is the

Lambda repressor

If CI is already expressed in a bacterial cell (that is: if there is a

prophage)

No other lambda phages are able to infect the same cell

att

cII

Q P

O

R S

Cos

cI

A W B

C D

E F Z

U V G T H M L K I J

cro PR PL N cIII

int

xis

λphages can enter a lytic (right arm) or a lysogenic (left arm) cycle

Phage promoters (Lambda and T7)

Bacteriophages offer a possible alternative to the regulated promoters of metabolic genes

Page 41: It was first identified (1885) by Theodor Escherich as a

cII

Q P O

R S

cI

A W B C

D E F Z U V G T H M L K I J

cro PR PL N cIII

int

xis

PL and PR promoter are directly recognized by the bacterial RNA

polymerases

And are very efficiently regulated by the Lambda

repressor

The Lambda PL/OL promoter-operator ensures medium to high expression levels

λPL

To further improve the control on the promoter, mutated forms of the CI repressor are usually employed

Page 42: It was first identified (1885) by Theodor Escherich as a

to easily switch the expression on/off, a temperature-sensitive mutant of the lambda repressor (cI857; CT in 37742) is the most frequently used

29-30 °C λPL

T sensitive cI857 repressor: ON

Active form

Heterologous gene

λPL = repressed

42 °C λPL

T sensitive cI857 repressor: OFF

Inactive form

Heterologous gene

λPL = constitutive

It is very advantageous for the overexpression of proteins susceptible to proteolysis which would be degraded at more growth-friendly temperatures

Page 43: It was first identified (1885) by Theodor Escherich as a

An alternative to the use of temperature sensitive mutant is to put the CI repressor under the control of PTrp

An economic and practically devoid of tryptophan culture medium contains molasses and acid casein hydrolysate

Heterologous gene

P trp cI repressor λPL

By adding tryptone (tryptic digest of casein tryptophan rich) to the medium)

Heterologous gene λPL cI repressor P

trp the repressor

expression halts the foreign gene expression starts

Page 44: It was first identified (1885) by Theodor Escherich as a

Another interesting application of CI repressor has been proposed in 2016 by G. Durante- Rodríguez et al.

These authors engineered a artificial chimeric regulator

By fusing the DNA binding domain of CI and the BzdR repressor of the bacterium Azoarcus, that responds to benzoyl-CoA

In natural conditions the lytic cycle is triggered by the SOS response, trough the action of RecA

Page 45: It was first identified (1885) by Theodor Escherich as a

PR promoter: repressed PR promoter: derepressed

λNTCI λCCI stress

Benzoyl-CoA

PR promoter

λNTCI

LCBzdR

PR promoter: derepressed

Page 46: It was first identified (1885) by Theodor Escherich as a

Differently from the Lambda PL one, the T7 promoter is not recognised by E. coli RNA polymerase as it promotes the class 2 (tardive) genes

So while choosing to use a T7 promoter we need a bacterial host expressing T7 RNA polymerase

The transcription from the T7 promoter depends upon the T7 RNA polymerase, produced among the class1 (early) genes which are transcribed

by the host RNA polymerase

Page 47: It was first identified (1885) by Theodor Escherich as a

the T7 RNA polymerase is faster than the E.coli one and its expression has to be tightly controlled

BL21 DE3

The most popular one is the commercially available E. coli BL21 (DE3)

A lisogenic E. coli «B» strain harbouring the DE3 Lambda phage

In the λDE3 phage the T7 polymerase is expressed from the LacUV5 promoter so that it can be succesfully produced even in

the presence of glucose

The λDE3 mutant has been obtained by inserting the PlacUV5-T7 fragment within

the integrase encoding gene

The disruption of the int gene prevents the excision of the

DE3 prophage

Page 48: It was first identified (1885) by Theodor Escherich as a

Of course, it is possible to construct one’s own lysogen (DE3) strain, with the same technique used to construct the BL21(DE3)

or other commercially available (DE3) strains

To do so 3 (4) different bacteriophages are needed:

λDE3 (int-): it is not able to integrate into (or excise from) the bacterial chromosome by itself

An Helper Phage (B10) that provides the int function to λDE3, but cannot form a lysogen by itself because it is cI- (has no repressor)

By means of the integrase provided in trans by the Helper Phage, the DE3 will be able to integrate in some bacterial cell

However, the colonies of the lysogenic cells would grow among an overwhelming number of WT ones, so we need a selection tool

Both of these phages cannot propagate due to another mutation affecting the ability to lyse the bacteria

Page 49: It was first identified (1885) by Theodor Escherich as a

-Cannot kill λDE3 lysogens, because it has the same immunity, as λDE3 -Cannot integrate in susceptible cells (cI-) -It kills the mutated bacterial cells resistant to λDE3 that otherwise would survive and hamper the screening

The selection phage (B482)

λDE3

B10 B482

Lysogens are prepared by co-infection with the three phages

B10 + λDE3 integration

Unaffected by B482

No DE3 integration

Killed by B482

most of the growing colonies should be (DE3)

Page 50: It was first identified (1885) by Theodor Escherich as a

To check the lysogenic state a 4th phage is employed

But it will succeed in killing the λDE3 lysogen cells, as they produce T7

polymerase, once induced with IPTG

The Tester Phage, is a T7 mutant, deleted in the T7 RNA

polymerase region

So it is unable to form large plaques on WT E. coli cells

Page 51: It was first identified (1885) by Theodor Escherich as a

A further control may be exerted with the T7 lysozyme

The LSZM encoding gene has been cloned in a p15 plasmid obtaining pLysE and pLysS

Lysozyme binds directly to the T7 RNA polymerase hampering its activity

But it has the drawback to slow the host growth rate as it cuts the cell wall, weakening the cell structure

p15 Cmr

pLysE

T7 LSZM

p15 Cmr

pLysS

T7 LSZM

Notwithstanding the tight control, some basal activity still occurs; it is not usually a problem but it could be if the product is toxic

Page 52: It was first identified (1885) by Theodor Escherich as a

pLysE and pLysS differ only for the orientation of the T7 lysozyme encoding gene

But it is a very important difference

In pLysE the gene is transcribed from the very strong Tet promoter

p15 Cmr

Prom Tet

T7 LSZM

pLysE However, the amount of produced lysozyme is too high

The cells became too frail

Actually, this was the projected construct

Page 53: It was first identified (1885) by Theodor Escherich as a

As the gene was transcribed from the weak T7Φ 3.80 promoter

p15 Cmr

T7 LSZM

pLysS

prom T7Φ 3.80

T7Φ 3.80 is located DOWNSTREAM of the gene but transcribes it together with the

entire plasmid

In this case the produced lysozyme is sufficient to hamper the T7

polymerase activity

without halting the growth of the culture

unexpectedly, the opposite orientation seemed to function

Page 54: It was first identified (1885) by Theodor Escherich as a

A T7 driven expression can sometimes cause some aggregation of the produced proteins

A possible solution is

To slow the production rate

By dosing the IPTG (25-100 µM instead of 1 mM)

By decreasing the incubation temperature

INDUCTION

After the induction, the product output steps up abruptly but

the growth of the hosts harbouring a T7 driven plasmid could stop

it is essential to carefully determine the right timing in order to get the maximal yield

Page 55: It was first identified (1885) by Theodor Escherich as a

T7 RNA polymerase

CE6 BACTERIOPHAGE

The polymerase gene is cloned into the int gene so to be transcribed from the λPL e λPI promoters during the infection

λCE6 is not able to enter the lytic cycle because of the “Sam” mutation (A G in 45352) which inactivates the lysis protein GPS

the basal expression of T7 RNA polymerase prevents the use of lysogenic hosts with particularly toxic genes transcribed from the T7 promoter

CE6

Immunity region

cII

Q P

O

cI857 cro PR PL N

cIII

xis

PI

int

Under the control of CI857

In such cases the λCE6 bacteriophage can be used to provide a source of T7 RNA polymerase to susceptible hosts

Page 56: It was first identified (1885) by Theodor Escherich as a

When CE6 infects the cell the T7 RNA polymerase synthesized de novo starts to copy the target DNA with a very high efficiency

CE6 can be propagated in the E. coli LE392 (supF) strain which suppresses the Sam7 mutation allowing the phage

to enter in the lytic cycle

the system is not so efficient as the use of lysogenic strains (DE3)

But until the infection is performed there are no polymerases able to recognize the T7 promoter

and to transcribe the target gene! +T7 POL

Page 57: It was first identified (1885) by Theodor Escherich as a

Another possible drawback of the T7 promoter in lysogenic strains is is the uneven expression in the single transformant clones

BL21 DE3

BL21 DE3

BL21 DE3

BL21 DE3

BL21 DE3

so that it is necessary to examine several colonies to look for those with the highest production level

Page 58: It was first identified (1885) by Theodor Escherich as a

T7/lac can also be combined to the use of pLysS

Another variant is the T7/lac hybrid promoter

P T7

lac operator Heterologous gene +1

The basal transcription is blocked by LacI

relieves at the same time the repression from the target gene Very often the lacI gene is included in the vectors with this system, to warrant a tight control

P T7

lac operator Heterologous gene +1

The IPTG used to induce the T7 RNA polymerase expression in BL21(DE3)

LacI

LacI IPTG

Page 59: It was first identified (1885) by Theodor Escherich as a

Another important feature is the TERMINATOR

On the expression vectors, a strong rho independent terminator is often located downstream from the MCS to ensure the right release of the mRNA

from the ribosome

Terminators are palindromic sequences followed by a A strand (U at the 3’ end of the mRNA)

<<<<<<<:::<:<<:-:--:-:>>:>:::>>>>>>> AACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTG T7

<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>> AAAACAAAAGGCTCAGTCGGAAGACTGGGCCTTTTGTTTT

rrnD

Page 60: It was first identified (1885) by Theodor Escherich as a

The terminator prevents the genes located downstream the target one, and in the same orientation, to be co-transcribed from the promoter

Downstream gene P Target gene

Downstream gene P Target gene

Page 61: It was first identified (1885) by Theodor Escherich as a

On some plasmids for example the β-lactamase encoding gene can be transcribed from the T7 promoter

BLA

The degradation rate of ampicillin increases

S S

Plasmidless cells can overgrow

Page 62: It was first identified (1885) by Theodor Escherich as a

If there are other promoters even far from the target ORF but oriented in the same direction

So to o prevent undue transcripts involving the gene of interest

It may be expedient to put a strong terminator also UPSTREAM of the expression cassette

Page 63: It was first identified (1885) by Theodor Escherich as a

WHERE IS BETTER TO DIRECT THE RECOMBINANT PRODUCT IN THE CELL?

Inclusion bodies easy purification protection from proteases inactive (non toxic) proteins Higher yield Simpler plasmid design

Simple purification low proteolysis level Improved folding N-terminus authenticity

The least extensive proteolysis Simple purification Improved folding N-terminal authenticity

Usually no secretion Cell lysis

Signal does not always facilitate the export Inclusion bodies may form

Inclusion bodies protein folding denaturation/refolding N-terminal extension

cytoplasm

periplasm

Outside the cell

Page 64: It was first identified (1885) by Theodor Escherich as a

The macromolecule concentration can reach 300–400 mg/ml

E. coli cytoplasm is a very crowded environment

On average, a ribosome releases one protein chain every 35 seconds

In such conditions the first challenge for a protein is to correctly fold

The small (<100 residues) single domain host proteins efficiently reach a native conformation owing to their fast folding kinetics

CYTOPLASM

large multidomain and overexpressed recombinant proteins often require the assistance of folding modulators

Page 65: It was first identified (1885) by Theodor Escherich as a

IN VIVO THE PHYSIOLOGICAL PROTEIN FOLDING IS ASSISTED BY CHAPERONES

the nascent polypeptides first interact with ribosome-bound Trigger Factor (TF)

TF transiently associates to the ribosome

To form a protected folding space where nascent polypeptides are shielded from both proteases and aggregation

Once released from TF, the peptides can either

fold spontaneously (roughly 70% of cytosolic proteins under normal growth

conditions)

Or require further folding-assistance by downstream

chaperones

Page 66: It was first identified (1885) by Theodor Escherich as a

KJE system (5-18%) DnaK (Hsp70) with its co-chaperone DnaJ and the

nucleotide exchange factor GrpE

ELS system 10-15% GroEL (Hsp60) and its co-chaperone GroES

The substrate proteins ejected from DnaK can

DNAK & DNAJ stabilize the nascent chain in a folding-competent state already during translation

By binding hydrophobic segments that are exposed in the extended chain but that will be later buried within the

folded protein

K

Be transferred in a semi-folded form to

GroELGroES

fold into a proper conformation

Be recaptured by DnaKDnaJ for additional cycles of binding/release

Proteins requiring a folding assistance can enter

Page 67: It was first identified (1885) by Theodor Escherich as a

GroEL is made of two rings and the protein binds inside the open one

the open ring becomes an enlarged folding chamber with a hydrophilic lining, where the substrate folding (~10s) is timed by

ATP hydrolysis

the GroEL ring binds ATP

And rapidly recruits GroES, which caps the cavity

EL EL EL

EL

EL EL

ES

ES

Page 68: It was first identified (1885) by Theodor Escherich as a

Once folded, the peptide will be released when ATP will bind again and a new peptide

will enter on the opposite side

In the meanwhile, GroES, the bound ADP and the folded peptide from the previous cycle, are ejected from the opposite ring

If the ejected substrate still exhibits significant surface hydrophobicity it will be recaptured, entering a new cycle

ES

ES

EL

EL

Page 69: It was first identified (1885) by Theodor Escherich as a

The chance to succeed, however, critically depends on the structure of the heterologous

protein

Direct the peptide to the PERIPLASM

To facilitate the correct folding of recombinant proteins

Some vectors include the DnaK/DnaJ or GroEL/GroES encoding genes

As the cytoplasm is a REDUCING compart, proteins that depend upon the disulphide bond formation, are easily misfolded

OR

In such cases it is often more expedient to manipulate the host strain rather than rely on plasmid features

Page 70: It was first identified (1885) by Theodor Escherich as a

In eukaryotic cells, disulphide bonds are preferentially formed in the endoplasmic reticulum

The BACTERIAL PERIPLASM is an oxidizing compartment

(ER)

It can therefore surrogate the ER, although translocating nascent polypeptides through the inner membrane introduces a delicate step

It hosts enzymes catalyzing both disulphide bond formation and isomerization, as well

as specific chaperones and foldases

Page 71: It was first identified (1885) by Theodor Escherich as a

The number of available gates to the periplasm is limited so that metastable precursors may accumulate in the cytoplasm

The fusion to suitable leader peptides allows to translocate the

unfolded precursors into the periplasm by either

the Sec (post-translational)

or the SRP (co-translational) system

Page 72: It was first identified (1885) by Theodor Escherich as a

SEC dependent translocation (General Secretory Pathway)

periplasm The Sec(YEG) translocases form a

transmembrane channel

The energy is provided by the ATPase SecA

The co-translational SRPs system utilizes the same channel as the GSP, but instead of SecB, a Small

RibonucleoProtein (SRP) acts as the chaperone

Ffh

5’ 3’

the SecB chaperone binds the pre-protein and brings it to the

general secretion apparatus

Page 73: It was first identified (1885) by Theodor Escherich as a

SRP binds to the signal peptide of the nascent polypeptide forming an SRP–ribosome nascent chain complex

then binds to its own receptor (FtsY) on the cytoplasmic side of the inner membrane, pulling the complex

FtsY

the SRP pathway is an alternative to the post-translational secretion “SEC” and it is required to avoid premature folding of the proteins in the cytoplasm

Page 74: It was first identified (1885) by Theodor Escherich as a

Another translocation system is TAT (Twin Arginine Translocase)

the Tat apparatus is energised exclusively by the transmembrane proton electrochemical gradient (Δp)

TAT

It is made of three essential (Tat A, B, C) transmembrane proteins and two accessory

ones (D,E)

Page 75: It was first identified (1885) by Theodor Escherich as a

The signal peptides which are recognized by TAT are characterized by a couple of arginine residues at the N-terminus

Differently from Sec, TAT translocates only the correctly folded peptides

periplasm

Page 76: It was first identified (1885) by Theodor Escherich as a

It is about 25 AA long (17-18 25-27)

The typical signal sequence of Gram-negative (didermal) bacteria

positive hydrophobic polar Mature Peptide

THE LENGTH IS CRITICAL

Page 77: It was first identified (1885) by Theodor Escherich as a

SO AS THE STRUCTURE

At least 1 positively charged residue (Lys Arg) within the first 7 ones

A central hydrophobic region (L V I)

A Ser or Ala rich C-TERMINUS

Frequently Pro at (-6)

Typically an Ala (-3 -1) = cut site for the signal peptidase (Ala-X-Ala) (less frequently Ala is substituted with Ser or Gly

P S S A H A L V L F L A L L Y M K K F

cotranslational translocation by SRP needs the presence of highly hydrophobic leader sequences

Page 78: It was first identified (1885) by Theodor Escherich as a

N-region H-region C-region

MNNNDLFQASRRRFLAQLGGLTVAGMLGPSLLTPRRATAAQA

This is mainly due to the extended N-region

S/T-R-R-X-F-L-K

To be targeted to the Tat pathway the N-terminal signal peptides must harbour at least two consecutive arginine residues within an S–R–R–x–F–L–K consensus

the H-region of Tat signal peptides is less hydrophobic than that of Sec-specific signal peptides owing to the presence of more glycine and threonine residues

TAT signal sequences are longer than their SEC counterparts (38 aminoacids, on average)

Page 79: It was first identified (1885) by Theodor Escherich as a

Some of the most popular..

PEL B (PECTATE LYASE)

ST-II (E. coli THERMOSTABLE TOXIN)

DSBA (OXIDOREDUCTASE)

OMPT (MEMBRANE PROTEASE)

K12

ETEC

PEC

PHO A (ALKALINE PHOSPHATASE )

A Sec signal sequence is present in almost all the expression vectors

Recombinant peptide

Page 80: It was first identified (1885) by Theodor Escherich as a

MRTLTTLGLALLLAQPAVA AQAVLPQLQPYTAPAAWLTPVAPLRIADN

MRTLTTLGLALLLAQPAVAAQA VLPQLQPYTAPAAWLTPVAPLRIADN

? ?

MRTLTTLGLALLLAQPAVAAQAVLPQLQPYTAPAAWLTPVAPLRIADN

http://www.cbs.dtu.dk/services/SignalP/

Any ambiguity in the leader peptide should be avoided

This signal peptide, from the chromosomal b-lactamase B3 of Pseudomonas otitidis showed two possible cutting points

some online services analyze the putative signal sequences and try to predict their efficiency and the most probable cutting site

Page 81: It was first identified (1885) by Theodor Escherich as a

artificial peptides can be designed

MIA-1/MIA-2: the same peptide but different nucleotide sequence (same CAI value) (aimed to co-express two different peptides on the same vector, avoiding homologous recombination)

MIAmax: the best one..

MIAperC: introduces a cloning NcoI site (CCATGG) without altering the AMA motif for the signal peptidase II

Page 82: It was first identified (1885) by Theodor Escherich as a

Even when fused with a suitable signal sequence, some large cytoplasmic proteins or some mutants obtained by a combinatorial approach could fail to be

translocated

Secondary/tertiary structure

Chaperones recognizing

attempts to solve the problem could be tried by

Overexpressing the DsbC Disulphide-isomerase

Overexpressing the wide range Skp/OmpH: chaperone

Testing different leader/chaperones combinations

The bias could arise from

Page 83: It was first identified (1885) by Theodor Escherich as a

To avoid the possible congestion of the translocation systems

It is possible to:

Decrease the expression rate

Increase some limiting components of the translocation system

e.g. introduce plasmid copies of prlA4 e secE genes, encoding the main transport proteins

Page 84: It was first identified (1885) by Theodor Escherich as a

Once in the periplasm, the folding is enzymatically catalyzed by

With the exception of DsbB, these proteins belong to the thioredoxin protein superfamily

the Dsb oxidases/isomerases

chaperones such as Skp DegP and FkpA

peptidyl-prolyl isomerases such as SurA PpiA and PpiB

. the Dsb protein system (DsbA, B, C, D, G) mediates the disulphide bond

formation and rearrangement

Page 85: It was first identified (1885) by Theodor Escherich as a

. The formation of disulphide bonds in a protein is made possible by two related pathways:

an oxidative pathway, which is responsible for the formation of the disulphides

OXIDATIVE PATHWAY Disulphides are introduced into the substrate

proteins trough exchanging them with the periplasmic protein DsbA, which is, in turn,

reoxidized by the inner membrane protein DsbB

and an isomerization pathway which shuffles incorrectly formed disulphides

Page 86: It was first identified (1885) by Theodor Escherich as a

Substrates with more than two cysteines may form incorrect disulphides, causing

them to misfold

THE ISOMERASE PATHWAY

Non-native disulphides are corrected by DsbC and DsbG which are maintained in their active reduced state by the inner

membrane protein DsbD

Page 87: It was first identified (1885) by Theodor Escherich as a

To facilitate the purification of heterologous proteins it is possible to fuse them with other proteins (fusion partners) or with

short aa stretches (peptide TAGS)

The MCS allows to fuse the TAG in frame with the coding sequence of the recombinant peptide

MCS TAG

5’ FUSION

MCS TAG

3’ FUSION

Once expressed, the recombinant protein has to be purified

Vectors are available that allow to position the tag to the N- or the C- terminal end

Whenever the tertiary structure of the peptide is available, the TAG is placed at the solvent-accessible end

If a signal peptide for the secretion has to be added the TAG is placed at the C-terminal end

Page 88: It was first identified (1885) by Theodor Escherich as a

TAGS DRAWBACKS

Need for expensive proteases

The cutting efficiency never reaches 100% and limits the yield

frequent need of further treatment (eg. the formation and isomerization of disulphide bridges) in order to obtain an active product

the lack of preliminary indications on the possibility to obtain the solubilization, which can only be determined experimentally

Page 89: It was first identified (1885) by Theodor Escherich as a

pBR322

pBR322 4361 bp

TET R AM R

pMB1 ori

Rop bom/nic

Designed in 1977 by the scientist Bolivar and Rodriguez, pBR322 has been the first artificial plasmid vector

The mob gene is absent but the nic/bom sites could allow its mobilization by an helper plasmid

POPULAR PLASMID VECTORS FOR E. COLI

ori: pMB1 (belongs to the ColE1 family of plasmids) Selection/screening BLATEM (Tn3) and tet (pSC101)

Page 90: It was first identified (1885) by Theodor Escherich as a

This plasmid has a low copy number (~20 copies per cell) due to the action of the Rop protein and of RNAI and RNAII

the plasmid replication is initiated by RNAII which hybridizes strongly to the plasmid

The formation of this hybrid at the origin is critical for plasmid replication

RNaseH digests RNA II yielding a 550 nt molecule that acts as a primer for DNA polymeraseI and initiates the replication of the entire plasmid

RNAII

RNaseH

RNAII

primer

replication

RNAI is a non-coding RNA that acts as an antisense repressor of plasmid replication within ColE1 plasmids

RNAI

Page 91: It was first identified (1885) by Theodor Escherich as a

RNAI concentration increases together with the copy number

RNAI anneals to RNAII by complementary base pairing and blocks the access to RNAseH thus prohibiting RNAII from its initiation role

RNAII

RNAI

This results in a negative feedback loop for replication, setting the average number of plasmids per cell

the plasmid encodes also the Rop (repression of primer) protein

Rop further enhances and stabilizes the interaction

between RNA I and RNA II

RNA-II

RNA-I ROP

Page 92: It was first identified (1885) by Theodor Escherich as a

RNAI accumulates and hybridizes with RNAII

The plasmid is replicated and the copy number increases

At a low plasmid concentration

RNAII is transcribed from the plasmid

But when the plasmid concentration trespasses a certain treshold

Replication stops and the plasmid copy number

reaches a steady state

Page 93: It was first identified (1885) by Theodor Escherich as a

The RNAI-Rop regulation prevents the copy number to increase

The deletion of the rop gene, coupled with a point mutation that reduces the

formation of the RNA I/RNA II duplex, led to

The higher copy number of pUC plasmids (derivatives of pMB1 plasmids)

pMB1ori and BLATEM selection are derived from pBR322

pBR322 4361 bp

Page 94: It was first identified (1885) by Theodor Escherich as a

Puc18/19 Small plasmids with a very high copy number up to 500-700/cell

Plac MCS 5’- -lacZ

The screening marker is lacZ; lacI is not on the vector so that basal activity is high and can be limited in lacIq strains

The absence of both mob and bom/nic ensures that the Puc18/19 can’t be mobilized by helper plasmids

pUC18 and pUC19 differ only for the MCS orientation

Most probably the point mutation hampers the interactions between RNA I and RNA II by producing a temperature-

dependent alteration of the RNA II conformation

Page 95: It was first identified (1885) by Theodor Escherich as a

pBluescript II SK/KS (+)/(-)

ori: ColE1; selection BLATEM; screening lacZ

Include phage elements PHAGEMIDS

T3 and T7 promoters flank the MCS

Possibility of in vitro transcription by phage polymerases

MCS= KpnI SacI (KS) o SacI KpnI (SK) within the β-galactosidase α-peptide

ColE1

Page 96: It was first identified (1885) by Theodor Escherich as a

Philamentous phage F1 origin (+) or (-)

The bacterial cells are infected with an M13 mutant phage

(M13K07)

M13K07 copies the ssDNA according to the (+) or (-) orientation

Once packed in the viral particles the ssDNA can isolated to be used as a probe or for site specific mutagenesis techniques

ColE1

lacZ frag

F1- F1+ Makes it possible the phagemid to be

rescued as sense or antisense single-stranded (ss) DNA by an helper phage

Page 97: It was first identified (1885) by Theodor Escherich as a

(+)

(+)

Infectious form

Replicative form dsDNA

Bidirectional replication

pII inserts a nick in the (+) strand

The rolling circle replication starts

Once completed completed the (+) strand

is cut by pII Is released and circularizes

M13 biological cycle

Host enzymes

Page 98: It was first identified (1885) by Theodor Escherich as a

Host enzymes

Replicative form

pII mut recognizes only poorly the engineered origin

So the phagemid is preferentially replicated

M13K07 + phagemid

M13K07 is an M13 phage mutated in pII

phagemid

The original replicon of the phage has been modified by several lacZ insertions P-II

mut

And packed into the phage

Page 99: It was first identified (1885) by Theodor Escherich as a

The PET series

based on PBR322

Expression vectors

Transcription vectors

*** S10 MCS T7-T

T7 SD gene «10» derived from ΦT7

The cloned gene fuses with the N-terminal end of S10; T7 is induced according the features of the host strain

Page 100: It was first identified (1885) by Theodor Escherich as a

In time other “optional characters" were added to this popular series, changing the selection, adding TAG as histidine strands and / or phage elements

The series includes several vectors

The lowercase letter (a,b,c o d) denotes the reading frame of S10, as referred to the cloning site BamHI, placed

downstream of the signal sequence

a) GGT CGC GGA TCC b) GGT CGG GAT CCG c) GGT CGG ATC CGG d) The same frame as “c” but NcoI (CCATGG) instead of

NdeI (CATATG) upstream to the signal sequence

Selection: AMP other than in the «9»series where it is kan 11: T7/lac + lacI are on the plasmid

12: downstream S10 there is the OmpT signal sequence suitable for the secretion

5: no terminator has been inserted