66
Proteins, Enzymes, Biochemistry Sept. 21, 2001 Duncan MacCannel: Historical Perspective on Molecular Biology / Genetics

Proteins, Enzymes, Biochemistry Sept. 21, 2001 Duncan MacCannel: Historical Perspective on Molecular Biology / Genetics

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Proteins, Enzymes, BiochemistrySept. 21, 2001

Duncan MacCannel: Historical Perspective on Molecular Biology / Genetics

Background

The Thread of Life. Susan Aldridge. Chapter 2

Molecular Biology of the Cell. Alberts et al. Garland Press

Suggested further reading

• Protein molecules as computational elements in living cells. D. Bray. Nature. 1995 Jul 27;376(6538):307-12.

• Signaling complexes: biophysical constraints on intracellular communication. D. Bray. Annu Rev Biophys Biomol Struct. 1998;27:59-75.

• Metabolic modeling of microbial strains in silico. Ms W. Covert, et al. Trends in Biochemical Sciences Vol.26 ( 2001). 179-186.

• Modelling cellular behaviour. D. Endy & R. Brent. Nature(2001) 409: 391-395.

A - Introduction to Proteins / Translation

• The primary structure is defined as the sequence of amino acids in the protein. This is determined by and is co-linear to the sequence of bases

(triplet codons) in the gene*.

5’---CTCAGCGTTACCAT---3’3’---GAGTCGCAATGGTA---5’

5’---CUCAGCGUUACCAU---3’

N---Leu-Ser-Val-Thr---C

DNA

RNA

PROTEIN

transcription

translation

* - this is not strictly true in most eukaryotic genomes

Structure of Genes In Eukaryotic Organisms

hnRNAheterogeneous nuclear RNA

RNA splicing

Transcription

mRNA

hnRNAheterogeneous nuclear RNA

RNA splicing

Transcription

mRNA

Introns

Structure of Genes In Eukaryotic Organisms

Exons

Structure of Genes In Eukaryotic Organisms

hnRNAheterogeneous nuclear RNA

RNA splicing

Transcription

mRNA

mRNA

Alternative RNA splicing

Structure of Genes In Eukaryotic Organisms

hnRNAheterogeneous nuclear RNA

RNA splicing

Transcription

mRNA

Control Elements

Structure of Genes In Eukaryotic Organisms

• Coding sequence can be discontinuous and the gene can be composed of many introns and exons.

• The control regions (= operators) can be spread over a large region of DNA and exert action-at-a-distance.

• There can be many different regulators acting on a single gene – i.e. more signal integration than in bacteria.

• Alternate splicing can give rise to more than one protein product from a single ‘gene’.

• Predicting genes (introns, exons and proper splicing) is very challenging.

• Because the control elements can be spread over a large segment of DNA, predicting the important sites and their effects on gene expression are not very feasible at this time.

Schematic Illustration of Transcription

The nucleotides in an mRNA are joined together to form a complementary copy of the DNA sequence.

Translation

Note that many ribosomes can read one message like beads on a string generating many polypeptide chains simultaneously.

• Translation is the synthesis of a polypeptide (protein) chain using the mRNA template.• Note the mRNA has directionality and is read from the 5’end towards the 3’end.• The 5’end is defined at the DNA level by the promoter but this does not define the translation start.• The translation start sets the ‘register’ or reading frame for the message.• The end is determined by the presence of a STOP codon (in the correct reading frame).

Schematic Illustration of Translation

Protein Synthesis involves specialized RNA molecules called transfer RNA or tRNA.

The translation start is dependent on:1) a sequence motif called a ribosome binding site (rbs)2) an AUG start codon 5-10 bp downstream from the rbs

Translation Start Position

3’end of 16S rRNA

3’AU //-5’ UCCUCA |||||| 5’-NNNNNNNAGGAGU-N5-10-AUG-//-3’

mRNA rbs start

In bacteria a single mRNA molecule can code for several proteins. Such messages are said to be polycistronic. Since the message for all genes in such a transcript are present at the same concentration (they are on the same molecule), one might predict that translation levels will be the same for all the genes. This is not the case: translation efficiency can vary for the different messages within a transcript.

Gene 1 Gene 2 Gene 3 Gene 4

Promoter(Start)

Terminator(Stop)

mRNA

DNA

4 genes , 1 message

Polycistronic mRNA

Tar Tap R B Y Z 5000 1000 <100 1000 18000 10000

(Protein monomer per cell)

Translation Efficiency is an important part of gene expression

A single mRNA may encode several proteins. The final level of each protein may vary significantly and is a function of:1) translation efficiency2) protein stability

Translation

B – Introduction to Proteins / Characteristics

• The primary structure is defined as the sequence of amino acids in the protein. This is determined by and is co-linear to the sequence of bases

(triplet codons) in the gene*.

5’---CTCAGCGTTACCAT---3’3’---GAGTCGCAATGGTA---5’

5’---CUCAGCGUUACCAU---3’

N---Leu-Ser-Val-Thr---C

DNA

RNA

PROTEIN

transcription

translation

* - this is not strictly true in most eukaryotic genomes

H2N CH C

CH3

OH

O

amino group carboxylic acid

amino acid(alanine)

There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties.

H2N CH C

CH3

OH

O

amino group carboxylic acid

amino acid(alanine)

There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties.

-carbon

Amino acids differ in the side chains on the carbon.

H2N CH C

CH3

OH

O

amino group carboxylic acid

amino acid(alanine)

There are 20 naturally occurring amino acids in proteins, each with distinctive ‘side chains’ that give them characteristic chemical properties.

-carbon

Amino acids differ in the side chains on the carbon.

-CH3 (methyl)

H2N CH C

CH2

OH

O

HN

H2N CH C

CH3

OH

O

CH C

CH2

OH

O

HN

H2N CH C

CH3

HN

O

H2O

+

peptide bond

Alanine + Tyrptophan(ala) + (trp)(A) + (W)

Dipeptide(Ala-Trp)

By convention polypeptides are written from the N-terminus (amino) to the C-terminus (carboxy)

Alanine ala AArginine arg RAsparagine asn NAspartic acid asp DCysteine cys CGlutamine gln QGlutamic acid glu EGlycine gly GHistidine his HIsoleucine ile ILeucine leu LLysine lys KMethionine met MPhenylalanine phe FProline pro P Serine ser SThreonine thr TTryptophan trp WTyrosine tyr YValine val V

H2N CH C

H

OH

O

HN

C OH

O

H2N CH C

CH2

OH

O

SH

Glycine

Proline

Cysteine

The Newly Synthesized Polypeptide

• The information from DNARNAProtein is linear and the final polypeptide synthesized will have a sequence of amino acids defined by the sequence of codons in the message.

• The sequence of amino acids is called the primary structure.

• Secondary structure refers to local regular/repeating structural elements.

• The folded three dimensional structure is referred to as tertiary structure.

Protein function depends on an ordered / defined three dimensional folding. The final three dimensional folded state of the protein is an intrinsic property of the primary sequence. How the primary sequence defines the final folded conformation is generally referred to as the Protein Folding Problem.

Primary structure of green fluorescent protein

(single letter AA codes)

SEQUENCE 238AA

26886MW

MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

The primary sequence can be derived directly from the gene sequence but going from sequence to structure or sequence to function is not possible unless there is a related protein for which structure or function is known. Likewise, the structure alone rarely provides information about function (only if the function of a related protein is known).

Projections of the Tertiary Structure of Green Fluorescent Protein

Backbone tracing

Projections of the Tertiary Structure of Green Fluorescent Protein

Backbone tracing

Ile188-Gly189-Asp190-Gly191-Pro192-Val193

Projections of the Tertiary Structure of Green Fluorescent Protein

“Ribbon diagram” showing secondary structures

Projections of the Tertiary Structure of Green Fluorescent Protein

“Ribbon diagram” showing secondary structures

Secondary structures

-helix

Projections of the Tertiary Structure of Green Fluorescent Protein

“Ribbon diagram” showing secondary structures

Secondary structures

-helix -strand

Projections of the Tertiary Structure of Green Fluorescent Protein

“Wireframe” model showing all atoms and chemical bonds.

Ile188-Gly189-Asp190-Gly191-Pro192-Val193

Projections of the Tertiary Structure of Green Fluorescent Protein

“Stick” model showing all atoms and chemical bonds.

“Space filling” model where each atom is represented as a sphere of its Van der Waals radius.

MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELY

Random Coil“Denatured”“Unfolded”

“Native”“Folded”

“folding”

“denaturation”

The final folded three dimensional (tertiary) structure is an intrinsic property of the primary structure.

Primary structure Tertiary Structure

In general, proteins are unstable outside of the cell and very sensitive for solvent conditions.

Active site - the region of a protein (enzyme) to which a substrate molecule binds.• The active site is formed by the three dimensional folding of the peptide backbone and amino acid side chains. (lock and key / induced fit)• The active site is highly specific in binding interactions (stereochemical specificity).

The three dimensional structure of CAP and the cAMP ligand-binding site(Figures 3-45 and 3-55 from Alberts)

Proteins can undergo changes in their three dimensional structure in response to changing conditions or interactions with other molecules. This usually alters the ‘activity’ of the protein.

Conformational Change in Protein Structure

Proteins can undergo changes in their three dimensional structure in response to changing conditions or interactions with other molecules. This usually alters the ‘activity’ of the protein.

Conformational Change in Protein Structure

Binding of the substrate (glucose) cause the protein (hexokinase) to shift from an open to closed conformation. (Fig. 5-2, Alberts)

C - Introduction to Proteins / Protein Functions

Proteins carry out a wide variety of functions in, on and outside the cell. For the purpose of this course, we will generalize these functions into three categories. These are not mutually exclusive and many proteins fit into more than one of these categories.

1 - Structural

2 - Enzymatic

3 - Signal Transduction (information processing)

C1 - Protein Functions: Structural

Proteins can form large complexes that function primarily as structural elements:

Protein coats of viruses. These are large, regular repeating structures composed of 100-1000’s of protein subunits. (Figs 6-74 and 6-72, Alberts).Electron micrographs of A) Phage T4, B) potato virus X, C) adenovirus, D) influenza virus. SV40 structure determined by X-ray crystallography.

Cytoskeleton in eukaryotic cells is responsible not only for determining shape but also in cell movement, mechanical sensing, intracellular trafficking and cell division.

A human cell grown in tissue culture and stained for protein (such that only large regular structures are highlighted). Note the variety of structures (Fig 16-1, Alberts)

Microtubules form by the polymerization of tubulin subunits. Whether the polymer grows or shrinks is influenced conditions in the cell - Dynamic Instability

(Fig 16-33, Alberts; for discussion of dynamic instability see Flyvbjerg H, Holy TE, Leibler S. Stochastic dynamics of microtubules: A model for caps and catastrophes. Phys Rev Lett. 1994 Oct 24;73(17):2372-2375.

C2 - Protein Functions: Enzymatic

Enzyme: a protein* that catalyzes a chemical reaction, where a catalyst is defined as a substance that accelerates a chemical reaction without itself undergoing change. * some RNA molecules can also be considered enzymes

A

A + B

B

C + D

X

Y

• Specificity• Accelerated reaction rates• Control (regulation)• Enzymes can only affect the rate (kinetics) of a reaction, they can not make a reaction more energetically favorable.• Enzymes can be saturated by substrate.

Basics of Enzyme Kinetics

v = Vs

(KM + s)

Michaelis-Menton Kinetics - for a simple enzyme reaction, the interaction of enzyme and substrate is considered an equilibrium and the overall reaction as follows:

E + S ES E + P

k+1

k-1

k+2

v = velocity, reaction rateKM = Michaelis constant

KM = k2 + k-1

k1

C3 - Protein Functions: Signal Transduction

Signal Transduction- in general the relaying of a signal from one physical form to another- in biological terms, the process by which a cell responds to signals (can be intracellular, extracellular).

Examples of ‘signals’ (inputs):• chemicals • light• temperature• electrical (ion gradients)• other cells (cell-cell contact)• mechanical sensing

SignalTransductionInput Output

Generalized Model of Response to Extracellular Signal

“Action”

Ligand

Receptor

ActivatedReceptor

• Ligand can activate or inactivate receptor• Output (action) dependent on system and sometime cell type• In metazoans (multi-cellular eukaryotes), there are about 16 intercellular classes of signaling systems

“Action”

Ligand

ReceptorActivatedReceptor

~PP~

Example 1: Transmembrane Tyrosine Kinase Receptors

• Ligand binding results in receptor dimerization• The cytoplasmic (intracellular) domains are tyrosine kinases which phosphorylate each other on Tyr residue side chains.• This sets off a series of intracellular events

Ligand

Receptor

ActivatedReceptor

Example 2 : Steroid Receptors

nucleus

• The steroid binds to it’s receptor in the cytoplasm.• The steroid-receptor complex but not the free receptor can move into the nucleus .• The steroid-receptor complex binds to specific binding site(s) on the DNA to regulate gene expression.

Ligand

Receptor

ActivatedReceptor

Example 3. Heterotrimeric G-Proteins

GDPGTP

GTPGTP

GDP

• Ligand binding causes activation of the subunit which promotes exchange of GDP for GTP• In the GTP form, the subunit and the associated subunits dissociate from the complex.• Each subunit can go on to initiate a series of intracellular events.

complex)

D - Regulation of Protein Activity

Proteins are often regulated such that the ‘activity’ of a protein is not a constant function of its concentration.

The concentration of a protein in the cell is a function of the rate of synthesis and the rate of degradation. Both these processes can be regulated.

DNA RNA ProteinTranscription Translation Degradation

Synthesis

Protein Active Protein Inactive

Regulation of Enzyme Activity

A BXNegative Feedback

(Product Inhibition)

A B C D E FX

Mechanistically negative feedback can be by direct competition of the product with the substrate for the active site or it can be indirect through interaction wit the enzyme away from the active site.

Regulation of Enzyme Activity

A BXPositive Feedback

(Product Inhibition)

A BXPositive Feedforward

Cooperativity / Allosteric Regulation

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

+ +

Cooperativity / Allosteric Regulation

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

Positive cooperativity

+ +

Cooperativity / Allosteric Regulation

+ +

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

Positive cooperativity

Negative cooperativity

+ +

Cooperativity / Allosteric Regulation

+ +

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

Positive cooperativity

Negative cooperativity

n,1

0

0.25

0.5

0.75

1

0.01 1 100 10000

Fraction bound vs ligand concentration

+ +

Cooperativity / Allosteric Regulation

+ +

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

Positive cooperativity

Negative cooperativity

0

0.25

0.5

0.75

1

0.01 1 100 10000

Positive Cooperativity(n=2, n=3)

+ +

Cooperativity / Allosteric Regulation

+ +

+ +

Hypothetical examples of binding of a ligand to a dimeric protein. The binding curve is very sensitive to the effects on one site on the other.

Two independent sites

Positive cooperativity

Negative cooperativity

0

0.25

0.5

0.75

1

0.01 1 100 10000

Negative Cooperativity(n= 0.5)

Allosteric protein: a protein that changes from one conformation to another upon binding a ligand or when it is covalently (chemically) modified. The change in conformation alters the activity of the protein. Historically considered with multi-meric proteins (e.g. hemoglobin).

Allosteric effector(positive)

Ligand

Regulation of Protein Activity by Covalent Modification

The activity of a protein can modified by addition or removal of a chemical group to an amino acid side chain (i.e. - as a substrate for another enzyme).

The most common modifications are:• Methylation (-CH3)• Phosphorylation (-PO3)• Nucleotidyl• Fatty acid • Myristol

note that many proteins are modified in other ways such as addition of sugar groups (glycosylation) but these are not ‘regulatory’ modifications.

Phosphorylation is the most common mechanism of regulation by covalent modification

Kinase - an enzyme that phosphorylatesPhosphatase - an enzyme that removes phosphate

Regulation by LocalizationProtein activity can be regulated by changing the localization of the protein. This turns out to be a common theme in eukaryotic signal transduction.

Localization can be altered allosterically or by covalent modification.

Addition of a fatty acid group can cause a cytoplasmic protein to associate with the cell membrane.

~PP~

~PP~

Covalent modification of a protein can generate a binding site for another protein.

E - General ConsiderationsProteins have a diverse range of functions and a variety of mechanisms of regulation. The ability to form networks of proteins acting on proteins, the sharing of common reaction intermediates and forming multi-step chemical pathways allows for an endless number of possibilities.

Some general considerations about protein systems:

• A reaction can behave as a step function (digital, boolean) if there is significant cooperativity in the system or if there modifying enzyme that works near saturation.

• Since proteins can act in a catalytic manner, there can be signal amplification.

• Many systems are adaptive, in that the response to signal is not necessarily constant over time (e.g. a signal transduction system may become desensitized and no loner respond to the presence of a ligand- c.f. heterotrimeric G protein).

EnvZ/OmpR system in E. coli bacteria

~P

~POmpR

EnvZ

Increasing Osmolarity

EnvZ

EnvZ is a histidine kinase (phosphorylates specific histidine residues) in response to changes in osmolarity (salt concentration). The ~P group is transferred to OmpR to form OmpR~P. EnvZ also catalyzes the dephosphorylation of OmpR~P.

OmpR~P is a transcriptional regulator of two gene (ompF and ompC). It binds to DNA only in the phosphorylated state.

OmpR~P can activate or repress expression of a gene depending on the position of the binding site relative to the promoter.

~P

~P

X

OFF

ON

Activation and repression of the ompF promoter is regulated by a high affinity and a low affinity binding site respectively. Activation of ompC is through a low affinity activator site.

ompF+ - ompC+

Note that OmpR~P is required for both ompF and ompC transcription.

Low osmolarity High osmolarity

ompF+ - ompF

ompC

+

+

-~P

ompC+

~P

ON

ONOFF

OFF~P ~P

~P

Osmolarity

ProteinLevel

OmpR~P

OmpC

OmpF

Not an ON/OFF switch but more like a thermostat (i.e. gradients of expression levels).

~P

~PRegulator

Receptor

Increasing Signal

[Signal][o

utpu

t sig

nal]

Linear dependence

Playing with Switches

~P

~PRegulator

Receptor

Increasing Signal

[Signal][o

utpu

t sig

nal]

Linear dependence

Adding Cooperativity

Playing with Switches

~P

~PRegulator

Receptor

Increasing Signal

[Signal][o

utpu

t sig

nal]

Linear dependence

Adding Cooperativity

Playing with Switches

Adding More Cooperativity

~P

~PRegulator

Receptor

Increasing Signal

[Signal][o

utpu

t sig

nal]

Playing with Switches

Approximates a step function (ON/OFF Switch)

Not as bad as it looks!

Not all pathways will operate in a single cell.

Epidermal Growth Factor Signaling Pathway

http://www.grt.kyushu-u.ac.jp/spad/pathway/egf.html

• Protein interactions• Protein modification

(Activation/inhibition)• Protein re-localization• Transcriptional regulation