Welcome to Introduction to Bioinformatics Friday, 1 September Introduction to Molecular Biology

Welcome toIntroduction to Bioinformatics

Introduction to Molecular BiologyDNA to protein

Molecular biology forms part of a long intellectual tradition

• c. 350 BCE - Aristotle begins the Western tradition of natural philosophy • 1664 - Robert Hook coins the term “cell” in his treatise Micrographia• 1676 - Anton van Leeuwenhoek discovers bacteria, later spermatozoa• 1735 - Carl Linnaeus’ Systema Naturae lays the foundations for taxonomy• 1859 - Charles Darwin’s Origin of the Species• 1860s - Louis Pasteur disproves abiogenesis and develops the “Germ theory”• 1866 - Gregor Mendel describes the “inheritance of traits” in peas.• 1868 - Ernst Haeckel postulates that the nucleus responsible for heredity• 1869 - Friedrich Miescher isolates a crude nucleic acid preparation• 1880s - Cytologists work out the details of mitosis, meiosis and fertilization


• 1903 - Walter Sutton proposes a chromosomal theory of heredity• 1908 - Thomas Hunt Morgan discovers that genes can mutate• 1909 – Archibald Garrod proposes the “gene-enzyme” hypothesis• 1915 – Morgan and colleagues publish linkage maps of D. melanogaster• 1927 – H.J. Muller and L.J. Stadler show that radiation can induce mutation• 1928 - Fredrick Griffith demonstrates genetic transformation• 1931 - Barbara McClintock proves genetic recombination – “crossing over” • 1944 - Avery, MacLeod & McCarty show that DNA carries genetic information• 1940s - The “modern synthesis” - E. Mayr, T. Dobzhansky, & J. Huxley• 1949 – Erwin Chargaff formulates “Chargaff’s rules” of DNA composition• 1952 - Hershey and Chase demonstrate that DNA is the genetic material• 1953 - James Watson and Francis Crick describe the double helix of DNA• 1957 – Vernon Ingram - genes determine the sequence of amino acids • 1958 - Matt Meselson and Frank Stahl prove semiconservative replication


1958 - Arthur Kornberg discovers DNA polymerase 1960 – Sam Weiss and Jerard Hurwitz independently discover RNA polymerase 1960s - Genetic code cracked by Crick, Marshall Nirenberg, Har Gobind Khorana, etc. 1961 - Charles Yanofsky & Sydney Brenner show colinearity of DNA & protein 1965 - Holley and Zamir determine the structure of a tRNA

20xx – Your contribution!!

The “Central Dogma” of Molecular Biology

DNA RNA Protein

Term coined by Francis Crick in 1956 to describe the flowof information in the cell

Replication

Transcription

Translation

Information flow is compartmentalized

Translation

DNA

Pre-mRNA

mRNA

protein

mRNAmRNA

Protein

Transcription

Processing

Export

Decay

Decay

Frederick Griffiths demonstrates “Transformation” of a heritable character in the bacteria Streptococcus pneumoniae

What is the nature of the Gene?

Oswald Avery, Colin MacLeod & Maclyn McCartyfirst show that DNA is the “genetic principle”

What is the nature of the Gene?

Enzymesused todegradeproteins

The Hershey-Chase experiment confirmsthat DNA is the stuff of heredity

Erwin Chargaff and his rules

1.Ratio of nucleotides depends on species2.A=T and G=C no matter the organism

What is the structure of DNA?

1952

Rosalind Franklin and Maurice Wilkins produce X-ray diffraction images of DNA crystals that suggested that DNA must have some helical arrangement

What is the structure of DNA?

1953

Francis Crick and James Watson put together all of the clues and correctlydeduce that DNA is a Double Helix

DNA base pairing occurs through hydrogen bonds

A:T pairs: 2 bonds

G:C pairs: 2 bonds

The double helix strongly suggested that DNA replicationmight proceed by a “semiconservative” process

The Meselson-Stahl experiment argues for strand separation during DNA replication

Genes control the amino acid sequence of proteins

•1957 – Vernon Ingram shows that sickle cell haemoglobin varies from wild type by the substitution of one amino acid

Genes control the amino acid sequence of proteins

Alteration of amino acid sequence is also observed in all other hereditary anaemias!

DNA cannot directly specify the sequenceof amino acids in proteins

• Protein synthesis in eukaryotic cells known to take place in the cytoplasm

• There must therefore be a SECOND information containing molecule that gets its specificity from DNA, but then moves to the cytoplasm

• Attention immediately focuses on RNA – was easy to imagine that it could be produced from a DNA template

•Torborn Caspersson and Jean Brachet demonstrated that RNA was mostly in the cytoplasm

Jean Brachet (1909-1998)

Discovery of mRNA

• T2 is a bacteriophage that infects E. coli

• Completely shuts down normal cellular transcription. Only viral protein is made

• T2 RNA always has the same composition as T2 DNA

• T2 carries none of its own RNA

• 32P is incorporated into RNA made after T2 infection

• Only about 3-5% of cellular RNA is messenger RNA!

The case for RNA

Chemically very similar to DNA

Hydroxyl groupMissing methylgroup in uracil

relative to thymine

Easy to imagine RNA being produced from a DNA template

There must be a molecular machinethat makes RNA from a DNA template

• Jerard Hurwitz and Sam Weiss independently discover an enzyme that will only make RNA in the presence of DNA.

• The enzyme uses ATP, GTP, CTP, and UTP as precursors

POOR GUYS!

1959 Nobel prize was already awarded to Severo Ochoafor what turned out later to be the WRONG ENZYME!

RNA Polymerase is a molecular machine that carries out transcription

RNA is synthesised in the nucleusbut travels to the cytoplasm

Cells pulse-labelled with 3H coupled cytidine

T = 15 minutes T = ~90 minutes

D.M. Prescott

Ribosomes are the site of protein synthesis

ribosomes studding the endoplastic reticulum

Shown using radio labelled amino acids in conjunction with ultracentrifugation to isolateDifferent cell fractions. Where does the radioactivity end up at various times?

Ribosomes and associated rRNAs are the factories for protein synthesis

Crick’s adaptor hypothesis

• Can folded RNA act directly as the template for protein synthesis?

• Seems unlikely:• the nucleosides chemically want to react with water soluble groups• but many amino acids are polar• no clear way to discriminate chemically similar amino acids

Crick proposes that an adaptor molecule must fit between RNAand the incoming amino acids, but its nature is unknown

Incoming amino acid

Adaptor molecule

RNA

Crick’s adaptors (tRNAs) are themselves RNA molecules

• Self-folding by complementary base pairs gives a structure with several functional domains

• Account for ~10% of cellular RNAabundance

• Typically includes several modified, non-standard bases.

Mahlon Hoagland Paul Zamecnik

Zamecnick and Hoagland discoveraminoacyl synthetases

Enzymes that added an adaptor (that we now know to be tRNA) to amino acids prior to their incorporation in proteins

It turns out these tRNAs are Crick’s proposed adaptors

Translation proceeds througha tRNA intermediate

Nature of the genetic code

• Obvious early on most likely a triplet code in order to code 20 amino acids:

• 4 x 4 nucleotides can specify 42 = 16 amino acids• 4 x 4 x 4 nucleotides can specify 43 = 64 amino acids

• Code must be redundant

• Not overlapping – Sydney Brenner’s thought experiment

• Marshall Nirenberg and Heinrich Matthaei showed that a homopolymer (UUUUUU…. etc. ) produced a poly-phenylalanine protein

Example RNA with two repeating units RNAs with two repeating units:

(UCUCUCU → UCU CUC UCU) produced a polypetide consisting of alternating Serine and CUC codes for Leucine

RNAs with three repeating units:

(UACUACUA → UAC UAC UAC, or ACU ACU ACU, or CUA CUA CUA) produced three different strings of amino acids

RNAs with four repeating units including UAG, UAA, or UGA, produced only dipiptides and tripeptides thus revealing that UAG, UAA and UGA are stop codons.

Khorana's synthetic RNA approachto cracking the genetic code

The genetic code is (almost)universal

Amino acids fall into five functional categories

Most commonLeu Gly Ser

Least commonTrp Met His

Study Question 11Degeneracy and frequency of amino acids

Study Question 12Single mutation from AGA

Silent: |

Hydrophilic/Hydrophilic: |


Silent: |


Conservative: |

|

Hydrophilic/Hydrophobic: |


Silent: |


Conservative: |

|

Hydrophilic/Hydrophobic: |

|

|

Other: |

Proteins have four levels of structure

The primary structure of proteins is determined bypeptide bonds between amino acids

Formation of this bond is associated with a small +ve G

Protein synthesis depends on coupled reactions!

Secondary structure - the alpha helix

Alpha helical conformation is stabilised by hydrogen bonding

H bonds

Secondary structure – beta sheets

Parallel configuration

Antiparallel configuration

The enzyme acetylcholinesterase bound to acetylcholine

Secondary structures combineto determine tertiary structure

Proteins combine to form quaternary structures

Collagen Haemoglobin

Interactions (usually with a small molecule) can alter the shape and activity of an enzyme

Allosteric interactions

+ve -ve

Enzymes lower the activation energiesassociated with biochemical reactions

G

Typical energy of activation is 20-30 kcal/mol

Eukaryotic mRNA must often must be splicedin order to produce a mature transcript

Exons often correspond to functional protein domains and alternative splicing can give rise to variant proteins

Documents

Welcome to Introduction to Bioinformatics Friday, 1 September Introduction to Molecular Biology