89
The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission Problems, Moscow, Russia and Department of Bioengineering and Bioinformatics, Moscow State University

The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Embed Size (px)

Citation preview

Page 1: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Genetic Code

Math-CS Camp, 19.07.06, Singapore

Mikhail S. Gelfand

Research and Training Center of Bioinformatics,Institute for Information Transmission Problems, Moscow, Russia

andDepartment of Bioengineering and Bioinformatics,

Moscow State University

Page 2: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Biological Code by Martynas Yčas (London, 1969)

Биологический код (Mосква, 1971)

0

20

40

60

80

100

120

140

47 49 51 53 55 57 59 61 63 65 67 69 71

year(s)

refs

.

191X192X193X

18XX

190X

1941-451946-501951-551956

Page 3: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

To apply mathematics in biology, a mathematician has to understand biology. Israel Gelfand

Page 4: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Plan

• Pre-history– Genetics– Evolutionary theory– Chemistry

• Cracking the Code

• Update

Page 5: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Genetics: Gregor Mendel (1822-1884)

• Attended the Philosophical Institute in Olomouc

• Since 1843 – at the Augustinian Abbey of St. Thomas in Brno

• 1851-1853 – studied in the University of Vienna

• 1856-1863 – cultivated 28 thousand pea plants

• The Three Laws of Genetics (“Experiments on Plant Hybridization”)– Read to the Natural History Society of

Brunn in Bohemia (1865)– Published in Proceedings of the

Natural History Society (1866)• Since 1866 – abbot, stopped working

in science

Page 6: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The seven traits of pea plants studied by Mendel

Page 7: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The first law

Crossing two pure lines different in some trait (e.g. yellow / green seeds), one gets only one variant (allele) in the first generation (the dominant allele)

F0

F1

Page 8: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The second law

Crossing two pure lines different in some trait (e.g. yellow / green seeds), one gets only one variant (allele) in the first generation (the dominant allele), and the distribution 3:1 of the dominant and recessive alleles in the second generation.

F0

F1

F2

Page 9: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

(Law of large numbers)

F0

F1

F2

The 3:1 ratio is seen only when the number of observations is sufficiently high.

Page 10: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The third lawTwo different traits are inherited independently

(in the second generation the ratio is 9:3:3:1)

F0

F1

F2

Page 11: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

F2

Page 12: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

What if we take a pair with a different assortment of the same traits?

F0

F1

F2

F0

?

Page 13: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Same F1

F0

F1

F2

F0

F1

Page 14: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Same F2… regardless of the initial assortment

F0

F1

F2

F0

F1

Page 15: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Incomplete dominance

Page 16: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Incomplete dominance

?

Page 17: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Incomplete dominance

?

Page 18: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Incomplete dominance

Page 19: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Charles Darwin (1809-1882)

• 1825-27 in Edinburgh University and 1827-31 in University of Cambridge – natural history, geology, botany

• 1831-1836 – Voyage of the Beagle

• Journal of Researches into the Geology and Natural History of the various countries visited by H.M.S. Beagle (1839)

Page 21: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Law of Natural Selection

• Species make more offspring than can grow to adulthood. • Populations remain roughly the same size. • Food resources are limited, but are relatively constant most of

the time.

• In such an environment there will be a struggle for survival among individuals.

• In sexually reproducing species, generally no two individuals are identical.

• Much of the variation is heritable.

• Individuals with the "best" characteristics will be more likely to survive …

• … those desirable traits will be passed to their offspring …• … and then inherited by following generations, becoming

prevalent and then fixed among the population through time.

Page 24: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Re-discovery of the Mendel laws and emergence of modern genetics

• Hugo de Vries (1900)• William Bateson

– genetics, gene, allele

• Walter Sutton – Link between genes and

chromosomes(1902)

• Archibald Garrod – Genetic cause of some

human disease (1902-08-23)

• Thomas Morgan, work on Drosophila. – Mutants: spontaneous

appearance of new alleles (a fly with white eyes in a population of flies with red eyes) (1908)

– Universal acceptance of chromosomes (1915)

Page 25: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Gene = a set of non-complementing mutationsEdward Lewis: Do two recessive mutations occur in the same gene?

F1: Mutant phenotype

F1: Wild-type phenotype

Page 26: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

F2 Mutant phenotypes persist in cis (same gene). Mutant phenotypes reappear in trans (different genes)

F1: Mutant phenotype

F1: Wild-type phenotype

F2: All mutant phenotypes

F2

WT WT WT WTMut Mut Mut Mut Mut

1 2 2 41 2 1 2 1 9:7

Page 27: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

DNA

• Friedrich Miescher (1869)– Nucleolin– Richard Altmann: nucleic acid (1889). Only in chromosomes

• Phoebus Levene (1929)– Components (four bases, the sugar-phosphate chain)– Nucleotide: phosophate+sugar+base unit

• Hammarsten and Casperson (1930s)– DNA is a long polymer; crystals

• Astbury (1938)– X-ray photographs

• Chargaff rules (1947) – In many organisms, #A=#T, #C=#G

Page 28: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Transforming factor (Frederick Griffith,1928)

Page 29: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

… = DNA (Oswald Avery, Colin McLeod, Maclyn MacCarthy,1944)

Page 30: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

DNA is the genetic medium of phages (Alfred Hershey and Martha Chase, 1948)

32P – radioactive DNA35S – radioactive proteins

Only DNA enters the cell

Page 31: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

… and only DNA is inherited by progeny phages

Page 32: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Erwin Schrödinger

“What is life”, 1946: The gene is an aperiodic crystal

Page 33: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The structure of DNA …

• Maurice Wilkins and Rosalind Franklin: high-resolution crystals (1950-1953)

Page 34: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

… is the double helixJames Watson and Francis Crick (1953)

Page 35: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Nature paper: a few lines more than one page

Page 36: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The DNA chain

Page 37: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Complementary pairs of nucleotides

С

Т

G

A

Page 38: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Figures from the second

Watson-Crick paper

Page 39: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The main distances are the same

Page 40: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

One base-pair in the double helix (axial view)

Page 41: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The double helix, stick and ball models, axial view

Page 42: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The double helix, stick and ball models, side view

Page 43: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Three models for the replication of DNA

Page 44: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The semi-conservative one is correct (Matthew Meselson and Franklin Stahl, 1958)

Q: What would be the outcome if one of the two other models were correct?

Cells are grown on the 15N (heavy) medium for several generations, then transferred to 14N (light) medium

Page 45: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Electron micrograph of replicating DNA

Page 46: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Central Dogma (F.Crick)DNA RNA protein

Page 47: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Crossingover and recombination

• Genes from one chromosome are not inherited independently

• Recombination allows for relative mapping of gene positions on the chromosome:if two genes are close, the frequency of recombination will be lower

Page 48: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Collinearity of the gene and the protein (Charles Yanofsky, 1967)

Page 49: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The Genetic Code• The genetic code:

correspondence between DNA and protein (George Gamow, 1954) (Георгий Гамов)

• Crick and co-authors (1961):– Non-overlapping (one mutation affects one amino

acid)– Degenerate (many codons for one amino acid)– Comma-less (no specific markers between codons)– Periodic

Page 50: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The codon is a triplet• Mutations caused by acridine

– Non-leaky (instead of weakened function, simply no function)– Mechanism: insertions and deletions of nucleotides

(the downstream part of the gene completely scrambled the code is comma-less)

CUACUACUACUACUACUACUACUACUACUACUACUACUALeuLeuLeuLeuLeuLeuLeuLeuLeuLeuLeuLeuLeu

insertionCUACUACUACGUACUACUACUACUACUACUACUACUACULeuLeuLeuArgThrThrThrThrThrThrThrThrThr

deletionCUACUACUACUACUACUACUACUACUACACUACUACUACLeuLeuLeuLeuLeuLeuLeuLeuLeuHisTyrTyrTyr

U

G

Page 51: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Double mutants and revertants

• Two classes of mutations: (+) and (–) • Double mutants (+)¤(+) and (–)¤(–) still produce loss-of-

function phenotypes• Double mutants (+)¤(–) and (–)¤(+) produce leaky

phenotypes

CUACUACUACGUACUACUACUACUACUACUACUACUACULeuLeuLeuArgThrThrThrThrThrThrThrThrThr

¤CUACUACUACUACUACUACUACUACUACACUACUACUACLeuLeuLeuLeuLeuLeuLeuLeuLeuHisTyrTyrTyr

CUACUACUACGUACUACUACUACUACUACACUACUACUALeuLeuLeuArgThrThrThrThrThrThrLeuLeuLeu

Page 52: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission
Page 53: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission
Page 54: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission
Page 55: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Triple mutants are revertants!

• Triple mutants of the same class, (+)¤(+)¤(+) and (–)¤(–)¤(–), produce leaky phenotypes

CUACUACUACGUACUACUACUACUACUACUACUACUACUACULeuLeuLeuArgThrThrThrThrThrThrThrThrThrThr¤CUACUACUACUACUACUACGUACUACUACUACUACUACUACULeuLeuLeuLeuLeuLeuArgThrThrThrThrThrThrThr

double mutant – loss of function phenotype

CUACAUCUACGUACUACUACGUACUACUACUACUACUACUACLeuLeuLeuArgThrThrThrTyrTyrTyrTyrTyrTyrTyr¤CUACUACUACUACUACUACUACUACUACGUACUACUACUACULeuLeuLeuLeuLeuLeuLeuLeuLeuArgThrThrThrThr

triple mutant – leaky phenotype

CUACUACUACGUACUACUACGUACUACUACGUACUACUACUALeuLeuLeuArgThrThrThrTyrTyrTyrValLeuLeuLeu

Page 56: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission
Page 57: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Cracking the Code (F.Crick, M.Nirenberg, J.Matthaei, S.Ochoa,

G.Khorana, … and you)

• Regular oligonucleotides– … UUUUUUUUUU …– … UCUCUCUCUC …– … UCAUCAUCAU …

• Random oligonucleotides with known composition• Changes in proteins caused by deamination-

caused mutations: CU, AG• Changes in proteins caused random mutations• (tRNA binding in the presense of trinucleotides)

Page 58: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

20 amino acids and 64 codons

• Alanine• Cysteine• Aspartate• Glutamate• Phenylalanine• Glycine• Histidine• Isoleucine• Lysine• Leucine• Methionine• Asparagine• Proline• Glutamine• Arginine• Serine• Threonine• Valine• Tryptophan• Tyrosine

UUU Phe UCU   UAU   UGU  

UUC   UCC   UAC   UGC  

UUA   UCA   UAA   UGA  

UUG   UCG   UAG   UGG  

CUU   CCU   CAU   CGU  

CUC   CCC Pro CAC   CGC  

CUA   CCA   CAA   CGA  

CUG   CCG   CAG   CGG  

AUU   ACU   AAU   AGU  

AUC   ACC   AAC   AGC  

AUA   ACG   AAA Lys AGA  

AUG   ACA   AAG   AGG  

GUU   GCU   GAU   GGU  

GUC   GCC   GAC   GGC  

GUA   GCA   GAA   GGA  

GUG   GCG   GAG   GGG  

Page 59: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Triplet binding data (from Crick’s Croonian lecture, 1966)

Page 60: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Reading the code: The ribosome

Page 61: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Translation

Page 62: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Polysomes

Page 63: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Adaptors (F.Crick and S.Brenner)

Page 64: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

tRNA: secondary structure

Page 65: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

tRNA: three-dimensional structure

Page 66: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

tRNA and aminoacid-tRNA-synthetase

Page 67: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Initiation of translation

Page 68: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Translation start sitesdnaN ACATTATCCGTTAGGAGGATAAAAATG

gyrA GTGATACTTCAGGGAGGTTTTTTAATG

serS TCAATAAAAAAAGGAGTGTTTCGCATG

bofA CAAGCGAAGGAGATGAGAAGATTCATG

csfB GCTAACTGTACGGAGGTGGAGAAGATG

xpaC ATAGACACAGGAGTCGATTATCTCATG

metS ACATTCTGATTAGGAGGTTTCAAGATG

gcaD AAAAGGGATATTGGAGGCCAATAAATG

spoVC TATGTGACTAAGGGAGGATTCGCCATG

ftsH GCTTACTGTGGGAGGAGGTAAGGAATG

pabB AAAGAAAATAGAGGAATGATACAAATG

rplJ CAAGAATCTACAGGAGGTGTAACCATG

tufA AAAGCTCTTAAGGAGGATTTTAGAATG

rpsJ TGTAGGCGAAAAGGAGGGAAAATAATG

rpoA CGTTTTGAAGGAGGGTTTTAAGTAATG

rplM AGATCATTTAGGAGGGGAAATTCAATG

Page 69: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Translation start sites aligned

dnaN ACATTATCCGTTAGGAGGATAAAAATG

gyrA GTGATACTTCAGGGAGGTTTTTTAATG

serS TCAATAAAAAAAGGAGTGTTTCGCATG

bofA CAAGCGAAGGAGATGAGAAGATTCATG

csfB GCTAACTGTACGGAGGTGGAGAAGATG

xpaC ATAGACACAGGAGTCGATTATCTCATG

metS ACATTCTGATTAGGAGGTTTCAAGATG

gcaD AAAAGGGATATTGGAGGCCAATAAATG

spoVC TATGTGACTAAGGGAGGATTCGCCATG

ftsH GCTTACTGTGGGAGGAGGTAAGGAATG

pabB AAAGAAAATAGAGGAATGATACAAATG

rplJ CAAGAATCTACAGGAGGTGTAACCATG

tufA AAAGCTCTTAAGGAGGATTTTAGAATG

rpsJ TGTAGGCGAAAAGGAGGGAAAATAATG

rpoA CGTTTTGAAGGAGGGTTTTAAGTAATG

rplM AGATCATTTAGGAGGGGAAATTCAATG

Page 70: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Elongation

Page 71: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Termination of translation

Page 72: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Dialects

• The genetic code is not universal• … but the differences are relatively minor• … occur mainly in small genomes of organelles• … and involve specific codon families.• In many cases symmetry is increased, or entire families

reassigned.• Many changes involve stop codons

Page 73: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Reassignment

CUN (=CUU, CUC, CUA, CUG): LeuThr

Possible initiation codons in addition to AUG (Met):NUG (=GUG,UUG,CUG), AUN (=AUU,AUC,AUA)

UAA, UAG: stop Gln

Page 74: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

More symmetry

AUU IleAUC IleAUA IleMetAUG Met

AGU SerAGC SerAGA ArgSerAGG ArgSer

UGU CysUGC CysUGA stopTrpUGG Trp

Page 75: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Vulnerable codon families

CGU ArgCGC ArgCGA Arg noneCGG Arg none

AGU Ser AGC SerAGA Arg Ser Gly stop AGG Arg Ser Gly stop none

GGU GlyGGC GlyGGA GlyGGG Gly

Page 76: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Stop-containing families

UGU CysUGC CysUGA stop Trp Cys SecUGG Trp

UAU TyrUAC TyrUAA stop Tyr GlnUAG stop Gln (Pyl)

Page 77: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

How many letters are there in the English alphabet?

Page 78: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

How many letters are there in the English alphabet?

• 26 (everybody knows) …

Page 79: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

How many letters are there in the English alphabet?

• 26 (everybody knows) …

• … but we are discussing the book by Yčas …

Page 80: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

How many letters are there in the English alphabet?

• 26 (everybody knows) …

• … but we are discussing the book by Yčas …

• … so everybody are naïve

Page 81: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

How many amino acids?

• Chemists: hundreds– many occur in proteins:

post-translation modifications

• How many amino acids are encoded by DNA?

Page 82: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Crick:

Page 83: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Is formyl-methionine a “standard” amino acid?

• Occurs in bacteria at N-termini of all recently synthesized proteins (may be enzymatically removed later on)

• Has three codons: AUG, GUG, UUG– unlike “inernal” methionine encoded only

by AUG– by the way, internal GUG encodes Valine

and internal UUG encodes Leucine

Page 84: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Selenocysteine• In all three domains of life (bacteria, eukaryotes, archaea)• Encoded by UGA followed by a special hairpin structure

(SECIS)– without this hairpin UGA is a stop-codon– several genes for selenoproteins per genome (or none)– corresponds to cysteine in homologs (more efficient in enzymes)

• Complicated mechanism of incorporation (specific tRNA, seryl-tRNA-synthetase, conversion to SeCys on tRNA, specific elongation factor)

Page 85: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Alignment of SECIS elements

Page 86: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

The consensus

SECIS structure

Page 87: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

SECIS elements: examples

Page 88: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Pyrrolysine

• In methanogenic archaea• A derivative of lysine• Directly encoded (unlike selenocysteine).

Standard mechanism: – UAG codon– specific tRNA – aminoacyl-tRNA

• UAG rarely used as a stop codon– never as the only stop of a gene

Page 89: The Genetic Code Math-CS Camp, 19.07.06, Singapore Mikhail S. Gelfand Research and Training Center of Bioinformatics, Institute for Information Transmission

Thanks

• Wikipedia• Ergito• Authors of papers,

photographs and Internet resources

• Professor Leong Hon Wai• The organizers• The assistants• The students