Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
A Zero-Knowledge Based Introduction to Biology
Konstantinos (Gus) Katsiapis25 Sep 2009
Thanks to Cory McLean and George Asimenos
Cells: Building Blocks of Life
© 1997-2005 Coriell Institute for Medical Research
cell, membrane, cytoplasm, nucleus, mitochondrion
2
DNA: “Blueprints” for a cell
• Genetic information encoded in long strings of double-stranded DNA
• DeoxyriboNucleic Acid comes in only four flavors: Adenine, Cytosine, Guanine, Thymine
3
Nucleotide
O
C C
CC
H
H
HHH
H
H
COPO
O-
O
to next nucleotide
to previous nucleotide
to base
deoxyribose, nucleotide, base, purine (A,G), pyrimidine (T,C), 3’, 5’
3’
5’ Adenine (A)
Cytosine (C)
Guanine (G)
Thymine (T)
Let’s write “AGACC”!
pyrimidines
purines
4
“AGACC” (backbone)
5
“AGACC” (DNA)deoxyribonucleic acid (DNA)
5’
5’3’
3’
6
DNA is double stranded
3’
5’
5’
3’
DNA is always written 5’ to 3’
AGACC or GGTCT
strand, reverse, complement, reverse-complement
7
DNA Packaginghistone, nucleosome, chromatin, chromosome, centromere, telomere
H1DNA
H2A, H2B, H3, H4
~146bp
telomerecentromere nucleosome
chromatin
8
The Genome
The genome is the full set of hereditary information for an organism
Humans bundle two copies of the genome into 46 chromosomes in every cell= 2 * (1-22 + X/Y)
9
Building an Organism
cellDNA Every cell has the same sequence of
DNA
Subsets of the DNA sequence determine the identity and function of different cells
10
From DNA To Organism
?
Proteins do most of the work in biology, and are encoded by subsequences of DNA, known as genes.
11
RNA
O
C C
CC
H
OH
HHH
H
H
COPO
O-
O
to next ribonucleotide
to previous ribonucleotide
to base
ribose, ribonucleotide, U
3’
5’ Adenine (A)
Cytosine (C)
Guanine (G)
Uracil (U)
pyrimidines
purines
12
T U
Genes & Proteins
3’5’
5’3’ TAGGATCGACTATATGGGATTACAAAGCATTTAGGGA...TCACCCTCTCTAGACTAGCATC
ATCCTAGCTGATATACCCTAATGTTTCGTAAATCCCT...AGTGGGAGAGATCTGATCGTAG
AUGGGAUUACAAAGCAUUUAGGGA...UCACCCUCUCUAGACUAGCAUC
(transcription)
(translation)
Single-stranded RNA
protein
Double-stranded DNA
gene, transcription, translation, protein
13
Gene Transcriptionpromoter
3’5’
5’3’
G A T T A C A . . .
C T A A T G T . . .
14
Promoter: A region of DNA facilitating transcription of a gene. Usually located one the same strand, upstream and nearby.
Gene Transcriptiontranscription factor, binding site, RNA polymerase
3’5’
5’3’
Transcription factors: a type of protein that binds to DNA and helps initiate gene transcription.
Transcription factor binding sites: Short sequences of DNA (6-20 bp) recognized and bound by TFs.
RNA polymerase binds a complex of TFs in the promoter.
G A T T A C A . . .
C T A A T G T . . .
15
Gene Transcription
3’5’
5’3’
The two strands are separated
G A T T A C A . . .
C T A A T G T . . .
16
Gene Transcription
3’5’
5’3’
An RNA copy of the 5’→3’ sequence is created from the 3’→5’ template
G A T T A C A . . .
C T A A T G T . . .
G A U U A C A
17
Gene Transcription
3’5’
5’3’
G A U U A C A . . .
G A T T A C A . . .
C T A A T G T . . .
pre-mRNA 5’ 3’
18
RNA Processing5’ cap, polyadenylation, exon, intron, splicing, UTR, mRNA
5’ cap poly(A) tail
intron
exon
mRNA
5’ UTR 3’ UTR
19
Gene Structure
5’ 3’
promoter5’ UTR exons 3’ UTR
introns
codingnon-coding
20
How many?(Human Genome)
• Exons per gene:~ 8 on average (max: 148)
• Nucleotides per exon:170 on average (max: 12k)
• Nucleotides per intron:5,500 on average (max: 500k)
• Nucleotides per gene:45k on average (max: 2,2M)
21
From RNA to Protein
• Proteins are long strings of amino acids joined by peptide bonds
• Translation from RNA sequence to amino acid sequence performed by ribosomes
• 20 amino acids 3 RNA letters required to specify a single amino acid
22
Amino acidamino acid
C
O
N
H
C
H
H OH
R
There are 20 standard amino acids
AlanineArginine
AsparagineAspartateCysteine
GlutamateGlutamine
GlycineHistidine
IsoleucineLeucineLysine
MethioninePhenylalanine
ProlineSerine
ThreonineTryptophan
TyrosineValine
23
C
O
N
H
C
H
R
to previous aa to next aa
N-terminus
(start)
H OH
C-terminus
(end)
N-terminus, C-terminus
from 5’ 3’ mRNA24
Proteins
Translation
The ribosome (a complex of protein and RNA) synthesizes a protein by reading the mRNA in triplets (codons). Each codon is translated to an amino acid.
ribosome, codon
mRNA
P site A site
25
Translation
GGGG GlyGAG GluGCG AlaGUG Val
A GGA GlyGAA Glutamic acid (Glu)
GCA AlaGUA Val
C GGC GlyGAC AspGCC AlaGUC Val
U GGU Glycine (Gly)GAU Aspartic acid (Asp)
GCU Alanine (Ala)GUU Valine (Val)
G
GAGG Arg AAG LysACG ThrAUG Methionine (Met) or START
A AGA Arginine (Arg)AAA Lysine (Lys)ACA Thr AUA Ile
C AGC Ser AAC AsnACC ThrAUC Ile
U AGU Serine (Ser)AAU Asparagine (Asn)ACU Threonine (Thr)
AUU Isoleucine (Ile)
A
GCGG Arg CAG GlnCCG ProCUG Leu
A CGA Arg CAA Glutamine (Gln)CCA ProCUA Leu
C CGC Arg CAC HisCCC ProCUC Leu
U CGU Arginine (Arg)CAU Histidine (His)CCU Proline (Pro)CUU Leucine (Leu)
C
GUGG Tryptophan (Trp)
UAG STOPUCG Ser UUG Leu
A UGA STOPUAA STOPUCA Ser UUA Leucine (Leu)
C UGC CysUAC TyrUCC SerUUC Phe
U UGU Cysteine (Cys)UAU Tyrosine (Tyr)UCU Serine (Ser)UUU Phenylalanine (Phe)
U
GACU
26
Translation
5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’
UTR Met
Start Codon
Ala Trp ThrStop
Codon27
Translation
Met Ala
5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’
Trp
28
Transcription and Translation [3D Simulation]
Errors?
• What if the transcription / translation machinery makes mistakes?
• What is the effect of mutations in coding regions?
mutation
29
Reading Framesreading frame
G C U U G U U U A C G A A U U A G
G C U U G U U U A C G A A U U A G
G C U U G U U U A C G A A U U A G
G C U U G U U U A C G A A U U A G
30
Synonymous Mutation
G C U U G U U U A C G A A U U A G
Ala Cys Leu Arg Ile
G C U U G U U U A C G A A U U A G
synonymous (silent) mutation, fourfold siteG
G C U U G U U U G C G A A U U A G
Ala Cys Leu Arg Ile
31
Missense Mutation
G C U U G U U U A C G A A U U A G
Ala Cys Leu Arg Ile
G C U U G U U U A C G A A U U A G
missense mutationG
G C U U G G U U A C G A A U U A G
Ala Trp Leu Arg Ile
32
Nonsense Mutation
G C U U G U U U A C G A A U U A G
Ala Cys Leu Arg Ile
G C U U G U U U A C G A A U U A G
nonsense mutationA
G C U U G A U U A C G A A U U A G
Ala STOP
33
Frameshift
G C U U G U U U A C G A A U U A G
Ala Cys Leu Arg Ile
G C U U G U U U A C G A A U U A G
frameshift
G C U U G U U A C G A A U U A G
Ala Cys Tyr Glu Leu
34
Gene Expression Regulation
• When should each gene be expressed?• Regulate gene expression
Examples:
– Make more of gene A when substance X is present– Stop making gene B once you have enough– Make genes C1, C2, C3 simultaneously
• Why? Every cell has same DNA but each cell expresses different proteins.
• Signal transduction: One signal converted to another– Cascade has “master regulators” turning on many
proteins, which in turn each turn on many proteins, ...
Regulation, signal transduction
35
Gene Regulation
Gene expression is controlled at many levels: DNA chromatin structure Transcription Post-transcriptional modification RNA transport Translation mRNA degradation Post-translational modification
36
Transcription Regulation
• Much gene regulation occurs at the level of transcription.
• Primary players:– Binding sites (BS) in cis-regulatory
modules (CRMs)
– Transcription factor (TF) proteins
– RNA polymerase II
• Primary mechanism: – TFs link to BSs
– Complex of TFs forms
– Complex assists or inhibits formation of the RNA polymerase II machinery
37
Tx Factor Binding Sites
• Short, degenerate DNA sequences recognized by particular TFs
• For complex organisms, cooperative binding of multiple TFs required to initiate transcription
Binding Sequence Logo
38
Transcription Regulation Mechanisms
enhancer, silencer, insulator
Transcription Factor Specificity:
Enhancer:
Silencer:
39
Insulator:
40
unicellular
multicellular
Unicellular vs. Multicellular
Non-coding RNAs
• RNAs transcribed from DNA but not translated into protein
• Structural ncRNAs: Conserved secondary structure (A-U, C-G, G-U)
• Involved in gene regulation
41
Summary
• All hereditary information encoded in double-stranded DNA
• Each cell in an organism has same DNA
• DNA RNA protein• Proteins have many diverse roles in
cell• Gene regulation diversifies protein
products within different cells42
The end?
43