43
A Zero-Knowledge Based Introduction to Biology Konstantinos (Gus) Katsiapis 25 Sep 2009 Thanks to Cory McLean and George Asimenos

A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

A Zero-Knowledge Based Introduction to Biology

Konstantinos (Gus) Katsiapis25 Sep 2009

Thanks to Cory McLean and George Asimenos

Page 2: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Cells: Building Blocks of Life

© 1997-2005 Coriell Institute for Medical Research

cell, membrane, cytoplasm, nucleus, mitochondrion

2

Page 3: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

DNA: “Blueprints” for a cell

• Genetic information encoded in long strings of double-stranded DNA

• DeoxyriboNucleic Acid comes in only four flavors: Adenine, Cytosine, Guanine, Thymine

3

Page 4: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Nucleotide

O

C C

CC

H

H

HHH

H

H

COPO

O-

O

to next nucleotide

to previous nucleotide

to base

deoxyribose, nucleotide, base, purine (A,G), pyrimidine (T,C), 3’, 5’

3’

5’ Adenine (A)

Cytosine (C)

Guanine (G)

Thymine (T)

Let’s write “AGACC”!

pyrimidines

purines

4

Page 5: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

“AGACC” (backbone)

5

Page 6: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

“AGACC” (DNA)deoxyribonucleic acid (DNA)

5’

5’3’

3’

6

Page 7: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

DNA is double stranded

3’

5’

5’

3’

DNA is always written 5’ to 3’

AGACC or GGTCT

strand, reverse, complement, reverse-complement

7

Page 8: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

DNA Packaginghistone, nucleosome, chromatin, chromosome, centromere, telomere

H1DNA

H2A, H2B, H3, H4

~146bp

telomerecentromere nucleosome

chromatin

8

Page 9: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

The Genome

The genome is the full set of hereditary information for an organism

Humans bundle two copies of the genome into 46 chromosomes in every cell= 2 * (1-22 + X/Y)

9

Page 10: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Building an Organism

cellDNA Every cell has the same sequence of

DNA

Subsets of the DNA sequence determine the identity and function of different cells

10

Page 11: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

From DNA To Organism

?

Proteins do most of the work in biology, and are encoded by subsequences of DNA, known as genes.

11

Page 12: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

RNA

O

C C

CC

H

OH

HHH

H

H

COPO

O-

O

to next ribonucleotide

to previous ribonucleotide

to base

ribose, ribonucleotide, U

3’

5’ Adenine (A)

Cytosine (C)

Guanine (G)

Uracil (U)

pyrimidines

purines

12

T U

Page 13: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Genes & Proteins

3’5’

5’3’ TAGGATCGACTATATGGGATTACAAAGCATTTAGGGA...TCACCCTCTCTAGACTAGCATC

ATCCTAGCTGATATACCCTAATGTTTCGTAAATCCCT...AGTGGGAGAGATCTGATCGTAG

AUGGGAUUACAAAGCAUUUAGGGA...UCACCCUCUCUAGACUAGCAUC

(transcription)

(translation)

Single-stranded RNA

protein

Double-stranded DNA

gene, transcription, translation, protein

13

Page 14: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Transcriptionpromoter

3’5’

5’3’

G A T T A C A . . .

C T A A T G T . . .

14

Promoter: A region of DNA facilitating transcription of a gene. Usually located one the same strand, upstream and nearby.

Page 15: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Transcriptiontranscription factor, binding site, RNA polymerase

3’5’

5’3’

Transcription factors: a type of protein that binds to DNA and helps initiate gene transcription.

Transcription factor binding sites: Short sequences of DNA (6-20 bp) recognized and bound by TFs.

RNA polymerase binds a complex of TFs in the promoter.

G A T T A C A . . .

C T A A T G T . . .

15

Page 16: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Transcription

3’5’

5’3’

The two strands are separated

G A T T A C A . . .

C T A A T G T . . .

16

Page 17: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Transcription

3’5’

5’3’

An RNA copy of the 5’→3’ sequence is created from the 3’→5’ template

G A T T A C A . . .

C T A A T G T . . .

G A U U A C A

17

Page 18: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Transcription

3’5’

5’3’

G A U U A C A . . .

G A T T A C A . . .

C T A A T G T . . .

pre-mRNA 5’ 3’

18

Page 19: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

RNA Processing5’ cap, polyadenylation, exon, intron, splicing, UTR, mRNA

5’ cap poly(A) tail

intron

exon

mRNA

5’ UTR 3’ UTR

19

Page 20: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Structure

5’ 3’

promoter5’ UTR exons 3’ UTR

introns

codingnon-coding

20

Page 21: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

How many?(Human Genome)

• Exons per gene:~ 8 on average (max: 148)

• Nucleotides per exon:170 on average (max: 12k)

• Nucleotides per intron:5,500 on average (max: 500k)

• Nucleotides per gene:45k on average (max: 2,2M)

21

Page 22: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

From RNA to Protein

• Proteins are long strings of amino acids joined by peptide bonds

• Translation from RNA sequence to amino acid sequence performed by ribosomes

• 20 amino acids 3 RNA letters required to specify a single amino acid

22

Page 23: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Amino acidamino acid

C

O

N

H

C

H

H OH

R

There are 20 standard amino acids

AlanineArginine

AsparagineAspartateCysteine

GlutamateGlutamine

GlycineHistidine

IsoleucineLeucineLysine

MethioninePhenylalanine

ProlineSerine

ThreonineTryptophan

TyrosineValine

23

Page 24: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

C

O

N

H

C

H

R

to previous aa to next aa

N-terminus

(start)

H OH

C-terminus

(end)

N-terminus, C-terminus

from 5’ 3’ mRNA24

Proteins

Page 25: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Translation

The ribosome (a complex of protein and RNA) synthesizes a protein by reading the mRNA in triplets (codons). Each codon is translated to an amino acid.

ribosome, codon

mRNA

P site A site

25

Page 26: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Translation

GGGG GlyGAG GluGCG AlaGUG Val

A GGA GlyGAA Glutamic acid (Glu)

GCA AlaGUA Val

C GGC GlyGAC AspGCC AlaGUC Val

U GGU Glycine (Gly)GAU Aspartic acid (Asp)

GCU Alanine (Ala)GUU Valine (Val)

G

GAGG Arg AAG LysACG ThrAUG Methionine (Met) or START

A AGA Arginine (Arg)AAA Lysine (Lys)ACA Thr AUA Ile

C AGC Ser AAC AsnACC ThrAUC Ile

U AGU Serine (Ser)AAU Asparagine (Asn)ACU Threonine (Thr)

AUU Isoleucine (Ile)

A

GCGG Arg CAG GlnCCG ProCUG Leu

A CGA Arg CAA Glutamine (Gln)CCA ProCUA Leu

C CGC Arg CAC HisCCC ProCUC Leu

U CGU Arginine (Arg)CAU Histidine (His)CCU Proline (Pro)CUU Leucine (Leu)

C

GUGG Tryptophan (Trp)

UAG STOPUCG Ser UUG Leu

A UGA STOPUAA STOPUCA Ser UUA Leucine (Leu)

C UGC CysUAC TyrUCC SerUUC Phe

U UGU Cysteine (Cys)UAU Tyrosine (Tyr)UCU Serine (Ser)UUU Phenylalanine (Phe)

U

GACU

26

Page 27: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Translation

5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’

UTR Met

Start Codon

Ala Trp ThrStop

Codon27

Page 28: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Translation

Met Ala

5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’

Trp

28

Transcription and Translation [3D Simulation]

Page 29: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Errors?

• What if the transcription / translation machinery makes mistakes?

• What is the effect of mutations in coding regions?

mutation

29

Page 30: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Reading Framesreading frame

G C U U G U U U A C G A A U U A G

G C U U G U U U A C G A A U U A G

G C U U G U U U A C G A A U U A G

G C U U G U U U A C G A A U U A G

30

Page 31: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Synonymous Mutation

G C U U G U U U A C G A A U U A G

Ala Cys Leu Arg Ile

G C U U G U U U A C G A A U U A G

synonymous (silent) mutation, fourfold siteG

G C U U G U U U G C G A A U U A G

Ala Cys Leu Arg Ile

31

Page 32: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Missense Mutation

G C U U G U U U A C G A A U U A G

Ala Cys Leu Arg Ile

G C U U G U U U A C G A A U U A G

missense mutationG

G C U U G G U U A C G A A U U A G

Ala Trp Leu Arg Ile

32

Page 33: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Nonsense Mutation

G C U U G U U U A C G A A U U A G

Ala Cys Leu Arg Ile

G C U U G U U U A C G A A U U A G

nonsense mutationA

G C U U G A U U A C G A A U U A G

Ala STOP

33

Page 34: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Frameshift

G C U U G U U U A C G A A U U A G

Ala Cys Leu Arg Ile

G C U U G U U U A C G A A U U A G

frameshift

G C U U G U U A C G A A U U A G

Ala Cys Tyr Glu Leu

34

Page 35: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Expression Regulation

• When should each gene be expressed?• Regulate gene expression

Examples:

– Make more of gene A when substance X is present– Stop making gene B once you have enough– Make genes C1, C2, C3 simultaneously

• Why? Every cell has same DNA but each cell expresses different proteins.

• Signal transduction: One signal converted to another– Cascade has “master regulators” turning on many

proteins, which in turn each turn on many proteins, ...

Regulation, signal transduction

35

Page 36: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Gene Regulation

Gene expression is controlled at many levels: DNA chromatin structure Transcription Post-transcriptional modification RNA transport Translation mRNA degradation Post-translational modification

36

Page 37: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Transcription Regulation

• Much gene regulation occurs at the level of transcription.

• Primary players:– Binding sites (BS) in cis-regulatory

modules (CRMs)

– Transcription factor (TF) proteins

– RNA polymerase II

• Primary mechanism: – TFs link to BSs

– Complex of TFs forms

– Complex assists or inhibits formation of the RNA polymerase II machinery

37

Page 38: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Tx Factor Binding Sites

• Short, degenerate DNA sequences recognized by particular TFs

• For complex organisms, cooperative binding of multiple TFs required to initiate transcription

Binding Sequence Logo

38

Page 39: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Transcription Regulation Mechanisms

enhancer, silencer, insulator

Transcription Factor Specificity:

Enhancer:

Silencer:

39

Insulator:

Page 40: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

40

unicellular

multicellular

Unicellular vs. Multicellular

Page 41: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Non-coding RNAs

• RNAs transcribed from DNA but not translated into protein

• Structural ncRNAs: Conserved secondary structure (A-U, C-G, G-U)

• Involved in gene regulation

41

Page 42: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

Summary

• All hereditary information encoded in double-stranded DNA

• Each cell in an organism has same DNA

• DNA RNA protein• Proteins have many diverse roles in

cell• Gene regulation diversifies protein

products within different cells42

Page 43: A Zero-Knowledge Based Introduction to Biologyweb.stanford.edu/class/cs273a/presentations.aut09/CS273a... · 2018-01-09 · Gene Expression Regulation • When should each gene be

The end?

43