35
N ucleus DNA m RNA Protein Transcription Translation ER U C A G Designs in DNA Richard Deem, Paradoxes Class, March 16, 2014

Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Embed Size (px)

Citation preview

Page 1: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Nucleus

DNA

mRNA

Protein

Transcription

Translation

ER

U

C

A G

Designs in DNARichard Deem,

Paradoxes Class, March 16, 2014

Page 2: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

The problem with biology…

…is you need to know things

Page 3: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Nucleus

Central Dogma of Biology

AT

GC

ATAG

TG

C C

A T

T

GC

ATA

TG

C

TA G

C A T CTA

GC

ATA

TG

C G

GC

ATA

DNA

mRNA

Protein

Transcription

Translation

GAGG A U C A CA UU A G

G

U C

AU

A

C

A U

ERG A

G G A U CA

CA

U

UA G

G

U C

AU

A

C

AU

GAG

G A U C A CA UU A G

G

U C

AU

A

C

A U

ChloroplastMitochondrion

Page 4: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Guanine (G)

Adenine (A)

Purines

C

CHC

CN

N N

N

CH

H

NH2

C

CC

CHN

N

O

H2N N

N

CH

H

Thymine (T)

Cytosine (C)

Pyrimidines

C

CHC

CHN

N

H

O

O

CH3

CH

CHC

CN

N

H

NH2

O

DNA Bases

Page 5: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Nucleotide

Structure of Deoxyadenosine

Glycosidic Bond

Adenine (base)C

C CH

C

N

NN

N

HC

NH2

O

OH

OCH2

Sugar (Deoxyribose)

5’

3’

Nucleoside

O- P

O

O-

Page 6: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

HC

HC C

CN

N

NH

O

H

OCH2O- P

O

O

O

C

CHC

CN

N N

N

CH

HN

H

O H2C O-P

O

O-

O

O

C

HC C

CNH

N O

H3C

O

OH

H2CO- P

O

O

O

C

C CH

CN

NN

N

HC

NH

H

OCH2O- P

O

O-

O

Adenine

Cytosine

Guanine

Thymine

5’

3’

C

CHC

CHN

N

O

O

CH3

O

OH

H2C O-P

O

O

O

C

C

C

C

NH

N

O

NH

N

NHC

H

OCH2O- P

O

O

O

C C

C

C

HN

N

O

HN

N

NCH

H

O H2C O-P

O

O

O

CH

CHC

CN

N

HN

O

H

O H2C O-P

O

O

O

Adenine

Cytosine

Guanine

Thymine3’

5’

HydrogenBond DNA Structure

Page 7: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

A T

G

C

A

TAG

T

GC

C

A

T T

G

C

A

TAT

GC

T

A

GC A T

CTA

G

C

A

TA

TG

C

G

G

C

ATA

DNA Double Helix

Page 8: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Chromosome

Nucleosome

DNA Structure

DNA

Histone H1

4 Histone protein pairs

Page 9: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Human ChromosomesElectron

MicrographKaryotype

Telomere

Centromere

Page 10: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA Structure: ChromatinHeterochromatin(condensed DNA)

Euchromatin(actively transcribed DNA)

Nucleus

Page 11: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

C

C

C

C

N

N

O

O

CH3

H H

Deoxyribose

C

C

C

C

N

N

O

O

H

H H

Ribose

Bases Found in DNA vs. RNA

Thymine Uracil

DNA RNAAdenine Adenine

Cytosine Cytosine

Guanine Guanine

Page 12: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

G A

GG

A U CA

CA

U

UA G

G

U C

AU

A

CA

U

G A

GG

A U CA

CA

U

UA G

G

U C

AU

A

CA

U

Transfer RNA (tRNA)Transfer RNA

Anti-codon

Mesenger RNA (mRNA)

G

A

G

C

U A

U

U

C

GG C

CC

UA

G

C

UC

G

CA U

C

A

C

G

CG

A U

AC G

UA

C

GC

GC

G

C

G

C

GC

GU

A

C

G

A

AU

UCodon

Methionine

Page 13: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Electron Micrograph of Translation Process

Ribosomes Protein chainsmRNA

+

Page 14: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Codon AA Codon AA Codon AA Codon AAUUU Phe UCU

Ser

UAU Tyr UGU CysUUC UCC UAC UGCUUA Leu UCA UAA Stop UGA StopUUG UCG UAG UGG Trp

CUULeu

CCUPro

CAU His CGUArgCUC CCC CAC CGC

CUA CCA CAA Gln CGACUG CCG CAG CGGAUU Ile ACU

ThrAAU Asn AGU SerAUC ACC AAC AGC

AUA Met ACA AAA Lys AGA ArgAUG ACG AAG AGGGUU

ValGCU

AlaGAU Asp GGU

GlyGUC GCC GAC GGCGUA GCA GAA Glu GGAGUG GCG GAG GGG

The Genetic Code

Page 15: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA as a Language

• Four “letters” ( bases A, U, G, C)• 64 three letter “words” (codons)• “Redundant” – Many “words” have

the identical “meaning”• 20 unique “words” (amino acids)• Unlimited “sentences” (proteins)

Page 16: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Nucleus

Transcription/Translation

AT

GC

ATAG

TG

C C

A T

T

GC

ATA

TG

C

TA G

C A T CTA

GC

ATA

TG

C G

GC

ATA

DNA

mRNA

Protein

Transcription

Translation

GAGG A U C A CA UU A G

G

U C

AU

A

C

A U

ERG A

G G A U CA

CA

U

UA G

G

U C

AU

A

C

AU

GAG

G A U C A CA UU A G

G

U C

AU

A

C

A U

Page 17: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA Design:Alternative Splicing of RNA

Multiple proteins from one gene

Page 18: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA

Exons

Genes to Proteins: DNAmRNA

introns (between exons)

5’ 3’mRNA

Translated regionProtein

Transcribed regionPre-mRNA

UTR UTR

Page 19: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Exon5Exon5Exon4Int4Exon4Int3Exon3Exon3Int2Exon2Exon2Int1Exon1Exon1

Alternative Splicing of RNAExon5Exon4Exon2Exon1

Protein isoform A Protein isoform B

mRNA

Pre-mRNA

Page 20: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Types of Alternative Splicing

AS Pattern Type AcronymCassette exon (skipped exon) CEIntron retention IRMutually exclusive exons MXEAlternative 3' sites A3SSAlternative 5' sites A5SSAlternative first exon AFEAlternative last exon ALE

Page 21: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA Design: Duons

Overlapping regulatory and protein codes

Page 22: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Promoter region

Transcription Factors

DNA

NFAT Y2 Y1 NFATAP-1 AP-1AP-2

-200 -150 -100-300 -250-6800

NFAT

-800

NFkB

Exons

Page 23: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Genome-Wide Transcription Factor Binding Sites

• Used enzyme DNase I• Digested DNA from 81 different cell

lines• Sequenced and mapped the location

of all TF binding sites

»»»NRSF

«««USF

«««SP1

«««SP1

DNase I cleavageper nucleotide(PLBD2 gene)

Page 24: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Duon Sequences

• 86% of genes expressed at least one duon sequence

• Duons comprise 14% of all exonic coding

• Over 12 million base pairs

Andrew B. Stergachis et al. 2013. Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science 342, 1367.

Page 25: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Protein Sequence

Example of Duon in DNA

Leu Gln Gln Ile Thr Arg Gly Arg Ser ThrCTGCAGGCCATCACCAGGGGGCGCAGCACC

CCACCAGGGGGCGCA

DNA Sequence

CTCF Binding Sequence

CELSR2 Gene: Chr1:109806358-109806387

Andrew B. Stergachis et al. 2013. Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science 342, 1367.

Page 26: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Duons are Functional

Andrew B. Stergachis et al. 2013. Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science 342, 1367.

Firstexon

Internalexons

Finalexon

Non-codingbases

0

1

2

3

4 Dnase I Footprint Density

Dna

se I

foot

prin

ts/k

b p < 10-15

Page 27: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

DNA Design: Dual Coding Genes

Multiple proteins from alternative reading frames

Page 28: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Reading Frames

Leu Gln Gln Ile Thr Arg Gly Arg Ser ThrCTGCAGGCCATCACCAGGGGGCGCAGCACC

Cys Arg Pro Ser Pro Gly Gly Ala AlaAla Gly His His Gln Gly Ala Gln His

GACGTCCGGTAGTGGTCCCCCGCGTCGTGGGln Leu Gly Asp Ala Pro Gly Arg Arg Gly

Ala Ala Met Val Leu Pro Arg Leu ValAla Pro TrpStopTrp Pro Ala Cys Cys

Page 29: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Dual-Coding Genes

Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions.

Wen-Yu Chung, et al. A First Look at ARFome: Dual-Coding Genes in Mammalian Genomes. PLoS Computational Biology 3 (5) e91.

Page 30: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Finding Dual Coding Genes

• Evolutionary assumptions underestimate true numbers of dual coding genes

• 9% of human and 7% of mouse• Less than 30% shared: mouse:human• 90% of genes on opposite strands• 1259 human alternative proteins

detected by mass spectrometryChaitanya R Sanna, et al. Overlapping genes in the human and mouse genomes. BMC Genomics 2008, 9:169.

Benoıt Vanderperre, et al. 2013. Direct Detection of Alternative Open Reading Frames Translation… PLoS ONE 8(8): e70698.

Page 31: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

“Sentences” from Two Directions

• A man, a plan, a canal: Panama• Live not on evil• Was it a car or a cat I saw?

Page 32: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Dual Coding Gene: EIF6 (ITGB4BP)

GACAGAAGAAATTCTGGCAGATGTGCTCAAGGTGGAAGTCTTCAGACAGACAGTGGC GAC CCAGGTGCTAGTAGGAAGCTACTGTGTCTTCAGCAATCALeu GlnGln Thr Asp GlySerArgSer Ser Gly Ala Leu GlnGly SerGlnGlyAsp Asn ProGly ArgArgAla CysArg Leu SerLys LeuCysArg

Frame 1108 aa

Han Liang and Laura F. Landweber. 2006. A genome-wide study of dual coding regions in human alternatively spliced genes. Genome Research 16:190–196.

Frame 2226 aa

2851824313486107566

27510182177312566

Phe AsnGln Gln Thr ValValAspLeu Val Ala Leu Tyr SerVal ValArgAlaThr Ile GlnGlu GlyAspLeu ValGlu Phe GlnSer CysValGlu

Page 33: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Dual Coding Gene: Ncaph2

Pro GluHis ArgAsp TrpGlnArgGlu LeuThr Glu Ala GlyLeu ValIleValMet Val AsnLeu AspLysAla GlnGlu Leu GluVal AlaPheAsp

Angelo Theodoratos, et al. Splice variants of the condensin II gene Ncaph2 include alternative reading frame... FEBS Journal 279 (2012) 1422–1432.

Exon 2Long

GlyCys LeuArg GlySerLeuAla ThrGly Arg ArgSer TrpIleLeuMet Cys ThrPro ThrArgSer SerTrp TrpTrp HisThrArg

Exon 2ShortGluMet ValGluAsp

Exon 2IntermediateLeu IleAsp Gln

Leu IleAsp Gln

Leu IleAsp Gln

Exon 1 Alternative Transcripts

Alternative Reading Frame

Page 34: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Protein Products of Ncaph2

50 bp

200 bpLa

dder

Thym

us

Mus

cle

Brain

Bone

Mar

row

Kidney

Testi

s

Heart

Spleen

Liver

Lung

Int 215 bpLong 232 bp

Short 140 bp

Page 35: Richard Deem, Paradoxes Class, March 16, 2014. Nucleus A A T T G G C C A A T T A A G G T T G G C C C C A A T T T T G G C C A A T T A A T T G G C C T

Conclusions

• At least three independent examples of design in DNA–Alternative splicing producing multiple

proteins from one gene–Duons–overlapping sequences of

coding and transcription factor binding–Dual coding genes