Upload
cybele
View
57
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Lecture 15: HW2 Feedback Ultraconservation. Ultraconserved Elements in the Human Genome: The Hip & The Hype. GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG - PowerPoint PPT Presentation
Citation preview
http://cs273a.stanford.edu [Bejerano Fall10/11] 1
http://cs273a.stanford.edu [Bejerano Fall10/11] 2
Lecture 15:
HW2 Feedback
Ultraconservation
http://cs273a.stanford.edu [Bejerano Fall10/11] 33
GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGCAATCAAGGTGACAGACAACTGGTGCAATGATGGTAGTGGAAATGGAGGAGAGGGGATTGATTCAAGATGCATTTAGGACCAAGAATCGGGAGCTTGTGAACGTGTGTATGAGTACTGTAGACGGAGTGGGTGTGTCATCAGAGAAGATCTGAGCATTTGGGCTTGCTCTCCTCAGAGGCCCTGCGAGTGGAGTTCAGCTTTTCCTCATGGGGCAAATCTCACTTTCGCTCCAGTTCCTGGGGCTCAGAGTCCCTGGCCCAGATGCCTCTTGCCATCTCATCTTCACCCTGCCTGGCTTCCCTTGCTTGTTCCAGGATTGTTTCATAAAGAGGGATGTGGTTGGTCTTTAACCCTATGAATGCTGGCTGAGGATGCCTGCGGAACCTGTAGTGAAGCTTTCAGGGGCTGCTCGGGTTCTGGCTGGTAGGTGAACACTGTCCATCTTGCCGGCTGGGACACAGTGACTCTGGGTAGTTGTGTAAGAGAGGGGCCCTTGGCAGACAAACAGGTTCTTCTCTGTTGGTGGGCCAGCCAGCAGGTCAGTGGGAAGGTTAAAGGTCATGGGGTTTGGGAGAACTGGGTGAGGAGTTCAGCCCCATCCCCCGTAAAGCTCCTGGGAAGCACTTCTCTACTGGGGCAGCCCCTGATACCAGGGCACTCATTAACCCTCTGGGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAGGCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCTGTCTGCTCTCTGTGCCCAG
Ultraconserved Elements in theHuman Genome: The Hip & The Hype
Dept. of Developmental BiologyDept. of Computer Science
Stanford University
Gill Bejerano
http://cs273a.stanford.edu [Bejerano Fall10/11] 4
Sequence Conservation implies Function
(but whichwhich function/s?...)
human
mouse
mammalianancestor
...CTTTGCGA-TGAGTAGCATCTACTATTT...
...ACGTGGGACTGACTA-CATCGACTACGA...
functional region!
Comparative Genomics of related species highlights:
http://cs273a.stanford.edu [Bejerano Fall10/11] 5
HumanGenome:
3*109 letters
Human Genome full of Conserved Non-Coding Elements
1.5%known
function >50%junk
3x more functional DNA than known!
compare to other species
>5% human genome functional
~106 genomic loci do not code for protein
What do they do then?
http://cs273a.stanford.edu [Bejerano Fall10/11] 6
Conserved elements in the Human Genome
all human-mouse alignmentshuman-mouse ancestral repeats alignment
Difference: 5% of
Human Genome
[Mouse consortium, Nature 2002]
election
human-mouse ancestral repeats alignment
85%id on average
http://cs273a.stanford.edu [Bejerano Fall10/11] 7
Conserved elements in the Human Genome
all human-mouse alignmentshuman-mouse ancestral repeats alignment
Difference: 5% of
Human Genome
election
human-mouse ancestral repeats alignment
85%id on average
UltraconservationUltraconservation
http://cs273a.stanford.edu [Bejerano Fall10/11] 8
Typical DNA Conservation levels
Conserved elements between human and mouse are on average 85% identical. [mouse consortium, 2002]
(dot = base identical to human)
http://cs273a.stanford.edu [Bejerano Fall10/11] 9
Ultraconserved Elements
[Bejerano et al., Science 2004]
fish
481 elements perfectly conserved (100%id) over
200bp or more between human, mouse and rat.
using2 vs. 3species
http://cs273a.stanford.edu [Bejerano Fall10/11] 10
Contamination
http://cs273a.stanford.edu [Bejerano Fall10/11] 11
What exactly is an Ultraconserved Element?
Aha!!
using3 vs. 43species
http://cs273a.stanford.edu [Bejerano Fall10/11] 12
Ultraconservation as a Phenomenon
Few species More and more species
Hmmm….
http://cs273a.stanford.edu [Bejerano Fall10/11] 13
Ultraconserved Elements: Why?
Hundreds of long genomic regions identical between amniotes they must have rejected many different changes.
But... all functions we understand in our genome are encoded using redundant codes.
**
*
**
CDS ncRNA TFBS
seq.
http://cs273a.stanford.edu [Bejerano Fall10/11] 14
Conserved elements in the Human Genome
all human-mouse alignmentshuman-mouse ancestral repeats alignment
Difference: 5% of
Human Genome
election
human-mouse ancestral repeats alignment
85%id on average
UltraconservationUltraconservation
Why did I Why did I look at the tail?look at the tail?
http://cs273a.stanford.edu [Bejerano Fall10/11] 15
...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...
DNA Replication is Imperfect
It’s imperfect on all scales: small, medium and large.
In particular it begets novel functional entities:
...ACGTACGACTGACTAGCATCGACTACGA...
...ACGTACGACTGACTAGCATCGACTACGA........TCTGACTAGCATCGACTACGA...
functionaljunk
functionalfunctional
functional’’ functional’
regionalduplication
functionaldivergence
Protein & RNA gene families come to life this way. What else does?
http://cs273a.stanford.edu [Bejerano Fall10/11] 16
Computational Approach I
Group them into paralog families of human functional regions of common origins: • Annotated members induce function on all. • Examine core, substitutions in family. • Test for “guilt by association”. [Bejerano et al., ISMB 2004]
.....ACGTGCATGACTGACTAGCATCAGACGACTAC..GATAATACGCTACGACTAGCTAC.....human DNA
...TGACTAGCATCGACTAC..GATAATACGAC... ...CATCGACTAC..GATAATACGACGGTTGGT...AC T
~400bp
http://cs273a.stanford.edu [Bejerano Fall10/11] 17
Functional Annotation by Families
[Bejerano et al., ISMB 2004]
Puzzling News:96% of the 700,000appear unique(!)
Good News:We still find12,027 families
novel putative ncRNAs, cis-regulatory elements, etc.
After removing from top 5% Human all annotated regions, and more:
700,000 elements, covering 3.5% Human Genome
http://cs273a.stanford.edu [Bejerano Fall10/11] 18
human
mouse
rat
related genesrelated elements(75%id over 200bp)
same element96%id over 200bp
same element95%id over 200bp
Computational Approach II
Classical Biological approach: experiment to understand these regions
Computational approach: how many regions like this or “better” are there?
http://cs273a.stanford.edu [Bejerano Fall10/11] 19
Out popped the Ultraconserved Elements
Puzzling News:96% of the 700,000
conserved non-codingelements appear
unique(!)
Same with Ultras
http://cs273a.stanford.edu [Bejerano Fall10/11] 20
What could ultras be doing?
•exonic•non•possibly
Associating distal peaks in a gene-based context is statistically inappropriate
21
Gene transcription start site
Ultraconserved Element
Ontology term (e.g. ‘development’)
http://cs273a.stanford.edu [Bejerano Fall10/11]
N = 8 genes in genome
K = 3 genes annotated with
n = 3 genes selected by proximal peaks
k = 2 selected gene annotated with
P = Pr(k ≥1 | n=2, K =3, N=8)
1.Set gene regulatory domain.
2.Associate Ultras with genes.
3.Per ontology term, count annotated genes selected.
4.Rank terms by enrichment hypergeometric p-value.
Evolved into
http://great.stanford.edu/
Enrichment Association of Ultraconserved Elements
22
Exo
nic
Ultr
asN
on
-exo
nic
Ultr
as
http://cs273a.stanford.edu [Bejerano Fall10/11]
http://cs273a.stanford.edu [Bejerano Fall10/11] 23
Ultras are Functional
Back in 2004 we hypothesized:
481 ultraconserved elements
exonic subset –
post transcriptional regulation
[Ni et al., Genes Dev.; Lareau et al., Nature, 2007]
“nonexonic” subset –
transcriptional regulators
[Pennacchio et al., Nature, 2006]
http://cs273a.stanford.edu [Bejerano Fall10/11] 24
Ultraconserved Non-coding RNA
[Calin et al, Cancer Cell, 2007]miRNA complementarity
About 1/3 of all ultras are expressed.
Some are predicted to provide microRNA targets.
A few are anti-correlated with miRNAexpression levels.
A few even act as oncogenes.
http://cs273a.stanford.edu [Bejerano Fall10/11] 25
Ultras are Under Strong Human Selection
Ultra DAF NonSyn DAF
[Katzman et al, Science ,2007]
Mutational cold spots? NO. Rare (new) mutations are introduced to the population.
Fierce purifying selection? YES. Very few of these get anywhere near fixation.
chimpA
humans
G AAA
http://cs273a.stanford.edu [Bejerano Fall10/11] 26
Touch an Ultra And You - DIY
[Ahituv et al., PLoS Biology, 2007]
http://cs273a.stanford.edu [Bejerano Fall10/11] 27
What can’t we measure in the lab?
sN
s
e ee
esN 21
1),|fixationPr(
Ne is population size, s selective dis/advantage.Both of which are VERY wrong in the lab.
http://cs273a.stanford.edu [Bejerano Fall10/11] 28
So it can happen – but does it FIX?
tDNA element
mouse
http://cs273a.stanford.edu [Bejerano Fall10/11] 29
Count Fraction Lost, Binned by %id
t human
macaque
dog
mouse
rat
100bp
sliding
window
count_all
count_hole
bin
by
%id
humandog rat mouse
macaque
http://cs273a.stanford.edu [Bejerano Fall10/11] 30
Quite Some Time Later
[McLean & Bejerano, Genome Res., 2008]
http://cs273a.stanford.edu [Bejerano Fall10/11] 31
Ultras are Fiercely Retained through Evolution
Ultras are
>300 fold
more
persistent
than
neutral DNA(25% deleted)
the genomic deletiongenomic deletion is
100%id primates-dog: 1,691,090bp
rodents deleted: 1,447bp (0.086%)
sN
s
e ee
esN 21
1),|fixationPr(
http://cs273a.stanford.edu [Bejerano Fall10/11] 32
How special are the Ultras?
election
UltraconservationUltraconservation
http://cs273a.stanford.edu [Bejerano Fall10/11] 33
Ultraconservation as a Phenomenon
Few species More and more species
Hmmm….
We do not see a bump in the curve
Ultraconserved Elements: What do we know?
• Excessive sequence conservation exists.• Set is heterogeneous from a functional perspective.• Four can be KO-ed with no clear phenotype.• Yet, the set is under extreme selection in natural
populations, both for mutations and deletions.• Most ultras have deep orthology, and no paralogy.• One ultra comes from a mobile element co-option events.• Others may have come from similar events.• Ultras appear the tip of a continuum, not a unique peak.
http://cs273a.stanford.edu [Bejerano Fall10/11] 34
Ultraconserved Elements: What we don’t
• What maintains so much conservation?
http://cs273a.stanford.edu [Bejerano Fall10/11] 35
**
*
**