Upload
ramiro
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Replication associated strand asymmetries in mammalian genomes In silico detection of replication origins. Samuel Nicolay Benjamin Audit Edward Brodie of Brodie Alain Arneodo (ENS-Lyon). Maxime Huvet Marie Touchon Yves d'Aubenton-Carafa Claude Thermes (CGM, Gif sur Yvette). - PowerPoint PPT Presentation
Citation preview
Replication associated strand asymmetries Replication associated strand asymmetries
in mammalian genomesin mammalian genomes
In silico In silico detection of replication originsdetection of replication origins
Maxime HuvetMarie TouchonYves d'Aubenton-CarafaClaude Thermes
(CGM, Gif sur Yvette)(CGM, Gif sur Yvette)
Samuel NicolayBenjamin AuditEdward Brodie of Brodie Alain Arneodo
(ENS-Lyon)(ENS-Lyon)
Supports: CNRS, ACI IMPBio, ANR
Long genome sequence fragments tend to show on the same strand: fA = fT and fG = fC
« SECOND PARITY RULE »
00,20,40,60,81
1,21,4
0 0,2 0,4 0,6 0,8 1 1,2 1,40
20
40
60
80
0 20 40 60 80
A (M
b)
T (Mb) T (Mb)
A (M
b)
0
20
40
60
80
0 20 40 60 80C (Mb)
00,20,40,60,81
1,21,4
0 0,2 0,4 0,6 0,8 1 1,2 1,4
C (Mb)
G (M
b)
G (M
b)
Bacteria/Archaebacteria Human chromosomes
AA
TT
GG
CC
at equilibrium
Second Parity rule (PR2): f fAA = = ffTT and and ffGG = = ffC C (at large scales)
(Chargaff, 1962; Sueoka, Lobry, 1995)
Same mutation/repair processes on the 2 DNA strands
Same values of complementary sustitution rates
LARGE SCALE PROPERTIES OF GENOMIC MUTATIONS
REPLICATION : asymmetry of mutation/repair processes between leading and lagging strands
replication
origin
lagging strand
leading strand
5’
3’
What mechanisms cause composition asymmetries ?
EUBACTERIA: G > C and T > A in the leading strand
nG – nC
nG + nC
SGC = > 0nT – n A
nT + n A
STA = > 0
3’
5’
Composition asymmetry in procaryotes
nG – nC
nG + nC
1 kb windows
SSGC GC ==
x 106 pb
SS GC
GC
G > CG < C5’5’ 3’3’
Bacillus subtilisBacillus subtilis
leading strandleading strandlagging strand
ORIORI
TERTERTERTER
RNA POLYMERASE
5’
3’
3’
3’
5’
5’
non-transcribedstrand
transcribed strand
TRANSCRIPTION : asymmetry of mutation/repair processes between transcribed and non-transcribed strands
What mechanisms cause composition asymmetries ?
nG – nC
nG + nC
SGC = > 0nT – n A
nT + n A
STA = > 0
EUBACTERIA: G > C and T > A on the non-transcribed strand
Skew profiles associated to transcription and replication in Eubacteria
transcriptional skew profile
5’
3’
5’ 3’5’
3’
(+)(-)
S
transcribedstrand
non-transcribedstrand5’ 3’
0
replicative skew profile
3’3’
ORIORI
5’5’
5’ 3’
5’ 3’
S
leadingstrand
laggingstrand
0
superposition of replication and transcription
5’ leadingstrand 3’
S
ORIORI
laggingstrand
0
S = STA + SGC
S
Bacillus subtilis
Mbp
genes (strand +) genes (strand -) intergenic regions
STRAND ASYMMETRIES IN EUKARYOTES ?
1. Strand asymmetries associated to transcription
in the human genome
Strand asymmetries associated to transcription in human genes
Introns (126 000)
≈ 12 000 genes(no exons, no repeats)
Downward jumps (3’)Upward jumps (5’)
nT – n A
nT + n A
STA =
nG – nC
nG + nC
SGC =
∆S = STA + SGC ~ 7%
Mean skew associated to transcription
Intergenic sequences Intergenic sequences
-40 -20 0 20 40-40 -20 0 20 40
66
44
22
00
-2-2
88
-40 -20 0 20 40-40 -20 0 20 40
66
44
22
00
-2-2
88
-40 -20 0 20 40-40 -20 0 20 40
66
44
22
00
-2-2
88
-40 -20 0 20 40-40 -20 0 20 40
66
44
22
00
-2-2
88
(kb)(kb)
5’5’ 3’3’
5’5’ 3’3’
STA
SGC
2. Strand asymmetries associated to replication
in the human genome
Skew profiles around human replication origins
genes (strand +) genes (strand -) intergenic regions
Superimposition of replication and transcription biases
S
genes (strand +) genes (strand -) intergenic regions
ORI
5'
3'
ORI
S 0
Transcription : ∆S ~ ± 7%
Replication : ∆S ~ + 14%
Conservation of skew profiles in mammalian genomes
human
mouse
rat
dog
Conservation of replication origins in mammalian genomes
3. In silico detection of replication origins
in the human genome
Detection of upward jumps associated to replication
Main problem :
• necessity to avoid the jumps due only to transcription
Scale of analysis :
• larger than typical size of genes
• smaller than typical size of replicons necessity of multi-scale analysis
5'
3'ORI
S 0ORI ORI
100 kb 1 Mb
Genes
Mean size : 30 kb
w
=20
0 kb
w =
100
kb w
=50
kb
w =
10 k
b
S S
der
ivat
ive
S de
rivat
ive
S S
numerous jumps
high precision
few jumps
low precision
w
first derivative
Multi scale jump detection using the wavelet transform
Multi scale jump detection using the wavelet transform
Signal smoothenedat large scale (200 kb) Identification of transitions
position of transitions (1 kb)
Histograms of jump amplitude
upward
downward
%
Asymmetry of the human genome
x (Mb)x (Mb)
« Factory roof » skew profiles
MCM4 TOP1SS
x (kbx (kb))SS
« Factory roofs » around experimentally determined replication origins
Conservation of potential origins in mammalian genomes
human
mouse
dog
Replication terminaison sites : distributed between fixed adjacent origins
Model of eucaryotic replicon
OO OOTT
Eucaryote Procaryote
OO TT
Ori 1 Ori 2 Ori 1 Ori 2 Ori 1 Ori 2at each cycle: after several cycles: after N cycles:
S
759 « factory roofs spanning »
~ 40% of the human genome
factory roofwavelets
Detection of factory roofs using the wavelet transform
ASYMMETRY OF HUMAN GENOME
factory roofs = 40 %
factory roofs < 1 %
transcriptional skew profile
5’
3’5’ 3’
5’
3’
(+)(-)S
transcribedstrand
non-transcribedstrand5’ 3’
0
replicative skew profile
superposition of transcription and replication
33’’
ORORII
55’’
5’
3’
5’ 3’
ORIORI
S 0
ORIORI
33’’
ORORII
55’’
5’
3’
EUCARYOTIC REPLICON MODEL
Comparison with replication timing data
Woodfine et al., Cell Cycle (2005)
early
late
oriori
Position on human chromosome 6 (Mbp)
Replication
timing
GENE ORGANISATION IN HUMAN CHROMOSOMES
Organisation of transcription around predicted replication origins
Co-orientation of transcription and replication
S
ORIORI ORIORI
Open chromatin
Replication origins are situated at the center
of open chromatin regions
GenomicDNA
Model of mammalian chromatin organization
Conclusions
• Existence of replication-coupled strand asymmetries in human genome
• Replication origins correspond to large transitions of skew profiles
• These transitions are conserved in mammalian genomes
• Detection of more than one thousand putative origins active in germ-line cells
• « Factory roof » profiles : regularly distributed termination sites
• Essential rome of replication in organisation of gene order and expression