Upload
nguyentram
View
226
Download
0
Embed Size (px)
Citation preview
IntroductionMaterialMethod
Results and AnalysesConclusion
The last universal common ancestor
Relationship between genomic GC content andoptimal growth temperature in Bacteria
Norbert Kopocz;Supervisor: Allen Rodrigo
December 9, 2009
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Table of contents
1 IntroductionMusto et al.Wang et al.ideas
2 Materialdata
3 Methodtree, ancestral states and delta values
4 Results and Analysescorrelation
5 Conclusion
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
overview
Why is the relationship between genomic G+C content and theoptimal growth temperature so interesting?
environmental temperature based mutation of nucleotides
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
overview
Why is the relationship between genomic G+C content and theoptimal growth temperature so interesting?
environmental temperature based mutation of nucleotides
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
background
source: http://en.wikipedia.org/wiki/File:AT-GC.jpg
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
publication by .....
.... Musto et al.
correlation over all species
no positive correlationfound
R = -0.167p-Value: < 0.0000195% confidence intervall:-0.238 to -0.094
20 40 60 80 100
3040
5060
70
opt_growth_temp
GC_content
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
publication by .....
.... Musto et al.
analyzing families of prokaryotes:
20 prokaryotic families
15 out of them with positive correlation
but only 8 with statistically significance
all in all: ”Topt is one of the factors that influences genomic GC inprokaryotes”
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
publication by .....
.... Musto et al.
analyzing families of prokaryotes:
20 prokaryotic families
15 out of them with positive correlation
but only 8 with statistically significance
all in all: ”Topt is one of the factors that influences genomic GC inprokaryotes”
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
publication by .....
.... Wang et al.
”no significance”
analyzing a dataset of 1065 species:
separating into 5 temperature groups
less than 30 ◦C30 ◦C to 40 ◦C40 ◦C to 50 ◦C50 ◦C to 80 ◦Cgreater than 80 ◦C
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
publication by .....
.... Wang et al.
results:
average genomic GC ishighest in the lowesttemperature group(less than 30 ◦C)significant correlationonly in lowtemperature range
Temp. group [◦C] R p-value<= 30 0.29 < 10−6
30-40 -0.38 < 10−6
40-50 0.14 0.41
50-80 -0.21 0.12
>= 80 0.23 0.25
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
overview
two different ideas:
consideration of observed values (Musto et al. and Wang etal.)
our idea:
find a correlation between:evolutionary change in GC contentand evolutionary change in optimal growth temperature
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Musto et al.Wang et al.ideas
overview
why do we consider the evolutionary change in both values?
different species - different lifestyle:
GC poor e.g.: pathogens or symbionts [1] andspecies with small genomes [2]GC rich: large genomes [1]
[1] EP Rocha, A. Danchin, Base composition bias might result from competition for metabolic resources[2] N.A. Moran, Microbial Minimalism: Genome Reduction in Bacterial Pathogens
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
data
Data
706 species (Archaeabacteria and Prokaryotes)
genomic GC content and optimal growth temperature
16S ribosomal RNA sequences from NCBI
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
tree, ancestral states and delta values
creating tree
alignment over 706 16S rRNA sequences
editing alignment using program Squint
creating maximum likelihood phylogenetic tree
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
tree, ancestral states and delta values
creating tree
alignment over 706 16S rRNA sequences
editing alignment using program Squint
creating maximum likelihood phylogenetic tree
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
tree, ancestral states and delta values
creating tree
alignment over 706 16S rRNA sequences
editing alignment using program Squint
creating maximum likelihood phylogenetic tree
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
tree, ancestral states and delta values
ancestral states
ancestral states: squared change parsimony methodancestral states for opt. growth temperatureand genomic GC content
Furthermore: program to calculate 16S rRNA GC content
observed values
observed values
observed values
ancestral state at node
ancestral state at
root
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
tree, ancestral states and delta values
delta values
set ancestral state values on right position in treecalculating delta values using java, jebl library
delta 1
delta 5
delta 4
delta 3
delta 2
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ genomic GC vs. ∆ temp
over all ancestral states:
-20 -15 -10 -5 0 5 10 15
-10
-50
510
1520
dTempall
dGCall
correlation:
R = 0.104
p-value: < 0.0001
95% confidence intervall:0.052 to 0.155
degrees of freedom: 1408
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ genomic GC vs. ∆ temp
over internal nodes:
-20 -15 -10 -5 0 5 10
-10
-50
510
dTempint
dGCint
correlation:
R = 0.068
p-value: 0.06966
95% confidence intervall:-0.0055 0.14158
degrees of freedom: 702
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ genomic GC vs. ∆ temp
over external nodes:
-15 -10 -5 0 5 10 15
-10
-50
510
1520
dTempnew
dGCnew
correlation:
R = 0.128
p-value: 0.0006
95% confidence intervall:0.055 to 0.2
degrees of freedom: 704
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ rRNA GC content vs. ∆ temp
over all ancestral states (rRNA):
-20 -15 -10 -5 0 5 10 15
-10
-50
510
RNA_Temp
RNA_gc
correlation:
R = 0.094
p-value: 0.0004
95% confidence intervall:0.042 to 0.145
degrees of freedom: 1408
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ rRNA GC content vs. ∆ temp
over internal nodes (rRNA):
-20 -15 -10 -5 0 5 10
-1.0
-0.5
0.0
0.5
1.0
RNA_int_temp
RNA_int_gc
correlation:
R = 0.115
p-value: 0.002
95% confidence intervall:0.0416 to 0.1874
degrees of freedom: 702
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
correlation
∆ rRNA GC content vs. ∆ temp
over external nodes (rRNA):
-15 -10 -5 0 5 10 15
-10
-50
510
RNA_ex_temp
RNA_ex_gc
correlation:
R = 0.12
p-value: 0.001
95% confidence intervall:0.0464 to 0.1919
degrees of freedom: 704
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
summary
new way to figure out a relationship using:
phylogenetic backgroundevolutionary change in both valuessignificance
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
??Relationship??
Yes, we can ...... say there is a relationship between:
genomic GC content and Topt
rRNA GC content and Topt
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
Acknowledgements
Thank you very much for your attention!
Allen Rodrigo
Peter Tsai
Sibon Li
Kevin Chang
Bioinformatics Institute
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
overview
our idea:
Temp +Temp -
GC +
GC -
allowed
allowed
allowed
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
distances
developing a permutation-test of the distances to the origin
*
**
**
***
*
**
*
*
*
*
*
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
how works my permutation test?
48
42
53
52
37
43
55
41
39
61
54
50
41
43
51
41
37
62
64
21
36
71
40
28
89
78
41
21
37
54
GC Temp
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
statistic
permutation test of the distances tothe origin
distance to origin:d = 2
√∆GC 2 + ∆Temp2
S = d IV
d I,II,III
I II
III IV
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria
IntroductionMaterialMethod
Results and AnalysesConclusion
statistic
0.9 1.0 1.1 1.2
02
46
density.default(x = x, kernel = "gaussian")
N = 500 Bandwidth = 0.01381
Density
number of permutations:N = 500
observed value fordistance proportionS = 0.909
significant for α = 1 %
Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria