32
Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between genomic GC content and optimal growth temperature in Bacteria Norbert Kopocz; Supervisor: Allen Rodrigo December 9, 2009 Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genom

The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

Embed Size (px)

Citation preview

Page 1: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

The last universal common ancestor

Relationship between genomic GC content andoptimal growth temperature in Bacteria

Norbert Kopocz;Supervisor: Allen Rodrigo

December 9, 2009

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 2: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Table of contents

1 IntroductionMusto et al.Wang et al.ideas

2 Materialdata

3 Methodtree, ancestral states and delta values

4 Results and Analysescorrelation

5 Conclusion

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 3: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

overview

Why is the relationship between genomic G+C content and theoptimal growth temperature so interesting?

environmental temperature based mutation of nucleotides

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 4: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

overview

Why is the relationship between genomic G+C content and theoptimal growth temperature so interesting?

environmental temperature based mutation of nucleotides

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 5: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

background

source: http://en.wikipedia.org/wiki/File:AT-GC.jpg

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 6: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

publication by .....

.... Musto et al.

correlation over all species

no positive correlationfound

R = -0.167p-Value: < 0.0000195% confidence intervall:-0.238 to -0.094

20 40 60 80 100

3040

5060

70

opt_growth_temp

GC_content

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 7: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

publication by .....

.... Musto et al.

analyzing families of prokaryotes:

20 prokaryotic families

15 out of them with positive correlation

but only 8 with statistically significance

all in all: ”Topt is one of the factors that influences genomic GC inprokaryotes”

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 8: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

publication by .....

.... Musto et al.

analyzing families of prokaryotes:

20 prokaryotic families

15 out of them with positive correlation

but only 8 with statistically significance

all in all: ”Topt is one of the factors that influences genomic GC inprokaryotes”

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 9: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

publication by .....

.... Wang et al.

”no significance”

analyzing a dataset of 1065 species:

separating into 5 temperature groups

less than 30 ◦C30 ◦C to 40 ◦C40 ◦C to 50 ◦C50 ◦C to 80 ◦Cgreater than 80 ◦C

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 10: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

publication by .....

.... Wang et al.

results:

average genomic GC ishighest in the lowesttemperature group(less than 30 ◦C)significant correlationonly in lowtemperature range

Temp. group [◦C] R p-value<= 30 0.29 < 10−6

30-40 -0.38 < 10−6

40-50 0.14 0.41

50-80 -0.21 0.12

>= 80 0.23 0.25

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 11: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

overview

two different ideas:

consideration of observed values (Musto et al. and Wang etal.)

our idea:

find a correlation between:evolutionary change in GC contentand evolutionary change in optimal growth temperature

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 12: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Musto et al.Wang et al.ideas

overview

why do we consider the evolutionary change in both values?

different species - different lifestyle:

GC poor e.g.: pathogens or symbionts [1] andspecies with small genomes [2]GC rich: large genomes [1]

[1] EP Rocha, A. Danchin, Base composition bias might result from competition for metabolic resources[2] N.A. Moran, Microbial Minimalism: Genome Reduction in Bacterial Pathogens

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 13: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

data

Data

706 species (Archaeabacteria and Prokaryotes)

genomic GC content and optimal growth temperature

16S ribosomal RNA sequences from NCBI

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 14: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

tree, ancestral states and delta values

creating tree

alignment over 706 16S rRNA sequences

editing alignment using program Squint

creating maximum likelihood phylogenetic tree

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 15: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

tree, ancestral states and delta values

creating tree

alignment over 706 16S rRNA sequences

editing alignment using program Squint

creating maximum likelihood phylogenetic tree

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 16: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

tree, ancestral states and delta values

creating tree

alignment over 706 16S rRNA sequences

editing alignment using program Squint

creating maximum likelihood phylogenetic tree

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 17: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

tree, ancestral states and delta values

ancestral states

ancestral states: squared change parsimony methodancestral states for opt. growth temperatureand genomic GC content

Furthermore: program to calculate 16S rRNA GC content

observed values

observed values

observed values

ancestral state at node

ancestral state at

root

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 18: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

tree, ancestral states and delta values

delta values

set ancestral state values on right position in treecalculating delta values using java, jebl library

delta 1

delta 5

delta 4

delta 3

delta 2

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 19: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ genomic GC vs. ∆ temp

over all ancestral states:

-20 -15 -10 -5 0 5 10 15

-10

-50

510

1520

dTempall

dGCall

correlation:

R = 0.104

p-value: < 0.0001

95% confidence intervall:0.052 to 0.155

degrees of freedom: 1408

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 20: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ genomic GC vs. ∆ temp

over internal nodes:

-20 -15 -10 -5 0 5 10

-10

-50

510

dTempint

dGCint

correlation:

R = 0.068

p-value: 0.06966

95% confidence intervall:-0.0055 0.14158

degrees of freedom: 702

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 21: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ genomic GC vs. ∆ temp

over external nodes:

-15 -10 -5 0 5 10 15

-10

-50

510

1520

dTempnew

dGCnew

correlation:

R = 0.128

p-value: 0.0006

95% confidence intervall:0.055 to 0.2

degrees of freedom: 704

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 22: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ rRNA GC content vs. ∆ temp

over all ancestral states (rRNA):

-20 -15 -10 -5 0 5 10 15

-10

-50

510

RNA_Temp

RNA_gc

correlation:

R = 0.094

p-value: 0.0004

95% confidence intervall:0.042 to 0.145

degrees of freedom: 1408

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 23: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ rRNA GC content vs. ∆ temp

over internal nodes (rRNA):

-20 -15 -10 -5 0 5 10

-1.0

-0.5

0.0

0.5

1.0

RNA_int_temp

RNA_int_gc

correlation:

R = 0.115

p-value: 0.002

95% confidence intervall:0.0416 to 0.1874

degrees of freedom: 702

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 24: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

correlation

∆ rRNA GC content vs. ∆ temp

over external nodes (rRNA):

-15 -10 -5 0 5 10 15

-10

-50

510

RNA_ex_temp

RNA_ex_gc

correlation:

R = 0.12

p-value: 0.001

95% confidence intervall:0.0464 to 0.1919

degrees of freedom: 704

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 25: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

summary

new way to figure out a relationship using:

phylogenetic backgroundevolutionary change in both valuessignificance

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 26: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

??Relationship??

Yes, we can ...... say there is a relationship between:

genomic GC content and Topt

rRNA GC content and Topt

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 27: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

Acknowledgements

Thank you very much for your attention!

Allen Rodrigo

Peter Tsai

Sibon Li

Kevin Chang

Bioinformatics Institute

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 28: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

overview

our idea:

Temp +Temp -

GC +

GC -

allowed

allowed

allowed

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 29: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

distances

developing a permutation-test of the distances to the origin

*

**

**

***

*

**

*

*

*

*

*

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 30: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

how works my permutation test?

48

42

53

52

37

43

55

41

39

61

54

50

41

43

51

41

37

62

64

21

36

71

40

28

89

78

41

21

37

54

GC Temp

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 31: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

statistic

permutation test of the distances tothe origin

distance to origin:d = 2

√∆GC 2 + ∆Temp2

S = d IV

d I,II,III

I II

III IV

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria

Page 32: The last universal common ancestor Relationship between ... · Introduction Material Method Results and Analyses Conclusion The last universal common ancestor Relationship between

IntroductionMaterialMethod

Results and AnalysesConclusion

statistic

0.9 1.0 1.1 1.2

02

46

density.default(x = x, kernel = "gaussian")

N = 500 Bandwidth = 0.01381

Density

number of permutations:N = 500

observed value fordistance proportionS = 0.909

significant for α = 1 %

Norbert Kopocz; Supervisor: Allen Rodrigo The last universal common ancestorRelationship between genomic GC content and optimal growth temperature in Bacteria