Upload
donagh
View
36
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Molecular Phylogeny in a context of possible Lateral Gene Transfers. Eric Bapteste. W.F. Doolittle Lab. The reason(s) why we doubt a strict tree-like representation should be used. Biological processes favoring lateral exchanges of DNA... are powerful - PowerPoint PPT Presentation
Citation preview
Molecular Phylogeny in a context of possible Lateral Gene Transfers
Eric Bapteste
W.F. Doolittle Lab
The reason(s) why we doubt a strict tree-like representation
should be used
• Biological processes favoring lateral exchanges of DNA... are powerful
• Phylogenetic evidence for a unique Tree of Life are weak
• Molecular phylogenies might even suggest that LGT happens
… at least in some lineages
Biological Processes contribute to lateral exchanges of DNA
Internal source of variation
Mutator phenotype
Baseline replicationerrors (point mutations)
Intragenomic recombination(legitimate and illegitimate)
Hypervariable loci
Genome of the Organism
Deletion of geneticmaterial (Gene loss)
Gene duplication
Vertical inheritance
Genome of the Descendent
External source of variation
DNA viruseslytic RNA virusesretroviruses
Conjugativeplasmids andtransposons
DNA from divergentlineage
Transduction
Transformation
Conjugation
Horizontalinheritance
Cell fusions
membranevesicle transfer
Phylogenetic evidence for a unique Tree of Life are weak
“The general lack of conflict observed among the 203 remaining families was not due to the absence of phylogenetic signal in the gene alignments because most genes did conflict with several other topologies (see Figure 3). We interpreted this congruence as a reflection of shared history and a lack of LGT. Therefore, we chose these genes as the basis for inferring the true organismal phylogeny for these 13 species.”
Gamma-proteobacteria: an apparent agreement on a tree
Lerat E et al., PLoS Biol. 2003 Oct;1(1):E19.
AU test
0
20
40
60
80
100
120
140
160
180
200
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 105
0
40
80
120
160
200
SH test
1
Blue : non different from the ML tree (5%)
Red: different from the ML tree
Topologies
Num
ber
of a
lignm
ents Phylogenetic evidence for a
unique Tree of Life are weak
Testing the congruence/conflict between markersTopologies
Genes
1
2
3
4
5
6
7
0
0,1
4,1
0,6
0,1
0,4
0,3
4
5,4
7,8
4,3
4,7
3,1
9,7
13
13
0,4
17
9
22
47
42
27
0
33
37
29
41
0,2
1,7
3,3
0,1
0
0,4
0,5
…..
R²=0,9
R²=0,05
Principal Component Analysis of p-values for each gene and topology
62 1
4 75 3
-1.5
-1
-0.5
0
0.5
1
1.5
-2 -1.5 -1 -0.5 0 0.5 1
genes1 LGT event
Principal Component Analysis of 205 genes of gamma-proteobacteriaand simulated markers with transfers
-1.5
-1
-0.5
0
0.5
1
1.5
-2 -1.5 -1 -0.5 0 0.5 1
genes
1 LGT event
2 LGT events
Principal Component Analysis of 205 genes of gamma-proteobacteriaand simulated markers with transfers
-1
-1.5
-1
-0.5
0
0.5
1
1.5
-2 -1.5 -0.5 0 0.5 1
genes
1 LGT event
2 LGT events
3 LGT events
Principal Component Analysis of 205 genes of gamma-proteobacteriaand simulated markers with transfers
-1.5
-1
-0.5
0
0.5
1
1.5
-2 -1.5 -1 -0.5 0 0.5 1
genesRandom1 LGT event2 LGT events3 LGT events
Principal Component Analysis of 205 genes of gamma-proteobacteriaand simulated markers with transfers
GENE NUMBER i
TO
PO
LO
GIE
S N
UM
BE
R i
P-value
GENES
TO
PO
LO
GIE
S
CLUSTER OF
GENES
CL
US
TE
R O
F P
LA
US
IBL
E
TO
PO
LO
GIE
S
BLUE:Genes with LGT
RED:genes
CLUSTER OF
GENES
CL
US
TE
R O
F P
LA
US
IBL
E
TO
PO
LO
GIE
S
CLUSTER OF GENES C
LU
ST
ER
OF
PL
AU
SIB
LE
T
OP
OL
OG
IES
1 2 34
genes clearlyshowing lateral
transfergenes showing nothing clearly
genes clearly showing vertical
descent
enthusiastic lateralists
committed verticalists
INCONGRUENCE OF ORTHOLOGOUS GENES: HOW MUCH IS NOISE, HOW MUCH IS TRANSFER (ORTHOLOGOUS REPLACEMENT)? TRUTH IS, NO ONE REALLY KNOWS
What we propose to do
A synthesis
Vertical part Horizontal part
Principles to make a synthesis
Reference phylogeny
ABCDEF
ABCDEFPhylogeny of gene 2
AB
CDE
F99
99
Synthesis
Phylogeny of gene 1
F
B
CDE
A
99
9999
From a tree …
ML TreesBV > 50strict consensus
… to a synthesis
Conclusions
We need better trees to have better synthesis
LGT should be accounted for when reconstructing the evolutionary history
Many interesting biological and epistemological avenues to explore in the near future
Many thanks to The Doolittle and Roger labs
Celine BrochierYan Boucher
Dave MacLeod Robert Charlebois
Jessica Leigh
Ed Susko Ford DoolittleDavid Walsh
Topology
I respect ( and more) Vincent DaubinThe reason why my interpretation of the dataset is different :
- I believe that these most of these genes do not contain enough phylogenetic signal to tell the whole history of gamma proteobacteria alone
This is the very motive for concatenation: genes are too weak alone
However, based on biological evidence, transfer could have happened,
- so we should not prejudge that these genes with a unknown history have been transmitted only vertically. In context of LGT, concatenation is not safe a priori.
In other words, in the possible presence of LGT,« when we do not know, we do not know! »
- Test concatenations of markers of entirely simulated data, full of transfers, also gives robust phylogenies (Douady and Doolittle, unpublished)
So, even a good support for a tree coming from a concatenation is no garantee that the true history has been recovered. Careful analyses of each marker are required.
- During these analyses, if we also see some conflict. We should show it, and then do a synthesis instead of a tree
The phylogenetic signal is not robust over the whole Synthesis: basal branches are poorly supported.
Distribution of the phylogenetic signal along the synthesis
00.10.20.30.40.50.60.70.80.91
1 2 3 4 5 6 7 8distance from the root
Total phylogenetic signal
Longest consecutive vertical path supported
7 564
3
2
More precisely, many inner nodes are only supported by a minority of the genes (in purple). There are always genes (in dark green) for which we ignore their phylogenetic
history.
0
50
100
150
200
1A 1B 2A 2B 3A 4A 4B 5A 5B 6A 6BBap
hiEco
liHinf
lPae
r
Pmult
Styphi
VcholW
iggXax
o
Xcam
pXfa
st
YpesC
O92
YpesK
IM
Horizontal and vertical inheritance Mode of transmission unknown
Xfast
Xaxo
Xcamp
Paer
Wigg
Baphi
Vchol
Pmult
Hinfl
YpesCO92
YpesKIM
Styphi
Ecoli A brief view of the differences between the 16 plausible topologies (AU test, 5%)
What are the main evolutionary routes?
GenesGenes
The road of relationshipsAre there main routes? Unique routes? Side-issues?
Are the genes involved in LGT especially mobile ones?
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
au
Average p-value (AU test) for each topology over all the genes
The average p-value of the best tree for each gene is: 0.83
The concatenate tree is a good “average”, but for most genes is not the best tree
Concatenate tree
Electric network Railway network
Maritime networkCrystal Web
NEW QUESTIONS: Optimisation, functionality, economy, shorter paths ?
70 Bacteria
Genes which were Laterally transferred: 0rpl4_1.puz_bip.out, 0efg_1.puz_bip.out, 0rpl18_1.puz_bip.out, 0fmt_1.puz_bip.out
Archaea 70
Euka 70
Chloro 70
Strict ConsensusBV > 50 %
Genes which were Laterally transferred:gp25boocon.txt.out, gp46boocon.txt.out
These two events of transfers make a support for two phylogenetic relationships:the last common ancestor of (133, rb69 and T4) would have given two genes to the last common ancestor of 25, 31, and 44RR
“A radical departure from conventional thinking”W. Martin/M. Embley
Me crazy, but on the shoulders of many philosophers:
Leibniz, Whitehead, Deleuze, Parrochia, etc.
“A radical departure from thinking?”
Rivera and L
ake, Nature, 2004
ROOT OF THE RING
B
HM
E
PY1
Y2
E
HM
B
P
Y1Y2
P
BH
M
E
Y1Y2
B
HM
P
EY1
Y2
E
HM
P
B
Y1Y2
60.5% 16.8% 10%
7.2%1.8%
H
MB
E
P
Y1
Y2Unknown descendent
Unknown descendent 16.8
16.8
10
10
7.2
1.8
96.3
77.7 79.1 89.1
96.3
Y1Y2
PE
BH
MUnknown Descendent
Unknown Descendent
96.3
79.1 77.7 94.596.3
10
10
16.8
16.8
1.8
7.2
CLUSTER OFGENES
CL
US
TE
R O
F P
LA
US
IBL
E
TO
PO
LO
GIE
S
We can question:-the choice of the drawing of evolution
-if a non-tree like null hypothesis should not be considered to build evolutionary scenarios
Heuristic of the synthesis...
There are 26 vertical branches and 11 lateral branchesThe total vertical thickness is about 13 times more important than the total horizontal thickness Yet, 18 genes were laterally transferred8 lateral branches are mostly compatible with the reference tree3 lateral branches are mostly incompatible with the reference treeThus, 72.7% of LGT are mostly compatible with the reference tree