Upload
vuongtuong
View
219
Download
5
Embed Size (px)
Citation preview
Are essential genes conserved?Fatemeh Ashari Ghomi1, Paul Gardner1, Lars Barquist2
1 School of Biological Sciences, University of Canterbury, Christchurch, New Zealand2 Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany
Transposon-directed insertion-site sequencing is anapproach for studying the essentiality of genes inprokaryotes. In this method, pools of single insertionmutants are constructed using transposon mutagene-sis and the effect of each mutation on the mutants’ sur-vival is evaluated by sequencing the survivors. Thiscan lead to the identification of essential genes.We have used transposon-directed insertion-site se-quencing to study the essentiality of genes in 12strains from Enterobacteriaceae which are depicted inthe figure. For this, we have studied different biasesthat can affect our transposon insertion experiment.After correcting for the biases, we have studied therelation between the essentiality of genes and theirconservation.
Klebsiella pneumoniae Ecl8
Salmonella Typhimurium SL1344
Salmonella Enteritidis P125109
Escherichia coli UPEC ST131
Salmonella Typhimurium D23580
Escherichia coli ETEC CS17
Salmonella Typhimurium A130
Enterobacter cloacae NCTC 9394
Citrobacter rodentium ICC168
Escherichia coli ETEC H10407
Klebsiella pneumoniae RH201207
Salmonella Typhi Ty2
Introduction
Questions1. Are there any biases that affect the results of transposon insertion experiments?
2. Is the conservation of essentiality consistent with the species tree?
3. Are essentiality of genes and their conservation related?
Questions1. Are there any biases that affect the results of transposon insertion experiments?
2. Is the conservation of essentiality consistent with the species tree?
3. Are essentiality of genes and their conservation related?
Transposon insertion is the process of inserting a nucleotide sequence into a geneso that it disrupts the gene and causes the gene lose its functionality.
• If the gene is essential the organism will not be able tosurvive.
• If it is non-essential the organism will be able to survive.
• If it is a beneficial loss the organism will benefit from los-ing it.
After genome sequencing:
• No or few transposon insertions are spotted in essential genes.
• An intermediate number of transposon insertions are detected in non-essentialgenes.
• Many transposon insertions are observed in beneficial losses.
Transposon-directed insertion-site sequencing
We have divided our genesinto 3 segments: 5% ofthe genes on the 5’ end,20% of the genes on the3’ end, and the rest in themiddle. The figure showsthat the number of inser-tions on the 3’ and 5’ endsis more than the internalregion in essential genesand less than the internalregion in beneficial losses.
Essential
position
mea
n ii
0.0
0.1
0.2
0.3
0.4
5' 3'
First 5%internalLast 20%
Beneficial loss
position
mea
n ii
0.0
0.5
1.0
1.5
2.0
2.5
5' 3'
First 5%internalLast 20%
Are transposon insertions evenly distributed within genes?
0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Distance bias
Distance from the origin
inse
rtio
n in
dex
We have also investigated if the number of in-sertions in genes is related to the position ofthe gene within the genome or the G-C con-tent of the gene. The results propose the furtherwe get from the origin of replication, the fewernumber of insertions we have (the left figure).Moreover, the right figure shows that when theG-C content is greater than 0.5, there is no bias.
0.2 0.3 0.4 0.5 0.6 0.7
0.0
0.5
1.0
1.5
2.0
2.5
3.0
GC bias
GC content
inse
rtio
n in
dex
To test whether the results are biased towards certain motifs, we have generated logos from 10 nucleotides flanking the 100 top mostfrequent insertion sites. The analysis shows no significant bias.
probability
CTAG
CATG
AGCT
CAGT
GTAC
CATGTAGC
ATCG
GATCCTGAATCG
ACGTAGCT
AGTCGCTATCAG
ATGC
CTGAATGCGCATATCG
0
1
5 10 15 20
1
bit
s
C
T
T
A
C
CATGT
AGC
GATCC
T
GA
ATCGACGT
A
GTCG
CTAT
CAG
C
TGA
ATGCG
C
A
T
ATCG
2
0
The top 100 most frequent insertion sites
5 10 15 20
Are transposons biased towards certain positions in the genome?
We have compared the number of genes that are conserved in different strains in our study and the number of genes that are essentialin these strains. The results propose that although conservation of genes follows a tree-like trend, the essentiality does not show atree-like signal.
1909
779
471
261
208
165
121
97 93 93 84 82 78 77 74 64 61 61 58 56
0
500
1000
1500
2000
Inte
rsec
tion
Size
Conservation
Escherichia coli K-12 MG1655
Salmonella Typhi Ty2
Escherichia coli ETEC H10407
Salmonella Typhimurium A130
Escherichia coli ETEC CS17
Salmonella Typhimurium D23580
Escherichia coli UPEC ST131
Klebsiella pneumoniae RH201207
Klebsiella pneumoniae Ecl8
Salmonella Enteritidis P125109
Salmonella Typhimurium SL1344
Enterobacter cloacae NCTC 9394
Citrobacter rodentium ICC168
180
124
5648
4336 32 31
25 24 22 21 19 189 7 7 6 5 4
0
50
100
150
Inte
rsec
tion
Size
Essentiality
Escherichia coli K-12 MG1655
Salmonella Typhi Ty2
Escherichia coli ETEC H10407
Salmonella Typhimurium A130
Escherichia coli ETEC CS17
Salmonella Typhimurium D23580
Escherichia coli UPEC ST131
Klebsiella pneumoniae RH201207
Klebsiella pneumoniae Ecl8
Salmonella Enteritidis P125109
Salmonella Typhimurium SL1344
Enterobacter cloacae NCTC 9394
Citrobacter rodentium ICC168
Is the conservation of essentiality consistent with the species tree?
We have divided the genes in our12 strains into 3 groups: genus spe-cific genes, genes with one copy pergenome, and genes with multiple copiesper genome. The study of essential-ity in these groups shows that mostof the essential genes are copied onceper genome and most of the beneficiallosses are genus specific.We have performed a pathway en-richment analysis on different groupsof genes in Salmonella Typhi usingKOBAS 2.0. The results indicate thatessential genes are mostly involved inessential pathways such as replicationand translation; the enrichment of thepathways related to non-essential genesis not statistically significant; and thebeneficial losses are mostly involved inpathways that are not needed in nutrient-rich broth.
All clusters
Insertion Index
Fre
quen
cy
0 1 2 3 4
020
0
n = 6550
EssentialNon−essentialBeneficial loss
Genus specific
0 1 2 3 4
010
0
n = 2884
Single copy
0 1 2 3 4
n = 2742
Multiple copy
0 1 2 3 4
n = 924
Protein export
DNA replication
Homologous recombination
Terpenoid backbone biosynthesis
Ribosome
0 2 4 6 8−log10(P−value)
Pat
hway
Essential
Flagellar assembly
Microbial metabolism in diverse environments
Phosphotransferase system (PTS)
Sulfur metabolism
Two−component system
0.0 0.5 1.0−log10(P−value)
Pat
hway
Non−essential
Phosphotransferase system (PTS)
Lipopolysaccharide biosynthesis
Bacterial invasion of epithelial cells
Salmonella infection
Bacterial secretion system
0 2 4 6−log10(P−value)
Pat
hway
Beneficial losses
Are essential genes more likely to be conserved?
• The 5’ and 3’ ends of genes have a different tolerance for insertions compared to the internal region in transposon-directed insertion-site sequencing.
• The number of transposons inserted to a gene is related to the distance of the gene from the origin of replication.
• The transposons are not biased towards certain motifs or G-C content of the gene.
• The conservation of essentiality is not consistent with the species tree.
• Essential genes are more likely to be conserved.
Conclusions
ContactFatemeh Ashari [email protected]
ContactFatemeh Ashari [email protected]