22
Vol 6 | June 2008 Presenter: Constantin Bod extra information in the notepad!

Vol 6 | June 2008 Presenter: Constantin Bode extra information in the notepad!

Embed Size (px)

Citation preview

Vol 6 | June 2008

Presenter: Constantin Bode

extra information in the notepad!

Classifing bacteria

- in the 1970s: DNA-DNA hybridization was introduced- Isolates that showed >70% DNA homology were considered to belong to the same species

- 16S ribosomal RNA - ubiquitous in bacterial and archeal genomes- matches to 98% of the 70% cut-off method- High-throughput method

- with a few drawbacks: - you cannot distinguish between some phenotypically distinct species (e.g. Bacillus thuringiensis and B. anthracis)- organisms were the ‘universal‘ primers do not fit are not detected - pour method for resolving sub-populations within species

Classifing bacteria

- Mulitlocus enzyme electrophoresis (MLEE)- classifies bacteria on the basis of the isoforms of a combination of approx. 15 metabolic enzymes

- drawback: - low throughput (intensive laboratory work)

- not widely used

- Multilocus sequence typing (MLST)- based on the partial sequences of 7 housekeeping genes of approx. 450 bp each - high throughput - allows direct comparison between different laboratories

-> Database (MLST Public Repository)-drawback:

- for some species there is too little sequence variation in the housekeeping genes for a sufficient discrimination

- drawback for 16S rRNA and MLST- only limited genome coverage

Classifing bacteria

- Single-nucleotide polymorphisms (SNPs)- originally developed for use in humans- analysis of single gens- e.g. reconstructing the evolutionary history of S. typhi (82 SNPs)- potential for more general use in bacterial population genetics is still unproven

Genomic coverage of genetic typing methods

Core genome: encode proteins which are involved in essential functions (replication, transcription and translation)

Dispensable genome:encode proteins that facilitate organismal adaptation

Neisseria meningitidis:16S rRNA: 0.07%MLST: 0.2%

Salmonella Typhi:SNP: 2%

Genetic markers and deviations from population structure

Schematic representation of different resolution levels

eMLST: extended MLST

ST: sequence typesET: electrophoretic types

Taxonomic rank

Rank Human E. coli

Domain Eukarya Bacteria

Kingdom Animalia Monera

Division Chordata Proteobacteria

subdivision Vertebrata

Class Mammalia Gammaproteobacteria

Subclass Theria

Order Primates Enterobacteriales

Suborder Haplorrhini

Family Hominidae Enterobacteriaceae

Subfamily Homininae

Genus Homo Escherichia

Species H. sapiens E. coli

Subspecies/strain sapiens O157:H7

modified after: http://en.wikipedia.org/wiki/Taxonomic_rank

sequencing technologies

Frederick Sanger developed the Sanger chain-termination method in the late 1970(Nobel Price)

Post-Sanger sequencing technologies

Post-Sanger sequencing technologies

Roche company

available since 2005bad resolution of homopolymer DNA segments (multiple copies of a single base)400 million high quality bases per 10 hour instrument runSince oct 2008 400 base pairs in length

reads: 250 bases

http://www.454.com/about-454/index.asp

Post-Sanger sequencing technologies

available since 2007

reads: approx. 25 bases

http://www3.appliedbiosystems.com/AB_Home/applicationstechnologies/SOLiDSystemSequencing/OverviewofSOLiDSequencingChemistry/index.htm

glass slide

Post-Sanger sequencing technologies

Ilumina inc.

available since 2006since June 2009: Full Genome Sequencing Service for $48,000 per genome(first commercial personal genome sequence)

reads: approx. 40 bases

http://www.illumina.com

Molecular evolutionary mechanisms that shape bacterial species diversity

genetic information of a bacterial species – pan genome

intra-species inter-species

population dynamic

Metagenomics (environmental genomics or community genomics)ability to capture genomic diversity within a natural population

Pan genome

Pan genome - core genome

shared by all strains- dispensable genes

shared by some but not all isolates- strain-specific genes

unique to each isolate

analysis of 17 Streptococcus pneumoniae genomes

core genome of 1,454 genespan genome of approx. 5000 genes

142 genomes would need to be sequenced to obtain the complete S.pneumoniae genome (just a assumption)(it is not possible to characterize a species from a single genome sequence)

pan genome reflects the selective pressure to generate new adaptive combinations

Diversity generating mechanism

Evading or avoiding an immune system colonizing a highly variabe enviroment

requires diversity!

- exchange of DNA- modified clonal growth

Simplest mechanism: random change in length (during DNA replication)

Diversity generating mechanism

Simplest mechanism: random change in length (during DNA replication)

Campylobacter jejuni express different subsets of surface proteins

(Heliobacter pylori is also using this mechanism of phase variation)

Diversity generating mechanism

DNA inversion

A small section of DNA is inverted by a cleavage-and-ligation reaction that is mediated by a site specific recombinase control the expression of entire operons

invertible promoters of Bacteroides fragilis

Diversity generating mechanism

DNA inversion

exchange parts of the coding sequences of expressed genes with sequences from silent cassettes random expression of different alternative proteins

Bacteroides fragilis

S specificityM methylationR restriction

Diversity generating mechanism

single base point mutationsDNA recombination

To see this diversity you need variation between shotgun sequences of the metagenome of a single organism.Can be easly overlooked in clonal cultures.

Estimation: >99% of the bacteria in the enviroment cannot be cultured in the laboratory de novo sequencing is required

Applications of the genomic era

reverse vaccinology develop vaccines

Rappuoli R. Current Opinion in Microbiology 200, 3:445-450

Questions

Thanks for your attention!

Definitions

Genome: The entire hereditary information of an organism that is encoded by its DNA (or RNA for some viruses)Bacterial typing: A procedure for identifiying types and strains of bacteria. Metagenome: The global genetic repertoire of an environmental niche that is constituted by diverse organisms such as free-living microorganisms in the wild or the commensals of a particular niche in a mammalian host16S ribosomal RNA: The 16S ribosomal RNA gene is a component of the small bacterial and archaeal ribosomal subunit. The gene includes hypervariable regions that contain species-specific signature sequences which are useful for bacterial and archaeal identification at the species levelMLEE: The characterization of bacterial species by the relative electrophoretic mobility of approximately 15 cellular metabolicenzymesMLST: An unambiguous procedure for characterizing isolates of bacterial species using the sequences of internal fragments (usually) seven housekeeping genes. Approximately 450-500 bp internal fragments of each gene are used, as these can be accurately sequenced on both strands using an automated DNA sequencerPan-genome: The global gene repertoire of a bacterial species that comprises the sum of the core and the dispensable genome SNP: DNA sequence variation that occurs when a single nucleotide in the genome differs between members of a species. Core genome: The pool of genes that is shared by all the trains of the same bacterial speciesLateral gene transfer: The mechanism by which an individual of one species transfers genetic material (that is DNA) to an individual of a different species. Metagenomics: The study of the genomic repertoire of all the organisms that live in a particular environment and their activities as a collective. The genomic analysis is applied to entire communities of microorganisms, which bypasses the need to isolate and culture individual microbial species.Reverse vaccinology: A genomic approach to vaccine development that searches the entire genetic repertoire of a pathogen for protective antigens.