76
A Phylogeny-Driven Genomic Encyclopedia of Bacteria and Archaea Jonathan A. Eisen Talk for ASBMB April 25, 2010 Sunday, April 25, 2010

Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Embed Size (px)

DESCRIPTION

Talk at ASBMB.

Citation preview

Page 1: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

A Phylogeny-Driven Genomic Encyclopedia of Bacteria and Archaea

Jonathan A. Eisen

Talk for ASBMBApril 25, 2010

Sunday, April 25, 2010

Page 2: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Eisen Lab - Phylogenomics of Novelty

Origin of New Functions and

Processes

Species Evolution

Genome Dynamics

•New genes•Changes in old genes•Changes in pathways

•Phylogenetic history•Vertical vs. horizontal descent•Needed to track gain/loss of processes, infer convergence

•Evolvability•Repair and recombination processes•Intragenomic variation

Sunday, April 25, 2010

Page 3: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Bacterial evolve

Sunday, April 25, 2010

Page 4: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Fleischmann et al. 1995TIGRTIGR

Sunday, April 25, 2010

Page 5: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Microbial genomes

From http://genomesonline.orgSunday, April 25, 2010

Page 6: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 7: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Sunday, April 25, 2010

Page 8: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

The Tree is not Happy

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Sunday, April 25, 2010

Page 9: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 10: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

Based on Hugenholtz, 2002

2002

Sunday, April 25, 2010

Page 11: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

Based on Hugenholtz, 2002

2002

Sunday, April 25, 2010

Page 12: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in ArchaeaBased on Hugenholtz, 2002

2002

Sunday, April 25, 2010

Page 13: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in EukaryotesBased on Hugenholtz, 2002

2002

Sunday, April 25, 2010

Page 14: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Filling in the Genomic Phylogenetic Gaps

• Common approach within some eukaryotic groups

• Many small projects funded to fill in some bacterial or archaeal gaps

• Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature

Sunday, April 25, 2010

Page 15: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Solution I: sequence more phyla

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Sunday, April 25, 2010

Page 16: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

The Tree of Life is Still Angry

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Eukaryotes

Bacteria

Archaea

Sunday, April 25, 2010

Page 17: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Major Lineages of Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.23 Streptosporangineae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.9 Dermabacteraceae2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.3 MC472.5.6.4 Rubrobacteraceae

2.5 Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.3.1 Unclassified2.5.1.3.2 Acidimicrobiaceae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.13.1 Unclassified2.5.2.13.2 Acidothermaceae2.5.2.13.3 Ellin60902.5.2.13.4 Frankiaceae2.5.2.13.5 Geodermatophilaceae2.5.2.13.6 Microsphaeraceae2.5.2.13.7 Sporichthyaceae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.15.1 Unclassified2.5.2.15.2 Dermacoccus2.5.2.15.3 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.17.1 Unclassified2.5.2.17.2 Agrococcus2.5.2.17.3 Agromyces2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.20.1 Unclassified2.5.2.20.2 Kribbella2.5.2.20.3 Nocardioidaceae2.5.2.20.4 Propionibacteriaceae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.22.1 Unclassified2.5.2.22.2 Kitasatospora2.5.2.22.3 Streptacidiphilus2.5.2.23 Streptosporangineae2.5.2.23.1 Unclassified2.5.2.23.2 Ellin51292.5.2.23.3 Nocardiopsaceae2.5.2.23.4 Streptosporangiaceae2.5.2.23.5 Thermomonosporaceae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.8.1 Unclassified2.5.2.8.2 Corynebacteriaceae2.5.2.8.3 Dietziaceae2.5.2.8.4 Gordoniaceae2.5.2.8.5 Mycobacteriaceae2.5.2.8.6 Rhodococcus2.5.2.8.7 Rhodococcus2.5.2.8.8 Rhodococcus2.5.2.9 Dermabacteraceae2.5.2.9.1 Unclassified2.5.2.9.2 Brachybacterium2.5.2.9.3 Dermabacter2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.2.1 Unclassified2.5.6.2.2 Conexibacter2.5.6.2.3 XGE5142.5.6.3 MC472.5.6.4 Rubrobacteraceae

Sunday, April 25, 2010

Page 18: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

From http://genomesonline.org

Microbial genomes 2010- ...

Sunday, April 25, 2010

Page 19: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 100 phyla of bacteria

• Genome sequences are mostly from three phyla

• Most phyla with cultured species are sparsely sampled

• Lineages with no cultured taxa even more poorly sampled

• Solution - use tree to really fill gaps

Well sampled phyla

Sunday, April 25, 2010

Page 20: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Why Increase Phylogenetic Coverage?

• Gene discovery• Annotation, functional prediction• Metagenomic analysis• Mechanisms of diversification• Species phylogeny and classification

Sunday, April 25, 2010

Page 21: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

A Genomic Encyclopedia of Bacteria and Archaea (GEBA)

Sunday, April 25, 2010

Page 22: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Pilot Project Overview

• Identify major branches in rRNA tree for which no genomes are available

• Identify a cultured representative for each group

• Grow > 200 of these and prep. DNA• Sequence and finish 100• Annotate, analyze, release data• Assess benefits of tree guided sequencing• Paper published in Nature Dec. 2009.

Sunday, April 25, 2010

Page 23: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Pilot Target List

0

5

10

15

20

25

30

35

B: A

ctinob

acteria

(High GC)

B: A

minan

aero

bia

B: A

quifica

e

B: B

actero

idetes

B: C

hlor

oflexi

B: D

efer

ribac

tere

s

B: D

efer

ribac

tere

s

B: D

eino

cocc

i

B: D

elta Pro

teob

acteria

B: Eps

ilon Pr

oteo

bacter

ia

B: Firm

icutes

B: Fus

obac

teria

B: G

amma Pr

oteo

bacter

ia

B: G

emmatim

onad

etes

B: H

aloa

naer

obiales

B: Planc

tomyc

etes

B: S

piro

chae

tes

B: The

rmod

esulfoba

cter

ia

B: The

rmod

esulfobia

B: The

rmov

enab

ulae

A: H

alob

acteria

A: A

rcha

eoglob

i

A: M

etha

noba

cter

ia

A: M

etha

nomicr

obia

A: The

rmoc

occi

A: The

rmop

rotei

Phyla

# o

f G

en

om

es

Sunday, April 25, 2010

Page 24: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA and Openness

• All data being released as quickly as possible with no restrictions to IMG-GEBA; Genbank, etc

• Data also available in Biotorrents (http://biotorrents.net)

• Individual genome reports being published in new Open Access journal “Standards in Genome Sciences (SIGS)”

• Main GEBA paper in Nature freely available and published using Creative Commons License

Sunday, April 25, 2010

Page 25: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Assess Benefits of GEBA

• All genomes have some value

• But what, if any, is the benefit of tree-guided sequencing over other selection methods

• Lessons for other large scale microbial genome projects?

Sunday, April 25, 2010

Page 26: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Lesson 1

rRNA Tree is Useful for Identifying Phylogenetically Novel Genomes

rRNA Tree topology is not perfect;Genome-based trees better

Sunday, April 25, 2010

Page 27: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Sunday, April 25, 2010

Page 28: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 29: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Network of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Sunday, April 25, 2010

Page 30: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Wh

Whole genome tree built using AMPHORAby Martin Wu and Dongying Wu

Sunday, April 25, 2010

Page 32: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Proteobacteria

Sunday, April 25, 2010

Page 33: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Lesson 2

Phylogenetically-guided genome selection improves genome

annotation

Sunday, April 25, 2010

Page 34: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Predicting Function

• Key step in genome projects• More accurate predictions help guide

experimental and computational analyses• Many diverse approaches• Comparative and evolutionary analysis

greatly improves most predictions

Sunday, April 25, 2010

Page 35: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling

• Better definition of protein family sequence “patterns”

• Conversion of hypothetical into conserved hypotheticals

• Greatly improves “comparative” and “evolutionary” based predictions

• Linking distantly related members of protein families

• Improved non-homology prediction

Sunday, April 25, 2010

Page 37: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Lesson 3

Improves analysis of genome data from uncultured organisms

Sunday, April 25, 2010

Page 38: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Metagenomics

shotgun

clone

Sunday, April 25, 2010

Page 39: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 40: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

rRNA phylotyping from metagenomics

Venter et al., 2004

Sunday, April 25, 2010

Page 41: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Shotgun Sequencing Allows Use of Alternative Anchors (e.g., RecA)

Venter et al., 2004

Sunday, April 25, 2010

Page 42: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Venter et al., 2004

Sunday, April 25, 2010

Page 43: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Venter et al., 2004

Cannot be done without good sampling of genomes

Sunday, April 25, 2010

Page 44: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

ABCDEFG

TUVWXYZ

Binning challenge

Sunday, April 25, 2010

Page 45: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

ABCDEFG

TUVWXYZ

Binning challenge

Best binning method: reference genomes

Sunday, April 25, 2010

Page 46: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

ABCDEFG

TUVWXYZ

Binning challenge

No reference genome? What do you do?

Sunday, April 25, 2010

Page 47: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

ABCDEFG

TUVWXYZ

Binning challenge

No reference genome? What do you do?

Phylogeny ....Sunday, April 25, 2010

Page 48: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Binning Using AMPHORA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Alph

apro

teob

acteria

Betapr

oteo

bacter

ia

Gammap

roteob

acteria

Deltapr

oteo

bacter

ia

Epsil

onpr

oteo

bacter

ia

Uncla

ssified

Pro

teob

acteria

Cyan

obac

teria

Chlamyd

iae

Acidob

acteria

Bacter

oide

tes

Actin

obac

teria

Aquific

ae

Plan

ctom

ycetes

Spiro

chae

tes

Firmicu

tes

Chloro

flexi

Chloro

bi

Uncla

ssified

Bac

teria

dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf

AMPHORA - each read on its own treeSunday, April 25, 2010

Page 49: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Binning Using AMPHORA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Alph

apro

teob

acteria

Betapr

oteo

bacter

ia

Gammap

roteob

acteria

Deltapr

oteo

bacter

ia

Epsil

onpr

oteo

bacter

ia

Uncla

ssified

Pro

teob

acteria

Cyan

obac

teria

Chlamyd

iae

Acidob

acteria

Bacter

oide

tes

Actin

obac

teria

Aquific

ae

Plan

ctom

ycetes

Spiro

chae

tes

Firmicu

tes

Chloro

flexi

Chloro

bi

Uncla

ssified

Bac

teria

dnaGfrrinfCnusApgkpyrGrplArplBrplCrplDrplErplFrplKrplLrplMrplNrplPrplSrplTrpmArpoBrpsBrpsCrpsErpsIrpsJrpsKrpsMrpsSsmpBtsf

AMPHORA - each read on its own tree

Cannot be done without good sampling of genomes

Sunday, April 25, 2010

Page 50: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Phylogenomic Lesson 5

We have still only scratched the surface of microbial diversity

Sunday, April 25, 2010

Page 51: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Protein Family Rarefaction Curves

• Take data set of multiple complete genomes• Identify all protein families using MCL• Plot # of genomes vs. # of protein families

Sunday, April 25, 2010

Page 52: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 53: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 54: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 55: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 56: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 57: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

MobileMotility

Element?

Sunday, April 25, 2010

Page 58: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Distribution Novelty: Bacterial Actin Related Protein

Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin

!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(

2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+

;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)

2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(

5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-

."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!

5"#$-.-6&0&1- !"#$++-&%%!

?"#@-%1*)A10(-. !"#$)),),%)

?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%

5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&

?"#4&0$)&4-/@ !"#''-+&%$-

D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)

?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,

((

')

$++$++

'*

$++

$++

)*

$++

$++

*$

((),

$++()

(%$++

)%

$++

-)

$++

+/*!

!"#$%

!&'(

!&')

!&'*

+!&'

!&',

!&'-

!&'.

!&'/

!&'(0

See also Guljamow et al. 2007 Current Biology. Sunday, April 25, 2010

Page 59: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Sunday, April 25, 2010

Page 60: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Diversity: Sequenced Bacteria & Archaea

From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html

Sunday, April 25, 2010

Page 61: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Diversity with GEBA

From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html

Sunday, April 25, 2010

Page 63: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Phylogenetic Diversity: All

From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html

Sunday, April 25, 2010

Page 64: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Most phyla with cultured species are sparsely sampled

• Lineages with no cultured taxa even more poorly sampled

Well sampled phylaPoorly sampled

No cultured taxaSunday, April 25, 2010

Page 65: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Uncultured Lineages:Technical Approaches

• Get into culture• Enrichment cultures• If abundant in low diversity ecosystems• Flow sorting• Microbeads• Microfluidic sorting• Single cell amplification

Sunday, April 25, 2010

Page 66: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Phylogenomic Lesson 6

Need Experiments from Across the Tree of Life too

Sunday, April 25, 2010

Page 67: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

As of 2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 68: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Experimental studies are mostly from three phyla

As of 2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 69: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Experimental studies are mostly from three phyla

• Some studies in other phyla

As of 2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 70: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Eukaryotes

As of 2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 71: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Viruses

As of 2002

Based on Hugenholtz, 2002

Sunday, April 25, 2010

Page 72: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

Need experimental studies from across the tree too

Sunday, April 25, 2010

Page 73: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

Sunday, April 25, 2010

Page 74: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

MICROBES

Sunday, April 25, 2010

Page 75: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

A Happy Tree of Life

Sunday, April 25, 2010

Page 76: Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project

GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan

Eisen, Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus,

Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et

al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor

Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)

• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, Eddy Rubin, Jim Bristow)

Sunday, April 25, 2010