88
Networks in genomics and bioinformatics: from phylogeny to Twitter ISCB2012 July 12, 2012 Jonathan A. Eisen University of California, Davis @phylogenomics Friday, July 13, 12

Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Embed Size (px)

DESCRIPTION

Talk as part of http://www.iscb.org/ismb2012-program/ismb2012-scs"Networks in genomics and bioinformatics: from phylogeny to Twitter"

Citation preview

Page 1: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Networks in genomics and bioinformatics: from phylogeny to Twitter

ISCB2012July 12, 2012

Jonathan A. EisenUniversity of California, Davis

@phylogenomics

Friday, July 13, 12

Page 2: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Networks in genomics and bioinformatics: from phylogeny to Twitter

ISCB2012July 12, 2012

Jonathan A. EisenUniversity of California, Davis

@phylogenomics

Friday, July 13, 12

Page 3: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

A meandering path and lessons “learned”

ISCB2012July 12, 2012

Jonathan A. EisenUniversity of California, Davis

@phylogenomics

Friday, July 13, 12

Page 4: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Friday, July 13, 12

Page 5: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Social Networking in Science

Friday, July 13, 12

Page 6: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Bacterial evolve

Friday, July 13, 12

Page 7: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Friday, July 13, 12

Page 8: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogenomics of Novelty

Friday, July 13, 12

Page 9: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogenomics of Novelty

Friday, July 13, 12

Page 10: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Phylogenomics of Novelty

Friday, July 13, 12

Page 11: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

Friday, July 13, 12

Page 12: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

Friday, July 13, 12

Page 13: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Genome Dynamics

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

Friday, July 13, 12

Page 14: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Genome Dynamics

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

•Evolvability•Repair and recombination processes•Intragenomic variation

Friday, July 13, 12

Page 15: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Genome Dynamics

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

•Evolvability•Repair and recombination processes•Intragenomic variation

Friday, July 13, 12

Page 16: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Species Evolution

Genome Dynamics

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

•Evolvability•Repair and recombination processes•Intragenomic variation

Friday, July 13, 12

Page 17: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Origin of New Functions and

Processes

Species Evolution

Genome Dynamics

Phylogenomics of Novelty

•New genes•Changes in old genes•Changes in pathways

•Phylogenetic history•Vertical vs. horizontal descent•Needed to track gain/loss of processes, infer convergence

•Evolvability•Repair and recombination processes•Intragenomic variation

Friday, July 13, 12

Page 18: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Undergrad Lesson 1:Be prepared for random events

• Gould’s class b/c planned on not majoring in Biology

• RMBL via backpacking trip• Geology library job w/ Nabokov collection

b/c went to wrong building• Discovering Colleen Cavanaugh’s lab via

street encounter

Friday, July 13, 12

Page 19: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Undergrad Lesson 2: Phylogeny Matters

• “MacClade”• Phylogenetic ecology• Phylotyping

Friday, July 13, 12

Page 20: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogeny Matters

Eisen et al. 1992

Friday, July 13, 12

Page 21: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson I: find right people to work with

• Went to work on butterfly population biology and phylogeny

• Advisor and I did not see eye to eye• Despite great subject for me (combined

phylogeny, molecular evolution, RMBL, etc), chose not to join lab

• Did many rotations …• Picked final lab in part b/c advisor was right

match

Friday, July 13, 12

Page 22: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson II:never too late to change

• Wanted to combine DNA repair studies and molecular evolution

• I: Thymineless death• II: Adaptive mutation• III: Repair in archaea

Friday, July 13, 12

Page 23: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Friday, July 13, 12

Page 24: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson II:never too late to change

• Wanted to combine DNA repair studies and molecular evolution

• I: Thymineless death• II: Adaptive mutation• III: Repair in archaea• IV: Bioinformatics and genome analysis …

Friday, July 13, 12

Page 25: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson III:Get others to do your work

• Interested in RecA structure function relationships

• Using phylogeny to look for correlated substitutions in RecA structure, like done with rRNA

• But not enough sequences …

Friday, July 13, 12

Page 26: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Friday, July 13, 12

Page 27: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Shotgun Sequencing Allows Use of Alternative Anchors (e.g., RecA)

Venter et al., 2004Friday, July 13, 12

Page 28: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson IV:Stealing is good

• Phylogenetic perspective in bioinformatics missing

Friday, July 13, 12

Page 29: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

“Nothing in biology makes senseexcept in the light of evolution.”

T. H. Dobzhansky (1973)

Friday, July 13, 12

Page 30: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Evolutionary Perspective and Comparative Biology

• Comparative biology is the analysis of differences and similarities between species.

• An evolutionary perspective is useful in such studies because this allows one to focus not just on the levels and degrees of similarity or difference but on how and why similarities and differences came to be.

Friday, July 13, 12

Page 31: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogenomics

• Lots of sequences being produced with no functions associated with them

• Much debate in community about how to predict functions

Friday, July 13, 12

Page 32: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Predicting Function

• Identification of motifs• Homology/similarity based methods

• Highest hit• Top hits• Clusters of orthologous groups• HMM models• Structural threading and modeling• Evolutionary reconstructions

Friday, July 13, 12

Page 33: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogeny Matters

Eisen et al. 1992

Friday, July 13, 12

Page 34: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Evolutionary Functional Prediction

1 2 3 4 5 6

3 5

3

1A 2A 3A 1B 2B 3B

2A 1B

1

1 2

2

2 31

1A 3A

1A 2A 3A

1A 2A 3A

4 6

4 5 6

4 5 6

2B 3B

1B 2B 3B

1B 2B 3B

1A3A

1B 2B3B

12 4

62A

2A

53

5

EXAMPLE BMETHOD

Duplication?

Duplication?

IDENTIFY HOMOLOGS

OVERLAY KNOWNFUNCTIONS ONTO TREE

INFER LIKELY FUNCTIONOF GENE(S) OF INTEREST

ALIGN SEQUENCES

CALCULATE GENE TREE

CHOOSE GENE(S) OF INTEREST

Species 3Species 1 Species 2

ACTUAL EVOLUTION(ASSUMED TO BE UNKNOWN)

EXAMPLE A

Duplication?

Duplication

Ambiguous

Based on Eisen, 1998 Genome Res 8: 163-167.

Friday, July 13, 12

Page 35: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Similarity ≠ Relatedness

Friday, July 13, 12

Page 36: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Evolutionary Rate Variation

Friday, July 13, 12

Page 37: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogenetic Prediction of Function

• Many powerful and automated similarity based methods for assigning genes to protein families• COGs• PFAM HMM searches

• Some limitations of similarity based methods can be overcome by phylogenetic approaches

• Automated methods now available• Sean Eddy• Steven Brenner• Kimmen Sjölander

• But …

Friday, July 13, 12

Page 38: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson V:Teaching helps you learn

Friday, July 13, 12

Page 39: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Grad school lesson VI:There are no career rules

Friday, July 13, 12

Page 40: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Career Lesson I:Build on what you know

• Phylogenetic approaches to genomics• Genomics of endosymbionts• Genomic studies of communities• Analysis of DNA repair genes in genome

sequences• Phylogenomics of halophilic archaea• GEBA• Phylogenetic metagenomics• ...

Friday, July 13, 12

Page 41: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Career Lesson II: Don’t Only Use What You Know

Friday, July 13, 12

Page 42: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

What We Don’t Know Can Hurt Us

Friday, July 13, 12

Page 43: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

D. radiodurans genome

Friday, July 13, 12

Page 44: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

DNA Repair Genes in D. radiodurans

Process Genes in D. radiodurans

Nucleotide Excision Repair UvrABCD, UvrA2 Base Excision Repair AlkA, Ung, Ung2, GT, MutM, MutY-Nths,

MPG AP Endonuclease Xth Mismatch Excision Repair MutS, MutL Recombination Initiation Recombinase Migration and resolution

RecFJNRQ, SbcCD, RecD RecA RuvABC, RecG

Replication PolA, PolC, PolX, phage Pol Ligation DnlJ dNTP pools, cleanup MutTs, RRase Other LexA, RadA, HepA, UVDE, MutS2

Friday, July 13, 12

Page 45: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Problem ...

• List of DNA repair gene homologs in D. radiodurans genome is not significantly different from other bacterial genomes of the similar size

Friday, July 13, 12

Page 46: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Repair Studies in Different Species(via Medline searches as of 1998)

Humans 7028E. coli 3926S. cerevisiae 988Drosophila 387B. subtilits 284S. pombe 116Xenopus 56C. elegans 25A. thaliana 20Methanogens 16Haloferax 5Giardia 0

Friday, July 13, 12

Page 47: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

~40 Phyla of Bacteria

Friday, July 13, 12

Page 48: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

Most DNA metabolism studies in two Phyla

Friday, July 13, 12

Page 49: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

Deinococcus is very distant from well studied groups

Friday, July 13, 12

Page 50: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

-Ogt-RecFRQN-RuvC-Dut-SMS

-PhrI-AlkA-Nfo-Vsr-SbcCD-LexA-UmuC

-PhrI-PhrII-AlkA-Fpg-Nfo-MutLS-RecFORQ-SbcCD-LexA-UmuC-TagI

-PhrI-Ogt-AlkA-Xth-MutLS-RecFJORQN-Mfd-SbcCD-RecG-Dut-PriA-LexA-SMS-MutT

-PhrI-PhrII?-AlkA-Fpg-Nfo-RecO-LexA-UmuC

-PhrI-Ung?-MutLS-RecQ?-Dut-UmuC

-PhrII-Ogg

-Ogt-AlkA-TagI-Nfo-Rec-SbcCD-LexA

-Ogt-AlkA-Nfo-RecQ-SbcD?-Lon-LexA

-AlkA-Xth-Rad25?

-AlkA-Rad25

-Nfo

-Ogt-Ung-Nfo-Dut-Lon

-Ung

-PhrII

-PhrI

Ecoli

Haein

Neigo He

lpy

Bacs

u

Strpy

Myc

ge

Myc

pn

Borbu Trep

a

Syns

p

Met

jn

Arcfu

Met

th

Human

Yeas

t

BACTERIA ARCHAEA EUKARYOTES

from mitochondria

+Ada+MutH+SbcB

dPhr

+TagI?+Fpg

+UvrABCD+Mfd

+RecFJNOR+RuvABC

+RecG+LigI

+LexA+SSB

+PriA+Dut?

+Rus+UmuD+Nei?

+RecEtRecT?

+Vsr+RecBCD?

+RFAs+TFIIH

+Rad4,10,14,16,23,26+CSA

+Rad52,53,54+DNA-PK, Ku

dSNF2dMutSdMutLdRecA

+Rad1+Rad2

+Rad25?+Ogg+LigII

+Ung?+SSB,

+Dut?

+PhrI, PhrII+Ogt

+Ung, AlkA, MutY-Nth+AlkA

+Xth, Nfo?+MutLS?

+SbcCD+RecA

+UmuC+MutT

+LondMutSI/MutSII

dRecA/SMSdPhrI/PhrII

+Sprt3MG

+Rad7+CCE1

+P53dRecQ

dRad23+MAG?

-PhrII-RuvC

tRad25

+TagI?

+RecT

tUvrABCD

tTagI ?

Gain and Loss of Repair Genes

Eisen and Hanawalt, 1999 Mut Res 435: 171-213

Friday, July 13, 12

Page 51: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Solution - Experiments

Friday, July 13, 12

Page 52: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

What We Don’t Know Can Hurt Us

Friday, July 13, 12

Page 53: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

As of 2002

Based on Hugenholtz, 2002

Friday, July 13, 12

Page 54: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Most genomes from three phyla

As of 2002

Based on Hugenholtz, 2002

Friday, July 13, 12

Page 55: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Most genomes from three phyla

• Some studies in other phyla

As of 2002

Based on Hugenholtz, 2002

Friday, July 13, 12

Page 56: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Most genomes from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Eukaryotes

As of 2002

Based on Hugenholtz, 2002

Friday, July 13, 12

Page 57: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Most genomes from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Viruses

As of 2002

Based on Hugenholtz, 2002

Friday, July 13, 12

Page 58: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Friday, July 13, 12

Page 59: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

GEBA

http://www.jgi.doe.gov/programs/GEBA/pilot.html

Friday, July 13, 12

Page 60: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

rRNA Tree of Life

Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.

Based on tree from Pace 1997 Science 276:734-740

Archaea

Eukaryotes

Bacteria

Friday, July 13, 12

Page 64: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

rRNA Tree of Life

Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.

Based on tree from Pace 1997 Science 276:734-740

Archaea

Eukaryotes

Bacteria

??????

Wu et al. (2011) PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011

Friday, July 13, 12

Page 65: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

????

Phage

Phage

????

Thaumarchaeot

Friday, July 13, 12

Page 66: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

56

Number of SAGs from Candidate Phyla

OD

1

OP

11

OP

3

SA

R4

06

Site A: Hydrothermal vent 4 1 - -Site B: Gold Mine 6 13 2 -Site C: Tropical gyres (Mesopelagic) - - - 2Site D: Tropical gyres (Photic zone) 1 - - -

Sample collections at 4 additional sites are underway.

Phil Hugenholtz

GEBA uncultured

Friday, July 13, 12

Page 67: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Uncharacterized genes

Friday, July 13, 12

Page 68: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Non homology functional

• Many genes have homologs in other species but no homologs have ever been studied experimentally

• Non-homology methods can make functional predictions for these

Friday, July 13, 12

Page 69: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Phylogenetic profiling basis

• Microbial genes are lost rapidly when not maintained by selection

• Genes can be acquired by lateral transfer• Frequently gain and loss occurs for entire

pathways/processes• Thus might be able to use correlated

presence/absence information to identify genes with similar functions

Friday, July 13, 12

Page 70: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Non-Homology Predictions: Phylogenetic Profiling

• Step 1: Search all genes in organisms of interest against all other genomes

• Ask: Yes or No, is each gene found in each other species

• Cluster genes by distribution patterns (profiles)

Friday, July 13, 12

Page 71: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Carboxydothermus hydrogenoformans

• Isolated from a Russian hotspring• Thermophile (grows at 80°C)• Anaerobic• Grows very efficiently on CO (Carbon

Monoxide)• Produces hydrogen gas• Low GC Gram positive (Firmicute)• Genome Determined (Wu et al. 2005

PLoS Genetics 1: e65. )

Friday, July 13, 12

Page 75: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

PG Profiling Works Better Using Orthology

Friday, July 13, 12

Page 76: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

PG Profiling Works Better Using Independent Contrasts

Friday, July 13, 12

Page 77: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Career Lesson III: Networks Matter

Friday, July 13, 12

Page 78: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Protein Family Rarefaction Curves

• Take data set of multiple complete genomes

• Identify all protein families using MCL• Plot # of genomes vs. # of protein families

Friday, July 13, 12

Page 85: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Metagenomics

Friday, July 13, 12

Page 86: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Binning challenge

Friday, July 13, 12

Page 87: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

AB

C

��

�� �

��

��

��

��

��

��

��

��

��

� ��

��

��

��

��

��

� �

��

� �

��

��

� �

��

��

� �

� �

� �

��

��

� ��

��

��

��

��

��

��

��

��

��

� �

��

��

� �

��

��

� �

��

��

��

��

��

��

��

� �

��

��

���

��

��

� �

��

��

��

� ��

��

� �

��

��

� �

� �� �

� �

��

��

��

��

���

� �

��

� �

��

��

��

��

��

��

��

���

��

��

��

��

��

� �

��

��

��

��

��

��

���

��

��

��

��

��

� �

��

� �

��

�� �

��

��

� �

��

��

��

��

��

��

��

��

�� �

��

��

��

���

��

��

��

��

��

�� �

�� �

��

��

��

��

��

�� �

��

� ��

� �

��

��

��

� �

��

� �

��

� �

��

��

��

��

��

� �

��

��

��

� �

��

��

��

��

��

��

��

��

��

� �

��

��

��

��

��

� �

��

Sharpton et al. submitted

Friday, July 13, 12

Page 88: Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinformatics: from phylogeny to Twitter"

Career Lesson IV: Openness Helps

Friday, July 13, 12