10
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America: 96 1,979

The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes:

Embed Size (px)

DESCRIPTION

The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes:. Postmortem. Europe: 386. N America: 1,180. Asia: 235. Africa: 6. 1,979. S America: 96. Oceania: 81. Outline. User Community Data Transformation Data Flow What’s Next. - PowerPoint PPT Presentation

Citation preview

Page 1: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

The (IMG) Systems for

Comparative Analysis of

Microbial Genomes &

Metagenomes:

N America: 1,180

Europe: 386

Asia: 235

Africa: 6

Oceania: 81

S America: 96

1,979

Page 2: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

User Community

Data Transformation

Data Flow

What’s Next

Outline

Page 3: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

Transformation

Assembly: Assembled

reads

Sequencing:Qualified reads

Functional annotation*:

Pathways

Structural annotation:

Predicted genes

Cha

ract

eriz

atio

n

Functional annotation:

Annotated proteins

DataStructure & semantics Logical: objects, correlations Physical: files, formats, size

ProcessingMethods, tools

Questions

Sequencing:Raw reads

Implementation Data management Computing infrastructure

Genome exploration

Browse & search genome

Browse genome sequence: genes coordinates, features

Search genome for presence of specific genes, functions

Sequence browserChromosome

map

Data interpretation for individual genomes

Page 4: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

Genome fusion: pangenomes

Pathways

Functions

Genome integration

“OM

ICS

” in

tegr

atio

n

Genes

DataStructure & semantics Logical: data model Physical: database system

IntegrationMethods, tools

Questions

Implementation Analysis operations Flow (composition) Performance

Genome 1 g3 g2 g1

g1 g2 g3

Genome k

Gene correlations

Genome ng1 g2 g3 g4

Gene expression from: Proteomics Transcriptomics

Conserved genesFunction

Profile

Comparative Analysis

Review, revise, improve quality of annotations

Explore /compare gene & functional content of genomes & metagenomes

Detect /correct annotation gaps & inconsistencies

Data interpretation across genomes

Page 5: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

QuestionsCompleteness & consistency of functional catalogue for genomes•Consistence: IMG terms & pathways•Completeness: IMG metabolic reconstruction Expert curation in IMG ER

Data Integration

Functional

annotation

Structural annotatio

n

Scaffolds

Genes

Functions

Ʃgenes

Functional catalogue

Phenotypes

Genomes

Phenotype

rules

Phenotypeprediction

Biological data interpretation processQuestionsGene prediction accuracyNeed re-annotation of all microbial genomes

QuestionsMultiple resources, methods•Potential conflicts, errors•Missing annotationsRequires integrated context (IMG ER) + tools for review/curation

Page 6: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

Every 4 months

Monthly

Monthly

Instructor & Student Tools

On demand

IMG systems data flow: up to Dec 2011

Page 7: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

9,991 Public Genomes

22.5 mil genes1,293 Private Genomes 6.1 mil genes

9,991 Public Genomes

22.5 mil genes1,293 Private Genomes 6.1 mil genes

7,989 Genomes 12.6 Mil genes

+ 1,077 Samples: >120 Studies + 2.5 Bil Genes

+ 1,077 Samples: >120 Studies + 2.5 Bil Genes

357 Samples > 95 Studies +140 Mil Genes

357 Samples > 95 Studies +140 Mil Genes

Every 2-3 weeks

On demand

Monthly

Bi weekly

Instructor & Student Tools

IMG systems data flow: May 2012

Page 8: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

IMG development focus

Large metagenome datasets in IMG/M ER Extended underlying datastore

Revision of metagenome analysis tools

New User Workspace for handling sets of genomes, functions, genes

Long running operations transitioned to background execution mode

Content update process New genomes added to IMG ER & IMG/M ER at the same time

Data distribution

Documentation

Page 9: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

genome.jgi.doe.gov

IMG data distribution

Page 10: The (IMG) Systems for Comparative Analysis of  Microbial Genomes & Metagenomes:

IMG documentation