Upload
guinevere-madden
View
55
Download
1
Tags:
Embed Size (px)
DESCRIPTION
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes:. Postmortem. Europe: 386. N America: 1,180. Asia: 235. Africa: 6. 1,979. S America: 96. Oceania: 81. Outline. User Community Data Transformation Data Flow What’s Next. - PowerPoint PPT Presentation
Citation preview
The (IMG) Systems for
Comparative Analysis of
Microbial Genomes &
Metagenomes:
N America: 1,180
Europe: 386
Asia: 235
Africa: 6
Oceania: 81
S America: 96
1,979
User Community
Data Transformation
Data Flow
What’s Next
Outline
Transformation
Assembly: Assembled
reads
Sequencing:Qualified reads
Functional annotation*:
Pathways
Structural annotation:
Predicted genes
Cha
ract
eriz
atio
n
Functional annotation:
Annotated proteins
DataStructure & semantics Logical: objects, correlations Physical: files, formats, size
ProcessingMethods, tools
Questions
Sequencing:Raw reads
Implementation Data management Computing infrastructure
Genome exploration
Browse & search genome
Browse genome sequence: genes coordinates, features
Search genome for presence of specific genes, functions
Sequence browserChromosome
map
Data interpretation for individual genomes
Genome fusion: pangenomes
Pathways
Functions
Genome integration
“OM
ICS
” in
tegr
atio
n
Genes
DataStructure & semantics Logical: data model Physical: database system
IntegrationMethods, tools
Questions
Implementation Analysis operations Flow (composition) Performance
Genome 1 g3 g2 g1
g1 g2 g3
Genome k
Gene correlations
Genome ng1 g2 g3 g4
Gene expression from: Proteomics Transcriptomics
Conserved genesFunction
Profile
Comparative Analysis
Review, revise, improve quality of annotations
Explore /compare gene & functional content of genomes & metagenomes
Detect /correct annotation gaps & inconsistencies
Data interpretation across genomes
QuestionsCompleteness & consistency of functional catalogue for genomes•Consistence: IMG terms & pathways•Completeness: IMG metabolic reconstruction Expert curation in IMG ER
Data Integration
Functional
annotation
Structural annotatio
n
Scaffolds
Genes
Functions
Ʃgenes
Functional catalogue
Phenotypes
Genomes
Phenotype
rules
Phenotypeprediction
Biological data interpretation processQuestionsGene prediction accuracyNeed re-annotation of all microbial genomes
QuestionsMultiple resources, methods•Potential conflicts, errors•Missing annotationsRequires integrated context (IMG ER) + tools for review/curation
Every 4 months
Monthly
Monthly
Instructor & Student Tools
On demand
IMG systems data flow: up to Dec 2011
9,991 Public Genomes
22.5 mil genes1,293 Private Genomes 6.1 mil genes
9,991 Public Genomes
22.5 mil genes1,293 Private Genomes 6.1 mil genes
7,989 Genomes 12.6 Mil genes
+ 1,077 Samples: >120 Studies + 2.5 Bil Genes
+ 1,077 Samples: >120 Studies + 2.5 Bil Genes
357 Samples > 95 Studies +140 Mil Genes
357 Samples > 95 Studies +140 Mil Genes
Every 2-3 weeks
On demand
Monthly
Bi weekly
Instructor & Student Tools
IMG systems data flow: May 2012
IMG development focus
Large metagenome datasets in IMG/M ER Extended underlying datastore
Revision of metagenome analysis tools
New User Workspace for handling sets of genomes, functions, genes
Long running operations transitioned to background execution mode
Content update process New genomes added to IMG ER & IMG/M ER at the same time
Data distribution
Documentation
genome.jgi.doe.gov
IMG data distribution
IMG documentation