View
213
Download
0
Category
Preview:
Citation preview
VectorBaseVectorBase
VectorBase
A Resource Centre for A Resource Centre for Invertebrate Hosts of Human Invertebrate Hosts of Human
PathogensPathogens
Bob MacCallumBob MacCallum
Imperial College LondonImperial College London
VectorBaseVectorBase
Outline
• Introduction to VectorBase
• Two important recent developments:
– Community Annotations
– Gene Expression Data
VectorBaseVectorBase
What is VectorBase?
• Aim Genomic bioinformatics resource for invertebrate
vectors of human pathogens Data hub for community
• Funding US NIAID (National Institute for Allergy and Infectious
Diseases) via its Bioinformatics Resource Centre (BRC) program
VectorBaseVectorBase
Why VectorBase?
• Sequencing initiatives do not include “after-care”
• Ensembl had no long-term plans for insects
VectorBaseVectorBase
Main VectorBase activities
• www.vectorbase.org:– Browse, search & download genomic data
• Genome annotation– Automatic & manual
• Functional genomics
• Ontologies
• Training/outreach/consultancy
VectorBaseVectorBase
Invertebrate vectorsSpecies Disease Status Funder
Aedes aegypti Yellow feverDengue fever
Complete† NIAID
Anopheles gambiae PEST Malaria Complete† -
Anopheles gambiae M & S form
Malaria Assembled NHGRI
Culex pipiens quinquefasciatus
Lymphatic filariasis
Complete† NIAID
Glossina morsitans morsitans
Sleeping sickness Initiated Wellcome Trust
Ixodes scapularis Lyme disease Draft gene set NIAID
Lutzomyia longipalpis Leishmania Planned NHGRI/Wellcome Trust
Pediculus humanus Typhus Draft gene set NHGRI
Phlebotomus papatasi Leishmania Planned NHGRI/Wellcome Trust
Rhodnius prolixus Chagas disease Initiated NHGRI
VectorBaseVectorBase
Notre Dame
PIsFrank Collins, Dave Severson, Greg Madey, Nora Besansky
Tasks project coordinationcore website developmentcommunity annotation pipelineAedes and Anopheles community reps.
VectorBaseVectorBase
EBI(European Bioinformatics Institute)
PIEwan Birney
Tasks “automated” genome annotationcomparative genomicsGenbank submissionsgenome browser technology
VectorBaseVectorBase
IMBB, Crete
PIKitsos Louis
Tasks ontologies for anatomy, insecticide resistance, biological processespopulation genetics
VectorBaseVectorBase
Imperial College, London
PIsGeorge Christophides, Fotis Kafatos
Tasks functional genomics: gene expression, RNAi phenotypes
VectorBaseVectorBase
Genome annotation cycle
Automatic gene build
Assembly
Community annotations
Manual annotations
Other genomes, gene sets
Repeat library (TEs etc)ESTs, cDNAs
Protein domains
VectorBaseVectorBase
Manual annotation
• Flybase team (Kathy Campbell)
• Anopheles 2L completed Sep 2006
• Anopheles 2R completed Sep 2007
• Anopheles X completed Feb 2008
• 875 Culex genes completed July 2008
• Three mosquitoes better than one
VectorBaseVectorBase
Community annotation
• Expertise from around world
• Gene models, symbols, literature, function
• Need system to track contributions
• Incorporated in gene build updates
• Credit sourcesCommunity Annotation Pipeline (CAP)
VectorBaseVectorBase
CAP: gene model submission
• Gene symbol• Gene description• mRNA sequence• Translation start• Translation stop• Determination method• GO IDs• PubMed IDs
Excel spreadsheet
VectorBaseVectorBase
CAP: what happens next
• Transcript aligned to genome
• Gene model constructed
• Reviewed by community representative
VectorBaseVectorBase
CAP: other annotations
• Publications
• CV/ontology terms
• Free text comment*
(* unmoderated)
VectorBaseVectorBase
Expression data
• Many microarray technologies
• Many experimental designs
• Large amount of information
• Many ways to do analysis
VectorBaseVectorBase
Microarray repositories
• Widely adopted standard: MIAME
• GEO (NCBI) & ArrayExpress (EBI)
• Repository ≠ Useful data
• Curation backlog at central repositories
• VectorBase data is manageable
• We manage and curate
VectorBaseVectorBase
Microarray pipeline at VB
What Where
Alignments & gene assignments Ensembl-style database
Microarray data, raw & processed BASE
Statistics and web interface VB’s GESOL API
VectorBaseVectorBase
No time today for…
• Averaging over multiple reporters
• Ambiguous reporters
• List of microarray experiments in VB
• Community microarray data submission
• Expert analysis & collaboration
• Future developments
VectorBaseVectorBase
VectorBase’s future directions
• More genomes & sequencing
• Population biology, association studies
• More community involvement in genome annotation
• Enhanced functional genomics resources
Recommended