28
Marie-Adèle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Embed Size (px)

DESCRIPTION

Sequencing  Small genomes (bacterial and model organisms)  projects  Current capacity 4 M reads p/a sufficient for 100 Mb of finished sequence  Mainly whole genome/chromosome shotguns including finishing  Many are international collaborations  Larger more complex genomes ( Mb) on the horizon Informatics  Automatic analysis  Manual annotation by expert biologists  Tools: finishing (Cyclops), annotation (Artemis), comparative analysis (ACT)  Data dissemination  Database resources Functional Genomics  S. pombe  Bacterial Genomes  D. discoideum The Pathogen Sequencing Unit

Citation preview

Page 1: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Marie-Adèle RajandreamThe Pathogen Sequencing Unit

The Sanger InstituteThe Wellcome Trust Genome Campus

HinxtonCambridge

United Kingdom

Page 2: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

The Sanger Institute

Principally funded by Wellcome Trust (about 96 %)

60,000,000 bases per day of raw data

600 employees

Sequencing of Human, Mice, Zebrafish & pathogen genomes

Manual and automatic genome annotation (Ensembl, Artemis)

Identification of cancer causing mutations (recently BRAF gene mutation)

Sequence variation and disease association

Page 3: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Sequencing Small genomes (bacterial and model organisms) 60-70 projects Current capacity 4 M reads p/a sufficient for 100 Mb of finished sequence Mainly whole genome/chromosome shotguns including finishing Many are international collaborations Larger more complex genomes (35-100 Mb) on the horizon

Informatics Automatic analysis Manual annotation by expert biologists Tools: finishing (Cyclops), annotation (Artemis), comparative analysis (ACT) Data dissemination Database resources

Functional Genomics S. pombe Bacterial Genomes D. discoideum

The Pathogen Sequencing Unit

Page 4: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

GeneDB

http://www.genedb.org

Page 5: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Project pages

annotation

sequencesanalysis

GeneDBhttp://www.genedb.org

FTP site

BLASTcuration

Page 6: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

What is GeneDB?

• a generic organism database

• annotated sequences as well as functional data

• visualisation in user-friendly environment

• annotation and analysis of data by biologists

• flexible enough to incorporate new data types

• linked to external databases

• fully curated

Page 7: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

The GeneDB project

• Started in 2001

• Funded by the Wellcome Trust for a period of 5 years

• Initially for 3 organisms: S. pombe, Leishmania & Trypanosome

• 2 full-time programmers, 1 part-time programmer

• One curator for each organism

• One helpdesk person / programmer

• Prototype now done and in use

Page 8: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 9: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 10: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 11: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 12: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 13: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 14: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 15: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 16: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 17: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 18: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 19: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 20: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Technical Outline Prototype“Java”

biojava

data

gui

minelet

mining

test

utils

web

Web

jsp cgi

blast

ominblast

asp common

cerevisiae

pombe

malaria

leish

tryp

Data

aspimagesserialiseindices

cerevisiaeimagesserialiseindices

pombe

malaria

tryp

leish

EMBL

Page 21: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Broad specifications for production version

• Relational database

• Curator / annotator interface incorporating functionality of Artemis (MESS)

• Facility for doing more complex queries

For comprehensive, detailed specs see our Functional Specifications document

Page 22: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

P. falciparum chr. 14

Page 23: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 24: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 25: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 26: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom
Page 27: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

“biotin carboxylase”Inferred by Sequence Similarity

with a yeast sequenceSGD:S0005299

(which was originally annotated based on a published

mutant phenotype)

Page 28: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

Pathogen Sequencing Unit

AnalysisMartin AslettSteven Bentley Matthew BerrimanAna CerdenoChristiane Hertz-FowlerMatthew HoldenKeith JamesRachel Lyne Arnab PainChris PeacockMohammed Sebaihia Nick Thomson Valerie Wood

Project ManagementBart BarrellJulian ParkhillMarie-Adele RajandreamAl IvensNeil Hall

ProgrammingRob DaviesDavid HarperArnaud KerhornouPaul MooneyKim RutherfordAdrian TiveyEd Zuiderwijk

Karen MungallTheresa FeltwellIan GoodheadZahra HanceHeidi HauserMandy SandersMark SimmondsDanielle Walker

Barbara HarrisBecky AtkinAndrew BarronCarol ChillingworthLouise ClarkeCraig CortonJonathan DoggettNicola LennardAlexandra LineDoug Ormand

David HarrisMatthew CollinsNigel FoskerArlette GobleLee MurphySusan O’NeilSimon RutterDavid SaundersKathy SeegerRobert SquaresSteven Squares

Carol ChurcherKaren Brooks Inna CherevachTracey ChillingworthKay ClarkePaul DaviesNancy HamlinKay JagelsSharon MouleBrian WhiteSally WhiteheadSubcloning

Ann CroninAudrey FraserDavid JohnsonMike QuailClaire Price Ester Rabbinowitsch Sarah Sharp

MappingMaria FookesJohn Woodward

Sequencing

Wellcome Trust Sanger Institute

AdministrationYvonne Shaw