Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
2015%06%27(
1(
1(
“All of your answers are approximate, you might as well live with it…”
2(
Andrew Rau-Chaplin, 1½ hours ago
2015%06%27(
2(
(Integrated(Rapid(Infec6ous(Disease(Analysis(
www.irida.ca(
Rob Beiko Faculty of Computer Science Dalhousie University June 12
Microbial genomics for rapid investigation of infectious disease
Image © Kenneth Todar(
2009 and Influenza A
4(
2015%06%27(
3(
5(
6(
2015%06%27(
4(
7(
Influenza A RNA genome (14,000 nucleotides) Eight segments (Image: Tao and Zheng, Science 2012)
S. Typhi CT18 DNA genome (~5,100,000 nucleotides) One chromosome + two plasmids Science (2001)
VIRUS BACTERIUM
8(
Outbreak investigation
Similari6es:(place,(6me,(gene$cs'
fda.gov(
2014(
2010%2013(
Inns(et(al.((2015)(
2015%06%27(
5(
Outbreak investigation in Canada
9(
NATIONAL(MICROBIOLOGY(LABORATORY(
PROVINCIAL'PUBLIC'HEALTH'LABORATORIES'
CLINICAL(ISOLATES(
SENTINEL'SURVEILLANCE'(FoodNet(Canada)(
CLINICAL,(FOOD,(ENVIRONMENTAL(
CANADIAN'FOOD'INSPECTION'AGENCY'
(Regulatory)'
FOOD(ISOLATES(
LISTERIA()(E.(COLI(O157:H7(%(SALMONELLA(%(SHIGELLA(
PFGE/MLVA(
PUBLIC'HEALTH'ACTION!
10(
Pulsed Field Gel Electrophoresis Serratia - NICU
Jang(et(al.,(J(Hosp(Infect((2001)(
2015%06%27(
6(
11(
15 gigabases per run $1000 - $1500 / run, 1 day
Tinier pieces (150 – 400 bases)
< 1 kilobase per run $2 / run, 1-3 hours (96 in parallel)
Tiny pieces (600 – 1000 bases)
2011: Illumina MiSeq 1977: Sanger sequencing ( )
DNA Sequencing
10/10/2013( VanBUG( 12(
2015%06%27(
7(
MiSeq projects at Dalhousie • Bedford(Basin(microbial(monitoring(• Pediatric(Crohn’s(disease(samples(• Global(microbial(air(sampling(• Mink(genomes(• Sequencing(Lactobacillus(genomes(from(the(poop(of(old(mice(
• Wastewater(diversity(and(func6on(in(the(Arc6c(• Verifying(ingredients(in(dog(food((((((((((()(• Exercise(and(the(Microbiome(
13(
(Integrated(Rapid(Infec6ous(Disease(Analysis(
www.irida.ca(
14(
! 1.56M,(3%year(Genome(Canada(Large%Scale(Applied(Placorm(Grant((
! SFU(/(BCCDC(/(PHAC%NML(/(Dalhousie(! DNA(sequencing(and(downstream(applica6ons(
• data(management(/(federa6on(• analysis(workflows(• ontologies(• APIs(• 3rd%party(applica6ons(
! Implementa6on(in(provincial(public(health(labs(! Training(
2015%06%27(
8(
15(
Five Pillars of IRIDA
16(
! Ontologies(and(data(standards(! NCBI,(MiXS,(vegetables(
! Metadata(! Data(provenance(! Data(quality(! Environmental(informa6on(
2015%06%27(
9(
Data sharing!
• BIG challenges – different jurisdictions, “ownership” of epi data. Privacy!
• Health service providers – concerns about privacy and data breach
• Technology outstrips policy • What digital records could we get TODAY?
• Canada lagging in data sharing
17(
18(
! Calling(isolates(based(on(gene6c(varia6on(
! Tradi6onal:(! Pulsed%field(! Mul6%locus((standards!(mlst.net)(
! Whole(genomes:(! Lots(of(informa6on!(! Too(much(informa6on!(! Lots(of(filtering(and(quality(control(required(
2015%06%27(
10(
19(
! Workflow(management(
! REST%like(API((3rd(–(party(applica6ons)(
! Security:(authen6ca6on(/(authoriza6on(
! Data(models(&(implementa6on(
Local Storage
Remote APIs
IRIDA’s Federated Design
List(Samples(
20(
2015%06%27(
11(
21(
! Each(pipeline(is(implemented(as(a(Galaxy(workflow(
! Internal(analysis(pipelines(! Assembly(and(annota6on(! Phylogene6cs(! “Line(list”(management(
! 3rd%party(applica6ons(
22(
Sampled(genomes( Quality(control( Tree(genera6on(/(visualiza6on(
Single-Nucleotide Variant Phylogenetic Pipeline (SNVPhyl)
2015%06%27(
12(
23(
GenGIS
Data(from(Hai6(cholera(outbreak,(2010(hnp://kiwi.cs.dal.ca/GenGIS(
IslandViewer
24(hnp://www.pathogenomics.sfu.ca/islandviewer/browse(
2015%06%27(
13(
25(
! Interfaces(/(environment(
! Personas(! Researchers(! Epidemiologists(! Clinical(microbiologists(/(lab(technicians(
! Workflow(design(and(execu6on(
Full Privileges
Cluster' Line'List'ID'
Pa$ent'Name'
Prov.'Health'No.'
Age' Sex' Loca$on' Sample'ID'
Collec$on'Date'
Culture'Result'
A( 1( John(Smith( 4513253244( 26( M( Vancouver( F14231( 14/03/21( Salmonella(
sp.(
A( 2( Sally(Smith( 4519567458( 24( F( Vancouver( F14235( 14/03/21( Salmonella(
sp.(
B( 3( Tom(Jones( 4517543216( 35( M( Vancouver( M6542( 14/03/24( Salmonella(
sp.(
B( 4( Helen(Jones( 9856321124( 35( F( Vancouver( S1245( 14/03/22( Salmonella(
sp.(
C( 5( Jennifer(Lee( 4516853122( 29( F( Vancouver( S5642( 14/03/22( Salmonella(
sp.(
C( 6( Michael(Brown( 9456534561( 45( M( Victoria( T68954( 14/03/25( Salmonella(
sp.(
Phylogene$c'Tree'
Gene$c'Distance'
2015%06%27(
14(
Limited Privileges
Cluster' Line'List'ID'
Pa$ent'Name'
Prov.'Health'No.'
Age' Sex' Loca$on' Sample'ID'
Collec$on'Date'
Culture'Result'
A( 1( John(Smith( 4513253244( 26( M( Vancouver( F14231( 14/03/21( Salmonella(
sp.(
A( 2( Sally(Smith( 4519567458( 24( F( Vancouver( F14235( 14/03/21( Salmonella(
sp.(
B( 3( Tom(Jones( 4517543216( 35( M( Vancouver( M6542( 14/03/24( Salmonella(
sp.(
B( 4( Helen(Jones( 9856321124( 35( F( Vancouver( S1245( 14/03/22( Salmonella(
sp.(
C( 5( Jennifer(Lee( 4516853122( 29( F( Vancouver( S5642( 14/03/22( Salmonella(
sp.(
C( 6( Michael(Brown( 9456534561( 45( M( Victoria( T68954( 14/03/25( Salmonella(
sp.(
Phylogene$c'Tree'
Gene$c'Distance'
Large-scale sequencing initiatives
28(en.wikipedia.org(
2015%06%27(
15(
FDA GenomeTrakr
29(hnp://www.fda.gov/Food/FoodScienceResearch/WholeGenomeSequencingProgramWGS/ucm363134.htm(
Public Health England project (>10,000 Salmonella so far)
• As(of(2015,(sequencing(every(sampled(Salmonella(isolate(collected(in(England(
• Over(10,000(sequenced(to(date(• 8000(already(available(for(download(in(the(public(databases(
30(
2015%06%27(
16(
Gary(van(Domselaar,(NML(31(
The Global Microbial Identifier
32(
What’s next?
??? per run $900 / run, 6 hours
Huge pieces (max so far – 200-300 kilobases) Can stop / restart using same disposable flowcell
2015: Oxford Nanopore MinION
15 cm (-ish)
thehightechsociety.com(
2015%06%27(
17(
Quick(et(al.((2015)(
“Using(a(novel(streaming(phylogene6c(placement(method(samples(can(be((assigned(to(a(serotype(in(40(minutes(and(determined(to(be(part(of(the(outbreak(in(less(than(2(h.”(
33(
Ebola monitoring
34(blogs.biomedcentral.com Joshua Quick, Nick Loman
2015%06%27(
18(
Example workflow
35(
6 hrs
Change(flowcell(
Samples evaluated against reference in real time
Posi6ve(ID(/((placement(
Load(DNA(
"( "( "("( "( "("("("(
confidence(
Challenges
• Sample(extrac6on:(geong(DNA(from(stuff(• Clinical%grade(evalua6on(
• Training(• Equipment(reliability(• Sequencing(errors(• Quality(of(reference(data(/(anribu6on(algorithms(
• Database(updates(in(real(6me(• Ethics(/(privacy((Genomes(Sequenced(While(U(Wait)(
36(
2015%06%27(
19(
The Point
37(
Comprehensive monitoring Accurate typing Rapid identification Real-time decision making
Acknowledgements PIs'
Fiona(Brinkman(–(SFU(Will(Hsiao(–(PHMRL(Gary(Van(Domselaar(–(NML(Morag(Graham(%(NML(Rob(Beiko(–(Dalhousie((University'of'Lisbon'Joᾶo(Carriҫo((Na$onal'Microbiology'Laboratory'(NML)'Franklin(Bristow(Aaron(Petkau(Thomas(Manhews(Josh(Adam(Adam(Olsen(Tara(Lynch(Shaun(Tyler(Philip(Mabon(Philip(Au(Celine(Nadon(Manhew(Stuart%Edwards(Chrystal(Berry(Lorelee(Tschener((Laboratory'for'Foodborne'Zoonoses'(LFZ)'Eduardo(Taboada(Peter(Kruczkiewicz(Chad(Laing(Vic(Gannon(Manhew(Whiteside(Ross(Duncan(Steven(Mutschall(
Simon'Fraser'University'(SFU)'Melanie(Courtot(Emma(Griffiths(Geoff(Winsor(Julie(Shay(Manhew(Laird(Bhav(Dhillon(Raymond(Lo((BC'Public'Health'Microbiology'&''Reference'Laboratory'(PHMRL)'and'BC''Centre'for'Disease'Control'(BCCDC)'Judy(Isaac%Renton(Patrick(Tang(Natalie(Prystajecky(Jennifer(Gardy(Damion(Dooley(Linda(Hoang(Kim(MacDonald(Yin(Chang(Eleni(Galanis(Marsha(Taylor(Cletus(D’Souza(Ana(Paccagnella((University'of'Maryland'Lynn(Schriml((Canadian'Food'Inspec$on'Agency'(CFIA)'Burton(Blais(Catherine(Carrillo(Dominic(Lambert((Dalhousie'University'Alex(Keddy( 38(
McMaster'University'Andrew(McArthur(Daim(Sardar((European'Nucleo$de'Archive'Guy(Cochrane(Petra(ten(Hoopen(Clara(Amid((European'Food'Safety'Agency'Leibana(Criado(Ernesto(Vernazza(Francesco(Rizzi(Valen6na((((
2015%06%27(
20(
39(
Seminar from the Will Hsiao, BC Centres for Disease Control
40(
Materials to be available on http://bioinformatics.ca/
June 24-26, 2015
2015%06%27(
21(
The Bioinformatics Exam of the Future
41(
tagc.com.au(commons.wikimedia.org/wiki/File:DNA_ahelatest_moodustunud_niit_katsu6_korgil..JPG(hnp://omicfron6ers.com/2014/06/11/diaryofaminion_part2/(
2009 was a long time ago
42(J.(Craig(Venter(Ins6tute(
2015%06%27(
22(
43(Photo credit: Emma Allen-Vercoe Some slides courtesy of Gary Van Domselaar, NML