Upload
iridacommunity
View
160
Download
0
Embed Size (px)
Citation preview
IRIDA: Canada’s federated platform for genomic epidemiology
William Hsiao, [email protected]
@wlhsiao
BC Centre for Disease Control Public Health Laboratory and University of British Columbia
IRIDA Platform Overview
• IRIDA= Integrated Rapid Infectious Disease Analysis
• A free, open source, standards compliant, high quality genomic epidemiology analysis platform to support real-time disease outbreak investigations
Core Functions:
• Management of strain and genomic sequence data
• Rapid processing and analysis of genomic data
• Informative display of genomic results
• Sample, Case, and aggregate data (“metadata”) Management
Target audience:
• Public health agencies who need a platform to manage and process genomic data
• Public health agencies who need a platform to use genomics for outbreak investigations
IRIDA
Sequencing Instruments
Web Application
Data management
Built-in Analytical
Tools
External Galaxy
Command-line Tools
10 simple rules (wish list) to build a better public health microbiology genomic epidemiology analysis systemDownloadLatest version at https://github.com/phac-nml/irida
1: Engage the Users Through the Entire Software Development Cycle
- Project Team has direct access to state of the art research in academia
- Project Team is directly embedded in user organization
2: Have A Simple User Interface
Line List View (under testing)
Timeline View (Conceptualization)
Selectable fields
Travel
Symptoms and Onset
Exposure Types
Hospitalization
Launch a pipeline
Be Like
3: Build a Robust, Extensible Platform
• IRIDA uses Galaxy tomanage workflows
• Adding additional pipelines is relativelyeasy
• Using a standardAPI to allow 3rd party tools to obtain data from IRIDA (e.g. IslandViewer and GenGIS)
IRIDA
Servlet Container
REST API Central File Storage
Web Interface
Application Logic
Compute ClusterGalaxy
$ ~ >_ Galaxy
http://www.pathogenomics.sfu.ca/islandviewer/http://kiwi.cs.dal.ca/GenGIS/Main_Page
4: Have Extensive Documentation
• Documentation should be available for • Users – step by step tutorial with screen shots / FAQ• System Administrators – installation instructions / issue trackers• Developers – open source, collaborative development / IRC Channel
• Easily Accessible at https://irida.corefacility.ca/documentation/
5: Implement QC Throughout the Whole Application
• Genomics is sensitive and sequence data are inherently noisy
• Genomics is a rapidly advancing technology• Standardizing pipelines difficult and can stifle innovation• Better to standardize the performance and reporting metrics and ensure any
validated pipelines meet the testing criteria• Developing a general QC testing module (RCQC) that use ontology to standardize
QC metrics (https://github.com/Public-Health-Bioinformatics/rcqc)
• Data Provenance and Version Control (data + Pipelines) are must’s for Diagnostic Labs
6: Build to Enable Collaboration
• Be able to compare pipelines• Pipeline implemented using Galaxy – transparent
and shareable • Define QC criteria using ontology to compare the
different pipelines of the same purpose
• Be able to share data in standard formats to minimize data re-entry from one platform to another
• Federation of platforms using standard API to share data and analysis results
7: Use Compatible Data Standards
• Sequence data are more compatible / shareable but metadata are currently in silo and incompatible
• Collaboration and Sharing are difficult when data are incompatible
• Compatibility != Sameness
• Use Ontology to allow customization of term list but all terms with same meaning (semantics) should have the same universal ID (e.g. an URL) to facilitate mapping of terms
8: Implement Fine Grained Access Control
Detailed View Restricted View
E.g. User role permissions control visibility and editing of content
Authorization
• Industry-standard authentication and authorization mechanisms
• Local authorization per instance.
• Method-level authorization.• Object-level authorization.
9: Use Technology to Safeguard Patient Privacy
It’s easy to lose control of the Excel Line List - someone can make a copy of the content and pass it around without your knowledge; typos are common and cumulative!
Technology can control who sees what and when
Separate out sensitive patient data from pathogen sequence data but be able to bring them together when necessary without resorting to emailing of line lists!
10: Have Multiple, Flexible Access Options
• No one size fits all solution; Having many platforms to choose from is a good thing (but data should be portable across platforms!)
• IRIDA is available in several different flavours:Local Install Virtual Machine Cloud Instance Public Version
Advantages Full control of the system; your data never leave your centre
Full control of the system; Easy to setup
Full control of the system; does not require local computing infrastructure
No setup required, upload your data and have it processed using Compute Canada Resource
Disadvantages Computing infrastructure and IT support needed to main the resource
Not really scalable if run on your own desktop; some performance loss
Data go into a cloud environment; uploading to cloud environment can be slow
Data go into a public instance (data remain private to your account); upload can be slow
14
AcknowledgementsProject LeadersFiona Brinkman – SFUWill Hsiao – PHMRLGary Van Domselaar – NML
University of LisbonJoᾶo Carriҫo
National Microbiology Laboratory (NML)Franklin BristowAaron PetkauThomas MatthewsJosh AdamAdam OlsonTarah LynchShaun TylerPhilip MabonPhilip AuCeline NadonMatthew Stuart-EdwardsMorag GrahamChrystal BerryLorelee TschetterAleisha Reimer
Laboratory for Foodborne Zoonoses (LFZ)Eduardo TaboadaPeter KruczkiewiczChad LaingVic GannonMatthew WhitesideRoss DuncanSteven Mutschall
Simon Fraser University (SFU)Melanie CourtotEmma GriffithsGeoff WinsorJulie ShayMatthew LairdBhav DhillonRaymond Lo
BC Public Health Microbiology & Reference Laboratory (PHMRL) and BC Centre for Disease Control (BCCDC)Judy Isaac-RentonPatrick TangNatalie PrystajeckyJennifer GardyDamion DooleyLinda HoangKim MacDonaldYin ChangEleni GalanisMarsha TaylorCletus D’SouzaAna Paccagnella
University of MarylandLynn Schriml
Canadian Food Inspection Agency (CFIA)Burton BlaisCatherine CarrilloDominic Lambert
Dalhousie UniversityRob BeikoAlex Keddy
McMaster UniversityAndrew McArthurDaim Sardar
European Nucleotide ArchiveGuy CochranePetra ten HoopenClara Amid
European Food Safety AgencyLeibana Criado ErnestoVernazza FrancescoRizzi Valentina
1515
IRIDA Annual General MeetingWinnipeg, April 8-9, 2015