Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
RD-CONNECT WP5 UPDATE
R D - C o n n e c t A n n u a l M e e t i n g B e r l i n , M a y 1 s t 2 0 1 7
2
Platform moved to RD -Connect c luster
RD-Connect clúster
• 19 servers• Each server has :
• 256 GB RAM
• 20 TB sata disk + 900 GB SSD disk
• 32 CPU cores
• Software suite• Apache mesos + DCOS for cluster management
• Apache marathon for docker orchestration
• Foreman + Puppet
• Jenkins
• AIM:• Moving towards 100% CI/CD (continuous integration/deployment)
3
Platform moved to RD -Connect c luster
Monitoring
- Software stackBeats + Elasticsearch + Kibana
- Status of monitoringProxies (production and integration) monitorizedGenomics application monitoring under development
- Future developmentsComplete monitoring of all the applicationsIntegration of applications monitoring with resources/metric monitoring for performance optimization and resource allocation minimizationAnalysis of logs for anomaly detections (Kafka + Spark)
4
Whole RD-Connect Platform Architecture Overview 2017
Application server(Liferay - Java)
Postgresql
REST API
Security
Application server
(Play2 - Scala)
Psql El
REST API
Security
Application server(Xwiki)
Mysql Solr
REST API (*)
Security
Application server(Spring - Java)
mysql El.
Security
VCFs
REST API
DiseaseCard (REST API)
Alfa (REST API) LUMC Tools (REST API) ***
Filtration tool (web) Client (angular)
UMD (web service **)
Web browser
Phenotips
CAS
Biobanks and Reg.(Id Card) Samples (Molgenis)
Biobanks andRegistries Samples
LegendEl : ElasticsearchPsql : PostgresSolr: Apache Solr
ID relationships(RDF,postgres, d2r)
Application server
(Play2 - Scala)
LDAP
REST API
Security
Genomics
IDs
Integrated security
5
RD-Connect Genomics Platform
5
https://platform.rd-connect.eu
6
CAS login
7
Samples and users
2016 Annual Meeting 2017 Annual Meeting
Users
Users connecedT1 (Jan-Mar)
24798
34 41
GenomicSamples 567 2123
8
Data f low to RD-Connect
EGA
RD-Connect platform
Sequencing lab
Standard analysis pipeline
Raw data(FASTQ/BAM)
Researcher/Clincian
AnalysisTools
PhenoTips (HPO terms)
N=2123and counting …
9
Benchmarking of VC Pipel ines
Laurie et al. Human Mutation, 2016
NA12878 50xWGS FastQs (Illumina Platinum), analysed with several pipelines. Concordance with Gold Standard VC set from GIAB/NIST (Zook et al., 2014) for the reliably-callable region of the genome (70%)
10
Benchmarking of VC Pipel ines
Laurie et al. Human Mutation, 2016
99% 65% 62%
76% 31% 31%
Reliably Callable
NotReliably Callable
11
Genomics platform architecture
Hadoop File-system (HDFS)
RESTWeb
Server(Scala)
Metadata, user info & permissions (Postgres)
gVCFs
Variant Calling &
Annotation pipeline
Table format (Parquet)
Real-timeQueries
Indexed Data(ElasticSearch)
External hive table
D. Piscia, J. Protasio, S. Laurie, S. Beltran,JM Fernández, A. Cañada, V. de la Torre et al
BrowserClient
(Angular)
AuthorisedAccess
Web Services
12
Improved GUI
13
Improved GUI
14
RD-Connect Genomics Platform
D. Piscia, J. Protasio, S. Laurie, A. Papakonstantinou, S. Beltran
15
Preset fi lters and share queries
16
Added Cl inVar (and looking @ HGMD)
ClinVar can be used for filtering, and ClinVar categories are shown
Started conversations to explore integration of HGMD
17
Get l ists of genes associated to OMIM and HPO
Search for OMIM and HPO termsthrough OMIM and PhenoTips APIs
18
Predefined l ists of genes
OMIM and HPO related genes accessedthrough OMIM and PhenoTips APIs
Added more lists of genes
19
New l inks ( inc l . HSF, HGMD and gnomAD )
20
Development of common API to integrate tools through Links
21
Search across samples (per gene/s) with al l f i l ters
22
Search across samples (per gene/s) with al l f i l ters
23
Exomiser in product ion
Run Exomiser on filtered results (coming soon)
HPO terms and inheritance model extracted fromPhenoTips through API
BBMRI-LPC Whole Exome Sequencing Call for RD (2016)
Goal:
to promote the utilization of cutting-edge next-
generation sequencing technology for the
identification of novel causative variants and
genes and to molecularly diagnose rare disease
patients. BBMRI-LPC also wants to promote
biobanking for rare diseases, the use of rare
diseases biobanks and responsible data sharing.
To sequence and analyse:
900 exomes in 17 coordinated projects.
Sequencing and analysis carried out at the
CNAG-CRG and the Wellcome Trust Sanger
Institute (WTSI).
Researches are able to analyse the data in
RD-Connect’s platform
3/17 projects released through RD-Connect
(follow-up session by Manuel and Marina on
submission and from Hanns on results)
25
2016 Main Achievements
Deployment of platform in RD-Connect’s cluster
Improved the CAS; connection with ID-Cards, Sample Catalogue and
PhenoTips underway
Genomics platform with 2123 experiments
Filtering by genes linked to OMIM and HPO; PhenoTips API improved
Common API – to add informative links to external
Integration of ClinVar and links to external tools (HSF, gnomAD, HGMD etc.)
Exomiser in production
Additional features for genomics platform
Processing of the BBMRI-LPC projects
26
Contributors
WP1: Coordination
WP2: Patient registries
WP3: Biobanks
WP4: Bioinformatics
WP5: Unified platform
Hanns Lochmüller(Newcastle and TREAT-NMD)
Domenica Taruscio (ISS and EPIRARE)
Lucia Monaco(Fondaz. Telethon & EuroBioBank)
WP6 Ethical/legal/social
Ivo Gut (CNAG Barcelona)
Christophe Béroud(INSERM Marseille)
WP7: Impact/Innovation
Mats Hansson (Uppsala)
Kate Bushby(Newcastle and EUCERD/ EJARD)
I. Gut
S. Beltran
D. Piscia
S. Laurie
J. Protasio
A. Papakonstantinou
I. Martinez
R. Tonda
J.R. Trotta
CNIOA. Valencia
S. Capella
V. de la Torre
J.M. Fernández
A. Cañada
CNAG AMU
(Marseille)C. Béroud
D. Salgado
J.P. Desvignes
Interactive
BioSoftwareA. Blavier
S. Lair
LUMCP.B. t’Hoen
M. Roos
M. Thompson
R. Raliyaperumal
B. Mons
U. Of TorontoM. Brudno
M. Girdea
S. Dumitriu
O. Buske
EGAT. Keane
D. Spalding
J. Paschall
J. Almeida-King
J. Rambla
Newcastle U.H. Lochmüller
R. Thompson
A. Topf
I. Zaharieva
U. AveiroJ.L. Oliveira
P. Lopes
P. Sernaleda
U. of PatrasG. Patrinos
Murdoch U.M. Bellgard