RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster...

Preview:

Citation preview

RD-CONNECT WP5 UPDATE

R D - C o n n e c t A n n u a l M e e t i n g B e r l i n , M a y 1 s t 2 0 1 7

2

Platform moved to RD -Connect c luster

RD-Connect clúster

• 19 servers• Each server has :

• 256 GB RAM

• 20 TB sata disk + 900 GB SSD disk

• 32 CPU cores

• Software suite• Apache mesos + DCOS for cluster management

• Apache marathon for docker orchestration

• Foreman + Puppet

• Jenkins

• AIM:• Moving towards 100% CI/CD (continuous integration/deployment)

3

Platform moved to RD -Connect c luster

Monitoring

- Software stackBeats + Elasticsearch + Kibana

- Status of monitoringProxies (production and integration) monitorizedGenomics application monitoring under development

- Future developmentsComplete monitoring of all the applicationsIntegration of applications monitoring with resources/metric monitoring for performance optimization and resource allocation minimizationAnalysis of logs for anomaly detections (Kafka + Spark)

4

Whole RD-Connect Platform Architecture Overview 2017

Application server(Liferay - Java)

Postgresql

REST API

Security

Application server

(Play2 - Scala)

Psql El

REST API

Security

Application server(Xwiki)

Mysql Solr

REST API (*)

Security

Application server(Spring - Java)

mysql El.

Security

VCFs

REST API

DiseaseCard (REST API)

Alfa (REST API) LUMC Tools (REST API) ***

Filtration tool (web) Client (angular)

UMD (web service **)

Web browser

Phenotips

CAS

Biobanks and Reg.(Id Card) Samples (Molgenis)

Biobanks andRegistries Samples

LegendEl : ElasticsearchPsql : PostgresSolr: Apache Solr

ID relationships(RDF,postgres, d2r)

Application server

(Play2 - Scala)

LDAP

REST API

Security

Genomics

IDs

Integrated security

5

RD-Connect Genomics Platform

5

https://platform.rd-connect.eu

6

CAS login

7

Samples and users

2016 Annual Meeting 2017 Annual Meeting

Users

Users connecedT1 (Jan-Mar)

24798

34 41

GenomicSamples 567 2123

8

Data f low to RD-Connect

EGA

RD-Connect platform

Sequencing lab

Standard analysis pipeline

Raw data(FASTQ/BAM)

Researcher/Clincian

AnalysisTools

PhenoTips (HPO terms)

N=2123and counting …

9

Benchmarking of VC Pipel ines

Laurie et al. Human Mutation, 2016

NA12878 50xWGS FastQs (Illumina Platinum), analysed with several pipelines. Concordance with Gold Standard VC set from GIAB/NIST (Zook et al., 2014) for the reliably-callable region of the genome (70%)

10

Benchmarking of VC Pipel ines

Laurie et al. Human Mutation, 2016

99% 65% 62%

76% 31% 31%

Reliably Callable

NotReliably Callable

11

Genomics platform architecture

Hadoop File-system (HDFS)

RESTWeb

Server(Scala)

Metadata, user info & permissions (Postgres)

gVCFs

Variant Calling &

Annotation pipeline

Table format (Parquet)

Real-timeQueries

Indexed Data(ElasticSearch)

External hive table

D. Piscia, J. Protasio, S. Laurie, S. Beltran,JM Fernández, A. Cañada, V. de la Torre et al

BrowserClient

(Angular)

AuthorisedAccess

Web Services

12

Improved GUI

13

Improved GUI

14

RD-Connect Genomics Platform

D. Piscia, J. Protasio, S. Laurie, A. Papakonstantinou, S. Beltran

15

Preset fi lters and share queries

16

Added Cl inVar (and looking @ HGMD)

ClinVar can be used for filtering, and ClinVar categories are shown

Started conversations to explore integration of HGMD

17

Get l ists of genes associated to OMIM and HPO

Search for OMIM and HPO termsthrough OMIM and PhenoTips APIs

18

Predefined l ists of genes

OMIM and HPO related genes accessedthrough OMIM and PhenoTips APIs

Added more lists of genes

19

New l inks ( inc l . HSF, HGMD and gnomAD )

20

Development of common API to integrate tools through Links

21

Search across samples (per gene/s) with al l f i l ters

22

Search across samples (per gene/s) with al l f i l ters

23

Exomiser in product ion

Run Exomiser on filtered results (coming soon)

HPO terms and inheritance model extracted fromPhenoTips through API

BBMRI-LPC Whole Exome Sequencing Call for RD (2016)

Goal:

to promote the utilization of cutting-edge next-

generation sequencing technology for the

identification of novel causative variants and

genes and to molecularly diagnose rare disease

patients. BBMRI-LPC also wants to promote

biobanking for rare diseases, the use of rare

diseases biobanks and responsible data sharing.

To sequence and analyse:

900 exomes in 17 coordinated projects.

Sequencing and analysis carried out at the

CNAG-CRG and the Wellcome Trust Sanger

Institute (WTSI).

Researches are able to analyse the data in

RD-Connect’s platform

3/17 projects released through RD-Connect

(follow-up session by Manuel and Marina on

submission and from Hanns on results)

25

2016 Main Achievements

Deployment of platform in RD-Connect’s cluster

Improved the CAS; connection with ID-Cards, Sample Catalogue and

PhenoTips underway

Genomics platform with 2123 experiments

Filtering by genes linked to OMIM and HPO; PhenoTips API improved

Common API – to add informative links to external

Integration of ClinVar and links to external tools (HSF, gnomAD, HGMD etc.)

Exomiser in production

Additional features for genomics platform

Processing of the BBMRI-LPC projects

26

Contributors

WP1: Coordination

WP2: Patient registries

WP3: Biobanks

WP4: Bioinformatics

WP5: Unified platform

Hanns Lochmüller(Newcastle and TREAT-NMD)

Domenica Taruscio (ISS and EPIRARE)

Lucia Monaco(Fondaz. Telethon & EuroBioBank)

WP6 Ethical/legal/social

Ivo Gut (CNAG Barcelona)

Christophe Béroud(INSERM Marseille)

WP7: Impact/Innovation

Mats Hansson (Uppsala)

Kate Bushby(Newcastle and EUCERD/ EJARD)

I. Gut

S. Beltran

D. Piscia

S. Laurie

J. Protasio

A. Papakonstantinou

I. Martinez

R. Tonda

J.R. Trotta

CNIOA. Valencia

S. Capella

V. de la Torre

J.M. Fernández

A. Cañada

CNAG AMU

(Marseille)C. Béroud

D. Salgado

J.P. Desvignes

Interactive

BioSoftwareA. Blavier

S. Lair

LUMCP.B. t’Hoen

M. Roos

M. Thompson

R. Raliyaperumal

B. Mons

U. Of TorontoM. Brudno

M. Girdea

S. Dumitriu

O. Buske

EGAT. Keane

D. Spalding

J. Paschall

J. Almeida-King

J. Rambla

Newcastle U.H. Lochmüller

R. Thompson

A. Topf

I. Zaharieva

U. AveiroJ.L. Oliveira

P. Lopes

P. Sernaleda

U. of PatrasG. Patrinos

Murdoch U.M. Bellgard

Recommended