26
RD-CONNECT WP5 UPDATE RD-Connect Annual Meeting Berlin, May 1 st 2017

RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

RD-CONNECT WP5 UPDATE

R D - C o n n e c t A n n u a l M e e t i n g B e r l i n , M a y 1 s t 2 0 1 7

Page 2: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

2

Platform moved to RD -Connect c luster

RD-Connect clúster

• 19 servers• Each server has :

• 256 GB RAM

• 20 TB sata disk + 900 GB SSD disk

• 32 CPU cores

• Software suite• Apache mesos + DCOS for cluster management

• Apache marathon for docker orchestration

• Foreman + Puppet

• Jenkins

• AIM:• Moving towards 100% CI/CD (continuous integration/deployment)

Page 3: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

3

Platform moved to RD -Connect c luster

Monitoring

- Software stackBeats + Elasticsearch + Kibana

- Status of monitoringProxies (production and integration) monitorizedGenomics application monitoring under development

- Future developmentsComplete monitoring of all the applicationsIntegration of applications monitoring with resources/metric monitoring for performance optimization and resource allocation minimizationAnalysis of logs for anomaly detections (Kafka + Spark)

Page 4: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

4

Whole RD-Connect Platform Architecture Overview 2017

Application server(Liferay - Java)

Postgresql

REST API

Security

Application server

(Play2 - Scala)

Psql El

REST API

Security

Application server(Xwiki)

Mysql Solr

REST API (*)

Security

Application server(Spring - Java)

mysql El.

Security

VCFs

REST API

DiseaseCard (REST API)

Alfa (REST API) LUMC Tools (REST API) ***

Filtration tool (web) Client (angular)

UMD (web service **)

Web browser

Phenotips

CAS

Biobanks and Reg.(Id Card) Samples (Molgenis)

Biobanks andRegistries Samples

LegendEl : ElasticsearchPsql : PostgresSolr: Apache Solr

ID relationships(RDF,postgres, d2r)

Application server

(Play2 - Scala)

LDAP

REST API

Security

Genomics

IDs

Integrated security

Page 5: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

5

RD-Connect Genomics Platform

5

https://platform.rd-connect.eu

Page 6: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

6

CAS login

Page 7: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

7

Samples and users

2016 Annual Meeting 2017 Annual Meeting

Users

Users connecedT1 (Jan-Mar)

24798

34 41

GenomicSamples 567 2123

Page 8: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

8

Data f low to RD-Connect

EGA

RD-Connect platform

Sequencing lab

Standard analysis pipeline

Raw data(FASTQ/BAM)

Researcher/Clincian

AnalysisTools

PhenoTips (HPO terms)

N=2123and counting …

Page 9: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

9

Benchmarking of VC Pipel ines

Laurie et al. Human Mutation, 2016

NA12878 50xWGS FastQs (Illumina Platinum), analysed with several pipelines. Concordance with Gold Standard VC set from GIAB/NIST (Zook et al., 2014) for the reliably-callable region of the genome (70%)

Page 10: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

10

Benchmarking of VC Pipel ines

Laurie et al. Human Mutation, 2016

99% 65% 62%

76% 31% 31%

Reliably Callable

NotReliably Callable

Page 11: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

11

Genomics platform architecture

Hadoop File-system (HDFS)

RESTWeb

Server(Scala)

Metadata, user info & permissions (Postgres)

gVCFs

Variant Calling &

Annotation pipeline

Table format (Parquet)

Real-timeQueries

Indexed Data(ElasticSearch)

External hive table

D. Piscia, J. Protasio, S. Laurie, S. Beltran,JM Fernández, A. Cañada, V. de la Torre et al

BrowserClient

(Angular)

AuthorisedAccess

Web Services

Page 12: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

12

Improved GUI

Page 13: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

13

Improved GUI

Page 14: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

14

RD-Connect Genomics Platform

D. Piscia, J. Protasio, S. Laurie, A. Papakonstantinou, S. Beltran

Page 15: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

15

Preset fi lters and share queries

Page 16: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

16

Added Cl inVar (and looking @ HGMD)

ClinVar can be used for filtering, and ClinVar categories are shown

Started conversations to explore integration of HGMD

Page 17: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

17

Get l ists of genes associated to OMIM and HPO

Search for OMIM and HPO termsthrough OMIM and PhenoTips APIs

Page 18: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

18

Predefined l ists of genes

OMIM and HPO related genes accessedthrough OMIM and PhenoTips APIs

Added more lists of genes

Page 19: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

19

New l inks ( inc l . HSF, HGMD and gnomAD )

Page 20: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

20

Development of common API to integrate tools through Links

Page 21: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

21

Search across samples (per gene/s) with al l f i l ters

Page 22: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

22

Search across samples (per gene/s) with al l f i l ters

Page 23: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

23

Exomiser in product ion

Run Exomiser on filtered results (coming soon)

HPO terms and inheritance model extracted fromPhenoTips through API

Page 24: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

BBMRI-LPC Whole Exome Sequencing Call for RD (2016)

Goal:

to promote the utilization of cutting-edge next-

generation sequencing technology for the

identification of novel causative variants and

genes and to molecularly diagnose rare disease

patients. BBMRI-LPC also wants to promote

biobanking for rare diseases, the use of rare

diseases biobanks and responsible data sharing.

To sequence and analyse:

900 exomes in 17 coordinated projects.

Sequencing and analysis carried out at the

CNAG-CRG and the Wellcome Trust Sanger

Institute (WTSI).

Researches are able to analyse the data in

RD-Connect’s platform

3/17 projects released through RD-Connect

(follow-up session by Manuel and Marina on

submission and from Hanns on results)

Page 25: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

25

2016 Main Achievements

Deployment of platform in RD-Connect’s cluster

Improved the CAS; connection with ID-Cards, Sample Catalogue and

PhenoTips underway

Genomics platform with 2123 experiments

Filtering by genes linked to OMIM and HPO; PhenoTips API improved

Common API – to add informative links to external

Integration of ClinVar and links to external tools (HSF, gnomAD, HGMD etc.)

Exomiser in production

Additional features for genomics platform

Processing of the BBMRI-LPC projects

Page 26: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB

26

Contributors

WP1: Coordination

WP2: Patient registries

WP3: Biobanks

WP4: Bioinformatics

WP5: Unified platform

Hanns Lochmüller(Newcastle and TREAT-NMD)

Domenica Taruscio (ISS and EPIRARE)

Lucia Monaco(Fondaz. Telethon & EuroBioBank)

WP6 Ethical/legal/social

Ivo Gut (CNAG Barcelona)

Christophe Béroud(INSERM Marseille)

WP7: Impact/Innovation

Mats Hansson (Uppsala)

Kate Bushby(Newcastle and EUCERD/ EJARD)

I. Gut

S. Beltran

D. Piscia

S. Laurie

J. Protasio

A. Papakonstantinou

I. Martinez

R. Tonda

J.R. Trotta

CNIOA. Valencia

S. Capella

V. de la Torre

J.M. Fernández

A. Cañada

CNAG AMU

(Marseille)C. Béroud

D. Salgado

J.P. Desvignes

Interactive

BioSoftwareA. Blavier

S. Lair

LUMCP.B. t’Hoen

M. Roos

M. Thompson

R. Raliyaperumal

B. Mons

U. Of TorontoM. Brudno

M. Girdea

S. Dumitriu

O. Buske

EGAT. Keane

D. Spalding

J. Paschall

J. Almeida-King

J. Rambla

Newcastle U.H. Lochmüller

R. Thompson

A. Topf

I. Zaharieva

U. AveiroJ.L. Oliveira

P. Lopes

P. Sernaleda

U. of PatrasG. Patrinos

Murdoch U.M. Bellgard