Big data e inteligencia artificial y HPC en salud y ... · Spanish Node of ELIXIR (ELIXIR-ES)...

Preview:

Citation preview

Plan TL InfoDay21 October 19

Alfonso Valencia. Ph.D.ICREA ProfessorDirector Life Sciences Dept. BSCDirector Spanish Bioinformatics InstituteINB-ISCIII ELIXIR-ES

Big data e inteligencia artificial y HPC en salud y biomedicina

•DATOS

Datos sobre el tiempo

Datos sobre Biología: Genómica, Transcriptómica, Epigenómica, Proteómica, Metabolómica, Lipidómica, ….

Diez mil millones de cell phones

16

Spanish National Bioinformatics Institute (INB)Spanish Node of ELIXIR (ELIXIR-ES)

TransBioNet

network of Bioinformaticians in

Research Institutes of Spanish

Hospitals

(INB hosted)

ELIXIR:

European infrastructure for biological information

13

•DATOS

• INTELIGENCIA ARTIFICIAL

10/12/2019

Advances in AI and HPC go hand by handSince GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.

Petaflops/dayused to trainneural networks

10/12/2019

10/12/2019

BSC works on Medical Imaging

Glaucoma EpiretinalMembrane

Nevus Macular Degeneration

● Detecting retina pathologies

○ Trained models competitive with ophthalmologists

○ With Lenovo & Hospital Vall Hebron

● Learning from liver conditions

○ Learning about rare diseases

○ With Hospital Clinic

● Predicting and guiding in-vitro success

○ Finding the best embryo ASAP

○ With Hospital Clinic

● Supporting medical doctors on Rx review

○ Aid for Dr. in rural areas

○ With Asepeyo and ICS

By Ulises Cortes and Dario Garcia UPF & BSC

Predicting protein structure from the sequence is one of the fundamental problems in molecular biology.

It is the key to the prediction of the consequences of mutations in human diseases and to drug design

•DATOS

• INTELIGENCIA ARTIFICIAL

•HPC

The Evolution of the Research Paradigm

Numerical Simulation and

Big Data Analysis

• Reduce expense

• Avoid suffering

• Help to build knowledge where experiments are impossible or not affordable

Digital Twin for Future Medicine

Is this scenario possible? When?

By Mariano Vazquez, CASE - BSC

Simulations of biological systems at different levels

Model’s readouts

Drug target

~48 h simulation time, 30 min wall time~2500 cells

Slide from M. Ponce de L, BSC

By Victor Guallar ICREA & BSC

Exploration of the parameter

space

Monitoring and tailoring simulations

during execution time

Analysing the results of the simulations

Large Scale Simulations

Event Recognition System

AI / ML systems

Advances in AI and HPC go hand by handSince GPUs were first used in AI (2012), computing power available to generate AI models has increased exponentially – and improvements in computing power has been key for AI progress.

Petaflops/dayused to trainneural networks

From Jordi Torres

Text mining & Cognitive computing for melanoma research

Bio-Knowledge

corpus

Healthcare dictionarie

s

IBM Text Analytics Catalog

Content Analytics

graph

Bio-entities associations

Functional interpretation

Translational applications

16M abstracts

BSCText mining

WATSONCognitive

computing

In collaboration with IBM Spain

BSC Specialized info Watson General info

Corpus/entities extractionNetwork of entitiesInference of relationships

myastheniaazathioprine

nephrectomy

sarcoidosis

breast cancer

tyrosine kinase inhibitor

potency

apctetanus

hypophysectomy diplopia

dysphagia

lung cancer

aidselastosis

cheilitis

lobectomy

chest pain

leukoderma

autoimmunity pancreatic cancer

pancreatitislymphadenitis

toxoplasmosislymphadenopath

y

colon cancer

vitiligo

hepatomauveitis

ascitessplenectomyirritation

vulvectomy hyperactivityanemia

appendectomy

appendicitis

neurosurgery

sequela

dilation and curettage

endometriosis

fertility

carcinoid

thyroidectomy

hypercalcemia

goiter

photocoagulationamputation

hemosiderosis

scleritis

iritis

retinal detachmen

t

enucleation

atypicality

psoriasis

melanosis

exenteration

neurofibroma

teratoma gastrectomy

amaurosis

LOX

MIF

IL10

SMAD3CTGFSKI

MMP1CAV1TGFB1

NOTCH1 IRF8

IGF1

MST1

ITK

CD81

XIAP

FAS TP53

CCND

BRCA1

BRCA2

CDKN2

CDKN2B

MSH2

MLH1

PTEN

LTBP2

CD55

TFIL6

Word embeddings

100M tokens

v1 0,78 0,65 0,98

v2 0,23 0,12 0,32

v3 0,90 0,32 0,56

v4 0,08 0,43 0,65

v5 0,77 0,88 0,77

|V|

Evaluación intrínsica: cálculo similitud entre términos (sinónimos en SNOMED)

Evaluación extrínsica: comprobar su utilidad en otras tareas PLN (neuroNER)

Word2Vec

fastText

85M tokens

v1

v2

v3

v4

Word vectors

Wo

rd e

mb

edd

ings

MAPK3

H3K9 methylation

cerebellar granule cells

individualized Paediatric Cure (iPC)

Cloud-based virtual-patient models

for precision paediatric oncology

Gender and other biases …

Explainable Artificial Inteligence

10/12/2019

Recommended