Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017

Preview:

Citation preview

Machine Learningto cure the World

Xavier AmatriainCurai

MLConf SF ‘17

Medicine is hard(er)

● Doctors have ~15 minutes to capture information* about a patient, diagnose, and recommend treatment

● *Information○ Patient’s history○ Patient’s symptoms○ Medical knowledge

■ Learned years ago■ Latest research findings■ Different demographics

● Data is growing over time, so is complexity● Very hard for doctors to “manually”

personalize their “recommendations”

Medical Diagnosis

● Diagnosis (R.A. Miller 1990):

○ Mapping from patient’s data (history, examination, lab exams…) to a possible condition.

○ It depends on ability to:■ Evoke history■ Surface symptoms and

findings■ Generate hypotheses that

suggest how to refine or pursue different hypothesis

○ In a compassionate, cost-effective manner

Cost of medical errors

● 400k deaths a year can be attributed to medical errors as well as 4M serious health events

○ This compares to 500k deaths from cancer or 40k from vehicle accidents

● Almost half of those events could be preventable

● 30% or $750B is wasted by the US Healthcare system every year

How to improve medical care?

● Automate processes through AI/ML

● Use of (big) data● More/better personalization● Improved user experience

both for patients and doctors

Does this sound familiar?

Medical Decision Support +

Knowledge Bases

Personalization

NLP

Multimodal input

ML/AIMedical System

ML/AIMedicalSystem

Personalization

NLP

Multimodal input

Medical Decision Support +Knowledge Bases

Medical Decision + Knowledge Bases

Medical Knowledge Bases encode years of Doctor Expertise

Doctor ExpertiseMedical Research

An example: Internist-1/QMR/Vddx

● Internist (1971) led by Jack Myers considered (one of) the best clinical diagnostic experts in the US

○ University of Pittsburgh, Chairman of the National Board of Medical Examiners, President of the American College of Physicians, and Chairman of the American Board of Internal Medicine

● Process for adding a disease requires 2-4 weeks of full-time effort and doctors reading 50 to 250 relevant publications

An example: Internist-1/QMR/Vddx

ML/AI Approaches to Diagnosis

● Early DDSS based on Bayesian reasoning (60s-70s)● Bayesian networks (80s-90s)● Neural networks (lately)

Health knowledge graphs

ML/AIMedicalSystem

Medical Decision Support +

Knowledge Bases

Personalization Multimodal input

NLP

Ontologies

● Snomed Clinical Terms○ Computer processable collection of medical terms used in clinical

documentation and reporting.○ Clinical findings, symptoms, diagnoses, procedures, body

structures, organisms substances, pharmaceuticals, devices...

● ICD-10○ 10th revision of the International Statistical Classification of

Diseases and Related Health Problems (ICD)○ Codes for diseases, signs and symptoms, abnormal findings,

complaints, social circumstances, and external causes

● UMLS○ Compendium of many controlled vocabularies○ Mapping structure among vocabularies ○ Allows to translate among the various terminology systems

NLP

● Understanding what doctors and patients say● Extracting knowledge from medical texts● ...

Electronic Health Records

● EHR/EMRs include digital information about patients encounters with doctors or the health system

NLP

Methods and algorithms to extract meaning and knowledge

from unstructured text

Patientunderstanding

The Language of Medicine

Doctor’sNotes

Medical researchpublications

ML/AIMedicalSystem

Clinical Decision Support +

Medical Knowledge Bases

Personalization

NLP

Multimodal input

Multimodal input

We will include many different signals besides direct patient

input

Speech interfaces

Image recognition

Sensors/lab data

Inputs to DDSS

● Improve accuracy of signals input to diagnostic systems by using AI/ML techniques

ML/AIMedicalSystem

Clinical Decision Support +

Medical Knowledge Bases

NLP

Multimodal input

Personalization

Precision medicine

● Precision medicine (NIH):

"an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person."

● Term is relatively new, but concept has been around for many years.

○ E.g. blood transfusion is not given from a randomly selected donor

Personalization

The best and most relevant information “for you”

Patient profile & medical history

Personalization

The best and most relevant information “for you”

Patient profile & medical history

Biological markers & other lab data

Lessons learned from Recsys

Clinical Decision Support +

Medical Knowledge Bases

ML/AIMedicalSystem

Personalization

NLP

Multimodal input

What is different from other domains?

● Cost of errors● We care about causality● Implicit user signals not enough● Need of conversational approaches

○ Importance of eliciting information○ Importance of communicating outcomes

● Complex interactions between diseases and symptoms, including temporal sequences

What are we doing?

● Building an awesome team (Netflix, Quora, Facebook, Google, Microsoft, Uber, Stanford…)

● Combining AI/ML and best product/UX practices to build a service that revolutionizes healthcare by empowering patients to make their own decisions

● Leveraging pre-existing resources and state-of-the-art approaches

● We are stealth, too soon to say too much about what we have

Challenges

● Algorithmic: e.g. combining expert rule-based and ML● Data: quality, sparsity, and bias in data● UX: trustworthiness and engagement of the system,

incentives…● Legal● …

It’s about time we overcome all of these.

References

● “Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base” . Shwe et al. 1991. ● “Computer-assisted diagnostic decision support: history, challenges, and possible paths forward” Miller. 2009.● “Mining Biomedical Ontologies and Data Using RDF Hypergraphs” Liu et al. 2013. ● “Health Recommender Systems: Concepts, Requirements, Technical Basics & Challenges”, Wiesner & Pfeifer, 2014. ● “A ‘Green Button’ For Using Aggregate Patient Data At The Point Of Care” Longhurst et al. 2014. ● “Building the graph of medicine from millions of clinical narratives” Finlayson et al. 2014. ● “Comparison of Physician and Computer Diagnostic Accuracy” Semigran et al. 2016. ● “Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization”. Joshi et al. 2016. ● “Clinical Tagging with Joint Probabilistic Models” . Halpern et al. 2016. ● “Deep Patient: An Unsupervised Representation to Predict the Future of Patients from EHR”. Miotto et al. 2016. ● “Learning a Health Knowledge Graph from Electronic Medical Records” Rotmensch et al. 2017. ● “Clustering Patients with Tensor Decomposition”. Ruffini et al. 2017. ● “Patient Similarity Using Population Statistics and Multiple Kernel Learning”. Conroy et al. 2017. ● “Diagnostic Inferencing via Clinical Concept Extraction with Deep Reinforcement Learning”. Ling et al. 2017. ● “Generating Multi-label Discrete Patient Records using Generative Adversarial Networks” Choi et al. 2017● Suresh, H., Szolovits, P., & Ghassemi, M. (2017, March 20). The Use of Autoencoders for Discovering Patient

Phenotypes. arXiv.org.

References

Yes, we’re hiring!