19
Utilizing Social Health Websites for Cognitive Computing Exploring the Potential of User-Generated Health Content for Clinical Decision Support Systems Harriëtte Smook [email protected] 28 October 2014

Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Embed Size (px)

DESCRIPTION

Crowdsourced annotations data o ffers cognitive computing systems insights in lay semantics. This is especially important in health care, where medical terminology is often not aligned with patients `lay' language. However, the general crowd often has limited medical knowledge. Therefore this research investigated the opportunities of social health websites for obtaining ground truth annotations data for cognitive computing systems including clinical decision support systems. By identifying these websites and analyzing their data, it off ers a starting point for the future utilization of user-generated health content for cognitive systems. However, the opportunities of social health data are currently limited by various legal regulations. Therefore this paper also dwells on the legal aspects of implementing social health data for cognitive computing systems.

Citation preview

Page 1: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Utilizing Social Health Websites for Cognitive Computing Exploring the Potential of User-Generated Health Content for Clinical Decision Support Systems

Harriëtte Smook [email protected]

28 October 2014

Page 2: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Cognitive Computing Systems‘Prostheses’ for human cognition

Introduce a new generation of Clinical Decision Support Systems

Learn by being used: Humans often can easily detect machine errors. Systems usage can be arranged in such a way that humans understand the system and the problems it solves.

Expand human cognition: Ease processes, especially those with large data sets or data that requires human interpretation.

Why Cognitive Systems? IBM Research. Retrieved from http://www.research.ibm.com/cognitive-computing/why-cognitive-systems.shtml, accessed 16 July 2014. Lora Aroyo. CrowdTruth: The 7 Myths of Human Annotation. Cognitive Computing Forum 2014. Retrieved from http://www.slideshare.net/laroyo/truth-is-a-lie-7-myths-about-human-annotation-cogcomputing-forum-2014, accessed 28 October 2014.

Apple Siri

Google Glass

IBM Watson

Interact naturally: Machines & users should be closer to each other by enabling machines to understand human natural language

Page 3: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Clinical Decision Support SystemsIBM Watson

2. Generates & evaluates!evidence-based hypothesis

1. Understands !human natural language & human communication

3. Adapts & learns!from user selections

& responses

Transformational technologies combinedLora Aroyo. CrowdTruth: The 7 Myths of Human Annotation. Cognitive Computing Forum 2014. Retrieved from http://www.slideshare.net/laroyo/truth-is-a-lie-7-myths-about-human-annotation-cogcomputing-forum-2014, accessed 28 October 2014

Page 4: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

How can Health 2.0 help cognitive computing systems?

+ =Collaboration of patients, medical experts and researchers

Collective aggregation of information, experiences and data

Tools for collecting, tracking and sharing health information: • Monitoring new treatments • Collecting real-world experiences • Patients have more explicit control over their own data

Social Health Websites: !

PatientsLikeMe !

… !

HealthUnlocked ?Health Tracking Tools:

Page 5: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

How can health 2.0 help cognitive computing systems?

+ =The crowd provides human perspectives:

Crowdsourcing Human SemanticsNew generation of

Clinical Decision Support Systems

Patients Health-aware citizensDoctors

Experts provide

formal knowledge

My patient has acute coryza!

Well, I only have a cold.

Page 6: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

How to utilize user-generated health content as training data for cognitive computing systems?

Representativeness Validity

Consistency

2. Data Analysis 3. Create Ground Truth Data

Compare with existing Watson data

1. Gather the data

PatientsLikeMe Publicly available pages

Page 7: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Data AnalysisImportant aspects for obtaining widespread health data

Coverage of different medical conditions

> 500 conditions

Availability of different kinds of data

Diverse health tracking tools

Consistency in the used vocabulary

43% of the symptoms covered by UMLS

Cultural and geographical dispersion of users

> 260.000 users Website in English

PatientsLikeMe (PLM)

Catherine Arnott Smith and Paul J Wicks. Patientslikeme: Consumer health vocabulary as a folksonomy. In AMIA annual symposium proceedings, vol. 2008, p. 682. American Medical Informatics Association, 2008.

Page 8: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Demographic analysis:!• Data analysis in terms of demographics & population

• Countries of residence, gender & age

Analysis of top-reported conditions:!• Prevalence on PLM vs. prevalence in the U.S. • Demographics per top-reported condition vs. official health statistics:

• Gender, peak age & onset age

Analysis of top-reported treatments:!• Top-reported treatments vs. official drug prescription statistics • PLM treatments per top-reported condition vs. officially listed treatments in U.S.

Lexical Analysis:!• PLM conditions and treatments compared with official medical terminology (UMLS)

PLM Data Analysis

Page 9: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

PLM Data Characteristics

697 ConditionsCurrent ageOnset age

432 ConditionsReported treatments

Perceived effectiveness of treatments

1617 TreatmentsCurrent patientsStopped patients

AdherenceBurdenCosts

Current durationPast duration

Severity of side effects

1257 TreatmentsReported purpose

Perceived effectiveness per purpose1172 Treatments

Top reported dosages

1032 TreatmentsTop reasons why people stopped

663 TreatmentsTop reported side effects

663 ConditionsCurrent patients

GenderPrimary conditionCondition status

Top reported symptoms

373600 PatientsAge Gender

Gender per age category233153 Unique members

99274 U.S. members

Page 10: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Demographic Analysis

Other

United States

United StatesUnited KingdomCanadaAustraliaIndiaSouth AfricaIrelandNew ZealandOther

37,2% 4,2% 2,7% 1,1% 0,8% 0,3% 0,3% 0,2% 51,7%

37% of PatientsLikeMe’s members lives in the United States

Countries of residence, gender and age

Page 11: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

The dataset is biased towards womenPe

rcen

tage

of a

ll m

embe

rs

0

1

2

3

4

5

6

7

8

9

10

Age category0 – 4 5 – 9 10 – 14 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64 65 – 69 70 – 74 75 – 79

0,5

1,4

3,1

5,6

8,4

9,89,4

8,8

6,9

5,8

4,4

3

1,1

0,20,10,1

0,6

1,1

1,9

2,6

3,13,23,132,72,6

2,4

1,6

0,60,20,20,1

Male: 1 Female: 2,35Gender ratio

Page 12: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Perc

enta

ge

0

2

4

6

8

10

12

14

16

18

Age category0 – 4 5 – 9 10 – 14 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64 65 – 69 70 – 74 75 – 79

USA PLM USA

1,6

3,4

6,7

11

15,1

16,7

15,6

14,3

11

9,2

6,5

3,9

1,3

0,40,40,2

2,43,2

4,4

5,7

6,67,276,7

6,26,66,87,16,96,76,66,5

People aged 30 - 70 are overrepresented

Page 13: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Top-reported conditionsAre more prevalent on PatientsLikeMe than in the United States

Condition PLM US US1 Fibromyalgia 21,4% 2%2 Multiple Sclerosis!

!19,3% 0,1%

3 Major Depressive Disorder 8,7% 6,7%4 Generalized Anxiety Disorder 7% 3,1%5 Chronic Fatigue Syndrome 6,6% 0,3%6 Parkinson’s Disease 6,6% 0,3%7 Epilepsy 4,5% 0,2%8 Rheumatoid Arthritis 2,4% 0,6%9 Amyotrophic Lateral Sclerosis 3,3% 0,01%

10 Post-Traumatic Stress Disorder 3,4% 3,6%

U.S. most prevalent conditions are mainly related to heart disease and overweight

Page 14: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Demographics per conditionGender

Onset age

Women are overrepresented in all top conditions on PatientsLikeMe

Peak age

PLM patients suffering from mental health conditions are remarkably older than the peak age PLM patients suffering from conditions common among elderly are remarkably younger

PLM patients suffering from mental health conditions experience these often already in their childhood

Page 15: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Top-reported treatmentsAre less popular prescription drugs in the U.S.

Top-reported PLM treatments versus official U.S. rankingsPLM Treatment U.S. rank

1 Gabapentin 202 Duloxetine n.a.3 Pregabalin n.a.4 Baclofen n.a.5 Clonazepam n.a.6 Copaxone n.a.7 Levothyroxine 28 Tramadol 219 Lamotrigine n.a.

10 Bupropion n.a.

U.S. Treatment PLM rank1 Hydrocodone Paracetamol 132 Levothyroxine Sodium 73 Lisinopril 374 Simvastatin 425 Metoprolol 536 Amlodipine 577 Omeprazole 98 Metformin 229 Salbutamol 28

10 Atorvastatin n.a.

Official U.S. rankings versus top-reported PLM treatments

Frequently prescribed drugs in the U.S. are less popular on PLM

Page 16: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Lexical analysisThe majority of the treatments and conditions is covered by UMLSLexical tools:!• BeCas1 • UMLS Metathesaurus

Browser2 • NCBO BioPortal Annotator3

• RxTerms4

All treatments and conditions from the data set are compared with UMLS!• Only 2 out of 1025 unique treatments & 9 out of 663 unique conditions are not covered:

• Too general term (e.g. accidental fall) • Term is proposed and not yet included in UMLS or under discussion • Term is removed from UMLS • Term is not evidence-based and used by alternative healers

1. http://bioinformatics.ua.pt/becas/#!/about 2. http://uts.nlm.nih.gov/home.html 3. http://bioportal.bioontology.org/annotator 4. http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm

Page 17: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Issues in utilizing user-generated health content as training data for cognitive computing systems

Accessibility Privacy issuesBias & Limitations

Each data source comes with bias and

limitations that need to be considered

Data is not easily accessible How to avoid?

Page 18: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

Access to high coverage of (rare) medical conditions

Access to patients and health-aware citizens as an intermediate between

the general crowd and experts

Knowledge from the patients’ perspective

Opportunities in utilizing user-generated health content as training data for cognitive computing systems

Page 19: Utilizing Social Health Websites for Cognitive Computing and Clinical Decision Support Systems

In the future..

Perform analysis on data from

alternative geographical contexts

Perform analysis on data with

different characteristics

Generate better

ground truth data