51
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics June 11, 2012

Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics

  • Upload
    lester

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping. Jyotishman Pathak, PhD Assistant Professor of Biomedical Informatics. June 11, 2012. Project 3: Collaborators & Acknowledgments. - PowerPoint PPT Presentation

Citation preview

Page 1: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput PhenotypingJyotishman Pathak, PhDAssistant Professor of Biomedical Informatics

June 11, 2012

Page 2: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Project 3: Collaborators & Acknowledgments• CDISC (Clinical Data Interchange Standards Consortium)

• Rebecca Kush, Landen Bain• Centerphase Solutions

• Gary Lubin, Jeff Tarlowe• Group Health Seattle

• David Carrell• Harvard University/MIT

• Guergana Savova, Peter Szolovits• Intermountain Healthcare/University of Utah

• Susan Welch, Herman Post, Darin Wilcox, Peter Haug• Mayo Clinic

• Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor

©2012 MFMER | slide-2

Page 3: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Page 4: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Phenotyping is still a bottleneck…

©2012 MFMER | slide-4[Image from Wikipedia]

Page 5: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

EHR systems: United States 2002—2011

©2012 MFMER | slide-5

[Millwood et al. 2012]

Page 6: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Electronic health records (EHRs) driven phenotyping

• EHRs are becoming more and more prevalent within the U.S. healthcare system• Meaningful Use is one of the major drivers

• Overarching goal• To develop high-throughput automated

techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings

©2012 MFMER | slide-6

Page 7: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-7

http://gwas.org

Page 8: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

EHR-driven Phenotyping Algorithms - I• Typical components

• Billing and diagnoses codes• Procedure codes• Labs• Medications• Phenotype-specific co-variates (e.g., Demographics,

Vitals, Smoking Status, CASI scores)• Pathology• Imaging?

• Organized into inclusion and exclusion criteria

©2012 MFMER | slide-8

Page 9: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

EHR-driven Phenotyping Algorithms - II

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings [eMERGE Network]

©2012 MFMER | slide-9

Page 10: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Hypothyroidism Algorithm

No secondary causes (e.g., pregnancy, ablation)

No ICD-9s forHypothyroidism

NoAbnormalTSH/FT4

No Antiboides for TTG/TPO

ICD-9s forHypothyroidism

Antibodies forTTG or TPO(anti-thyroglobulin,anti-thyroperidase)

AbnormalTSH/FT4

No thyroid-altering medications (e.g., Phenytoin, Lithium)

Thyroid replace. meds

Case 1 Case 2

No thyroid replace. meds

Control

2+ non-acute visits in 3 yrs

No hx of myasthenia gravis

©2012 MFMER | slide-10

[Denny et al., 2012]

Page 11: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Hypothyroidism Algorithm: Validation

Positive Predictive Values (PPV) Based on Chart Review – All Sites

SiteEHR-based

Cases/Controls

Sampled forChart Review

Cases/ControlsOld CasePPV (%)

New Case PPV (%)

Group Health 430/1,188 50/50 92 98

Marshfield 509/1193 50/50 88 91

Mayo Clinic 250/2,145 100/100 76 97

Northwestern 103/516 50/50 88 98

Vanderbilt 184/1,344 50/50 90 98All sites 1,421/6,362 — 87 96

©2012 MFMER | slide-11

[Denny et al., 2012]

Page 12: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Data Categories used to define the EHR-driven Phenotyping Algorithms

Clinical gold standard

EHR-derived phenotype

Phenotype Definitions

Validation (PPV/NPV)

Alzheimer’s Dementia

Demographics, clinical examination of mental status, histopathologic examination

Diagnoses, medications

Demographics, laboratory tests, radiology reports

73%

Cataracts Clinical exam finding (Ophthalmologic examination)

Diagnoses, procedure codes

Demographics, medications

98%/98%

Peripheral Arterial Disease

Radiology test results (ankle-brachial index or arteriography)

Diagnoses, procedure codes, medications, radiology test results

Demographics 94%/99%

Type 2 Diabetes Laboratory Tests Diagnoses, laboratory tests, medications

Demographics, height, weight, family history

98%/100%

Cardiac Conduction

ECG measurements ECG report results Demographics, diagnoses, procedure codes, medications, laboratory tests

97%

[eMERGE Network]©2012 MFMER | slide-12

Page 13: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

0.5 5

Genotype-Phenotype Association Results

0.5 50.5 5.01.0

Odds Ratio

rs2200733 Chr. 4q25rs10033464 Chr. 4q25rs11805303 IL23Rrs17234657 Chr. 5rs1000113 Chr. 5rs17221417 NOD2rs2542151 PTPN22rs3135388 DRB1*1501rs2104286 IL2RArs6897932 IL7RArs6457617 Chr. 6rs6679677 RSBN1rs2476601 PTPN22rs4506565 TCF7L2rs12255372 TCF7L2rs12243326 TCF7L2rs10811661 CDKN2Brs8050136 FTOrs5219 KCNJ11rs5215 KCNJ11rs4402960 IGF2BP2

Atrial fibrillation

Crohn's disease

Multiple sclerosis

Rheumatoid arthritis

Type 2 diabetes

disease gene / regionmarker

2.0[Ritchie et al. 2010]

observedpublished

©2012 MFMER | slide-13

Page 14: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Key lessons learned from eMERGE• Algorithm design and transportability

• Non-trivial; requires significant expert involvement• Highly iterative process• Time-consuming manual chart reviews• Representation of “phenotype logic” for transportability

is critical

• Standardized data access and representation• Importance of unified vocabularies, data elements, and

value sets• Questionable reliability of ICD & CPT codes (e.g., billing

the wrong code since it is easier to find)• Natural Language Processing (NLP) is critical

©2012 MFMER | slide-14

Page 15: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

[eMERGE Network]

©2012 MFMER | slide-15

Page 16: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-16

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 17: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

The SHARPn “phenotyping funnel”

©2012 MFMER | slide-17

Phenotype specific patient cohorts

DRLs

QDMs

CEMs

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Intermountain EHR

Mayo Clinic EHR

Page 18: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Clinical Element ModelsHigher-Order Structured Representations

©2012 MFMER | slide-18

[Stan Huff, IHC]

Page 19: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Pre- and Post-Coordination

©2012 MFMER | slide-19

[Stan Huff, IHC]

Page 20: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping [Stan Huff, IHC]

CEMs available for patient demographics, medications, lab measurements, procedures etc.

Page 21: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-21

SHARPn data normalization flow - I

CEM MySQL database with normalized patient information

[Welch et al. 2012]

Page 22: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

SHARPn data normalization flow - II

©2012 MFMER | slide-22

CEM MySQL database with normalized patient information

Page 23: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-23

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 24: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Our task: human readable machine computable

©2012 MFMER | slide-24

[Thompson et al., submitted 2012]

Page 25: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

NQF Quality Data Model (QDM)• Standard of the National Quality Forum (NQF)

• A structure and grammar to represent quality measures in a standardized format

• Groups of codes in a code set (ICD-9, etc.)• "Diagnosis, Active: steroid induced diabetes" using

"steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)”

• Supports temporality & sequences• AND: "Procedure, Performed: eye exam" > 1 year(s)

starts before or during "Measurement end date"• Implemented as set of XML schemas

• Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.)

©2012 MFMER | slide-25

Page 26: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-26

116 Meaningful Use Phase I Quality Measures

Page 27: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - I

©2012 MFMER | slide-27

Human readable HTML

Page 28: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - II

©2012 MFMER | slide-28

Computable XML

Page 29: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

NQF Measure Authoring Tool (MAT)

©2012 MFMER | slide-29

Page 30: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

©2012 MFMER | slide-30

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 31: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

JBoss® open-source Drools rules based management system (RBMS)

©2012 MFMER | slide-31

• Represents knowledge with declarative production rules• Origins in artificial intelligence

expert systems• Simple when <pattern> then

<action> rules specified in text files

• Separation of data and logic into separate components

• Forward chaining inference model (Rete algorithm)

• Domain specific languages (DSL)

Page 32: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Example Drools rule

©2012 MFMER | slide-32

rule "Glucose <= 40, Insulin On“

when $msg : GlucoseMsg(glucoseFinding <= 40,

currentInsulinDrip > 0 )then

glucoseProtocolResult.setInstruction(GlucoseInstructions.GLUCOSE_LESS_THAN_40_INSULIN_ON_MSG);end

{binding} {Java Class} {Class Getter Method}

Parameter {Java Class}

{Class Setter Method}

{Rule Name}

Page 33: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

Measure Authoring

Toolkit

Drools Engine

From non-executable to executable

Data TypesXML-based structured

representation

Value Setssaved in XLS

files

MeasuresXML-basedStructured

representation

Mapping data typesand value sets

Fact Models

Converting measures to Drools scripts

Droolsscripts

©2012 MFMER | slide-33

[Li et al., submitted 2012]

Page 34: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

©2012 MFMER | slide-34

[Li et al., submitted 2012]

Page 35: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

The “executable” Drools flow

©2012 MFMER | slide-35

Page 36: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-36

Phenotype library and workbench - I

1. Converts QDM to Drools2. Rule execution by querying

the CEM database3. Generate summary reports

http://phenotypeportal.org

Page 37: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-37

Phenotype library and workbench - IIhttp://phenotypeportal.org

Page 38: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

©2012 MFMER | slide-38

Phenotype library and workbench - IIIhttp://phenotypeportal.org

Page 39: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-39

Phenotype library and workbench - IV

Page 40: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Page 41: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - I• Machine learning and

association rule mining• Manual creation of

algorithms take time• Let computers do the

“hard work”• Validate against

expert developed ones

©2012 MFMER | slide-41

[Caroll et al. 2011]

Page 42: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - I

• Origins from sales data• Items (columns): co-morbid conditions• Transactions (rows): patients• Itemsets: sets of co-morbid conditions• Goal: find all itemsets (sets of conditions)

that frequently co-occur in patients.• One of those conditions should be DM.

• Support: # of transactions the itemset I appeared in• Support({TB, DLM, ND})=3

• Frequent: an itemset I is frequent, if support(I)>minsup

Patient TB DLM

ND … IEC

001 Y Y Y Y

002 Y Y Y Y

003 Y Y

004 Y

005 Y Y Y

X: infrequent

[Simon et al. 2012]

©2012 MFMER | slide-42

Page 43: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Additional on-going research efforts - II

©2012 MFMER | slide-43

Page 44: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-44

TRALI/TACO sniffer

Additional on-going research efforts - II

Page 45: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-45

Page 46: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Active Surveillance for TRALI and TACO

Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service.

Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.

©2012 MFMER | slide-46

Page 47: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

Additional on-going research efforts - III• Phenome-wide association scan (PheWAS)

• Do a “reverse GWAS” using EHR data• Facilitate hypothesis generation

©2012 MFMER | slide-47

[Pathak et al. submitted 2012]

Page 48: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Publications till date (conservative)

Year 1 (2011) Year 2 (2012) Year 3 (2013)0

2

4

6

8

10

12

14

8

66

2

12

PapersAbstractsUnder review

©2012 MFMER | slide-48

Page 49: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Mayo projects and collaborations• Ongoing

• Transfusion related acute lung injury (Kor)• Drug induced liver injury (Talwalkar)• Drug induced thrombocytopenia and neutropenia (Al-Kali)• Active surveillance for celiac disease (Murray)• Warfarin dose response & heartvalve replacements (Pereira)• Phenotype definition standardization (HCPR/Quality)

• Getting started/planning• Pharmacogenomics of systolic heart failure

(Bielinski/Pereira)• Pharmacogenomics of SSRI (Mrazek/Weinshilboum)• Lumbar image reporting with epidemiology (Kallmes)• Active clinical trial alerting (CTMS/Cancer Center)

©2012 MFMER | slide-49

Page 50: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

HTP related presentations• June 11th, 2012

• Using EHRs for clinical research (Vitaly Herasevich)• Association rule mining and T2D risk prediction (Gyorgy Simon)• Scenario-based requirements engineering for developing EHR add-ons

to support CER in patient care settings (Junfeng Gao)• June 12th, 2012

• Exploring patient data in context clinical research studies: Research Data Explorer (Adam Wilcox et al.)

• Utilizing previous result sets as criteria for new queries with FURTHeR (Dustin Schultz et al.)

• Semantic search engine for clinical trials (Yugyung Lee)• Knowledge-driven workbench for predictive modeling (Peter Haug et al.)• Clinical analytics driven care coordination for 30-day readmission –

Demonstration from 360 Fresh.com (Ramesh Sairamesh)

©2012 MFMER | slide-50

Page 51: Jyotishman Pathak,  PhD Assistant Professor of Biomedical Informatics

SHARPn High-Throughput Phenotyping

Thank You!

©2012 MFMER | slide-51

[email protected]