28
An update on Genomic CDS A complex ontology for pharmacogenomics / personalized medicine and clinical decision support Matthias Samwald, José Antonio Minarro-Giménez Medical University of Vienna W3C Semantic Web for Healthcare and Life Science Interest Group

Samwald ore 2014

Embed Size (px)

Citation preview

Page 1: Samwald   ore 2014

An update on Genomic CDS

A complex ontology for pharmacogenomics / personalized medicine and clinical decision

support

Matthias Samwald, José Antonio Minarro-Giménez

Medical University of Vienna

W3C Semantic Web for Healthcare and Life Science Interest Group

Page 2: Samwald   ore 2014

Drug efficacy and toxicity can vary drastically between patients with different genetic profiles

Up to 100,000 deaths and 2 million hospitalizations are caused by adverse drug reactions per year in the United States alone.

Page 3: Samwald   ore 2014
Page 4: Samwald   ore 2014

Goals of ontology development

• Providing a simple and concise formalism for representing pharmacogenomic knowledge

• Finding errors and lacking definitions in pharmacogenomic knowledge bases

• Automatically assigning alleles and phenotypes to patients

• Matching patients to clinically appropriate pharmacogenomic guidelines and clinical decision support messages

• Being able to detect inconsistencies between pharmacogenomics treatment guidelines from different sources.

Page 5: Samwald   ore 2014

This is how it actually looks in the ontology

Class: rs1057911

SubClassOf: polymorphism

Annotations: rsid "rs1057911",relevant_for CYP2C9,can_be_tested_with 23andMe_v2,can_be_tested_with 23andMe_v3,can_be_tested_with Affymetrix_DMET_chip,rdfs:seeAlso <http://bio2rdf.org/dbsnp:rs1057911>, dbsnp_orientation_on_reference_genome "forward"

Class: rs1057911_A

SubClassOf: rs1057911

Class: rs1057911_T

SubClassOf: rs1057911

DisjointClasses: rs1057911_A, rs1057911_T

Page 6: Samwald   ore 2014

Examples of OWL axioms to represent humans with homozygous or heterozygous genotypes. Humans usually have two copies of each gene (and hence each polymorphism occurs twice)

Class: human_with_genotype_rs1057911_variant_A_ASubClassOf: has exactly 2 rs1057911_A

Class: human_with_genotype_rs1057911_variant_A_TSubClassOf: has some rs1057911_A and has some rs1057911_T

Page 7: Samwald   ore 2014

An excerpt of a translational allele/haplotype table for the gene CYP2C9

Page 8: Samwald   ore 2014

An excerpt of a translational allele/haplotype table for the gene CYP2C9

Page 9: Samwald   ore 2014

An excerpt of a translational allele/haplotype table for the gene CYP2C9

Page 10: Samwald   ore 2014

Examples of scenarios where automated scripts helped in the curation of haplotype definitions

Page 11: Samwald   ore 2014

Dosing guideline from a US Food and Drug Administration (FDA) drug label

Page 12: Samwald   ore 2014

An excerpt of a CDS rule derived from the warfarin drug label

Class: 'human triggering CDS rule 9'

Annotations:

CDS_message "0.5-2 mg warfarin per day should be considered

as a starting dose range for a patient with this genotype

according to the warfarin drug label.”,

relevant_for Warfarin,

recommendation_importance "Important modification"

EquivalentTo:

human and

(has some 'CYP2C9 *1') and

(has some 'CYP2C9 *3') and

(has exactly 2 rs9923231_T)

Page 13: Samwald   ore 2014

An example of how pharmacogenomic findings about an individual patient can be represented

Individual: ‘John Doe’

Types:

human,

(has some rs6025_C) and (has some rs6025_T),

(has some rs9934438_A) and (has some rs9934438_G),

has exactly 2 rs12979860_T,

has exactly 2 rs9923231_T,

(has some ‘CYP2C9*1’) and (has some ‘CYP2C9*3’),

has exactly 2 ‘CYP2D6*2’

Page 14: Samwald   ore 2014

An example of how pharmacogenomic findings about an individual patient can be represented

Individual: ‘John Doe’

Types:

human,

(has some rs6025_C) and (has some rs6025_T),

(has some rs9934438_A) and (has some rs9934438_G),

has exactly 2 rs12979860_T,

has exactly 2 rs9923231_T,

(has some ‘CYP2C9*1’) and (has some ‘CYP2C9*3’),

has exactly 2 ‘CYP2D6*2’

"0.5 - 2 mg warfarin per day

should be considered as a

starting dose range for a patient

with this genotype according to

the warfarin drug label."

OWL Reasoner

Page 15: Samwald   ore 2014

Some basic statistics

The ontology currently represents

• 336 SNPs with 707 variants

• 665 haplotypes related to 43 genes

• 22 rules related to human phenotypes

• 308 dosage recommendations rules

It is made up of approximately

• 22.000 axioms

• 7.700 logical axioms

• 4.100 classes

Page 16: Samwald   ore 2014

Time taken by different reasoners for classifying and realising the demo ontology.

Ontologies have ALCQ expressivity.

System specifications: Windows 7 Professional, java version 1.6.0_29-b11 and 64 bit platform

running on an Intel Core i5-2430M and 4GB of memory

Page 17: Samwald   ore 2014

Ontology development and application was characterized by cycling through 3 emotional stages

Page 18: Samwald   ore 2014

Ontology development and application was characterized by cycling through 3 emotional stages

Page 19: Samwald   ore 2014

Ontology development and application was characterized by cycling through 3 emotional stages

Page 20: Samwald   ore 2014

The good

• Majority of primary goals of ontology development havelargely been meto But devil is in the details, and there are roadblocks for practical

application

• Helped to find concise formalisation and identify pitfallsthat might have been overlooked with anotherapproach, at least initially

• Manchester Syntax is easily readable with this ontologyo Some decision support axioms were curated by medical student

who wrote them down in Manchester Syntax with minimal

training

Page 21: Samwald   ore 2014

The challenging

• TrOWL still performs best among freely available reasoners by a wide margin, but still might only provide partial results o Seems complete, but hard to tell for sure

o Bad for critical applications such as health care

o Predictable incompleteness would be better than unpredictable

incompleteness

• Konclude also worked and is complete, need to evaluate further (as well as other commercial reasoners)

Page 22: Samwald   ore 2014

The challenging

• OWL approach pushed everything firmly into ‘research prototype’ modeo Still feels quite adventerous and somewhat burdensome when

used for mission-critical applications

o We re-implemented part of the reasoning process with our own

code to get rid of OWL for mission critical inferences (this also

helped to make decision support algorithms run on Android)

Page 23: Samwald   ore 2014

The challenging

• Awkward moment when starting reasoner after extending/modifying the ontology: will it still terminate within an acceptable timespan?o Quite unpredictable, shrouds development process in doubt

o It would be great if all reasoners would ship with end-user

friendly heuristics describing ontology features known to

significantly decrease performance

Page 24: Samwald   ore 2014

The challenging

• After implementing 80% of the needed features in an elegant OWL 2 DL ontology, I found that the missing 20% cannot be expressed in OWL…o There should be more end-user friendly documentation

describing patterns that might seem as if they could be handled

by a specific reasoner, but cannot actually be handled.

o For me: realizing that I would need cardinality restrictions on

transitive properties / property paths, but that is a no-go. Sigh.

Page 25: Samwald   ore 2014

The bad

• TrOWL did not alert us about some errors while other reasoners did. Some of the time.

• But those other reasoners often could not explain the errors either (waiting forever), so not very helpful with the complex ontology we are working with.

• When explanations were available, it was often very tricky to spot the actual mistake

o Need (even) better explanation summaries

o A few times the error reports seemed to be errors by

the reasoners, since explanations did not make sense

and we were unable to find a cause ourselves

Page 26: Samwald   ore 2014

The bad

• If reasoning takes long / forever, no easy means for profiling to find out what is causing performance problems, therefore difficult to fix

Page 27: Samwald   ore 2014

All this is part of a larger project

Page 28: Samwald   ore 2014

Thanks

W3C collaborators:

Michel Dumontier (Carleton University)

Robert R. Freimuth (Mayo Clinic)

Richard Boyce (University of Pittsburgh)

Simon Lin (Marshfield Clinic)

Robert L. Powers (Predictive Medicine, Inc.)

Joanne S. Luciano (Rensselaer Polytechnic Institute)

Eric Prud’hommeaux (W3C)

M. Scott Marshall (MAASTRO Clinic)

Funding:

Austrian Science Fund (FWF): [PP 25608-N15]

http://www.genomic-cds.org/

http://safety-code.org/