Upload
philip-payne
View
197
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Beyond Ontologies: Putting Biomedical Knowledge to Work
Philip R.O. Payne, Ph.D.Associate Professor and Chair, Biomedical Informatics (College of Medicine)Associate Professor, Health Services Management and Policy (College of Public Health)Associate Director for Data Sciences, Center for Clinical and Translational ScienceExecutive-in-residence, Office of Technology Transfer and Commercialization
NCBO Project MeetingMarch 13, 2013
COI/Disclosures
Federal Funding: NCI, NLM, NCATS, AHRQ
Additional Research Funding: SAIC, Rockefeller Philanthropy Associates, Academy Health, Pfizer
Academic Consulting: CWRU, Cleveland Clinic, University of Cincinnati, Columbia University, Emory University, Virginia Commonwealth University, University of California San Diego, University of California Irvine, University of California San Francisco, University of Minnesota, Northwestern University
Other Consulting/Honoraria: American Medical Informatics Association (AMIA), Institute of Medicine (IOM)
Editorial Boards: Journal of the American Medical Informatics Association, Journal of Biomedical Informatics
Study Sections: NLM (BLIRC), NCATS (formerly NCRR), NIDDK
Corporate: Oracle, HP, Epic, Accelmatics (interim-CEO)
2
Outline
Working definitions and assumptions Putting biomedical knowledge to work
A practical approach to CDEs Resource discovery Hypothesis generation
Discussion Knowledge-based systems engineering Big data
3
Outline
Working definitions and assumptions Putting biomedical knowledge to work
A practical approach to CDEs Resource discovery Hypothesis generation
Discussion Knowledge-based systems engineering Big data
4
5
The Multiple Dimensions of Biomedical Knowledge Engineering
Technology
• Deployment model
• Systems integration
• Extensibility
Data Sharing
• Vocabularies• Semantics• Knowledge
engineering processes
Empowering Knowledge
Workers
• Tools and best practices
• Governance• Socio-cultural
factors
Creating Dynamic, Interoperable Systems
Lowering Barriers
to Adoption
Facilitating Sharing
and Integration
Solving Real World
Problems
6
A Balanced Approach to Realizing the Benefits of Shared Semantics
Computable Interoperability
Working Interoperability
Peer-to-Peer Negotiation of
Project-Specific Constructs
Impact on Community-wide: Governance Technologies Software Engineering Approaches
Community-wide Vocabularies
and Semantics
Historical Focus
7
Empowering Knowledge Workers
Driving Biological and
Clinical Problems
Subject Matter Experts
Solutions to Real World
Interoperability Needs
Critical Issues: Workflows that enable engagement by Subject Matter Experts Tight coupling of knowledge engineering efforts and research programs
that can define “real world” driving problems Facilitation and support of interdisciplinary, team science models
(including basic and translational scientists, clinical researchers, and informaticians)
Biomedical Informatics ≠ EngineeringSystems-level Approaches To Knowledge Engineering and Usability Are Essential
4 Assumptions Regarding the Current State and Future of the NCBO
1) The tools and knowledge collections created and maintained by the NCBO have become a substrate for a broad spectrum of biomedical informatics innovations Analogous to the role played by NLM provided
resources
2) Future directions for the center and its work will trend towards an applied science focus
3) At the same time, outreach, engagement, and education will remain high priorities
4) The current funding climate presents a significant and unknown challenge to the preceding 3 assumptions
8
4 Assumptions Regarding the Current State and Future of the NCBO
1) The tools and knowledge collections created and maintained by the NCBO have become a substrate for a broad spectrum of biomedical informatics innovations Analogous to the role played by NLM provided
resources
2) Future directions for the center and its work will trend towards an applied science focus
3) At the same time, outreach, engagement, and education will remain high priorities
4) The current funding climate presents a significant and unknown challenge to the preceding 3 assumptions
9
Outline
Working definitions and assumptions Putting biomedical knowledge to work
A practical approach to CDEs Resource discovery Hypothesis generation
Discussion Knowledge-based systems engineering Big data
10
11
A Pragmatic Approach to CDEs:The openMDR Project
Defining Common Data Elements (CDEs)
Common Data Elements (CDEs) are standardized terms for the collection and exchange of data
CDEs are metadata CDEs describe the type of data being collected, not the data
itself
Critical role(s) for CDEs: to identify discrete, defined items for data collection to promote consistent data collection in the field to eliminate unneeded or redundant data collection to promote consistent reporting and analysis to reduce the possibility of error related to data translation and
transmission to facilitate data sharing
12
Source: National Cancer Institute (NCI)
OpenMDR: a Distribute CDE Platform
13
Semantic Metadata Management Suite Locally relevant ontology-anchored data elements
Rapid and agile development paradigm
Distributed terminology ecosystem Federated queries across multiple deployments
Interaction with other semantic management systems ISO 11179 semantic repository
Integration with industry standard tools
http://openmdr.org
OpenMDR Functional Components
14
Create and manage terminology
Discover and reuse concepts
Annotate models for discovery and interoperability
Utilize data elements to build semantically anchored services
http://openmdr.org
OpenMDR Is Federated
15
Multiple deployments for locally relevant terminology.
DNS-like hierarchy of authority
http://openmdr.org
OpenMDR as part of an MDA Workflow
16
Empowers knowledge workers
Enterprise Architect plugin Formulate searches
against local or distributed OpenMDR instances
Identify semantic terms in detail Concept codes help
distinguish similar elements
Apply annotations to the data model
http://openmdr.org
OpenMDR and the TRIAD SOA
17
http://openmdr.org
18
Resource Discovery: ResearchIQ
Motivation for the Design of ResearchIQ
• Clinical and translational researchers frequently need to identify and engage:– Collaborators– Shared resources– Data, information, and knowledge collections
• There are a multitude of sources that can be used to support such needs, however they are usually:– Heterogeneous– Difficult to find– Not linked
How do we overcome these barriers to the efficient planning and conduct of clinical and translational studies?
A Potential Solution:
What is ResearchIQ (Research Integrative Query)? A single knowledge resource portal for the clinical and
translational research community that will provide a ”front door” for a variety of resources.
How does ResearchIQ work? Knowledge anchored semantic search Leveraging semantic web technologies
Current project focus is on the development and deployment of an end-user facing proof-of concept Can it be done? How difficult will it be? Can it scale? What kind of coverage would we have?
High-level System Architecture
Query Performance Optimization
Presentation Layer
Knowledge Base Growth (2012)
Jan-12 Feb-12 Mar-12 Apr-12 May-12 Jun-12 Jul-12 Aug-12 Sep-12 Oct-12 Nov-12 Dec-123
3.2
3.4
3.6
3.8
4
4.2
4.4
Cumulative Total Resources
Months - 2012
Lo
g (
Mo
nth
ly C
um
ula
tiv
e
Re
so
urc
e C
ou
nt)
Managing Knowledge Base Growth
25
Hypothesis Generation: TOKEN Knowledge Synthesis
Platform
Putting Conceptual Knowledge to Work:Constructive Induction (CI) & Hypothesis Generation
Conceptual Knowledge Constructs (CKCs)• Conceptual knowledge-anchored concepts + relationships• Higher order constructs (multiple intermediate concepts)• Controls for concept granularity (search depth)• Basis for inference of hypotheses concerning relationships between data elements
Experimental Context: CLL Research Consortium
NCI-funded Program/Project (PO1) Translational research targeting Chronic Lymphocytic Leukemia
(CLL) Established in 1999 Cohort of over 6,000 patients Comprehensive phenotypic and bio-molecular data sets, as well as
bio-specimens
8 participating sites
Informatics platform: Research networking Clinical trials management Correlative data management Bio-specimen management
Multi-part CI Evaluation Study in CLL
(1) Efficacy (2) Verification & Validation
(3) Mining Domain
Literature
CKC Evaluation• 108 data elements• 822 UMLS concepts• 5800 CKCs• 5 SMEs• Random sample (250)
• 86% valid• 90% “meaningful”
Search depth controls
TOKEn browser
Automated lit. queries
• Random sample (50)
SME “gold standard”
•Support metric Critical
relationship• support metric• “meaningful”• Significant correlation1. Payne PR, Borlawsky T, Kwok A, Dhaval R, Greaves A. Ontology-anchored Approaches to Conceptual Knowledge Discovery in a
Multi-dimensional Research Data Repository. AMIA Translational Bioinformatics Summit Proc. 2008.2. Payne PR, Borlawsky T, Kwok A, Greaves A. Supporting the Design of Translational Clinical Studies Through the Generation and
Verification of Conceptual Knowledge-anchored Hypotheses. AMIA Annu Symp Proc. 2008.3. Payne PR, Borlawsky T, Lele O, James S, Greaves AW. The TOKEN Project: Knowledge Synthesis for in-silico Science. Journal
of American Medical Informatics Association (JAMIA). 2011
Mining CLL literature
• Medline, 2005-2008
Comparison•Literature-based
CKCs•Ontology-based CKCs
Critical findings• No overlap• Differing granularity
• More timely (SMEs)
CKC Visualization
Cytogenetic & Chromosomal abnormalities
Bio-molecular Products
HematologicMalignancies
Bone Marrow Morphology
Tissues of Origin
Solid Tumors
Myelogenous Malignancies
TOKEn CKC Network: CLL Research Consortium Metadata
Cytogenetic Abnormalities
TreatmentResponse
Bone Marrow Morphology
Lymphomas
Leukemia's
Chromosome Loss
Laboratory Findings
Protein Expression
Molecular Abnormalities
Tissues of Origin
Tissues of Origin
TOKEn CKC Network: Semantic Partitions
Outline
Working definitions and assumptions Putting biomedical knowledge to work
A practical approach to CDEs Resource discovery Hypothesis generation
Discussion Knowledge-based systems engineering Big data
31
Applying Conceptual Knowledge: Building Knowledge-Based Systems
Payne PR et al. Translational informatics: enabling high-throughput research paradigms. In: Physiol. Genomics 39: 131-140, 2009
Knowledge-based Systems: Replicating Expert Performance
Adapted from Gaines and Shaw, “Knowledge Acquisition Tools Based On Personal Construct Psychology”, 1993
The Importance of Knowledge-based Systems Engineering is Amplified by Our Increased Focus on Big Data
34
Volume
Velocity
Variability
Scalability Extensibility Reproducibility Multi-
dimensional data, information, and knowledge Integration
Moving beyond the “hype cycle” and solving real world problems
Over $100M investment by NIH, including the creation of centers of excellence
35
Collaborators: Peter J. Embi, MD, MS
Albert M. Lai, PhD
Kun Huang, PhD
Po-Yin Yen, RN, PhD
Yang Xiang, PhD
Marcelo Lopetegui, MD
Tara Borlawsky-Payne, MA
Omkar Lele, MS, MBA
Marjorie Kelley
William Stephens
Arka Pattanayak
Caryn Roth
Andrew Greaves
Funding: NCI: R01CA134232, R01CA107106,
P01CA081534, P50CA140158, P30CA016058
NCATS: U54RR024384
NLM: R01LM009533, T15LM011270
AHRQ: R01HS019908
Rockefeller Philanthropy Associates
Academy Health – EDM Forum
Acknowledgements
Laboratory for Knowledge Based Applications and Systems Engineering (KBASE):
36
Thank you for your time and attention!• [email protected]• http://go.osu.edu/payne