Upload
amia
View
28
Download
2
Tags:
Embed Size (px)
DESCRIPTION
2013 Summit on Clinical Research Informatics
Citation preview
Characterization of the Biomedical Query Mediation Process
Gregory W. Hruby, MA, Mary Regina Boland, MA, James J. Cimino, MD, Junfeng Gao, PhD, Adam B. Wilcox, PhD,
Julia Hirschberg, PhD, Chunhua Weng, PhD
Department of Biomedical Informatics Columbia University, New York City
Outline § Introduction
• Problem: data access bottleneck
• Limitations of related work
• Study goal: understanding of the query mediation process
§ Data
§ Methods
§ Results
§ Conclusions
Introduction
§ From Jan 2005 – Dec 2011, the Columbia Urology Department has published a total of 244 manuscripts
§ 244 publications · 18 clinical trials · 41 basic science · 123 retrospective outcomes · 61 literature reviews
§ 51% percent of Columbia Urology’s contribution to the medical literature is represented by retrospective outcomes work
Data access bottleneck § The accelerated adoption of electronic health
records (EHRs) • Implies massive amounts of clinical data will be
available (maybe) in electronic form
• The electronic data methods forum, a working group of the AHRQ plainly states that “big data” from the EHR is indispensable for CER
§ Thus the expansion of clinical data access to our researchers is an important priority for accelerating clinical and translational research
Limitations of related work Current methods providing EHR data access
§ One mechanism to relieve the burden on the query analyst has been the development of information retrieval tools • I2b2, SHRINE, VISAGE, STRIDE, RedX, etc…
The mediated query
• The more common approach to data access involves a mediated process, or a query negotiation § This process involves an unstructured interview
between a data analyst and the clinical investigator § The query analyst represents system knowledge,
and the clinical researcher is the domain knowledge expert
§ Inherently, the lack of a common vocabulary can lead to uncertainty and vagueness throughout the conversation
Long-term goal
§ Toward a structured reference interview
§ Model and automate the structured reference interview
Short-term goal § This study attempts to decompose and
characterize the negotiation space between the QE and the MR
QE MR
Data
Urology Research
CDR Informatician
Principal Investigator
Statistician
Key Players
Data
§ Between July 2011 and January 2012, we recorded and transcribed 31 discussions between one query expert (QE) and eight medical researchers (MRs)
§ Example of 5 dialogue acts
QE: Alright. So we're going to be talking about your study so I guess briefly describe to me what you want to do. MR: So, I haven't really put much thought into it, I just talked with a guy and he suggested that he had talked with umm a pathologist and with other urologists and it would be like very, very interesting to see like after cystectomies see if the urethra was involved QE: Uh huh MR: Umm because that could umm like possibly umm affect you know the outcomes of like long term outcomes of the of the like complications and overall prognosis, that's what he told me. But I haven't like QE: So we're looking at the effect of urethral involvement, urethral or ureteral?
Methods: Annotation schema
§ Ten randomly selected projects to develop a dialogue act classification schema § Derived common tasks of dialogue acts
§ Grouped common tasks
§ The schema was iteratively improved on sample transcripts
Results: Schema 1. State the Problem (e.g., “Alright, So we're going to be talking about your
study so I guess briefly describe to me what you want to do.”) 2. Explain the Clinical Process (e.g., “And then they are diagnosed with
cancer after the image?”) 3. Locate Data Elements in EHRs (e.g., “You will have to look in the
operative note.”) 4. Discuss Study Design (e.g., “Because we want to exclude any disease that
could potentially have an effect on the GFR.”) 5. Clarify Research Workflow (e.g., “It's gonna be rare. So you're probably
gonna have to update it as well.”) 6. Explain Data Results to Researchers (e.g., " So follow-up is last time
known alive. So this is corresponding to overall survival information.”) 7. Review IRB and Privacy Policies (e.g., “It is expedited because it is de-
identified.”) 8. Confirm Completed Process (e.g., “Alright. I think we have enough
information.”)
Results: Schema 2. Explain the Clinical Process (e.g., “And then they are diagnosed with
cancer after the image?”) 1. Patients Demographics 2. Temporal Aspect of clinical process
1. Initial Diagnosis of Disease 2. Primary Treatment of Disease 3. Follow-up/Surveillance of disease 4. Salvage Treatment of disease
3. Laboratory Tests 4. Radiograhical studies 5. Clinical findings
1. Disease Confounders and Comorbidities 2. Social History 3. Family History 4. Clinical Stage/Risk assessment/Disease Status of Diagnosis 5. Survival: Disease Specific, and Overall survival
6. Surgical Procedure 7. Pathology 8. Medical Therapy 9. Radiation Therapy 10. Other Treatment 11. Treatment Toxicities, Complications, and Adverse events
Methods: Dialogue act annotation
§ Three raters independently annotated all 3160
dialogue acts § GH – 9174 codes § MB – 8703 codes § JG – 8530 codes
§ Inter-rater agreement assessment § Kappa score 0.61
§ Rater disagreement was resolved with consensus
Methods: Data analysis
§ Consensus annotations used for further analysis § Consensus resulted in 8444 codes used to
describe the negotiation space § Data normalization
Methods: Data analysis Conversation A
Conversation B
Normalized
DA 1 DA 2 DA 3 DA 4 DA 5
DA 1 DA 2 DA 3
Methods: Data analysis
§ Data aggregation
§ DA were summed over all the normalized conversation
§ Descriptive statistics and theme river visualizations
Results: Theme river
0
10
20
30
40
50
60
70
80
1 7 13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
121
127
133
Cou
nt o
f Cod
es
Conversation Act
8.0 Confirm completed Process
7.0 Review IRB and Privacy Policies
6.0 Explain Data Results to Researchers
5.0 Clarify Research Workflow
4.0 Discuss Study Design
3.0 Locate Data Elements in EHRs
2.0 Explain the Clinical Process
1.0 State the Problem
Results: Theme river
0
5
10
15
20
25
30
35
40
45
50
1 4 7 10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
112
115
118
121
124
127
130
133
Cou
nt o
f Cod
es
Conversation Acts in a Normalized Session
2.0 Explain the Clinical Process Trend Line
Results: Theme river
0
2
4
6
8
10
12
14
16
18
1 4 7 10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
112
115
118
121
124
127
130
133
Cou
nt o
f Cod
es
Conversation Acts in a Normalized Session
4.0 Discuss Study design 5.0 Clarify Research Workflow Trend Line Trend Line
Conclusions
§ Contribute early understanding of the mediated query process
§ The query negotiation space is an iterative process necessary to reach an understanding
§ Query mediation represents a process-based needs assessment and clarification
The research was supported by grants
R01 LM009886 “Bridging the semantic gap between research eligibility criteria and clinical data” from the National Library of Medicine (PI: Weng) 5T15LM007079: Columbia University Biomedical Informatics Training Program (PI: Hripcsak)
Acknowledgments
22