Upload
jared56
View
794
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Rapid Identification of Pneumonias in BioSense Data Using Radiology Text Reports
Armen Asatryan, MD, MPH,1 Haobo Ma, MD, MS,1 Roseanne English, BS,2 Jerome Tokars, MD, MPH2
1Science Applications International Corporation 2Centers for Disease Control and Prevention
Atlanta, GA
The findings and conclusions in this presentation are those of the author(s) and do not necessarily represent the views of the
Centers for Disease Control and Prevention and/or Science Applications International Corporation
Background -- BioSense
• BioSense started in 2003 as CDC’s electronic biosurveillance system
• Purposes: early detection and situational awareness • Data sources: DoD Medical Treatment Facilities, VA
Medical Centers, LabCorp, non-federal acute care hospitals
• Currently, 371 acute care hospitals send ≥1 type of data to BioSense, 42 send radiology text reports
• Available data: facility, patient disposition, visit dates, patient demographics, microbiology lab order/results, radiology orders/results, hospital pharmacy orders, chief complaint, ICD-9 codes, vital signs (temp, BP) in the ER
Background -- Pneumonia
• Pneumonia is a major contributor to morbidity and mortality– 1.4 M pneumonia hospital discharges in the US in
2003 (American Lung Association, www.lungusa.org)– Pneumonia and influenza are 7th leading cause of
death in the US– Total cost to the US economy estimated at $37.5
billion in 2004 (ibid)• Early detection of pneumonia is important as
several Category A bioterrorism agent-related diseases as well as influenza can present as pneumonia
Previous Research: Administrative Data for Pneumonia Detection
• ICD-9 code- and DRG-based algorithms have Sn 48-66%, Sp >99%, and PPV 73-81% for identifying pneumonia when compared to clinical (physician) reference (Aronsky et al, 2005).
• Bayesian network based algorithm for identifying pneumonia (ICD-9 code as a reference): AUC 0.93 (CI 0.907, 0.948), at fixed Sn 95%, Sp was 69%, PPV 7.3%, NPV 99.8% (Aronsky et al, 2000).
Objective
• Objective of study: find keywords in radiology reports that are most highly associated with pneumonia diagnosis
• Radiology text reports may be available with 1-3 days, whereas final diagnosis may take 1-3 weeks– In our data set, radiology text reports were available,
on average, 9 days earlier than ICD-9 discharge codes
Methods• Studied radiology text reports from 13 hospitals sending
data to BioSense• Study period: 2/2006 through 1/2007• Included inpatient, outpatient, and ER patients• Keyword-based text parsing SAS® program using
RegEx and accounting for negations and double negations (e.g. pneumonia cannot be excluded)
• Search terms: airspace, consolidation, density, infiltrate, opacity, and pneumonia/pneumonitis
• Logistic regression to evaluate the independent association of the keywords with a final diagnosis of pneumonia (ICD-9 codes 480--486).
Example of a CXR showing pneumonia
Example of a text radiology report with keywords underlined
• AP and lateral chest film on 01/01/2007 at 12:00 AM. No comparison films available. Clinical history: 58 years old black female with cough and fever. Findings: Right lower lobe infiltrate, consistent with pneumonia. Left lung, heart, mediastinum, and pulmonary vasculature unremarkable. No pneumothorax. No pleural effusion. Impression: Right lower lobe pneumonia.
Validation of Text Parsing• Sample dataset: 400 reports (300 with any
of the keywords, including negations, and 100 without)
• Gold-standard for text parsing validation: manual review of the radiology reports
• Manual review for the search keywords: “pneumonia” 77, others 63, none 260
• Text parsing: sensitivity 98.5%, specificity 98.6%
Results• 67,714 reports of chest X-rays performed
for 42,510 hospital visits of 34,883 patients
• Male : female ratio = 1 : 1
• Age– Range: <1 day (newborn) to 105 years old– Mean: 55 years old
Results• Among all 67,714 radiology reports
– ICD-9 Dx of Pneumonia: 8,631 (12.8%)– Keywords found (some reports had > 1
keyword)• Opacity 22.1%• Infiltrate 15%• Density 8.2%• Pneumonia 6%• Consolidation 5.5%• Airspace disease 1.1%
Logistic Regression Model to Evaluate Keywords
• Reports may have >1 keyword in many combinations:– Right lower lobe infiltrate, consistent with pneumonia– Right lower lobe density, suggestive of airspace
disease or mass
• Model shows independent association between each keyword and final diagnosis
• Model contained 6 dichotomous indicator variables (i.e., 0=keyword not present in report, 1=keyword present)
Logistic Regression ModelPredictors of Final Diagnosis of Pneumonia
Keyword
Odds Ratio* 95% Confidence Interval
Infiltrate 3.5 3.3-3.6
Pneumonia 3.0 2.7-3.2
Airspace 2.7 2.3-3.3
Consolidation 2.7 2.5-2.9
Opacity 2.0 1.9-2.1
Density 1.2 1.1-1.3
* Reference group is no mention of the keyword, i.e., reports mentioning “infiltrate” are 3.5 times more likely to have a final ICD-9 diagnosis of pneumonia than reports without this keyword
Keyword Index
Pneumonia diagnosis
No pneumonia diagnosis
One or more keyword
6,186 18,180
No keywords 2,445 40,903
All 8,631 59,083
Results: Keyword Index
• One or more of the 5 keywords was found in– 6,186/8,631 (71.7%) of reports of patients
with a pneumonia diagnosis– 18,180/59,083 (30.8%) of patients without a
pneumonia diagnosis– Sensitivity 72%, specificity 69%, kappa=0.23
Limitations / Strengths
• Limitations– Study included a small number of facilities in a limited
geographical area– No formal data verification performed– Based on ICD-9 coded diagnosis, rather than formal
clinical case definition– Variations in readings between radiologists (style,
preferences, quality)• Strengths
– Used empirical process to determine keywords associated with pneumonia
– Used clinically rich data
Discussion• Computerized text parsing can successfully identify
keywords in radiology text reports• Five keywords (infiltrate, pneumonia, airspace,
consolidation, and opacity) were independently associated with a final ICD-9 coded diagnosis of pneumonia
• Reasonable sensitivity for the 5 keywords and ICD-9 Dx, but low kappa. Reasons:– Inadequacy of coded diagnoses– Clinical Dx in the presence of a negative CXR– Either ICD-9 code or CXR may indicate pneumonia, but neither
is definitive– Possible missing CXRs
Radiology Text Report Processed by Natural Language Processing
• History Section – HPI AP PORTABLE CHEST AT 1100 HOURS HISTORY:
Shortness of breath. • Diagnostic Tests Section
– There are chronic and degenerative changes. There is marked chronic lung disease. Extensive pulmonary opacity is seen in the right mid and right lower lung zone compatible with pneumonia. This is superimposed on diffuse chronic lung disease. A focal area of scarring and/ or discoid atelectasis is seen at the medial left lung base. A central line enters from the right and terminates in the superior vena cava.
• Assessment Section – IMPRESSION: Marked chronic lung disease. Interval
development of a large right mid and right lower lung zone opacity compatible with pneumonia.
Future Steps
• Study additional facilities• Evaluate radiology text keywords using a formal
case definition along with chart review as the gold standard
• Evaluate the value of natural language processors in finding additional keywords and implementing more sophisticated concepts such as degrees of certainty, presence of comorbidities, and earliness of detection