Upload
vahid-taslimitehrani
View
257
Download
1
Embed Size (px)
Citation preview
Ohio Center of Excellence in Knowledge-Enabled Computing
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic
Brain InjuryVahid Taslimitehrani, Guozhu Dong
kno.e.sis centerDepartment of Computer Science and Engineering
Wright State UniversityDayton, OH
1
Ohio Center of Excellence in Knowledge-Enabled Computing
Outline
• Motivation and background• Preliminaries
– Contrast pattern mining – Logistic regression
• CPXR(Log)• TBI data• Results of CXR(Log) on TBI• Conclusion• References
2
Ohio Center of Excellence in Knowledge-Enabled Computing
Motivation and Background
• CPXR (Log): Accurate and informative prognostic models
Prognostic models are central to medicine. [Steyerberg, 2009] Facilitate physicians decision making process on patient
treatment plan, screening and etc. Help to understand the disease behavior including identifying
new biomarkers. Number of articles listed in PubMed with “prediction model” in
title in 2012 is 7 times of that in 2000. [pubmed]
3
Ohio Center of Excellence in Knowledge-Enabled Computing
Motivation and Background
• CPXR (Log): A powerful new generic Logistic Regression method
Logistic regression is one of the most popular approaches for building clinical prediction models. [Steyerberg, 2009]
Logistic regression models are desirable since They are representable. They are probabilistic based. They are flexible in terms of
predictor variables. (categorical and numerical variables)
4
Ohio Center of Excellence in Knowledge-Enabled Computing
Motivation and Background
• Traumatic Brain Injury
One of the leading causes of death and disability worldwide. Annually, 1.5 million death in worldwide. [Perel, 2006] $76.5 billion dollars including direct and indirect cost in 2010
in US. [www.cdc.gov] Early and accurate prognostic models based on just admission
time data to make time–critical clinical decisions by physicians.
5
Ohio Center of Excellence in Knowledge-Enabled Computing
Challenges in clinical modeling
• Accuracy of the clinical prediction models• Easiness to interpret clinical prediction models
• To explain medical decision to the patient• To identify important risk factors
• Avoid overfitting to make clinical prediction models more generalizable
• Early decision making• ABILITY to CAPTURE
– Heterogeneous patient group behavior
6
Ohio Center of Excellence in Knowledge-Enabled Computing
CPXR works well by using several pattern local model pairs
These are different subpopulations that need different predicted models. Using just one prediction function does not work well!!
Not an extreme case! It happens very often …7
Ohio Center of Excellence in Knowledge-Enabled Computing
How CPXR(Log) is different from other classifiers?
• CPXR introduced the idea of– using patterns to logically characterize different
subpopulations of data and – using local regression models to represent predictor
response relationship of the subpopulation– choosing a pattern only if the local model is very
accurate [Dong, 2014] • CPXR(Log)
– can capture diversified/heterogeneous behavior. – is more generalizable. – is less overfitting than other classifiers.
• CPXR(Log) is more accurate than other classifiers like SVM and Random Forest.
8
Ohio Center of Excellence in Knowledge-Enabled Computing
Traditional classification vs CPXR
Training Data Classification engine
Classifier (model)
Training Data
Classification engine
Baseline model
• Large error data• Small
error data
(Pattern 1, Model 1)
(Pattern 2, Model 2)
(Pattern k, Model k)
.
.
.
Build and selectCPs &
local models
9
Ohio Center of Excellence in Knowledge-Enabled Computing
CPXR(Log) – PXR concept
• Definition: Let be training data for regression. Let be a regression model built on , which we will call the baseline model on . A pattern aided regression (PXR) model is a tuple , where is the pattern set of , s are local regression models of s and is the default regression model. We define the regression model of as
for each instance , where
10
Ohio Center of Excellence in Knowledge-Enabled Computing
Preliminaries: Contrast Patterns
• A toy example
• )=• • Given a threshold like 2, is a contrast pattern.• Details: We only consider one minimal generator pattern for
each “equivalency class” of contrast patterns.
TID Classb d e g ib c e g ia c e g ja c e h jb d f g i
Ohio Center of Excellence in Knowledge-Enabled Computing
Quality measures
• CPXR(Log) needs to efficiently extract a desirable pattern set from a huge search space of potential pattern sets.
• Definition: The average residual reduction (arr) of a pattern w.r.t. a model and a dataset is
• Definition: The total residual reduction (trr) of a pattern set w.r.t a model and a dataset is
where , , and .
Ohio Center of Excellence in Knowledge-Enabled Computing
CPXR(Log) algorithm -- outline
• First step: split training dataset into two classes, and .• : instances of where baseline model makes Large Error.• : instances of where baseline model makes Small Error.
• Second step: extract all contrast patterns on satisfying .• Third step: search for a small set of pattern to maximize error
reduction and uses that set to build a model.
• Note Each pattern is associated with a local regression model built on ’s
matching data. Using a pattern and its local associated regression model is a flexible way
to represent one predictor response relationship. Different pairs represent highly different predictor response relationships.
13
Ohio Center of Excellence in Knowledge-Enabled Computing
CPXR(Log) – details (1)
• Inputs:• Training data • Baseline model • to partition into and • threshold on contrast patterns
• Output:• A model
Let denote ’s error on ; Determine to minimize ; Let ; Discretize each numerical variable using entropy based
binning; Extract all contrast patterns for in the class (;
14
Ohio Center of Excellence in Knowledge-Enabled Computing
CPXR(Log) – details (2)
For each , build the local regression model for data in ; Let , where is the pattern in with highest
Let be the regression model trained from ; Return ;
15
Ohio Center of Excellence in Knowledge-Enabled Computing
TBI data
• TBI dataset is a collection of some International and US Tirilazad trials.
• 2159 instances. [Steyerberg, 2008]• 15 numerical and categorical predictor variables.• Missing instances were treated using multiple imputation.• The outcome variable is the Glascow Outcome Scale: GOS 1
(dead),…, GOS 5 (good recovery)• This study used two discretized versions of GOS: “Mortality” vs
survival (GOS1 vs GOS 2-5), “Unfavorable” vs favorable (GOS 1-3 vs GOS 4-5)Category Predictor variables
Basic Cause of injury, age, GCS motor score, pupil reactivity
Computed tomography (CT)
Hypoxia, hypotension, Marshall CT, tSAH, eDH, compressed cistern, midline shift more than 5 mm
Lab Glucose, ph, sodium, hb 16
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Performance of SLogR and CPXR(Log) on Mortality models
Model SLogR CPXR(Log)Specificity
Sensitivity
F1 AUC Specificity
Sensitivity
F1 AUC
Basic 0.95 0.18 0.27
0.77
0.96 0.18 0.28 0.8
Basic+CT 0.95 0.32 0.42
0.8 0.96 0.42 0.53 0.88
Basic+CT+Lab
0.94 0.36 0.46
0.8 0.97 0.46 0.58 0.92
Of course more accurate than standard logistic regression
17
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Performance of SLogR and CPXR(Log) on Unfavorable models
Model SLogR CPXR(Log)Specificity
Sensitivity
F1 AUC Specificity
Sensitivity
F1 AUC
Basic 0.85 0.52 0.59
0.76
0.89 0.54 0.63 0.82
Basic+CT 0.85 0.6 0.66
0.8 0.87 0.65 0.7 0.87
Basic+CT+Lab
0.84 0.61 0.66
0.81
0.91 0.72 0.76 0.93
18
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Impact of adding more variables on AUC
Variable set change Mortality UnfavorableCPXR(Log)
SLogR CPXR(Log) SLogR
Basic Basic +CT 10% 7.7% 6% 5.2%Basic Basic + CT + Lab 15% 11.1% 13.4% 6.6%
Mortality UnfavorableBasic
Basic+CT
Basic+CT+Lab Basic
Basic+CT
Basic+CT+Lab
11.1%
12.8% 15% 7.9%
8.8% 14.8%CPXR(Log) over SlogR
AUC improvement when more variables are used by CPXR(Log) and SLogR
19
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – ROC curves of Basic models
20
Ohio Center of Excellence in Knowledge-Enabled Computing
Results - ROC curves of (Basic + CT) models
21
Ohio Center of Excellence in Knowledge-Enabled Computing
Results - ROC curves of (Basic+CT+Lab) models
22
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Performance comparison
CPXR(Log)Comparing CPXR(Log) performance with
- Logistic Regression- SVM- Random Forest
23
Ohio Center of Excellence in Knowledge-Enabled Computing
Example: patterns used by CPXR(Log) & Mortality (Basic+CT+Lab)
patterns arr Cov(CT classification = III) 15
%20%
(CT classification = V) AND (midline shift) AND (0.56 < glucose <= 10.4)
12%
15%
(No compressed cistern) AND (No midline shift) AND (7.22 < PH <= 7.45)
10%
40%
(10.77 < glucose <= 21.98) AND (134 < sodium <= 144) 18%
18%
(No Hypotension) AND (134 < sodium < 144) AND (10.55 < HB <= 14.57) AND (with tSAH)
19%
20%
(No tSAH) AND (134 < sodium <= 144) AND (10.77 < glucose <= 21.98) AND (No Hypotension) AND (No midline shift) AND (One reactive pupil)
19%
20%
(No tSAH) AND (One reactive pupil) 18%
40%24
Ohio Center of Excellence in Knowledge-Enabled Computing
Odds ratios
(CT classification = V) AND (midline shift) AND (0.56 < glucose <= 10.4)
25
Ohio Center of Excellence in Knowledge-Enabled Computing
Residual reduction and example patient
•Age = 15 years old•Cause of injury =
motorbike accident•GCS motor score = 5
(No eye response)•No reactive pupil•No hypoxia•No hypotension•CT scan classification
= V (mass lesion)•No tSAH•With ePDH•Has midline shift
more than 5 mm•Glucose = 9.06
mmol/l•PH = 7.37•Sodium = 141 mmol/l•Hb = 14.4 g/dl•Patient is dead.
0.78, risk of survival based on standard logistic regression!!!!
0 500 1000 1500 2000 25000
100
200
300
400
500
600
Error distribution of TBI dataset on SLogR
Patient is matched with “pattern II” and CPXR(Log) predicted 0.38 risk of survival.
26
0 500 1000 1500 2000 25000
2
4
6
8
10
12Error distribution of TBI dataset on
CPXR(Log)
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Box plot of RMSE reduction in CPXR
• Piecewise linear regression• Support vector regression • Bayesian additive regression tree• Gradient boosting method
How much CPXR can reduce RMSE (Root Mean Square Error) in 50 datasets comparing to
27
Ohio Center of Excellence in Knowledge-Enabled Computing
Results – Noise sensitivity and impact of the number of patterns
Number of patterns is determined by the method automatically.
How much noisy datasets can impact on the performance of CPXR and other methods?
28
Ohio Center of Excellence in Knowledge-Enabled Computing
Conclusion
• We presented an effective new method, CPXR(Log) for logistic regression and for clinical predictive modeling.
• We showed CPXR is more accurate than standard logistic regression and some other classification algorithms.
• We also presented CPXR(Log) models including patterns and local models an new odds ratios of predictor variables.
29
Ohio Center of Excellence in Knowledge-Enabled Computing
References
• Guozhu Dong & Vahid Taslimitehrani. Pattern-Aided Regression Modeling and Prediction Model Analysis. Tech Report, CSE, Wright State Univ. 2014.
• E. Steyerberg: Clinical prediction models. Springer, 2009.• P. Perel, P. Edwards, R. Wentz, and I. Roberts: Systematic
review of prognostic models in traumatic brain injury. BMC medical informatics and decision making, 6(1): 1-10, 2006.
• G. Dong, J. Li: Efficient mining of emerging patterns: Discovering trends and differences. In Proc. KDD, 43-52, 1999.
• E.W. Steyerberg, et al: Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS medicine, 5(8): e165, 2008.
30
Ohio Center of Excellence in Knowledge-Enabled Computing
Preliminaries: Logistic Regression
• Regression modeling: predicting response variable (output) based on predictor variables (input).
• Logistic regression: the response variable is binary. For example,
• “having the disease” or “not”• “mortal” or “not”
• Let X=() be a vector of predictor variables• and Y be the response variable. • The goal of logistic regression is learning a function like
satisfying
Chi-square () is one of the goodness of fit measures for logistic regression 31
Ohio Center of Excellence in Knowledge-Enabled Computing
Preliminaries: Contrast Patterns
• An item is a single variable condition of the form “A = a” or “ “• A pattern is a finite set of items.• An instance X from dataset D is said to match a pattern P, if X satisfies every item in P. • Example:
“ Age ” AND “Diagnosed with high cholesterol = YES” is a pattern with TWO items.
One instance (patient ID = 1) matches the above pattern.
Patient ID
Age BMI Sys Blood Pressure
Diagnosed with high Cholesterol
Diagnosed with Heart Failure ©
1 75 22 120 YES YES2 67 27 131 NO NO
32
Ohio Center of Excellence in Knowledge-Enabled Computing
Preliminaries: Contrast Patterns
• The matching data of pattern P in dataset D or is the set of all instances matching pattern P.
• The support of pattern P in D is • Given 2 classes and ,the support ratio of pattern P from to
• Given a threshold , a contrast pattern (emerging pattern) of
class is a pattern P satisfying . [Dong, 1999]
33