A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic Brain Injury

Ohio Center of Excellence in Knowledge-Enabled Computing

A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic

Brain InjuryVahid Taslimitehrani, Guozhu Dong

kno.e.sis centerDepartment of Computer Science and Engineering

Wright State UniversityDayton, OH

1


Outline

• Motivation and background• Preliminaries

– Contrast pattern mining – Logistic regression

• CPXR(Log)• TBI data• Results of CXR(Log) on TBI• Conclusion• References

2


Motivation and Background

• CPXR (Log): Accurate and informative prognostic models

Prognostic models are central to medicine. [Steyerberg, 2009] Facilitate physicians decision making process on patient

treatment plan, screening and etc. Help to understand the disease behavior including identifying

new biomarkers. Number of articles listed in PubMed with “prediction model” in

title in 2012 is 7 times of that in 2000. [pubmed]

3



• CPXR (Log): A powerful new generic Logistic Regression method

Logistic regression is one of the most popular approaches for building clinical prediction models. [Steyerberg, 2009]

Logistic regression models are desirable since They are representable. They are probabilistic based. They are flexible in terms of

predictor variables. (categorical and numerical variables)

4



• Traumatic Brain Injury

One of the leading causes of death and disability worldwide. Annually, 1.5 million death in worldwide. [Perel, 2006] $76.5 billion dollars including direct and indirect cost in 2010

in US. [www.cdc.gov] Early and accurate prognostic models based on just admission

time data to make time–critical clinical decisions by physicians.

5


Challenges in clinical modeling

• Accuracy of the clinical prediction models• Easiness to interpret clinical prediction models

• To explain medical decision to the patient• To identify important risk factors

• Avoid overfitting to make clinical prediction models more generalizable

• Early decision making• ABILITY to CAPTURE

– Heterogeneous patient group behavior

6


CPXR works well by using several pattern local model pairs

These are different subpopulations that need different predicted models. Using just one prediction function does not work well!!

Not an extreme case! It happens very often …7


How CPXR(Log) is different from other classifiers?

• CPXR introduced the idea of– using patterns to logically characterize different

subpopulations of data and – using local regression models to represent predictor

response relationship of the subpopulation– choosing a pattern only if the local model is very

accurate [Dong, 2014] • CPXR(Log)

– can capture diversified/heterogeneous behavior. – is more generalizable. – is less overfitting than other classifiers.

• CPXR(Log) is more accurate than other classifiers like SVM and Random Forest.

8


Traditional classification vs CPXR

Training Data Classification engine

Classifier (model)

Training Data

Classification engine

Baseline model

• Large error data• Small

error data

(Pattern 1, Model 1)

(Pattern 2, Model 2)

(Pattern k, Model k)

.

.

.

Build and selectCPs &

local models

9


CPXR(Log) – PXR concept

• Definition: Let be training data for regression. Let be a regression model built on , which we will call the baseline model on . A pattern aided regression (PXR) model is a tuple , where is the pattern set of , s are local regression models of s and is the default regression model. We define the regression model of as

for each instance , where

10


Preliminaries: Contrast Patterns

• A toy example

• )=• • Given a threshold like 2, is a contrast pattern.• Details: We only consider one minimal generator pattern for

each “equivalency class” of contrast patterns.

TID Classb d e g ib c e g ia c e g ja c e h jb d f g i


Quality measures

• CPXR(Log) needs to efficiently extract a desirable pattern set from a huge search space of potential pattern sets.

• Definition: The average residual reduction (arr) of a pattern w.r.t. a model and a dataset is

• Definition: The total residual reduction (trr) of a pattern set w.r.t a model and a dataset is

where , , and .


CPXR(Log) algorithm -- outline

• First step: split training dataset into two classes, and .• : instances of where baseline model makes Large Error.• : instances of where baseline model makes Small Error.

• Second step: extract all contrast patterns on satisfying .• Third step: search for a small set of pattern to maximize error

reduction and uses that set to build a model.

• Note Each pattern is associated with a local regression model built on ’s

matching data. Using a pattern and its local associated regression model is a flexible way

to represent one predictor response relationship. Different pairs represent highly different predictor response relationships.

13


CPXR(Log) – details (1)

• Inputs:• Training data • Baseline model • to partition into and • threshold on contrast patterns

• Output:• A model

Let denote ’s error on ; Determine to minimize ; Let ; Discretize each numerical variable using entropy based

binning; Extract all contrast patterns for in the class (;

14


CPXR(Log) – details (2)

For each , build the local regression model for data in ; Let , where is the pattern in with highest

Let be the regression model trained from ; Return ;

15


TBI data

• TBI dataset is a collection of some International and US Tirilazad trials.

• 2159 instances. [Steyerberg, 2008]• 15 numerical and categorical predictor variables.• Missing instances were treated using multiple imputation.• The outcome variable is the Glascow Outcome Scale: GOS 1

(dead),…, GOS 5 (good recovery)• This study used two discretized versions of GOS: “Mortality” vs

survival (GOS1 vs GOS 2-5), “Unfavorable” vs favorable (GOS 1-3 vs GOS 4-5)Category Predictor variables

Basic Cause of injury, age, GCS motor score, pupil reactivity

Computed tomography (CT)

Hypoxia, hypotension, Marshall CT, tSAH, eDH, compressed cistern, midline shift more than 5 mm

Lab Glucose, ph, sodium, hb 16


Results – Performance of SLogR and CPXR(Log) on Mortality models

Model SLogR CPXR(Log)Specificity

Sensitivity

F1 AUC Specificity

Sensitivity

F1 AUC

Basic 0.95 0.18 0.27

0.77

0.96 0.18 0.28 0.8

Basic+CT 0.95 0.32 0.42

0.8 0.96 0.42 0.53 0.88

Basic+CT+Lab

0.94 0.36 0.46

0.8 0.97 0.46 0.58 0.92

Of course more accurate than standard logistic regression

17


Results – Performance of SLogR and CPXR(Log) on Unfavorable models

Model SLogR CPXR(Log)Specificity

Sensitivity

F1 AUC Specificity

Sensitivity

F1 AUC

Basic 0.85 0.52 0.59

0.76

0.89 0.54 0.63 0.82

Basic+CT 0.85 0.6 0.66

0.8 0.87 0.65 0.7 0.87

Basic+CT+Lab

0.84 0.61 0.66

0.81

0.91 0.72 0.76 0.93

18


Results – Impact of adding more variables on AUC

Variable set change Mortality UnfavorableCPXR(Log)

SLogR CPXR(Log) SLogR

Basic Basic +CT 10% 7.7% 6% 5.2%Basic Basic + CT + Lab 15% 11.1% 13.4% 6.6%

Mortality UnfavorableBasic

Basic+CT

Basic+CT+Lab Basic

Basic+CT

Basic+CT+Lab

11.1%

12.8% 15% 7.9%

8.8% 14.8%CPXR(Log) over SlogR

AUC improvement when more variables are used by CPXR(Log) and SLogR

19


Results – ROC curves of Basic models

20


Results - ROC curves of (Basic + CT) models

21


Results - ROC curves of (Basic+CT+Lab) models

22


Results – Performance comparison

CPXR(Log)Comparing CPXR(Log) performance with

- Logistic Regression- SVM- Random Forest

23


Example: patterns used by CPXR(Log) & Mortality (Basic+CT+Lab)

patterns arr Cov(CT classification = III) 15

%20%

(CT classification = V) AND (midline shift) AND (0.56 < glucose <= 10.4)

12%

15%

(No compressed cistern) AND (No midline shift) AND (7.22 < PH <= 7.45)

10%

40%

(10.77 < glucose <= 21.98) AND (134 < sodium <= 144) 18%

18%

(No Hypotension) AND (134 < sodium < 144) AND (10.55 < HB <= 14.57) AND (with tSAH)

19%

20%

(No tSAH) AND (134 < sodium <= 144) AND (10.77 < glucose <= 21.98) AND (No Hypotension) AND (No midline shift) AND (One reactive pupil)

19%

20%

(No tSAH) AND (One reactive pupil) 18%

40%24


Odds ratios

(CT classification = V) AND (midline shift) AND (0.56 < glucose <= 10.4)

25


Residual reduction and example patient

•Age = 15 years old•Cause of injury =

motorbike accident•GCS motor score = 5

(No eye response)•No reactive pupil•No hypoxia•No hypotension•CT scan classification

= V (mass lesion)•No tSAH•With ePDH•Has midline shift

more than 5 mm•Glucose = 9.06

mmol/l•PH = 7.37•Sodium = 141 mmol/l•Hb = 14.4 g/dl•Patient is dead.

0.78, risk of survival based on standard logistic regression!!!!

0 500 1000 1500 2000 25000

100

200

300

400

500

600

Error distribution of TBI dataset on SLogR

Patient is matched with “pattern II” and CPXR(Log) predicted 0.38 risk of survival.

26

0 500 1000 1500 2000 25000

2

4

6

8

10

12Error distribution of TBI dataset on

CPXR(Log)


Results – Box plot of RMSE reduction in CPXR

• Piecewise linear regression• Support vector regression • Bayesian additive regression tree• Gradient boosting method

How much CPXR can reduce RMSE (Root Mean Square Error) in 50 datasets comparing to

27


Results – Noise sensitivity and impact of the number of patterns

Number of patterns is determined by the method automatically.

How much noisy datasets can impact on the performance of CPXR and other methods?

28


Conclusion

• We presented an effective new method, CPXR(Log) for logistic regression and for clinical predictive modeling.

• We showed CPXR is more accurate than standard logistic regression and some other classification algorithms.

• We also presented CPXR(Log) models including patterns and local models an new odds ratios of predictor variables.

29


References

• Guozhu Dong & Vahid Taslimitehrani. Pattern-Aided Regression Modeling and Prediction Model Analysis. Tech Report, CSE, Wright State Univ. 2014.

• E. Steyerberg: Clinical prediction models. Springer, 2009.• P. Perel, P. Edwards, R. Wentz, and I. Roberts: Systematic

review of prognostic models in traumatic brain injury. BMC medical informatics and decision making, 6(1): 1-10, 2006.

• G. Dong, J. Li: Efficient mining of emerging patterns: Discovering trends and differences. In Proc. KDD, 43-52, 1999.

• E.W. Steyerberg, et al: Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS medicine, 5(8): e165, 2008.

30


Preliminaries: Logistic Regression

• Regression modeling: predicting response variable (output) based on predictor variables (input).

• Logistic regression: the response variable is binary. For example,

• “having the disease” or “not”• “mortal” or “not”

• Let X=() be a vector of predictor variables• and Y be the response variable. • The goal of logistic regression is learning a function like

satisfying

Chi-square () is one of the goodness of fit measures for logistic regression 31



• An item is a single variable condition of the form “A = a” or “ “• A pattern is a finite set of items.• An instance X from dataset D is said to match a pattern P, if X satisfies every item in P. • Example:

“ Age ” AND “Diagnosed with high cholesterol = YES” is a pattern with TWO items.

One instance (patient ID = 1) matches the above pattern.

Patient ID

Age BMI Sys Blood Pressure

Diagnosed with high Cholesterol

Diagnosed with Heart Failure ©

1 75 22 120 YES YES2 67 27 131 NO NO

32



• The matching data of pattern P in dataset D or is the set of all instances matching pattern P.

• The support of pattern P in D is • Given 2 classes and ,the support ratio of pattern P from to

• Given a threshold , a contrast pattern (emerging pattern) of

class is a pattern P satisfying . [Dong, 1999]

33

Data & Analytics

A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic Brain Injury