37
Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9

Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Analysis

STAT 526

Professor Olga Vitek

May 4, 2011

9

Page 2: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Data andSurvival Functions

• Statistical analysis of time-to-event data

– Lifetime of machines and/or parts(called failure time analysis in engineering)

– Time to default on bonds or credit card(called duration analysis in economics)

– Patients survival time under different treatment(called survival analysis in clinical trial)

– Event-history analysis (sociology)

• Why a special topic on survival analysis?

– Non-normal and skewed distribution

– Need to answer questions related to

P (X > t+ t0|X ≥ t0)

– Censored/truncated data

– Here we only focus on right-censored data

9-1

Page 3: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Function

• Continuous survival time T

– its probability density function is f(t)

– its cumulative probability function is F (t)

F (t) = P (T ≤ t) =

∫ t

0f(s)ds

• The survival function of T is

S(t) = P (T > t) = 1− F (t)

– also called survival rate

– steeper curve → shorter survival

– S′(t) = −f(t)

• Mean survival time

µ = E{T} =

∫ ∞0

tf(t) dt =

∫ ∞0

t dF (t) =

∫ ∞0

t d[1− S(t)]

=

∫ ∞0

[∫ t

0dx

]d[1− S(t)] =

∫ ∞0

[∫ ∞x

d[1− S(t)]

]dx

=

∫ ∞0

S(x) dx

9-2

Page 4: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Hazard Function

• The hazard function of T is

λ(t) = lim∆t↘0

P (t ≤ T < t+ ∆t |T ≥ t)∆t

– proportion of subjects with the event per unittime, around time t; λ(t) ≥ 0

– measure of ’proneness’ to the even as functionof time

– λ(t) 6= f(t): λ(t) is conditional on survival to t

• Relates to the survival function

λ(t) = lim∆t↘0

F (t+ ∆t)− F (t)

∆t

/S(t) =

f(t)

S(t)= −

S′(t)

S(t)

= −d

dtlogS(t) (negative slope of log-survival)

• The cumulative hazard function of T is

defined as Λ(t) =∫ t0 λ(s)ds

– Λ(t) = − logS(t)

– S(t) = exp{−Λ(t)}

– λ(t), Λ(t) or S(t) define the distribution

9-3

Page 5: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Parametric Survival Models

• Exponential distribution:

– λ(t) = ρ, where ρ > 0 is a constant, and t > 0

– S(t) = e−ρt, ⇒ f(t) = −S′(t) = ρe−ρt

• Weibull distribution:

– λ(t) = λp(λt)p−1; λ, p > 0 are constants, t > 0

– S(t) = e−(λt)p, ⇒ f(t) = −S′(t) = e−(λt)pλp(λt)p−1

– is the Exponential distribution when p = 1

• Gompertz distribution:

– λ(t) = αeβt; α, β > 0 are constants, t > 0

– is the Exponential distribution when β = 0

• Gompertz-Makeham distribution:

– λ(t) = λ+ αeβt; α, β > 0 are constants, t > 0

– adds an initial fixed component to the hazard

9-4

Page 6: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric SurvivalModels

• Approximate time by discrete intervals

– f(xi) =

{P{T = xi}0 otherwise

– S(xi) =∑

j: xj≥t f(xj)

– λ(xi) = f(xi) / S(xi)

• After substitution:

– f(xi) = λ(xi)∏i−1j=1 (1− λj)

– S(xi) =∏i−1j=1(1− λj)

– S(t) =∏j:xj<t

(1− λj), t > 0

9-5

Page 7: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Right-Censored Data

• Survival time of i-th subject Ti, i = 1, · · · , n.

• Censoring time of i-th subject Ci, i = 1, · · · , n.

• Observed event for ith subject:

Yi = min(Ti, Ci), δi = 1{Ti≤Ci} =

{1, if Ti ≤ Ci0, if Ti > Ci

– Data are reported in pairs (Yi, δi)

= (observedTime, hasEvent)

• Assumption:

– Ci are predetermined and fixed, or

– Ci are random, mutually independent, and inde-pendent of Ti

9-6

Page 8: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-parametric Estimation

of Survival Function

And

Comparison Between Groups

With Censoring

9-7

Page 9: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Example: Remission Timesof Leukaemia Patients

• 21 leukemia patients treated with drug (6-mercaptopurine)

• 21 matched controls, no covariates.

> library(MASS) #Described in Venable & Ripley, Ch.13> data(gehan)

> gehanpair time cens treat

1 1 1 1 control2 1 10 1 6-MP3 2 22 1 control4 2 7 1 6-MP5 3 3 1 control6 3 32 0 6-MP7 4 12 1 control8 4 23 1 6-MP9 5 8 1 control10 5 22 1 6-MP11 6 17 1 control12 6 6 1 6-MP13 7 2 1 control14 7 16 1 6-MP15 8 11 1 control16 8 34 0 6-MP17 9 8 1 control.........................

9-8

Page 10: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Kaplan-Meier Estimator ofSurvival Function

• Also called product limit estimator

• Re-organize the data

Distinct Event Times t1 · · · ti · · · tk# of Events d1 · · · di · · · dk

# Survivors Right Before ti n1 · · · ni · · · nk

• Kaplan-Meier estimator S(t) :

– Estimate P (T > t1) by p1 = (n1 − d1)/n1

– Estimate P (T > ti|T > ti−1) by pi = (ni − di)/ni,i = 2, · · · , k

– If ti−1 < t < ti,

S(t) =

P (T > t) = P (T > t1)P (T > t|T > t1) = · · ·

P (T > t1){i−1∏m=2

P (T > tm|T > tm−1)}P (T > t|T > ti−1)

– Estimate S(t) by S(t) = S(ti−1)ni−dini

=∏i:ti≤t

ni−dini

,

(Kaplan & Meier, 1958, JASA)

9-9

Page 11: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric Fit ofSurvival Curves

• Represent data in the censored survival form– create a data structure ’Surv’

– ’+’ represents censored time

> library(survival)> Surv(gehan$time, gehan$cens)1 10 22 7 3 32+ 12 23 8 22 17 6 2 1611 34+ 8 32+ 12 25+ 2 11+ 5 20+ 4 19+ 15 68 17+ 23 35+ 5 6 11 13 4 9+ 1 6+ 8 10+

• K-M estimation of survival function– subset of results for the treatment group:

> fit <- survfit(Surv(time, cens) ~ treat, data=gehan)> summary(fit)

treat=6-MPtime n.risk n.event survival std.err lower95%CI upper95%CI

6 21 3 0.857 0.0764 0.720 1.0007 17 1 0.807 0.0869 0.653 0.996

10 15 1 0.753 0.0963 0.586 0.96813 12 1 0.690 0.1068 0.510 0.93516 11 1 0.627 0.1141 0.439 0.89622 7 1 0.538 0.1282 0.337 0.85823 6 1 0.448 0.1346 0.249 0.807

9-10

Page 12: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Interval K-M Estimatorof Survival Function

• Direct interval estimation of S(t)

– The variance of S(t) can be estimated by

var(S(t)) = [S(t)]2∑i:ti≤t

di

ni(ni − di)

– This is Greenwood’s formula (Greenwood, 1926)

– CI for S(t) is S(t)± 1.96√var[S(t)],

– S(t) ∈ [0,1], but var(S(t)) may be large enough

that the usual CI will exceed this interval

• Interval estimation using log S(t) (preferred)

– The variance of log S(t) can be estimated as

var[ log S(t) ] =∑i:ti≤t

di

ni(ni − di)

– Construct the CI for log S(t)

[L,U ] = log S(t)± 1.96√var[log S(t)],

– For CI of S(t), transform the limits with [eL, eU ].

9-11

Page 13: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Estimate the Cumulative Hazard

Function Λ(t) = − logS(t)

• Using the Kaplan-Meier estimator of S(t):

Λ(t) = − log S(t)

• Alternatively, the Nelson-Aalen Estimator:

Λ(t) =

{0, if t ≤ t1∑

i:ti≤tdini, if t ≥ ti

V ar(Λ(t) =∑i:ti≤t

di

n2i

– better small-sample-size performance than basedon the K-M procedure

– useful in comparing the fit of a parametric modelto its non-parametric alternative

9-12

Page 14: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric Fit ofSurvival Curves

>plot(fit, conf.int=TRUE, lty=3:2, log=TRUE,xlab="Remission (weeks)", ylab="Log-Survival", main="Gehan")

• Plot curves and CI; default CI are on the log scale

• ’log=TRUE’ plots y axis (i.e. S(t)) on the log scale(i.e. y axis shows the hegative cumulative hazard)

0 5 10 15 20 25 30 35

0.05

0.10

0.20

0.50

1.00

Gehan

Remission (weeks)

Sur

viva

l

control6−MP

9-13

Page 15: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test for Homogeneity

• Non-parametric test

– Compare two populations with hazard functionsλi(t), i = 1,2.

– Collect two samples from each population.

– Construct a pooled sample with k distinct eventtimes

Distinct Failure t1 · · · ti · · · tkTime

Pool # of Failures d1 · · · di · · · dkSample # survivors n1 · · · ni · · · nk

right before tiSample # of Failures d11 · · · d1i · · · d1k

1 # survivors ti n11 · · · n1i · · · n1k

right before tiSample # of Failures d21 · · · d2i · · · d2k

2 # survivors n21 · · · n2i · · · n2k

right before ti

– Note di = d1i + d2i, ni = n1i + n2i, i = 1, · · · , k

– Test the hypotheses

H0 : λ1(t) = λ2(t), t ≤ τvs. Ha : λ1(t) 6= λ2(t) for some t ≤ τ

9-14

Page 16: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test for

Homogeneity: Procedure

• Consider [ti, ti + ∆) for small ∆Sample Sample Pooled

1 2 Sample# failures d1i d2i di

# survivors at ti n1i − d1i n2i − d2i ni − di# of survivors ti n1i n2i ni

right before ti

• If λ1(t) 6= λ2(t) around ti, there should be associ-ation between sample and event in this 2 × 2 table

• d1i|(ni, di, n1i)H0∼ Hypergeometric(ni, di, n1i):

P (d1i = x|ni, di, n1i) =

(dix

)(ni − din1i − x

)(

nin1i

)=⇒ E[d1i|ni, di, n1i] =

din1i

ni

• Define U =∑k

i=1

[d1i − din1i

ni

]E[U ] = 0, var(U) =

∑ki=1

din1i(ni−di)(ni−n1i)n2i (ni−1)

U/var(U)1/2 H0, asy∼ N(0,1)

9-15

Page 17: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test in R

(Venable & Ripley Sec.13.2)

> ?survdiff

> survdiff(Surv(time,cens)~treat,data=gehan)

N Observed Expected (O-E)^2/E (O-E)^2/Vtreat=6-MP 21 9 19.3 5.46 16.8treat=control 21 21 10.7 9.77 16.8

Chisq= 16.8 on 1 degrees of freedom, p= 4.17e-05

• Conclusion: Reject H0

• Warning: this test does not adjust for covariates,and may often be inappropriate

– Appropriate for this study, since the individualsare matched pairs

9-16

Page 18: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Parametric Estimation of

Survival Function

and

Comparison Between Groups

9-17

Page 19: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

The Likelihood Function

• The likelihood of i-th observation (yi, δi, xi):{f(yi|Xi), if δi = 1, (not censored)

S(yi|Xi), if δi = 0, (right− censored)

• The likelihood of the data is

L(β) ∝n∏i=1

{[f(yi|Xi)]δi[S(yi|Xi)]1−δi

}– The inference approaches based on the likeli-

hood function can be applied.

∗ See chapter by Lee on ML parameterestimation

– Use AIC or BIC to compare nested models.

– Graphical check of model adequacy

∗ Plot Λ(t) vs. t for exponential models;

∗ Plot log(Λ) vs. log(t) for Weibull models;

∗ Can also plot deviance residuals.

9-18

Page 20: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models for Hazard Function

• General form: λ(t) = λ0(t)ex′β

– λ0(t): the baseline hazard

– ex′β: multiplicative effect, independent of time

• Implications for the Survival Function:

– S(t) = e−Λ(t) = e−∫ t

0λ(s) ds = e−Λ0(t)ex

′β

– logΛ(t) = log[−log S(t)] = logΛ0 + x′β

– If the model is appropriate, a plot of log[−log S(t)KM ]for different groups yields roughly parallel lines

9-19

Page 21: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models for Hazard Function

• Exponential distribution, no predictors:

– λ(t) = ρ - constant hazard function

– S(t) = e−ρ t - survival function

• Exponential distribution, one predictor:

– λ(t) = eβ0+β1X = eβ0 · eβ1X,

– eβ0 plays the role of λ0,i.e. the constant baseline hazard

• Weibull distribution, no predictors:

– λ(t) = λp p tp−1 - hazard function

– S(t) = e−(λ t)p - survival function

• Weibull distribution, one predictor:

– λ(t) = p tp−1 · e(β0+β1X)p = p tp−1eβp0 · epβ1X

– eβ0+β1X plays the role of λ in overall hazard

– p tp−1eβp0 is the baseline hazard

– eβ0 plays the role of λ in the baseline hazard

9-20

Page 22: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

More on Proportional Hazard

• Assumption of proportional hazard implies:

– The hazard ratio for two subjects i and j withdifferent covariates, at a gven time is

λi(t)

λj(t)=λ0(t)

λ0(t)·eβ0+β1Xi

eβ0+β1Xj

= e(Xi−Xj)β

– constant over time

– only function of covariates

• Can graphically verify the assumption of

proportional hazard.

– Check for parallel lines of SKM(t) on the com-plementary log-log scale

– For the gehan dataset:

> plot(fit, lty=3:4, col=2:3, fun="cloglog",xlab="Remission (weeks)", ylab="log Lambda(t)")

> legend("topleft", c("control", "6-MP"),lty=4:3, col=3:2)

9-21

Page 23: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Verify the Assumptionof Proportional Hazard:

Gehan Study

1 2 5 10 20

−2.

0−

1.5

−1.

0−

0.5

0.0

0.5

1.0

Remission (weeks)

log

Lam

bda(

t)

control6−MP

• Good agreement with the additive structure on thelog-log scale

• The assumption of proportional hazard is plausible

9-22

Page 24: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models fot Survival Function

• Also called Accelerated Failure Models

S(t) = S0(t · ex′β)

– S0 is the baseline survival function

– Covariates accelerate or contract the time toevent

– Survival time T0 = T ·ex′β has a fixed distribution

log T0 = log T + X′β, ⇒log T = log T0 −X′β

– Weibull (and its special case Exponential) arethe only distributions that can be simultaneously(and equivalently) specified as proportional haz-ard and accelerated failure models.

9-23

Page 25: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure:Exponential

• Exponential distribution:

– λ(t) = ρ, S(t) = e−ρ t

– S0(t) = e−t - survival of the standard exponential

• Effect of the covariates:

– S(t) = S0(t · eβ0+β1X), eβ0+β1X = ρ

– If T0 ∼ exp(1), and T are the observed times

T0 = T · eβ0+β1X

log T0 = log T + β0 + β1X, and

log T = −β0 − β1X + log T0

• In R:

log T = β0 + β1X + σ · log ε,

– σ is a scale parameter fixed at σ = 1

– ε ∼ Exp(1)

– use opposite sign of β to estimate survival:

S(t) = S0(t · e−β0−β1X)

9-24

Page 26: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure:Exponential

> fit.exponential <- survreg(Surv(time,cens) ~ treat,data=gehan, dist="exponential")

> summary(fit.exponential)Value Std. Error z p

(Intercept) 3.69 0.333 11.06 2.00e-28treatcontrol -1.53 0.398 -3.83 1.27e-04

Scale fixed at 1

Exponential distributionLoglik(model)= -108.5 Loglik(intercept only)= -116.8Chisq= 16.49 on 1 degrees of freedom, p= 4.9e-05

• Parameter interpretation:

– −βtreatcontrol = 1.53 > 0

– time until remission is longer for controls

• Predicted survival above T = 10 for trt:

– exp(-10*exp(-3.69))=0.7790189

– comparable to StrtKM(10) = 0.753

9-25

Page 27: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure: Weibull

• Weibull distribution:

– λ(t) = λp p tp−1 - hazard function

– S(t) = e−(λ t)p - survival function

– Can be viewed as the survival function of theexponential random variable T ′ = T p ∼ Exp(λp)

• Effect of the covariates:

– Identical to the exponential:S(t′) = S0(t′ · eβ0+β1X), eβ0+β1X = λp

– If T0 ∼ exp(1), and T ′ are the observed times

T0 = T ′ · eβ0+β1X

log T0 = log T ′ + β0 + β1X

– Returning to the original notation T ′ = T p:

log T0 = p logT + β0 + β1X

log T = −1

pβ0 −

1

pβ1X +

1

plog T0

9-26

Page 28: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure: Weibull

• In R Weibull is the default distribution:

log T = β0 + β1X + σ · log ε,

– σ is a scale parameter, σ = 1p, ε ∼ Exp(1)

– use −β/σ to estimate survival:

S(t) = S0(t1/σ · e−β0/σ−β1/σX)

> fit.weibull <- survreg(Surv(time,cens)~treat, data=gehan)

> summary(fit.weibull)Value Std. Error z p

(Intercept) 3.516 0.252 13.96 2.61e-44treatcontrol -1.267 0.311 -4.08 4.51e-05Log(scale) -0.312 0.147 -2.12 3.43e-02

Scale= 0.732Weibull distributionLoglik(model)= -106.6 Loglik(intercept only)= -116.4Chisq= 19.65 on 1 degrees of freedom, p= 9.3e-06

• Predicted survival above T = 10 for trt:

– exp(-10^(1/0.732)*exp(-3.1516/0.732))=0.7308616

– comparable to StrtKM(10) = 0.753

9-27

Page 29: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Variable Selectionand Prediction

• Compare models with and without pair asblocking factor

anova( survreg(Surv(time,cens) ~ treat, data=gehan),survreg(Surv(time,cens)~factor(pair)+treat, data=gehan))

Terms Res.Df -2*LL TestDf Deviance P(>|Chi|)treat 39 213.159 NA NA NA

(pair)+treat 19 181.343 +(pair) 20 31.81597 0.04529

• Prediction for the median survival

– On the linear predictor scale

> fit.weibull.nopairs <-survreg(Surv(time,cens)~treat, data=gehan)

> ?predict.survreg> pred.contr <- predict(fit.weibull.nopairs,

data.frame(treat=’control’), type="uquantile",p=0.5, se=TRUE)

– On the survival function scale

> exp( c(L=pred.contr$fit - 2*pred.contr$se.fit,U=pred.contr$fit + 2*pred.contr$se.fit) )

L.1 U.15.045779 10.396147

9-28

Page 30: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazards

Model

A semi-parametric

approach

9-29

Page 31: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazards(Relative Risk) Model

• Assume:

– Ci and Ti conditionally independent, given Xi.

– Observe k distinct exact failure times(i.e. δi = 1); t1 < t2 < · · · < tk.

– No ties in observed exact failure times(i.e., if δi = δj = 1, then yi 6= yj for i 6= j).

• Define the risk set at time t

R(t) = {i : yi ≥ t} = {individuals alive right before time t}

• Cox (1975) considered the conditional prob-ability

– Probability of event in a small interval aroundtj, given that the individual is in the risk set

P (ind. i failed at [tj, tj + ∆) | i ∈ R(tj)) ≈λ(tj)∆∑

k∈R(tj)λ(tj)∆

9-30

Page 32: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Partial Likelihood Approachto Estimate β (Cox, 1975)

• With covariate Xi(t), assume a proportional hazardsmodel

λ(t) = λ0(t) expXi(t)′β(t)

where λ0(t) is a baseline hazard and eX′iβ(t) is the rel-

ative risk. Treating λ0(t) as a nuisance parameter,one may estimate β(t) by maximizing the partiallikelihood

L(β) =∏j

expXi(tj)′β(tj)∑l∈R(tj)

expXl(t)′β(tj),

where the i(j)th item fails at tj and R(tj) is the riskset at tj.

• Conditional on the history up to tj and the fact thatone item fails at tj, each term within the productis proportional to the likelihood of a multinomialmodel.

• λ0(t) is not specified, which is the non-parametricpart of the model.

• Xl(t)′β(tj) is the parametric part of the model.

9-31

Page 33: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Partial Likelihood Approach toEstimate β (Cox, 1975)

• The partial likelihood function by Cox (1975) is

L(β) =nobs∏i=1

eX′iβ∑

k∈R(tj)eX

′kβ

=n∏i=1

{eX

′iβ∑

k∈R(yi)eX

′kβ

}δi

— Cox argued that the partial likelihood has allproperties of an usual likelihood, e.g., maximizingit for an optimal β =⇒ maximum partial likelihoodestimator (MPLE) β

• Breslow’s Estimator of the Baseline Cumulative Haz-ard Rate:

Λ0(t) =∑j:tj≤t

1∑i∈R(t) e

X′jβ=∑i:yi≤t

δi∑j∈R(yi)

eX′jβ

• There are different ways to relax Assumption 3to handle tied failure times, e.g., exact method,Efron’s method, and Breslow’s method.

Method R Options CommentExact method="exact" accurate, long

Efron’s method="efron" approximate, betterBreslow’s method="breslow" approximate

9-32

Page 34: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazard

• Use exact method to account for ties

• β > 0 as expected

> fit.cox1 <-coxph(Surv(time,cens)~treat,method="exact", data=gehan)

> summary(fit.cox1)

n= 42, number of events= 30coef exp(coef) se(coef) z Pr(>|z|)

treatcontrol 1.6282 5.0949 0.4331 3.759 0.000170 ***

exp(coef) exp(-coef) lower .95 upper .95treatcontrol 5.095 0.1963 2.18 11.91

Rsquare= 0.321 (max possible= 0.98 )Likelihood ratio test= 16.25 on 1 df, p=5.544e-05Wald test = 14.13 on 1 df, p=0.0001704Score (logrank) test = 16.79 on 1 df, p=4.169e-05

9-33

Page 35: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazard

• Add pair as block

> fit.cox2 <- coxph(Surv(time,cens)~treat+factor(pair),method="exact", data=gehan)

> summary(fit.cox2)coef exp(coef) se(coef) z Pr(>|z|)

treatcontrol 3.314679 27.513571 0.742620 4.463 8.06e-06 ***factor(pair)2 -5.015219 0.006636 1.550131-3.235 0.001215 **factor(pair)3 -3.598195 0.027373 1.547371-2.325 0.020053 *

.................................................Rsquare= 0.662 (max possible= 0.98 )Likelihood ratio test= 45.51 on 21 df, p=0.001484Wald test = 27.42 on 21 df, p=0.1573Score (logrank) test = 39.73 on 21 df, p=0.008023

• LR test compares log(partial likelihoods),

but the test has similar properties

> anova(fit.cox1, fit.cox2, test="Chisq")Analysis of Deviance TableCox model: response is Surv(time, cens)Model 1: ~ treatModel 2: ~ treat + factor(pair)

loglik Chisq Df P(>|Chi|)1 -74.5432 -59.915 29.256 20 0.08283 .

9-34

Page 36: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Visualize Model Fit

• Survival curve and CI for an individual withaverage covariates– K-M curves refer to unadjusted populations, Cox

curves refer to an ’average’ patient

– can make survival curves for the Cox model forspecific values of covariates by providing newdata

>plot(survfit(fit.cox1),lty=2:3,xlab="Remission (weeks)",ylab="Survival", main="Gehan", cex=1.5)

0 5 10 15 20 25 30 35

0.0

0.2

0.4

0.6

0.8

1.0

Gehan

Remission (weeks)

Sur

viva

l

9-35

Page 37: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Model Diagnostics

• Cox-Snell residuals are most useful for examiningthe overall fit of a model

rc,i = − log[S(yi|xi)] = Λ(yi|xi) = Λ0(yi)eX′iβ

— Λ0(t) is estimated by the Breslow’s estimator.

— If the estimates Λ0(t) and β were accurate, {(rc,i, δi), i =1, · · · , n} are right-censored observations of Exponential(1).

— For the right-censored data {(rc,i, δi), i = 1, · · · , n},construct the Nelson-Aalen estimator Λ(t) and plotΛ(t) vs. t.

• Martingale residuals can be used to determine thefunctional form of a covariate

rm,i = δj − rc,j

— Fit the Cox model with all covariates except theone of interest, and plot the martingale residualsagainst the covariate of interest.

• Deviance residuals are approximately symmetricallydistributed about zero and large values may indicateoutliers.

rd,i = sign(rm,i)√−2[rm,i + δi log(rc,j)]

9-36