7
Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/davidian (Joint work with X. Huang, L. Stefanski, K. Doehler, L. Tang, M. Zhang) Greenberg Lecture IV: Latent Variable/Survival 1 Outline 1. Introduction 2. Latent-model theoretical robustness 3. Empirically checking latent-model robustness 4. “Smooth” inference for survival functions with arbitrarily censored data 5. “Smooth” semiparametric regression analysis with arbitrarily censored data Greenberg Lecture IV: Latent Variable/Survival 2 1. Introduction Two mainstays of biostatistical methodology and practice: Latent-variable models ” – e.g., measurement error models, models with random effects Survival analysis Two mini-talks: Research by my PhD students Tools for checking whether inference in latent variable models is robust to assumptions on the latent variable distribution – with Xianzheng Huang and Len Stefanski Methods for survival analysis based on mild smoothness assumptions – with Kirsten Doehler , Lihua Tang , and Min Zhang Greenberg Lecture IV: Latent Variable/Survival 3 Latent-Model Robustness in Structural Measurement Error Models Xianzheng Huang, Len Stefanski, and Marie Davidian Department of Statistics North Carolina State University Greenberg Lecture IV: Latent Variable/Survival 4 2. Latent-model theoretical robustness Particular latent variable model: Structural measurement error model Y = observed response X = true predictor (q × 1), with true density f * X (x) W = observed predictor (q × 1) Usual assumptions: Take q =1 for simplicity Conditional density of Y |X is f Y |X (y|x; θ), true value θ * W = X + U, U ∼N (02 U ), σ 2 U known conditional density of W|X is f W|X (w|x; σ 2 U ) (normal ) f Y,W|X (y,w|x; θ)= f Y |X (y|x; θ)f W|X (w|x; σ 2 U ) (surrogacy ) Interested in inference on θ Observed data: (Yj ,Wj ), j =1,...,n, iid Greenberg Lecture IV: Latent Variable/Survival 5 2. Latent-model theoretical robustness X is a latent variable: Assumptions on X? One approach to inference on θ: Make a parametric assumption about the true density of X (i.e., the latent variable model ) Assumed parametric latent variable model: f (a) X (x; τ (a) ), depending on a parameter vector τ (a) Likelihood inference: Estimate θ, τ (a) by b θ, b τ (a) maximizing L(θ, τ (a) ) = n Y j=1 fY,W(Yj ,Wj ; θ, τ (a) ) = n Y j=1 Z f Y |X (Yj |x; θ)f W|X (Wj |x; σ 2 U )f (a) X (x; τ (a) ) dx If f (a) X (x; τ (a) ) is correctly specified b θ is consistent and asymptotically efficient Greenberg Lecture IV: Latent Variable/Survival 6

Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

Some New Methods for LatentVariable Models and Survival

Analysis

Marie Davidian

Department of Statistics

North Carolina State University

http://www.stat.ncsu.edu/∼davidian

(Joint work with X. Huang, L. Stefanski, K. Doehler, L. Tang, M. Zhang)

Greenberg Lecture IV: Latent Variable/Survival 1

Outline

1. Introduction

2. Latent-model theoretical robustness

3. Empirically checking latent-model robustness

4. “Smooth” inference for survival functions with arbitrarily censored

data

5. “Smooth” semiparametric regression analysis with arbitrarily

censored data

Greenberg Lecture IV: Latent Variable/Survival 2

1. Introduction

Two mainstays of biostatistical methodology and practice:

• “Latent-variable models ” – e.g., measurement error models, models

with random effects

• Survival analysis

Two mini-talks: Research by my PhD students

• Tools for checking whether inference in latent variable models is

robust to assumptions on the latent variable distribution – with

Xianzheng Huang and Len Stefanski

• Methods for survival analysis based on mild smoothness assumptions

– with Kirsten Doehler , Lihua Tang , and Min Zhang

Greenberg Lecture IV: Latent Variable/Survival 3

Latent-Model Robustness inStructural Measurement Error

Models

Xianzheng Huang, Len Stefanski, and Marie Davidian

Department of Statistics

North Carolina State University

Greenberg Lecture IV: Latent Variable/Survival 4

2. Latent-model theoretical robustness

Particular latent variable model: Structural measurement error model

Y = observed response

X = true predictor (q × 1), with true density f∗X(x)

W = observed predictor (q × 1)

Usual assumptions: Take q = 1 for simplicity

• Conditional density of Y |X is fY |X(y|x; θ), true value θ∗

• W = X + U , U ∼ N (0, σ2U ), σ2

U known

⇒ conditional density of W |X is fW |X(w|x;σ2U ) (normal )

• fY,W |X(y, w|x; θ) = fY |X(y|x; θ)fW |X(w|x;σ2U ) (surrogacy )

• Interested in inference on θ

Observed data: (Yj ,Wj), j = 1, . . . , n, iid

Greenberg Lecture IV: Latent Variable/Survival 5

2. Latent-model theoretical robustness

X is a latent variable: Assumptions on X?

• One approach to inference on θ: Make a parametric assumption

about the true density of X (i.e., the latent variable model )

• Assumed parametric latent variable model: f(a)X (x; τ (a)), depending

on a parameter vector τ (a)

Likelihood inference: Estimate θ, τ (a) by θ, τ (a) maximizing

L(θ, τ (a)) =

n∏

j=1

fY,W (Yj ,Wj ; θ, τ(a))

=

n∏

j=1

∫fY |X(Yj |x; θ)fW |X(Wj |x;σ

2U )f

(a)X (x; τ (a)) dx

• If f(a)X (x; τ (a)) is correctly specified ⇒ θ is consistent and

asymptotically efficient

Greenberg Lecture IV: Latent Variable/Survival 6

Page 2: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

2. Latent-model theoretical robustness

What if f(a)X (x; τ (a)) is incorrectly specified?

• θ can be inconsistent (and hence asymptotically biased )

Our definition of “latent-model robustness:” The estimator θ and

more generally the model are said to be robust if this doesn’t happen !

• I.e., Latent-model robustness means lack of asymptotic bias

• ⇒ The estimator under a correct model is trivially robust

• Asymptotic bias is only possible if both f(a)X (x; τ (a)) is misspecified

and σ2U > 0

• So we are interested in whether there is an “interaction ” between

these factors ⇒ nonrobustness

Greenberg Lecture IV: Latent Variable/Survival 7

2. Latent-model theoretical robustness

Definition: Full latent-model robustness

• Score for assumed model

ψ(y, w, θ, τ (a)) = ∂/∂(θ, τ (a)){ log fY,W (y, w; θ, τ (a)) }

• θ(σU ), τ (a)(σU ) satisfy

E[ψ{Y,W, θ(σU ), τ (a)(σU )} ] = 0 (wrt to the true dist’n)

• Under conditions, θp

−→ θ(σU )

• In general , if f(a)X (x; τ (a)) is incorrect and σU > 0, θ(σU ) 6= θ∗

• The MLE for θ under f(a)X (x; τ (a)) is robust if

θ(σU ) ≡ θ∗ σU ≥ 0

Greenberg Lecture IV: Latent Variable/Survival 8

2. Latent-model theoretical robustness

Remarks: As we noted already

• If f(a)X (x; τ (a)) is correctly specified , then this condition will hold

• . . . but it can also hold when f(a)X (x; τ (a)) is incorrectly specified !

• E.g., if f(a)X (x; τ (a)) is incorrect but is sufficiently flexible to capture

moments of the true model on which θ(σU ) depends

Greenberg Lecture IV: Latent Variable/Survival 9

2. Latent-model theoretical robustness

Full model robustness: Only verifiable in simple models; not very

practically useful

A little easier: First-order latent-model robustness

θ(σU ) = θ∗ + σ2Uθ

′′(0) + o(σ2U )

• Can get by implicit differentiation of E{ψ(·)} as in Stefanski (1985,

Biometrika)

• Implies a necessary , first-order condition for robustness is θ′′(0) = 0

• Example where this holds (and can be shown analytically )

Y |X ∼ N (β0 + β1X,σ2e), f

(a)X (x; τ (a)) = τ

(a)2 h(τ

(a)1 + τ

(a)2 x),

h(·) an arbitrary density (see Huang et al. (2006, Biometrika)

Greenberg Lecture IV: Latent Variable/Survival 10

2. Latent-model theoretical robustness

Realistically: First-order robustness is still too hard to be practically

useful for fancier models arising in real applications

• Need an accessible way to assess robustness to the choice of the

model f(a)X (x; τ (a)) that can be used in data analysis

Idea: Exploit these concepts of theoretical robustness

• If θ is robust , then a plot of

θ(σU ) vs. σU

should be flat ! If not robust, θ(σU ) will change with σU

• Construct an empirical plot in this spirit based on data by exploiting

the simulation step of simulation-extrapolation (SIMEX ). . .

Greenberg Lecture IV: Latent Variable/Survival 11

3. Empirically checking robustness

Remeasured data: Add additional increments of measurement error

• Actual observed data (Y,W ), var(W |X) = σ2U

• “Remeasured data ” {Y,W (λ)}, var{W (λ)|X} = (1 + λ)σ2U

W (λ) = W + λ1/2σUZ, Z ∼ N (0, 1), λ > 0

• Key : If the assumed model∫fY |X(y|x; θ)fW |X(w|x;σ2

U )f(a)X (x; τ (a)) dx

is correct for (Y,W ), then∫fY |X(y|x; θ)fW |X{w|x; (1 + λ)σ2

U}f(a)X (x; τ (a)) dx

is correct for {Y,W (λ)}

Greenberg Lecture IV: Latent Variable/Survival 12

Page 3: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

3. Empirically checking robustness

Result: If the assumed model f(a)X (x; τ (a)) is correct or yields robust

inferences, an estimator based on remeasured data should be

approximately unbiased regardless of the size of λ

• Thus, estimators based on remeasured data for different λ should

show no dependence on λ

• ⇒ Inspect such estimators for a range of λ in a plot

• Write θ(λ) for an estimator based on λ-remeasured data

Greenberg Lecture IV: Latent Variable/Survival 13

3. Empirically checking robustness

Observed data: (Yj ,Wj), j = 1, . . . , n, λ = 0

• MLE θ(0) (estimates θ when measurement error variance = σ2U )

Remeasurement method: For each λ on a grid of λ ∈ [0, λmax],

1 ≤ λmax ≤ 3

• Construct B remeasured data sets, where the bth remeasured data

set is {Yj ,Wb,j(λ)}, j = 1, . . . , n

Wb,j(λ) = Wj + λ1/2σUZb,j , Zb,jiid∼ N (0, 1), j = 1, . . . , n

b = 1, . . . , B (B = 50 or 100 suffices)

• For each b, compute MLE θb(λ) using {Yj ,Wb,j(λ)}, j = 1, . . . , n

• Compute θB(λ) = B−1∑B

b=1 θb(λ)

(estimates θ when measurement error variance = (1 + λ)σ2U )

Greenberg Lecture IV: Latent Variable/Survival 14

3. Empirically checking robustness

Proposed plot for checking robustness: Plot θB(λ) vs. λ

• If f(a)X (x; τ (a)) is correct or robust the plot should be

approximately flat across the range λ ∈ [0, λmax]

• If f(a)X (x; τ (a)) is nonrobust the plot will exhibit change with λ

• In fact : Can apply the remeasurement method to any estimation

technique for measurement error models (not just likelihood

estimators)

Greenberg Lecture IV: Latent Variable/Survival 15

3. Empirically checking robustness

Example: Y binary, P (Y = 1|X = x) = Φ(β0 + β1x), θ = (β0, β1)T

• True density of X is a bimodal mixture of two normals

• Three estimators for θ: Take f(a)X (x; τ (a)) to be

– a normal density (n)

– the flexible SNP density (s)

– a normal mixture density (m), which is the correct specification

Plots: For β1 (β0 plots similar)

• Theoretical , β(·)1 (σU ) − β

(m)1 (σU ) vs. σU

• Remeasurement method , β(·)1,B(λ) − β

(m)1,B (λ) vs. λ,

B = 100, σU = 0.4

Greenberg Lecture IV: Latent Variable/Survival 16

3. Empirically checking robustness

Theoretical plot:

0.0 0.2 0.4 0.6 0.8 1.0

-0.3

-0.2

-0.1

0.0

PSfrag replacements β(·)

1(σ

U)−

β(m

)1

(σU

)

σU

λ

Solid = β(n)1 (σU ), Dashed = β

(s)1 (σU )

Greenberg Lecture IV: Latent Variable/Survival 17

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

3. Empirically checking robustness

Empirical plot: λ = 0 corresponds to σU = 0.4

0.0 0.5 1.0 1.5 2.0 2.5 3.0

-0.1

2-0

.10

-0.0

8-0

.06

-0.0

4-0

.02

0.0

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

b β(·)

1,B

(λ)−

b β(m

)1

,B

(λ)

Solid = β(n)1 (λ), Dashed = β

(s)1 (λ)

Greenberg Lecture IV: Latent Variable/Survival 18

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Page 4: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

3. Empirically checking robustness

Test statistic: In addition to visual assessment

t(λ∗) =θB(0) − θB(λ∗)

SE{θB(0) − θB(λ∗)}, λ∗ > 0

• Choose λ∗ in accordance with B; we have used λ∗ = 1 or 3 with

little difference

• “Large ” | t(λ∗) | indicates lack of robustness

• Reasonable operating characteristics in simulations

Greenberg Lecture IV: Latent Variable/Survival 19

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

3. Empirically checking robustness

Summary:

• The remeasurement method for empirically checking robustness can

be applied to any measurement error model, e.g., multiplicative

error, additional error-free covariates, etc.

• Currently : Extension to more complicated joint models for

longitudinal data and a primary/survival endpoint

Example/details: Huang, X., Stefanski, L., and Davidian, M. (2006)

Latent-model robustness in structural measurement error models.

Biometrika 93, 53-64.

Greenberg Lecture IV: Latent Variable/Survival 20

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

“Smooth” Inference for ArbitrarilyCensored Time-to-Event Data

Kirsten Doehler, Min Zhang, Lihua Tang and Marie Davidian

Department of Statistics

North Carolina State UniversityPSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

bβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Greenberg Lecture IV: Latent Variable/Survival 21

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Survival analysis: Tradition

• Parametric models too restrictive ⇒

• Nonparametric or semiparametric models and methods

• Advantage : Minimal assumptions ⇔ robustness

• Disadvantage : Includes implausible models as possibilities,

computational/inferential difficulties

Perspective: Impose mild “smoothness ” assumptions

• Disadvantage : More restrictive (but not too much )

• Advantage : Computational ease , unified handling of arbitrary

censoring , possible efficiency gains

Greenberg Lecture IV: Latent Variable/Survival 22

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Assume: Time-to-event random variable T , values in (0,∞)

• Survival function S(t) = P (T > t), 0 < t <∞

• Density f(t), f ∈ H, where H is a class of “smooth ” densities

• Objective : Estimate S(t) under these assumptions

Class H: Gallant and Nychka (1987)

• “Sufficiently differentiable ”

• No “unusual ” behavior, e.g., oscillations, jumps, other weirdness

• May be multimodal , skewed , fat- or thin-tailed

• q-dimensional; we consider q = 1 for now

Greenberg Lecture IV: Latent Variable/Survival 23

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Representation of h ∈ H:

h(z) = P 2∞(z)ψ(z) + lower bound

• Infinite Hermite series + lower bound governing tails

• P∞(·) infinite-dimensional polynomial

• ψ(·) is a density with moment generating function; the “base

density ”

• Almost always in published applications : ψ(·) is the standard

normal density ϕ(·) (but doesn’t have to be. . . )

• “SemiNonParametric ” (SNP )

Greenberg Lecture IV: Latent Variable/Survival 24

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Page 5: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

4. “Smooth” inference for survival functions

Practical use: Truncate

hK(z) = P 2K(z)ψ(z)

• “Standardized version ”

• E.g., K = 2, PK(z) = a0 + a1z + a2z2

• Flexible : K = 1, 2 often suffices to approximate almost any shape

• Approximation has same support as the base density

•∫hK(z) dz = 1 ensured “automatically ” by a reparameterization of

polynomial coefficients via a spherical transformation (Zhang and

Davidian, 2001) ⇒ in terms of K angles φ

• K selected via information criteria. . .

• Approximate any f ∈ H by shifting/scaling of Z with this density

Greenberg Lecture IV: Latent Variable/Survival 25

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Representation of survival density: Consider two base density

representations:

• log(T ) = µ+ σZ, Z has density h ∈ H

Approximate h by hK(z;φ) with ψ(z) = ϕ(z) = (2π)−1/2e−z2/2,

the standard normal density

• T = µZσ, Z has density h ∈ H

Approximate h by hK(z;φ) with ψ(z) = E(z) = e−z, the standard

exponential density

• Alternatively, on the log scale with extreme value base density

• In either case ⇒ approximate f(t) by fK(t; θ), θ = (µ, σ, φ).

• Evidence : Virtually any plausible survival density can be

approximated with small K and one of these base densities

Greenberg Lecture IV: Latent Variable/Survival 26

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Survival function approximation: SK(t; θ) =∫ ∞

tfK(u; θ) du

• E.g., Normal base density

SK(t; θ) =

∫ ∞

(log t−µ)/σ

P 2K(z)ϕ(z) dz

• Linear combination of easy integrals I(k, r) =

∫ ∞

r

zkϕ(z) dz,

I(0, r) = 1 − Φ(r), I(1, r) = ϕ(r),

I(k, r) = rk−1ϕ(r) + (k − 1)I(k − 2, r), k > 2

• Similar recursion for exponential base representation

Result: Straightforward approximation to S(t)

• Trivial computation

Greenberg Lecture IV: Latent Variable/Survival 27

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Straightforward likelihood: Any censoring/truncation pattern

• Right-censored data: Observe iid (Vi,∆i), i = 1, . . . , n,

Vi = min(Ti, Ci), Ti⊥⊥Ci, ∆i = I(Ti ≤ Ci)

• Likelihood for θ based on observed data for fixed K and base

`K(θ) =

n∑

i=1

[∆i log{fK(Vi; θ)} + (1 − ∆i) log{SK(Vi; θ)}

]

• Interval-censored data: Ti known to lie in [Li, Ri]

`K(θ) =

n∑

i=1

[log{SK(Li; θ) − SK(R+

i ; θ)}]

• Etc.

Greenberg Lecture IV: Latent Variable/Survival 28

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Choosing K-base: Standard information criteria

• Fit K = 0, 1, . . . ,Kmax for each base density, Kmax = 3 generally

suffices; i.e., estimate θ

• Choose K-base optimizing a given information criterion, e.g., AIC,

BIC, HQ = −2`K(θ) + 2dim(θ) log log n

• Starting values over a grid to ensure global maximum

Greenberg Lecture IV: Latent Variable/Survival 29

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Details/remarks:

• For chosen K-base , standard errors for SK(t; θ) via delta method

treating as standard parametric problem work well

• Computation : standard optimization routines (e.g., SAS IML

nlpqn), very fast

• Test statistic for comparing two groups: integrated weighted

difference

T =

∫ t00w(u){S1,K1

(u; θ1) − S2,K2(u; θ2)} du

SE[∫ t0

0w(u){S1,K1

(u; θ1) − S2,K2(u; θ2)} du

]

compare to standard normal critical values

Greenberg Lecture IV: Latent Variable/Survival 30

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Page 6: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

4. “Smooth” inference for survival functions

Representative Monte Carlo simulations: S(t) is Weibull, n = 200,

1000 data sets, estimation at S(t0) = 0.9, 0.8, . . . , 0.1

30% Right cens 75% Interval cens

25% Right cens

S(t0) Rel eff KM Cov prob Rel eff NPML Cov prob

0.1 1.21 0.93 2.97 0.91

0.2 1.39 0.94 3.49 0.92

0.3 1.41 0.95 2.84 0.93

0.4 1.42 0.95 2.75 0.94

0.5 1.47 0.95 2.57 0.93

0.6 1.43 0.95 2.35 0.95

0.7 1.40 0.95 2.21 0.95

0.8 1.45 0.94 2.15 0.93

0.9 1.87 0.94 – 0.94

• SNP bias < 1.5%

Greenberg Lecture IV: Latent Variable/Survival 31

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Right censored Interval-censored

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

time

prob

abili

ty

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

bβ(·)1,B

(λ) − bβ(m)1,B

(λ)

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

time

prob

abili

ty

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

bβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Greenberg Lecture IV: Latent Variable/Survival 32

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

ACTG 175: Time to AIDS or death. ZDV monotherapy (n1 = 617,

68% right censored) vs. combination therapy (n2 = 1847, 80% right

censored)

0 200 400 600 800 1000 1200

0.5

0.6

0.7

0.8

0.9

1.0

ACTG 175 Data

time (days)

Sur

viva

l Pro

babi

lity

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

bβ(·)1,B

(λ) − bβ(m)1,B

(λ)

T 2 = 39.9, p-value < 0.0001 (logrank test = 47.2, p-value < 0.0001)

Greenberg Lecture IV: Latent Variable/Survival 33

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

4. “Smooth” inference for survival functions

Breast Cosmesis data (Finkelstein and Wolfe, 1985): Time to

cosmetic deterioration. Radiation (n1 = 46, 25 RC, 21 IC) vs.

Radiation+chemo (n2 = 48, 13 RC, 35 IC)

0 10 20 30 40 50 60

0.0

0.2

0.4

0.6

0.8

1.0

Cosmetic Deterioration Data

time (months)

Sur

viva

l Pro

babi

lity

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λ

bβ(·)1,B

(λ) − bβ(m)1,B

(λ)

T 2 = 7.84, p-value = 0.005 (FW test = 6.83, p-value < 0.01)

Greenberg Lecture IV: Latent Variable/Survival 34

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

5. “Smooth” semiparametric regression

Regression analysis: Consider right-censoring

• Ti, Ci as before, Vi = min(Ti, Ci), ∆i = I(Ti ≤ Ci)

• Observed data : (Vi,∆i, Xi), Xi (p× 1) vector of covariates

• Usual assumption : Ti⊥⊥Ci|Xi

• Interested in a model that describes the association between Ti and

Xi

Greenberg Lecture IV: Latent Variable/Survival 35

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

5. “Smooth” semiparametric regression

Popular models: With meaningful interpretation

• Accelerated failure time (AFT) model

log Ti = XTi β + ei, ei has density f0(t)

⇒ Represent f0(t) by SNP

• Proportional hazards model (PH) – unspecified baseline survival

function S0(t)

S(t|Xi) = S0(t)exp(XT

i β)

⇒ Represent density f0(t) of S0(t) by SNP

• Proportional odds model (PI) – unspecified baseline log odds a0(t)

−logit{S(t|Xi)} = a0(t)+XTi β, a0(t) = −logit[S0(t)/{1−S0(t)}]

⇒ Represent density f0(t) of S0(t) by SNP

Greenberg Lecture IV: Latent Variable/Survival 36

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

Page 7: Some New Methods for Latent Outline Variable Models and ...davidian/newgreenberg.pdf · moments of the true model on which µ(¾U) depends Greenberg Lecture IV: Latent Variable/Survival

5. “Smooth” semiparametric regression

Remarks:

• Arbitrary censoring straightforward

• All models in a common framework ⇒ model selection via

information criteria

• Standard errors , confidence intervals , etc. straightforward

• Easy computation

Greenberg Lecture IV: Latent Variable/Survival 37

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)

5. “Smooth” semiparametric regression

Extensions:

• “Heteroscedastic ” AFT model

• Subject-specific AFT model for clustered data

log Tij = XTijβ + bi + eij , bi ∼ N (0, σ2

b ), eijiid∼ f0(t)

• Bivariate survival data: T1, T2 have “smooth ” density

f(t1, t2) ⇒ represent by bivariate (q = 2) SNP

• Joint longitudinal-survival models

• Etc.

Greenberg Lecture IV: Latent Variable/Survival 38

PSfrag replacements

β(·)1 (σU ) − β

(m)1 (σU )

σU

λbβ(·)1,B

(λ) − bβ(m)1,B

(λ)