1 MEASUREMENT ERROR - Nc State Universitystefansk/mimeo_series_mem_version_master_fil… · 1 MEASUREMENTERROR 7 µ Y X ¶ »N ‰µ fl1 +fl x„ x „ x ¶; µ fl2¾2 +¾2 †

1 MEASUREMENT ERROR

J. S. Buzas, T. D. Tosteson and L. A. Stefanski ?

1 University of Vermont [email protected] North Carolina State University [email protected] Dartmouth College [email protected]

Summary. This article focuses on statistical issues related to the problems offitting models relating a disease response variable, Y , to true predictors X anderror-free predictors Z, given values of measurements W , in addition to Y and Z.Although disease status may also be subject to measurement error, attention islimited to measurement error in predictor variables.

The article is organized in three main sections. The first defines basic conceptsand models of measurement error and outlines the effects of ignoring measurementerror on the results of standard statistical analyses. An important aspect of mostmeasurement error problems is the inability to estimate parameters of interest givenonly the information contained in a sample of (Y, Z, W ) values. Some features ofthe joint distribution of (Z, X, W ) must be known or estimated in order to estimateparameters of interest. Thus additional data, depending on the type of error model,must often be collected. Consequently it is important to include measurement errorconsiderations when planning a study, both to enable application of a measurementerror analysis of the data and to ensure validity of conclusions. Planning studiesin the presence of measurement is the topic of the second section. Methods for theanalysis of data measured with error differ according to the nature of the mea-surement error, the additional parameter-identifying information that is available,and the strength of the modeling assumptions appropriate for a particular prob-lem. The third section describes a number of common approaches to the analysis ofdata measured with error, including simple, generally applicable, bias-adjustmentapproaches, conditional likelihood, and full likelihood approaches.

Institute of Statistics Mimeo Series No. 2544April 2003

? J. S. Buzas is Associate Professor, Department of Mathematics and Statis-tics, University of Vermont. Leonard A. Stefanski is Professor, Departmentof Statistics, North Carolina State University. Tor D. Tosteson is Asso-ciate Professor of Community and Family Medicine (Biostatistics), Dart-mouth College. Email addresses: [email protected], [email protected],[email protected].

2 J. S. Buzas, T. D. Tosteson and L. A. Stefanski

1.1 Introduction

Factors contributing to the presence or absence of disease are not alwayseasily determined or accurately measured. Consequently epidemiologists areoften faced with the task of inferring disease patterns using noisy or indirectmeasurements of risk factors or covariates. Problems of measurement arise fora number of reasons, including for example: reliance on self-reported infor-mation; the use of records of suspect quality; intrinsic biological variability;sampling variability; and laboratory analysis error. Although the reasons forimprecise measurement are diverse, the inference problems they create, sharein common the structure that statistical models must be fit to data formu-lated in terms of well-defined but unobservable variablesX, using informationon measurements W that are less than perfectly correlated with X. Prob-lems of this nature are called measurement error problems and the statisticalmodels and methods for analyzing such data are called measurement errormodels.

This article focuses on statistical issues related to the problems of fittingmodels relating a disease response variable, Y , to true predictors X anderror-free predictors Z, given values of measurements W , in addition to Yand Z. Although disease status may also be subject to measurement error,attention is limited to measurement error in predictor variables. We furtherrestrict attention to measurement error in continuous predictor variables.Categorical predictors are not immune from problems of ascertainment, butmisclassification is a particular form of measurement error. Consequentlymisclassification error is generally studied separately from measurement error,although there is clearly much overlap.

The article is organized in three main sections. Section 1.2 defines basicconcepts and models of measurement error and outlines the effects of ignoringmeasurement error on the results of standard statistical analyses. An impor-tant aspect of most measurement error problems is the inability to estimateparameters of interest given only the information contained in a sample of(Y,Z,W ) values. Some features of the joint distribution of (Z,X,W ) mustbe known or estimated in order to estimate parameters of interest. Thus ad-ditional data, depending on the type of error model, must often be collected.Consequently it is important to include measurement error considerationswhen planning a study, both to enable application of a measurement erroranalysis of the data and to ensure validity of conclusions. Planning studiesin the presence of measurement is the topic of Section 1.3. Methods for theanalysis of data measured with error differ according to the nature of themeasurement error, the additional parameter-identifying information that isavailable, and the strength of the modeling assumptions appropriate for aparticular problem. Section 1.2 describes a number of common approaches tothe analysis of data measured with error, including simple, generally appli-cable, bias-adjustment approaches, conditional likelihood, and full likelihoodapproaches.

1 MEASUREMENT ERROR 3

This article is intended as an introduction to the topic. In depth cover-age of linear measurement error models is provided by Fuller [42]. Carroll,et al. [30] provide detailed coverage of nonlinear models as well as densityestimation. Other review articles geared toward measurement error in epi-demiology include Carroll [21], Thomas et al. [115], and Armstrong et al. [6].Prior to the book by Fuller [42] the literature on measurement error modelswas largely concerned with linear measurement error models and went underthe errors-in-variables.

1.2 Measurement Error and Its Effects

This section presents the basic concepts and definitions used in the literatureon nonlinear measurement error models. The important distinction betweendifferential and nondifferential error is discussed first followed by a descrip-tion of two important models for measurement error. The major effects ofmeasurement are described and illustrated in terms of multivariate normalregression models.

1.2.1 Differential and Nondifferential Error, and SurrogateVariables

The error in W as a measurement of X is nondifferential if the conditionaldistribution of Y given (Z,X,W ) is the same as that of Y given (Z,X), thatis, fY |ZXW = fY |ZX . When fY |ZXW 6= fY |ZX the error is differential. Thekey feature of a nondifferential measurement is that it contains no informationfor predicting Y in addition to the information already contained in Z andX. When fY |ZXW = fY |ZX , W is said to be a surrogate for X.

Many statistical methods in the literature on measurement error model-ing are based on the assumption that W is a surrogate. It is important tounderstand this concept and to recognize when it is or is not an appropriateassumption. Nondifferential error is plausible in many cases, but there aresituations where it should not be assumed without careful consideration.

If measurement error is due solely to instrument or laboratory-analysiserror, then it can often be argued that the eror is nondifferential. However, inepidemiologic applications measurement error commonly has multiple sourcesand instrument and laboratory-analysis error are usually minor componentsof the total measurement error. Often in these cases it is not clear whethermeasurement error is nondifferential.

The potential for nondifferential error is greater in case-control studies be-cause covariate information ascertainment and exposure measurement followdisease response determination. In this case selective recall or a tendency for


cases to overestimate exposure can induce dependencies between the responseand the true exposure even after conditioning on true exposure.

A useful exercise for thinking about the plausibility of the assumption thatW is a surrogate, is to consider whether W would have been measured (orincluded in a regression model) had X been available. For example, supposethat the natural predictor X is defined as the temporal or spatial averagevalue of a time-varying risk factor or spatially-varying exposure (e.g., bloodpressure, cholesterol, lead exposure, particulate matter exposure), and theobservedW is a measurement at a single point in time or space. In such cases,it might be convincingly argued that the single measurement contributes littleor no information in addition to that contained in the long-term average.

However, this line of reasoning is not foolproof. The surrogate status ofW can depend on the particular model being fit to the data. For example,consider models where Z has two components, Z = (Z1, Z2). It is possible tohave fY |Z1Z2XW = fY |Z1Z2X and fY |Z1XW 6= fY |Z1X . Thus W is a surrogatein the full model including Z1 and Z2 but not in the reduced model includingonly Z1. In other words, whether a variable is a surrogate or not dependson other variables in the model. A simple example illustrates this feature.Let X ∼ N(µx, σ

2x). Assume that ε1, ε2, U1 and U2 are mean zero normal

random variables such that X, ε1, ε2, U1, U2 are mutually independent. LetZ = X + ε1 + U1, Y = β1 + βzZ + βzX + ε2, and W = X + ε1 + U2.Then E(Y |X) 6= E(Y |X,W ) but E(Y |Z,X,W ) = E(Y |Z,X). The essentialfeature of this example is that the measurement errorW−X is correlated withthe covariate Z. Whether Z is in the model or not determines whetherW is asurrogate or not. Such a situation has the potential of arising in air pollutionhealth effects studies. Suppose that X is the spatial-average value of an airpollutant, W is the value measured at a single location, the components ofZ include meteorological variables, and Y is a spatially aggregated measureof morbidity or mortality (all variables recorded daily, X, W and Z suitablylagged). If weather conditions influence both health and the measurementprocess (e.g., by influencing the spatial distribution of the pollutant), then itis possible that W would be a surrogate only for the full model containing Z.

With nondifferential measurement error, it is possible to estimate pa-rameters in the model relating the response to the true predictor using themeasured predictor only with minimal additional information on the errordistribution, i.e., it is not necessary to observe the true predictor. However,this is not generally possible with differential measurement error. In this caseit is necessary to have a validation subsample in which both the measuredvalue and the true value are recorded. The data requirements are discussedmore fully in Section 1.3. Much of the literature on measurement error modelsdeals with the nondifferential error, and hence that is the focus of this article.Problems with differential error are often better analyzed via techniques formissing data.


1.2.2 Error Models

The number ways a surrogate W and predictor X can be related are count-less. However, in practice it is often possible to reduce most problems to oneof two simple error structures. For understanding the effects of measurementerror and the statistical methods for analyzing data measured with error anunderstanding of the two simple error structures is generally sufficient.

Classical Error Model The standard statistical model for the case in whichW is a measurement of X in the usual sense isW = X+U , where U has meanzero and is independent of X. As explained in the preceding section whetherW is a surrogate or not depends on more than just the joint distributionof X and W . However, in the sometimes plausible case that the error U isindependent of all other variables in a model, then it is nondifferential andWis a surrogate. This is often called the classical error model. More precisely,it is an independent, unbiased, additive measurement error model. BecauseE(W | X) = X, W is said to be unbiased measurement of X.

Not all measuring methods produce unbiased measurements. However, itis often possible to calibrate a biased measurement resulting in an unbiasedmeasurement. Error calibration is discussed later in greater detail.

Berkson Error Model For the case of Berkson error, X varies around Wand the accepted statistical model is X = W + U where U has mean zeroand is independent of W . For this model, E(X | W ) = W , and W is calledan unbiased Berkson predictor of X, or simply an unbiased predictor of X.The terminology results from the fact that the best squared error predictorof X given W is E(X |W ) =W .

Berkson [8] described a measurement error model which is superficiallysimilar to the classical error model, but with very different statistical prop-erties. He described the error model for experimental situations in which theobserved variable was controlled, hence the alternative name controlled vari-able model, and the error-free variable, X, varied around W . For example,suppose that an experimental design called for curing a material in a kiln ata specified temperature W , determined by thermostat setting. Although thethermostat is set to W , the actual temperature in the kiln, X, often variesrandomly from W due to less-than-perfect thermostat control. For a prop-erly calibrated thermostat a reasonable assumption is that E(X | W ) = W ,which is the salient feature of a Berkson measurement (compare to an unbi-ased measurement for which E(W | X) = X).

Apart from experimental situations, in which W is truly a controlledvariable, the unbiased Berkson error model seldom arises as a consequenceof sampling design or direct measurement. However, like the classical errormodel, it is possible to calibrate a biased surrogate so that the calibrated


measurement satisfies the assumptions of the Berkson error model.

Reduction to Unbiased Error Model The utility of the classical andBerkson error structures is due to the fact that many error structures can betransformed to one or the other. Suppose thatW ∗ is a surrogate forX. For thecase that a linear model for the dependence ofW ∗ on X is reasonable, that is,W ∗ = γ1+γxX+U∗, where U∗ is independent of X, the transformed variableW = (W ∗ − γ1)/γx satisfies the classical error model W = X + U , whereU = U∗/γx. In other words W ∗ can be transformed into an independent,unbiased, additive measurement.

Alternatively, for the transformation W = E(X | W ∗) it follows thatX =W +U , where U = X−E(X |W ∗) is uncorrelated with W . Thus apartfrom the distinction between independence and zero correlation of the errorU , any surrogate W ∗ can be transformed to an unbiased additive Berksonerror structure.

Both types of calibration are useful. The transformation that maps anuncalibrated surrogate W ∗ into a classical error model is called error cal-ibration. The transformation that maps W ∗ into a Berkson error model iscalled regression calibration [30]; see Tosteson et al., [117] for an interestingapplication of regression calibration.

In theory, calibration reduces an arbitrary surrogate to a classical errormeasurement or a Berkson error measurement, explaining the attention givento these two unbiased error models. In practice, things are not so simple. Sel-dom are the parameters in the regression of W on X (error calibration) or inthe regression of X on W (regression calibration) known, and these parame-ters have to be estimated, which is generally possible only if supplementarydata are available for doing so. In these cases there is yet another sourceof variability introduced by the estimation of the parameters in the chosencalibration function. This is estimator variability and should be accountedfor in the estimation of standard errors of the estimators calculated from thecalibrated data.

1.2.3 Measurement Error in the Normal Linear Model

We now consider the effects of measurement error in a normal simple linearregression model. This model has limited use in epidemiology, but it is oneof the few models in which the effects of measurement error can be explicitlyderived and explained. Measurement error affects relative risk coefficients inmuch the same way as regression coefficients, so that the insights gained fromthis simple model carry over to more useful epidemiologic models.

Consider the multivariate normal formulation of the simple linear regres-sion model,


(YX

)∼ N

{(β1 + βxµx

µx

),

(β2xσ

2x + σ2

ε βxσ2x

βxσ2x σ2

x

)}. (1.1)

If, as is assumed here, the substitute variable W is jointly normally dis-tributed with (Y,X), then in the absence of additional assumptions onthe relationship between W and (Y,X) the multivariate normal model for(Y, X,W ) isYXW

∼ N

β1 + βxµx

µxµw

,

β2xσ

2x + σ2

ε βxσ2x βxσxw + σεw

βxσ2x σ2

x σxwβxσxw + σεw σxw σ2

w

,(1.2)

where σxw = Cov(X,W ) and σεw = Cov(ε,W ). In measurement error mod-eling the available data consist of observations (Y,W ) so that the relevantsampling model is the marginal distribution of (Y, W ).

We now describe biases that arise from the so-called naive analysis ofthe data, that is, the analysis of the observed data using the usual methodsfor error-free data. In this case the naive analysis is least squares analysisof {(Wi, Yi), i = 1, . . . , n}, so that the naive analysis results in unbiasedestimates of the parameters in the regression model for Y on W , or what werefer to as the naive model. Naive-model parameters are given in Table 1.1for some particular error models.

Differential Error For the general case of measurement with possiblydifferential error the naive estimator of slope is an unbiased estimator of(βxσxw + σεw)/σ

2w rather than βx. Depending on the covariances between ε

and W , and X and W , and the variance of W , the naive-model slope couldbe less than or greater than βx, so that no general conclusions about biasare possible. Similarly the residual variance of the naive regression, could beeither greater or less than the true model residual variance. It follows thatfor a general measurement W , the coefficient of determination for the naiveanalysis could be greater or less than for the true model. These results indi-cate the futility of trying to make generalizations about the effects of usinga general measurement for X in a naive analysis.

Surrogate For the multivariate normal model with 0 < ρ2xw < 1, W is a

surrogate if and only if σεw = 0. With an arbitrary surrogate measurementthe naive estimator of slope unbiasedly estimates βxσxw/σ

2w . Depending on

the covariance between X and W and the variance of W , the naive-modelslope could be less or greater than βx, so that again no general statementsabout bias in the regression parameters are possible. For an uncalibrated mea-surement, E(W |X) = γ0 + γxX, σxw = cov(X,W ) = γxσ

2x and Var(X) =

γ2xσ

2x + σ2

u. In this case the relative bias, σxw/σ2w = γxσ

2x/(γ

2xσ

2x + σ2

u), isbounded in absolute value by 1/|γx|. For an uncalibrated Berkson measure-ment, E(X|W ) = α1+αwW , σxw = αwσ

2w, and the relative bias is αw. When


W is a surrogate the residual variance from the naive analysis is never lessthan the true-model residual variance, and is strictly greater except in theextreme case that X and W are perfectly correlated, ρ2

xw = 1. It follows thatfor an arbitrary surrogate the coefficient of determination for the naive modelis always less than or equal to that for the true model. The use of a surrogatealways entails a loss of predictive power. The naive-model slope indicates thatin order to recover βx from an analysis of the observed data, only σxw wouldhave to be known. A validation study in which bivariate observations (X, W )were obtained, would provide the necessary information for estimating σxw.

Classical Error If the surrogate, W , is an unbiased measurement, E(W |X) = X, and the classical error model holds, then µw = µx, σxw = σ2

x, andσ2w = σ2

x + σ2u. In this case the naive slope estimator unbiasedly estimates

βxσ2x/(σ

2x + σ2

u) . For this case the sign (±) of βxσ2x/(σ

2x + σ2

u) is always thesame as the sign of βx, and the inequality σ2

x/(σ2x + σ2

u)|βx| ≤ |βx| showsthat the naive estimator of slope is always biased toward 0. This type of biasis called attenuation or attenuation toward the null. The attenuation factorλ = σ2

x/(σ2x + σ2

u) is called the reliability ratio and its inverse is called thelinear correction for attenuation. In this case the coefficient of determina-tion is also attenuated toward zero and the term attenuation is often usedto describe both attenuation in the slope coefficient and the attenuation inthe coefficient of determination. Regression dilution has also been used in theepidemiology literature to describe attenuation (MacMahon et al., [63]). Inorder to recover βx from an analysis of the observed data it would be suffi-cient to know σ2

u. Either replicate measurements or validation data provideinformation for estimating the measurement error variance σ2

u.

Berkson Error With W a surrogate, the Berkson error model is embeddedin the multivariate normal model by imposing the condition E(X |W ) =W .In this case µx = µw, σxw = σ2

w and σ2x = σ2

w + σ2u. When W and X satisfy

the unbiased Berkson error model, X =W +U , the naive estimator of slopeis an unbiased estimator of βx, i.e., there is no bias. Thus there is no biasin the naive regression parameter estimators, but there is an increase in theresidual variance and a corresponding decrease in the model coefficient ofdetermination. Even though no bias is introduced there is still a penaltyincurred with the use of Berkson predictors. However, with respect to validinference on regression coefficients the linear model is robust to Berksonerrors. The practical importance of this robustness property is limited becausethe unbiased Berkson error model seldom is appropriate without regressioncalibration except in certain experimental settings as described previously.

Discussion Measurement error is generally associated with attenuation, andas the Table 1.1 shows, attenuation in the coefficient of determination oc-


Error Slope ResidualModel Variance

Differential βx(σxwσ2w

)+(σεwσ2w

)σ2ε + β2

xσ2x − (σxwβx+σεw)2

σ2w

Surrogate βx(σxwσ2w

)σ2ε + β2

xσ2x(1− ρ2

xw)

Classical βx(

σ2x

σ2x+σ2

u

)σ2ε + β2

xσ2x

(σ2u

σ2x+σ2

u

)

Berkson βx σ2ε + β2

xσ2x

(σ2u

σ2x

)

No Error βx σ2ε

Table 1.1. Table entries are slopes and residual variances of the linear modelrelating Y to W for the cases in which W is a differential measurement, a surrogate,an unbiased classical-error measurement, an unbiased Berkson predictor, and thecase of no error (W = X).

curs with any surrogate measurement. However, attenuation in the regres-sion slope is, in general, specific only to the classical error model. The factthat measurement-error-induced bias depends critically on the type of mea-surement error, underlies the importance of correct identification of the mea-surement error in applications. Incorrect specification of the measurementerror component of a model can create problems as great as those caused byignoring measurement error.

The increase in residual variance associated with surrogate measurements(including classical and Berkson) gives rise not only to a decrease in predic-tive power, but also contributes to reduced power for testing. The noncen-trality parameter for testing H0 : βx = 0 with surrogate measurements isnβ2

xσ2xρ

2xw/

{σ2ε + β2

xσ2x

(1− ρ2

xw

)}which is less than the true-date noncen-

trality parameter, nβ2xσ

2x/σ

2ε , whenever ρ

2xw < 1. These expressions give rise

to the equivalent-power sample size formula

nw = nx[{σ2ε + β2

xσ2x

(1− ρ2

xw

)}/{σ2ερ

2xw

}]≈ nx/ρ

2xw,

where nw is the number of (W,Y ) pairs required to give the same power as asample of size nx of (X,Y ) pairs. The approximation is reasonable near thenull value βx = 0 (or more precisely, when β2

xσ2x

(1− ρ2

xw

)is small).

The loss of power for testing is not always due to an increase in variabilityof the parameter estimates. For the classical error model the variance of thenaive estimator is less than the variance of the true-data estimator asymp-totically if and only if β2

xσ2x/(σ

2x + σ2

u) < σ2ε /σ

2x, which is possible when σ2

ε is


large, or σ2u is large, or |βx| is small. So relative to the case of no measurement

error, classical errors can result in more precise estimates of the wrong (i.e.,biased) quantity. This cannot occur with Berkson errors, for which asymp-totically the variance of the naive estimator is never less than the varianceof the true-data estimator.

The normal linear model also illustrates the need for additional informa-tion in measurement error models. For example, for the case of an arbitrarysurrogate the joint distribution of Y and W contains eight unknown pa-rameters (β1, βx, µx, µw, σ

2x, σ

2ε , σxw, σ

2w), whereas a bivariate normal

distribution is completely determined by only five parameters. This meansthat not all eight parameters can be estimated with data on (Y,W ) alone. Inparticular, βx is not estimable. However, from Table 1.1 it is apparent that ifa consistent estimator of σxw can be constructed, say from validation data,then the method-of-moments estimator βx =

(s2w/σxw

)βw, is a consistent

estimator of βx, where βw is the least squares estimator of slope in the lin-ear regression of Y on W , s2w is the sample variance of W , and σxw is thevalidation-data estimator of σxw.

For the case of additive, unbiased, measurement error the joint distribu-tion of Y and W contains six unknown parameters (β1, βx, µx, σ

2x, σ

2ε , σ

2u),

so that again not all of the parameters are identified. Once again βx is notestimable. However, if a consistent estimator of σ2

u can be constructed, sayfrom either replicate measurements or validation data, then the method-of-moments estimator βx =

{s2w/

(s2w − σ2

u

)}βw, is a consistent estimator of βx,

where σ2u is the estimator of σ2

u.For the Berkson error model there are also six unknown parameters in

the joint distribution of Y and W , (β1, βx, µx, σ2x, σ

2ε , σ

2w), so that again

not all of the parameters are identified. The regression parameters β1 andβx are estimated unbiasedly by the intercept and slope estimators from theleast squares regression of Y and W . However, without additional data it isnot possible to estimate σ2

ε .

1.2.4 Multiple Linear Regression

The entries in Table 1.1 and the qualitative conclusions based on them gen-eralize to the case of multiple linear regression with multiple predictors mea-sured with error. For the Berkson error model it remains the case that nobias in the regression parameter estimators results from the substitution ofW for X, and the major effects of measurement error are those resulting fromin an increase in the residual variation.

For the classical measurement error model there are important aspects ofthe problem that are not present in the simple linear regression model. Whenthe model includes both covariates measured with error X and without errorZ, then it is possible for measurement error to bias the naive estimator ofβz as well as the naive estimator of βx. Furthermore, attenuation in the


coefficient of a variable measured with error is no longer a simple functionof the variance of that variable and the measurement error variance. Whenthere are multiple predictors measured with error, then the bias in regressioncoefficients is a nonintuitive function of the measurement error covariancematrix and the true-predictor covariance matrix.

Suppose that the multiple linear regression model for Y given Z and Xis Y = β1 + βTz Z + βTxX + ε. For the additive error model W = Z + U , thenaive estimator of the regression coefficients is estimating

(βz∗βx∗

)=

(σzz σzxσxz σxx + σuu

)−1(σzz σzxσxz σxx

)(βzβx

)(1.3)

and not (βTz , βTx )

T . For the case of multiple predictors measured with er-ror with no restrictions on the covariance matrices of the predictors or themeasurement errors, bias in individual coefficients can take almost any form.Coefficients can be attenuated toward the null, or inflated away from zero.The bias is not always multiplicative. The sign of coefficients can changesign, and zero coefficients can become nonzero (null predictors can appearto appear to significant). There is very little that can be said in general andindividual cases must be analyzed separately.

However, in the case of one variable is measured with error, i.e., X is ascalar, the attenuation factor in βx∗ is λ1 = σ2

x|z/(σ2x|z + σ2

u) where σ2x|z is

the residual variance from the regression of X on Z, that is, βx∗ = λ1βx.Because σ2

x|z ≤ σ2x, attenuation is accentuated relative to the case of no

covariates when the covariates in the model are correlated withX, i.e., λ1 ≤ λwith strict inequality when σ2

x|z < σ2x. Also, in the case of a single variable

measured with error, βz∗ = βz + (1 − λ1)βxΓz, where Γz is the coefficientvector of Z in the regression of X on Z, that is, E(X | Z) = Γ1 +Γ

Tz Z. Thus

measurement error in X can induce bias in the regression coefficients of Z.This has important implications for analysis of covariance models in whichthe continuous predictor is measured with error (references here).

The effects of measurement error on naive tests of hypotheses can beunderstood by exploiting the fact that in the classical error model W is asurrogate. In this case E(Y |Z,W ) = E{E(Y |Z,X,W )|Z,W}= E{E(Y |Z,X)|Z,W} = β1 +β

Tz Z+βTx E(X|Z,W ). With multivariate nor-

mality E(X|Z,W ) is linear, say E(X|Z,W ) = α0 + αTz Z + αwW , and thus

E(Y |Z,W ) = β0 + βTx α0 + (βTz + βTx αTz )Z + βTx α

TwW. (1.4)

This expression holds for any surrogate W . Our summary of hypothesis test-ing in the presence of measurement error is appropriate for any surrogatevariable model provided αTw is an invertible matrix, as it is for the classicalerror model. Suppose that the naive model is parameterized

E(Y |Z,W ) = γ0 + γTz Z + γTwW. (1.5)


A comparison of (1.4) and (1.5) reveals the main effects of measurement erroron hypothesis testing.

First note that (βTz , βTx )

T = 0 if and only if (γTz , γTx )

T = 0. This impliesthat the naive-model test that none of the predictors are useful for explainingvariation in Y is valid in the sense of having the desired Type I error rate.Further examination of (1.4) and (1.5) shows that γz = 0 is equivalent toβz = 0, only if αzβx = 0. It follows that the naive test of H0 : βz = 0 isvalid only if X is unrelated to Y (βx = 0) or if Z is unrelated to X (αz = 0).Finally, the fact that βx = 0 is equivalent to αwβx = 0 implies that thenaive test of H0 : βx = 0 is valid. The naive tests that are valid, those thatmaintain the Type I error rate, will still suffer reduced power relative to thetest based on the true data.

1.2.5 Nonlinear Regression

The effects of measurement error in nonlinear models are much the samequalitatively as in the normal linear model. The use of a surrogate measure-ment generally results in reduced power for testing associations, producesparameter bias, and results in a model with less predictive power. However,the nature of the bias depends on the model, the type of parameter, and theerror model. Generally, the more nonlinear the model is, the less relevant arethe results for the linear model. Parameters other than linear regression coeffi-cients (e.g., polynomial coefficients, transformation parameters, and variancefunction parameters) have no counterpart in the normal linear model andthe effect of measurement errors on such parameters must be studied on acase-by-case basis.

Regression coefficients in generalized linear models, including models ofparticular interest in epidemiology such as logistic regression and Poissonregression, are affected by measurement error in much the same manner as arelinear model regression coefficients. This means that relative risks and oddsratios derived from logistic regressions models are affected by measurementerror much the same as linear model regression coefficients (reference here?).However, unlike the linear model, unbiased Berkson measurements generallyproduce biases in nonlinear models, although they are often much less severethan biases resulting from classical measurement errors (for comparable ρxw).This fact forms the basis for the method known as regression calibration inwhich an unbiased Berkson predictor is estimated by a preliminary calibration

analysis, and then the usual (naive) analysis is performed with E(X|W )replacingX. This fact also explains why more attention is paid to the classicalerror model than to the Berkson error model.

The effects of classical measurement error on flexible regression models,e.g., nonparametric regression, is not easily quantified, but there are gen-eral tendencies worth noting. Measurement error generally “smooths out”regression functions. Nonlinear features of E(Y |X) such as curvature of lo-cal extremes, points of nondifferentiability, and discontinuities will generally


be less pronounced or absent in E(Y |W ). For normal measurement error,E(Y |W ) is smooth whether E(Y |X) is or is not, and local maxima and min-ima will be less extreme — measurement error tends to wear off the peaksand fill in the valleys. This can be seen in a simple parametric model. IfE(Y |X) = β0 + β1X + β2X

2 and (X,W ) are jointly normal with µx = 0,then E(Y |W ) is also quadratic with the quadratic coefficient attenuated byρ4xw. The local extremes of the two regressions differ by β2σ

2x(1− ρ2

xw) whichis positive (negative) when E(Y |X) is convex (concave).

The effects of classical measurement error on density estimation is qual-itatively similar to that of nonparametric regressions. Modes are attenuatedand regions of low density are inflated. Measurement error can mask multi-modality in the true density and will inflate the tails of the distribution. Naiveestimates of tail quantiles are generally more extreme than the correspondingtrue-data estimates.

1.2.6 Logistic Regression Example

This section closes with an empirical example illustrating the effects of mea-surement error in logistic regression and the utility of the multivariate normallinear regression model results for approximating the effects of measurementerror. The data used are a subset of the Framingham Heart Study data andare described in detail in Carroll et al. [30]. For these data X is long-termaverage systolic blood pressure after transformation via ln(SBP-50), denotedTSBP. There are replicate measurements (W1,W2) for each of n = 1615 sub-jects in the study. The true-data model is logistic regression of coronary heartdisease (0, 1) on X and covariates (Z) including age, smoking status (0, 1),and cholesterol level.

Assuming the classical error model for the replicate measurements, Wj =X + Uj , analysis of variance produces the estimate σ2

u = .0126. The aver-age W = (W1 +W2) /2 provides the best measurement of X with an errorvariance of σ2

U/2 (with estimate .0063).

The three measurements, W1, W2 and W , can be used to empiricallydemonstrate attenuation due to measurement error. The measurement errorvariances of W1 and W2 are equal and are twice as large the measurementerror variance ofW . Thus the attenuation in the regressions usingW1 andW2

should be equal; whereas the regression using W should be less attenuated.Three naive logistic models,

logit{Pr(CHD=1)} = β0 + βz1AGE+ βz2SMOKE+ βz3CHOL + βxTSBP

were fit using each of the three measurements W1, W2 and W . The estimatesof the TSBP coefficient from the logistic regressions using W1 and W2 areboth 1.5 (to one decimal place). The coefficient estimate from the fit usingW is 1.7. The relative magnitudes of the coefficients 1.5 < 1.7 are consistentwith the anticipated effects of measurement error — greater attenuation as-sociated with larger error variance. The multiple linear regression attenuation


coefficient for a measurement with error variance σ2 is λ1 = σ2x|z/(σ

2x|z + σ2).

Assuming that this applies to the logistic model suggests that

1.7 ≈σ2x|z

σ2x|z + σ2

u/2βx and 1.5 ≈

σ2x|z

σ2x|z + σ2

u

βx.

Because βx is unknown these approximations cannot be checked directly.However, a check on their consistency is obtained by taking ratios leadingto 1.13 = 1.7/1.5 ≈ (σ2

x|z + σ2u)/(σ

2x|z + σ2

u/2). Using the ANOVA estimate,

σ2u = .0126, and the mean squared error from the linear regression of W

on AGE, SMOKE and CHOL as an estimate of σ2w|z, produces the estimate

σ2x|z = σ2

w|z − σ2u/2 = .0423− .0063 = .0360. Thus (σ2

x|z + σ2u)/(σ

2x|z + σ2

u/2)

is estimated to be (.0360+ .0126)/(.0360+ .0063) = 1.15. In other words, theattenuation in the logistic regression coefficients is consistent (1.13 ≈ 1.15)with the attenuation predicted by the normal linear regression model result.

These basic statistics can also be used to calculate a simple bias-adjustedestimator as βx = 1.7(σ2

x|z + σ2u/2)/σ

2x|z = 1.7(.0360 + .0063)/.0360 = 2.0,

which is consistent with estimates reported by Carroll et al. [30] obtainedusing a variety of measurement error estimation techniques. We do not rec-ommend using linear model corrections for logistic for there are number ofmethods more suited to the task as described in Section 1.4. Our intent withthis example is to demonstrate the general relevance of the results for linearregression to other generalized linear models.

The odds ratio for a ∆ change in transformed systolic blood pressure isexp(βx∆). With the naive analysis this is estimated to be exp(1.7∆); the bias-corrected analysis produces the estimate exp(2.0∆). Therefore the naive oddsratio is attenuated by approximately exp(−.3∆). More generally, the naive(ORN ) and true (ORT ) odd ratios are related via ORN/ORT = ORλ1−1

T ,where λ1 is the attenuation factor in the naive estimate of βx. The naiveand true relative risks have approximately the same relationship under thesame conditions (small risks) that justify approximating relative risks by oddsratios.

1.3 Planning Epidemiologic Studies with Measurement

Error

As the previous sections have established, exposure measurement error iscommon in epidemiologic studies and, under certain assumptions, can beshown to have dramatic effects on the properties of relative risk estimates orother types of coefficients derived from epidemiologic regression models. Itis wise therefore to include measurement error considerations in the planningof a study, both to enable the application of a measurement error analysis atthe conclusion of the study and to assure scientific validity.


In developing a useful plan, it is important to have considered a number ofimportant questions. To begin with, what are the scientific objectives of thestudy? Is the goal to identify a new risk factor for disease, perhaps for thefirst time, or is this a study to provide improved estimates of the quantitativeimpact of a known risk factor? Is prediction of future risks the ultimategoal? The answers to these questions will determine the possible responses todealing with the measurement error in the design and analysis of the study,including the choice of a criterion for statistical optimality. It is even possiblethat no measurement error correction is needed to achieve the purposes ofthe study, and in certain instances, absent other considerations such as cost,that the most scientifically valid design would eliminate measurement errorentirely.

The nature of the measurement error should be carefully considered. Forinstance, is the measurement error nondifferential? What is the evidence tosupport this conclusion? Especially in the study of complex phenomenonsuch as nutritional factors in disease, the nondifferential assumption deservesscrutiny. For example, much has been made of the diet “record” as the goldstandard of for nutritional intakes, but recent analyses have cast doubt onthe nondifferential measurement error associated with substituting monthlyfood frequency questionnaires (Kipnis et al. [56]). On the other hand, mea-surement errors due to validated scientific instrument errors may be moreeasily justified as nondifferential.

Another thing to consider is the possible time dependency of exposureerrors, and how this may affect the use of nondifferential models. This oftenarises in case-control studies where exposures must be assessed retrospec-tively. An interesting example occurs in a recent study of arsenic exposurewhere both drinking water and toenail measurements are available as per-sonal exposure measures in a cancer case-control study (Karagas et al. [55]).Toenail concentrations give a biologically time-averaged measure of exposure,but the time scale is limited and the nail concentrations are influenced byindividual metabolic processes. Drinking water concentrations may be freefrom possible confounding due to unrelated factors affecting metabolic path-ways, but could be less representative of average exposures over the timeinterval of interest. This kind of ambiguity is common in many epidemiologicmodelling situations, and should indicate caution in the rote application ofmeasurement error methods.

Depending on the type of nondifferential error, different study plans maybe required to identify the desired relative risk parameters. For instance,replicate measurements of an exposure variable may adequately identify thenecessary variance parameters in a classical measurement error model. Undercertain circumstances, an “instrumental” variable may provide the informa-tion needed to correct for measurement error. This type of reliability/validitydata leads to identifiable relative risk regression parameters in classical orBerkson case error.


In more complex “surrogate” variable situations with nondifferential er-ror, an internal or external validation study may be necessary, where the“true” exposure is measured without error is available for a subset or inde-pendent sample of subjects. These designs are also useful and appropriate forclassical measurement error models, but are essential in the case of surrogateswhich cannot be considered “unbiased”. Internal validation studies have thecapability of checking the nondifferential assumption, and thus are poten-tially more valuable. With external validation studies, there may be doubt asto whether the populations characterized by the validation and main study

samples are comparable in the sense that the measurement error model is

equivalent or “transportable” between the populations. . The considerationsabove are summarized in the following table for some of the options thatshould be considered when planning a study in the presence of measurementerror.

Table 1. Appropriate plans for collecting validation data inepidemiologic studies with different types of measurement error.

Validation DataMeasurement Error Replicates Instrumental External Internal

Variables Study Study

Classical yes yes yes yesBerkson no yes yes yesGeneral Surrogate no no yes yesDifferential no no no yesNon-Transportable yes yes no yes

Based on validity concerns alone, internal validation studies may havethe greatest advantage. However, this neglects the important issue of thecosts of obtaining the true exposures, which may be considerably larger thanthose for a more readily available surrogate. For instance, it may the casethat a classical additive error model applies and that replicate measures areeasier/cheaper to get than true values. Depending on the relative impacton the optimality criterion used, the replicate design might be more cost-effective, although the internal validation study would still be valid.

A number of approaches have been suggested and used for the design ofepidemiologic studies based on variables measured with error. These may becharacterized broadly as sample size calculation methods, where the designdecision to be made has to do mainly with the size of the main study instudies where the measurement error is known or can be ignored; and designapproaches for studies using internal or external validation data where boththe size of the main study and the validation sample must be chosen. In thesections that follow, we review both of these approaches


1.3.1 Methods for Sample Size Calculations

Methods for sample size calculations are typically based on the operatingcharacteristics of a simple hypothesis test. In the case of measurement er-ror in a risk factor included in an epidemiologic regression model, the nullhypothesis is that the regression coefficient for the risk factor equals zero,implying no association between the exposure and the health outcome. For aspecific alternative one might calculate the power for a given sample size or,alternatively, the sample size required to achieve a given power.

It has been known for some time the effect of measurement error is to re-duce the power of the test for no association both in linear models (Cochran1968 [34]) and 2 x 2 tables with nondifferential misclassification (Fleiss 1981[40]). This result has been extended to survival models (Prentice 1982 [75])and to generalized linear models with nondifferential exposure measurementerror (Tosteson and Tsiatis 1988 [118]), including linear regression, logisticregression, and tests for association in 2 x 2 contingency tables. Using smallrelative risk approximations, it is possible to show that for all of these com-mon models for epidemiologic data, the ratio of the sample size required usingthe data measured without error to the sample size required using the errorprone exposure is approximately nx/nw = ρ2

xw, the square of the correlationbetween X and W. This relation provides a handy method for determiningsample size requirements in the presence of measurement error as

nw = nx/ρ2xw (1.6)

If additional covariates Z are included in the calculation, a partial correlationcan be used instead. The same formula has been used for sample size cal-culations based on regression models for prospective studies with log-linearrisk functions and normal distributions for exposures and measurement error(McKeown-Eyssen and Tibshirani 1994 [65]) and case-control studies withconditionally normal exposures within the case and control groups (Whiteet al. 1994 [127]). Recent development have improved this approximation(Tosteson et al. in press [120]), but formula (1.6) remains a useful tool forchecking sample size requirements in studies with measurement error.

For generalized linear models([118]) and survival models (Prentice 1982[75]), it has been shown that optimal score test can be computed by replacingthe error prone exposure variableW with E[X|W ], a technique that was laterwas termed ”regression calibration” (Carroll et al. [30]). Subsequent workextended these results to a more general form of the score test incorporatinga nonparametric estimate of the measurement error distribution (Stefanskiand Carroll 1995 [108]). One implication of this result is that in commonmeasurement error models, including normally distributed exposure errorsand nondifferential misclassification errors, the optimal test is computed quitesimply by simply ignoring the measurement error and using the usual test


based on W rather than X, the true exposure. However, the test will stillsuffer the loss of power implicit in formula (1.6).

It is interesting to consider the effects of Berkson case errors on sample sizecalculations. The implication for analysis are somewhat different, in as muchas regression coefficients are unbiased by Berkson case errors for linear modelsand to the first order for all generalized linear models. However, as appliedto epidemiologic research, there is no distinction with respect to the effectsof this type of nondifferential sample size calculations for simple regressionmodels without confounders, and formula (1.6) applies directly.

1.3.2 Planning for Reliability/Validation Data

In most epidemiologic applications, a measurement error correction will beplanned, although this may be deemed unnecessary in some situations wherethe investigators only wish to demonstrate an association or where the mea-surement error is known. Information on the measurement error parameterscan come from a number of possible designs, including replicate measure-ments, instrumental variables, external validation studies measuring the trueand surrogate exposures (i.e. just X and W ), or internal validation studies.A variety of statistical criteria can used to optimize aspects of the design,most commonly the variance of the unbiased estimate of the relative risk forthe exposure measured with error. Other criteria have included the powerof tests of association, as in the previous section, and criteria based on thepower of tests for null hypotheses other than ”no association” (Spiegelmanand Gray 1991 [97]).

To choose a design, it is usually necessary to have an estimate of themeasurement error variance and/or other parameters. This may be difficult,since validation data are needed to derive these estimates, and will not yethave been collected at the time when the study is being planned. However,this dilemma is present in most practical design settings and can be overcomein a number of informal way by deriving estimates from previous publications,pilot data, or theoretical considerations of the measurement error process.Certain sequential designs can be useful in this regard, and some suggestionsare discussed here in the context of the design of internal validation studies.

In studies where a correction is planned for classical measurement errorusing replicates, the simple approach to sample size calculations may providea guideline for choosing an appropriate number of replicates and a samplesize by replacing ρ2

xw with ρ2xw , where w is the mean of the nr replicates.

Depending on the relative costs of replication and obtaining an study partic-ipant, these expressions may be used to find an optimal value for the overallsample size, n, and the number of replicates, nr. For instrumental variables,as similar calculation can be made using a variation on the regression cali-bration procedure as applied to the score test for no association. In this case,the inflation in sample size for (1.6) is based on ρ2

xx, where x = E[X|W1,W2],


the predicted value of the true exposure given the unbiased surrogate W1andthe instrumental variable W2.

External and internal validation studies both involve a main study, witha sample size of n1 and a validation study, with sample size of n2. Theexternal validation study involves a independent set of measurements of thetrue and surrogate exposures, whereas the internal validation study is basedon a subset of the subjects in the main study. Both the size of the main studyand the validation study must be specified. In the internal validation study,n2 is by necessity less than or equal to n1, with equality implying a ”fullyvalidated” design. In the external validation study, n2 is not limited, but thethe impact of increasing the amount of validation data is more limited thanin the internal validation study. This is because the fully validated internalvalidation study has no loss of power versus a study that has no measurementerror, whereas the external validation study can only improve the power tothe same as that of a study with measurement error where the measurementerror parameters are known.

For common nonlinear epidemiologic regression analyses such as logisticregression, calculations to determine optimal values of n1and n2 have typi-cally involved specialized calculations (Spiegelman and Gray 1991 [97], Stram1995 [112] ). Less intractable expressions are available for linear discriminantmodels, not involving numerical integrations (Buonaccorsi 1990 [11]). Theactual analysis of the data from the studies may be possible using approxi-mations such as the regression calibration method requiring less sophisticatedsoftware (Speigelman et al. 2001 [96]).

A variant on the internal validation study are designs which use sur-rogate exposures and outcomes as stratification variables to select a highlyefficient validation sample. Cain and Breslow (1988)[20] developed methodsfor case control studies where surrogate variables were available during thedesign phase for cases and controls. Tosteson and Ware(1990) [119] devel-oped methods for studies where surrogates were available for both exposuresand a binary outcome. These designs can be analyzed with ordinary logisticregression if that model is appropriate for the population data. Methods forimproving the analysis of the designs and adapting them to other regressionmodels have been proposed (Tosteson et al. 1994 [117]; Holcroft et al. 1997[49] ; Reilly 1996 [80]).

1.3.3 Examples and Applications

Much of the research on methods for planning studies with measurement er-ror has been stimulated by applications from environmental, nutritional, andoccupational epidemiology. Nevertheless, it is fair to say that published ex-amples of studies designed with measurement error in mind are relatively rareand the best source of case studies may be methods papers such as those citedin this review. This may reflect a lack of convenient statistical software other


than what individual researchers have been able to make available. However,some useful calculations can be quite simple, as shown above, and a moreimportant factor in future applications of these methods will be proper edu-cation to raise the awareness among statisticians and epidemiologists of theimportance of addressing the problem of measurement error in the planningphases of health research.

1.4 Measurement Error Models and Methods

1.4.1 Overview

This section describes some common methods for correcting biases inducedby non-differential covariate measurement error. The focus is on nonlinearregression models, and the logistic model in particular, though all the meth-ods apply to the linear model. The intent is to familiarize the reader withthe central themes and key ideas that underlie the proposals, and providea contrast of the assumptions and types of data required to implement theprocedures.

The starting point for all measurement error analyses is the disease modelof interest relating the disease outcome Y to the true exposure(s) X and co-variates Z, and a measurement error model relating the mismeasured expo-sure W to (Z,X). Measurement error methods can be grouped according towhether they employ functional or structural modeling. Functional modelsmake no assumptions onX, beyond what are made in the absence of measure-ment error, e.g.

∑Ni=1(Xi − X)2 > 0 for simple linear regression. Functional

modeling is compelling because often there is little information in the dataon the distribution of X. For this reason, much of the initial research in mea-surement error methods focused on functional modeling. Methods based onfunctional modeling can be divided into approximately consistent (removemost bias) and fully consistent methods (remove all bias as N → ∞). Fullyconsistent methods for nonlinear regression models typically require assump-tions on the distribution of the measurement error. Regression calibrationand SIMEX are examples of approximately consistent methods while cor-rected scores, conditional scores and some instrumental variable (IV) meth-ods are fully consistent for large classes of models. Each of these approachesis described below.

Structural models assume X is random and require an exposure model forX, with the normal distribution as the default exposure model. Likelihoodbased methods are used with structural models.

Note that the terms functional and structural refer to assumptions on X,not on the measurement error model. The advantage of functional modelingis it provides valid inference regardless of the distribution of X. On the otherhand, structural modeling can result in large gains in efficiency and allowsconstruction of likelihood ratio based confidence intervals that often have


coverage probabilities closer to the nominal level than large sample normaltheory intervals used with functional models. The choice between functionalor structural modeling depends both on the assumptions one is willing tomake and, in a few cases, the form of the model relating Y to (Z,X). Thetype and amount of data available also plays a role. For example, validationdata provides information on the distribution of X, and may make structuralmodeling more palatable. The remainder of the chapter describes methodsfor correcting for measurement error. Functional methods are described first.

1.4.2 Regression calibration

Regression calibration is a conceptually straightforward approach to bias re-duction and has been successfully applied to a broad range of regressionmodels. It is the default approach for the linear model. The method is fullyconsistent in linear models and log-linear models when the conditional vari-ance of X given (Z,W ) is constant. Regression calibration is approximatelyconsistent in non-linear models. The method was first studied in the contextof proportional hazards regression (Prentice, 1982[75]). Extensions to logisticregression and a general class of regression models were studied in (Rosneret al 1989[85],1990[84]) and (Carroll and Stefanski 1990[25]), respectively. Adetailed and comprehensive discussion of regression calibration can be foundin (Carroll et al 1995[30]).

When the measurement error is non-differential, the induced diseasemodel, or regression model, relating Y to the observed exposure W and co-variates Z is E[Y | Z,W ] = E[E[Y | Z,X] | Z,W ], i.e. the induced diseasemodel is obtained by regressing the true disease model on (Z,W ). A conse-quence of the identity is that the form of the observed disease model dependson the conditional distribution of X given (Z,W ). This distribution is typi-cally not known, and even when known evaluating the right hand side of theidentity can be difficult. For example, if the true disease model is logistic andthe distribution of X conditional on (Z,W ) is normal, there is no closed formexpression for E[Y | Z,W ].

Regression calibration circumvents these problems by approximating thedisease model relating Y to the observed covariates (Z,W ). The approxima-tion is obtained by replacing X with E[X | Z,W ] in the model relating Y to(Z,X). Because regression calibration provides a model for Y on (Z,W ), theobserved data can be used to assess the adequacy of the model.

To describe how to implement the method, it is useful to think of theapproach as a method for imputing values for X. The idea is to estimateunobserved X with X∗ ≡ predicted value of X from the regression of Xon (Z,W ). Modeling and estimating the regression of X on (Z,W ) requiresadditional data in the form of internal/external replicate observations, in-strumental variables or validation data, see the example below. The regres-sion parameters in the true disease model are estimated by regressing Y on


(Z,X∗). Note that X∗ is the best estimate of X using the observed predic-tors (Z,W ); best in the sense of minimizing mean square prediction error. Tosummarize, regression calibration estimation consists of two primary steps:

1. Model and estimate the regression of X on (Z,W ) to obtain X∗.2. Regress Y on (Z,X∗) to obtain regression parameter estimates.

A convenient feature of regression calibration is that standard softwarecan often be used for estimation. However, standard errors for parameterestimates in step 2 must account for the fact that X∗ is estimated in step 1,something standard software does not do. Bootstrap or asymptotic methodsbased on estimating equation theory are typically used, see (Carroll et al1995[30]) for details.

When (Z,X,W ) is approximately jointly normal, or when X is stronglycorrelated with (Z,W ), the regression ofX on (Z,W ) is approximately linear:

E[X | Z,W ] ≈ µx +Σx|zwΣ−1zw

(Z − µzW − µw

)

where Σx|zw is the covariance of X with (Z,W ) and Σzw is the variance ma-trix of (Z,W ). Implementing regression calibration using the linear approxi-mation requires estimation of the calibration parameters µx, Σx|zw, Σzw, µw,and µz.

Example We illustrate estimation of the calibration function when tworeplicate observations of X are available in the primary study and the er-ror model is W = X + σU . For ease of illustration, we assume there are noadditional covariates Z. Let {Wi1,Wi2}Ni=1 denote the replication data and

suppose that E[X | W ] ≈ µx +Σx|wΣ−1w (W − µw) = µw +

σ2w−σ2

σ2w

(W − µw)

where the last equality follows from the form of the error model. Note

thatσ2

w−σ2

σ2w

is the attenuation factor discussed earlier in the chapter. The

method of moments calibration parameter estimators are µw =∑N

i=1 Wi/N ,

σ2w =

∑Ni=1(Wi − µw)

2/(N − 1) and σ2 =∑N

i=1

∑2j=1(Wij − Wi)

2/N =∑Ni=1(Wi1 −Wi2)

2/2N where Wi = (Wi1 +Wi2)/2. The imputed value for

X is X∗i = µw +

σ2w−σ2

σ2w

(Wi − µw).If the model relating Y to X is the simple linear regression model, (Y =

β1 + βxX + ε), regressing Y on X∗ results in βx =σ2

w

σ2w−σ2 βw where βw is the

’naive’ estimator obtained from regressing Y onW . Note for the linear modelthe regression calibration estimator coincides with the method of momentsestimator given in Section 1.2 of the chapter.

Our illustration of calibration parameter estimation assumed exactly tworeplicates were available for each Xi. This estimation scheme can be easilyextended to an arbitrary number of replicates for each Xi, see (Carroll et. al1995 [30]) for details.


Regression calibration can be ineffective in reducing bias in nonlinearmodels when: a) the effect of X on Y is large, for example large odds ratiosin logistic regression; b) the measurement error variance is large; c) the modelrelating Y to (Z,X) is not ’smooth’. It is difficult to quantify what is meantby ’large’ in a) and b) because all three factors (a-c) can act together. Inlogistic regression, the method has been found to be effective in a numberof applications (Rosner et al. 1989 [85], 1990 [84], Carroll et al 1995 [30]).Segmented regression is an example of a model where regression calibrationfails due to lack of model smoothness (Kuchenhoff and Carroll 1997 [57]).Segmented models relate Y toX using separate regression models on differentsegments along the range of X. Extensions of regression calibration thataddress the potential pitfalls listed in a)-c) are given in (Carroll and Stefanski1990 [25]).

1.4.3 Simex

Simulation-extrapolation (SIMEX) can correct for bias in a very broad rangeof settings and is the only method that provides a visual display of the ef-fects of measurement error on regression parameter estimation. SIMEX isfully consistent for linear disease models and approximate for nonlinear mod-els. SIMEX is founded on the observation that bias in parameter estimationvaries in a systematic way with the magnitude of the measurement error. Es-sentially, the method is to incrementally add measurement error to W usingcomputer simulated random errors and compute the corresponding regres-sion parameter estimate (simulation step). The extrapolation step modelsthe relation between the parameter estimates and the magnitude of the mea-surement errors. The SIMEX estimate is the extrapolation of this relation tothe case of zero measurement error. The method was developed in (Cook andStefanski 1994[35]; Stefanski and Cook 1995[110]) and summarized in detailin (Carroll et al 1995[30]).

Details of the method are best understood in the context of the classicaladditive measurement error model. However, the method is not limited to thismodel. To describe the method, suppose Wi = Xi + σUi for i = 1, . . . , n andfor s = 1, . . . , B, defineWis(λ) =Wi+

√λσUis where λ > 0, and {Uis}Bs=1 are

i.i.d. computer simulated standard normal variates. Note that the variance ofthe measurement error for the constructed measurement Wis(λ) is (1+λ)σ

2.

Let βs(λj) denote the vector of regression parameter estimators obtained byregression of Y on {Z,Ws(λj)} for 0 = λ1 < λ2 < · · · < λM . The value λM =2 is recommended (Carroll et al 1995[30]). The notation explicitly indicates

the dependence of the estimator on λj . Let β(λj) = B−1∑B

s=1 βs(λj). Herewe are averaging over the B simulated samples to eliminate variability dueto simulation, and empirical evidence suggests B = 100 is sufficient. Eachcomponent of the vector β(λ) is then modeled as a function of λ and theSIMEX estimator is the extrapolation of each model to λ = −1. Note thatλ = −1 represents a measurement error variance of zero.


Consider, for example, estimation of βx. The ‘observations’ produced bythe simulation {βx(λj), λj}Mj=1 are plotted and used to develop and fit an ex-

trapolation model relating the dependent variable βx(λ) to the independentvariable λ. In most applications, an adequate extrapolation model is pro-vided by either the nonlinear extrapolant function, βx(λj) ≈ γ1 +

γ2

γ3+λ, or a

quadratic extrapolant function, βx(λj) ≈ γ1 + γ2λ + γ3λ2. The appropriate

extrapolant function is fit to {βx(λj), λj}Mj=1 using ordinary least squares. Itis worth noting that the nonlinear extrapolant function can be difficult to fitnumerically and details for doing so are given in (Carroll et al 1995, [30]).

Example SIMEX was developed to understand and correct for the ef-fects of covariate measurement error in nonlinear disease models. How-ever, it is instructive to consider the simple linear regression model as anexample. In Section 1.2 the bias of the naive estimator was studied and

it follows from those results that βx(λ) =βxσ

2x

σ2x+σ2(1+λ) + Op(n

− 12 ) where

the symbol Op(n− 1

2 ) denotes terms that are negligible for n large. There-fore, the nonlinear extrapolant will result in a fully consistent estimator;

βx(−1) =βxσ

2x

σ2x+σ2(1+[−1]) = βx + Op(n

− 12 ). Refinements and further details

for the SIMEX method, including calculation of standard errors, are given in(Carroll et al 1995[30]).

1.4.4 Estimating Equations and Corrected Scores

Regression parameter estimators in nonlinear models are defined implicitlythrough estimating equations. Estimating equations are often based on thelikelihood score, i.e. the derivative of the log-likelihood, or quasi-likelihoodscores that only require assumptions on the first and second conditional mo-ments of the disease model. The criterion of least squares also leads to pa-rameter estimation based on estimating equations.

Corrected scores, conditional scores and certain instrumental variablemethods have been developed starting with the estimating equations thatdefine regression parameter estimates in the absence of measurement er-ror. A estimating score is unbiased if it has expectation zero. Measure-ment error induces bias in estimating equations, which translates into biasin the parameter estimator. Modifying the estimating equations to removebias produces estimators without bias. This is readily seen in the no-intercept simple linear regression model with classical measurement error;Y = βxX + ε and W = X + σU . In the absence of measurement er-ror, the least squares estimator for βx solves

∑Ni=1 ψ(Yi, Xi;βx) = 0 where

ψ(Yi, Xi;βx) = (Yi − βxXi)Xi is the least squares score. The score is unbi-ased: E[ψ(Yi, Xi;βx)] = βxσ

2x − βxσ

2x = 0. The score is no longer unbiased

when W replaces X; E[ψ(Yi,Wi;βx)] = βxσ2x − βx(σ

2x + σ2) 6= 0 whenever

σ2 > 0 and βx 6= 0.


Corrected scores are unbiased estimators of the score that would be usedin the absence of measurement error. A corrected score ψ∗(Yi,Wi;βx) satisfiesE[ψ∗(Yi,Wi;βx)] = ψ(Yi, Xi;βx) where the expectation is with respect tothe measurement error distribution. Corrected scores were first defined in(Stefanski 1989[99]) and (Nakamura 1990[69]). Note that corrected scores areunbiased whenever the original score is unbiased. This means that estimatorsobtained from corrected scores are fully consistent.

The corrected score for the simple linear no-intercept regression modelis easily seen to be ψ∗(Yi,Wi;βx) = ψ(Yi,Wi;βx) + σ2βx resulting in the

estimator βx =∑N

i=1 YiWi/(∑N

i=1W2i − σ2). In applications an estimate of

the measurement error variance replaces σ2. Note that the corrected scoreestimator for the linear model is also the method of moments estimator.

For the linear model, the corrected score was identified without makingan assumption on the distribution of the measurement error. For nonlinearregression models, obtaining a corrected score generally requires specificationof the measurement error distribution, and typically the normal distributionis used.

Consider Poisson regression with no intercept. The likelihood score in theabsence of measurement error is ψ(Yi, Xi;βx) = (Yi−exp{βxXi})Xi. If we as-sume that the measurement error satisfies U ∼ N(0, 1), then ψ∗(Yi,Wi;βx) =(Yi − exp{βxWi − β2

xσ2/2})Wi + βxσ

2 exp{βxWi − β2xσ

2/2}) is the correctedscore. Using results for the moment generating function of a normal randomvariable, one can verify that E[ψ∗(Yi,Wi;βx)] = (Yi − exp{βxXi})Xi wherethe expectation is with respect to the measurement error. The corrected scoreestimator solves

∑Ni=1 ψ

∗(Yi,Wi;βx) = 0, and the solution must be obtainednumerically for Poisson regression.

It is not always possible possible to obtain a corrected score (Stefanski1989 [99]). For example, the likelihood score for logistic regression does notadmit a corrected score, except under certain restrictions (Buzas and Ste-fanski 1996[17]). A method for obtaining corrected scores via computer sim-ulation was recently studied in (Novick and Stefanski 2002[71]). They alsoobtain an approximate corrected score for logistic regression using computersimulation.

1.4.5 Conditional Scores

Conditional score estimation is the default method for logistic regressionwhen the classical additive error model holds. The statistical theory of suffi-cient statistics and maximum likelihood underlie the derivation of conditionalscores, and conditional score estimators retain certain optimality properties oflikelihood estimators. Though we focus on logistic regression here, the methodapplies to a broader class of regression models, including Poisson and gammaregression. The method was derived in (Stefanski and Carroll 1987[107]).Construction of the conditional score estimator requires that the measure-ment error is normally distributed. However, the estimator remains effective


in reducing bias and is surprisingly efficient for modest departures from thenormality assumption (Huang and Wang 2001[51]). Computing conditionalscore estimators requires an estimate of the measurement error variance.

The conditional score estimator is defined implicitly as the solution to esti-mating equations that are closely related to the logistic regression maximumlikelihood estimating equations used in the absence of measurement error.In the absence of measurement error, the maximum likelihood estimator of(β1, βz, βx) is defined implicitly as the solution to

N∑

i=1

{Yi − F (β1 + βzZi + βxXi)}

1ZiXi

= 0,

where F (v) = {1 + exp(−v)}−1 is the logistic distribution function. Theconditional score estimator is defined as the solution to the equations

N∑

i=1

{Yi − F (β1 + βzZi + βx∆i)}

1Zi∆i

= 0

where ∆i = Wi + (Yi − 12 )σ

2βx and σ2 is an estimate of the measurementerror variance. Conditional score estimation for logistic regression replacesthe unobserved Xi with ∆i. It can be shown that E[Y | Z,∆] = F (β1 +βzZ+βx∆) and it follows that the conditional score is unbiased. Because ∆i

depends on the parameter βx, it is not possible to estimate (β1, βz, βx) usingstandard software by replacing X with ∆. Standard errors are computedusing the sandwich estimator or bootstrap.

For models other than the logistic, the simple scheme of replacing X with∆ is not true generally, and conditional score estimating equations for Poissonand gamma regression are much more complicated.

The conditional score estimator for the logistic model compares favorablyin terms of efficiency to the full maximum likelihood estimator that requiresspecification of an exposure model, see Stefanski and Carroll (1990)[109].

1.4.6 Instrumental variables

The methods described so far require additional data that allow estimation ofthe measurement error variance. Replicate observations and interval/externalvalidation data are two sources of such additional information. Another sourceof additional information are instrumental variables. Instrumental variables,denoted T , are additional measurements of X that satisfy three requirements;i) T is non-differential, i.e. fY |Z,X,T = fY |Z,X , ii) T is correlated with X andiii) T is independent ofW −X. Note that a replicate observation is an instru-mental variable but an instrumental variable is not necessarily a replicate. Itis possible to use an instrumental variable to estimate the measurement error


variance and then use one of the above methods. Doing so can be inefficient,and IV methods typically do not directly estimate the measurement errorvariance.

Consider the cancer case-control study of arsenic exposure mentioned inSection 3. Two measurements of arsenic exposure are a available for eachcase/control in the form of drinking water and toenail concentrations. Nei-ther measure is an exact measure of long-term arsenic exposure (X). Takingtoenail concentration to be an unbiased measurement of X, the drinkingwater concentration can serve as an instrumental variable.

Instrumental variable methods have been used in linear measurementerror models since the 1940’s, see (Fuller, 1987, [42]) for a good introduc-tion. Instrumental variable methods for nonlinear models were first studiedin (Amemiya 1990 [2]). Extensions of regression calibration and conditionalscore methodology to instrumental variables were given in Carroll and Ste-fanski 1994 [26]; Stefanski and Buzas 1995 [105]; Buzas and Stefanski 1996[19].

The essential idea underlying instrumental variable estimation can be un-derstood by studying the simple linear model without intercept: Y = βxX+εand W = X + σU . Then Y = βxW + ε where ε = ε − βxσU and it ap-pears that Y and W follow a simple linear regression model. However, Wand ε are correlated, violating a standard assumption in linear regression,and the least squares estimator for βx is biased, see Section 2. The leastsquares estimating equation

∑Ni=1{Yi − βxWi}Wi = 0 is biased because Wi

and Yi − βxWi are correlated. This suggests an unbiased equation can beconstructed by replacing Wi outside the brackets with a measurement un-correlated with Yi − βxWi. An IV T satisfies the requirement and the IVestimating equation

∑Ni=1{Yi−βxWi}Ti = 0 results in the consistent estima-

tor βx =∑N

i=1 YiTi/∑N

i=1WiTi. Non-zero correlation between X and T isrequired so that the denominator is not estimating zero. The key idea is thatthe score factors into two components where the first component {Yi−βxWi}has expectation zero and the second component Ti is uncorrelated with thefirst.

The method must be modified for nonlinear problems. Logistic regressionwill be used to illustrate the modification. If we ignore measurement error,the estimating equations for logistic regression are

N∑

i=1

{Yi − F (β1 + βzZi + βxWi)}

1ZiWi

= 0.

Unlike the linear case, for the logistic model and nonlinear models generally,the first term in the estimating score, {Yi − F (β1 + βzZi + βxWi)}, does nothave expectation zero, so that replacing Wi with Ti outside the brackets inthe above equation does not result in an estimator that reduces bias.

Define the logistic regression instrumental variable estimating equations


N∑

i=1

h(Zi,Wi, Ti) {Yi − F (β1 + βzZi + βxWi)}

1ZiTi

= 0

where h(Zi,Wi, Ti) =√

F ′(β1+βzZi+βxTi)F ′(β1+βzZi+βxWi)

is a scalar valued weight function

and F ′ denotes the derivative of F . It can be shown the estimating equationis unbiased provided the distribution of the measurement error is symmetric,implying the estimator obtained from the equations is fully consistent. See(Buzas 1997[15]) for extensions to other disease models, including the Poissonand gamma models.

1.4.7 Likelihood methods

Likelihood methods for estimation and inference are appealing because ofoptimality properties of maximum likelihood estimates and dependability oflikelihood ratio confidence intervals. In the context of measurement errorproblems, the advantages of likelihood methods relative to functional meth-ods have been studied in Schafer and Purdy (1996)[92] and Kuchenhoff andCarroll (1997)[57]. However, the advantageous properties are contingent oncorrect specification of the likelihood. As discussed below, this is often adifficult task in measurement error problems.

The likelihood for an observed data point (Y,W ) conditional on Z is

fYW |Z =

∫fY |Z,X,W fW |Z,XfX|Zdx =

∫fY |Z,XfW |Z,XfX|Zdx

where the second equality follows from the assumption of non-differentialmeasurement error. The integral is replaced by a sum if X is a discreterandom variable. The likelihood for the observed data is then

∏Ni=1 fYi,Wi|Zi

,and maximum likelihood estimates are obtained by maximizing the likelihoodover all the unknown parameters in each of the three component distribu-tions comprising the likelihood. In principle, the procedure is straightforward.However, there are several important points to be made.

1. The likelihood for the observed data requires complete distributionalspecification for the disease model (fY |Z,X), the error model (fW |Z,X)and an exposure model (fX|Z).

2. As was the case for functional models, estimation of parameters in thedisease model generally requires, for all intents and purposes, observa-tions that allow estimation of parameters in the error model, for examplereplicate measurements.

3. When the exposure is modeled as a continuous random variable, for ex-ample the normal distribution, the likelihood requires evaluation of anintegral. For many applications the integral cannot be evaluated analyti-cally and numerical methods must be used, typically Gaussian quadratureor Monte Carlo methods.


4. Finding the maximum of the likelihood is not always straightforward.

While the last two points must be addressed to implement the method,they are technical points and will not be discussed in detail. In principle, nu-merical integration followed by a maximization routine can be used, but thisapproach is often difficult to implement in practice, see (Schafer 2002[90]).Algorithms for computation and maximization of the likelihood in generalregression models with exposure measurement error are given in (Higdonand Schafer 2001[48] and Schafer 2002[90]). Alternatively, a Bayesian for-mulation can be used to circumvent some of the computational difficulties,see Carroll, Roeder and Wassermen (1999)[22]. For the normal theory linearmodel and probit regression with normal distribution for the exposure model,the likelihood can be obtained analytically (Fuller 1987 [42] and Carroll etal 1984[31]). The analytic form of the likelihood for the probit model oftenprovides an adequate approximation to the likelihood for the logistic model.

The first point above deserves discussion. None of the preceding methodsrequired specification of an exposure model (functional methods). Here an ex-posure model is required. It is common to assume X | Z ∼ N(α1+αxZ, σ

2x|z),

but, unless there are validation data, it is not possible to assess the adequacyof the exposure model using the data. Some models are robust to the normal-ity assumption. For example, in the normal theory linear model, i.e. when(Y,Z,X,W ) is jointly normal, maximum likelihood estimators are fully con-sistent regardless of the distribution of X. The literature is currently lackingresults as to the robustness of other disease models to assumptions on X.In a Bayesian framework, Richardson and Leblond (1997) [82] show mis-specification of the exposure model can seriously affect estimation for logisticdisease models.

Semi-parametric and flexible parametric modeling are two approachesthat have been explored to address potential robustness issues in specifying anexposure model. Semi-parametric methods leave the exposure model unspec-ified, and the exposure model is essentially considered as another parameterthat needs to be estimated. These models have the advantage of model ro-bustness but may lack efficiency relative to the full likelihood. See Roeder,Carroll, and Lindsay (1996)[83], Schafer (2001)[91] and Taupin (2001)[114].

Flexible parametric exposure models typically use a mixture of normalrandom variables to model the exposure distribution, as normal mixtures arecapable of capturing moderately diversified features of distributions. Flexibleparametric approaches have been studied in Kuchenhoff and Carroll(1997)[57], Carroll, Roeder and Wasserman (1999)[22] and Schafer (2002)[90].

The likelihood can also be obtained conditional on both W and Z. In thiscase the likelihood is

fY |Z,W =

∫fY |Z,XfX|Z,W dx


necessitating an exposure model relating X to W and Z. This form of thelikelihood is natural for Berkson error models. In general, the choice of whichlikelihood to use is a matter of modeling convenience.

1.4.8 Survival analysis

Analysis of survival data with exposure measurement error using propor-tional hazards models presents some new issues. Of the methods presented,only SIMEX can be applied without modification in the proportional hazardssetting.

Many of the proposed methods for measurement error correction in pro-portional hazards models fall into one of two general strategies. The firststrategy is to approximate the induced hazard and then use the approximatedhazard in the partial likelihood equations. This strategy is analogous to theregression calibration approximation discussed earlier. The second strategyis to modify the partial likelihood estimating equations. Methods based onthis strategy stem from the corrected and conditional score paradigms.

In the absence of measurement error, the proportional hazards model pos-tulates a hazard function of the form λ(t | Z,X) = λ0(t) exp (β

Tz Z + βxX)

where λ0(t) is an unspecified baseline hazard function. Estimation and infer-ence for (βx, βz) are carried out through the partial likelihood function, as itdoes not depend on λ0(t).

Prentice (1982)[75] has shown that when (Z,W ) is observed, the inducedhazard is λ(t | Z,W ) = λ0(t)E[exp (βTz Z + βxX) | T ≥ t, Z,W ]. The in-duced hazard requires a model for X conditional on (T ≥ t, Z,W ). Thisis problematic because the distribution of T is left unspecified in propor-tional hazards models. However, when the disease is rare λ(t | Z,W ) ≈λ0(t)E[exp (βTz Z + βxX) | Z,W ] (Prentice 1982[75]) and if we further as-sume that X | Z,W is approximately normal with constant variance thenthe induced hazard is proportional to exp (βTz Z + βxE[X | Z,W ]). In otherwords, regression calibration is appropriate in the proportional hazards set-ting when the disease is rare and X | Z,W is approximately normal.

Modifications to the regression calibration algorithm have been devel-oped for applications where the rare disease assumption is untenable, seeClayton (1991)[33], Tsiatis et. al. (1995)[122], Wang, Hsu, Feng and Pren-tice (1997)[125], and Xie, Wang and Prentice (2001)[132]. Conditioning onT ≥ t cannot be ignored when the disease is not rare. The idea is to re-estimate the calibration function E[X | Z,W ] in each risk set, that is the setof individuals known to be at risk at time t. Clayton’s proposal assumes thecalibration functions across risk sets have a common slope and his methodcan be applied provided one has an estimate of the measurement error vari-ance. Xie et. al.[132] extend the idea to varying slopes across the risk setsand require replication (reliability data). Tsiatis et. al. [122] consider timevarying covariates and also allow for varying slopes across the risk sets.


When a validation subsample is available it is possible to estimate theinduced hazard nonparametrically, that is without specifying a distributionfor X | (T ≥ t, Z,W ), see Zhou and Pepe (1995)[134] and Zhou and Wang(2000)[135] for the cases when the exposure is discrete and continuous, re-spectively.

The second strategy avoids modeling the induced hazard and instead mod-ifies the partial likelihood estimating equations. Methods based on the cor-rected score concept are explored in Nakamura (1992)[70], Buzas (1998)[16]and Huang and Wang (2000)[50]. The methods in Nakamura (1992)[70] andBuzas (1998)[16] assume the measurement error is normally distributed andonly require an estimate of the measurement error variance. In contrast, theapproach in Huang and Wang (2000)[50] does not require assumptions on themeasurement error distribution but replicate observations on the mismea-sured exposure are needed to compute the estimator. Each of the methodshas been shown to be effective in reducing bias in parameter estimators. Tsi-atis and Davidian (2001)[123] extend conditional score methodology to theproportional hazards setting with covariates possibly time dependent.

References

1. Amemiya, Y. (1985): Instrumental variable estimator for the nonlinear errorsin variables model. Journal of Econometrics, 28, 273-289.

2. Amemiya, Y. (1990): Instrumental variable estimation of the nonlinear mea-surement error model, in Statistical Analysis of Measurement Error Modelsand Application, P.J. Brown & W.A. Fuller, eds. American Mathematics Soci-ety, Providence.

3. Amemiya, Y. (1990b): Two-stage instrumental variable estimators for the non-linear errors in variables model. Journal of Econometrics, 44, 311-332.

4. Amemiya, Y., Fuller, W.A. (1988): Estimation for the nonlinear functional re-lationship. Annals of Statistics, 16, 147-160.

5. Armstrong, B. (1985): Measurement error in generalized linear models. Com-munications in Statistics, Part B — Simulation and Computation, 14, 529-544.

6. Armstrong, B.K., White, E., Saracci, R. (1992): Principles of Exposure Mea-surement Error in Epidemiology. Oxford University Press, Oxford.

7. Armstrong, B.G., Whittemore, A.S., Howe, G.R. (1989): Analysis of case-controldata with covariate measurement error: application to diet and colon cancer.Statistics in Medicine, 8, 1151-1163.

8. Berkson, J. (1950): Are there two regressions? Journal of the American Statis-tical Association, 45, 164-180.

9. Breslow, N.E., and Cain, K.C. (1988): Logistic regression for two-stage case-control data. Biometrika, 75, 11-20.

10. Buonaccorsi, J.P. (1990): Errors in variables with systematic biases, Commu-nications in Statistics — Theory and Methods, 18, 1001-1021.

11. Buonaccorsi, J.P. (1990): Double sampling for exact values in some multivariatemeasurement error problems. Journal of the American Statistical Association,85, 1075-1082.


12. Buonaccorsi, J.P. (1990): Double sampling for exact values in the normal dis-criminant model with application to binary regression. Communications inStatistics — Theory and Methods, 19, 4569-4586.

13. Buonaccorsi, J.P. (1991): Measurement error, linear calibration and inferencesfor means. Computational Statistics and Data Analysis, 11, 239-257.

14. Buonaccorsi, J.P., Tosteson, T. (1993): Correcting for nonlinear measurementerror in the dependent variable in the general linear model. Communications inStatistics — Theory and Methods, 22, 2687-2702.

15. Buzas, J.S. (1997): Instrumental variable estimation in nonlinear measurementerror models. Communications in Statistics — Theory and Methods, 26, 2861-2877.

16. Buzas, J.S. (1998): Unbiased scores in proportional hazards regression withcovariate measurement error. Journal of Statistical Planning and Inference, 67,247-257.

17. Buzas, J.S., and Stefanski, L.A. (1996): A note on corrected score estimation.Statistics and Probability Letters, 28, 1-8.

18. Buzas, J.S., and Stefanski, L.A. (1996): Instrumental variable estimation inprobit measurement error models. Journal of Statistical Planning and Inference,55, 47-62.

19. Buzas, J.S., and Stefanski, L.A. (1996): Instrumental Variable Estimation inGeneralized Linear Measurement Error Models. Journal of the American Sta-tistical Association, 91, 999-1006.

20. Cain, K.C. and N.E. Breslow (1988): Logistic regression analysis and efficientdesign for two-stage studies. American Journal Epidemiology, 128, 1198-1206.

21. Carroll, R.J. (1998): Measurement error in epidemiologic studies, in Encyclo-pedia of Biostatistics 2491-2519.

22. Carroll, R. J. Roeder, K. and Wasserman L. (1999): Flexible parametric mea-surement error models, Biometrics, 55, 44-54.

23. Carroll, R.J., Ruppert, D. (1988): Transformation and Weighting in Regression.Chapman & Hall, London.

24. Carroll, R.J., and Ruppert, D. (1996): The use and misuse of orthogonal re-gression in measurement error models. American Statistician, 50, 1-6.

25. Carroll, R.J., and Stefanski, L.A. (1990): Approximate quasilikelihood estima-tion in models with surrogate predictors. Journal of the American StatisticalAssociation, 85, 652-663.

26. Carroll, R.J, and Stefanski, L.A. (1994): Measurement error, instrumental vari-ables and corrections for attenuation with applications to meta-analyses. Statis-tics in Medicine, 13, 1265-1282.

27. Carroll, R.J., Gail, M.H., and Lubin, J.H. (1993): Case-control, studies witherrors in predictors. Journal of the American Statistical Association, 88, 177-191,

28. Carroll, R.J., Gallo, P.P., and Gleser, L.J. (1985): Comparison of least squaresand errors-in-variables regression, with special reference to randomized analysisof covariance. Journal of the American Statistical Association, 80, 929-932.

29. Carroll, R.J., Kuchenhoff, H., Lombard, F., and Stefanski, L.A. (1996): Asymp-totics for the SIMEX estimator in structural measurement error models. Journalof the American Statistical Association, 91, 242-250.

30. Carroll, R.J., Ruppert, D., and Stefanski, L.A. (1995): Measurement Error inNonlinear Models. Chapman & Hall, London.


31. Carroll, R.J., Spiegelman, C., Lan, K.K., Bailey, K.T., and Abbott, R.D. (1984):On errors-in-variables for binary regression models. Biometrika, 71, 19-26.

32. Carroll, R.J., Wang, S., Wang, C.Y. (1995): Asymptotics for prospective analy-sis of stratified logistic case-control studies. Journal of the American StatisticalAssociation, 90, 157-169.

33. Clayton, D.G. (1991): Models for the analysis of cohort and case-control stud-ies with inaccurately measured exposures, in Statistical Models for LongitudinalStudies of Health, J.H. Dwyer, M. Feinleib, P. Lipsert et al., eds. Oxford Uni-versity Press, New York, 301-331.

34. Cochran, W.G. (1968): Errors of measurement in statistics. Technometrics, 10,637-666.

35. Cook, J., Stefanski, L.A., (1995): A simulation extrapolation method for para-metric measurement error models. Journal of the American Statistical Associ-ation, 89, 1314-1328.

36. Crouch, E.A., Spiegelman, D. (1990): The evaluation of integrals of theform

∫∞

∞

f(t)exp(−t2)dt: applications to logistic-normal models. Journal of theAmerican Statistical Association, 85, 464-467.

37. Devanarayan, V., Stefanski, L. A. (2002): Empirical Simulation Extrapolationfor Measurement error Models with Replicate Measurements. Statistics andProbability Letters, 59, 219-225.

38. Devine, O.J., Smith, J.M. (1998): Estimating sample size for epidemiologicstudies: the impact of ignoring exposure measurement uncertainty. Statistics inMedicine, 12, 1375-1389.

39. Dosemeci, M., Wacholder, S., Lubin, J.H. (1990): Does non-differential misclas-sification of exposure always bias a true effect towards the null value? AmericanJournal of Epidemiology, 132, 746-748.

40. Fleiss, J. L. (1981): Statistical methods for rates and proportions. Wiley.41. Freedman, L.S., Carroll, R.J., Wax, Y. (1991): Estimating the relationship be-

tween dietary intake obtained from a food frequency questionnaire and trueaverage intake. American Journal of Epidemiology, 134, 510-520.

42. Fuller W.A. (1987): Measurement Error Models. Wiley, New York.43. Ganse, R.A., Amemiya, Y., Fuller, W.A. (1983): Prediction when both variables

are subject to error, with application to earthquake magnitude. Journal of theAmerican Statistical Association, 78, 761-765.

44. Gleser, L.J. (1981): Estimation in multivariate errors in variables regressionmodel: large sample results. Annals of Statistics, 9, 24-44.

45. Gleser, L.J. (1990): Improvements of the naive approach to estimation in non-linear errors-in-variables regression models, in Statistical Analysis of Measure-ment Error Models and Application P.J. Brown & W.A. Fuller, eds. AmericanMathematical Society, Providence.

46. Greenland, S. (1980): The effect of misclassification in the presence of covari-ates. American Journal of Epidemiology, 112, 564-569.

47. Greenland, S., Robins, J.M. (1985): Confounding and misclassification. Amer-ican Journal of Epidemiology, 122, 495-506.

48. Higdon R. and Schafer D.W. (2001): Maximum likelihood computations for re-gression with measurement error. Computational Statistics and Data Analysis,35, 283-299.

49. Holcroft, C.A., Rotnitzky, A., Robins, J.M. (1997): Efficient estimation of re-gression parameters from multistage studies with validation of outcome andcovariates. Journal of Statistical Planning and Inference, 65, 349-374.


50. Huang, Y., Wang, C.Y. (2000): Cox regression with accurate covariates unascer-tainable: a nonparametric-correction approach. Journal of the American Statis-tical Association, 95, 1209-1219.

51. Huang, Y., Wang, C.Y. (2001): Consistent functional methods for logistic re-gression with errors in covariates. Journal of the American Statistical Associa-tion, 95, 1209-1219.

52. Hwang, J. T. and Stefanski, L. A. (1994): Monotonicity of Regression Functionsin Structural Measurement Error Models. Statistics and Probability Letters, 20,113-116.

53. Hughes, M.D. (1993): Regression dilution in the proportional hazards model.Biometrics, 49, 1056-1066.

54. Hunter, D.J., Spiegelman, D., Adami, H.O., Beeson, L., van der Brandt, P.A.,Folsom, A.R., Fraser, G.E., Goldbohm, A., Graham, S., Howe, G.R., Kushi,L.H., Marshall, J.R., McDermott, A., Miller, A.B., Speizer, F.E., Wolk, A.,Yaun, S.S., Willett, W. (1996): Cohort studies of fat intake and the risk ofbreast cancer — a pooled analysis. New England Journal of Medicine, 334,356-361.

55. Karagas, M.R., Tosteson, T.D., Blum, J., Morris, S.J., Baron, J.A., Klaue, B.(1998): Design of an epidemiologic study of drinking water arsenic and skin andbladder cancer risk in a U.S. population. Environmental Health Perspectives,106, 1047-1050.

56. Kipnis V, Carroll RJ, Freedman LS, Li L Implications of a new dietary measure-ment error model for estimation of relative risk: Application to four calibrationstudies, American Journal of Epidemiology,50 (6): 642-651 SEP 15 1999

57. Kuchenhoff, H., Carroll, R.J. (1997): Segmented regression with errors in pre-dictors: semi-parametric and parametric methods. Statistics in Medicine, 16,169-188.

58. Kuha, J. (1994): Corrections for exposure measurement error in logistic regres-sion models with an application to nutritional data. Statistics in Medicine, 13,1135-1148.

59. Kuha, J. (1997): Estimation by data augmentation in regression models withcontinuous and discrete covariates measured with error. Statistics in Medicine,16, 189-201.

60. Lagakos S. (1988): Effects of mismodeling and mismeasuring explanatoryvariables on tests of their association with a response variable. Statistics inMedicine, 7, 257-274.

61. Little, R.J.A., Rubin. D.B. (1987): Statistical Analysis with Missing Data. Wi-ley, New York.

62. Liu, X., Liang, K.Y. (1992): Efficacy of repeated measures in regression modelswith measurement error. Biometrics, 48, 645-654.

63. MacMahon, S., Peto, R., Cutler. J, Collins, R., Sorlie, P., Neaton, J., Abbott,R., Godwin. J., Dyer, A., Stamler, J. (1990): Blood pressure, stroke and coro-nary heart disease: Part 1, prolonged differences in blood pressure: prospectiveobservational studies corrected for the regression dilution bias, Lancet, 335,765-774.

64. Mallick, B.K., Gelfand, A.E. (1996): Semiparametric errors-in-variables models:a Bayesian approach. Journal of Statistical Planning and Inference, 52, 307- 322.

65. McKeown-Eyssen, G.E., Tibshirani, R. (1994): Implications of measurementerror in exposure for the sample sizes of case-control studies. American Journalof Epidemiology, 139, 415-421.


66. McNamee, R. (2002): Optimal designs of two-stage studies for estimation ofsensitivity, specificity and positive predictive value. Statistics in Medicine, 21,3609–3625.

67. Michalek, J.E., Tripathi, R.C. (1980). The effect of errors in diagnosis andmeasurement on the probability of an event. Journal of the American StatisticalAssociation, 75, 713-721.

68. er Miller, P., Roeder, K. (1997): A Bayesian semiparametric model for case-control studies with errors in variables. Biometrika, 84, 523-537.

69. Nakamura, T. (1990): Corrected score functions for errors-in-variables models:methodology and application to generalized linear models. Biometrika, 77, 127-137.

70. Nakamura, T. (1992): Proportional hazards models with covariates subject tomeasurement error. Biometrics, 48, 829-838.

71. Novick, S.J., Stefanski, L.A. (2002): Corrected score estimation via complexvariable simulation extrapolation. Journal of the American Statistical Associa-tion, 458, 472-481.

72. Prentice, R.L. (1982): Covariate measurement errors and parameter estimationin a failure time regression model. Biometrika, 69, 331-342.

73. Pepe, M.S., Self, S.G., Prentice, R.L. (1989): Further results in covariate mea-surement errors in cohort studies with time to response data. Statistics inMedicine, 8, 1167-1178.

74. Pierce, D.A., Stram, D.O., Vaeth, M., Schafer, D. (1992): Some insights intothe errors in variables problem provided by consideration of radiation dose-response analyses for the A-bomb survivors. Journal of the American StatisticalAssociation, 87, 351-359.

75. Prentice, R.L. (1982): Covariate measurement errors and parameter estimationin a failure time regression model. Biometrika, 69, 331-342.

76. Prentice, R.L. (1989): Surrogate endpoints in clinical trials: definition and op-erational criteria. Statistics in Medicine, 8, 431-440.

77. Prentice, R.L. (1996): Dietary fat and breast cancer: measurement error andresults from analytic epidemiology. Journal of the National Cancer Institute,88, 1738-1747.

78. Prentice, R.L., Pyke, R. (1979): Logistic disease incidence models and case-control studies. Biometrika, 66, 403-411.

79. Racine-Poon, A., Weihs, C., Smith, A.F.M. (1991): Estimation of relative po-tency with sequential dilution errors in radioimmunoassay. Biometrics, 47, 1235-1246.

80. Reilly, M. (1996): Optimal sampling strategies for two phase studies. AmericanJournal of Epidemiology, 143, 92-100.

81. Richardson, S., Gilks, W.R. (1993): A Bayesian approach to measurement errorproblems in epidemiology using conditional independence models. AmericanJournal of Epidemiology, 138, 430-442.

82. Richardson, S. and Leblond, L. (1997): Some comments on misspecificationof priors in Bayesian modelling of measurement error problems. Statistics inMedicine, 16, 203-213.

83. Roeder, K., Carroll, R.J., Lindsay, B.G. (1996): A nonparametric mixture ap-proach to case-control studies with errors in covariables. Journal of the Ameri-can Statistical Association, 91, 722-732.


84. Rosner, B., Spiegelman, D., Willett, W.C. (1990): Correction of logistic regres-sion relative risk estimates and confidence intervals for measurement error: thecase of multiple covariates measured with error. American Journal of Epidemi-ology, 132, 734-745.

85. Rosner, B., Willett, W.C., Spiegelman, D. (1989): Correction of logistic regres-sion relative risk estimates and confidence intervals for systematic within-personmeasurement error. Statistics in Medicine, 8, 1051-1070.

86. Rudemo, M., Ruppert, D., Streibig, J.C. (1989): Random effect models in non-linear regression with applications to bioassay. Biometrics, 45, 349-362.

87. Satten, G.A., Kupper, L.L. (1993): Inferences about exposure-disease associa-tion using probability of exposure information. Journal of the American Statis-tical Association, 88, 200-208.

88. Schafer, D. (1987): Covariate measurement error in generalized linear models.Biometrika, 74, 385-391.

89. Schafer, D. (1993): Likelihood analysis for probit regression with measurementerrors. Biometrika, 80, 899-904.

90. Schafer D. (2002): Likelihood Analysis and Flexible Structural Modeling forMeasurement Error Model Regression. Journal of Computational Statistics andData analysis, 72, 33-45.

91. Schafer, D. (2001): Semiparametric maximum likelihood for measurement errormodel regression. Biometrics, 57, 53-61.

92. Schafer, D. and Purdy, K. (1996): Likelihood analysis for errors-in-variablesregression with replicate measurements. Biometrika 83, 813-824.

93. Schmid, C.H., Rosner, B. (1993): A Bayesian approach to logistic regressionmodels having measurement error following a mixture distribution. Statistics inMedicine, 12, 1141-1153.

94. Smith, A.F.M., Gelfand, A.E. (1992): Bayesian statistics without tears: asampling-resampling perspective. American Statistician, 46, 84-88.

95. Spiegelman, D. (1994): Cost-efficient study designs for relative risk modelingwith covariate measurement error. Journal of Statistical Planning and Inference,42, 187-208.

96. Spiegelman, D., Carroll, R.J., and Kipnis, V. (2001): Efficient regression cal-ibration for logistic regression in mainstudy/internal validation study designswith an imperfect reference instrument. Statistics in Medicine, 20, 139-160.

97. Spiegelman, D. and R. Gray (1991): Cost-efficient study designs for binaryresponse data with Gaussian covariate measurement error. Biometrics, 47, 851-869.

98. Stefanski, L.A. (1985): The effects of measurement error on parameter estima-tion. Biometrika, 72, 583-592.

99. Stefanski, L.A. (1989): Unbiased estimation of a nonlinear function of a nor-mal mean with application to measurement error models. Communications inStatistics — Theory and Methods, 18, 4335-4358.

100. Stefanski, L. A. (1989): Correcting Data for Measurement Error in GeneralizedLinear Models. Communications in Statistics — Theory and Methods. 18, 1715-1733.

101. Stefanski, L. A. (2000). Measurement Error Models. Journal of the AmericanStatistical Association, 95, 1353-1358.

102. Stefanski, L. A. (2001): Measurement Error, in Encyclopedia of Environ-metrics, A. El-Shaarawi and W. W. Piegorsch, Eds., Wiley, UK.


103. Stefanski, L. A. (2002): Measurement Error, in Statistics in the 21st Century,A. E. Raftery, A. A. Tanner and M. T. Wells, Eds., Chapman and Hall.

104. Stefanski, L. A. and Bay, J. M. (1996): Simulation Extrapolation Decon-volution of Finite Population Cumulative Distribution Function Estimators.Biometrika, 83, 407-417.

105. Stefanski, L.A, Buzas, J.S. (1995): Instrumental variable estimation in binarymeasurement error models. Journal of the American Statistical Association, 90,541-550.

106. Stefanski, L.A., Carroll, R.J. (1985): Covariate measurement error in logisticregression. Annals of Statistics, 13, 1335-1351.

107. Stefanski, L.A., Carroll. R.J. (1987): Conditional scores and optimal scores ingeneralized linear measurement error models. Biometrika, 74, 703-716.

108. Stefanski, L.A., Carroll, R.J. (1990): Score tests in generalized linear measure-ment error models. Journal of the Royal Statistical Society B, 52, 345-359.

109. Stefanski, L.A., Carroll, R.J. (1990): Structural logistic regression measure-ment error models, in Proceedings of the Conference on Measurement ErrorModels, P.J. Brown & W.A. Fuller, eds. Wiley, New York.

110. Stefanski, L.A, Cook, J. (1995): Simulation extrapolation: the measurementerror jackknife. Journal of the American Statistical Association, 90, 1247-1256.

111. Stephens, D.A., Dellaportas, P. (1992): Bayesian analysis of generalized linearmodels with covariate measurement error, in Bayesian Statistics 4, J.M. Be-mado, J.O. Berger, A.P. Dawid & A.F.M. Smith, eds. Oxford University Press,Oxford, pp. 813-820.

112. Stram, D.O., Longnecker, M.P., Shames, L., Kolonel, L.N., Wilkens, L.R.,Pike, M.C., Henderson, B.E. (1995): Cost-efficient design of a diet validation-study. American Journal of Epidemiology, 142(3), 353-362.

113. Tanner, M.A. (1993): Tools for Statistical Inference: Methods for the Explo-ration of Posterior Distributions and Likelihood Functions, 2nd Ed. Springer-Verlag, New York.

114. Taupin, M. (2001): Semi-parametric estimation in the nonlinear structuralerrors-in-variables model, Annals of Statistics, 29, 66-93.

115. Thomas, D., Stram, D., Dwyer, J. (1993): Exposure measurement error: in-fluence on exposure-disease relationships and methods of correction. AnnualReview of Public Health, 14, 69-93.

116. Titterington, D.M., Smith, A.F.M., Makov, U.E. (1985): Statistical Analysisof Finite Mixture Distributions. Wiley, New York.

117. Tosteson, T., Stefanski, L.A., Schafer, D.W. (1989): A measurement errormodel for binary and ordinal regression. Statistics in Medicine, 8, 1139-1147.

118. Tosteson, T.D., Tsiatis, A.A. (1988): The asymptotic relative efficiency ofscore tests in the generalized linear model with surrogate covariates. Biometrika,75, 507-514.

119. Tosteson, T.D., Ware, J.H. (1990): Designing a logistic regression study usingsurrogate measures of exposure and outcome. Biometrika, 77, 11-20.

120. Toteson, T.D., Buzas, J.S., Demidenko, D., Karagas, M.R. (2003): Power andsample size calculations for generalized regression models with covariate mea-surement error. Statistics in Medicine (in press).

121. Tosteson, T.D., Titus-Ernstoff, L., Baron, J.A., Karagas, M.R. (1994): A two-stage validation study for determining sensitivity and specificity. EnvironmentalHealth Perspectives, 102, 11-14.


122. Tsiatis A. A., DeGruttola, V. and Wulfsohn, M. S. (1995): Modeling the re-lationship of survival to longitudinal data measured with error: Applicationsto survival and CD4 counts in patients with AIDS. Journal of the AmericanStatistical Association 90, 27-37.

123. Tsiatis A. A. and Davidian M. (2001): A semiparametric estimator for theproportional hazards model with longitudinal covariates measured with error.Biometrika, 88, 447-458.

124. Wang, N., Carroll, R.J., Liang, K.Y. (1996): Quasi-likelihood and variancefunctions in measurement error models with replicates. Biometrics. 52, 401-411.

125. Wang C.Y., Hsu, Z.D., Feng, Z.D. and Prentice, R.L. (1997): Regression cali-bration in failure time regression. Biometrics, 53, 131-145.

126. Weinberg, C.R., Wacholder, S. (1993): Prospective analysis of case-controldata under general multiplicative-intercept models. Biometrika, 80, 461-465.

127. White, E., Kushi, L.H., Pepe, M.S. (1994): The effect of exposure varianceand exposure measurement error on study sample size. Implications for designof epidemiologic studies. Journal of Clinical Epidemiology, 47, 873-880.

128. Whittemore, A.S. (1989): Errors in variables regression using Stein estimates.American Statistician, 43, 226-228.

129. Whittemore, A.S., Gong, G. (1991): Poisson regression with misclassifiedcounts: application to cervical cancer mortality rates. Applied Statistics, 40,81-93.

130. Whittemore, A.S., Keller, J.B. (1988): Approximations for regression withcovariate measurement error. Journal of the American Statistical Association,83, 1057-1066.

131. Wittes, J., Lakatos, E., Probstfield, J. (1989): Surrogate endpoints in clinicaltrials: cardiovascular trials. Statistics in Medicine, 8, 415-425.

132. Xie, S.X., Wang, C.Y., and Prentice, R.L. (2001): A risk set calibration methodfor failure time regression by using a covariate reliability sample. Journal of theRoyal Statistical Society B, 63, 855-870.

133. Zhao, L.P., Lipsitz, S. (1992): Designs and analysis of two-stage studies. Statis-tics in Medicine, 11, 769-782.

134. Zhou, H. and Pepe, M. S. (1995): Auxiliary covariate data in failure timeregression analysis. Biometrika, 82, 139-149.

135. Zhou, H. and Wang, C. Y. (2000): Failure time regression with continuouscovariates measured with error. Journal of the Royal Statistical Society, SeriesB, 62, 657-665.

Documents

1 MEASUREMENT ERROR - Nc State Universitystefansk/mimeo_series_mem_version_master_fil… · 1 MEASUREMENTERROR 7 µ Y X ¶ »N ‰µ fl1 +fl x„ x „ x ¶; µ fl2¾2 +¾2 †