6
Intercept Studies, Clinical Trials, and Cluster Experiments: To Whom Can We Extrapolate? Charles D. Cowan, PhD Resolution Trust Corporation, Washington, DC Janet Wittes, PhD Statistics Collaborative, Washington, DC ABSTRACT: While sample surveys rely on an explicit theory of the relationship between a sample and its reference population, many nonsurvey-based designs lack a clear re- lationship to the population to which inferences are to be drawn. For example, in order to be applicable to large populations, intercept studies+which include obser- vational studies of migrant populations and randomized studies such as mall inter- cepts-must make implicit or explicit assumptions about the probability of finding members of the population. In conducting clinical trials, constraints imposed by the research often lead large segments of. the population to have zero probability of ob- servation. Although such experiments generally conduct research at only a few sites, investigators usually intend to apply their results quite broadly. This paper compares intercept studies and clinical trials with particular emphasis on inference to a popu- lation for which the probabilities of being observed are unknown, unreliable, or very small. KEY WORDS: Clinical trial methodology, generalizntion, intercept studies INTRODUCTION While we both have been involved in research studies among people, we come from two very different schools of thought concerning inference. CC has had experience in formal surveys with complex designs for inference and estimation whereas JW has worked in clinical trials that take extreme care to randomize treatments among participants. Thus, CC approaches inference through statistical rules that allow generalization from sample to population while JW depends on randomization. Our intellectual differences center on our divergent views regarding the generalizability of a clinical trial to a pop- ulation broader than cohort randomized. CC expects a clear relationship be- tween the experimental group and the population to which inferences are to Address reprint requests to: Janet Wines, Statistics Collaborative, 1816 Jefferson Place NW, Washington, DC 20036. Received November 8, 1991; revised July 12, 1993. 24 Controlled Clinical Trials 15~2629 (1994) 0197-2456/94/$7.00 0 Elsevier Science Inc. 1994 655 Avenue of the Americas, New York, New York 10010

Intercept studies, clinical trials, and cluster experiments: To whom can we extrapolate?

Embed Size (px)

Citation preview

Intercept Studies, Clinical Trials, and Cluster Experiments: To Whom Can We Extrapolate?

Charles D. Cowan, PhD

Resolution Trust Corporation, Washington, DC

Janet Wittes, PhD

Statistics Collaborative, Washington, DC

ABSTRACT: While sample surveys rely on an explicit theory of the relationship between a sample and its reference population, many nonsurvey-based designs lack a clear re- lationship to the population to which inferences are to be drawn. For example, in order to be applicable to large populations, intercept studies+which include obser- vational studies of migrant populations and randomized studies such as mall inter- cepts-must make implicit or explicit assumptions about the probability of finding members of the population. In conducting clinical trials, constraints imposed by the research often lead large segments of. the population to have zero probability of ob- servation. Although such experiments generally conduct research at only a few sites, investigators usually intend to apply their results quite broadly. This paper compares intercept studies and clinical trials with particular emphasis on inference to a popu- lation for which the probabilities of being observed are unknown, unreliable, or very small.

KEY WORDS: Clinical trial methodology, generalizntion, intercept studies

INTRODUCTION

While we both have been involved in research studies among people, we come from two very different schools of thought concerning inference. CC has had experience in formal surveys with complex designs for inference and estimation whereas JW has worked in clinical trials that take extreme care to randomize treatments among participants. Thus, CC approaches inference through statistical rules that allow generalization from sample to population while JW depends on randomization. Our intellectual differences center on our divergent views regarding the generalizability of a clinical trial to a pop- ulation broader than cohort randomized. CC expects a clear relationship be- tween the experimental group and the population to which inferences are to

Address reprint requests to: Janet Wines, Statistics Collaborative, 1816 Jefferson Place NW, Washington, DC 20036.

Received November 8, 1991; revised July 12, 1993.

24 Controlled Clinical Trials 15~2629 (1994) 0197-2456/94/$7.00 0 Elsevier Science Inc. 1994

655 Avenue of the Americas, New York, New York 10010

Extrapolation from Clinical Trials 25

be made. JW is more likely to be comfortable with the leap of faith required to apply the results of a clinical trial to cohorts of people with zero probability of having been entered into the trial so long as the analysis strictly reflects the randomization and the people to whom the results are to be applied are essentially similar to those in the study. CC argues that “essentially similar” is a nonoperational concept without a formal, probabilistic link. The two of us do agree that our ability to draw inference from a study is a function of the statistical design as well as the biological and broadly sociological vari- ability in the outcomes being studied. We have tried to develop a paradigm to help us decide when we can reasonably draw inferences from a study cohort to a larger population.

DIFFERENCES BETWEEN SURVEYS AND OTHER FORMS OF RESEARCH

An ideal survey is a special type of study that bases inference on its design. Although several factors determine the probability of sampling an individual, in an ideal survey we can calculate this probability precisely. A sample survey uses these probabilities to provide an unambiguous basis for inference from the sample to the population from which it was drawn [l].

In surveys we encounter problems in drawing inferences when we have imprecise or unreliable knowledge of the sampling probabilities. We may fail to know the precise probabilities if some sampled individuals refuse to be interviewed or cannot be found. In some household surveys, a housing unit is sampled in order to interview the occupants but the interviewer is unable to ascertain even if the unit is occupied. Model-based adjustments to prob- abilities deal with problems of nonresponse and indeterminancy [2]. Even in the presence of nonresponse, however, inference from most large-scale sur- veys is still overwhelmingly design-based because the assumptions of the models used lead only to small discrepancies from the planned design-based analysis. Sources of variation that affect our measures are either assumed to be completely represented within the survey design or incorporated through linkages with large, complete record systems and thus are not an important source of potential bias [3].

INTERCEPT BASED DESIGNS AND ASSUMPTIONS AFFECTING INFERENCE

The sampler, whose stock in trade is design-based estimation procedures, has problems drawing inferences when the data do not arise from a tightly controlled design. Consider an intercept study, i.e., a study that measures individuals who happen to be found at a specific location and time. Two common types of intercept studies are the watering point study and the mall intercept study. A watering point study, which investigates people with no fixed location, looks for individuals at some place where they eventually must go or at least are very likely to go. Examples of such studies are surveys of nomads in Somalia, where interception is literally at the watering points, and surveys of the homeless in New York with public shelters as the watering points. A mall intercept, on the other hand, is a study in which interviewers

26 CD. Cowan and J. Wittes

establish a stationary location in an area that may be visited frequently by one subgroup of the population, infrequently by a second, and not at all by a third. The analysis requires assumptions about the frequency with which people are likely to visit a mall, whether the entire relevant population for the study will attend a mall at some time, and about the types of clustering effects at different malls. Intercept studies can be expensive, but for certain types of populations (the homeless, for example) they may be the only way to conduct research. In other cases, they may be much less expensive than looking for a relatively rare member of a much larger general population [4]. In still other cases, an intercept may be required to identify subjects for recruitment into trials and a central locus to administer the treatments. For example, Cowan [5] compared a clinical trial to an intercept study with hos- pitals and clinics as the watering points.

The assumptions regarding the probability of observing a particular kind of individual form the basis for inference in watering point and mall intercept studies. The possibility that the chance of observing an individual may be related to the measures under study is of great importance to making valid inferences. In many cases, we assume that all individuals in the population have a nonzero chance of being sampled, so that no member of the population is entirely excluded from the possibility of participating in the study. In a watering point study we may assign equal weights to everyone in the pop- ulation at large whereas in a mall intercept study the weights may be functions of known covariates. When this assumption of nonzero probability of inter- ception is violated for a large number of people, we are unable to infer to a segment of the population through the mechanism provided by the sample design; instead, we must rely on some assumed model. In a typical watering point study all people are highly likely to come to the site, while in a mall intercept study some percentage of the population (perhaps large) will never appear at the study site.

CONDITIONS FOR INFERENCE IN CLINICAL TRIALS

In most clinical trials, one selects a cohort of patients, intervenes with two or more randomly assigned treatments, and then evaluates which of the tested treatments performs “better.” The selection of the study cohort is made with an eye toward compliance and risk; that is, the trial enters people who are believed to be at high risk for the condition under study and at relatively low risk for competing events. For instance, in studying prevention of heart dis- ease, people with cancer are often excluded because they are likely to die of their underlying cancer and thus contribute little to the inference concerning heart disease. Similarly, clinical trials tend to exclude people who are likely to move to a different city, unlikely to adhere to therapy, unable to follow instructions, or potentially disruptive in the clinic. Age limits may be set, perhaps to exclude minors and people over 65. These exclusions may reflect administrative ease and have little biological justification. Studies often ex- clude women of childbearing years from trials because information about the consequence of the drugs to pregnancy or lactation may be lacking, or because the investigators fear that mothers of young children may not have the time to be involved in a clinical trial. Recent clinical trials exclude even more

Extrapolation from Clinical Trials 27

people-the informed consent form may be frightening and the trial may be preceded by a long run-in period. to eliminate people who do not comply with a test protocol. Geography and social class passively prevent many people from participation. A person living in a rural area may be unlikely to be in a clinical trial simply because the medical facilities nearby may not be involved in research. Similarly, a trial that attracts participants through ad- vertisements is unlikely to enroll many people on the socioeconomic fringes of society. Only recently have large simple trials such as ISIS [6] and GISSI (71 provided models for randomizing patients from a wide cross-section of people and medical centers.

In spite of the extremely limited range of people in most clinical trials, we often apply the results to a much broader population. How broadly should results be applied? We both believe it is reasonable for a doctor to compare a newly presenting patient to the profile of patients in the trial. If the patient matches the profile, then the results are likely to be applicable. If, however, the patient differs from those randomized-she is a woman while the partic- ipants in the trial were men, he is 75 years old while those in the trial were all under 65, he is an alcoholic while substance abusers were excluded-what should the treating physician do? Our basic thesis is that the closer an inter- vention is to a purely biological process, the more confident we feel in ex- trapolating beyond the types of patients studied. Indeed, we contend that in such cases the appropriate cohort to enter into a clinical trial ought to consist of cooperative people without much attendant competing risk so that the trial can be efficient. On the other hand, if the intervention is far from a purely biological process, then the failure to include a wide range of people in the study can lead to conclusions that should not be extended to other popula- tions.

Consider, for example, two cases that represent opposite extremes: inter- ventions for septic shock and interventions for contraception. Shock is so overwhelming that the effect of such variables as social class, compliance, and competing risk pale compared to the process of the disease itself. Con- traception is at the other extreme; investigators in the field of population research have long known that methods that are highly effective in some ethnic groups are ineffective among other groups [8]. For example, ethnicity and religion are strongly related to the success or failure of particular methods. Failure to account clearly for the methods that determined the composition of the study cohort can lead to serious mistakes in inference to other popu- lations.

Still another reason to be cautious in drawing broad inferences from clinical trials is the often dramatic differences in the effectiveness of medical care by facility. The effect of certain types of treatments (e.g., surgery, behavioral interventions, diagnostic imaging, critical care) may depend on the skill of the physician or the sophistication of the medical care facility.

For purely biological processes, we may infer to a more general population from a randomized clinical trial of a well-defined study cohort as long as treatment effects are not importantly related to baseline covariates. The bio- logical model and not the study design tells us that inference is valid in this situation. JW tends to ignore such usual demographic variables as age, race, and sex in drawing inferences from most clinical interventions, whereas CC

28 C.D. Cowan and J. Wittes

tends to take more seriously the possibility that these variables are of pre- dictive importance in determining the efficacy of a clinical intervention.

As we move away from situations defined clearly by biology, a number of factors constrain our ability to draw inferences. The sampler would argue that as the range of variation of response increases, the level of control by the respondent or caregiver increases, or that as the relative importance of covariates or intervening variables increases, the possibility of drawing an inference to a broader population decreases. Consider, therefore, the char- acteristics of a study according to the rigor of the design employed and the validity of the assumptions made regarding how patients or respondents are selected. In a formal survey with no nonresponse, inferences can be entirely design-based. In the presence of nonresponse, our ability to draw inferences depends on the accuracy of assumptions we make to correct for the fact that some individuals self-select themselves out of the survey. An intercept-based sample requires some design considerations. Do all members of the overall population have a chance of being observed, can we determine the probability of an observation, and are persons who are excluded for any reason similar to those who were included?

In watering point studies, we make a formal attempt to cover the entire population by determining whether the entire population has at least a small chance to visit a watering point. We also attempt to determine whether in- dividuals go to more than one point during the study or visit the same water- ing point repeatedly, and, if so, how often. Finally, if there are persons not covered by the span of interception of the watering point, we are comfortable with the study if we expect their characteristics to be the same as those who are covered by our frame. Once these factors are considered, we can effectively subsample potential intercept sites and apply the information from the in- tercept study using design-based techniques.

CONCLUSIONS: HOW WE DRAW INFERENCES

One purpose of this paper was to begin a dialogue between practitioners from different areas in statistics who view inference in very different ways. A second purpose was to sensitize researchers to the assumption they make in drawing inferences. No hard-and-fast rules dictate how to draw inferences safely to a population-the only certainty is that the reliability of the inference decreases as one gets further from either the ideal model or the ideal design. No specific point is the demarcation between where it is safe and where it is unsafe to draw inferences. We two authors know that in practice we disagree with each other: CC is more insistent that the cohort studied bear some formal relation to the population to which generalizations are to be made; JW relies more on biological simikuities among people and the experience that the usual measured covariates are often disappointing predictors of the efficacy of treat- ment. We both agree that reports of a clinical trial should describe explicitly the process by which the study enrolled participants, so that questions and concerns about how an inference might be drawn from the research can be answered as part of the design, instead of by wishful thinking after the data have been collected.

We are grateful to Kent Bailey for his interesting suggestions on an earlier draft of this paper.

Extrapolation from Clinical Trials 29

REFERENCES

1. Cochran WG: Sampling Techniques. John Wiley and Sons, New York, 1977

2. Cowan CD, Ahmed S: Sample weights for coverage: a stochastic interpretation. Pm Section of Surzwy Research Methods, American Statistical Association, 1990

3. Kalton G: Compensating for missing data. Institute for Social Research, University of Michigan, Ann Arbor, 1983

4. Sudman S, Sirken MG, Cowan CD: Sampling rare and elusive populations: Science 240:991-996, 1988

5. Cowan CD: Mall intercepts and clinical trials: the philosophy of inference from different types of research designs. Marketing Res 1:15-22, 1989

6. ISIS-l (First International Study of Infarct Survival) Collaborative Group). Random- ized trial of intravenous atenolol among 16027 cases of suspected acute myocardial infarction: ISIS-l. Luncet 2:57-66, 1986

7. Tognoni G, Franzosi MG, Garattini S, Maggioni A, Lotto A, Mauri F, Rovelli F. The case of GISSI in changing the attitudes and practices of Italian cardiologists. Stat Med 9:17-27, 1990

8. Draper E: Birth Control in the Modem World: The Role of the Individual in Pop- ulation Control. Penguin, Baltimore, 1965