7
Intention-to-Treat Analyses in Behavioral Medicine Randomized Clinical Trials Sherry L. Pagoto & Andrea T. Kozak & Priya John & Jamie S. Bodenlos & Donald Hedeker & Bonnie Spring & Kristin L. Schneider Published online: 25 March 2009 # International Society of Behavioral Medicine 2009 Abstract Background Intention-to-treat (ITT) is an analytic approach where all randomized participants are included in analyses and in their originally assigned condition, regardless of adherence or protocol deviation. Purpose The present study aimed to determine whether reporting and correct use of ITT in behavioral medicine randomized clinical trials (RCTs) published in behavioral journals has improved in recent years. Method ITT and related analytic conventions were exam- ined in behavioral medicine RCTs (N =87) published in Annals of Behavioral Medicine, Health Psychology , and the Journal of Consulting and Clinical Psychology in the years 20002003 and then again in 20062007. Logistic regres- sion analyses tested whether ten indicators associated with ITT were being used increasingly over time. Also tested was whether reporting and correct use of ITT improved following the adoption of Consolidated Standards of Reporting Clinical Trials (CONSORT) statement. Results Results revealed that less than half of RCTs (42%) used ITT analyses correctly. Over time, reporting of sample size estimation and primary outcome as well as use of the term ITTto describe analyses improved; however, correct implementation of ITT did not. Improvement was not specifically attributable to CONSORT adoption. Conclusion Investigatorsclaims of using ITT analyses have increased over time, but correct use of ITT has not. Keywords Randomized controlled trials . Analytic quality . Intention-to-treat . Research design . Systematic review Introduction Intention-to-treat (ITT) is an analytic approach where all randomized participants are included in analyses and in their originally assigned condition, regardless of adher- ence or protocol deviation [15]. Preserving randomiza- tion increases the likelihood that group differences are due to the intervention and unaffected by biases that lead to an overestimation of the intervention effect [6]. The alter- natives to ITT are per protocol (PP) and available case (AC) analyses. PP is when cases are included in the analyses only if they adhered to the assigned intervention and completed the follow-up [7]. In AC, only participants for whom data were obtained are included in the analyses Int. J. Behav. Med. (2009) 16:316322 DOI 10.1007/s12529-009-9039-3 S. L. Pagoto (*) : J. S. Bodenlos : K. L. Schneider Division of Preventive and Behavioral Medicine, Department of Medicine, University of Massachusetts Medical School, 55 Lake Avenue North, Worcester, MA 01655, USA e-mail: [email protected] A. T. Kozak Department of Psychology, Oakland University, Rochester, MI, USA B. Spring Edward Hines, Jr. VA Hospital, Hines, IL, USA D. Hedeker Division of Epidemiology and Biostatistics, University of Illinois at Chicago, Chicago, IL, USA B. Spring Feinberg School of Medicine, Northwestern University, Chicago, IL, USA P. John University of Chicago, Chicago, IL, USA

Intention-to-Treat Analyses in Behavioral Medicine Randomized Clinical Trials

Embed Size (px)

Citation preview

Intention-to-Treat Analyses in Behavioral MedicineRandomized Clinical Trials

Sherry L. Pagoto & Andrea T. Kozak & Priya John &

Jamie S. Bodenlos & Donald Hedeker & Bonnie Spring &

Kristin L. Schneider

Published online: 25 March 2009# International Society of Behavioral Medicine 2009

AbstractBackground Intention-to-treat (ITT) is an analytic approachwhere all randomized participants are included in analysesand in their originally assigned condition, regardless ofadherence or protocol deviation.Purpose The present study aimed to determine whetherreporting and correct use of ITT in behavioral medicinerandomized clinical trials (RCTs) published in behavioraljournals has improved in recent years.Method ITT and related analytic conventions were exam-ined in behavioral medicine RCTs (N=87) published in

Annals of Behavioral Medicine, Health Psychology, and theJournal of Consulting and Clinical Psychology in the years2000–2003 and then again in 2006–2007. Logistic regres-sion analyses tested whether ten indicators associated withITT were being used increasingly over time. Also testedwas whether reporting and correct use of ITT improvedfollowing the adoption of Consolidated Standards ofReporting Clinical Trials (CONSORT) statement.Results Results revealed that less than half of RCTs (42%)used ITT analyses correctly. Over time, reporting of samplesize estimation and primary outcome as well as use of theterm “ITT” to describe analyses improved; however, correctimplementation of ITT did not. Improvement was notspecifically attributable to CONSORT adoption.Conclusion Investigators’ claims of using ITT analyseshave increased over time, but correct use of ITT has not.

Keywords Randomized controlled trials . Analytic quality .

Intention-to-treat . Research design . Systematic review

Introduction

Intention-to-treat (ITT) is an analytic approach where allrandomized participants are included in analyses and intheir originally assigned condition, regardless of adher-ence or protocol deviation [1–5]. Preserving randomiza-tion increases the likelihood that group differences are dueto the intervention and unaffected by biases that lead to anoverestimation of the intervention effect [6]. The alter-natives to ITT are per protocol (PP) and available case(AC) analyses. PP is when cases are included in theanalyses only if they adhered to the assigned interventionand completed the follow-up [7]. In AC, only participantsfor whom data were obtained are included in the analyses

Int. J. Behav. Med. (2009) 16:316–322DOI 10.1007/s12529-009-9039-3

S. L. Pagoto (*) : J. S. Bodenlos :K. L. SchneiderDivision of Preventive and Behavioral Medicine,Department of Medicine,University of Massachusetts Medical School,55 Lake Avenue North,Worcester, MA 01655, USAe-mail: [email protected]

A. T. KozakDepartment of Psychology, Oakland University,Rochester, MI, USA

B. SpringEdward Hines, Jr. VA Hospital,Hines, IL, USA

D. HedekerDivision of Epidemiology and Biostatistics,University of Illinois at Chicago,Chicago, IL, USA

B. SpringFeinberg School of Medicine, Northwestern University,Chicago, IL, USA

P. JohnUniversity of Chicago,Chicago, IL, USA

[8]. Missing data are ignored in PP and AC analyses, butprocedures for estimating missing data are required forITT analyses in trials with incomplete data [6,9]. WhileITT analyses guard against some of the biases associatedwith PP analyses, exorbitant missing data can even biasITT analyses. Complete transparency of possible sourcesof bias requires a description of the analytic approach(i.e., ITT, PP, AC), the quantity of missing data, and tech-niques for handling missing data.

The reporting of whether or not ITT analyses were used,the amount of missing data, and strategies used for dealingwith missing data are required by the Consolidated Stand-ards of Reporting Clinical Trials (CONSORT) statement.The CONSORT statement, first published in 1996 andupdated in 2001, was developed to improve the transparencyof randomized clinical trial (RCT) reports by establishingreporting standards. Both authors and reviewers are asked toinsure that reports of RCT address each item on a checklist[10–12]. CONSORT has been endorsed by editorial boardsof major medical journals for a decade and recently by theeditorial boards of behavioral medicine journals, includingHealth Psychology (HP) [13], Annals of BehavioralMedicine (ABM) [14], and Journal of Consulting andClinical Psychology (JCCP) [15].

The CONSORT statement is exclusively focused ontransparent reporting, and while it requires authors toreport whether or not ITT was used, it does not directlyendorse any particular analytic strategy, including ITT. Byrequiring researchers to be transparent, CONSORT adop-tion might have the positive effect of raised awareness ofand proper use of methodological conventions in RCTs,like ITT. Adoption of the CONSORT statement bymedical journals has been associated with greater use ofITT [16]. The impact of CONSORT on reporting and useof ITT in behavioral medicine journals remains unknown.One study revealed that behavioral medicine RCTspublished in medical journals were more likely to utilizeITT than those published in behavioral journals [17]. Onepossible explanation for this finding is that medicaljournals were in compliance with CONSORT but behav-ioral journals were not. Behavioral journals have recentlyaccepted the CONSORT statement but it remains unknownwhether this has impacted the analytic quality of theirRCTs.

The present study examines whether the implementa-tion of ITT and related analytic conventions has improvedas a function of time from 2000 to 2007 in RCTs ofbehavioral medicine interventions published in threebehavioral journals. We then investigated whether theuse of ITT procedures and related analytic conventionsimproved as a function of the adoption of CONSORT inthese journals.

Methods

Inclusion Criteria

Studies included in our review were required to meet criteriafor phase II or III RCTs. These criteria include (1) use ofrandomization procedures to determine treatment assignment,(2) test of an intervention in humans against a standardtreatment or control group, and (3) purpose is to evaluateefficacy [18]. To limit sample heterogeneity, we includedonly those RCTs that randomized at the level of theindividual, excluding those that randomized by group (e.g.,family, site, school). Also, the tested intervention wasrequired to have employed a behavioral treatment modality.RCTs that exclusively tested the effects of medications ormedical devices were excluded. Further, to qualify as abehavioral medicine RCT, either the intervention or theprimary endpoint needed to pertain to the prevention ortreatment of an acute or chronic disease or condition.Psychological disorders (e.g., mood disorders, schizophrenia)and substance abuse were included only in so far as theycontributed to a health condition as an endpoint. Primaryendpoints could be either biological or behavioral and thetarget population could be either healthy or ill. A total of 87articles met criteria for inclusion.

Selection Procedures

Three journals (HP, ABM, JCCP) were selected because (1)they routinely publish behavioral medicine RCTs (at leastone RCT per year during 2000–2003 and 2006–2007) and(2) each had formally adopted the CONSORT statement by2005 per their publication instructions. Other journals wereconsidered but did not meet these criteria. Two authorsindependently hand-searched every journal issue fromJanuary 2000 to June 2003 and January 2006 to December2007 to identify behavioral medicine RCTs. Articles werethen assigned to assessor pairs who excluded articles thatdid not meet inclusion criteria.

Development of Assessment Criteria

Four investigators developed the rating system for eachitem by assessing the same ten papers through progressivedrafts of an assessment scheme. Assessor discrepancieswere discussed and ambiguous items were revised oreliminated. Each time an element’s rating description wasrevised, all ten articles were reassessed until intercoderagreement across all ten articles reached !90%. The 12items shown in Table 1 comprised the final roster of items.Each of these items was deemed necessary to evaluate notonly whether ITT was used correctly, but also the degree to

Int. J. Behav. Med. (2009) 16:316–322 317

which the analyses, ITT or otherwise, could be biased bymissing data and missing data strategies. The items weredichotomously coded, requiring a judgment of present orabsent (or not applicable). The first item assessed wassample size estimation, because any analytic approach, ITTor PP, is dependent on adequate power to detect an effect.Articles were given credit if a sample size estimation wasperformed that accounted for some anticipated rate ofmissing data. Whether authors identified a primary end-point was assessed because the primary endpoint (asopposed to secondary endpoints) should be analyzed viaITT. Whether the denominator of the primary analyses wasreported was assessed because this is a necessary step toevaluating whether all randomized cases were included inthe analyses. Also assessed was whether authors referred tothe analyses as “ITT”, to determine the discrepancybetween correct and incorrect use of ITT. Correct use ofITT was evaluated by rating the articles on whether allrandomized cases were actually included in the primaryanalyses. To assess this, the denominator of the primaryanalyses must match the number randomized. Also assessedwas whether missing data were reported and how muchwere missing at baseline and follow-up. Because exorbitantmissing data introduce biases into even an ITT analysis,also assessed was whether the article reported greater than10% missing data. Whether the article used an analyticmodel that accounted for missing data or simple imputationfor missing data was also assessed. Finally, whether bothITT and PP analyses were employed was assessed. Whendata are missing, both ITT and PP are subject to bias; thus,the conclusion of a clinical trial cannot rest on a singlereport of either ITT or PP analysis alone [7]. Reports ofstudies that have missing data which include both ITT andPP analyses provide the reader greater information about

the conclusion and the degree of bias introduced by thatmissing data.

Assessment Phase

The assessment phase for articles published between 2000and 2003 occurred from August 2003 to June 2004; forarticles published in 2006, it occurred from January 2006 toDecember 2006; and for articles published in 2007, itoccurred from May 2008 to July 2008. Assessors consistedof two biostatisticians and six clinical psychologists. Twoassessors were responsible for selecting eligible articles anddistributing them to assessor dyads. As in other reviews ofquality reporting [12,19], it was not deemed necessary tomask the articles. Each article was reviewed independentlyby two assessors, using all possible combinations ofassessor dyads. To minimize drift, each dyad assessed twoto four articles before alternating partners. Prior to resolvingdiscrepant ratings, average interassessor agreement acrossall 87 articles was 82% and did not differ across the twotime periods (p=0.73). After initial ratings were madeindependently, assessment discrepancies were discussed.When agreement could not be achieved within a dyad, theitem was referred to the full assessment team to attainconsensus. Monthly conference calls were held to resolvediscrepancies. Once agreement was achieved, a finalassessment sheet was entered into a database.

Analytic Plan

The proportion of articles achieving criterion for each itemwas computed for each year. All 12 dichotomously ratedfeatures were analyzed using logistic regression modelsincluding only an overall effect of time across 2000 through

Table 1 Analytic quality indicators by journal

Year 2000 2001 2002 2003 Pre-CONSORT 2006 2007 Post-CONSORT

# articles 16 12 12 10 50 19 18 37

Study size rationale (%) 6 8 25 20 14 37 44 40

Define primary outcome (%) 25 75 58 90 58 63 78 70

Report missing baseline data (%) 63 78 20 88 62 83 82 83

Report dropout (%) 94 92 83 100 92 100 100 100

Less than 10% dropout (%) 21 33 20 40 26 42 39 40

Denominator reported (%) 88 58 50 90 72 76 78 77

ITT declared (%) 25 42 25 40 32 67 72 70

Analyze all randomized? (%) 38 33 17 50 34 47 56 51

Missing data strategy? (%) 50 50 42 60 50 58 61 59

Analytic model accounts for missing data? (%) 6 25 17 0 12 21 39 30

Simple imputation (%) 37 25 17 40 30 32 22 27

Used ITT and complete case (%) 13 17 8 20 14 0 0 0

318 Int. J. Behav. Med. (2009) 16:316–322

2007. Next, a dichotomous dummy variable was createdbased on whether an article was published in 2000–2003(pre-CONSORT) or in 2006–2007 (post-CONSORT). Inthe second set of models, this dummy variable was added toeach logistic regression model to test whether the post-CONSORT period represented a departure from the overalllinear time trend. If significant, this would suggest that theadoption of CONSORT significantly affected that item. Fortwo variables, “dropout reported” and “use of both ITT andcomplete-case analyses”, the second model could not befully estimated because there was no variation on thesevariables in 2006–2007 (i.e., the percentages were 100%and 0%, respectively). As a result, Fisher’s exact tests wereused to compare the collapsed totals at pre- and post-CONSORT.

Results

A total of 87 RCTs published during 2000–2007 metinclusion criteria: 26 articles (30%) were published in ABM,36 articles (41%) were published in HP, and 25 articles(29%) were published in the JCCP. Frequencies for eachitem by year are shown in Table 1. A significant lineareffect of time was revealed for sample size estimation (p=0.01), primary outcome (p=0.03), and whether authorsreferred to analyses as “ITT” (p<0.001). All increased overtime. Two variables were marginally significant (p=0.05)including use of both ITT and PP analyses, which decreasedover time, and use of an analytic model that accounts formissing data, which increased over time. No other variablessignificantly changed across time (see Table 2). In thesecond set of models examining whether the post-CONSORT period represents a departure from the timetrend, no variables reached significance (see Table 3). TheFisher exact test for “dropout reported” was nonsignificant;however, the proportion of studies conducting both com-

pleter and ITT analyses declined over time from 14% to 0%(p=0.01).

Discussion

Results revealed some evidence of improved reporting ofquality indicators relevant to ITT analyses in recent years inbehavioral medicine RCTs published in three behavioraljournals. The reporting of sample size estimation andprimary outcome, as well as the tendency to refer toanalyses as “ITT”, were the three indicators that increasedover time. There was also a trend toward increased use ofanalytic models that accounted for missing data. Improve-ments did not appear to be attributable specifically toCONSORT, given that the trends evident after its adoptionwere consistent with the increasing trends that preceded itsadoption. However, the reporting of some quality indicatorsrelevant to ITT analyses was high across all years. Forexample, more than 80% of articles in all years reported thenumber of dropouts in their trials.

That trial reports increasingly include information onsample size estimation is encouraging. More difficult toascertain is whether an increased tendency to refer toanalyses as “ITT” is wholly a good thing. Although thistendency increased linearly over time, the number of RCTsthat actually performed ITT correctly (i.e., included allrandomized cases in the analyses) did not change signifi-cantly over time. In essence, investigators were more likelyto say they were using ITT over time, but not more likely toactually use it. Perhaps, awareness of CONSORT in thebehavioral medicine community may have had the unin-tended consequence of inflated reporting of ITT analyses.After the adoption of CONSORT, authors more often claimedto have conducted ITT analyses (a change from 32% pre-CONSORT to 68% post-CONSORT) while number ofRCTs implementing ITT correctly rose more modestly (and

Variable B SE X2 p value

Sample size 0.40 0.14 7.42 0.01

Primary outcome 0.25 0.11 4.97 0.03

Missing data at baseline 0.23 0.12 3.41 0.06

Dropout 0.48 0.32 2.27 0.13

10% dropout 0.14 0.12 1.44 0.23

Denominator 0.04 0.12 0.08 0.77

ITT declared 0.38 0.12 10.59 0.00

ITT used 0.17 0.11 2.37 0.12

Missing data strategy 0.14 0.11 1.72 0.18

Analytic model accounts for missing data 0.21 0.10 3.83 0.05

Simple imputation "0.04 0.08 0.18 0.66

Completer plus ITT analyses "0.44 0.23 3.59 0.05

Table 2 Linear effect of timeon each ITT criteria

B beta, SE standard error, X2 chisquare

Int. J. Behav. Med. (2009) 16:316–322 319

nonsignificantly) from 34% pre-CONSORT to 51% post-CONSORT. Inaccurate reporting of ITT policy increasedfrom 8% to 19% after CONSORT was adopted. In anearlier investigation, we reported an even slightly higherrate (24%) of inaccurate ITT reporting for behavioralmedicine trials published in major medical journals (i.e.,Journal of the American Medical Association (JAMA),New England Journal of Medicine (NEJM) that hadadopted CONSORT since its inception [17]. Adoption ofCONSORT is intended to improve RCT reporting, thusmonitoring of the accuracy of reports of CONSORTcriteria may be needed. When ITT was inaccurately claimed,this often occurred because an author defined ITT as ananalysis of cases according to their original randomizedassignment, omitting the key feature regarding inclusionof all randomized cases in the analysis. In other cases, theauthor did not define ITT, but merely labeled the analysesas such. Interestingly, the frequency of reports correctlyusing ITT analyses without specifically naming the analyticapproach, “ITT,” decreased from 10% pre-CONSORT to2% post-CONSORT.

Some areas of analytic quality were consistently reportedacross time. For example, nearly all articles reported howmuch dropout occurred with 100% reporting dropout in2003, 2006, and 2007. Other good news is that ITT was themost frequent analytic policy implemented in these behav-ioral medicine RCTs, and while not statistically significant,the proportion using ITT increased from 34% to 51% from2000–2003 to 2006–2007, a difference of 17%. Unfortu-nately, ITT analysis is not used in the majority of RCTs(42%). PP analysis was the second most common analyticapproach, in 27% of reports in both 2000–2003 and 2006–2007. PP analysis reduces the sample size and, as a result,the power of the study, especially when sample sizeestimates are not performed or do not account for missingdata. PP analyses can also bias the estimates unless missingdata are missing completely at random, which is atypical inRCTs. Other analytic approaches were observed, such as

including all randomized cases except those who did notbegin treatment, excluding randomized cases who did notmeet inclusion criteria, cross-sectional analyses of a singletime point, and others.

Wood et al. [9] conducted a review examining the use ofITT and missing data strategies in RCTs published in fourtop-tier medical journals (British Medical Journal, Journalof the American Medical Association, New EnglandJournal of Medicine, Lancet). The vast majority of articles(89%) reported missing data, which is comparable to therate found in our sample of articles (87%). In that study,ITT was accurately used in only 14% of trials, although41% reported that their analytic approach was ITT. Themost common approach to handling missing data was PPanalysis (taken by 58% of articles). In our previous reviewof behavioral medicine RCTs published in medical journals(JAMA, NEJM) from 2000 to 2003, we found that 57%used ITT and only 17% used PP analyses, which maysuggest a very promising improving trend in those medicaljournals. Whether this trend is attributable to increasingacceptance of CONSORT in those journals or some otherfactor is less clear. Perhaps with time, a similar trend will beobserved in behavioral journals. Even when a journalaccepts the CONSORT guidelines, it may require severalyears for the attitudes and behavior of authors, reviewers,and editors to change. Even though the intention ofCONSORT is to enhance accurate reporting of the design,conduct, analyses, and interpretation of an RCT [12], itremains important for authors and reviewers to monitor theaccuracy of that reporting.

The present study found trends toward change in use ofvarious missing data strategies over time. The use ofanalytic models that account for missing data increasedfrom 6% to 39% from 2000 to 2007, a change thatapproached significance (p=0.05), while use of PP analysesdid not change over time. While 14% of RCTs reportedresults via both ITT and PP during 2000–2003, no RCTsreported both strategies in 2006–2007. Performing both ITT

Variable B SE X2 p value

Sample size "0.62 1.09 0.33 0.56

Primary outcome 1.61 1.08 2.23 0.13

Missing data at baseline "0.64 1.01 0.40 0.52

10% dropout 0.09 0.95 0.00 0.92

Denominator "0.96 0.93 1.04 0.30

ITT declared "0.21 0.90 0.06 0.81

ITT used "0.39 0.89 0.19 0.65

Accounted for missing data "0.11 0.92 0.02 0.89

Analytic model accounts for missing data 0.68 1.82 0.14 0.70

Simple imputation 0.43 1.43 0.07 0.76

Table 3 Effect of post-CONSORT period on lineartime trend for each ITT criteria

B beta, SE standard error, X2 chisquare

320 Int. J. Behav. Med. (2009) 16:316–322

and PP analyses can be informative, by revealing outcomesfor the ITT sample as well as those who completedtreatment. When findings from these two approachesconverge, biases associated with the missing data strategyare likely to be minimized.

The current study has some limitations. The sample sizeof 87 articles from three journals may have limited power todetect changes over time, especially for the comparison ofpre- and post-CONSORT time periods. This study repre-sents an initial exploration into the use of ITT and relatedanalytic conventions in behavioral medicine RCTs andfuture research should continue this evaluative process. Thealternative to comparing the pre- and post-CONSORT timeperiods (i.e., within groups comparison) would be to com-pare articles from journals that had accepted CONSORT tothose that had not accepted CONSORT (i.e., betweengroups comparison). The difficulty with a between groupscomparison is that the two sets of journals would likelydiffer on multiple characteristics (in addition to theirendorsement of CONSORT) which might make the find-ings difficult to interpret. Also, a between-groups compar-ison would not allow us to observe change over time injournals that are undergoing editorial policy changes, whichwas the purpose of our investigation.

Another limitation to the present investigation is thatthe three journals selected do not necessarily represent theentire field of behavioral medicine. The content of theJCCP, for example, is not exclusively behavioral medicine.JCCP was selected though, because while covering abroader range of behavioral research, it publishes a highnumber of RCTs. In our sample, 29% of studies werepublished in JCCP which is comparable to the 30%published in ABM and not much less than the 41% in HP.Several other behavioral medicine field journals wereexplored for inclusion but did not meet inclusion criteria.Future studies over longer time periods and including morejournals might provide a much broader view of the analyticquality of clinical trials in the field of behavioral medicine.Additionally, only trials that randomized at the individuallevel were included. Group- and cluster-randomized trialswere not included because of the unique analytic complex-ities of such trials. The vast majority of RCTs performedrandomization at the individual level.

By 2006, CONSORT had been adopted for a short time(about 1–2 years) in each journal, and changes in reportingand analytic approach may become apparent only afterseveral years. HP adopted CONSORT in 2003, ABM in2004, and JCCP in 2005. Articles published in 2006 shouldhave been processed under CONSORT guidelines, with thepossible exception of JCCP. However, our primary focuswas on change in use of ITT over time, regardless ofCONSORT adoption. Finally, because the focus was on ITTanalytic quality, adherence to other aspects of CONSORT

were not assessed. Finally, it should be noted that in thepresence of missing data, ITT analyses are biased toward nogroup differences, thus may not be suitable for trials thatare establishing equivalence or adverse effects [8].

The RCT maximizes potential to measure unbiasedtreatment effects and is a cornerstone of the evidence thatinforms clinical practice. Its full potential can be under-mined when analytic quality is not maximized, randomiza-tion is not preserved, missing data strategies lead to bias,or large dropout rates occur. Increased analytic rigor ofbehavioral medicine clinical trials will increase the impactof behavioral medicine in the broader field of health careand allow for a more valid comparison of findings acrosstrials.

Acknowledgment This paper represents work undertaken on behalfof the Society of Behavioral Medicine’s Evidence-Based BehavioralMedicine Committee.

References

1. Bradford H. Principles of medical statistics. London: OxfordUniversity Press; 1961.

2. Gravel J, Opatrny L, Shapiro S. The intention-to-treat approach inrandomized controlled trials: Are authors saying what they do anddoing what they say? Clin Trials. 2007;4:350–6.

3. Hollis S, Campbell F. What is meant by intention to treat analysis?Survey of published randomised controlled trials. BMJ. 1999;319(7211):670–4.

4. Last JM. A dictionary of epidemiology. 4th ed. New York: OxfordUniversity Press; 2001.

5. Newell DJ. Intention-to-treat analysis: Implications for quantita-tive and qualitative research. Int J Epidemiol. 1992;21(5):837–41.

6. Heritier SR, Gebski VJ, Keech AC. Inclusion of patients inclinical trial analysis: the intention-to-treat principle. Med J Aust.2003;179(8):438–40.

7. Porta N, Bonet C, Cobo E. Discordance between reportedintention-to-treat and per protocol analyses. J Clin Epidemiol.2007;60:663–9.

8. Cochrane Handbook for Systematic Reviews of InterventionsVersion 5.0.1. (2008) Retrieved November 22, 2008, from www.cochrane-handbook.org

9. Wood AM, White IR, Thompson SG. Are missing outcome dataadequately handled? A review of published randomized controlledtrials in major medical journals. Clin Trials. 2004;1(4):368–76.

10. AltmanDG, Schulz KF, Moher D, EggerM, Davidoff F, Elbourne D,et al. The revised CONSORT statement for reporting randomizedtrials: explanation and elaboration. Ann Intern Med. 2001;134(8):663–94.

11. Chiles JA, Lambert MJ, Hatcher JW. The impact of psychologicalinterventions on medical cost offset: A meta-analytic review. ClinPsychol Sci Pract. 1999;6:204–20.

12. Moher D, Jones A, Lepage L. Use of the CONSORT statementand quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA. 2001;285(15):1992–5.

13. Stone A. Editorial: modification of instructions to authors. HealthPsychol. 2003;22(4):331.

14. Kaplan RM, Trudeau KJ, Davidson KW. New policy on reports ofrandomized clinical trials. Ann Behav Med. 2004;27(2):81.

15. La Greca AM. Editorial. J Consult Clin Psychol. 2005;73(1):3–5.

Int. J. Behav. Med. (2009) 16:316–322 321

16. Kane RL, Wang J, Garrard J. Reporting in randomized clinicaltrials improved after adoption of the CONSORT statement. J ClinEpidemiol. 2007;60(3):241–9.

17. Spring B, Pagoto SL, Knatterud GL, Kozak A, Hedeker D.Examination of the analytic quality of behavioral health random-ized clinical trials. J Clin Psychol. 2007;63(1):53–71.

18. Dickersin K, Scherer R, Lefebvre C. Identifying relevantstudies for systematic reviews. BMJ. 1994;309(6964):1286–91.

19. Berlin JA. Does blinding of readers affect the results of meta-analyses? University of Pennsylvania Meta-analysis BlindingStudy Group. Lancet. 1997;350(9072):185–6.

322 Int. J. Behav. Med. (2009) 16:316–322