15
This article was downloaded by: [University of Connecticut] On: 08 October 2014, At: 01:30 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20 Joint One-Sided and Two-Sided Simultaneous Confidence Intervals S. Braat a , D. Gerhard b & L. A. Hothorn b a Biometrics, N.V. Organon, Oss, The Netherlands Biometrics, Global Clinical Information , b Institute of Biostatistics , Leibniz University of Hannover , Germany Published online: 18 Mar 2008. To cite this article: S. Braat , D. Gerhard & L. A. Hothorn (2008) Joint One-Sided and Two-Sided Simultaneous Confidence Intervals, Journal of Biopharmaceutical Statistics, 18:2, 293-306, DOI: 10.1080/10543400701697174 To link to this article: http://dx.doi.org/10.1080/10543400701697174 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

  • Upload
    l-a

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

This article was downloaded by: [University of Connecticut]On: 08 October 2014, At: 01:30Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/lbps20

Joint One-Sided and Two-SidedSimultaneous Confidence IntervalsS. Braat a , D. Gerhard b & L. A. Hothorn ba Biometrics, N.V. Organon, Oss, The Netherlands Biometrics, GlobalClinical Information ,b Institute of Biostatistics , Leibniz University of Hannover ,GermanyPublished online: 18 Mar 2008.

To cite this article: S. Braat , D. Gerhard & L. A. Hothorn (2008) Joint One-Sided and Two-SidedSimultaneous Confidence Intervals, Journal of Biopharmaceutical Statistics, 18:2, 293-306, DOI:10.1080/10543400701697174

To link to this article: http://dx.doi.org/10.1080/10543400701697174

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

Journal of Biopharmaceutical Statistics, 18: 293–306, 2008Copyright © Taylor & Francis Group, LLCISSN: 1054-3406 print/1520-5711 onlineDOI: 10.1080/10543400701697174

JOINT ONE-SIDED AND TWO-SIDEDSIMULTANEOUS CONFIDENCE INTERVALS

S. Braat1, D. Gerhard2, and L. A. Hothorn21Biometrics, N.V. Organon, Oss, The Netherlands Biometrics,Global Clinical Information2Institute of Biostatistics, Leibniz University of Hannover, Germany

For the analysis of multiarmed clinical trials often a set consisting of a mixture ofone- and two-sided tests can be preferred over a set of common two-sided hypothesessettings. Here we show the straightforward application of existing multiple comparisonprocedures for the difference and ratio of normally distributed means to complex trialdesigns, involving one and two test directions. The proposed contrast tests providea more flexible framework than the existing methods at nearly similar power. Anapplication is illustrated for an example with multiple treatment doses and two activecontrols; statistical software codes are included for R and SAS System.

Key Words: Clinical trials; Multiple comparisons; Simultaneous confidence intervals; User-definedcontrasts.

INTRODUCTION

In multiarm clinical trials multiplicity of inferences is present, for example forthe comparison of treatment and dose groups or multiple endpoints (Anonymous,2001). Therefore, the control of the type I error rate is an important principlefor the assessment of the results (CPMP, 2002). For the evaluation of multipletreatment or dose groups in a trial, it is popular to use two-sided tests or confidenceintervals in multiple comparisons, like the Tukey–Kramer procedure (Tukey, 1953),as available in SAS PROC GLM or PROC MIXED. The Tukey–Kramer procedurerequires overall two-sided testing, whereas for some study objectives, one-sideddecisions can be appropriate. In the ICH E9 guidelines (1998) the use of one-sidedapproaches is stated as controversial, and it is important to justify their appropriateuse prospectively. Koch (1991) gives some examples where, in confirmatory studiesof pharmaceutical companies for approval of production, one-sided objectives areof importance:

1. To test the superiority of a test drug over a placebo2. To demonstrate better efficacy of a combination drug versus its components

Received February 16, 2007; Accepted June 27, 2007Address correspondence to D. Gerhard, Institute of Biostatistics, Leibniz University

of Hannover, Herrenhäuser Str. 2, Hannover D-30419, Germany; E-mail: [email protected]

293

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 3: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

294 BRAAT ET AL.

3. To demonstrate that a tolerance � is an upper bound for the extent to which atest drug has poorer efficacy than an active reference control drug

4. To demonstrate that a tolerance � is an upper bound for the extent to which atest drug has poorer safety than a reference control drug or placebo

In particular, the proof of noninferiority of a new treatment relative toa standard represents a one-sided hypothesis in multiarm clinical trials, e.g., athree-arm design (D’Agostino et al., 2003).

In a trial, taken from Westfall et al. (1999) (for raw and summary data, seeTable 1), three test treatments (1-time, 2-times, 4-times) are compared to two activecontrols (drug D, drug E), where a decreasing effect of these treatments is desired.In a two-sided all-pairs comparison, many redundant hypotheses are included. Wecan reduce the number of hypotheses, by performing a two-sided comparison ofthe active controls and observing inferiority of the test treatments to each controlin only one direction. Further details and confidence interval calculation for thisexample is given in the section “Example.”

The use of one-sided hypotheses depends on the objective of the trial. If, forexample, it is not necessary to demonstrate that a test drug is similar or inferiorto a negative control, then this test direction needs no inferential attention; inthis situation a one-sided test is appropriate. If both directions of an effect are ofinterest, such as the comparison of two treatments, a two-sided test is required.A typical design for randomized dose-finding trials includes multiple doses of atest treatment, an active control, and a placebo (Bauer et al., 1998). For showingsuperiority to the placebo one-sided tests are adequate, whereas two-sided testsshould be used for comparing the treatment doses with an active control to assessa new drug’s position in comparison to existent drugs.

In an in vivo study (Long et al., 2004) tumor volumes were measured fordifferent administration schedules of combinations of the antiestrogen tamoxifenand the nonsteroidal aromatase inhibitor letrozole. The treatments are madeup of a negative control, 100�g/day tamoxifen alone and 100�g/day letrozole

Table 1 Example dataset (Westfall et al., 1999)

Test treatment Active control

Treatments 1-time 2-times 4-times drug D drug E

3�8612 10�3993 13�9621 16�9819 21�511910�3868 8�6027 13�9606 15�4576 27�24455�9059 13�6320 13�9176 19�9793 20�51993�0609 3�5054 8�0534 14�7389 15�77077�7204 7�7703 11�0432 13�5850 22�88502�7139 8�6266 12�3692 10�8648 23�95274�9243 9�2274 10�3921 17�5897 21�59252�3039 6�3159 9�0286 8�8194 18�30587�5301 15�8258 12�8416 17�9635 20�38519�4123 8�3443 18�1794 17�6316 17�3071

Mean 5�782 9�225 12�375 15�361 20�948Std. error 2�878 3�483 2�923 3�455 3�345

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 4: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 295

Table 2 Comparisons of interest forthe study from Long et al. (2004). Cis the negative control; Tamoxifen andLetrozole are denoted by T and L,respectively

Direction Comparison

one-sided C vs. Tone-sided C vs. Lone-sided C vs. T & Lone-sided C vs. T → Lone-sided C vs. L → Ttwo-sided T vs. T & Ltwo-sided T vs. T → Ltwo-sided T vs. L → Ttwo-sided L vs. T & Ltwo-sided L vs. T → Ltwo-sided L vs. L → Ttwo-sided T & L vs. T → Ltwo-sided T & L vs. L → T

alone, tamoxifen and letrozole concurrently, 4-week courses of first tamoxifenfollowed by letrozole, and 4-week courses of letrozole followed by tamoxifen.For comparisons with the negative control, one-sided tests on superiority areappropriate. Further, between the mono- and the combination therapies, as well asbetween the combination therapies alone, two-sided tests are relevant. A scheme, toillustrate certain treatment comparisons of interest, is shown in Table 2.

Cheung et al. (2004) developed a multiple testing procedure for many-to-onecomparisons, according to the Dunnett approach (1955), with multiple one-sidedand two-sided hypotheses. They present tables with two sets of critical values, onefor one-sided, a second for the two-sided decisions, to calculate multiple tests andtheir corresponding confidence intervals for various numbers of treatments, degreesof freedom, and correlation structures. Additionally, results of a simulation-basedpower study are presented where, for settings with a large number of one-sided tests,Cheung’s approach achieved a gain of up to 60% higher average power againstthe overall two-sided Dunnett procedure. A different approach was suggested byHayter et al. (2000) to calculate two-sided tests and confidence intervals with thesensitivity of one-sided procedures. For this method critical values are used, whichare a mixture of the critical values of a studentized range distribution and criticalvalues derived by Hayter and Liu (1996).

In this work, existing multiple contrast test methods are used in an intuitiveand straightforward way to provide simultaneous one- and two-sided test decisions,or adjusted p-values, and confidence intervals for the difference and ratio ofnormally distributed means in a linear model setting. We focus on the calculationof simultaneous confidence intervals, as they are appropriate for interpretation,convey quantitative information, and are explicitly recommended by the ICH E9Guidelines (1998). With a multiple contrast test we are not restricted to many-to-one(Dunnett, 1955) or all-pairs (Tukey, 1953) comparisons. The flexible frameworkof a user-defined contrast allows us to make inferences about every desired linear

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 5: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

296 BRAAT ET AL.

combination of means, for example, detecting a trend between multiple treatmentdoses (Bretz and Hothorn, 2003), or to identify a change point (Hirotsu, 2002).

DECOMPOSITION OF TWO-SIDED HYPOTHESES INTO TWO ONE-SIDEDHYPOTHESES IN A CONTRAST TEST SETTING OF NORMAL MEANS

Consider a one-way analysis of variance

Ykl = �k + �kl� k = 1� � � � m� l = 1� � � � � nk�

where Ykl is the response, �k the mean of treatment k and �kliid∼ N�0� 2); m is the

number of treatments with nk observations for treatment k. In a trial, we may beinterested in making inferences on s linear combinations of means. For an all-pairscomparison the hypotheses are

H0 m∑

k=1

cik�k = 0 HA m∑

k=1

cik�k �= 0 for i = 1� � � � � s�

where cik is a contrast matrix for s linear combinations with length m. Eachof the s two-sided decisions can be decomposed into 2 one-sided decisions bydefining contrast coefficients cik and �−1�cik, reflecting the two possible directions.Hypotheses that are not of interest can be omitted; for r one-sided tests and �s − r�two-sided tests the associated hypotheses are

H10

m∑k=1

cpk�k ≥ 0 H1A

m∑k=1

cpk�k < 0 for p = 1� � � � � r

H20

m∑k=1

cqk�k ≥ 0 H2A

m∑k=1

cqk�k < 0 for q = r + 1� � � � � s

H2′0

m∑k=1

−cqk�k ≥ 0 H2′A

m∑k=1

−cqk�k < 0 for q = s + 1� � � � � 2s − r�

which can be reduced to one hypothesis by the combination of all three contrastmatrices cpk� cqk, and −cqk into one large contrast matrix cik:

H0 m∑

k=1

cik�k ≥ 0 HA m∑

k=1

cik�k < 0 for i = 1� � � � � 2s − r�

Performing multiple comparisons, as shown in Hochberg and Tamhane (1987)or Hsu (1996), a contrast test is calculated with

Ti =∑m

k=1 cik�k

√∑m

k=1c2iknk

� wherem∑

k=1

cik = 0 and i = 1� � � � � 2s − r�

where the pooled standard error is denoted by . The joint distribution ofT1� � � � � T2s−r follows an i-variate t-distribution with � = ∑m

k=1 nk −m− 1 degrees

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 6: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 297

of freedom and correlation matrix = ��ij� for i� j = 1� � � � � 2s − r. The multiplecontrast test uses max�T1� � � � � T2s−r� as a test statistic and controls the familywiseerror rate. The correlation matrix is computed by

�ij =∑m

k=1cikcjknk√(∑m

k=1c2iknk

)(∑mk=1

c2jknk

) �

The �2s − r�× �2s − r� correlation matrix for �Tp�1≤p≤r one-sided tests and�Tq�−Tq�r+1≤q≤s two-sided test has the form

T1 →Tp →���

Tq →−Tq →

1 �1p · · · · · · �1q −�1q

�1p 1 · · · · · · �1q −�1q

������

� � �� � �

������

�1q �1q · · · · · · 1 −1

−�1q −�1q · · · · · · −1 1

�2s−r�×�2s−r�

This correlation matrix is symmetrical and singular, as the comparison −Tq is fullyspecified by Tq. Because only s contrasts are linearly independent, the inverted s − r

contrasts are linear combinations of the s other ones. Singular matrices pose aproblem for the numerical evaluation of multivariate t-probabilities. The integrationregion for calculating the multivariate t-distribution can be transformed in such away, that each multiple contrast test uses an s-variate nonsingular t-distribution.This reduction of dimensionality is handled automatically by available softwarepackages and is described in detail by Bretz et al. (2001), Bretz (1999) formultivariate t-distribution, and for the multivariate normal distribution by Tong(1990).

The null hypothesis belonging to each single contrast Ti can be rejected, ifthis test statistic is larger than the i-variate t-quantile ti�1−���� . The correspondingsimultaneous confidence intervals can be adopted to determine the one-sided andtwo-sided confidence intervals. The two-sided confidence intervals can be obtainedby reconnecting the decomposed one-sided limits whereby the sign of one of thelimits is inversed. For the following confidence intervals, the index p = 1� � � � � rdenotes the contrasts for the one-sided hypotheses and q = r + 1� � � � � s are thecontrasts for the two-sided hypotheses.

m∑k=1

cpk�k ∈−��

∑cpk�k + ti�1−����

√∑ c2pk

nk

m∑k=1

cqk�k ∈−

∑−cqk�k + ti�1−����

√∑ −c2qk

nk

∑cqk�k + ti�1−����

√∑ c2qk

nk

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 7: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

298 BRAAT ET AL.

CONFIDENCE INTERVALS FOR MULTIPLE RATIOS OF NORMAL MEANS

The contrast approach can also be used to perform multiple comparisons forthe ratio in a one-way layout of a general linear model. Confidence intervals forratios have the advantage of defining a scale-invariant threshold and, for someconfigurations, a higher power can be observed than for the inference about thedifference. Dilba et al. (2006) showed a method to calculate exact simultaneousconfidence intervals based on the multivariate t-distribution and proposed twoother approximate methods (plug-in and resampling approaches). For ratios, thefollowing set of hypotheses is of interest:

H10

∑mk=1 cpk�k∑mk=1 dpk�k

≥ 1 H1A

∑mk=1 cpk�k∑mk=1 dpk�k

< 1 for p = 1� � � � � r

H20

∑mk=1 cqk�k∑mk=1 dqk�k

≥ 1 H2A

∑mk=1 cqk�k∑mk=1 dqk�k

< 1 for q = r + 1� � � � � s

H2′0

∑mk=1 dqk�k∑mk=1 cqk�k

≥ 1 H2′A

∑mk=1 dqk�k∑mk=1 cqk�k

< 1 for q = s + 1� � � � � 2s − r�

These hypotheses can be written as one hypothesis, where the contrast matricescpk� cqk� dqk and dpk� dqk� cqk are merged into two large contrast matrices cik and dik.

H0

∑mk=1 cik�k∑mk=1 dik�k

≥ 1 HA

∑mk=1 cik�k∑mk=1 dik�k

< 1 for i = 1� � � � � 2s − r

Here ci and di are known vectors of real constants associated with the ith ratio.Let

�i =∑m

k=1 cik�k∑mk=1 dik�k

= c′i�d′i�

be the point estimate for the ratio of interest, where ci and di are linear combinationsof the design matrix X. Then the related confidence intervals are calculated,according to Fieller’s theorem (1954), with

�p ∈

0�

−Bp +√B2p − 4ApCp

2Ap

for p = 1� � � � � r

�q ∈

1

/−B′

q +√�B′

q�2 − 4AqC

′q

2A′q

−Bq +√B2q − 4AqCq

2Aq

for q = r + 1� � � � � s

where

A�p�q� = �d′�p�q���2 − �c1−��

22d′�p�q�Md�p�q�

B�p�q� = −2(�c′�p�q��d

′�p�q� − �c1−��

22c′�p�q�Md�p�q��)

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 8: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 299

C�p�q� = �c′�p�q���2 − �c1−��

22c′p�qMc�p�q�

M = �X′X�−1�

In A′q� B

′q� C

′q the d

′q and c′q are switched, as the reciprocal value of the ratio estimate

is used to calculate the opposite-sided confidence limit. For the plug-in approach,c1−� denotes a quantile of the multivariate t-distribution, for example, with themaximum likelihood estimates of the multiple ratios plugged into the correlationmatrix ���.

For a closer look at the calculation of these confidence intervals, theinterested reader is referred to Dilba et al. (2006), where many existing methodsfor inference about ratios are described. Software for calculating simultaneousconfidence intervals for the ratio is implemented as a R package, which is illustratedin Dilba et al. (2007).

POWER STUDY FOR INFERENCE ABOUT THE DIFFERENCEOF NORMAL MEANS

The probability of rejecting a null hypothesis is higher when the confidenceinterval has a smaller width. By reducing of the number of hypotheses to be tested,the confidence intervals should be shortened, as less multiplicity adjustments have tobe made. Therefore, with user-defined contrast approaches, a higher power shouldbe obtained than for the two-sided Dunnett or Tukey–Kramer procedure becauseonly the hypotheses of interest are used. The differences in interval range betweenthe Cheung et al. (2004) and the contrast approaches for balanced designs dependonly on the size of the critical values, because point estimates and variance termsare equal for both methods. For the contrast approach one critical value for everycomparison is calculated, whereas for Cheung’s approach a set of two critical values,separated into one- and two-sided quantiles, are needed. Taking the sum of thecritical values, weighted by the number of decisions, the behavior of both methodscan be compared with this average critical value for one- and two-sided comparisons(Fig. 1). The average critical values calculated by Cheung’s method are slightlysmaller than for the contrast method. An increase in the number of one-sidedtests is associated with an increase in the difference of the two methods. Whenconsidering only the one-sided decisions, the confidence intervals by Cheung are upto 1% narrower than for the contrast approach; the reverse is true for the two-sideddecisions, which are up to 2% wider. For the contrast approach every decision isweighted equally, whereas by the two different critical values from Cheung, the one-sided decisions are preferred. For the Bonferroni method, where the type I error rateis weighted by the number of one-sided comparisons � �

i�, this difference between

one- and two-sided critical values is even higher.As in Cheung et al. (2004), the increase in average power (the percentage

of false hypotheses that are correctly rejected) vs. the two-sided Dunnettapproach is investigated. Power was calculated by performing a simulationwith 100,000 replications for each parameter setting, comparing three groupswith a control �df = 60� � = 0�5�, increasing the noncentrality parameter� = �

∑mk=1 cik�k�/

√c2ik/nk.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 9: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

300 BRAAT ET AL.

Figure 1 Ratio of critical values (CV) between Cheung’s (2004) and the user-defined contrastapproaches over the percentage of two-sided decisions of all comparisons, at different degrees offreedom. The critical values for the single one-sided and single two-sided hypotheses, and the weightedaverage critical values are observed for comparisons of 2–6 groups.

The critical values for the Cheung et al. approach were taken from tablespresented in their article. In Fig. 2, similar to Cheung’s results, the methods, whichaccount for whether one- or two-sided decisions are made, have a higher averagepower than the overall two-sided Dunnett procedure. Both methods keep the overalllevel �. With higher noncentrality parameter � �≥4� the power of all tests are nearlyequal. For a large number of two-sided tests, the contrast and Cheung’s approachesbehave quite similarly; with an increasing number of one-sided tests, the differencebetween the approaches becomes larger, and the contrast approach shows only amarginal decrease in performance compared to Cheung’s method.

For the contrast approach, a closed expression for the calculation under anyalternative hypothesis HA can be given. The probability of a correct rejection of theglobal null hypothesis H0 is obtained by

P�T ≥ t� HA� = 1− P

max1≥i≥2s−r

∑m

k=1 cik�k

√∑

mk=1

c2iknk

< t� HA

= 1− T2s−r �−t�a� t�b� � �� ��

Here, t� denotes the critical value obtained under H0, the integration bounds area = ��� � � � ��� and b = 12s−r ; � is the noncentrality vector.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 10: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 301

Figure 2 Increase in average power of user-defined contrast tests, Cheung, and Bonferroni methodscompared to an overall two-sided many-to-one comparison, for 3 treatments with a control in abalanced design (df = 60� r = number of one-sided tests, s − r = number of two-sided tests).

EXAMPLE

Here we describe the calculation of simultaneous confidence intervals for theexample data set (Westfall et al., 1999) mentioned in the introduction. In this study,cholesterol reduction was observed for three doses of a test drug (20mg once a day,10mg twice a day, 5mg four times a day) and for two competing drugs used ascontrol. Instead of using a two-sided all-pairs comparison, in a user-defined contrastsetting, six one-sided confidence intervals, to examine the inferiority of three dosesvs. each of the two controls, and one two-sided confidence interval, comparing thetwo active controls, can be performed. The problem-adequate contrast-matrix c isshown in Table 3. A single contrast is used for one-sided hypotheses, whereas two

Table 3 Contrast matrix for the cholesterol example (Westfall et al., 1999)

Comparison 1-time 2-times 4-times drug D drug E

1-time – drug D −1 0 0 1 02-times – drug D 0 −1 0 1 04-times – drug D 0 0 −1 1 01-time – drug E −1 0 0 0 12-times – drug E 0 −1 0 0 14-times – drug E 0 0 −1 0 1drug D – drug E 0 0 0 −1 1drug E – drug D 0 0 0 1 −1

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 11: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

302 BRAAT ET AL.

contrasts with opposite signs are used for the two-sided hypotheses. For the all-pairsmethod, 20 decomposed one-sided confidence intervals have to be calculated; in thecontrast approach, the multiplicity adjustment is made for only 8 decisions. In Fig. 3the calculated confidence intervals are shown for the difference and ratio. Overall,the contrast approach yields shorter intervals than the all-pairs comparisons.

SOFTWARE

Multiple tests and confidence intervals for user-defined contrasts can becomputed using the statistical software such as R or SAS System. In R (RDevelopment Core Team, 2006), the package multcomp (Bretz et al., 2002; Hothornet al., 2006) allows the calculation of multiple tests (function simtest) andsimultaneous confidence intervals (function simint) by using the multivariate t- ornormal distribution. Package multcomp is based on package mvtnorm (Genz et al.,2006), which is used to compute multivariate normal and t-probabilities, quantiles,and densities. With the R package mratios, by Dilba and Schaarschmidt (2006),confidence intervals for the ratios in a one-way general linear model can becomputed similar to the package multcomp. R can be downloaded for free athttp://www.r-project.org. In SAS System, using PROC GLM for example, it isonly possible to calculate simultaneous confidence intervals or adjusted p-valuesfor all-pairs or many-to-one comparisons as pre-defined contrast settings. With thetwo SAS Macros “SimTests” and “SimIntervals” (Westfall et al., 1999), multiple

Figure 3 Confidence intervals for a user-defined contrast and two-sided all-pairs comparisons for thedifference and ratio (Cholesterol dataset).

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 12: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 303

tests and simultaneous confidence intervals are available for user-defined contrasts,but only for inferences about the differences of means. Instead of the multivariatet-distribution, here the quantiles are approximated by simulation. The two SASmacros can be found at http://ftp.sas.com/samples/A56648. Computer code for theWestfall example using R and SAS System is given in the appendix.

CONCLUSIONS

In this article, we show a way to adopt multiple contrast tests, for settingswith a mixture of one-sided and two-sided hypotheses, by decomposition of thetwo-sided hypotheses into two one-sided hypotheses, hereby controlling the typeI error rate. Available statistical software in SAS and R using the multivariate t-distribution can be used for inference in this setting via simultaneous confidenceintervals for both differences and ratios of normal means. The adopted multiplecontrast test can be applied to more general linear combinations of hypotheses thanthe existing methods at nearly similar power. Furthermore, we are able to calculatepower directly over the multivariate noncentral t-distribution.

APPENDIX

Here we present code for R (package multcomp) and the Westfall macro formultiple comparisons in SAS System for the example data set (see the section“Example”).

R Code

library(multcomp)data(cholesterol)cmat <- rbind(c(−1,0, 0, 1, 0),

c(0,−1, 0, 1, 0),c(0, 0,−1, 1, 0),c(−1, 0, 0, 0, 1),c(0,−1, 0, 0, 1),c(0, 0,−1, 0, 1),c(0, 0, 0,−1, 1),c(0, 0, 0, 1,−1))

colnames(cmat) <− levels(cholesterol$trt)rownames(cmat) <− c("drugD-1time","drugD-2time",

"drugD-4time","drugE-1time","drugE-2time", "drugE-4time","drugE-drugD","drugE-drugD")

linmod <− lm(response ∼ trt, data=cholesterol)glht_obj <− glht(linmod, linfct = mcp(trt = cmat),alternative="greater")confint(glht_obj)

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 13: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

304 BRAAT ET AL.

Code for SAS System

After the data set is read into the SAS System and the SimTests andSimIntervals macros are invoked, the contrasts are set with

proc glm data=cholesterol outstat=stat;class trt;model response=trt;lsmeans trt /out=ests cov;

run;quit;

%macro Estimates;use ests;read all var {LSMEAN} into EstPar;read all var {COV1 COV2 COV3 COV4 COV5} into Cov;use stat (where=(_TYPE_=’ERROR’));read all var {df} into df;

%mend;

%macro Contrasts;C = {−1 0 0 1 0 ,

0 −1 0 1 0 ,0 0 −1 1 0 ,−1 0 0 0 1 ,0 −1 0 0 1 ,0 0 −1 0 1 ,0 0 0 −1 1 ,0 0 0 1 −1} ;

C=C’;Clab = {"drugD-1time","drugD-2time","drugD-4time",

"drugE-1time", "drugE-2time","drugE-4time","drugE-drugD","drugE-drugD"};

%mend;

%SimIntervals(seed=12345, conf=0.95, side=L);

ACKNOWLEDGMENTS

We thank Miriam Annett as well as two anonymous referees for severalsuggestions that have improved the presentation of this paper.

REFERENCES

Anonymous (2001). Points to consider on multiplicity issues in clinical trials. BiometricalJournal 43:1039–1048.

CPMP (2002). Adjustment for multiplicity and related topics. Points to consider onmultiplicity issues in clinical trials. European Agency for the Evaluation of MedicinalProducts, CPMP/EWP/908/99. London, UK.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 14: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

SIMULTANEOUS CONFIDENCE INTERVALS 305

Bauer, P., Röhmel, J., Maurer, W., Hothorn, L. (1998). Testing strategies in multi-doseexperiments including active control. Statistics in Medicine 17:2133–2146.

Bretz, F. (1999). Powerful Modification of Williams’ Test on Trend. PhD. thesis, Universityof Hannover.

Bretz, F., Hothorn L. A. (2003). Statistical analysis of monotone or non-monotone dose-response data from in vitro toxicological assays. ATLA-Alternatives to LaboratoryAnimals 31:81–96 (Suppl. 1).

Bretz, F., Genz, A., Hothorn, L. A. (2001). On the numerical availability of multiplecomparison procedures. Biometrical Journal 43:645–656.

Bretz, F., Hothorn, T., Westfall, P. (2002). On multiple comparisons in R. R News 2(3):14–17.Cheung, S. H., Kwong, K. S., Chan, W. S., Leung, S. P. (2004). Multiple comparisons with a

control in families with both one-sided and two-sided hypotheses. Statistics in Medicine23:2975–2988.

D’Agostino, R. B. Sr., Massaro, J. M., Sullivan, L. M. (2003). Non-inferiority trials: Designconcepts and issues—The encounters of academic consultant statistics. Statistics inMedicine 22:169–186.

Dilba, G., Schaarschmidt, F. (2006). Mratios: Inferences for ratios of coefficients in thegeneral linear model. R package version 1.0.

Dilba, G., Bretz, F., Guiard, V. (2006). Simultaneous confidence sets and confidence intervalsfor multiple ratios. Journal of Statistical Planning and Inference 136:2640–2658.

Dilba, G., Schaarschmidt, F., Hothorn, L. A. (2007). Inferences for ratios of normal means.R News 7(1):20–23.

Dunnett, C. W. (1955). A multiple comparison procedure for comparing severaltreatments with a control. Journal of the American Statistical Association 50(272):1096–1121.

Fieller, E. C. (1954). Some problems in interval estimation. Journal of the Royal StatisticalSociety Series B—Statistical Methodology 16(2):175–185.

Genz, A., Bretz, F., R port by Hothorn, T. (2006). Mvtnorm: Multivariate normal and tdistribution. R package version 0.7-5.

Hayter, A. J., Liu, W. (1996). Exact calculations for the one-sided studentized range test fortesting against a simple ordered alternative. Computational Statistics and Data Analysis22:17–25.

Hayter, A. J., Miwa, T., Liu, W. (2000). Combining the advantages of one-sided and two-sided procedures for comparing several treatments with a control. Journal of StatisticalPlanning and Inference 86:81–99.

Hirotsu, C., Marumo, K. (2002). Changepoint analysis as a method for isotonic inference.Scandinavian Journal of Statistics 29(1):125–138.

Hochberg, Y., Tamhane, A. C. (1987). Multiple Comparison Procedures. John Wiley & Sons,Inc.

Hothorn, T., Bretz, F., Westfall, P. (2006). Multcomp: Simultaneous Inference for GeneralLinear Hypotheses. R package version 0.991-5.

Hsu, J. (1996). Multiple Comparisons: Theory and Methods. London, UK: Chapman& Hall.

ICH guideline E9 (1998). Guidance for Industry. Statistical principles for clinical trials. SeeURL http://www.fda.gov/cder/guidance/ICH_E9-fnl.PDF.

Koch, G. G. (1991). One-sided p-values and two-sided tests and p values. Journal ofBiopharmaceutical Statistics 1:161–170.

Long, B. J., Jelovac, D., Handratta, V., Thiantanawat, A., MacPherson, N., Ragaz, J.,Goloubeva, O. G., Brodie, A. M. (2004). Therapeutic strategies using the aromataseinhibitor letrozole and tamoxifen in a breast cancer model. Journal of the NationalBreast Cancer Institute 96:456–465.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4

Page 15: Joint One-Sided and Two-Sided Simultaneous Confidence Intervals

306 BRAAT ET AL.

R Development Core Team (2006). A language and environment for statistical computing.R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URLhttp://www.R-project.org.

Tong, Y. L. (1990). The Multivariate Normal Distribution. New York: Springer.Tukey, J. W. (1953). The problem of multiple comparisons. Mimeographed Notes. Princeton

University.Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., Hochberg, Y. (1999). Multiple

Comparisons and Multiple Tests Using the SAS System. Cary, NC: SAS Institute Inc.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

onne

ctic

ut]

at 0

1:30

08

Oct

ober

201

4