Week 11 Testing Hypotheses, Part Ipersonal.psu.edu/acq/401/course.info/week11.pdfx0=mean(x); newx=data.frame(x=x0); predict(out, newx, interval=”conﬁdence) Week 11 Testing Hypotheses,

OutlineLab 7: CIs and PIsHypothesis Testing

Week 11Testing Hypotheses, Part I

Week 11 Testing Hypotheses, Part I


Week 11 Objectives

1 Simulations are used to consolidate the understanding ofconfidence intervals (CI). In addition the lab sessiondemonstrates the use of R for

constructing CI for a mean, a proportion and the regressionparameters, andconstructing prediction intervals.

2 Concepts surrounding the subject of hypothesis testing areintroduced.

3 Test procedures, including formulas for p-values, forhypotheses regarding the mean, a proportion, theregression parameters, and the median are presented.



1 Lab 7: CIs and PIs

CIs and PIs in Regression

CI for µ (through “lm”) and CI for p

2 Hypothesis Testing

Introduction

Tests for µ and for p

Tests for Regression Parameters

The Sign Test for the Median



CIs and PIs in RegressionCI for µ (through “lm”) and CI for p

Explaining CIs

Each CI is a Bernoulli trial: It either contains the true parametervalue or not. After the CI is constructed, we talk about how”confident” we are that it contains the true value.

The following commands generate the figure in the next slide:m = 50; n=20; p = .5; numH=rbinom(m,n,p) # toss 20 coins 50 times,store the number of heads each time in numH

phat = numH/n # get the 50 estimators of 0.5

SE = sqrt(phat*(1-phat)/n) # get the 50 estimated standard errors

alpha = 0.10; zstar = qnorm(1-alpha/2) # compute zα/2

matplot(rbind(phat - zstar*SE, phat + zstar*SE), rbind(1:m,1:m),type=”l”, lty=1); abline(v=p)








Introduction







CIs for α1 and β1

If y and x contain the response and predictor values, 95%CIs for α1 and β1 of the model Y = α1 + β1X + ε areobtained as:

out=lm(y∼x); confint(out) or

out=lm(y∼x); confint(out, level=0.95)For 90% CIs use

out=lm(y∼x); confint(out, level=0.90)(∗) Can ask for CIs separately β1 and α1 by

confint(out, parm=”x”, level=0.90) and

confint(out, parm=”(Intercept)”, level=0.90)respectively. Omitting the level specification gives 95% CIs.




CIs for µY |X (x) = α1 + β1x

If y and x contain the response and predictor values, a CIfor µY |X (x), e.g., at x = 5.5, is obtained by:

out=lm(y ∼ x); newx=data.frame(x=5.5)predict(out, newx, interval=”confidence”, level=0.9)REMARK: If the predictor values were in the R object “t”,the “newx” command should be “newx=data.frame(t=5.5)”.Can do multiple CIs at different x-values. E.g., for CIs atx = 4.5 and x = 5.5, change the “newx” command to

newx=data.frame(x=c(4.5,5.5))




CI for E(Y ) (or for β0 = α1 + β1E(X ))

• If y and x contain the response and predictor values, a CI forE(Y ) can be done in two ways:

x0=mean(x); newx=data.frame(x=x0); predict(out, newx,interval=”confidence, level =0.9)xc = x - mean(x); out2=lm(y ∼ xc); confint(out2, level=0.90)

• REMARK: This CI for µY , which is uses the structure of theSLR model, will be shorter than the T CI.




Illustration with Simulated Data

e=rnorm(50,0,5); x=runif(50,0,10); y=25−3.4*x+e

95% CIs for α1 and β1:

out=lm(y∼x); confint(out)

90% CIs for α1 and β1:

out=lm(y∼x); confint(out, level=0.90)

95% CIs for µY |X (x) = α1 + β1x at x = 4.5 and x = 5.5:

out=lm(y ∼ x); newx=data.frame(x=c(4.5,5.5));predict(out, newx, interval=”confidence”)95% CI for E(Y ):

x0=mean(x); newx=data.frame(x=x0); predict(out, newx,interval=”confidence)




PIs in regression

• Recall that for prediction you need the normal SLR model.

90% PI for a new observation at x = 4.5:

out=lm(y∼x); newx=data.frame(x=4.5)predict(out, newx, interval=”predict”, level=0.9)

As usual, the default level is 95%.

Can do multiple PIs at different x-values. E.g., for 95% PIsat x = 4.5 and x = 5.5, change the “newx” command to

newx=data.frame(x=c(4.5,5.5))




PIs without a covariate

• Recall that for prediction you need the normality assumption.

With the data points in x, the simple command

predict(lm(x ∼ 1), interval=”predict”, level=0.9)gives a 90% PI for a new observation, but does so n times.To get it only once use:x2=rep(1,length(x)); newx=data.frame(x2=1)predict(lm(x ∼ -1+ x2), newx, interval=”predict”, level=0.9)




Illustration with Simulated Data

e=rnorm(50,0,5); x1=rnorm(50,9,3); y1=25−3.4*x1+e

95% PIs for new observations at x1 = 7.5 and x1 = 8.5:

out1=lm(y1 ∼ x1); newx=data.frame(x1=c(7.5,8.5));predict(out1, newx, interval=”predict”)

95% PI for a new X1 observation:

x2=rep(1,length(x1)); newx=data.frame(x2=1)predict(lm(x1 ∼ -1+ x2), newx, interval=”predict”)








Introduction







If y contains the sample,confint(lm(y ∼ 1), level=0.9)

gives the 90% T CI for E(Y ).The 90% T CI for E(Y ) can also by obtained bymean(y) ± qt(0.95,df=length(y)-1)*sd(y)/sqrt(length(y))keeping in mind that ”+” and ”−” must be used separately.

REMARK: Compare with the CI for E(Y ) that used the SLRmodel.

If t is the number of ”successes” in n trials,phat=t/n; phat ± qnorm(0.975)*sqrt(phat*(1-phat)/n)

gives the 95% CI for p. To obtain 90% or other CIs, adjustthe 0.975 in the above command accordingly.



IntroductionTests for µ and for pTests for Regression ParametersThe Sign Test for the Median





Introduction







Hypothesis is a statement regarding the range of values ofa parameter θ of interest.Hypothesis testing is a set of rules and procedures thatallow us to declare a hypothesis as being significantly (i.e.with high probability/confidence) untrue, or not.If the hypothesis is declared significantly untrue it isrejected, in which case we talk of a significant outcome.




Can CIs be used for testing hypotheses?

• A hypothesis of the form H0 : θ = θ0, where θ0 is a specifiedvalue, is rejected at level of significance α if θ0 does not belongin the (1− α)100% CI for θ.

Suppose we are interested in testing the hypothesis thatthe mean of a population equals 9.8, i.e., H0 : µ = 9.8.Suppose, further, that the data yield a 95% CI µ of(9.3,9.9). Since 9.8 belongs in the 95% CI, we say that H0is not rejected at level of significance α = 0.05.

• Nevertheless, there are a number of specific issues that arisein hypothesis testing which deserve separate treatment. Theseare discussed in the next few pages.




The null and alternative hypotheses

In every hypothesis testing situation, there is a nullhypothesis (H0) and an alternative hypothesis (Ha).Typically, the statement in Ha is the logical complement ofthe statement in H0.

For example, if H0 : θ = θ0 then Ha : θ 6= θ0. Other commonchoices for H0 and Ha pairs are:

H0 : θ ≤ θ0, Ha : θ > θ0 and

H0 : θ ≥ θ0, Ha : θ < θ0.

These alternatives will be called one-sided, whileHa : θ 6= θ0 is called two-sided.




Asymmetric treatment of H0 and Ha

Test procedures do not reject H0 unless there is strongevidence against it. In this respect, H0 is like the presumptionof innocence in a court of law.As a consequence, when H0 is not rejected, we have noevidence that H0 is true.

In the previous example, H0 : µ = 9.8 is not rejected since9.8 belongs in the 95% CI µ which is (9.3,9.9). However,H0 : µ = 9.4 would also not be rejected. Thus there is noevidence that either of these two null hypotheses is true.

On the other hand, if H0 is rejected at level of significance α, wehave statistical proof, at the (1− α)100% level, that H0 is false.

The court of law analogy of the level of significance isreasonable doubt.




Rule for Designating H0 and Ha

• The fact that test procedures favor H0 implies that properdesignation of the hypotheses is very important.

RULE: The hypothesis that the investigator wants to claim astrue, or wants evidence for, should be taken as the alternativehypothesis. The negation, or complementary statement, of thealternative hypothesis is the null hypothesis.

• As a convention we will be mainly stating only the alternativehypothesis. The correct H0 will be implied.




ExampleA trucking firm suspects a tire manufacturer’s claim that certaintires last, on average, at least 28,000 miles. The firm initiates astudy to confirm this suspicion. Designate Ha and H0.

Solution: The trucking firm wants evidence that the claim iswrong, i.e. that µ < 28,000. This is designated as Ha. Thecomplementary statement is designated as H0. Thus thehypotheses to be tested are H0 : µ ≥ 28,000 vsHa : µ < 28,000.




ExampleA tire manufacturing firm wants to claim that certain tires last,on average, at least 28,000 miles. The firm initiates a study tosupport the validity of the claim. Designate Ha and H0.

Solution: The manufacturer wants evidence that the claim theyare about to make, i.e. µ > 28,000, is correct. This isdesignated as Ha. The complementary statement is H0. Thusthe two hypotheses to be tested are H0 : µ ≤ 28,000 vsHa : µ > 28,000.




Test Statistics and Rejection Regions

The Cl-based procedure for rejecting H0 : θ = θ0 is notsuitable for H0 : θ ≤ θ0, or H0 : θ ≥ θ0, or other hypothesessuch as the equality of several means.Instead, test procedures will be specified in terms of a teststatistics (TS).

Hypotheses about p use a Z test statistic, and those aboutµ, β1 and µY |X=x use a T test statistic.

If the value of the TS is deemed very unlikely if H0 weretrue, then H0 is rejected. The collection of such unlikelyvalues for the TS is called the rejection region (RR).




Reporting the outcome and p-values

• Reporting the outcome of a test procedure as rejecting or nota null hypothesis is not as informative as can be.

For example, if H0 : µ = µ0 is rejected at level of significanceα = 0.1 does not mean that it would also be rejected at α = 0.5.This is so because a 95% CI for µ is wider than the 90% CI.

• It is more informative to report the so-called p-value which isdefined to be the smallest level of significance at which H0 would berejected.

• The p-value determines the outcome of the test by the rule:

p-value≤ α ⇒ reject H0 at level α




The issue of precision

• Precision in hypothesis testing is quantified in quite a differentway than in estimation. Consequently sample sizedetermination in hypothesis testing is based on very differentconsiderations.

In what follows we will:

1 Give the TS, rejection regions, and formulas for the p-valuefor testing hypotheses about µ, p, β1 and µY |X (x).

2 Apply the principle of analysis of variance or ANOVA fortesting H0 : β1 = 0 vs Ha : β1 6= 0.

3 Quantify precision in hypothesis testing.








Introduction







T tests for the mean

• Let X1, . . . ,Xn, be a simple r.s. from a population with finitevariance. If the population is normal then, under H0 : µ = µ0,

TH0 =X − µ0

S/√

n∼ Tn−1. (3.1)

Without normality (3.1) holds approximately for n ≥ 30.

• The RR for testing H0 against different alternatives are:

Ha RR at level αµ > µ0 TH0 ≥ tα,n−1µ < µ0 TH0 ≤ −tα,n−1µ 6= µ0 |TH0 | ≥ tα/2,n−1




• Formulas for the p-value of the T test for the mean:

P-value =

1−Gn−1(TH0) for Ha : µ > µ0Gn−1(TH0) for Ha : µ < µ02[1−Gn−1(|TH0 |)

]for Ha : µ 6= µ0

where Gn−1 is the cumulative distribution function of the Tn−1distribution.




Example

The change of a tire design is economically justified if theaverage lifetime with the new design exceeds 20,000 miles. Arandom sample of n = 36 new tires is tested yields X = 20,580and S = 1500. Should the new design be adopted? Test atlevel of significance α = 0.01, and report the p-value.

Solution. Here H0 : µ = 20,000 vs Ha : µ > 20,000. Since

TH0 =X − µ0

S/√

n=

20,580− 20,0001500/

√36

= 2.32 ≯ t.01,35 = 2.44,

H0 is not rejected, and thus the new design is not adopted. Thep-value is 1−G35(2.32) = 0.013, meaning that H0 can berejected at level α = 0.05.




Z tests for proportions

• If np0 ≥ 5 and n(1− p0) ≥ 5 then, under H0 : p = p0

ZH0 =p̂ − p0√p0(1−p0)

n

∼̇N(0,1).


Ha RR at level αp > p0 ZH0 ≥ zαp < p0 ZH0 ≤ −zαp 6= p0 |ZH0 | ≥ zα/2




• Formulas for the p-value of the Z test for a proportion:

P-value =

1− Φ(ZH0) for Ha : p > p0Φ(ZH0) for Ha : p < p02[1− Φ(|ZH0 |)

]for Ha : p 6= p0

where Φ is the cumulative distribution function of the standardnormal distribution.




Example (Cell-phone-induced car accidents in city)A r.s. of 200 accident reports yields that 151 are cell phoneinduced. Does support the suspicion that at least 70% are dueto cell phone use? Test at α = 0.01, and report the p-value.

Solution. Here, H0 : p = 0.7 vs Ha : p > 0.7. The TS is

ZH0 =p̂ − 0.7√

(0.7)(0.3)/200=

0.755− 0.7√(0.7)(0.3)/200

= 1.697.

Since 1.697 ≯ 2.33 so H0 is not rejected. The p-value is1− Φ(1.697) = 0.045 so the test is (barely) significant atα = 0.05.








Introduction







T tests for the regression slope

• If the error distribution is normal then, under H0 : β1 = β1,0

TH0 =β̂1 − β1,0

Sβ̂1

∼̇ Tn−2, (3.2)

where Sβ̂1=

√S2ε∑

X 2i −

1n (∑

Xi )2 and S2ε =

∑Y 2

i −α̂1∑

Yi−β̂1∑

Xi Yin−2 .

Without normality, (3.2) holds approximately if n ≥ 30.


Ha RR at level αβ1 > β1,0 TH0 > tn−2,αβ1 < β1,0 TH0 < −tn−2,αβ1 6= β1,0 |TH0 | > tn−2,α/2




• Formulas for the p-values for the T test for the slope:

P-value =

1−Gn−2(TH0) for Ha : β1 > β1,0Gn−2(TH0) for Ha : β1 < β1,02[1−Gn−2(|TH0 |)

]for Ha : β1 6= β1,0


• The most common testing problem is for

H0 : β1 = 0 vs Ha : β1 6= 0.

The test for this hypothesis is called the model utility test.




Example

Given the data on conductivity (µS/cm) measurements ofsurface water (X), and water in the sediment at the bank of ariver (Y), taken at 10 points during winter [reported in a 2004article in Environmental Toxicology], the least squaresestimators are α̂1 = 193.96, β̂1 = 0.934 and S2 = 4118.56.(a) The scientific question of interest is whether surface

conductivity (which is easier to measure) can be used forpredicting the conductivity of water in the sediment. Test atlevel of significance α = 0.05, and report the p-value.

(b) Test H0 : β1 ≥ 1.0 vs Ha : β1 < 1.0, at level α = 0.05, andreport the p-value.




Solution: (a) This is another way of asking for the model utility test.Since (calculation details are omitted)

TH0 = 0.934/0.0983 = 9.5 >> t8,0.025 = 2.31,

the test is significant at α = 0.05. Thus, surface conductivity can beused for predicting the conductivity of water in the sediment. Thep-value equals 2[1−G8(9.5)] = 0.000, so the test would have beensignificant at all common α levels.

(b) The TS (calculation details omitted) is

TH0 = (0.934− 1)/0.0983 = −0.67.

Since −0.67 ≮ −t8,0.05 = −1.86, there is not evidence, at α = 0.05, tosupport Ha : β1 < 1.0. The p-value G8(−0.67) is found by the Rcommand pt(-0.67,8) to be 0.260. Using Table A.4 we can say thatthe p-value is > 0.1.




Tests about the Regression Line• If the error distribution is normal then, under H0 : µY |X=x = µY |X=x,0

TH0 =µ̂Y |X=x − µY |X=x,0

Sµ̂Y |X=x

∼̇ Tn−2, (3.3)

where Sµ̂Y |X=x = Sε

√1n + n(x−X)2

n∑

X 2i −(

∑Xi )2 and S2

ε as before.

Without normality, (3.3) holds approximately if n ≥ 30.


Ha RR at level αµY |X=x > µY |X=x,0 TH0 > tn−2,αµY |X=x < µY |X=x,0 TH0 < −tn−2,αµY |X=x 6= µY |X=x,0 |TH0 | > tn−2,α/2




• Formulas for the p-values for the T test for the regression line:

P-value =

1−Gn−2(TH0) for Ha : µY |X=x > µY |X=x ,0Gn−2(TH0) for Ha : µY |X=x < µY |X=x ,02[1−Gn−2(|TH0 |)

]for Ha : µY |X=x 6= µY |X=x ,0


Example

With the toxicity assessment data of the previous example testthe hypothesis H0 : µY |X=400 ≥ 600 vs Ha : µY |X=400 < 600, atlevel α = 0.1, and report the p-value.




Solution: Note that the value 400 lies in the range of the surfaceconductivity measurements (this can be verified by checkingthe ToxAssesData.txt data set). Next,µ̂Y |X (400) = 193.96 + 0.934× 400 = 567.56. Since(computation details omitted),

TH0 =567.56− 600

20.47= −1.58 < −t8,0.1 = −1.397,

the null hypothesis is rejected at level α = 0.1. The p-valueG8(−1.58) is found by the R command pt(-1.58,8) to be 0.076.Using Table A.4 we can say that the p-value is betwee 0.05 and0.1. In either case, we conclude that H0 would not have beenrejected at level α = 0.05.








Introduction







• Testing for the median (or other percentiles) is of interest in itsown right.

• Moreover, when the data exhibit skewness and the samplesize is small, or if there are outliers suggesting a heavy taileddistribution, T tests for the mean should be avoided.

• The sign test is based on the fact that, when sampling from acontinuous distribution, each data point Xi is greater than themedian µ̃ with probability 0.5.

• This fact helps convert a hypothesis H0 : µ̃ = µ̃0 into ahypothesis that a binomial proportion is 0.5.

The conversion proceeds as follows:




Let X1, . . . ,Xn be a sample from a continuous distributionand set

Y = # of Xi that are > µ̃0.

Then Y ∼ Bin(n,p), where p = P(Xi > µ̃), and

if H0 : µ̃ = µ̃0 holds, then p = 0.5.

if Ha : µ̃ > µ̃0 holds, then p > 0.5. Why? (Hint: A picturehelps.)

if Ha : µ̃ < µ̃0 holds, then p < 0.5. Why?




• The above imply that testing H0 : µ̃ = µ̃0

vs Ha : µ̃ > µ̃0,or vs Ha : µ̃ < µ̃0,or vs Ha : µ̃ 6= µ̃0,

is equivalent to testing H0 : p = 0.5

vs Ha : p > 0.5,or vs Ha : p < 0.5,or Ha : p 6= 0.5,

respectively. Thus we have the following test procedure.




The Sign Test Procedure for H0 : µ̃ = µ̃0

• Assume n ≥ 10 and set

ZH0 =p̂ − 0.5√

0.25n

, where p̂ =Yn.


Ha RR at level αµ̃ > µ̃0 ZH0 ≥ zαµ̃ < µ̃0 ZH0 ≤ −zαµ̃ 6= µ̃0 |ZH0 | ≥ zα/2




• Formulas for the p-value of the sign tetst:

P-value =

1− Φ(ZH0) for Ha : µ̃ > µ̃0Φ(ZH0) for Ha : µ̃ < µ̃02[1− Φ(|ZH0 |)

]for Ha : µ̃ 6= µ̃0

where Φ is the cumulative distribution function of the standardnormal distribution.




ExampleThe data set InfantSBP.txt contains systolic blood pressuremeasurements from a sample of 36 infants. Do the datasuggest that the median is greater than 94? Test at α = 0.05and report the p-value.Solution. Here H0 : µ̃ = 94 and Ha : µ̃ > 94. Since n = 36 > 10the sign test can be used. Importing the data into the R objectx , the R command sum(x > 94) returns Y = 22. Thus,p̂ = Y/n = 0.611, and

ZH0 =0.611− 0.5√

0.25/36= 1.332

Since 1.332 < z0.05 = 1.645, H0 is not rejected. The p-value is1− Φ(1.332) = 0.091.


Documents

Week 11 Testing Hypotheses, Part Ipersonal.psu.edu/acq/401/course.info/week11.pdfx0=mean(x); newx=data.frame(x=x0); predict(out, newx, interval=”conﬁdence) Week 11 Testing Hypotheses,