37
CH7.2 2 Sample Tests

@let@token CH7.2 2 Sample Testsjagels/mat217/ch7_2.pdf · The test statistic, final version But as we stated before, we use SEdiff to estimate the denominator of the statistic z

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • CH7.2

    2 Sample Tests

  • Intro

    Suppose we wish to compare the heights, X1, of IN 12 year oldswith the heights, X2, of KY 12 year olds. We proceed as follows.

    Step I: Take an SRS of size n1 from the IN pop and compute x̄1.If we assume the population is N(µ1, σ1) then the distribution ofx̄1 is N(µx̄1 = µ1, σx̄ = σ1/

    √n1).

    Step II: Take an SRS of size n2 from the KY pop and compute x̄2.If we assume the population is N(µ2, σ2) then the distribution ofx̄2 is N(µx̄2 = µ2, σx̄ = σ2/

    √n2).

    Step III: Compute the statistic

    diff = x̄1 − x̄2

  • Distribution of diff

    diff will be used to estimate the difference in the population meansµ1 − µ2. Question: What is the distribution of our statistic? Weuse the rules from CH4 to answer this.

    µdiff = µx̄1 − µx̄2= µ1 − µ2

    diff is an unbiased estimator.

    σ2diff = σ2

    x̄1+ (−1)2σ2

    x̄2

    = σ21/n1 + σ2

    2/n2

    diff is normal with the above mean and variance.

  • SEdiff

    We assumed knowledge of σ1 and σ2 in the previous slide. Theseare unknown generally and we approximate them with the sampledeviations s1 and s2. Substituting these into the expression for σ

    2

    diff

    and taking the square root, we have the standard error of diff:

    SEdiff =

    s21

    n1+

    s22

    n2

  • Example

    Group n x̄ s

    IN 30 63 5KY 40 60 3

    The data for the comparison of the heights of 12 yr olds is given inthe table. IN is Pop I and KY is Pop 2. Find SEdiffSoln: First find SE2

    diff

    SE2diff =s21

    n1+

    s22

    n2

    =52

    30+

    32

    40= 25/30 + 9/40

    = 1.0583

  • Example continued

    It’s important that you use your calculators correctly. First do thedivisions and then do the addition. Now take the square root ofthe above answer.

    SEdiff =√1.0583 = 1.0288

  • Quiz: prob 54, p505

    Group n x̄ s

    Intervention 165 4.10 1.19Control 212 3.67 1.12

    You should read 53, 54 p505. We will return to the data above.Find SEdiffSoln: First find SE2

    diff

  • Quiz: prob 54, p505

    Group n x̄ s

    Intervention 165 4.10 1.19Control 212 3.67 1.12

    You should read 53, 54 p505. We will return to the data above.Find SEdiffSoln: First find SE2

    diff

    SE2diff =s21

    n1+

    s22

    n2

    =1.192

    165+

    1.122

    212= .0086 + .0059

    = .0145

    Now take the square root

  • Quiz: prob 54, p505

    Group n x̄ s

    Intervention 165 4.10 1.19Control 212 3.67 1.12

    You should read 53, 54 p505. We will return to the data above.Find SEdiffSoln: First find SE2

    diff

    SE2diff =s21

    n1+

    s22

    n2

    =1.192

    165+

    1.122

    212= .0086 + .0059

    = .0145

    Now take the square root

    SEdiff =√.0145 = .1204

  • 2 sample hypothesis testing

    The hypothesis for the 2 sample test has the form

    H0 : no difference

    Ha : there is a difference

    For the null, this translates to

    H0 : µ1 = µ2

    or the equivalentH0 : µ1 − µ2 = 0

    The assumption then is that there is no difference between the twopopulations mean.

  • The test statistic

    Recall that the test statistic is of the form

    z =stat− µstat

    σstat

    Our statistic is diff= x̄1 − x̄2. We found its mean and standarddeviation. So

    z =(x̄1 − x̄2)− (µ1 − µ2)√

    σ21/n1 + σ2

    2/n2

    But under the null hypothesis H0 : µ1 − µ2 = 0, this reduces to

    z =(x̄1 − x̄2)

    σ21/n1 + σ2

    2/n2

  • The test statistic, final version

    But as we stated before, we use SEdiff to estimate the denominatorof the statistic z . Using this and the fact that the numerator is diffgives us the final version of the test statistic.

    t =diff

    SEdiff

    which has an approximate t distribution with df=smaller ofn1 − 1, n2 − 1.

  • Example

    We return to the table on slide 5 of the heights of the 12 year olds.Is there a difference in the heights?

    1. State H0 and Ha.Ans: “difference” implies a two tail alternative.

    H0 : µ1 = µ2

    Ha : µ1 6= µ2

    2. Find the test statistic t.Ans: We found SEdiff = 1.0583. We need to find diff.

    diff = x̄1 − x̄2= 63 − 60= 3

  • Example continued

    So

    t =diff

    SEdiff

    =3

    1.0583

    = 2.8347

    with df the smaller of n1 − 1 = 29, n2 − 1 = 39 or df=29.

  • Example continued

    3. Find the P value.Ans: Go to table D, row 29 df. We see that

    t.01 < t < t.005

    So2 ∗ .005 < P < 2 ∗ .01

    (Multiply by 2 since it is two tail). Hence

    .01 < P < .02

    4. Make your decision at the α = .05 level.Ans: P < .02 ≤ .05 = α. The test is significant. Reject H0 infavor of the alternative that there is a difference in the heightsof the two populations.

  • Quiz: prob 54 againWe return to the table on slide 7 of the intervention and controlgroups. The researcher thinks that the test scores for theintervention group are higher.

    1. State H0 and Ha.

  • Quiz: prob 54 againWe return to the table on slide 7 of the intervention and controlgroups. The researcher thinks that the test scores for theintervention group are higher.

    1. State H0 and Ha.Ans: “higher” implies a one tail alternative.

    H0 : µ1 = µ2

    Ha : µ1 > µ2

    2. Find the test statistic t.

  • Quiz: prob 54 againWe return to the table on slide 7 of the intervention and controlgroups. The researcher thinks that the test scores for theintervention group are higher.

    1. State H0 and Ha.Ans: “higher” implies a one tail alternative.

    H0 : µ1 = µ2

    Ha : µ1 > µ2

    2. Find the test statistic t.Ans: We found SEdiff = .1204. We need to find diff.

    diff = x̄1 − x̄2= 4.10 − 3.67= .43

  • Quiz continued

    So

    t =diff

    SEdiff

    =.43

    .1204

    = 3.571

    with df the smaller of n1 − 1 = 164, n2 − 1 = 211 or df=164. Wewill use df=100.

  • Quiz continued

    3. Find the P value.

  • Quiz continued

    3. Find the P value.Ans: Go to table D, row 100 df. The largest t∗ value is t

    .0005.Since t > t

    .0005

    P < .0005

    4. Make your decision at the α = .01 level.Ans: P < .0005 ≤ .01 = α. The test is significant. Reject H0in favor of the alternative that the intervention group hashigher test scores.

  • Two sample CIs

    Recall that a CI has the form

    statistic - error < parameter < statistic + error

    For the two sample CI

    parameter→ µ1 − µ2statistic→ diff.error→ t∗SEdiff .

    t∗ depends on C and the df. So the CI has the form

    diff − t∗SEdiff < µ1 − µ2 < diff + t∗SEdiff

    with df =min{n1 − 1, n2 − 1}.

  • Example

    We once again use the comparison of the heights, slide 5.Find a 95% CI for difference in the means.We found that

    diff = 3, SEdiff = 1.0583, df = 29 .

    Using table D, row 29df, column 95

    diff − t∗SEdiff < µ1 − µ2 < diff + t∗SEdiff

    3− 2.045 ∗ 1.0583 < µ1 − µ2 < 3 + 2.045 ∗ 1.0583or

    .8358 < µ1 − µ2 < 5.1642

  • Significance

    Recall that the null hypothesis is

    H0 : µ1 − µ2 = 0.

    A point of interest then is “does the CI contain 0”? In our casethe answer is no.

    Based on the CI, reject the null hypothesis of no difference

    against a 2 tail alternative at the α = .05 level since 0 is notcontained in the 95% CI.

    This is the same result as the two tail hypothesis test that weconducted.

  • Quiz

    We use the data in prob 54 (slide 7).Find a 95% CI for difference in the means.You found that

    diff = .43, SEdiff = .1204, df = 100 .

    1. Find t∗.

  • Quiz

    We use the data in prob 54 (slide 7).Find a 95% CI for difference in the means.You found that

    diff = .43, SEdiff = .1204, df = 100 .

    1. Find t∗. Ans: df=100; t∗ = 1.984.

    2. Find error.

  • Quiz

    We use the data in prob 54 (slide 7).Find a 95% CI for difference in the means.You found that

    diff = .43, SEdiff = .1204, df = 100 .

    1. Find t∗. Ans: df=100; t∗ = 1.984.

    2. Find error.

    error = t∗SEdiff

    = 1.984 × .1204= .2389

  • Quiz continued

    3. Construct the CI.

  • Quiz continued

    3. Construct the CI.

    .43 − .2389 < µ1 − µ2 < .43 + .2389

    or.1911 < µ1 − µ2 < .6689

    4. Do you reject H0 : µ1 = µ2 against Ha : µ1 6= µ2 at theα = .05 level based on the CI?

  • Quiz continued

    3. Construct the CI.

    .43 − .2389 < µ1 − µ2 < .43 + .2389

    or.1911 < µ1 − µ2 < .6689

    4. Do you reject H0 : µ1 = µ2 against Ha : µ1 6= µ2 at theα = .05 level based on the CI?

    Ans: Yes. The CI does not contain 0.

  • More Quiz questions

    These are questions (a) and (b) in prob 54.

    (a) The scores on the exam are integers from 0 to 6. Is the datanormally distributed?

  • More Quiz questions

    These are questions (a) and (b) in prob 54.

    (a) The scores on the exam are integers from 0 to 6. Is the datanormally distributed?

    Ans: No. The data are discrete and the normal distribution iscontinuous. It takes on all values from −∞ to ∞.

    (b) Do you think that it is appropriate to use the T test on thesedata.

  • More Quiz questions

    These are questions (a) and (b) in prob 54.

    (a) The scores on the exam are integers from 0 to 6. Is the datanormally distributed?

    Ans: No. The data are discrete and the normal distribution iscontinuous. It takes on all values from −∞ to ∞.

    (b) Do you think that it is appropriate to use the T test on thesedata.

    Ans: The assumption behind the T test is that the data arenormally distributed. These data are not. However, thesample sizes are large. The CLT tells us that the approximatedistributions of x̄1 and x̄2 are normal. The answer is “Yes”.

  • Yet more Quiz questions

    1. A 98% CI for the difference of two means is

    −1 < µ1 − µ2 < 2

    Do you reject H0 : µ1 = µ2 against Ha : µ1 6= µ2 at theα = .02 level based on the CI?

  • Yet more Quiz questions

    1. A 98% CI for the difference of two means is

    −1 < µ1 − µ2 < 2

    Do you reject H0 : µ1 = µ2 against Ha : µ1 6= µ2 at theα = .02 level based on the CI?

    Ans: No. The CI contains 0.

    2. A test H0 : µ1 = µ2 against Ha : µ1 6= µ2 yields a P value of.06. Does a 95% CI contain 0?

  • Yet more Quiz questions

    1. A 98% CI for the difference of two means is

    −1 < µ1 − µ2 < 2

    Do you reject H0 : µ1 = µ2 against Ha : µ1 6= µ2 at theα = .02 level based on the CI?

    Ans: No. The CI contains 0.

    2. A test H0 : µ1 = µ2 against Ha : µ1 6= µ2 yields a P value of.06. Does a 95% CI contain 0?

    Ans: Yes. α = .05 and P > α. Do not reject H0 ⇒ the CIcontains 0.

  • Homework

    p504: 53, 55, 59, 61, 73, 82, 83