39
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations 

Two Populations -- Unknown Sigmas

Embed Size (px)

Citation preview

Page 1: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 1/39

Two Population Means

Hypothesis Testing andConfidence Intervals

With UnknownStandard Deviations 

Page 2: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 2/39

The Problem

• 1 or 2 are unknown

• 1 and 2 are not known (the usual case)

OBJECTIVES

• Test whether 1 > 2 (by a certain amount)

 –

or whether 1  2 • Determine a confidence interval for the

difference in the means: 1 - 2 

Page 3: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 3/39

KEY ASSUMPTIONSSampling is done from two populations.

 –

Population 1 has mean µ1 and variance σ1

2

. – Population 2 has mean µ2 and variance σ2

2.

 – A sample of size n1 will be taken from population 1.

 – A sample of size n2 will be taken from population 2.

 – Sampling is random and both samples are drawnindependently.

 – Either the sample sizes will be large or the

populations are assumed to be normally distribution.

1

2

1

1

111

n

σ variance,

n

σ deviationstandard,μmean:XvariableRandom

2

2

2

2

222

n

σ variance,

n

σ deviationstandard,μmean:XvariableRandom

Page 4: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 4/39

Distribution of X1 - X2 

• Since X1 and X2 are both assumed to be normal,

or the sample sizes, n1 and n2 are assumed to be

large, then because 1 and 2 are unknown, the

random variable X1 -X2 has a: – Distribution -- t

 – Mean = 1 - 2 

 – Standard deviation that depends on whether or not the

standard deviations of X1 and X2 (although unknown)

can be assumed to be equal

 – Degrees of freedom that also depends on whether or 

not the standard deviations of X1 and X2 can be

assumed to be equal

Page 5: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 5/39

Appropriate Standard Deviation For 

X1 -X2 When Are ’s Are Known 

• Recall the appropriate standard deviation

for X1 - X2 is:

• Now if 1 = 2 we can simply call it and write it

as:

• So if the standard deviations are unknown, we

need an estimate for the common variance, 2.

2

2

2

1

2

1

n

σ

n

σ

  

  

21

2

n1

n1σ

Page 6: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 6/39

Estimating 2

Degrees of Freedom

• If we can assume that the populations have equal

variances, then the variance of X1 - X2 is the

weighted average of s12 and s2

2, weighted by:

DEGREES OF FREEDOM• There are n1- 1 degrees of freedom from the first

sample and n2-1 degrees of freedom from the

second sample, so 

• Total Degrees of Freedom for the hypothesis test

or confidence interval = (n1 -1) + (n2 -1) = n1 + n2 -2

Page 7: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 7/39

e ppropr a e an ar ev a onFor X1 - X2 When Are ’s Unknown,

but Can Be Assumed to Be Equal• The best estimate for 2 then is the pooled

variance, sp2:

• Thus the best estimates for the variance and

standard deviation of X1 - X2 are:

2221

22121

12222112 p s2nn

1ns

2nn

1ns

DFTotal

DFs

DFTotal

DFs

 

 

 

 

 

 

 

 

 

  

  

  

 

 

 

 

 

 

  

 

21

2

PXX

21

2P

2XX

n

1

n

1

ss

n

1

n

1ss

21

21

Page 8: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 8/39

21 xx

t-Statistic

 

  

 

 

 

 

 

Error 

Standard

v

Estimate

Point

t

t-Statistic and t-Confidence Interval

Assuming Equal Variances

Degrees of Freedom = n 1 + n 2 -2 

Confidence Interval

 

 

 

 

 

 

 

 

Error 

 Standard t

Estimate

Point/2 

 2

x1

x  

2x

1x

 

 

 

 

21

2

 p

n

1

n

1s

 

  

 

21

2

 pn

1

n

1s

Page 9: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 9/39

The Appropriate Standard Deviation

For X1 - X2 When Are ’s Unknown,

And Cannot Be Assumed to Be Equal• If we cannot assume that the populations have

equal variances, then the best estimate for 12 is

s12 and the best estimate for 2

2 is s22.

• Thus the best estimates for the variance andstandard deviation of X1 - X2 are:

2

22

1

21

XX

2

2

21

2

12 XX

n

s

n

s s

n

s

n

s s

21

21

Page 10: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 10/39

t-Statistic and t-Confidence Interval

Assuming Unequal Variances

21 xx

t-Statistic

  

  

 

  

 

Error Standard

vEstimate

Point

t

Confidence Interval

 

  

 

 

  

 

Error 

Standardt

Estimate

Point/2 

Total Degrees of Freedom

1n

n

s

1n

n

s

n

s

n

s

2

2

2

2

2

1

2

1

2

1

2

2

22

1

21

 

  

 

 

  

 

 

  

 

 2

x1

x  2

x1x

2

2

2

1

2

1

n

s

n

s

2

2

2

1

2

1

ns

ns

Round the resulting value.

Page 11: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 11/39

Testing whether the Variances

Can Be Assumed to Be Equal• The following hypothesis test tests whether or not

equal variances can be assumed:

H0: 12/2

2 1 (They are equal)

HA: 12

/22

1 (They are different)

This is an F-test!

If the larger of s1

2

and s2

2

is put in the numerator, thethe test is:

Reject H0 if F = s12/s2

2 > F/2, DF1, DF2

Page 12: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 12/39

Hypothesis Test/Confidence Interval

Approach With Unknown ’s 

• Take a sample of size n1 from population 1

 – Calculate x1 and s12 

Take a sample of size n2 from population 2 – Calculate x2 and s2

• Perform an F-test to determine if the

variances can be assumed to be equal

• Perform the Appropriate Hypothesis Test

or Construct the Appropriate Confidence

Interval

Page 13: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 13/39

Example 1

Based on the following two random samples, – Can we conclude that women on the average score

better than men on civil service tests?

 – Construct a 95% for the difference in average scores

between women and men on civil service tests.

• Because the samp le sizes are large, we do no t have to 

assume that test scores have a no rmal distr ibut ion to 

perform our analyses.

Number sampled = 32

Sample Average = 75Sample St’d Dev. = 13.92 

Women

Number sampled = 30

Sample Average = 73Sample St’d Dev. = 11.79 

Men

Page 14: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 14/39

Example 1 – F-testDo an F-test to determine if variances can be

assumed to be equal.H0: W

2 /M2 = 1 (Equal Variances)

HA: W2 /M

2  1 (Unequal Variances)

• Select α = .05.

• Reject H0 (Accept HA) if Larger s2 /Smaller s2 

> F.025,DF(Larger s2),DF(Smaller s2) = F.025,31,29 = 2.09 *

(*Note th is is F .025,30,29 sinc e the table does n ot giv e the value for F .025, 31,29  ) 

Calculat ion:  sW2 / sM

2 = (13.92)2 /(11.79)2 = 1.39

Since 1.39 < 2.09, Cannot  conclude unequal variances.

Do Equal Variance t-test with 32+30-2=60 degrees of freedom .

E l 1

Page 15: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 15/39

Example 1

The Equal Variance t-Test

H0: W - M = 0HA: W - M > 0

• Select α = .05.

• Reject H0 (Accept HA) if t > t.05,60 = 1.658

Since .608 < 1.658, we cannot conclude that

women average better than men on the tests.

.608

30

1

32

1167.30

073)(75t

167.30(11.79)60

29(13.92)

60

31s

222

 p

 

  

 

 

  

 

 

  

 

E l 1

Page 16: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 16/39

Example 1

95% Confidence Interval

95% Confidence Interval

 

  

 

21

2

P.025,60MWn

1

n

1st)xx(

 

  

 

30

1

32

130.167000.2)7375(

2 ± 6.57

-4.57 8.57

Page 17: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 17/39

Example 2

Based on the following random samples of 

basketball attendances at the Staples Center, – Can we conclude that the Lakers average attendance is

more than 2000 more than the Clippers average

attendance at the Staples Center?

 –Construct a 95% for the difference in averageattendance between Lakers and Clippers games at the

Staples Center.

Since samp le sizes are small , we mus t assum e that attendance at Lakers and Clipper games have normal distr ib ut ion s to perform the analyses.

Number sampled = 13

Sample Average = 16,675

Sample St’d Dev. = 1014.97 

LA Lakers

Number sampled = 11

Sample Average = 12,009

Sample St’d Dev. = 3276.73 

LA Clippers

Page 18: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 18/39

Example 2 – F-test

• Do an F-test to determine if variances can be

assumed to be equal.H0: C

2 /L2 = 1 (Equal Variances)

HA: C2 /L

2  1 (Unequal Variances)

Note: Cl ipper var iance is the larger sample var iance 

• Choose α = .05.

• Reject H0 (Accept HA) if Larger s2 /Smaller s2 > F

.025,DF(Larger variance),DF(Smaller variance)

= F.025,10,12

= 3.37

Calculat ion:  sC2 / sL

2 = (3276.73)2 /(1014.97)2 = 10.42

Since 10.42 > 3.37, Can  conclude unequal variances.

Do Unequal Variance t-test. 

D f F d f th U l

Page 19: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 19/39

Degrees of Freedom for the Unequal

Variance t-Test

• The degrees of freedom for this test is given by:

1n

n

s

1n

n

s

n

s

n

s

2

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

 

  

 

 

  

 

 

  

 

= 11.626=

12

13

(1014.97)

10

11

(3276.73)

13

(1014.97)

11

(3276.73)

22

22

222

 

  

 

 

  

 

 

  

 

This rounded to 12 degrees of freedom.

E l 2 th t T t

Page 20: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 20/39

Proceed to the hypo thesis test for the 

di f ference in means w ith unequal var iances: H0: L - C = 2000

HA: L - C > 2000

Select α = .05.• Reject H0 (Accept HA) if t > t.05,12 = 1.782

Since t = 2.595 > 1.782, we can conclude that theLakers average more than 2000 per game morethan the Clippers at the Staples Center. 

Example 2 – the t-Test

595.2

11)73.3276(

13)97.1014(

2000)009,12675,16(t

22

E l 1

Page 21: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 21/39

Example 1

95% Confidence Interval

95% Confidence Interval

 

  

 

2

2

2

1

2

1.025,12CL

n

s

n

s t)xx(

 

  

 

11

)73.3276(

13

)97.1014(179.2)009,12675,16(

22

4666 ± 2238.47

2427.53 6904.47

Page 22: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 22/39

Excel Approach

F-test, t-test Assuming Equal Variances, t-test Assuming Unequal Variances are all

found in Data Analysis. 

• Excel only performs a one-tail F-test.

 – Multiply this 1-tail p-value by 2 to get the p-

value for the 2-tail F-test.

• Formulas must be entered for the LCL and

UCL of the confidence intervals.

 – All values for these formulas can be found in

the Equal or Unequal Variance t-test Output.

I tti /I t ti R lt

Page 23: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 23/39

Inputting/Interpreting Results

From Hypotheses Tests

• Express H0 and HA so that the number on theright side is positive (or 0)

• The p-value returned for the two-tailed test will

always be correct.

• The p-value returned for the one-tail test is

usually correct. It is correct if:

 – HA is a “> test” and the t-statistic is positive

This is the usual case• If t < 0, the true p-value is 1 – (p-value printed by Excel)

 – HA is a “< test” and the t-statistic is negative

• This is the usual case

• If t>0, the true p-value is 1 – (p-value printed by Excel)

Page 24: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 24/39

Excel For Example 1 – F-Test

Go Tools

Select Data Analysis

Select F-Test Two-Sample For Variances

1 (C )

Page 25: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 25/39

Example 1 – F-Test (Cont’d) 

Use Women (Column A) for Variable Range 1

Use Men (Column B) for Variable Range 2 

Check Labels

Designate first cell

for output.

Page 26: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 26/39

Example 1 – F-Test (Cont’d) 

p-value for

one-tail test

Page 27: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 27/39

Example 1 – F-Test (Cont’d) 

p-value for

one-tail test

=2*D9Multiply the one-tail p-value

by 2 to get the 2-tail p-value.

High p-value (.371671)

Cannot conclude Unequal Variances

Use Equal Variance t-test 

E l 1 t T t

Page 28: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 28/39

Example 1 – t-Test

Go Tools

Select Data Analysis

Select t-Test: Two-Sample Assuming Equal Variances

E l 1 t T t (C t’d)

Page 29: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 29/39

Example 1 – t-Test (Cont’d) 

Since HA is W - M > 0, enter

Column A for Range 1

Column B for Range 2

0 for Hypothesized Mean Difference

Check 

Labels Designate first cell

for output.

E l 1 t t t (C t’d)

Page 30: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 30/39

Example 1 – t-test (Cont’d) 

p-value for

the one-tail “>” test 

p-value for at

two-tail “” test 

High p-value for 1-tail test!

Cannot conclude average

women’s score >

average men’s score 

E l 1 95% C fid I t l

Page 31: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 31/39

Example 1 – 95% Confidence Interval

=(D15-E15)-TINV(.05,D20)*SQRT(D18*(1/D17+1/E17))

1x 2x- DF.025,t- 2Ps1n

1

2n

Highlight Cell G19

Add $ Signs Using

F4 key

Drag to cell G20

Change “-” to “+” 

**

E l F E l 2 F T t

Page 32: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 32/39

Excel For Example 2 – F-Test

Go Tools

Select Data Analysis

Select F-Test Two-Sample For Variances

E l 2 F T t (C t’d)

Page 33: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 33/39

Example 2 – F-Test (Cont’d) 

Use Lakers (Column B) for Variable Range 1

Use Clippers (Column D) for Variable Range 2 

Check Labels

Designate first cell

for output.

E l 2 F T t (C t’d)

Page 34: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 34/39

Example 2 – F-Test (Cont’d) 

Enter =2*F9

to give the p-value

for the two-tailed test

p-value for

one-tail test

Low p-value (.000352) – Can conclude Unequal Variances

Use Unequal Variance t-test 

E l 2 t T t

Page 35: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 35/39

Example 2 – t-Test

Go Tools

Select Data Analysis

Select

t-Test: Two Sample Assuming Unequal Variances

E l 2 t T t (C t’d)

Page 36: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 36/39

Example 2 – t-Test (Cont’d) 

Check 

Labels Designate first cell

for output.

Since HA is L - C > 2000, enter

Column B for Range 1

Column D for Range 2

2000 for Hypothesized Mean Difference

Example 2 t test (Cont’d)

Page 37: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 37/39

Example 2 – t-test (Cont’d) 

Low p-value for 1-tail test

(compared to α = .05)!

Can conclude the Lakers average

more than 2000 more people per

game than the Clippers.

p-value for

the one-tail “>” test 

p-value for at

two-tail “” test 

Example 2 95% Confidence Interval

Page 38: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 38/39

Example 2 – 95% Confidence Interval

=(F15-G15)-TINV(.05,F19)*SQRT(F16/F17+G16/G17)

1x2x- DF.025,t-

Highlight Cell I14

Add $ Signs Using

F4 key

Drag to cell I15

Change “-” to “+” 

1

21

n

s

2n

2

2

*

Page 39: Two Populations -- Unknown Sigmas

7/29/2019 Two Populations -- Unknown Sigmas

http://slidepdf.com/reader/full/two-populations-unknown-sigmas 39/39

Review• Standard Errors and Degrees of Freedom when:

 – Variances are assumed equal – Variances are not assumed equal

• F-statistic to determine if variances differ 

• t-statistic and confidence interval when: – Variances are assumed equal

 – Variances are not assumed equal

• Hypothesis Tests/ Confidence Intervals for Differences in Means (Assuming Equal or UnequalVariances) – By hand

 – By Excel