Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES …homepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch10_pt1.pdf · Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES Part 1: Hypothesis

Chapter 10: STATISTICAL INFERENCEFOR TWO SAMPLES

Part 1: Hypothesis tests on a µ − µfor independent groupsSections 10-1 & 10-2 Independent Groups

• It is common to compare two groups, and doa hypothesis test regarding the parameters ofthe groups.

•We will discuss two data collection designs inthis chapter, and we will discuss how the de-sign choice affects how we analyze the data.

– Number of accidents for the 9-5pm workshift compared to number of accidents formidnight-8am work shift.(Comparison of means, µ1 vs. µ2)

1

– The weights of individuals before a dietstarts compared to the weights of the in-dividuals after 10 weeks on the diet.(Comparison of means, µ1 vs. µ2)

This year, we will not cover compar-ison of proportions, but they are alsocommon:

– Proportion of defects in a device manufac-tured under process 1 vs. proportion of de-fects in a device manufactured under pro-cess 2.(Comparison of proportions, p1 vs. p2)

– Satisfied customer rate for AT&T cellu-lar compared to satisfaction rate for T-mobile.(Comparison of proportions, p1 vs. p2)

2

• Hypothesis testing for comparison of means:

H0 : µ1 = µ2 ⇒ H0 : µ1−µ2 = 0H1 : µ1 6= µ2 ⇒ H1 : µ1−µ2 6= 0

•When comparing the means of two samples(X̄1 vs. X̄2), we must determine if the datafrom the groups are totally independent, or ifthey are related, because this affects the typeof analysis performed (more on this later).

•We start with independent groups (sec-tions 10-1 to 10-2)...

———————————————————

• Example: Time spent exercising for malesand females.

QUESTION: Do males and females at UIspend the same amount of time, on average,at the UI fitness center?

3

X=Time in minutes spentat the fitness center.

Women Menn1 = 15 n2 = 15

63,32,86,53,49 52,75,74,68,93

73,39,56,45,67 77,41,87,72,53

49,51,65,54,56 84,65,66,69,62

4

x̄1 = 55.87 minutes x̄2 = 69.20 minutes

s1 = 13.527 minutes s2 = 13.790 minutes

The sample of females spends, on average,13.33 minutes less time at the Fitness centerthan this sample of males.

x̄1 − x̄2 = 55.87− 69.20 = −13.33 minutes

• Is this difference in sample means large enoughto say that µ1 6= µ2?

• As before, to answer this from a statisticalviewpoint, we need to consider the chance ofgetting a sample difference this large, evenwhen the two populations actually spend thesame amount of time, on average, i.e. whenµ1 − µ2 = 0.

5

• To get at this probability, we need to knowthe ‘behavior’ of the difference in sample meansX̄1 − X̄2.

•We looked at the distribution of this randomvariable (X̄1− X̄2) in chapter 7, and we willrevisit it here.

6

•Difference in sample means:

1. If σ21 and σ2

2 are known, and the distri-bution of values from both groups is nor-mal, we have

X̄1 − X̄2 ∼ N

(µ1 − µ2,

σ21

n1+σ2

2

n2

)and

Z =(X̄1 − X̄2)− (µ1 − µ2)√

σ21n1

+σ2

2n2

where Z has a N(0,1) distribution.

If the original distributions are not normalbut n is large, the above will also followfrom the central limit theorem.

7

• Difference in sample means:

2. If σ21 and σ2

2 are NOT known, wewill have to estimate them. IF IT IS REA-SONABLE TO ASSUME BOTH GROUPSHAVE A COMMON σ2, we will pool theinformation from both groups to estimatethis common σ2.

The pooled estimator of σ2 denoted by S2p

is defined by

S2p =

(n1 − 1)S21 + (n2 − 1)S2

2

n1 + n2 − 2

This value estimates both σ21 and σ2

2 be-

cause σ21 = σ2

2 = σ2.

S21 is the sample variance from group 1.

S22 is the sample variance from group 2.

8

S2p is a weighted average of the two sample

variances.

If the distribution of values from both groupsis normal, we have the random variable Tas

T =(X̄1 − X̄2)− (µ1 − µ2)√

S2pn1

+S2pn2

=(X̄1 − X̄2)− (µ1 − µ2)

Sp

√1n1

+ 1n2

and T has a tn1+n2−2 distribution,i.e. a t-distribution with n1 + n2 − 2 de-grees of freedom.

9

• Difference in sample means:

3. If σ21 and σ2

2 are NOT known, ANDthey have different variances, then we shouldnot take a pooled estimate of the variabil-ity, we should instead leave them separateas S2

1 and S22 . In this case we have the

random variable T as

T =(X̄1 − X̄2)− (µ1 − µ2)√

S21n1

+S2

2n2

where T has a tν distribution, and the de-grees of freedom ν for the t distribution isgiven by...

10

ν =

(S2

1n1

+S2

2n2

)2

(S2

1n1

)2

n1−1 +

(S2

2n2

)2

n2−1

This is known as Welch’s approximate t.

11

• Thus, in working with X̄1 − X̄2 to make aninference on µ1 − µ2

– if we know σ21 and σ2

2 we will use aZ-distribution.

– if we don’t known them and we thinkσ2

1 = σ22, we will use a t-distribution with

n1 + n2 − 2 degrees of freedom and use apooled estimate of the common varianceas S2

p.

– if we don’t known them and we thinkσ2

1 6= σ22, we will use a t-distribution with

ν degrees of freedom (ugly but useful, for-mula on previous slide) and have separateestimates for the variances as S2

1 and S22 .

• Back to the example where we will utilize ahypothesis test...

12

QUESTION: Do males and females at UIspend the same amount of time, on average,at the UI fitness center?

1. H0 : µ1 = µ2 ⇒ H0 : µ1− µ2 = 0H1 : µ1 6= µ2 ⇒ H1 : µ1− µ2 6= 0

Group 1 is the females, group 2 is themales.

2. TEST STATISTIC:We have a small sample from each groupand σ2

1 and σ22 are not given to us. We

will use a t-statistic.

We will assume the groups have a commonvariance σ2 (we can check this assumptionlater).

n1 = 15 n2 = 15x̄1 = 55.87 min. x̄2 = 69.20 min.s1 = 13.527 min. s2 = 13.790 min.

13

Pooled estimate of common σ2:

s2p =

(n1 − 1)s21 + (n2 − 1)s2

2

n1 + n2 − 2

=14× 13.5272 + 14× 13.7902

15 + 15− 2=186.572

and sp =√

186.572 = 13.659

The observed test statistic under H0 true,

t0 =(x̄1 − x̄2)− (µ1 − µ2)

sp

√1n1

+ 1n2

=(−13.33)− (0)

13.659√

115 + 1

15

= −2.67

and T0 ∼ t28 under H0 true (n1+n2−2=28).

14

3. P-VALUE:Under H0 true, T0 ∼ t28.

Compute P (T0 ≤ −2.67) = 0.0062 (fromsoftware, not your t-table)

This is a 2-sided test,so P-value = 2× 0.0062 = 0.0124

4. DECISION:Because the P-value is < α = 0.05, we re-ject H0. There IS statistically significantevidence that the mean time spent at theFitness Centers is not the same for menand women.

5. CHECK ANY ASSUMPTIONS:With the T test statistic (unknown σ2),we’ll check that the original distributionsare nearly normal.

15

Women Men

normal probability plot normal probability plot

We could also check the constant varianceassumption (we note that s1 and s2 arevery similar, but there are specific testsand plots we can use to check this.)

We should also make sure that we haveindependent random samples from the twopopulations we were interested in.

16

There are many choices for statistical software,Minitab being one of them. Here is the outputfrom this analysis performed in the the freelyavailable software called R:

> t.test(women,men,var.equal=T)

Two Sample t-test

data: women and men

t = -2.6733, df = 28, p-value = 0.01239

alternative hypothesis: true difference in means

is not equal to 0

95 percent confidence interval of difference:

-23.55011 -3.11656

sample estimates:

mean of x mean of y

55.86667 69.20000

17

Sometimes we want to do a test such as

H0 : µ1 − µ2 = 40

Where we’re interested in a specific differencebetween the means...

• Example: Viscosity

Fifteen batches of polymer are manufacturedunder the present process and the viscosityis measured:

724, 718, 776, 760, 745, 759, 795, 756, 742,740, 761, 749, 739, 747, 742

A process change is made and eight batchesare manufactured and the viscosity is mea-sured:

755, 785, 729, 775, 783, 760, 738, 79518

From a long history of viscosity measurements,they know the variability of viscosity is fairlystable, and they know σ = 20. They alsoknow the viscosity measurements are nor-mally distributed.

They would like to detect it if the mean viscosityof the new process is more than 10 units abovethe old mean (this could cause problems inmanufacturing down the line).

Perform a hypothesis test at the α = 0.10level.

Because σ is known, the test statistic will bea Z-statistic.

x̄2 = 765 and x̄1 = 750.2 and x̄2− x̄1 = 14.8

where group 2 is from the new manufacturingprocess.

19

The difference in sample means IS more than10 units, but have we collected enough datato feel fairly confident that the populationmeans are more than 10 units apart?

1. Hypotheses:H0 : µ2 − µ1 = 10H1 : µ2 − µ1 > 10

where µ2 is the average viscosity for thenew process.

2. Test statistic:

z0 =(x̄2 − x̄1)− (µ2 − µ1)√

σ22n2

+σ2

1n1

=(14.8)− (10)√

202

8 + 202

15

= 0.55

20

3. P-value:P (Z > 0.55) = 0.2912 {1-sided test}

4. Decision:p-value = 0.2912 is not less than α = 0.10.We fail to reject H0.

5. We were given info that viscosity followeda normal distribution.

There is NOT statistically significant evidencethat the mean of the new process is morethan 10 units above the mean of the old pro-cess.——————————————————–If the true difference in means is this large (14.8),

increasing n will make this apparent and we would

eventually reject. If the true difference is actually less

than 10, this would also become apparent as we col-

lect more data (we’ll hone in on the truth with larger

n).

21

• Example: Differential Gene Expression

A gene that shows differential expression be-tween two groups can be very informativefrom a biological perspective.

Comparisons:Cancer vs. Healthy patientsFast running mice vs. lazy miceObese individuals vs. healthy weight indvl’sHigh yield plants vs. low yield plants

In a plant genetics study, the gene calledAt5g50550 in the Arabidopsis plant showedthe following two sample expression distri-butions for the genetic lines called Columbiaand Landsberg.

22

Perform a hypothesis test for differential ex-pression (i.e. for non-equality of mean expres-sion).

23

The numerical summaries:(expression values have been normalized)

Columbia :n1 = 19 x̄1 = 0.60 s2

1 = 0.017

Landsberg :n2 = 11 x̄2 = −0.16 s2

2 = 0.042

We do not know the population variances, sowe will use a t-statistic.

We will NOT assume a common variance, sowe will use Welch’s approximate t and getthe degrees of freedom ν from...

24

ν =

(S2

1n1

+S2

2n2

)2

(S2

1n1

)2

n1−1 +

(S2

2n2

)2

n2−1

=

(0.017

19 + 0.04211

)2(0.01719

)2

18 +

(0.042

11

)2

10

=14.78

and since more degrees of freedom meansmore information, we don’t want to implywe have more info than we do, so we round-down to 14 (to be conservative).

25

1. Hypotheses:H0 : µ1 = µ2 {equal expression}H1 : µ1 6= µ2

where µ1 is the average gene expressionfor Columbia.

2. Test statistic:

t0 =(x̄1 − x̄2)− (µ1 − µ2)√

s21n1

+s2

2n2

=(0.60−−0.16)− (0)√

0.01719 + 0.042

11

= 11.07

And under H0 true, T0 ∼ t14

3. P-value:

2× P (T0 > 11.07) ≈ 0 Very small.

26

4. Decision:Reject H0. There is strong statistical sig-nificant evidence that these two groups havedifferent mean gene expression for this gene.

5. Since we used a t-statistic, we should checknormality of the two distributions (normalprobability plots not shown here). Therewas one outlier in the Landsberg groupwhich may be of some concern.

27

Comparison of two independent groups

also called...

Two-sample t-test(for H0 : µ1 − µ2 = 40)

...but if we know σ we actually do a Z-test.

• This type of test is performed when the mea-surements in the first group are independentof the measurements in the second group

• In these comparative experiments, we shouldhave a simple random sample from each pop-ulation (or group).

28

Summary of which test statistic touse:

1. If σ21 and σ2

2 are KNOWN, we’ll use aZ-statistic.

(We should have the original distributionsbe normal, or n large enough for X̄ ’s tobe normal.)

2. If σ21 and σ2

2 are NOT KNOWN, we’ll usea t-statistic:

(a) If it is reasonable to assume both groupshave A COMMON σ2, we will pool theinformation from both groups to esti-mate this common σ2 with S2

p, and thedegrees of freedom for the t is (n1+n2−2).

29

(b) If the groups DO NOT HAVE ACOMMON σ2, we should not pool theinformation for a common estimate ofσ2. We will instead keep separate esti-mates for the variances as S2

1 and S22 ,

and the degrees of freedom for the t willbe ν where ν comes from Welch’s ap-proximate t degrees of freedom formula.

30

100(1-α)% Confidence interval forµ − µ

• The point estimate for µ1 − µ2 is x̄1 − x̄2

•We can form a 100(1-α)% confidence intervalfor the difference in parameters µ1−µ2 usingthe same criterion as the previous pages as:

1. If σ21 and σ2

2 are KNOWN and two inde-pendent samples are taken from two nor-mal distributions

x̄1 − x̄2 ± zα/2

√σ2

1

n1+σ2

2

n2

31

2. If σ21 and σ2

2 are NOT KNOWN and twoindependent samples are taken from twonormal distributions

(a) with a common σ2

x̄1−x̄2 ± tα/2,n1+n2−2 ·sp√

1

n1+

1

n2

(b) without a common σ2

x̄1 − x̄2 ± tα/2,ν

√s2

1

n1+s2

2

n2

where ν is from Welch’s approximate tdegrees of freedom

32

Comparison of two independent groups

... we do a Two-sample t-test(for H0 : µ1 − µ2 = 40)

• This type of test is performed when the mea-surements in the first group are independentof the measurements in the second group

• The µ1 − µ2 hypothesis examples so far fitthis scenario:

– Time exercising for males and females

∗ n1 = 15 and n2 = 15 and the men andwomen chosen didn’t have anything incommon, they were independent

33

– Viscosity from old process and new pro-cess

∗ n1 = 15 and n2 = 8 and the two groupsof measurements were taken indepen-dent of each other

– Differential gene expression in Columbiaand Landsberg

∗ n1 = 19 and n2 = 11 for two indepen-dent groups of plants

If the two groups are not independent and wehave paired data, we will perform a Pairedt-test... next section.

34

Documents

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES …homepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch10_pt1.pdf · Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES Part 1: Hypothesis