Two sample tests

Study designs

• Single sample, compare two sub-samples (cross-sectional survey)

• Compare samples from 2 different populations (2 cross sectional surveys, case-control study)

• Single sample; subjects randomly allocated to different interventions (experiment, clinical trial)

Simple randomization

1. Generate n uniform (0,1) random deviates.2. If ui<0.5 assign to intervention A to unit i; if ui> 0.5 assign B.3. Note nA is a random variable with E(nA)=0.5.

Restricted randomization

1. Generate a U(0,1) deviate, ui, for each unit in the sample.2. Sort the deviates from smallest to largest.3. Assign intervention A to the units with the n/2 smallest ui’s.4. Note this results in half of the sample assigned to each intervention;i.e. nA is fixed.

2 21 2

1 21 1

Suppose we have N subjects. How

many should be allocated to each group?

Let N=n n .

V(Y Y ) .n N n

2 21 22 2

Find n that minimizes V(Y Y ).

dn n (N n )

2 21 2

If we randomly assign subjects to

interventions, it is reasonable to assume

. Then the optimum allocation

is n n .

1i 1 2j 2

2k k k k

2 21 2

1 2 1 21 2

Two independent samples, normally

distributed data

Y i 1,2,3,...,n ; Y j 1,2,3,...,n .

Y ~ N( , / n ).

Y Y ~ N( , )n n

If X and Y are 2 r.v. and a and b are

constants, then

E(aX+bY)=aE(X)+bE(Y), and

V(aX+bY)=a V(X)+b V(Y)+2abcov(X,Y)

2 21 2

1 21 2

Therefore if a=1 and b=-1:

Y Y is approximately distributed as

N( , ).n n

1 2 1 22 21 2

(Y Y ) ( )~ N(0,1).

(Y Y ) ( ) does not follow t-distn.

s sn n

s)1n(s)1n(s

:average) (weighted estimate Pooled

. estimates s and estimates s

.:assumption Additional

212121

)()YY(

Example:An experiment was conducted to see if a drug could prevent premature birth. 30 women atrisk of premature birth were assigned to take the drug or a placebo (15 in each group).Outcome: birthweight.

H : (1=drug; 2=placebo)

C {t t }

C {t 1.7}

BirthweightsDrug Placebo

6.9 6.4

7.6 6.7

7.3 5.4

7.6 8.2

6.8 5.3

7.2 6.6

8.0 5.8

5.5 5.7

5.8 6.2

7.3 7.1

8.2 7.0

6.9 6.9

6.8 5.6

5.7 4.2

8.6 6.8

y 7.08; s 0.899

y 6.26; s 0.961

14(.899) 14(.961)s 0.8659

287.08 6.26

t 2.412

.93115

t C , reject H ; p .01.

2 20 1 2

2 2 2k k k n 1

Testing H : :

We know that

(n 1)s / ~ k=1,2.

It can be shown that the distribution of the

ratio of two independent distributed

random variables divided by their dfs is

the F distribution

1 2 with df and df degrees

of freedom.

In general:

If X ~ ; k=1,2,

XY= ~ F

2 2k k k k

2 22 1 1 1

2 21 2 2 2

2 21 1

n 1,n 12 22 2

It follows that letting

X (n 1)s / ; k=1,2,

(n 1)(n 1)s /Y=

(n 1)(n 1)s /

s / ~ F

1 2 1 2

22 2 1

0 1 2 n 1,n 122

n 1,n 1, / 2 n 1,n 1,1 / 2

2 21 1

n 1,n 1, / 2 n 1,n 1,1 / 22 22 2

sUnder H : , ~ F .

So, to test H , we find the critical values

F and F .

s sIf F or F ,

we reject H .

1 22 1

, ,, ,1

Note :

2 2 2 20 1 2 A 1

.025,14,14 .025,14,14

Example birthweights:

H : ; H : .

C {F 2.79 F 1/ 2.79 0.36}

.899F 0.88

.961do not reject H .

2 21 2What do we do if ?

Fisher Behrens problem.

1. Satterwaite approximation.

2. Transform the data.

3. Nonparametric methods.

2 2 21 1 2 2

2 2 2 21 1 2 2

Satterwaite Approximation :

Statistic is usual t statistic.

(s / n s / n )df= 2

(s / n ) (s / n )n 1 n 1

2 21 2

If , we can consider a variance

stabilizing transformation.

Some examples:

If , W= Y.

If , W=ln(Y).

If , W=1/Y.

Notes:

(1) We transform both Y and Y .

(2) I rarely use transformations.

2j j j

If n and n are large, the homogeniety

of variance assumption is not important.

Recall if n is large, Y is approximately

distributed as N( , / n ).

1 22 21 2

To test H : , we use

(Y Y )z= .

s sn n

Under H , z is approximately distributed

as a N(0,1) variate. The approximation

gets better as n ,n .

This approximation is good enough for

practical purposes if n 25; j=1,2.

Note also that the assumption that the Y's

are normally distributed is not needed

for this statistic (Central Limit Theorm).

A study was done to compare the percent body fat of 3rd gradersAt schools on 2 Native AmericanReservations: Gila River (TohonaO’odham) and White River (Apache).

0 T A A T A

H : ; H :

n 63; n 35.

C {z 1.96}.

y 37.9%; s 8.66

y 32.8% s 6.88

37.9 32.8z 3.20

8.66 6.8863 35

reject H ; p=0.0014

If the sample sizes are small and the Y's

are not normally distributed:

1. Transform the Y's.

2. Nonparametric method

Wilcoxon-Mann Whitney rank sum test:

1. Pool the two samples and rank them from

smallest to largest.

2. Replace the observations with their ranks.

3. Compute the sum of the ranks, W ,in

group 1.

What hypothesis does the Wilcoxon

procedure test?

Assume Y ~ F (y); j=1,2.

H : F (y) F (y)

H : F (y) F (y ),

where is a constant.

There are N=n n subjects in our study.

NThus there are possible outcomes.

Under H , each is equally likely. We compute

the distribution of W by enumeration.

Example : 7 students are taking a series

of exams. They are randomly divided into

2 groups: n 3, n =4. After the first exam,

group 1 is told they scored badly on exam 1

regardless of their score; group

told nothing.

The null hypothesis is that telling the

students that they are doing poorly

will have no effect on their

subsequent grade.

Group1 Group2

7There are 35 possible outcomes

of the study.

Grades on second exam

Ranks W1 Ranks W1 Ranks W1

1,2,3 6 1,5,6 12 2,6,7 15

1,2,4 7 1,5,7 13 3,4,5 12

1,2,5 8 1,6,7 14 3,4,6 13

1,2,6 9 2,3,4 9 3,4,7 14

1,2,7 10 2,3,5 10 3,5,6 14

1,3,4 8 2,3,6 11 3,5,7 15

1,3,5 9 2,3,7 12 3,6,7 16

1,3,6 10 2,4,5 11 4,5,6 15

1,3,7 11 2,4,6 12 4,5,7 16

1,4,5 10 2,4,7 13 4,6,7 17

1,4,6 11 2,5,6 13 5,6,7 18

1,4,7 12 2,5,7 14

1 1 2c.d.f . of W for n 3 and n 4

w F(w)

6 0.02856

7 0.05714

8 0.1143

9 0.2000

10 0.3142

11 0.4286

12 0.5714

13 0.6857

14 0.8000

15 0.8857

16 0.9429

17 0.9714

18 1.0000

Note: it is impossible to conduct a

2-sided =0.05 test. We will do a 2-sided

0.1 test. C {6,18}.

Observed W 1 2 4 7. do not

reject H . P value=2(0.05714)=0.1143.

N(N 1)Note : W W .

n (N 1)E(W ) .

2n n (N 1)

V(W ) .12

If n and n are large,

W E(W )z=

will be approximately distributed

as (z).

i i ii 1

This approximation is good for n ,n 12.

If there are ties:

n n (N 1)V(W )

{ t (t 1)(t 1)}12N(N 1)

Birthweights (lbs.)Drug Rank Placebo Rank

6.9 18 6.4 11

7.6 25.5 6.7 13

7.3 23.5 5.4 3

7.6 25.5 8.2 27.5

6.8 15 5.3 2

7.2 22 6.6 12

8.0 29 5.8 8.5

5.5 4 5.7 6.5

5.8 8.5 6.2 10

7.3 23.5 7.1 21

8.2 27.5 7.0 20

6.9 18 6.9 18

6.8 15 5.6 5

5.7 6.5 4.2 1

8.6 30 6.8 15

H : 0; H : 0;

C {z 1.645}

15(31)E(W ) 232.5

15 (31)V(W no ties)= 581.25

1 2 3 4 5

i i ii 1

q 7; t 2; t 2; t 3; t 3; t 2;

t 2; t 2.

t (t 1)(t 1) 78

78(15)V(W adj. for ties) 581.25

12(31)(32)

581.25 1.47 579.78

W 291.5;

291.5 232.5z 2.45

579.78

reject H ; p=0.0071.

1i 2 j 1 2

1 1i 2 j

Mann-Whitney test:

Consider all n n possible pairs

(Y ,Y ); i=1,2,...,n ; j==1,2,...,n .

Let U # of pairs with Y Y .

1 21 1

It can be shown that:

n (N n 1)U W .

2Therefore the Mann-Whitney and

Wilcoxon tests are equivalent.

The Wilcoxon statistic is a special case

of a set of simple linear rank statistics.

Let R be the rank of the kth obs.

in the combined groups; k=1,2,...,N.

k k k k kk 1

Simple linear rank statistic:

S[a(R ),c ] c a(R ), where a(R ) is

a known function and c is a series

of constants.

k k k kk 1 k 1

N N2 2k k

k 1 k 1

It can be shown:

1E{S[a(R ),c ]} [ a(R )] c

1[ {a(R ) a} ] (c c)N 1

If N is large, the distribution of

S-E(S)z=

is approximately (z).

k k k k 1k 1

If a(R ) R , and

1 if kth obs in grp.1c

0 if kth obs. in grp.2

S[a(R ),c ] c R W .

Normal Scores Test:

c as before,

Ra(R ) ( )

Savage Scores-logrank test

1 1 1a (R ) ...

N N 1 N R 1

1 1 1ln(c) 1 ...

Therefore:

a (R ) ln(N) ln(N R )

Nln( ).

However, this is undefined if R N, so take

N 1a (R ) ln( )

This form of the statistic is called the

logrank statistic and is used in survival

analysis.

1 0.1 0.095312 0.211111 0.2006713 0.336111 0.3184544 0.478968 0.4519855 0.645635 0.6061366 0.845635 0.7884577 1.095635 1.0116018 1.428968 1.2992839 1.928968 1.70474810 2.928968 2.397895

N=10Rank Savage Score Logrank score

Optimum LRS:

Distribution an(Rk)Normal Normal ScoresExponential Savage ScoresLogistic Wilcoxon Scores

Permutation test:

N subjects randomly assigned to

N2 groups; n n . There are possible

assignments and each is equally likely.

Each of these assignments results in a

value of Y Y . Compute Y

2Y for

each possible outcome.

Compute the empirical distribution

under H of equal means. From the

edf, determine the critical region of

the test.

7Example: test scores; 35 possible

outcomes.

65 69 70 73 88 89 92 -17.50 65 69 73 70 88 89 92 -15.7565 69 88 70 73 89 92 -7.00 65 69 89 70 73 88 92 -6.4265 69 92 70 73 88 89 -4.67 65 70 73 69 88 89 92 -15.1765 70 88 69 73 89 92 -6.42 65 70 89 69 73 88 92 -5.8365 70 92 69 73 88 89 -4.08 65 73 88 69 70 89 92 -4.6765 73 89 69 70 88 92 -4.08 65 73 92 69 70 88 89 -2.3365 88 89 69 70 73 92 4.67 65 88 92 69 70 73 89 6.4265 89 92 69 70 73 92 6.00 69 70 73 65 88 89 92 -12.8369 70 88 65 73 89 92 -4.08 69 70 89 65 73 88 92 -3.5069 70 92 65 73 88 89 -1.75 69 73 88 65 70 89 92 -2.3369 73 89 65 70 88 92 -1.75 69 73 92 65 70 88 89 0.0069 88 89 65 70 73 92 7.00 69 88 92 65 70 73 89 8.7569 89 92 65 70 73 88 9.33 70 73 88 65 69 89 92 -1.7570 73 89 65 69 88 92 -1.17 70 73 92 65 69 88 89 0.5870 88 89 65 69 73 92 7.58 70 88 92 65 69 73 89 9.3370 89 92 65 69 73 88 9.92 73 88 89 65 69 70 92 9.3373 88 92 65 69 70 89 11.08 73 89 92 65 69 70 92 10.6788 89 92 65 69 70 73 20.42

Grp1 Grp2 diff. Grp1 Grp2 diff.

-17.50 0.029 6.41 0.714-15.75 0.057 7.00 0.743-15.17 0.086 7.58 0.771-12.83 0.114 8.75 0.800 -7.00 0.143 9.33 0.886 -6.42 0.200 9.92 0.914 -5.83 0.229 10.67 0.943 -4.67 0.286 11.08 0.971 -4.08 0.371 20.42 1.000 -3.50 0.400 -2.33 0.457 -1.75 0.543 -1.17 0.571 0.00 0.600 0.58 0.629 4.67 0.657 6.00 0.686

CDF of diff d P(Diff≤d) d (Diff≤d)

Critical region:

C {d 17.5 or d=20.42},

where d=Y Y .

Observed d=-15.75; do not reject H .

Permutation Test

• No assumptions except random assignment

• Computations extensive if N is moderately large

• CLT type theorem shows that for large samples normal approximation is good.

Two sample tests

Documents

Two-Sample Tests of Hypothesisocw.upj.ac.id › files › Slide-MGT205-Slide11.pdf · 11-9 Two-Sample Tests about Proportions EXAMPLES The vice president of human resources wishes

1 Hypothesis Testing One-sample tests –One-sample tests for the mean –One-sample tests for proportions Two-sample tests –Two-sample tests for the mean

Module 17: Two-Sample t-tests, with equal variances for ...biostatcourse.fiu.edu/PDFSlides/Module17b.pdf · Module 17: Two-Sample t-tests, with equal variances for the two populations

Revisiting Classi er Two-Sample Tests for GAN Evaluation ... · 2. Two-sample tests can be used to evaluate generative models with intractable likelihoods but tractable sampling procedures

Chapter 24: Hypotheses tests for two-sample means

K sample problems and non-parametric tests. Two-Sample T-test (unpaired)

Multivariate Two-Sample Tests Based on Nearest Neighborshcmth031/mtstbonn.pdfMultivariate Two-Sample Tests Based on Nearest Neighbors MARK F. SCHILLING* A new class of simple tests

Two-Sample Z-Tests Allowing Unequal Variance (Enter ... · Two-Sample T-Tests Allowing Unequal Variance, Two -Sample T-Tests Assuming ... cover the unequal variance z-test ... a separate

Correcting Two-Sample z and t Tests for Correlation: An ... · Correcting Two-Sample z and t Tests for Correlation: An Alternative to One-Sample Tests on Difference Scores Donald

- ONE SAMPLE HYPOTHESIS TESTS - TWO SAMPLE HYPOTHESIS TESTS (INDEPENDENT AND DEPENDENT SAMPLES) 1 June 7, 2012

Multiple Two-Sample T-Tests - Sample Size - · PDF fileMultiple Two-Sample T-Tests 615-2 ... When the two-sample T-test is run for a replicated microarray experiment, the result is

CHAPTER 11 Two-Sample Tests of Hypothesis SOLVED PROBLEMS

Statistical Techniques I EXST7005 Two-sample t-tests

Single-Sample and Two-Sample t Tests - · PDF fileSo the single-sample t test can answer some important questions ... Chapter 6 Single-Sample and Two-Sample t Tests ... single-sample

Correcting Two-Sample z and t Tests for Correlation: An Alternative … · Correcting Two-Sample z and t Tests for Correlation: An Alternative to One-Sample Tests on Difference Scores

Two Sample Tests

One- and Two-Sample Tests of Hypotheses

Two-sample Kolmogorov-Smirnov-type tests revisited: Old

Two Sample t Tests

Comparing Two Means: One-sample & Paired-sample t-tests Lesson 12