38
The paired sample experiment The paired t test

The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Embed Size (px)

Citation preview

Page 1: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The paired sample experiment

The paired t test

Page 2: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.The two treatments determine two different populations

– Popn 1 cases treated with treatment 1.– Popn 2 cases treated with treatment 2

The response variable is assumed to have a normal distribution within each population differing possibly in the mean (and also possibly in the variance)

Page 3: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Two independent sample design

A sample of size n cases are selected from population 1 (cases receiving treatment 1) and a second sample of size m cases are selected from population 2 (cases receiving treatment 2).The data

– x1, x2, x3, …, xn from population 1.– y1, y2, y3, …, ym from population 2.

The test that is used is the t-test for two independent samples

Page 4: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The test statistic (if equal variances are assumed):

1 1Pooled

y xt

sn m

2 21 1

2x y

Pooled

n s m ss

n m

where

Page 5: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The matched pair experimental design (The paired sample experiment)Prior to assigning the treatments the subjects are grouped into pairs of similar subjects.

Suppose that there are n such pairs (Total of 2n = n + n subjects or cases), The two treatments are then randomly assigned to each pair. One member of a pair will receive treatment 1, while the other receives treatment 2. The data collected is as follows:

– (x1, y1), (x2 ,y2), (x3 ,y3),, …, (xn, yn) .

xi = the response for the case in pair i that receives treatment 1.

yi = the response for the case in pair i that receives treatment 2.

Page 6: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Let di = yi - xi. Then

d1, d2, d3 , … , dn Is a sample from a normal distribution with mean,

d = 2 – 1 , and

variance 2 2 2 2cov ,d x y x y 2 2 2x y xy x y

2 2 2d x y xy x y

standard deviation

Note if the x and y measurements are positively correlated (this will be true if the cases in the pair are matched effectively) than d will be small.

Page 7: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

To test H0: 1 = 2 is equivalent to testing H0: d = 0.

(we have converted the two sample problem into a single sample problem).

The test statistic is the single sample t-test on the differences

d1, d2, d3 , … , dn

0d

d

dt

s n

namely

df = n - 1

Page 8: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Example

We are interested in comparing the effectiveness of two method for reducing high cholesterol

The methods

1. Use of a drug.

2. Control of diet.

The 2n = 8 subjects were paired into 4 match pairs.

In each matched pair one subject was given the drug treatment, the other subject was given the diet control treatment. Assignment of treatments was random.

Page 9: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The datareduction in cholesterol after 6 month period

Pair

Treatment 1 2 3 4Drug treatment 30.3 10.2 22.3 15.0Diet control Treatment 25.7 9.4 24.6 8.9

Page 10: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

DifferencesPair

Treatment 1 2 3 4Drug treatment 30.3 10.2 22.3 15.0Diet control Treatment 25.7 9.4 24.6 8.9

di 4.6 0.8 -2.3 6.1

0 2.31.213

3.792 4d

d

dt

s n

for df = n – 1 = 3, Hence we accept H0.

2.3d 3.792ds

0.025 3.182t

Page 11: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Nonparametric Statistical Methods

Page 12: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Many statistical procedures make assumptions

The t test, z test make the assumption that the populations being sampled are normally distributed. (True for both the one sample and the two sample test).

Page 13: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

This assumption for large sample sizes is not critical.

(Reason: The Central Limit Theorem)

The sample mean, the statistic z will have approximately a normal distribution for large sample sizes even if the population is not normal.

Page 14: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

For small sample sizes the departure from the assumption of normality could affect the performance of a statistical procedure that assumes normality.

For testing, the probability of a type I error may not be the desired value of = 0.05 or 0.01

For confidence intervals the probability of capturing the parameter may be the desired value (95% or 99%) but a value considerably smaller

Page 15: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Example: Consider the z-test

For = 0.05 we reject the hypothesized value of the mean if z < -1.96 or z > 1.96

sample meanz

n

Suppose the population is an exponential population with parameter . ( = 1/ and = 1/)

Page 16: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

0

0.01

0.02

0.03

0.04

0.05

0.06

-40 -20 0 20 40 60 80 100

Actual population

Assumed population

Page 17: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

x

Suppose the population is an exponential population with parameter . ( = 1/ and = 1/)It can be shown that the sampling distribution of

is the Gamma distribution with

and n

n

The distribution of is not the normal distribution with x

and x xn n

Use mgf’s

Page 18: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

-40 -20 0 20 40 60 80 100

Sampling distribution of x

Actual distribution

Distribution assuming normality

n = 2

Page 19: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

-40 -20 0 20 40 60 80 100

Sampling distribution of x

Actual distribution

Distribution assuming normality

n = 5

Page 20: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

-0.02

0

0.02

0.04

0.06

0.08

0.1

-40 -20 0 20 40 60 80 100

Sampling distribution of x

Actual distribution

Distribution assuming normality

n = 20

Page 21: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

DefinitionWhen the data is generated from process (model) that is known except for finite number of unknown parameters the model is called a parametric model.

Otherwise, the model is called a non-parametric model

Statistical techniques that assume a non-parametric model are called non-parametric.

Page 22: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The sign test

A nonparametric test for the central location of a distribution

Page 23: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

We want to test:

H0: median = 0

HA: median 0

against

(or against a one-sided alternative)

Page 24: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

• The assumption will be only that the distribution of the observations is continuous.

• Note for symmetric distributions the mean and median are equal if the mean exists.

• For non-symmetric distribution, the median is probably a more appropriate measure of central location.

Page 25: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The Sign test:

S = the number of observations that exceed 0

Comment: If H0: median = 0 is true we would expect 50% of the observations to be above 0, and 50% of the observations to be below 0,

1. The test statistic:

Page 26: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

0

0

50%50%

median = 0

If H 0 is true then S will have a binomial distribution with p = 0.50, n = sample size.

Page 27: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

median

If H 0 is not true then S will still have a binomial distribution. However p will not be equal to 0.50.

0

p

0 > median

p < 0.50

Page 28: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

median0

p

0 < median

p > 0.50

p = the probability that an observation is greater than 0.

Page 29: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

n = 10

Summarizing: If H0 is true then S will have a binomial distribution with p = 0.50, n = sample size.

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

Page 30: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

n = 10

The critical and acceptance region:

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0 1 2 3 4 5 6 7 8 9 10Choose the critical region so that is close to 0.05 or 0.01.e. g. If critical region is {0,1,9,10} then = .0010 + .0098 + .0098 +.0010 = .0216

Page 31: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

n = 10

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

e. g. If critical region is {0,1,2,8,9,10} then = .0010 + .0098 +.0439+.0439+ .0098 +.0010 = .1094

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0 1 2 3 4 5 6 7 8 9 10

Page 32: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

• If one can’t determine a fixed confidence region to achieve a fixed significance level , one then randomizes the choice of the critical region

• In the example with n = 10, if the critical region is {0,1,9,10} then = .0010 + .0098 + .0098 +.0010 = .0216

• If the values 2 and 8 are added to the critical region the value of increases to 0.216 + 2(.0439) = 0.0216 + 0.0878 = 0.1094

• Note 0.05 =0.0216 + 0.3235(.0878)Consider the following critical region

1. Reject H0 if the test statistic is {0,1,9,10} 2. If the test statistic is {2,8} perform a success-failure

experiment with p = P[success] = 0.3235, If the experiment is a success Reject Ho.

3. Otherwise we accept H0.

Page 33: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Example

Suppose that we are interested in determining if a new drug is effective in reducing cholesterol.

Hence we administer the drug to n = 10 patients with high cholesterol and measure the reduction.

Page 34: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

The dataCholesterol

Case Initial Final Reduction

1 240 228 122 237 222 153 264 262 24 233 224 95 236 240 -46 234 237 -37 264 264 08 241 219 229 261 252 910 256 254 2

Let S = the number of negative reductions = 2

Page 35: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

n = 10

If H0 is true then S will have a binomial distribution with p = 0.50, n = 10.

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

We would expect S to be small if H0 is false.

Page 36: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Choosing the critical region to be {0, 1, 2} the probability of a type I error would be

= 0.0010 + 0.0098 + 0.0439 = 0.0547

Since S = 2 lies in this region, the Null hypothesis should be rejected.

Conclusion: There is a significant positive reduction ( = 0.0547) in cholesterol.

Page 37: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

If n is large we can use the Normal approximation to the Binomial.Namely S has a Binomial distribution with p = ½ and n = sample size.Hence for large n, S has approximately a Normal distribution with

meanand

standard deviation

2

nnpS

22

1

2

1 nnnpqS

Page 38: The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable

Hence for large n,use as the test statistic (in place of S)

2

2n

nSS

zS

S

Choose the critical region for z from the Standard Normal distribution.

i.e. Reject H0 if z < -z/2 or z > z/2

two tailed ( a one tailed test can also be set up.