16
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Embed Size (px)

Citation preview

Page 1: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Analysis of two-way tables- Inference for two-way tables

IPS chapter 9.2

© 2006 W.H. Freeman and Company

Page 2: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Objectives (IPS chapter 9.2)

Inference for two-way tables

The “no association” hypothesis

Expected counts in two-way tables

The Chi-square test

Page 3: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Hypothesis: no associationAgain, we want to know if the differences in sample proportions

between groups are likely to have occurred by chance.

We use the chi-square () test to assess the hypothesis of no

relationship between the two categorical variables of a two-way table.

Page 4: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Expected counts in two-way tablesTwo-way tables summarize the data according to two categorical

variables. We want to test the hypothesis that there is no relationship

between these two categorical variables (H0).

To test this hypothesis, we compare actual counts from the sample

data with expected counts assuming the hypothesis of no

relationship.

The expected count in any cell of a two-way table when H0 is true is:

Page 5: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Expected counts in two-way tablesIf there is no difference between the row proportions for the various

columns then we can estimate the common proportion for any row as

total for that row divided by n. If we apply that proportion to the column

total (multiplication) we get the formula given on the previous page.

That number represents the number of observations we expect to see

in the cell if there is no association between the row and column

factors.

Page 6: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Cocaine addictionCocaine produces short-term feelings of physical and mental well being. To

maintain that effect, the drug may have to be taken more frequently and at

higher doses. After stopping use, users may feel tired, sleepy or depressed. 

The pleasurable high followed by

unpleasant after-effects encourage

repeated compulsive use, which can

easily lead to dependency. 

Desipramine is an antidepressant that acts on the brain in a way that may mitigate depression. It was tested to see if it aided recovery from cocaine addiction.

Treatment with desipramine was compared to a standard treatment (lithium)

and a placebo.

Page 7: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

25*26/74 ≈ 8.78 16.22

9.14 16.86

8.08 14.92

Desipramine

Lithium

Placebo

Expected relapse counts

No Yes35% 35%35%

Expected

Observed

Cocaine addiction

Page 8: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

The chi-square statistic (2) is a measure of how much the observed

cell counts in a two-way table diverge from the expected cell counts.

The formula for the 2 statistic is given by

2 = observed count - expected count( ) 2

expected count∑

Large values for 2 are caused by large deviations from the expected

counts and provide strong evidence against H0. How large is “Large”?

The chi-square test

Page 9: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

For the chi-square test, H0 states that there is no association between

the row and column variables in a two-way table. The alternative is that

these variables are related.

If H0 is true, the chi-square test statistic has approximately a χ2

distribution with (r − 1)(c − 1) degrees of freedom. See Table F in

the textbook.

The P-value for the chi-square test

is the area to the right of 2 under

the 2 distribution with df (r−1)

(c−1):

P(χ2 ≥ X2).

Page 10: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Table F If we have 3 rows and 3 columns then degrees of freedom is

(3-1)*(3-1) = 4

If we are interested in choosing a large value we are unlikely to reach by chance we might select the 95th percentile since there is only a 5% chance of getting a number that large if Ho is true. From page T-20 we get the value 9.49

If we calculate 2 from the sample and if that value is greater than 9.49 then we are likely to conclude that we have strong evidence against Ho.

Page 11: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

When is it safe to use a 2 test?

We can safely use the chi-square test when:

All individual expected counts are 1 or more (≥1)

No more than 20% of expected counts are less than 5 (< 5)

For a 2x2 table, this implies that all four expected counts

should be 5 or more.

Page 12: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Chi-square test vs. z-test for two proportions

When comparing only two proportions such as in a 2x2 table where the

columns represent counts of “success” and “failure,” we can test

H0: p1 = p2 vs. Ha p1 ≠ p2

equally with a two-sided z-test or with a chi-square test with 1 degree of

freedom and get the same p-value. In fact, the two test statistics are related by

the simple expression X2 = (z)2.

Page 13: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

ObservedCocaine addiction

The p-value is 0.005 or half a

percent. This is very significant.

We reject the null hypothesis of

no association and conclude that

there is a significant relationship

between treatment (desipramine,

lithium, placebo) and outcome

(relapse or not).

Minitab statistical software output

for the cocaine study

Page 14: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Successful firms

Franchise businesses are often given exclusive rights to a territory. This

means that the outlet will not have to compete with other outlets of the same

chain within that territory. How does the presence of an exclusive-territory

clause in the contract relate to the survival of the business?

A random sample of 170 new franchises recorded two categorical variables

for each firm: (1) whether the firm was successful or not (based on economic

criteria) and (2) whether or not the firm had an exclusive-territory contract.

This is a 2x2 table (two levels

for success, yes/no; two levels

for exclusive territory, yes/no).

df = (2 − 1)(2 − 1) = 1

Page 15: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

Successful firmsHow does the presence of an exclusive-territory clause in the

contract relate to the survival of the business?

To compare firms that have

an exclusive territory with

those that do not, we start by

examining column percents

(conditional distribution):

The difference between the percent of successes among the two types of firms

is quite large. The chi-square test can tell us whether or not these differences

can be plausibly attributed to chance. Specifically, we will test

H0: No relationship between exclusivity clause and success

Ha: There is some relationship between the two variables

Page 16: Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

The p-value is significant at α 5% (p is about 1.5%) so we reject H0: we have

found a significant relationship between an exclusive territory and the success of

a franchised firm.

Successful firms

Here is the chi-square

output from Minitab: