84
STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 12 Inference on Categorical Data Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

STATISTICS - Islamic University of Gaza

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STATISTICS - Islamic University of Gaza

STATISTICSINFORMED DECISIONS USING DATAFifth Edition

Chapter 12

Inference on Categorical Data

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Page 2: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit TestLearning Objective

1. Perform a goodness-of-fit test

Page 3: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (1 of 32)

Characteristics of the Chi-Square Distribution

1. It is not symmetric.

2. It’s shape depends on the degrees of freedom, just like Student’s t-distribution.

3. As the number of degrees of freedom increases, it becomes more nearly symmetric.

Page 4: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (2 of 32)

Page 5: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (3 of 32)

A goodness-of-fit test is an inferential procedure used to determine whether a frequency distribution follows a specific distribution.

Page 6: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (4 of 32)

Expected Counts

Suppose that there are n independent trials of an experiment with k ≥ 3 mutually exclusive possible outcomes. Let p1 represent the probability of observing the first outcome and E1 represent the expected count of the first outcome; p2 represent the probability of observing the second outcome and E2 represent the expected count of the second outcome; and so on. The expected counts for each possible outcome are given by

Ei = μi = npi for i = 1, 2, …, k

Page 7: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (5 of 32)

Parallel Example 1: Finding Expected Counts

A sociologist wishes to determine whether the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000. According to the United States Census Bureau, in 2000, 22.8% of grandparents have been responsible for their grandchildren less than 1 year; 23.9% of grandparents have been responsible for their grandchildren for 1 or 2 years; 17.6% of grandparents have been responsible for their grandchildren 3 or 4 years; and 35.7% of grandparents have been responsible for their grandchildren for 5 or more years. If the sociologist randomly selects 1,000 care-giving grandparents, compute the expected number within each category assuming the distribution has not changed from 2000.

Page 8: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (6 of 32)

Solution

Step 1: The probabilities are the relative frequencies from the 2000 distribution:

Page 9: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (7 of 32)

Solution

Step 2: There are n = 1,000 trials of the experiment so the expected counts are:

Page 10: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (8 of 32)

Test Statistic for Goodness-of-Fit Tests

Let Oi represent the observed counts of category i, Ei represent the expected counts of category i, k represent the number of categories, and n represent the number of independent trials of an experiment. Then the formula

Page 11: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (9 of 32)

CAUTION!

Goodness-of-fit tests are used to test hypotheses regarding the distribution of a variable based on a single population. If you wish to compare two or more populations, you must use the tests for homogeneity presented in Section 12.2.

Page 12: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (10 of 32)

The Goodness-of-Fit Test

To test the hypotheses regarding a distribution, we use the steps that follow.

Step 1: Determine the null and alternative hypotheses.

H0: The random variable follows a certain distribution

H1: The random variable does not follow the distribution in the null hypothesis

Page 13: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (11 of 32)

Step 2: Decide on a level of significance, α, depending on the seriousness of making a Type I error.

Page 14: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (12 of 32)

Step 3:

a) Calculate the expected counts, Ei , for each of the kcategories. The expected counts are Ei = npi for i = 1, 2, … , k where n is the number of trials and pi is the probability of the ith category, assuming that the null hypothesis is true.

Page 15: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (13 of 32)

Step 3:

b) Verify that the requirements for the goodness-of-fit test are satisfied.

1. All expected counts are greater than or equal to 1 (all Ei ≥ 1).

2. No more than 20% of the expected counts are less than 5.

Page 16: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (14 of 32)

CAUTION!

If the requirements in Step 3(b) are not satisfied, one option is to combine two or more of the low-frequency categories into a single category.

Page 17: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (15 of 32)

Classical Approach

Step 3 (continued):

c) Compute the test statistic:

Page 18: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (16 of 32)

Classical Approach

Step 4: Determine the critical value. All goodness-of-fit tests are right-tailed tests, so the critical value is with k − 1 degrees of freedom.

Page 19: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (17 of 32)

Classical Approach

Compare the critical value to the test statistic.

Page 20: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (18 of 32)

P-value Approach

By Hand Step 3 (continued):

c) Compute the test statistic:

Page 21: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (19 of 32)

P-value Approach

d) Use Table VIII to approximate the P-value by determining the area under the chi-square distribution with k − 1 degrees of freedom to the right of the test statistic.

Page 22: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (20 of 32)

P-value Approach

Technology Step 3 (continued):

c) Use a statistical spreadsheet or calculator with statistical capabilities to obtain the P-value. The directions for obtaining the P-value using the TI-84 Plus graphing calculator, MINITAB, Excel, and StatCrunch are in the Technology Step-by-Step in the text.

Page 23: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (21 of 32)

P-value Approach

Step 4: If the P-value < α, reject the null hypothesis.

Page 24: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (22 of 32)

P-value Approach

Step 5: State the conclusion.

Page 25: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (23 of 32)

Parallel Example 2: Conducting a Goodness-of-Fit Test

A sociologist wishes to determine whether the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000.

According to the United States Census Bureau, in 2000, 22.8% of grandparents have been responsible for their grandchildren less than 1 year; 23.9% of grandparents have been responsible for their grandchildren for 1 or 2 years; 17.6% of grandparents have been responsible for their grandchildren 3 or 4 years; and 35.7% of grandparents have been responsible for their grandchildren for 5 or more years. The sociologist randomly selects 1,000 care-giving grandparents and obtains the following data.

Page 26: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (24 of 32)

Number of Years Frequency

Less than 1 year 252

1 or 2 years 255

3 or 4 years 162

5 or more years 331

Test the claim that the distribution is different today than it was in 2000 at the α = 0.05 level of significance.

Page 27: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (25 of 32)

Solution

Step 1: We want to know if the distribution today is different than it was in 2000. The hypotheses are then:

H0: The distribution for the number of years care-giving grandparents are responsible for their grandchildren is the same today as it was in 2000

H1: The distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000

Page 28: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (26 of 32)

Solution

Step 2: The level of significance is α = 0.05.

Step 3:

a) The expected counts were computed in Example 1.

Number of Years

Observed Counts

Expected Counts

<1 252 228

1-2 255 239

3-4 162 176

≥5 331 357

Page 29: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (27 of 32)

Solution

Step 3:

Since all expected counts are greater than or equal to 5, the requirements for the goodness-of-fit test are satisfied.

Page 30: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (28 of 32)

Solution: Classical Approach

Page 31: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (29 of 32)

Solution: Classical Approach

Page 32: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (30 of 32)

Solution: P-value Approach

Page 33: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (31 of 32)

Solution: P-value Approach

Since the P-value ≈ 0.09 is greater than the level of significance α = 0.05, we fail to reject the null hypothesis.

Page 34: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.1 Goodness-of-Fit Test12.1.1 Perform a goodness-of-fit test (32 of 32)

Solution

Step 5: There is insufficient evidence to conclude that the distribution for the number of years care-giving grandparents are responsible for their grandchildren is different today than it was in 2000 at the α = 0.05 level of significance.

Page 35: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of ProportionsLearning Objective

1. Perform a test for independence2. Perform a test for homogeneity of proportions

Page 36: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (1 of 36)

The chi-square test for independence is used to determine whether there is an association between a row variable and column variable in a contingency table constructed from sample data. The null hypothesis is that the variables are not associated; in other words, they are independent. The alternative hypothesis is that the variables are associated, or dependent.

Page 37: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (2 of 36)

“In Other Words”

In a chi-square independence test, the null hypothesis is always

H0: The variables are independent

The alternative hypothesis is always

H1: The variables are not independent

Page 38: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (3 of 36)

The idea behind testing these types of claims is to compare actual counts to the counts we would expect if the null hypothesis were true (if the variables are independent). If a significant difference between the actual counts and expected counts exists, we would take this as evidence against the null hypothesis.

Page 39: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (4 of 36)

If two events are independent, then

P(E and F) = P(E)P(F)

We can use the Multiplication Principle for Independent Events to obtain the expected proportion of observations within each cell under the assumption of independence and multiply this result by n, the sample size, in order to obtain the expected count within each cell.

Page 40: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (5 of 36)

Parallel Example 1: Determining the Expected Counts in a Test for Independence

In a poll, 883 males and 893 females were asked “If you could have only one of the following, which would you pick: money, health, or love?” Their responses are presented in the table below. Determine the expected counts within each cell assuming that gender and response are independent.

blank Money Health Love

Men 82 446 355

Women 46 574 273

Page 41: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (6 of 36)

Solution

Step 1: We first compute the row and column totals:

blank Money Health Love Row Totals

Men 82 446 355 883

Women 46 574 273 893

Column totals 128 1020 628 1776

Page 42: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (7 of 36)

Solution

Step 2: Next compute the relative marginal frequencies for the row variable and column variable:

Page 43: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (8 of 36)

Solution

Step 3: Assuming gender and response are independent, we use the Multiplication Rule for Independent Events to compute the proportion of observations we would expect in each cell.

blank Money Health Love

Men 0.0358 0.2855 0.1758

Women 0.0362 0.2888 0.1778

Page 44: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (9 of 36)

Solution

Step 4: We multiply the expected proportions from step 3 by 1776, the sample size, to obtain the expected counts under the assumption of independence.

blank Money Health Love

Men 1776(0.0358)

≈ 63.5808

1776(0.2855)

≈ 507.048

1776(0.1758)

≈ 312.2208

Women 1776(0.0362)

≈ 64.2912

1776(0.2888)

≈ 512.9088

1776(0.1778)

≈ 315.7728

Page 45: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (10 of 36)

Expected Frequencies in a Chi-Square Test for Independence

To find the expected frequencies in a cell when performing a chi-square independence test, multiply the cell’s row total by its column total and divide this result by the table total. That is,

Page 46: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (11 of 36)

approximately follows the chi-square distribution with (r − 1)(c − 1) degrees of freedom, where r is the number of rows and c is the number of columns in the contingency table, provided that (1) all expected frequencies are greater than or equal to 1 and (2) no more than 20% of the expected frequencies are less than 5.

Page 47: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (12 of 36)

Chi-Square Test for Independence

To test the hypothesis regarding the association between (or independence of) two variables in a contingency table, we use the steps that follow:

Step 1: Determine the null and alternative hypotheses.

H0: The row variable and column variable are independent.

H1: The row variable and column variables are dependent.

Page 48: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (13 of 36)

Step 2: Choose a level of significance, α, depending on the seriousness of making a Type I error.

Page 49: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (14 of 36)

Step 3:

a) Calculate the expected frequencies (counts) for each cell in the contingency table.

b) Verify that the requirements for the chi-square test for independence are satisfied:

1. All expected frequencies are greater than or equal to 1 (all Ei ≥ 1).

2. No more than 20% of the expected frequencies are less than 5.

Page 50: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (15 of 36)

Classical Approach

Step 3:

c) Compute the test statistic:

Page 51: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (16 of 36)

Classical Approach

Page 52: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (17 of 36)

Page 53: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (18 of 36)

Classical Approach

Page 54: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (19 of 36)

P-value Approach

By Hand Step 3:

c) Compute the test statistic:

Page 55: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (20 of 36)

P-value Approach

d) Use Table VIII to determine an approximate P-value by determining the area under the chi-square distribution with (r − 1)(c − 1) degrees of freedom to the right of the test statistic.

Page 56: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (21 of 36)

P-value Approach

Technology Step 3:

c) Use a statistical spreadsheet or calculator with statistical capabilities to obtain the P-value. The directions for obtaining the P-value using the TI-83/84 Plus graphing calculator, MINITAB, Excel, and StatCrunch are in the Technology Step-by-Step in the text.

Page 57: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (22 of 36)

P-value Approach

Step 4: If the P-value < α, reject the null hypothesis.

Page 58: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (23 of 36)

P-value Approach

Step 5: State the conclusion.

Page 59: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (24 of 36)

Parallel Example 2: Performing a Chi-Square Test for Independence

In a poll, 883 males and 893 females were asked “If you could have only one of the following, which would you pick: money, health, or love?” Their responses are presented in the table below. Test the claim that gender and response are independent at the α = 0.05 level of significance.

blank Money Health Love

Men 82 446 355

Women 46 574 273

Page 60: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (25 of 36)

Solution

Step 1: We want to know whether gender and response are dependent or independent so the hypotheses are:

H0: gender and response are independent

H1: gender and response are dependent

Step 2: The level of significance is α = 0.05.

Page 61: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (26 of 36)

Solution

Step 3:

(a) The expected frequencies were computed in Example 1 and are given in parentheses in the table below, along with the observed frequencies.

blank Money Health Love

Men 82

(63.5808)

446

(507.048)

355

(312.2208)

Women 46

(64.2912)

574

(512.9088)

273

(315.7728)

Page 62: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (27 of 36)

Solution

Step 3:

Since none of the expected frequencies are less than 5, the requirements for the goodness-of-fit test are satisfied.

Page 63: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (28 of 36)

Solution: Classical Approach

Page 64: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (29 of 36)

Solution: Classical Approach

Page 65: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (30 of 36)

Solution: P-value Approach

Page 66: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (31 of 36)

Solution: P-value Approach

Step 4: Since the P-value is less than the level of significance α = 0.05, we reject the null hypothesis.

Page 67: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (32 of 36)

Solution

Step 5: There is sufficient evidence to conclude that gender and response are dependent at the α = 0.05 level of significance.

Page 68: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (33 of 36)

To see the relation between response and gender, we draw bar graphs of the conditional distributions of response by gender. Recall that a conditional distribution lists the relative frequency of each category of a variable, given a specific value of the other variable in a contingency table.

Page 69: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (34 of 36)

Parallel Example 3: Constructing a Conditional Distribution and Bar Graph

Find the conditional distribution of response by gender for the data from the previous example, reproduced below.

blank Money Health Love

Men 82 446 355

Women 46 574 273

Page 70: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (35 of 36)

Solution

We first compute the conditional distribution of response by gender.

Page 71: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.1 Perform a Test for Independence (36 of 36)

Solution

Page 72: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (1 of 13)

In a chi-square test for homogeneity of proportions, we test whether different populations have the same proportion of individuals with some characteristic.

Page 73: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (2 of 13)

The procedures for performing a test of homogeneity are identical to those for a test of independence.

Page 74: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (3 of 13)

Parallel Example 5: A Test for Homogeneity of Proportions

The following question was asked of a random sample of individuals in 1992, 2002, and 2008: “Would you tell me if you feel being a teacher is an occupation of very great prestige?” The results of the survey are presented below:

blank 1992 2002 2008

Yes 418 479 525

No 602 541 485

Page 75: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (4 of 13)

Solution

Step 1: The null hypothesis is a statement of “no difference” so the proportions for each year who feel that being a teacher is an occupation of very great prestige are equal. We state the hypotheses as follows:

H0: p1 = p2 = p3

H1: At least one of the proportions is different from the others.

Step 2: The level of significance is α = 0.01.

Page 76: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (5 of 13)

Solution

Step 3:

a) The expected frequencies are found by multiplying the appropriate row and column totals and then dividing by the total sample size. They are given in parentheses in the table below, along with the observed frequencies.

blank 1992 2002 2008

Yes418

(475.554)

479

(475.554)

525

(470.892)

No602

(544.446)

541

(544.446)

485

(539.108)

Page 77: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (6 of 13)

Solution

Step 3:

b) Since none of the expected frequencies are less than 5, the requirements are satisfied.

Page 78: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (7 of 13)

Solution: Classical Approach

Step 3:

c) The test statistic is

Page 79: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (8 of 13)

Solution: Classical Approach

Step 4: There are r = 2 rows and c = 3 columns, so we find the critical value using (2 − 1)(3 − 1) = 2 degrees of freedom.

Page 80: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (9 of 13)

Solution: Classical Approach

Page 81: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (10 of 13)

Solution: P-value Approach

By Hand Step 3:

c) The test statistic is

Page 82: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (11 of 13)

Solution: P-value Approach

Page 83: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (12 of 13)

Solution: P-value Approach

Step 4: Because the P-value is less than the level of significance α = 0.01, we reject the null hypothesis.

Page 84: STATISTICS - Islamic University of Gaza

Copyright © 2017, 2013, 2010 Pearson Education, Inc. All Rights Reserved

12.2 Tests for Independence and the Homogeneity of Proportions12.2.2 Perform a Test for Homogeneity of Proportions (13 of 13)

Solution

Step 5: There is sufficient evidence to reject the null hypothesis at the α = 0.01 level of significance. We conclude that the proportion of individuals who believe that teaching is a very prestigious career is different for at least one of the three years.