Upload
majesty-ortiz
View
27
Download
0
Embed Size (px)
Citation preview
THE CHI – SQUARE TESTFOR TWO INDEPENDENT SAMPLES
FUNCTION
correlation/test of relationship
nominal or categorical scaling
the question arises as to whether the two variables are independent of each other
used to determine the significance of differences between two independent groups
the hypothesis being tested is usually that the two groups differ with respect to some characteristics
APPLICATION
two political groups differ in their agreement or disagreement with some opinion
sexes differ in the frequency with which they choose certain leisure time activities
METHOD: SAMPLE PROBLEM 150 Male and 50 female HS students were asked about their drink preferences. Among the males, 35 preferred cold drinks and 15 preferred regular drinks. Of the females, 40 preferred cold drinks and 10 preferred regular drinks. Is sex related to drink preference?
1) The data are arranged into a frequency or contingency table in which rows represent groups and each column represents a category of the measured variable. The frequencies shown are observed frequencies.
Cold Regular
Male 35 15 50
Female 40 10 50
75 25 100
2X2 COM
METHOD: SAMPLE PROBLEM 1
2) Expected frequencies must be obtained. For any contingency tables of R rows and C columns, the expected frequencies are obtained by multiplying the appropriate row and column marginal totals, and dividing by the total number of observations N.
Cold Regular
Male 37.5 12.5 50
Female 37.5 12.5 50
75 25 100
COM
METHOD: SAMPLE PROBLEM 1I. H₀: Sex is not related to drink preference or Sex and drink preference are
independent H₁: Sex is related to drink preference or Sex and drink preference are dependent
II. Statistical Test: Since the drink preferences (cold and regular) are categorical variables, and because there were two groups (male and female) , and because the categories are mutually exclusive, the chi – square test for independent groups is appropriate to test H₀.
III. Since we want to accept H₀, we use α = .01 and N = 100. The df is determined by df = (r – 1)(c – 1), where r is the number of rows or groups (2) and c is the number of columns or categories (2). Hence, (2 – 1)(2 – 1) = 1. Appendix Table C indicates that the critical value of χ² is 6.64.
IV. Decision Rule: if χ² ≥ 6.64 reject H₀, otherwise if χ² < 6.64 accept H₀
METHOD: SAMPLE PROBLEM 1V. Computation:
VI. Decision: Since 1.34 < 6.64 then we accept H₀
VII. Thus, sex is not related to drink preference or sex and drink preference are independent.
𝜒2 = (𝑂 − 𝐸)2
𝐸
O E O - E (𝑂 − 𝐸)2(𝑂 − 𝐸)2
𝐸
35 37.5 -2.5 6.25 0.17
15 12.5 2.5 6.25 0.5
40 37.5 -2.5 6.25 0.17
10 12.5 2.5 6.25 0.5
100 100 𝝌𝟐 =1.34
O E
METHOD: SAMPLE PROBLEM 2
Suppose we wish to test whether tall and short people differ with respect to leadership qualities. The table below shows frequencies with which 43 short people and 52 tall people are categorized as “leaders,” “followers,” and “unclassifiable.”
Short Tall Combined
Follower 22 14 36
Unclassifiable 9 6 15
Leader 12 32 44
Total 43 52 95
ATC PAR
2 X 2 CONTINGENCY TABLE
𝜒2 =𝑁 𝐴𝐷 − 𝐵𝐶 2
(𝐴 + 𝐵)(𝐶 + 𝐷)(𝐴 + 𝐶)(𝐵 + 𝐷)
where df = 1
Going back to Sample Problem 1:
𝜒2 =100 (35)(10) − (15)(40) 2
(35 + 15)(40 + 10)(35 + 40)(15 + 10)
𝜒2 = 1.33
PARTITIONING OF DEGREES OF FREEDOM IN R X 2 TABLESIf the table is larger than 2 x 2 and if H₀ is rejected, the contingency table may be partitioned into independent subtables to determine just where the differences are in the original table.
Going back to Sample Problem 2:
Short Tall Short Tall
Followers 22 14 36 Follower or unclassifiable
31 20 51
Unclassifiable 9 6 15 Leader 12 32 44
43 52 95 43 52 95
(1) (2)
PARTITIONING OF DEGREES OF FREEDOM IN R X 2 TABLES
𝜒12 =
𝑁2(𝐴𝐷 − 𝐵𝐶)2
𝐶1𝐶2𝑅2𝑅1(𝑅1 + 𝑅2)𝜒22 =𝑁[𝐹(𝐴 + 𝐶) − 𝐸(𝐵 + 𝐷)]2
𝐶1𝐶2𝑅3 (𝑅1 + 𝑅2)
𝜒12 = .005
With df = 1 at α = .01, H₀ is accepted since .005 < 6.64. Thus, there is no relation between stature and people who are either followers or not classifiable in terms of leadership.
𝜒22 = 10.707
With df = 1 at α = .01, H₀ is rejected since 10.707 ≥ 6.64. Thus, the distribution of leaders and nonleaders differs as a function of stature.
ATC
WHEN TO USE
• When N ≤ 20, always use the
Fisher exact test
• When 20 < N < 40, the χ² test may be used if all expected frequencies are 5 or more. If the smallest expected frequency is less
than 5, use the Fisher exact test
• When N > 40, use χ² corrected
for continuity
The
2 x 2
case
WHEN TO USE
• χ² may be used if fewer than 20% of
the cells have an expected
frequency of less than 5 and if no
cell has an expected frequency of
less than 1
• The researcher should combine
adjacent categories to increase the
expected frequencies in the various
cells.
Contingency
tables with
df > 1
WHEN TO USE
• Yates’ correction for
continuitySmall
Expected
Values
To apply this correction we reduce by .5 the obtained frequencies that are greater than expectation and increase by .5 the obtained frequencies that are less than expectation.
𝜒2 =𝑁 𝐴𝐷 − 𝐵𝐶 −
𝑁2
2
(𝐴 + 𝐵)(𝐶 + 𝐷)(𝐴 + 𝐶)(𝐵 + 𝐷)
APPENDIX TABLE C
MSP1 SP2 PAR
REFERENCES
Siegel, S. and Castellan, N. J. (1988). Nonparametric Statistics for Behavioral Sciences. New York: McGraw Hill.
Ferguson, G. A. and Takane, Y. (1989). Statistical Analysis in Psychology and Education. United States: McGraw Hill.