15
Chi-square Basics

Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Embed Size (px)

Citation preview

Page 1: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Chi-square Basics

Page 2: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

The Chi-square distribution

2

1

n

ii

z

• Positively skewed but becomes symmetrical with increasing degrees of freedom

• Mean = k where k = degrees of freedom• Variance = 2k• Assuming a normally distributed dataset

and sampling a single z2 value at a time 2(1) = z2

– If more than one… 2(N) =

Page 3: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Why used?

• Chi-square analysis is primarily used to deal with categorical (frequency) data

• We measure the “goodness of fit” between our observed outcome and the expected outcome for some variable

• With two variables, we test in particular whether they are independent of one another using the same basic approach.

Page 4: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

One-dimensional

• Suppose we want to know how people in a particular area will vote in general and go around asking them.

• How will we go about seeing what’s really going on?

Republican Democrat Other

20 30 10

Page 5: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

• Hypothesis: Dems should win district

• Solution: chi-square analysis to determine if our outcome is different from what would be expected if there was no preference

22 ( )O E

E

Page 6: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

• Plug in to formula

Republican Democrat Other

Observed 20 30 10Expected 20 20 20

2 2 2(20 20) (30 20) (10 20)

20 20 20

Page 7: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

• Reject H0

• The district will probably vote democratic

• However…

2

2.05

(2) 10

5.99

Page 8: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Conclusion

• Note that all we really can conclude is that our data is different from the expected outcome given a situation– Although it would appear that the district will vote

democratic, really we can only conclude they were not responding by chance

– Regardless of the position of the frequencies we’d have come up with the same result

– In other words, it is a non-directional test regardless of the prediction

Page 9: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

More complex

• What do stats kids do with their free time?

TV Nap Worry Stare at Ceiling

Males 30 40 20 10Females 20 30 40 10

Page 10: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

• Is there a relationship between gender and what the stats kids do with their free time?

• Expected = (Ri*Cj)/N• Example for males TV: (100*50)/200 = 25

TV Nap Worry Stare at Ceiling

Total

Males 30 40 20 10 100Females 20 30 40 10 100

50 70 60 20 200

Page 11: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

• df = (R-1)(C-1)– R = number of rows– C = number of columns

TV Nap Worry Stare at Ceiling

Total

Males (E) 30 (25) 40 (35) 20 (30) 10 (10) 100Females (E)

20 (25) 30 (35) 40 (30) 10 (10) 100

50 70 60 20 200

Page 12: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Interpretation

• Reject H0, there is some relationship between gender and how stats students spend their free time

2

2.05

(3) 10.10

7.82

Page 13: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Other

• Important point about the non-directional nature of the test, the chi-square test by itself cannot speak to specific hypotheses about the way the results would come out

• Not useful for ordinal data because of this

Page 14: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Assumptions

• Normality– Rule of thumb is that we need at least 5 for our expected

frequencies value

• Inclusion of non-occurences– Must include all responses, not just those positive ones

• Independence– Not that the variables are independent or related (that’s what the

test can be used for), but rather as with our t-tests, the observations (data points) don’t have any bearing on one another.

• To help with the last two, make sure that your N equals the total number of people who responded

Page 15: Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees

Measures of Association

• Contingency coefficient

• Phi

• Cramer’s Phi

• Odds Ratios

• Kappa

• These were discussed in 5700