Upload
gerard-jacobs
View
219
Download
0
Embed Size (px)
DESCRIPTION
Chi Square Goodness of Fit Test This test allows us to compare a collection of categorical data with some theoretical expected distribution. Sometimes called a One-Sample Test Degrees of Freedom = categories – one. Example: Birth month of Baseball players
Citation preview
X2 TestsHow to know when to use them.
Which one should I use?
Chi Square TestsHow do I know to use a chi-square test?
Only use a Chi-Square test if all of the data in question is categorical (remember your assumptions) Randomness Independence 10% Rule Counted Data – The values in each cell must be counts
for the categories of a categorical (qualitative) variable. Expected Cell Frequency – Every cell should contain a
count of at least 5.
Chi Square Goodness of Fit TestThis test allows us to compare a collection of
categorical data with some theoretical expected distribution.
Sometimes called a One-Sample TestDegrees of Freedom = categories – one.
Example: Birth month of Baseball players
Chi Square Goodness of Fit TestAfter getting trounced by your little brother in a children’s game, you suspect the die he gave you to roll may be unfair. To check, you roll it 60 times, recording the number of times each face appears. Do these results cast doubt on the die’s fairness?• If the die is fair, how many times
would you expect each face to show?
• To see if these results are unusual, what type of test will you perform?
• State your hypotheses
Face Count1 112 73 94 155 126 6
Chi Square Goodness of Fit TestAfter getting trounced by your little brother in a children’s game, you suspect the die he gave you to roll may be unfair. To check, you roll it 60 times, recording the number of times each face appears. Do these results cast doubt on the die’s fairness?• Check the conditions
• How many degrees of freedom are there?
• Find x2 and the P-value
• State your conclusion
Face Count1 112 73 94 155 126 6
Chi Square Test of HomogeneityA test comparing the distribution of counts for
two or more groups on the same categorical variable.Finds the expected counts based on the overall
frequencies, adjusted for the totals in each group under the (null hypothesis) assumption that the distributions are the same for each group.
Degrees of Freedom = (rows – 1)(columns – 1)Where rows gives the number of categories and
columns gives the number of independent groups.
Example: Future plans of Graduates based on college of study
Chi Square Test of HomogeneityDoes your doctor know? A survey of articles from the New England Journal of Medicine (NEJM) classified them according to the principal statistics methods used. The articles recorded were all non-editorial articles appearing during the indicated years. Has there been a change in the use of Statistics?• What kind of test would be
appropriate?
• State the hypotheses.
• How many degrees of freedom are there?
• The smallest expected count will be in 1989/No cell. What is it?
Publication Year
1978-79 1989 2004
-05 Total
No Stats 90 14 40 144
Stats 242 101 271 614
Total 332 115 311 758
Chi Square Test of HomogeneityDoes your doctor know? A survey of articles from the New England Journal of Medicine (NEJM) classified them according to the principal statistics methods used. The articles recorded were all non-editorial articles appearing during the indicated years. Has there been a change in the use of Statistics?• Check the assumptions and
conditions for inference.
• Calculate the component of chi-square for the 1989/No cell.
• For this test, x2 = 25.28. What’s the P-value?
• State your conclusion.
Publication Year
1978-79 1989 2004
-05 Total
No Stats 90 14 40 144
Stats 242 101 271 614
Total 332 115 311 758
Chi Square Test of HomogeneityDoes your doctor know? A survey of articles from the New England Journal of Medicine (NEJM) classified them according to the principal statistics methods used. The articles recorded were all non-editorial articles appearing during the indicated years. Has there been a change in the use of Statistics?• Show how the residual for the
1989/No cell was calculated.
• What can you conclude from the patterns in the standardized residuals?
Publication Year
1978-79 1989 2004
-05
No Stats 3.39 -1.68 -2.48
Stats -1.64 0.81 1.20
Chi Square Test of IndependenceA test of whether two categorical variables are
independent. Usually displayed in a contingency table.Contingency table – A two-way table that classifies
individuals according to two categorical variables.Examines the distribution of counts for one group
of individuals classified according to both variables.
Degrees of Freedom = (rows – 1)(columns – 1)Where rows give the number of categories in one
variable and columns gives the number of categories in the other.
Chi Square Test of IndependenceThere is some concern that if a woman has an epidural to reduce pain during childbirth, the drug can get into the baby’s bloodstream, making the baby sleepier and less willing to breastfeed. Researchers followed up on 1178 births, noting whether the mother had an epidural and whether the baby was still nursing after 6 months.• What kind of test would be
appropriate?
• State the null and alternative hypotheses.
• How many degrees of freedom are there?
Epidural?
Breastfeeding at 6
months?
Yes No Total
Yes 206 498 704
No 190 284 474
Total 396 782 1178
Chi Square Test of IndependenceThere is some concern that if a woman has an epidural to reduce pain during childbirth, the drug can get into the baby’s bloodstream, making the baby sleepier and less willing to breastfeed. Researchers followed up on 1178 births, noting whether the mother had an epidural and whether the baby was still nursing after 6 months.• The smallest expected count
will be in the epidural/no breastfeeding cell. What is it?
• Check the assumptions and conditions for inference.
• Calculate the component of chi-square for the epidural/no breastfeeding cell.
Epidural?
Breastfeeding at 6
months?
Yes No Total
Yes 206 498 704
No 190 284 474
Total 396 782 1178
Chi Square Test of IndependenceThere is some concern that if a woman has an epidural to reduce pain during childbirth, the drug can get into the baby’s bloodstream, making the baby sleepier and less willing to breastfeed. Researchers followed up on 1178 births, noting whether the mother had an epidural and whether the baby was still nursing after 6 months.• For this test, x2 = 14.87. What’s
the P-value?• State your conclusion.• Show how the residual for the
epidural/no breastfeeding cell was calculated.
• What can you conclude from the standardized residuals?
Epidural?Breastfeeding at 6
months?
Yes No
Yes -1.99 14.2
No 2.43 -1.73
Chi Square Test of IndependenceThere is some concern that if a woman has an epidural to reduce pain during childbirth, the drug can get into the baby’s bloodstream, making the baby sleepier and less willing to breastfeed. Researchers followed up on 1178 births, noting whether the mother had an epidural and whether the baby was still nursing after 6 months.Suppose a broader study included several additional issues, including whether a mother drank alcohol, whether this was a first child, and whether the parents occasionally supplemented breastfeeding with bottled formula. Why would it not be appropriate to use chi-square methods on the 2 × 8 table with yes/no columns for each potential factor?