Quantitative Methods Part 3 Chi - Squared Statistic

Quantitative MethodsQuantitative Methods

Part 3Chi - Squared Statistic

Recap on T-StatisticRecap on T-StatisticIt used the mean and standard error

of a population sampleThe data is on an “interval” or scaleMean and standard error are the

parametersThis approach is known as

parametric Another approach is non-parametric

testing

Introduction to Chi-Introduction to Chi-SquaredSquaredIt does not use the mean and standard

error of a population sampleEach respondent can only choose one

category (unlike scale in T-Statistic)The expected frequency must be

greater than 5 for the test to succeed. If any of the categories have less than 5

for the expected frequency, then you need to increase your sample size

Example using Chi-Example using Chi-SquaredSquared“Is there a preference amongst

the UW student population for a particular web browser? “ (Dr C Price’s Data)◦They could only indicate one choice◦These are the observed frequencies

responses from the sampleFirefox IExplore

rSafari Chrome Opera

Observed frequencies

30 6 4 8 2

Was it just chance?Was it just chance?How confident am I?

◦Was the sample representative of all UW students?

◦Was it just chance?Chi-Squared test for significance

◦Some variations on test◦Simplest is Null Hypothesis

:The students show “no preference” for a particular browser

Chi-Squared: “Goodness of Chi-Squared: “Goodness of fit” (No preference)fit” (No preference)

: The students show no preference for a particular browser

This leads to Hypothetical or Expected distribution of frequency◦We would expect an equal number of

respondents per category◦We had 50 respondents and 5

categoriesFirefox IExplore


Expected frequencies

10 10 10 10 10

Expected frequency table

Stage1: Formulation of Stage1: Formulation of HypothesisHypothesis

: There is no preference in the underlying population for the factor suggested.

: There is a preference in the underlying population for the factors suggested.

The basis of the chi-squared test is to compare the observed frequencies against the expected frequencies

Stage 2: Expected Stage 2: Expected DistributionDistribution

As our “null- hypothesis” is no preference, we need to work out the expected frequency:◦You would expect each category to

have the same amount of respondents

◦Show this in “Expected frequency” table

◦Has to have more than 5 to be validFirefox IExplore


Expected frequencies

10 10 10 10 10

Stage 3a: Level of confidenceStage 3a: Level of confidence

Choose the level of confidence (often 0.05)◦0.05 means that there is 5% chance that

conclusion is chance◦95% chance that our conclusions are certain

Stage 3b: Degree of freedomStage 3b: Degree of freedom

We need to find the degree of freedom

This is calculated with the number of categories◦We had 5 categories, df = 5-1 (4)

Stage 3: Critical value of Chi-Stage 3: Critical value of Chi-SquaredSquared

In order to compare our calculated chi-square value with the “critical value” in the chi-squared table we need:

◦Level of confidence (0.05)◦Degree of freedom (4)

Our critical value from the table = 9.49

Stage 4: Calculate Stage 4: Calculate statisticsstatisticsWe compare the observed

against the expected for each category

We square each oneWe add all of them up

Firefox IExplorer

Safari Chrome Opera

Observed

30 6 4 8 2

Expected 10 10 10 10 10

= 52

Stage 5: DecisionStage 5: DecisionCan we reject the That students

show no preference for a particular browser?◦Our value of 52 is way beyond 9.49. We

are 95% confident the value did not occur by chance

So yes we can safely reject the null hypothesis

Which browser do they prefer?◦Firefox as it is way above expected

frequency of 10

Chi-Squared: Chi-Squared: “No Difference “No Difference from a Comparison Population”.from a Comparison Population”. RQ: Are drivers of high

performance cars more likely to be involved in accidents?◦Sample n = 50 and Market Research

data of proportion of people driving these categories

◦Once null hypothesis of expected frequency has been done, the analysis is the same as no preference calculation

High Performance

Compact Midsize Full size

FO20 14 9 9

MR%10% 40% 30% 20%

FE5 (10% of 50) 20 15 10

Chi-Squared test for Chi-Squared test for “Independence”.“Independence”.

What makes computer games fun?

Review found the following◦Factors (Mastery, Challenge and

Fantasy)◦Different opinion depending on

genderResearch sample of 50 males and 50

females

Mastery Challenge Fantasy

Male10 32 8

Female24 8 18

Observed frequency table

What is the research What is the research question?question?A single sample with individuals

measured on 2 variables◦RQ: ”Is there a relationship between fun

factor and gender?”◦HO : “There is no such relationship”

Two separate samples representing 2 populations (male and female)◦RQ: ““Do male and female players have

different preferences for fun factors?”◦ HO : “Male and female players do not have

different preferences”

Chi-Squared analysis for Chi-Squared analysis for “Independence”.“Independence”.

Establish the null hypothesis (previous slide)

Determine the critical value of chi-squared dependent on the confidence limit (0.05) and the degrees of freedom.◦ df = (R – 1)*(C – 1) = 1 * 2 = 2 (R=2, C=3)

Look up in chi-squared table◦ Chi-squared value = 5.99

Mastery Challenge

Fantasy

Male10 32 8

Female24 8 18


Calculate the expected frequencies◦ Add each column and divide by types (in

this case 2)◦ Easier if you have equal number for each

gender (if not come and see me)

Mastery Challenge Fantasy Respondents

Male (FO)10 32 8 50

Female (FO)24 8 18 50

Cat total34 40 26

Male (FE)17 20 13

Female (FE)17 20 13


Calculate the statistics using the chi-squared formula◦ Ensure you include both male and female

data


Male (FO)10 32 8

Female (FO)24 8 18

Male (FE)17 20 13

Female (FE)17 20 13

2 2 2 22 (10 17) (32 20) (24 17) (8 20)

...17 20 17 20

24.01

Stage 5: DecisionStage 5: DecisionCan we reject the null hypothesis?

◦ Our value of 24.01 is way beyond 5.99. We are 95% confident the value did not occur by chance

Conclusion: We are 95% confident that there is a relationship between gender and fun factor

But else can we get from this?◦ Significant fun factor for males = Challenge◦ Significant fun factor for females = Mastery

and Fantasy


Male (FO)10 32 8

Female (FO)24 8 18

Male (FE)17 20 13

Female (FE)17 20 13

WorkshopWorkshopWork on Workshop 7 activitiesYour journal (Homework)Your Literature Review

(Complete/update)

ReferencesReferences Dr C. Price’s notes 2010 Gravetter, F. and Wallnau, L. (2003) Statistics for the

Behavioral Sciences, New York: West Publishing Company

Documents

Quantitative Methods Part 3 Chi - Squared Statistic