Upload
razif-shahril
View
1.813
Download
1
Embed Size (px)
Citation preview
KNOWLEDGE FOR THE BENEFIT OF HUMANITY
BIOSTATISTICS (HFS3283)
CATEGORICAL DATA (CHI-SQUARE & FISHER EXACT TEST)
Dr. Mohd Razif Shahril
School of Nutrition & Dietetics
Faculty of Health Sciences
Universiti Sultan Zainal Abidin
1
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Topic Learning Outcomes At the end of this lecture, students should be able to;
• identify types of categorical data analysis and their use
• explain assumptions to be met when using chi-square
and fisher exact test
• perform chi-square and fisher exact test using SPSS
• explain how to interpret the SPSS outputs from chi-
square and fisher exact test
2
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What is categorical data analysis?
3
• Independent (Explanatory) Variable is
Categorical (Nominal or Ordinal)
• Dependent (Response) Variable is Categorical
(Nominal or Ordinal)
• Most common;
– 2x2 (Each variable has 2 levels)
– Nominal/Nominal
– Nominal/Ordinal
– Ordinal/Ordinal
CONTINGENCY TABLE
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Contingency Table
4
• Tables representing all combinations of levels of
explanatory and response variables
• Numbers in table represent Counts of the
number of cases in each cell
• Row and column totals are called Marginal
counts
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example of Contingency Table
5
• Response Variable – Cognitive Level (Low,
High)
• Explanatory Variable – BMI (Underweight,
Normal, Overweight, Obese)
BMICognitive
TotalLow High
Underweight 59 232 291
Normal 54 367 421
Overweight 114 101 215
Obese 173 54 227
Total 400 754 1154
Marginal Count
Marginal Count
Counts
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
2 x 2 Contingency Table
6
• Each variable has 2 levels– Explanatory Variable – Groups (Typically based on
demographics, exposure, or treatment)
– Response Variable – Outcome (Typically presence or absence of a characteristic)
BMICognitive
TotalLow High
≤ 24.9 113 599 712
> 24.9 287 155 442
Total 400 754 1154
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Chi-Square Test (X2)
7
• Hypothesis;– Comparing two or more
proportion
– Ho : P1 = P2
• Assumption– Random samples
– Observations are independent
– The number of cells with Expected Count (EC) less than 5, must be less than 20% of the total number of cells.
– The smallest EC must be at least 2.
Based on study design & method
Calculate expected count for each cell
(SPSS will do it)
The chi-square test for independence, also called Pearson's chi-square test or
the chi-square test of association, is used to discover if there is a
relationship between two categorical variables.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example Chi-Square Test (X2) – (1)
8
• Hypothesis;– Association between gender and Knowledge on
Nutrition (KoN)
– Comparing the proportion of Low KoN between gender
– Ho : P(KoN)male = P(KoN)femafe
• Assumption– Random samples [ √ ]
– Observations are independent [ √ ]
– The number of cells with Expected Count (EC) less than 5, must be less than 20% of the total number of cells
– The smallest EC must be at least 2Calculated by SPSS
9
Chi-square using SPSS - procedure:
1
2
3
10
Chi-square using SPSS - procedure:
4
5
6
7
8
9
Chi-square using SPSS - Output:
11
Descriptive statistics for each group
Chi-square statistic = 0.417df = 1; P-value = 0.518
Must be < 20%
Must be ≥ 2
2 EC assumptions
is met
Chi-square using SPSS – Table and Interpretation:
12
Variable nLow KoNFreq (%)
High KoNFreq (%)
X2 statistics a
(df)P-value
Gender
Male 39 19 (48.7) 20 (51.3)0.417 (1) 0.518
Female 34 14 (41.2) 20 (58.8)
Ethnicity
Malay
Others
Education Level
Low
High
Table 1: Factors (categorical variable) associated with Knowledge on Nutrition
a Chi-square test for independence
The prevalence (proportion) of Low Knowledge on Nutrition between male and female is not
significantly different (P = 0.518). Therefore, there is no significant association between gender and
Knowledge on Nutrition.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What if assumptions were not met?
13
• Combine adjacent columns or/and rows to
increase the EC if possible.
• If still did not meet expected cell assumption,
Fisher’s exact (FE) test can be applied (only
for 2 x 2 table in SPSS).
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example Chi-Square Test (X2) – (2)
14
• Hypothesis;– Association between ethnicity and Knowledge on Nutrition
(KoN)
– Comparing the proportion of Low KoN between ethnicity
– Ho : P(KoN)malay=P(KoN)chinese=P(KoN)indian=P(KoN)others
• Assumption
– Random samples [ √ ]
– Observations are independent [ √ ]
– The number of cells with Expected Count (EC) less than
5, must be less than 20% of the total number of cells
– The smallest EC must be at least 2 Calculated by SPSS
Chi-square using SPSS - Output:
Descriptive statistics for each group
4 (50%) cells have EC less than 5. The smallest EC is 1.36.One remedial maybe to
combine Indian and others, (or even combing 3 levels) and
call it as “others”.(Combination should be
interpretable/ meaningful)
15
Must be < 20%
Must be ≥ 2
2 EC assumptions
is not met
Chi-square using SPSS - Output:
Descriptive statistics for each group
16Must be < 20% Must be ≥ 2
2 EC assumptions
is met
Chi-square statistic = 0.072df = 1; P-value = 0.788
If EC assumptionsis still not met
Chi-square using SPSS – Table and Interpretation:
17
Variable nLow KoNFreq (%)
High KoNFreq (%)
X2 statistics a
(df)P-value
Gender
Male 39 19 (48.7) 20 (51.3)0.417 (1) 0.518
Female 34 14 (41.2) 20 (58.8)
Ethnicity
Malay 43 20 (46.5) 23 (53.5)0.072 (1) 0.788
Others 30 13 (43.3) 17 (56.7)
Education Level
Low
High
Table 1: Factors (categorical variable) associated with Knowledge on Nutrition
a Chi-square test for independence
The prevalence (proportion) of Low Knowledge on Nutrition between Malay and other ethnicity is not significantly different (P = 0.788). Therefore,
there is no significant association between ethnicity and Knowledge on Nutrition.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Fisher Exact Test
18
• Fisher’s Exact Test is a test for independence in a 2 X 2 table.
• It is most useful when the total sample size and the expected values are small. – Useful when E(cell counts) < 5.
• The output consists of more than one p-values: – Choose Exact Sig. (2-sided)
Thank You
19