Upload
phaniraj-lenkalapally
View
1.565
Download
1
Embed Size (px)
Citation preview
Punit K.Dwivedi 1
Cross Tabulation ,Frequency Distribution & Chi-Squire test
ByPunit K.Dwivedi
AIMABBCIT
Punit K.Dwivedi 2
Themes of My Presentation
Relationships and CorrelationsMonotonic and Nonmonotonic
relationsCross-TabulationsChi-squared analysis
Punit K.Dwivedi 3
Tests of Associations Examine associations between two or more variables. When two groups are studied, there will always be a
variable that predicts the actions of another variable. The predictor variable is the independent variable, and the criterion variable is the dependent variable.
Tests to measure statistical relationships between variables are:Regression AnalysisCorrelation Analysis
Punit K.Dwivedi 4
Relationships Between Two Variables
Relationship: a consistent, systematic linkage between the levels or labels for two variables
“Levels” refers to the characteristic of description for interval or ratio scales…the level of customer service, the level of sales dollars, the frequency of advertisements.
“Labels” refers to the characteristic of description for nominal or ordinal scales…buyers v. non-buyers, yes v. no, rank 1, 2 or 3, brand A or brand B.
This concept is important in understanding the type of relationship.
Punit K.Dwivedi 5
Types of Relationships Between Two Variables
Nonmonotonic: two variables are associated, but only in a very general sense. While we do not know a direction of the relationship, we do know that the presence or absence of one variable is associated with the presence or absence of another variable.
At the presence of cold weather, we have the presence of purchases for heavy coats.
At the presence of warm weather, we have the absence of purchases for heavy coats.
Apparel managers know this relationship exists…they know to build inventory for coats in the winter.
Punit K.Dwivedi 6
With the Presence of Breakfast, we have the Presence of Coffee…
McDonald’s example of a nonmonotonic relationship for the type of drink ordered for breakfast and for lunch.
Punit K.Dwivedi 7
Types of Relationships Between Two Variables
Monotonic: the general direction of a relationship between two variables is known. Increasing Decreasing
Shoe store managers know that there is an association between the age of a child and shoe size. The older a child, the larger the shoe size. We know the direction of the relationship…in this case, it is increasing.
We only know the general direction. We cannot determine exact shoe size by knowing a child’s age.
Punit K.Dwivedi 8
Types of Relationships Between Two Variables Linear: “straight-line” association between two
variables where Here, knowledge of one variable will yield knowledge of
another variable
Punit K.Dwivedi 9
Types of Relationships Between Two Variables
Curvilinear: some curved pattern describes the association
Research shows that job satisfaction is high when one first starts to work for a company but goes down after a few years and then back up after workers have been with the same company for many years.
This would be a U-shaped relationship.
Punit K.Dwivedi 10
Characterizing Relationships Between Variables
Presence: whether any systematic relationship exists between two variables of interest
Direction: whether to relationship is positive or negative
Strength of association: how strong is the systematic relationship: Strong? Moderate? Or Weak?
You should assess relationships in this ORDER: 1. Presence 2. Direction and 3. Strength
Punit K.Dwivedi 11
Cross-tabulations
While a frequency distribution describes one variable at a time, a cross-tabulation describes two or more variables simultaneously.
Cross-tabulation results in tables that reflect the joint distribution of two or more variables with a limited number of categories or distinct values.
Punit K.Dwivedi 12
Two Way Cross Tabulations
Since two variables have been cross classified, percentages could be computed either column wise, based on column totals, or row wise, based on row totals.
The general rule is to compute the percentages in the direction of the independent variable, across the dependent variable.
Punit K.Dwivedi 13
Cross-Tabulations: Looking at Two Variables Simultaneously
Cross-tabulation: consists of rows and columns defined by the categories classifying each of two variables
Cross-tabulation table: four types of numbers in each cell Frequency Raw percentage Column percentage Row percentage
In SPSS we go to ANALYZE, DESCRIPTIVE STATISTICS, CROSSTABS
Punit K.Dwivedi 14
SPSS WILL GENERATE CROSS TABULATION TABLES ANALYZE DESCRIPTIVE
STATISTICS, CROSSTABS
Punit K.Dwivedi 15
Cross-Tabulations When we have two nominal-scaled variables and we
want to know if they are associated, we use cross-tabulations to examine the relationship and the Chi-Square test to test for presence of a systematic relationship
In this situation, two variables both with nominal scales, we are testing for a nonmonotonic relationship
Lets suppose we want to know if there is a relationship between studying and test performance and both of these variables are measured using nominal scales.
Punit K.Dwivedi 16
Did You Study for the Test? * How Did You Perform on theTest? Crosstabulation
Count
77 2 79
3 18 21
80 20 100
Yes
No
Did You Studyfor the Test?
Total
Pass Fail
How Did You Performon the Test?
Total
Cross-Tabulations Did you study for the midterm test? ___Yes ___ No How did you perform on the midterm test? ___
Pass___Fail Now, lets look at the data in a crosstabulation table:
Punit K.Dwivedi 17
This is a nonmonotonic relationship.
Did You Study for the Test? * How Did You Perform on theTest? Crosstabulation
Count
71 6 77
7 16 23
78 22 100
Yes
No
Did You Studyfor the Test?
Total
Pass Fail
How Did You Performon the Test?
Total
Cross-TabulationsDo you “see” a relationship? Do you “see” the “presence” of studying with the “presence” of passing? Do you “see” the “absence” of passing with the “presence” of not studying?
Punit K.Dwivedi 18
Did You Study for the Test?
NoYes
Co
un
t
100
80
60
40
20
0
How Did You Perform
Fail
Pass
Cross-Tabulations• Bar charts can be used to “see” nonmonotonic
relationships
Punit K.Dwivedi 19
Statistics For Cross-tabulations To determine whether a systematic association exists, the
probability of obtaining a value of chi-square as large or larger than the one calculated from the cross-tabulation is estimated.
An important characteristic of the chi-square statistic is the number
of degrees of freedom (df) associated with it. That is, df = (r - 1) x (c -1).
The null hypothesis (H0) of no association between the two
variables will be rejected only when the calculated value of the test statistic is greater than the critical value of the chi-square distribution with the appropriate degrees of freedom.
Punit K.Dwivedi 20
Cross-Tabulations But, while we can “see” this association, how do we
know there is the presence of a systematic association? Is this association statistically significant? Would it
likely appear again and again if we sampled other students?
We use the Chi-Square test to tell us if nonmonotonic associations are really present.
In SPSS we run Chi-Square by ANALYZE, DESCRIPTIVE STATISTICS, CROSSTABS AND WITHIN THE CROSSTABS DIALOG BOX, CLICK ON STATISTICS AND SELECT CHI- SQUARE
Punit K.Dwivedi 21
Chi-Square Analysis
Chi-square analysis: The Chi-square value is based upon differences between observed and expected frequencies
Observed frequencies: counts for each cell found in the sample
Expected frequencies: calculated on the null hypothesis of “no association” between the two variables under examination
Punit K.Dwivedi 22
Chi-Square Formula Computed chi-square values:
Punit K.Dwivedi 23
Expected Frequencies The chi-square statistic is used to test the statistical significance
of the observed association in a cross-tabulation. The expected frequency for each cell can be calculated by using a
simple formula: fe =nr X nc n Where: nr= total number in the row nc = total number in the column n= total sample size
Punit K.Dwivedi 24
Chi-Square and Degrees of Freedom
The chi-square distribution’s shape changes depending on the number of degrees of freedom
The computed chi-square value is compared to a table value to determine statistical significance
Punit K.Dwivedi 25
Chi-Square Interpretation
How do I interpret a Chi-square result? The chi-square analysis yields the probability that
the researcher would find evidence in support of the null hypothesis if he or she repeated the study many, many times with independent samples.
If the P value is < or = to .05, this means there is little support for the null hypothesis (no association).
Therefore, we have a significant association….we have the PRESENCE of a systematic relationship between the two variables!
Punit K.Dwivedi 26
Chi-Square Tests
39.382b 1 .000
35.865 1 .000
34.970 1 .000
.000 .000
100
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)Exact Sig.(2-sided)
Exact Sig.(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is5.06.
b.
Read the P value (Asymp. Sig.) across from Pearson Chi-Square.Since the P value is <.05, we have a SIGNIFICANT association!
Chi-Square Output
Punit K.Dwivedi 27
Chi-Square Interpretation
How do I interpret a Chi-square result?A significant chi-square result means the
researcher should look at the cross-tabulation row and column percentages to examine the association pattern.
SPSS will calculate row, column (or both) percentages for you. See the CELLS box at the bottom of the CROSSTABS dialog box…
Punit K.Dwivedi 28
Did You Study for the Test? * How Did You Perform on the Test? Crosstabulation
71 6 77
92.2% 7.8% 100.0%
7 16 23
30.4% 69.6% 100.0%
78 22 100
78.0% 22.0% 100.0%
Count
% within Did YouStudy for the Test?
Count
% within Did YouStudy for the Test?
Count
% within Did YouStudy for the Test?
Yes
No
Did You Studyfor the Test?
Total
Pass Fail
How Did You Performon the Test?
Total
Look at the ROW %’s: 92% of those who studied passed; almost 70%of those who didn’t study failed! Examine the relationship!
Chi-Square Analysis
Punit K.Dwivedi 29
What about Presence, Direction and Strength
of our Nonmonotonic Relationship? Presence? Yes, we have presence in that our Chi-
square statistic was significant (< or =.05). This means the pattern we observe between study/not
studying and passing/failing is a systematic relationship. That is, we could expect to see this same relationship if we ran our study over many, many times
Direction? Nonmonotonic relationships do not have direction, only presence and absence
Strength? Since the Chi-square only tells us presence, you must judge the strength by looking at the pattern.
Do you think there is a “strong” relationship between study/not studying and passing/failing?
Punit K.Dwivedi 30
When To Use Crosstabs and the Chi-Square Test
When you want to know if there is an association between two variables and…
Both those variables have nominal (or ordinal) scales You can run Crosstabs and Chi-Square when one variable
has a nominal (or ordinal) scale and the other variable is either interval or ratio scaled.
However, the variable that is interval or ratio scaled should have a limited number of descriptors such as a 5 or 7 point scaled response scale. Otherwise, interpretation is difficult.
Punit K.Dwivedi 31
Thank You