31
Punit K.Dwivedi 1 Cross Tabulation ,Frequency Distribution & Chi- Squire test By Punit K.Dwivedi AIMA BBCIT

Cross Tabulation n Frequency Distribution

Embed Size (px)

Citation preview

Page 1: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 1

Cross Tabulation ,Frequency Distribution & Chi-Squire test

ByPunit K.Dwivedi

AIMABBCIT

Page 2: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 2

Themes of My Presentation

Relationships and CorrelationsMonotonic and Nonmonotonic

relationsCross-TabulationsChi-squared analysis

Page 3: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 3

Tests of Associations Examine associations between two or more variables. When two groups are studied, there will always be a

variable that predicts the actions of another variable. The predictor variable is the independent variable, and the criterion variable is the dependent variable.

Tests to measure statistical relationships between variables are:Regression AnalysisCorrelation Analysis

Page 4: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 4

Relationships Between Two Variables

Relationship: a consistent, systematic linkage between the levels or labels for two variables

“Levels” refers to the characteristic of description for interval or ratio scales…the level of customer service, the level of sales dollars, the frequency of advertisements.

“Labels” refers to the characteristic of description for nominal or ordinal scales…buyers v. non-buyers, yes v. no, rank 1, 2 or 3, brand A or brand B.

This concept is important in understanding the type of relationship.

Page 5: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 5

Types of Relationships Between Two Variables

Nonmonotonic: two variables are associated, but only in a very general sense. While we do not know a direction of the relationship, we do know that the presence or absence of one variable is associated with the presence or absence of another variable.

At the presence of cold weather, we have the presence of purchases for heavy coats.

At the presence of warm weather, we have the absence of purchases for heavy coats.

Apparel managers know this relationship exists…they know to build inventory for coats in the winter.

Page 6: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 6

With the Presence of Breakfast, we have the Presence of Coffee…

McDonald’s example of a nonmonotonic relationship for the type of drink ordered for breakfast and for lunch.

Page 7: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 7

Types of Relationships Between Two Variables

Monotonic: the general direction of a relationship between two variables is known. Increasing Decreasing

Shoe store managers know that there is an association between the age of a child and shoe size. The older a child, the larger the shoe size. We know the direction of the relationship…in this case, it is increasing.

We only know the general direction. We cannot determine exact shoe size by knowing a child’s age.

Page 8: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 8

Types of Relationships Between Two Variables Linear: “straight-line” association between two

variables where Here, knowledge of one variable will yield knowledge of

another variable

Page 9: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 9

Types of Relationships Between Two Variables

Curvilinear: some curved pattern describes the association

Research shows that job satisfaction is high when one first starts to work for a company but goes down after a few years and then back up after workers have been with the same company for many years.

This would be a U-shaped relationship.

Page 10: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 10

Characterizing Relationships Between Variables

Presence: whether any systematic relationship exists between two variables of interest

Direction: whether to relationship is positive or negative

Strength of association: how strong is the systematic relationship: Strong? Moderate? Or Weak?

You should assess relationships in this ORDER: 1. Presence 2. Direction and 3. Strength

Page 11: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 11

Cross-tabulations

While a frequency distribution describes one variable at a time, a cross-tabulation describes two or more variables simultaneously.

Cross-tabulation results in tables that reflect the joint distribution of two or more variables with a limited number of categories or distinct values.

Page 12: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 12

Two Way Cross Tabulations

Since two variables have been cross classified, percentages could be computed either column wise, based on column totals, or row wise, based on row totals.

The general rule is to compute the percentages in the direction of the independent variable, across the dependent variable.

Page 13: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 13

Cross-Tabulations: Looking at Two Variables Simultaneously

Cross-tabulation: consists of rows and columns defined by the categories classifying each of two variables

Cross-tabulation table: four types of numbers in each cell Frequency Raw percentage Column percentage Row percentage

In SPSS we go to ANALYZE, DESCRIPTIVE STATISTICS, CROSSTABS

Page 14: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 14

SPSS WILL GENERATE CROSS TABULATION TABLES ANALYZE DESCRIPTIVE

STATISTICS, CROSSTABS

Page 15: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 15

Cross-Tabulations When we have two nominal-scaled variables and we

want to know if they are associated, we use cross-tabulations to examine the relationship and the Chi-Square test to test for presence of a systematic relationship

In this situation, two variables both with nominal scales, we are testing for a nonmonotonic relationship

Lets suppose we want to know if there is a relationship between studying and test performance and both of these variables are measured using nominal scales.

Page 16: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 16

Did You Study for the Test? * How Did You Perform on theTest? Crosstabulation

Count

77 2 79

3 18 21

80 20 100

Yes

No

Did You Studyfor the Test?

Total

Pass Fail

How Did You Performon the Test?

Total

Cross-Tabulations Did you study for the midterm test? ___Yes ___ No How did you perform on the midterm test? ___

Pass___Fail Now, lets look at the data in a crosstabulation table:

Page 17: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 17

This is a nonmonotonic relationship.

Did You Study for the Test? * How Did You Perform on theTest? Crosstabulation

Count

71 6 77

7 16 23

78 22 100

Yes

No

Did You Studyfor the Test?

Total

Pass Fail

How Did You Performon the Test?

Total

Cross-TabulationsDo you “see” a relationship? Do you “see” the “presence” of studying with the “presence” of passing? Do you “see” the “absence” of passing with the “presence” of not studying?

Page 18: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 18

Did You Study for the Test?

NoYes

Co

un

t

100

80

60

40

20

0

How Did You Perform

Fail

Pass

Cross-Tabulations• Bar charts can be used to “see” nonmonotonic

relationships

Page 19: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 19

Statistics For Cross-tabulations To determine whether a systematic association exists, the

probability of obtaining a value of chi-square as large or larger than the one calculated from the cross-tabulation is estimated.

An important characteristic of the chi-square statistic is the number

of degrees of freedom (df) associated with it. That is, df = (r - 1) x (c -1).

The null hypothesis (H0) of no association between the two

variables will be rejected only when the calculated value of the test statistic is greater than the critical value of the chi-square distribution with the appropriate degrees of freedom.

Page 20: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 20

Cross-Tabulations But, while we can “see” this association, how do we

know there is the presence of a systematic association? Is this association statistically significant? Would it

likely appear again and again if we sampled other students?

We use the Chi-Square test to tell us if nonmonotonic associations are really present.

In SPSS we run Chi-Square by ANALYZE, DESCRIPTIVE STATISTICS, CROSSTABS AND WITHIN THE CROSSTABS DIALOG BOX, CLICK ON STATISTICS AND SELECT CHI- SQUARE

Page 21: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 21

Chi-Square Analysis

Chi-square analysis: The Chi-square value is based upon differences between observed and expected frequencies

Observed frequencies: counts for each cell found in the sample

Expected frequencies: calculated on the null hypothesis of “no association” between the two variables under examination

Page 22: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 22

Chi-Square Formula Computed chi-square values:

Page 23: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 23

Expected Frequencies The chi-square statistic is used to test the statistical significance

of the observed association in a cross-tabulation. The expected frequency for each cell can be calculated by using a

simple formula: fe =nr X nc n Where: nr= total number in the row nc = total number in the column n= total sample size

Page 24: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 24

Chi-Square and Degrees of Freedom

The chi-square distribution’s shape changes depending on the number of degrees of freedom

The computed chi-square value is compared to a table value to determine statistical significance

Page 25: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 25

Chi-Square Interpretation

How do I interpret a Chi-square result? The chi-square analysis yields the probability that

the researcher would find evidence in support of the null hypothesis if he or she repeated the study many, many times with independent samples.

If the P value is < or = to .05, this means there is little support for the null hypothesis (no association).

Therefore, we have a significant association….we have the PRESENCE of a systematic relationship between the two variables!

Page 26: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 26

Chi-Square Tests

39.382b 1 .000

35.865 1 .000

34.970 1 .000

.000 .000

100

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is5.06.

b.

Read the P value (Asymp. Sig.) across from Pearson Chi-Square.Since the P value is <.05, we have a SIGNIFICANT association!

Chi-Square Output

Page 27: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 27

Chi-Square Interpretation

How do I interpret a Chi-square result?A significant chi-square result means the

researcher should look at the cross-tabulation row and column percentages to examine the association pattern.

SPSS will calculate row, column (or both) percentages for you. See the CELLS box at the bottom of the CROSSTABS dialog box…

Page 28: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 28

Did You Study for the Test? * How Did You Perform on the Test? Crosstabulation

71 6 77

92.2% 7.8% 100.0%

7 16 23

30.4% 69.6% 100.0%

78 22 100

78.0% 22.0% 100.0%

Count

% within Did YouStudy for the Test?

Count

% within Did YouStudy for the Test?

Count

% within Did YouStudy for the Test?

Yes

No

Did You Studyfor the Test?

Total

Pass Fail

How Did You Performon the Test?

Total

Look at the ROW %’s: 92% of those who studied passed; almost 70%of those who didn’t study failed! Examine the relationship!

Chi-Square Analysis

Page 29: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 29

What about Presence, Direction and Strength

of our Nonmonotonic Relationship? Presence? Yes, we have presence in that our Chi-

square statistic was significant (< or =.05). This means the pattern we observe between study/not

studying and passing/failing is a systematic relationship. That is, we could expect to see this same relationship if we ran our study over many, many times

Direction? Nonmonotonic relationships do not have direction, only presence and absence

Strength? Since the Chi-square only tells us presence, you must judge the strength by looking at the pattern.

Do you think there is a “strong” relationship between study/not studying and passing/failing?

Page 30: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 30

When To Use Crosstabs and the Chi-Square Test

When you want to know if there is an association between two variables and…

Both those variables have nominal (or ordinal) scales You can run Crosstabs and Chi-Square when one variable

has a nominal (or ordinal) scale and the other variable is either interval or ratio scaled.

However, the variable that is interval or ratio scaled should have a limited number of descriptors such as a 5 or 7 point scaled response scale. Otherwise, interpretation is difficult.

Page 31: Cross Tabulation n Frequency Distribution

Punit K.Dwivedi 31

Thank You