71
16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross- Tabulation

16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Embed Size (px)

Citation preview

Page 1: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

16-1

Chapter 16Data Analysis:

Frequency Distribution, Hypothesis Testing, and

Cross-Tabulation

Page 2: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.1 Relationship of Frequency Distribution, Hypothesis Testing, and Cross-Tabulation to the Previous Chapters and the Marketing Research Process

Focus of this Chapter

Relationship toPrevious Chapters

• Frequency

• General Procedure for Hypothesis Testing

• Cross Tabulation

• Research Questions and Hypothesis (Chapter 2)

• Data Analysis Strategy (Chapter 15)

Approach to Problem

Field Work

Data Preparation and Analysis

Report Preparationand Presentation

Research Design

Problem Definition

Relationship to MarketingResearch Process

Chapter 16 - 2Copyright © 2011 Pearson Education, Inc.

Page 3: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Copyright © 2011 Pearson Education, Inc.

Figure 16.2 Frequency Distribution, Hypothesis Testing, and Cross Tabulation:

An Overview

Page 4: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Frequency Distribution (Fig 16.3 & 16.4)(Tables 16.1 & 16.2)

Statistics Associated With Frequency Distribution (Fig 16.5)

Introduction to Hypothesis Testing (Fig 16.6, 16.7, 16.8 & 16.9)

Cross Tabulation (Tables 16.3, 16.4 & 16.5)

Statistics Associated With Cross Tabulation (Fig 16.1)

Cross Tabulation in Practice (Fig 16.11)

Be

a D

M!

Be

an M

R!

Exp

erie

nti

al L

earn

ing

Wh

at Wo

uld

Yo

u D

o?

Opening Vignette

Application to Contemporary Issues (Figs 16.12, 16.13 & 16.14)

International Social Media Ethics

Chapter 16 - 4Copyright © 2011 Pearson Education, Inc.

Page 5: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Frequency Distribution

In a frequency distribution, one variable is considered at a time.

A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative percentages for all the values associated with that variable.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 5

Page 6: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Calculate the Descriptive Statistics, Measures of Location, and Variability

Plot the Frequency Histogram

Calculate the Percentage and Cumulative Percentage

Calculate the Frequency for Each Value of the Variable

Figure 16.3 Conducting Frequency Analysis

Chapter 16 - 6Copyright © 2011 Pearson Education, Inc.

Page 7: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

No User Group Sex Attitude 1. 3.00 2.00 7.00 2. 1.00 1.00 2.00 3. 1.00 1.00 3.00 4. 3.00 2.00 6.00 5. 3.00 2.00 5.00 6. 2.00 2.00 4.00 7. 2.00 1.00 5.00 8. 1.00 1.00 2.00 9. 2.00 2.00 4.00 10. 1.00 1.00 3.00 11. 3.00 2.00 6.00 12. 3.00 2.00 6.00 13. 1.00 1.00 2.00 14. 3.00 2.00 6.00 15. 1.00 2.00 4.00 16. 1.00 2.00 3.00 17. 3.00 1.00 7.00 18. 2.00 1.00 6.00 19. 1.00 1.00 1.00 20. 3.00 1.00 5.00 21. 3.00 2.00 6.00 22. 2.00 2.00 2.00 23. 1.00 1.00 1.00 24. 3.00 1.00 6.00 25. 1.00 2.00 3.00 26. 2.00 2.00 5.00 27. 3.00 2.00 7.00 28. 2.00 1.00 5.00 29. 1.00 1.00 9.00 30. 2.00 2.00 5.00 31. 1.00 2.00 1.00 32. 1.00 2.00 4.00 33. 2.00 1.00 3.00 34. 2.00 1.00 4.00 35. 3.00 1.00 5.00 36. 3.00 1.00 6.00 37. 3.00 2.00 6.00 38. 3.00 2.00 5.00 39. 3.00 2.00 7.00 40. 1.00 1.00 4.00 41. 1.00 1.00 2.00 42. 1.00 1.00 1.00 43. 1.00 1.00 2.00 44. 1.00 1.00 3.00 45. 1.00 1.00 1.00

Table 16.1 Usage and Attitude Toward Nike Shoes

Chapter 16 - 7Copyright © 2011 Pearson Education, Inc.

Page 8: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Value Label Value Frequency PercentageValid

PercentageCumulative Percentage

Very unfavorable

Very favorable

12345679

56668941

11.113.313.313.317.820.08.92.2

11.413.613.613.618.220.59.1

Missing

11.425.038.652.370.590.9

100.0

Total 45 100.0 100.0 100.0

Table 16.2 Frequency Distribution of Attitude Toward Nike

Chapter 16 - 8Copyright © 2011 Pearson Education, Inc.

Page 9: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.4 Frequency Histogram

0

1

2

3

4

5

6

7

8

9

Fre

qu

en

cy

1 2 3 4 5 6 7

Attitude Toward Nike

Chapter 16 - 9Copyright © 2011 Pearson Education, Inc.

Page 10: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The mean, or average value, is the most commonly used measure of central tendency.

The mean, , is given by

Where, Xi = Observed values of the variable X n = Number of observations (sample size)

The mode is the value that occurs most frequently. It represents the highest peak of the distribution.

16-10

Statistics Associated with Frequency DistributionMeasures of Location

Copyright © 2011 Pearson Education, Inc.

Page 11: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The median of a sample is the middle value when the data are arranged in ascending or descending order. If the number of data points is even, the median is usually estimated as the midpoint between the two middle values – by adding the two middle values and dividing their sum by 2. The median is the 50th percentile.

Statistics Associated with Frequency DistributionMeasures of Location

Chapter 16 - 11Copyright © 2011 Pearson Education, Inc.

Page 12: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.5 Skewness of a Distribution

Skewed Distribution

Symmetric Distribution

Mean Median

Mode (a)

Mean Median Mode (b)

Copyright © 2011 Pearson Education, Inc. 15 - 12

Page 13: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The range measures the spread of the data. It is simply the difference between the largest and smallest values in the sample.

Range =

Statistics Associated with Frequency DistributionMeasures of Variability

Chapter 16 - 13Copyright © 2011 Pearson Education, Inc.

Page 14: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The variance is the mean squared deviation from the mean. The variance can never be negative.

The standard deviation is the square root of the variance.

Statistics Associated with Frequency DistributionMeasures of Variability

Chapter 16 - 14Copyright © 2011 Pearson Education, Inc.

𝑆 𝑋=√∑𝑖=1

𝑛 (𝑋 𝑖−𝑋 )2

𝑛−1

Page 15: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.6 A General Procedure for Hypothesis Testing

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Step 7

Step 8

Formulate H0 and H1

Select Appropriate Test

Choose Level of Significance,

Collect Data and Calculate Test Statistic

Determine ProbabilityAssociated with Test

Statistic (TSCAL)

Determine CriticalValue of Test Statistic

TSCR

Compare with Level of Significance,

Determine if TSCAL fallsinto (Non) Rejection Region

Reject or Do Not Reject H0

Draw Marketing Research Conclusion

a) b)

a) b)

Chapter 16 - 15Copyright © 2011 Pearson Education, Inc.

Page 16: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A null hypothesis is a statement of the status quo, one of no difference or no effect. If the null hypothesis is not rejected, no changes will be made.

An alternative hypothesis is one in which some difference or effect is expected. Accepting the alternative hypothesis will lead to changes in opinions or actions.

The null hypothesis refers to a specified value of the population parameter (e.g. μ, σ, π), not a sample statistic (e.g. ).

A General Procedure for Hypothesis TestingStep 1: Formulate the Hypothesis

Chapter 16 - 16Copyright © 2011 Pearson Education, Inc.

Page 17: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A null hypothesis may be rejected, but it can never be accepted based on a single test. In classical hypothesis testing, there is no way to determine whether the null hypothesis is true.

In marketing research, the null hypothesis is formulated in such a way that its rejection leads to the acceptance of the desired conclusion. The alternative hypothesis represents the conclusion for which evidence is sought.

A General Procedure for Hypothesis TestingStep 1: Formulate the Hypothesis (Cont.)

Chapter 16 - 17Copyright © 2011 Pearson Education, Inc.

Page 18: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The test of the null hypothesis is a one-tailed test, because the alternative hypothesis is expressed directionally.

If that is not the case, then a two-tailed test would be required, and the hypotheses would be expressed as:

A General Procedure for Hypothesis TestingStep 1: Formulate the Hypothesis (Cont.)

Chapter 16 - 18Copyright © 2011 Pearson Education, Inc.

Page 19: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The test statistic measures how close the sample has come to the null hypothesis and follows a well-known distribution, such as the normal, t, or chi-square.

In our example, the z statistic, which follows the standard normal distribution, would be appropriate.

A General Procedure for Hypothesis TestingStep 2: Select an Appropriate Test

where

Chapter 16 - 19Copyright © 2011 Pearson Education, Inc.

𝜎 𝑝=√ 𝜋 (1−𝜋)𝑛

𝑧=𝑝−𝜋𝜎 𝑝

Page 20: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Type I Error Type I error occurs when the sample results lead to the

rejection of the null hypothesis when it is in fact true. The probability of type I error () is also called the level of

significance.

A General Procedure for Hypothesis TestingStep 3: Choose a Level of Significance

Chapter 16 - 20Copyright © 2011 Pearson Education, Inc.

Page 21: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Type II Error Type II error occurs when, based on the sample results, the

null hypothesis is not rejected when it is in fact false. The probability of type II error is denoted by . Unlike , which is specified by the researcher, the magnitude

of depends on the actual value of the population parameter (proportion).

A General Procedure for Hypothesis TestingStep 3: Choose a Level of Significance (Cont.)

Chapter 16 - 21Copyright © 2011 Pearson Education, Inc.

Page 22: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Power of a Test The power of a test is the probability (1 - ) of rejecting the

null hypothesis when it is false and should be rejected. Although is unknown, it is related to . An extremely low

value of (e.g., = 0.001) will result in intolerably high errors.

Therefore, it is necessary to balance the two types of errors.

A General Procedure for Hypothesis TestingStep 3: Choose a Level of Significance (Cont.)

Chapter 16 - 22Copyright © 2011 Pearson Education, Inc.

Page 23: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.7 Type I Error (α) and Type II Error (β)

Chapter 16 - 23Copyright © 2011 Pearson Education, Inc.

Page 24: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.8 Probability of z With a One-Tailed TestChosen Confidence Level = 95%

Chosen Level of Significance α=0.05

z = 1.645

Chapter 16 - 24Copyright © 2011 Pearson Education, Inc.

Page 25: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingStep 4: Collect Data and Calculate Test Statistic

 

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 25

Page 26: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingStep 4: Collect Data and Calculate Test Statistic (Cont.)

The test statistic z can be calculated as follows:

Chapter 16 - 26Copyright © 2011 Pearson Education, Inc.

Page 27: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingStep 5: Determine the Probability (Critical Value)

Using standard normal tables (Table 2 of the Statistical Appendix), the probability of obtaining a z value of 1.83 can be calculated (see Figure 15.5).

The shaded area between - and 1.83 is 0.9664. Therefore, the area to the right of z = 1.83 is 1.0000 - 0.9664 = 0.0336.

Alternatively, the critical value of z, which will give an area to the right side of the critical value of 0.05, is between 1.64 and 1.65 and equals 1.645.

Note, in determining the critical value of the test statistic, the area to the right of the critical value is either or /2. It is for a one-tail test and /2 for a two-tail test.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 27

Page 28: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingSteps 6 & 7: Compare the Probability (Critical Value) and Making the Decision

If the probability associated with the calculated or observed value of the test statistic (TSCAL) is less than the level of significance (), the null hypothesis is rejected.

The probability associated with the calculated or observed value of the test statistic is 0.0336. This is the probability of getting a p value of 0.44 when p = 0.40. This is less than the level of significance of 0.05. Hence, the null hypothesis is rejected.

Alternatively, if the absolute calculated value of the test statistic |(TSCAL)| is greater than the absolute critical value of the test statistic |(TSCR)|, the null hypothesis is rejected.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 28

Page 29: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingSteps 6 & 7: Compare the Probability (Critical Value) and Making the Decision (Cont.)

The calculated value of the test statistic z = 1.83 lies in the rejection region, beyond the value of 1.645. Again, the same conclusion to reject the null hypothesis is reached.

Note that the two ways of testing the null hypothesis are equivalent but mathematically opposite in the direction of comparison.

If the probability of TSCAL < significance level () then reject H0 but if |TSCAL | > |TSCR | then reject H0.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 29

Page 30: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

A General Procedure for Hypothesis TestingStep 8: Marketing Research Conclusion

The conclusion reached by hypothesis testing must be expressed in terms of the marketing research problem.

In our example, we conclude that there is evidence that the proportion of customers preferring the new plan is significantly greater than 0.40. Hence, the recommendation would be to introduce the new service plan.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 30

Page 31: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.9 A Broad Classification of Hypothesis Testing Procedures

Hypothesis Testing

Test of Association Test of Difference

Means Proportions

Chapter 16 - 31Copyright © 2011 Pearson Education, Inc.

Page 32: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Cross-Tabulation

While a frequency distribution describes one variable at a time, a cross-tabulation describes two or more variables simultaneously.

Cross-tabulation results in tables that reflect the joint distribution of two or more variables with a limited number of categories or distinct values, e.g., Table 16.3.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 32

Page 33: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Table 16.3A Cross-Tabulation of Gender and Usage of Nike Shoes

Chapter 16 - 33Copyright © 2011 Pearson Education, Inc.

Usage

GENDER

Female Male Raw Total

Light Users 14 5 19

Medium Users 5 5 10

Heavy Users 5 11 16

Column Total 24 21

Page 34: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Two Variables Cross-Tabulation

Since two variables have been cross classified, percentages could be computed either column wise, based on column totals (Table 16.4), or row wise, based on row totals (Table 16.5).

The general rule is to compute the percentages in the direction of the independent variable, across the dependent variable. The correct way of calculating percentages is as shown in Table 16.4.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 34

Page 35: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Table 16.4 Usage of Nike Shoes by Gender

Usage

GENDER

Female Male

Light Users 58.4% 23.8%

Medium Users 20.8% 23.8%

Heavy Users 20.8% 52.4%

Column Total 100.0% 100.0%

Chapter 16 - 35Copyright © 2011 Pearson Education, Inc.

Page 36: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Table 16.5 Gender by Usage of Nike Shoes

Usage

GENDER

Female Male Raw Total

Light Users 73.7% 26.3% 100.0%

Medium Users 50.0% 50.0% 100.0%

Heavy Users 31.2% 68.8% 100.0%

Chapter 16 - 36Copyright © 2011 Pearson Education, Inc.

Page 37: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Statistics Associated with Cross-TabulationChi-Square

To determine whether a systematic association exists, the probability of obtaining a value of chi-square as large or larger than the one calculated from the cross-tabulation is estimated.

An important characteristic of the chi-square statistic is the number of degrees of freedom (df) associated with it. That is, df = (r - 1) x (c -1).

The null hypothesis (H0) of no association between the two variables will be rejected only when the calculated value of the test statistic is greater than the critical value of the chi-square distribution with the appropriate degrees of freedom, as shown in Figure 16.10.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 37

Page 38: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.10 Chi-Square Test of Association

2

Level of Significance, α

Chapter 16 - 38Copyright © 2011 Pearson Education, Inc.

Page 39: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The chi-square statistic (2) is used to test the statistical significance of the observed association in a cross-tabulation. The expected frequency for each cell can be calculated by using a simple formula:

Statistics Associated with Cross-TabulationChi-Square (Cont.)

where = total number in the row= total number in the column= total sample size

rncnn

Chapter 16 - 39Copyright © 2011 Pearson Education, Inc.

𝑓 𝑒=𝑛𝑟 𝑛𝑐

𝑛

Page 40: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

For the data in Table 16.3, for the six cells from left to right and top to bottom

= (24 x 19)/45 = 10.1 = (21 x 19)/45 =8.9

= (24 x 10)/45 = 5.3 = (21 x 10)/45 =4.7

= (24 x 16)/45 = 8.5 = (21 x 16)/45 =7.5

Expected Frequency

ef

ef

ef

ef

ef

ef

Chapter 16 - 40Copyright © 2011 Pearson Education, Inc.

Page 41: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

= (14 -10.1)2 + (5 – 8.9)2

10.1 8.9

+ (5 – 5.3)2 + (5 – 4.7)2

5.3 4.7

+ (5 – 8.5)2 + (11 – 7.5)2

8.5 7.5

= 1.51 + 1.71 + 0.02 + 0.02 + 1.44 + 1.63

= 6.33

Chapter 16 - 41Copyright © 2011 Pearson Education, Inc.

χ 2= ∑𝑎 𝑙𝑙𝑐𝑒𝑙𝑙𝑠

( 𝑓 0− 𝑓 𝑒)2

𝑓 𝑒

Page 42: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Statistics Associated with Cross-Tabulation Chi-Square (Cont.)

The chi-square distribution is a skewed distribution whose shape depends solely on the number of degrees of freedom. As the number of degrees of freedom increases, the chi-square distribution becomes more symmetrical.

Table 3 in the Statistical Appendix contains upper-tail areas of the chi-square distribution for different degrees of freedom. For 2 degrees of freedom, the probability of exceeding a chi-square value of 5.991 is 0.05.

For the cross-tabulation given in Table 16.3, there are (3-1) x (2-1) = 2 degrees of freedom. The calculated chi-square statistic had a value of 6.333. Since this is greater than the critical value of 5.991, the null hypothesis of no association is rejected indicating that the association is statistically significant at the 0.05 level.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 42

Page 43: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Statistics Associated with Cross-TabulationPhi Coefficient

The phi coefficient () is used as a measure of the strength of association in the special case of a table with two rows and two columns (a 2 x 2 table).

The phi coefficient is proportional to the square root of the chi-square statistic:

It takes the value of 0 when there is no association, which would be indicated by a chi-square value of 0 as well. When the variables are perfectly associated, phi assumes the value of 1 and all the observations fall just on the main or minor diagonal.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 43

Page 44: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Statistics Associated with Cross-TabulationContingency Coefficient

While the phi coefficient is specific to a 2 x 2 table, the contingency coefficient (C) can be used to assess the strength of association in a table of any size.

The contingency coefficient varies between 0 and 1. The maximum value of the contingency coefficient

depends on the size of the table (number of rows and number of columns). For this reason, it should be used only to compare tables of the same size.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 44

𝐶=√ 𝜒2

𝜒 2+𝑛

Page 45: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Statistics Associated with Cross-TabulationCramer’s V

Cramer's V is a modified version of the phi correlation coefficient, , and is used in tables larger than 2 x 2.

or

. Chapter 16 - 45

𝑉=√ 2

min (𝑟 −1 ) ,(𝑐−1)

Copyright © 2011 Pearson Education, Inc.

Page 46: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Cross-Tabulation in Practice

While conducting cross-tabulation analysis in practice, it is useful to proceed along the following steps:

Test the null hypothesis that there is no association between the variables using the chi-square statistic. If you fail to reject the null hypothesis, then there is no relationship.

If H0 is rejected, then determine the strength of the association using an appropriate statistic (phi-coefficient, contingency coefficient, or Cramer's V), as discussed earlier.

If H0 is rejected, interpret the pattern of the relationship by computing the percentages in the direction of the independent variable, across the dependent variable. Draw marketing conclusions.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 46

Page 47: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Figure 16.11 Conducting Cross-Tabulation Analysis

Construct the Cross-Tabulation Data

Calculate the Chi-Square Statistic, Test the Null Hypothesis of No Association

Reject H0?

Interpret the Pattern of Relationship by Calculating

Percentages in the Direction of the Independent Variable

No Association

Determine the Strength of Association Using an Appropriate Statistic

NO YES

Chapter 16 -47Copyright © 2011 Pearson Education, Inc.

Page 48: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Chapter 16 -48Copyright © 2011 Pearson Education, Inc.

Page 49: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Chapter 16 -49Copyright © 2011 Pearson Education, Inc.

Page 50: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Chapter 16 -50Copyright © 2011 Pearson Education, Inc.

Page 51: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

International Marketing Research

Report preparation can be complicated by the need to prepare reports for management in different countries and in different languages.

In oral presentations, the presenter should be sensitive to cultural norms. for example, telling jokes, which is frequently done in the united states, is not appropriate in all cultures.

Different recommendations might be made for implementing the research findings in different countries.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 51

Page 52: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Marketing Research & Social Media

Social media, particularly blogs and Twitter, can play a crucial role in disseminating the results and the report of a marketing research project and the decisions made by the company based on the findings.

Blogs can also provide an avenue for a company to obtain consumer reaction to the research findings as well as their feedback on the decisions made and actions taken by the company based on the research findings.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 52

Page 53: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Marketing Research & Social Media (Cont.)

Social media research results can be effectively presented using charts and graphs such as the Twitter trends’ statistics graph.

Social media community members’ stories can often be effective illustrations of statistical findings when used in reports or executive presentations.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 53

Page 54: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Ethics in Social Media

Ethical issues include ignoring pertinent data when drawing conclusions or making recommendations, not reporting relevant information (such as low response rates), deliberately misusing statistics, falsifying figures, altering research results, and misinterpreting the results with the objective of supporting a personal or corporate viewpoint.

The researchers should prepare reports that accurately and fully document the details of all the procedures and findings.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 54

Page 55: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Ethics in Social Media (Cont.)

Clients also have the responsibility for full and accurate disclosure of the research findings and are obligated to employ these findings honorably.

ethical issues also arise when client firms, such as tobacco companies, use marketing research findings to formulate questionable marketing programs.

Copyright © 2011 Pearson Education, Inc. Chapter 16 - 55

Page 56: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

The main program in SPSS is FREQUENCIES. It produces a table of frequency counts, percentages, and cumulative percentages for the values of each variable. It gives all of the associated statistics.

If the data are interval scaled and only the summary statistics are desired, the DESCRIPTIVES procedure can be used.

The EXPLORE procedure produces summary statistics and graphical displays, either for all of the cases or separately for groups of cases. Mean, median, variance, standard deviation, minimum, maximum, and range are some of the statistics that can be calculated.

SPSS Windows: Frequencies

Chapter 16 - 56Copyright © 2011 Pearson Education, Inc.

Page 57: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

To select these frequencies procedures, click the following:

Analyze > Descriptive Statistics > Frequencies . . .or

Analyze > Descriptive Statistics > Descriptives . . .or

Analyze > Descriptive Statistics > Explore . . .

We illustrate the detailed steps using the data of Table 16.1.

SPSS Windows: Frequencies

Chapter 16 - 57Copyright © 2011 Pearson Education, Inc.

Page 58: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

1. Select ANALYZE on the SPSS menu bar.

2. Click DESCRIPTIVE STATISTICS, and select FREQUENCIES.

3. Move the variable "Attitude toward Nike <attitude>" to the VARIABLE(S) box.

4. Click STATISTICS.

5. Select MEAN, MEDIAN, MODE, STD. DEVIATION, VARIANCE, and RANGE.

6. Click CONTINUE.

7. Click CHARTS.

8. Click HISTOGRAMS, then click CONTINUE.

9. Click OK.

SPSS Detailed Steps: Frequencies

Chapter 16 - 58Copyright © 2011 Pearson Education, Inc.

Page 59: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Analyze > Descriptive Statistics > Crosstabs

We illustrate the detailed steps using the data of Table 16.1.

SPSS Windows: Cross-tabulations

The major cross-tabulation program is CROSSTABS. This program will display the cross-classification tables and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of the strength of the association that have been discussed. To select these procedures, click the following:

Chapter 16 - 59Copyright © 2011 Pearson Education, Inc.

Page 60: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

1. Select ANALYZE on the SPSS menu bar.

2. Click DESCRIPTIVE STATISTICS, and select CROSSTABS.

3. Move the variable "User Group <usergr>" to the ROW(S) box.

4. Move the variable "Sex <sex>" to the COLUMN(S) box.

5. Click CELLS.

6. Select OBSERVED under COUNTS, and select COLUMN under PERCENTAGES.

7. Click CONTINUE.

8. Click STATISTICS.

9. Click CHI-SQUARE, PHI, and CRAMER'S V.

10. Click CONTINUE.

11. Click OK.

SPSS Detailed Steps: Cross-Tabulations

Chapter 16 - 60Copyright © 2011 Pearson Education, Inc.

Page 61: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Excel: Frequencies

The Data > Data Analysis > Histogram function computes the descriptive statistics. The output produces the mean, standard error, median, mode, standard deviation, variance, range, minimum, maximum, sum, count, and confidence level. Frequencies can be selected under the histogram function. A histogram can be produced in bar format.

We illustrate the detailed steps using the data of Table 16.1.

Chapter 16 - 61Copyright © 2011 Pearson Education, Inc.

Page 62: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

1. Select DATA tab.

2. In the ANALYSIS Group, select DATA ANALYSIS.

3. The DATA ANALYSIS Window pops up.

4. Select HISTOGRAM from the DATA ANALYSIS Window.

5. Click OK.

6. The HISTOGRAM pop-up window appears on screen.

7. The HISTOGRAM window has two portions:

a. INPUT

b. OUTPUT OPTIONS

Excel Detailed Steps: Frequencies

Chapter 16 - 62Copyright © 2011 Pearson Education, Inc.

Page 63: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

8. Input portion asks for two inputs.

a. Click in the INPUT RANGE box and select (highlight) all 45

rows under ATTITUDE. $D$2: $D$46 should appear

in the INPUT RANGE box.

b. Do not enter anything into the BIN RANGE box.

(Note that this input is optional; if you do not provide a

reference, the system will take values based on

range of the data set).

9. In the OUTPUT portion of pop-up window, select the following options:

a. NEW WORKBOOK

b. CUMULATIVE PERCENTAGE

c. CHART OUTPUT

10. Click OK.

Excel Detailed Steps: Frequencies (Cont.)

Chapter 16 - 63Copyright © 2011 Pearson Education, Inc.

Page 64: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Excel: Cross-Tabulations

The Insert > Pivot Table function performs cross-tabulations in Excel. To do additional analysis or customize data, select a different summary function, such as maximum, minimum, average, or standard deviation. In addition, a custom calculation can be selected to analyze values based on other cells in the data plane. The Chi-square Test can be accessed under Formulas > Insert Function > ChiTest.

We illustrate the detailed steps using the data of Table 16.3

Chapter 16 - 64Copyright © 2011 Pearson Education, Inc.

Page 65: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

1. Select Insert (Alt + N).

2. Click on Pivot Table. The Pivot Table window pops up.

3. Select columns A to D and rows 1 to 46. “$A$1:$D$46” should appear in the range box.

4. Select NEW WORKSHEET in the CREATE PIVOT TABLE window.

5. Click OK.

Excel Detailed Steps: Cross-Tabulations

Chapter 16 - 65Copyright © 2011 Pearson Education, Inc.

Page 66: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

6. Drag variables into the layout on the left in the following format:

SEX

USERGR CASENO (Double-click CASENO

and select Count)

Excel Detailed Steps: Cross-Tabulations (Cont.)

Chapter 16 - 66Copyright © 2011 Pearson Education, Inc.

Page 67: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

7. Right click on SUM OF CASENO in the lower-right corner. Select “VALUE FIELD SETTINGS”.

8. Under “SUMMARIZE VALUE FIELD BY” select COUNT.

9. Click OK.

Excel Detailed Steps: Cross-Tabulations (Cont.)

Chapter 16 - 67Copyright © 2011 Pearson Education, Inc.

Page 68: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Exhibit 16.1Other Computer Programs: Frequencies

MINITAB

The main function is STAT > DESCRIPTIVE STATISTICS. The output values include the mean, median, mode, standard deviation, minimum, maximum, and quartiles. A histogram in a bar chart or graph can be produced from the GRAPH > HISTOGRAM selection.

SAS

The SUMMARY STATISTICS task provides summary statistics including basic summary statistics, percentile summary statistics, and more advanced summary statistics including confidence intervals, t statistics, coefficient of variation, and sums of squares. It also provides graphical displays including histograms. The ONE-WAY FREQUENCIES task can be used to generate frequency tables as well as binomial and chi-square tests.

Chapter 16 - 68Copyright © 2011 Pearson Education, Inc.

Page 69: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Exhibit 16.2 Other Computer Programs: Cross-Tabulations

MINITAB

In Minitab, cross-tabulations and chi-square are under the STAT > TABLES function. Each of these features must be selected separately under the Tables function.

SAS

The major cross-tabulation task is called TABLE ANALYSIS. This task will display the cross-classification table and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of strength of the association that have been discussed.

Chapter 16 - 69Copyright © 2011 Pearson Education, Inc.

Page 70: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Copyright © 2011 Pearson Education, Inc.

Acronym: Frequencies

The statistics associated with frequencies can be summarized by the acronym FREQUENCIES:

F requency histogram

R ange

E stimate of location: mean

Q uotients: percentages

U ndulation: variance

E stimate of location: mode

N umbers or counts

C umulative percentage

I ncorrect and missing values

E stimate of location: median

S hape of the distributionChapter 11 - 70

Page 71: 16-1 Chapter 16 Data Analysis: Frequency Distribution, Hypothesis Testing, and Cross-Tabulation

Copyright © 2011 Pearson Education, Inc.

Acronym: Tabulate

The salient characteristics of cross-tabulations can be summarized by the acronym TABULATE:

T wo variables at a time

A ssociation and not causation is measured

B ased on cell count of at least five

U derstood easily by managers

L imited number of categories

A ssociated statistics

T wo ways to calculate percentages

E xpected cell frequencies

Chapter 11 - 71