Upload
richard-denis-dorsey
View
222
Download
4
Embed Size (px)
Citation preview
Chapter 17: Statistical Analysis
CONTENTS
• The statistics approach• Statistical tests
– Types of data and appropriate tests– Chi-square– Comparing two means: the t-test– A number of means: one-way analysis of variance– A table of means: factorial analysis of variance– Correlation– Linear regression– Multiple regression– Factor and cluster analysis
The statistics approach
• Probabilistic statements• The normal distribution• Probabilistic statement formats• Significance• The null hypothesis• Dependent and independent variables.
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements
• descriptive: e.g. : 10% of adults play tennis• comparative: e.g. : 10% play tennis, but 12% play golf• relational: e.g. 15% of people with high incomes play tennis
but only 7% of people with low incomes do so: there is a positive relationship between tennis-playing and income.
• However: when based on a samples, the above must be made using a probabilistic format
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements contd
• We can be 95% confident that the proportion of adults that plays tennis is between 9% and 11%
• The proportion of golf players is significantly higher than the proportion of tennis players (at the 95% level of probability)
• There is a positive relationship between level of income and level of tennis playing (at the 95% level)
• (See discussion of Confidence intervals: Chapt 13).
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statement formats
• 95% probability– sometimes expressed as 5% – sometimes as 0.05
• 99% probability is also used – also expressed 1% or 0.01
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution (Fig. 17.1):
Popn Value
-4 -3 -2 -1 +1 +2 +3 +4
Samples
Sample valuesSample values
a. Drawing repeated samples (theory)
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution contd
Popn Value
-4 -3 -2 -1 +1 +2 +3 +4
Samples
Sample valuesSample values
b. Normal distribution/curve
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal curve (Fig. 13.1)
NUMBER OF SAMPLES
-4 -3 -2 -1 +1 +2 +3 +4
Popn Value
2.5%2.5%
95%
Standard errors Standard errors
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Significance
• Statistically significant: unlikely to have happened by chance (highly probable)
• Level of significance is affected by sample size (not by population size)
• Probability of finding happening by chance related to normal curve and similar theoretical distributions.
• But NB: small differences or weak relationships may not be socially or managerially significant – even when they are statistically significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Null hypothesis
• H0 – Null hypothesis: there is no significant difference or relationship
• H1 – Alternative hypothesis: there is a significant difference or relationship
• eg.– H0 tennis and golf participation levels are the same;
– H1 tennis and golf participation levels are significantly
different.
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Dependent and independent variables
Independent variable 2
Independent variable 3
Dependent variable
Independent variable 1
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical testsTask Format of data No. of
var’blesTypes of variable
Test
Relationship between 2 variables
Crosstabulation of frequencies
2 Nominal Chi-square
Difference between 2 means - paired
Means: for a whole sample
2 Two scale/ordinal
t-test - paired
Difference between 2 means – indep-endent samples
Means: for 2 sub-groups
2 1. scale/ordinal (means)2. nominal (2 grps only)
t-test – independent samples
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd
Task Format of data
No. of var’bles
Types of variable Test
Relationship between 2 variables
Means - for 3+sub-groups
2 1. scale/ ordinal (means)2. nominal (3+ groups)
One-way analysis of variance
Relationship between 3 or more variables
Means: crosstabulated
3+ 1. scale/ordinal (means)2. Two or more nominal
Factorial analysis of variance
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd
Task Format of data
No. of var’bles
Types of variable
Test
Relationship between 2 variables
Individual measures
2 Two scale/ ordinal
Correlation
Linear relationship between 2 variables
Individual measures
2 Two scale/ ordinal
Linear regression
Linear relationship between 3+ variables
Individual measures
3+ Three or more scale/ ordinal
Multiple regression
Relationships between large nos of variables
Individual measures
Many Large nos of scale/ ordinal
Factor analysisCluster analysis
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Data
• Extended version of Campus Sporting Life survey with– additional variables – additional cases
• See Appendix 17.2• SPSS used, as in Chapt. 16
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Chi-square (X2)
• Testing the relationship between two variables presented in a frequency crosstabulation.
• Null/alternative hypotheses:– H0 - there is no relationship between student status and gender in the
population– H1 - there is a relationship between status and gender in the
population
• Findings (Fig. 17.5):– Value of Chi-square: 6.522– Significance: 0.011– Less that 0.05 (5%)– Conclusion: H0 rejected, H1 accepted: there is a relationship
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing two means: t-test
• Paired samples: whole sample: comparing means for two variables
• Independent samples: sample divided into two groups (eg. males and females) and comparing means for one variable
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig. 17.9)
• Example 1: Compare average times played sport in last 3 months (12.2) with average times visited national parks (9.8)
• Difference is 2.4• value of t is 1.245• Significance is 0.219, which is larger than 0.05• Null hypothesis is accepted: difference is not
significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig. 17.10)
• Compare course costs for males ($110.00 pa) and females ($136.60)
• Difference is $28.60• value of t is -1.245• significance is 0.219• Null hypothesis is accepted: difference is not
significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
One-way analysis of variance (ANOVA)(Fig. 17.11, 13)
• Means of one variable for groups defined by another variable• F-test rather than r-test• eg. Means of times played sport by student status:
– F/T student/no paid work: mean = 9.7 times in 3 months– F/T student/paid work: 9.6 times– P/T student – F/T job: 19.1 times– P/T student – Other: 12.2 times
• Value of F: 2.485, Significance 0.072, which is greater than 0.05• Null hypothesis accepted: no relationship between status and sport• But for ‘going out for a meal’: F = 6.64 and Sig. = 0.001, which is less than 0.05,
so null hypothesis rejected: there is a significant relationship
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Factorial analysis of variance (Fig. 17.14, 15)
• Status not significant and gender not significant• But for status x gender: F = 3.681, Sig. = 0.019, which is <0.05, so
null hypothesis rejected: there is a significant relationship.
Mean number of visits in three monthsStatus Male FemaleF/T student/no paid work 3.1 1.5F/T student/paid work 1.6 5.4P/T student - F/T job 1.4 2.6P/T student/Other 3.5 3.2
• A table of means: two variables and means of a third• eg. Mean visits to theatre by gender by student status
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 17.16) Watched sport by income: weak positive: r = .46
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16) Played sport by income: weak negative: r = -0.44
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16) Sport exp. by income: strong positive: r= 0.91
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 17.18)
• Correlation coefficient (r) expresses the relationship numerically• No relationship: r =0• Exact relationship: r = 1 (positive) -1 (negative)• Correlation matrix shows correlations between a number of
variables
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation matrix (simplified Fig. 17.18)
Income Sport Watch sport
Visit park
Meal Sport exp.
Income 1.00Sport -.44** 1.00Watch sport .46** -.68** 1.00Visit park .02 .27 -.29* 1.00Meal .08 .45** -.29* -.04 1.00Sport exp. .91** -.37** .38 .06 .12 1.00
* = significant at the 0.05 level** = significant at the 0.01 level
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression Fits best fit ‘regression line’ to scatterplot: Fig. 17.21
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression: best fit may be a curve (Fig. 17.22)
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Multi-variate analysis
• Multiple regression has one dependent variable and a number of independent, influencing, variables
• One development: Structural Equation Modelling explores inter-relationships between a number of variables
• Cluster and factor analysis: combine large numbers of variables into groups – eg. lifestyle or personality groups
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge