49

Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Embed Size (px)

Citation preview

Page 1: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,
Page 2: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Overview

• Identify and explain descriptive and inferential (parametric) statistics.

• Descriptive statistics are the measures of central tendency.• Mean, median, mode, standard deviation, and

standard error of the mean.• The inferential statistics are:

• The z-test, t-test, f-test, and the Pearson correlation coefficient (r).

Page 3: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Overview

• To provide the information necessary to interpret the tables for these tests as presented in the Statistical Package for the Social Sciences (SPSS).

• This includes:• Confidence level• Confidence interval• Significance (sig.)• Assumption of equal variance

Page 4: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Quantitative Research

• Most of the quantitative research done at the MBA level is survey research.

• Issues concerning quantitative survey research design.• What are the variables for the study?• How will these variables be measured?• How will the sample be selected?• How will the data be collected?• How many participants are needed?• How will the data be analyzed?

Page 5: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Quantitative Research

• There are four types of quantitative research questions.

Type of Research Question Survey Research Example

Descriptive What are participants’ opinion about hazardous waste?

Group Difference Is there a significant difference in opinions about hazardous waste between Democrats and Republicans?

Relationship Is there a significant relationship between age of participants and their opinions about hazardous waste?

Prediction Can opinions about hazardous waste be accurately predicted by level of education or religious preference?

Page 6: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Descriptive Questions

• Descriptive questions count and group responses.• In our study regarding participants’ opinions about

hazardous waste descriptive questions would be used to:

• Count similar opinions• Group opinions by demographic characteristics

(age, education level, sex, etc.)• Describe grouped opinions by average, most often

expressed, degree of intensity, etc.• Descriptive statistics are used to answer these questions

• Mean, median, mode, standard deviation.

Page 7: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Group Difference Questions

• Group difference questions compare the responses of one group of participants with those of another group to identify significant differences.• A single sample can be compared to a constant – a

researcher might want to test whether a groups’ average level of concern expressed on a hazardous waste survey differs from 10 (on a 1 to 10 scale).

• Two independent samples can be compared with each other - a researcher might want to test whether the average level of concern expressed by Republicans on a hazardous waste survey differs from that expressed by Democrats.

Page 8: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Group Difference Questions

• Two paired samples can be compared with each other – a researcher might want to test whether the average level of concern expressed by a group on a hazardous waste survey before a hazardous waste ad campaign differs from that expressed by the same group after the ad campaign.

• Mean comparison statistics are used to answer these questions.• One-sample t test• Independent samples t test• Paired samples t test

Page 9: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Relationship Questions

• Relationship questions look for relationships between dependent and independent variables.• A researcher may want to determine if there is a

significant relationship between a participants age and the level of concern expressed on a hazardous waste survey.

• Bivariate correlation statistics are used to answer these questions.• Pearson’s correlation coefficient.

Page 10: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Prediction Questions

• Prediction questions look for independent variables that can be used to predict the outcome of the dependent variable.• A researcher may want to determine if participants’

level of education or religious preference predict the average level of concern expressed on a hazardous waste survey.

• Forecasting statistics are used to answer these questions.• Linear regression analysis

Page 11: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Levels of Measurement

• In putting together a survey the researcher must pay attention to the level of measurement used to collect data on the variables in the study.

• There are four levels or scales of measurement.• Nominal• Ordinal• Interval• Ratio

Page 12: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Nominal Measurements

• In nominal measurement the numerical values just "name" the attribute uniquely. No ordering of the cases is implied.

• For example, jersey numbers in basketball are measures at the nominal level. A player with number 30 is not more of anything than a player with number 15, and is certainly not twice whatever number 15 is.

• Examples include:• 1. Male 2. Female• 1. Caucasian 2. Indian 3. Asian• 1. Married 2. Single 3. Divorced

Page 13: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Ordinal Measurements

• In ordinal measurement the attributes can be rank-ordered. Here, distances between attributes do not have any meaning.

• For example, you might code Educational Attainment as0=less than H.S. 1=some H.S. 2=H.S. degree 3=some college 4=college degree 5=post college

• In this measure, higher numbers mean more education. But is distance from 0 to 1 same as 3 to 4? Of course not. The interval between values is not interpretable in an ordinal measure.

• Examples include:• Movie ratings• Class standing – Freshman, Sophomore, Junior, Senior

Page 14: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Interval Measurements

• In interval measurement the distance between attributes does have meaning. For temperature (in Fahrenheit), the distance from 30-40 is same as distance from 70-80.

• Because the interval between values is interpretable, we can compute an average of an interval variable, where we cannot for ordinal scales.

• Note that in interval measurement ratios don't make any sense - 80 degrees is not twice as hot as 40 degrees.

• Examples include measures where 0 (zero) does not mean the absence of the attribute.

Page 15: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Ratio Measurements

• In ratio measurement there is always an absolute zero that is meaningful. This means that you can construct a meaningful fraction (or ratio) with a ratio variable.

• Weight is a ratio variable.• In applied social research most "count" variables are

ratio, for example, the number of clients in past six months. Why? Because you can have zero clients and because it is meaningful to say that "...we had twice as many clients in the past six months as we did in the previous six months.“

• Examples include: degrees Kelvin, annual income in dollars, length or distance in inches, feet, miles, etc.

Page 16: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Levels of MeasurementAllowable Nominal Ordinal Interval Ratio

Descriptive Mode Median, Mode Mean, Median, Mode, Standard Deviation

Arithmetic Operations

Counts Greater or less than Addition and subtraction of scale values

Multiplication and division of scale values.

Statistics z test, t test, f test, Pearson’s correlation coefficient, linear regression

• In social science research the Likert scale is often used to promote ordinal values to interval values. This allows the data to be analyzed using inferential statistics.

Page 17: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Likert Scale

• The scale is named after Rensis Likert, who first developed it.

• The Likert scale is a psychometric scale commonly used in questionnaires, and is the most widely used scale in survey research.

• When responding to a Likert questionnaire item, respondents specify their level of agreement to a statement.

A Seven Level Likert Scale

1 2 3 4 5 6 7

Page 18: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Likert Scale

• Likert scales can be used to evaluate subjective or objective criteria.

• They are bipolar scales in that generally some level of agreement or disagreement is measured.

• Many social science researchers recommend scales consisting of 7 to 9 items.

A Seven Level Likert Scale

1 2 3 4 5 6 7

Page 19: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Likert Scale

• Likert scale values should always be accompanied by item labels.

• Without labels a mean result with a scale value of 1.9 would be reported as 1.9 on a scale of 1 to 7.

• With labels a mean result of 1.9 can additionally be reported as ‘Dissatisfied’. This adds meaning to the interpretation of the results.

A Seven Level Likert Scale

1 2 3 4 5 6 7Very

DissatisfiedDissatisfied Somewhat

DissatisfiedNeither

Dissatisfied or Satisfied

Somewhat Satisfied

Satisfied Very Satisfied

Page 20: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Likert Scale

• Some researchers object to the middle item indicating the absence of satisfaction (or absence of agreement or disagreement).

• They suggest that participants should be forced to respond or to select a no response option.

A Seven Level Likert Scale

1 2 3 4 5 6 7Very

DissatisfiedDissatisfied Somewhat

DissatisfiedNeither

Dissatisfied or Satisfied

Somewhat Satisfied

Satisfied Very Satisfied

Page 21: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Likert Scale

• Moving the no response option out of the scale adds to the meaningfulness of measures of central tendency (mean, median, mode)

• It also allows data to be interpreted in two groups.• The researcher can indicate the number of participants

responding as neither satisfied or dissatisfied.• And, when the calculations are made they are based

on participants with an opinion about satisfaction.A Six Level Likert Scale

1 2 3 4 5 6 0Very

DissatisfiedDissatisfied Somewhat

DissatisfiedSomewhat Satisfied

Satisfied Very Satisfied

Neither Satisfied or Dissatisfied

Page 22: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Types of Statistics

• There are two types of statistics used in social science research.• Descriptive• Inferential

• Descriptive statistics refer to methods used to organize, summarize, and tabulate data.

• Descriptive statistics provide a picture of what happened in the study.

• Descriptive statistics provide a basis for inferential statistics.

Page 23: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Types of Statistics

• Inferential statistics refers to methods used to draw inferences about a population based on descriptive data available on a sample drawn from the population.

Page 24: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Mean

• The mean is the sum of the individual samples (χ) divided by the number of samples.

• μ (mu) is the mean of the entire population.

• (x-bar) is the mean of the sample.x

1 60

2 34

3 74

4 10

5 86

6 59

7 34

8 50

9 43

10 59

11 68

12 35

13 53

14 28

15 82

16 47

17 60

18 40

19 19

20 59

x

1,000

Observation

n

xx

20

000,150

Page 25: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Median

• The median is the number that separates the higher half of a sample from the lower half.

• It is found by arranging the observations from highest to lowest and picking the middle one.

• When the number of observations is even, the median is the mean* of the two middle observations.

1 86

2 82

3 74

4 68

5 60

6 60

7 59

8 59

9 59

10 53

11 50

12 47

13 43

14 40

15 35

16 34

17 34

18 28

19 19

20 10

xObservation

2

)5053(5.51

* The strength of the median as a measure of central tendency is that, unlike the mean, it is a value that occurs in the sample. This strength is nullified when the median is the mean of the two middle observations.

Page 26: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Mode

• The mode is the value that occurs most frequently in a sample.

• When multiple values occur with the highest frequency the sample is said to be bi-modal or multi-modal.

1 86

2 82

3 74

4 68

5 60

6 60

7 59

8 59

9 59

10 53

11 50

12 47

13 43

14 40

15 35

16 34

17 34

18 28

19 19

20 10

xObservation

Page 27: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Standard Deviation

• It is a measure of the dispersion of a collection of numbers.

• It indicates how widely spread the values in a dataset are with respect to their mean.

A data set with a mean of 50 (shown in blue) and a standard deviation (σ)

of 20.

Page 28: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Standard Deviation

• It is calculated by determining the square root of the variance.

)1(

)( 2

n

xx

7,612

x x xx 2)( xx

19

612,720

1 60 50 10 100

2 34 50 -16 256

3 74 50 24 576

4 10 50 -40 1600

5 86 50 36 1296

6 59 50 9 81

7 34 50 -16 256

8 50 50 0 0

9 43 50 -7 49

10 59 50 9 81

11 68 50 18 324

12 35 50 -15 225

13 53 50 3 9

14 28 50 -22 484

15 82 50 32 1024

16 47 50 -3 9

17 60 50 10 100

18 40 50 -10 100

19 19 50 -31 961

20 59 50 9 81

Page 29: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Standard Error of the Mean (SEM)

• Because the mean of the population is usually unknown it is important to estimate the error between the sample mean and the population mean.

• The SEM is an unbiased estimate of expected error in the sample estimate of a population mean.

nSEM

20

2047.4

Page 30: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Inferential Statistics

• With these measures of central tendency (mean, median, mode, standard deviation, and standard error of the mean), we are now able to statistically compare samples.

• We have our sample (n = 20)• Mean = 50• Median = 51.5• Mode = 59• Standard Deviation = 20• Standard Error of the Mean = 4.47

• Let’s assume that we want to compare our sample with another sample having a mean of 60 to determine if there is a significant difference between the samples.

Page 31: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The z-test

• The z-test is used primarily with standardized testing to determine if the test scores of a particular sample of test takers are within or outside of the standard performance of test takers (a second group of test scores).

• The assumptions necessary for the z-test are:• The population standard deviation must be known.• The sample must be random.• The sample must be normally distributed.

• The null hypothesis for the test is that the means are equal H0: μ1=μ2 (the samples come from the same population).

Page 32: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The z-test

• First we need to determine the confidence level (CL) we require for this comparison.• The CL is determined by the researcher.• The CL tells us how sure we can be of our results.• Most social science research uses a CL of 95%.

• Next we need to determine the alpha level (α).• α = 1-CL• Most social science research uses an α of 5%.

Page 33: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The z-test

• The z score shows the distance of our test mean from the population mean in units of the population standard deviation.

• Our test mean = 60• Our sample mean = 50• Our SEM = 4.47• Therefore,

• Our z-score is 2.23

SEM

xxz

47.4

506023.2

Page 34: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Normal Distribution

MeanMedianMode

That means our comparison mean is 2.23 standard deviations above our population mean.

Page 35: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Normal Distribution

MeanMedianMode

A z-table tells us that 48.75% of the scores fall between 0 and our score of 60 (2.23σ).

48.75%50%

In the normal distribution, 50% of the scores fall between 0 and -∞.

Page 36: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Normal Distribution

MeanMedianMode

98.75%

Our confidence interval was 95%. Therefore, we must reject H0: (μsamp=μcomp).

This tells us that 98.75% of the time the comparison scores are higher than the scores of our sample.

Page 37: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Normal Distribution

MeanMedianMode

95%

The minimum/maximum allowable z-score for our 95% CL in this two-tailed test is ±1.96.

Page 38: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Confidence Interval

• Now that we know the number of standard deviations for the 95% CL for the z-test we can calculate our confidence interval.• z-score for 95%CL = ±1.96• SEM = 4.47• Mean = 50

))(( zSEMxub The upper bound is

76.85076.58

))(( zSEMxlb The lower bound is

76.85024.41

Page 39: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Confidence Interval

• The range of means for which we can fail to reject H0: (μsamp=μcomp) is 41.24 to 58.76.

Page 40: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Errors in Hypothesis Testing

• Hypothesis testing provides a negative assessment.• Researchers do not prove hypothesis. They test them

in an attempt to disprove them.

• Researchers either reject H0 or fail to reject H0.

• There are two possible errors in hypothesis testing.

• Type 1 Error: reject H0 when H0 is true.

• Type 2 Error: fail to reject H0 when H0 is false.

• The α level of the test sets the probability of making a Type 1 or a Type 2 Error.

• In most social science research the probability of an error is 5%.

Page 41: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The t distribution

• The t test is the mean comparison test most often used in social science research. The z-test requires that the population standard deviation is known. That is almost never the case in social science research.

• The assumptions of the t test are:• The sample must be random.• The sample must be normally distributed.

• However the t test is much more forgiving of violations of these assumptions (random samples are difficult).

Page 42: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The t Distribution

MeanMedianMode

• The t test is much better with smaller samples.• The area under the curve changes with each sample size below 200.• With sample sized above 200 the t and z tests are the same.

Page 43: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The t Distribution

MeanMedianMode

• By increasing the area under the curve for smaller samples the t test decreases the likelihood of a type 1 error.

• t test and z test results are read and interpreted the same.

Page 44: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The F Distribution

• The F test has a variety of applications.• It measures variance within and between samples.• In inferential statistics it is used to test variance between

two samples (t test) or to analyze the variance of more than two samples as in analysis of variance calculations (ANOVA).

• It is read and interpreted in the same way as the t test and the z test.

Page 45: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

SPSS

• Pause here and use the Statistical Package for the Social Sciences (SPSS) to compare sample means.

Page 46: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Bivariate Correlation

• The relationship between two sets of variables can be calculated numerically with a correlation coefficient.

• The correlation coefficient is a measure of the strength of association between two sets of variables.

• The correlation coefficient is a value between -1 and 1.• The stronger the correlation between two sets of

variables the closer the correlation coefficient is to -1 (for negative correlations) and 1 (for positive correlations).

Page 47: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

Bivariate Correlation

• Hinkle provides a table used by many social science researchers to describe the level of correlation present in their data.

Interpretation Size of Correlation Direction Size of Correlation Direction

Very High .90 to 1.00 Positive -.90 to -1.00 Negative

High .70 to .89 Positive -.70 to -.89 Negative

Moderate .50 to .69 Positive -.50 to -.69 Negative

Low .30 to .49 Positive -.30 to -.49 Negative

Little if any .00 to .29 Positive -.00 to -.29 Negative

Page 48: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

The Pearson r

• The Pearson correlation coefficient or Pearson r is used when both the X and Y variables are measured on at least the interval scale.

• All correlation coefficients operate in a similar fashion.

• The H0 for the Pearson r is H0: ρ=0. Thus, there is no correlation.

Page 49: Overview Identify and explain descriptive and inferential (parametric) statistics. Descriptive statistics are the measures of central tendency. Mean,

SPSS

• Pause here and use the Statistical Package for the Social Sciences (SPSS) to calculate correlations.