22
Chapter 5: Measures of Dispersion

Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Embed Size (px)

Citation preview

Page 1: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Chapter 5:Measures of Dispersion

Page 2: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

• Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents differ from each other. (spread is another interchangeable term)

• Dispersion helps us to measure the extent to which the values or responses deviate from each other.

• The tools used to measure the amount of variability or spread in the categories or values of a variable are called the measures of dispersion or measures of variability.

Example: Which one of these data sets has more variability?A: 22, 24, 25, 28, 30 B: 1, 5, 12, 20, 32Data set B is clearly more spread out.

Page 3: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Measures of Dispersion for Numerical Variables

• Dispersion in the case of a numerical variable is the degree to which the numbers in a distribution vary from the typical, or average value.

• Three typical methods to measure dispersion for numerical variables are:– Range– Interquartile range– Standard deviation

Page 4: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Range

• The range is the difference between the highest value and lowest value in a data set.

• It is the simplest way to obtain information about dispersion.

• Daily temperature is an example of range as it is usually given in the form of the highest and lowest values.

Page 5: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Example: Compute the range for the following set of data: 12, 3, 17, 21, 11, 18, 20, 19, 6, 15Range: 21 – 3 = 18• There are a few disadvantages to using range:1) The range does not give any information

about the distribution of the observations within its two endpoint values.

2) The range can be skewed by outliers, just as the mean can be.

For example, add the value 200 to the above data set. Now the range is 197. That is not representative of the data.

Page 6: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Interquartile Range

• Interquartile range is the range of the middle 50% of a distribution after the bottom 25% and top 25% have been eliminated.

• Interquartile range is more resistant to outliers than the range.

• In order to calculate the IQR, we first have to find the quartiles of the distribution, which are the values that divide the distribution into four groups of roughly the same size.

Page 7: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Steps to calculate quartiles1) Put the data in numerical order and find the median (this also happens to be the second quartile)2) Find the median of the values whose position in the ordered list is to the left of the median. This value is the first quartile.3) Find the median of the values whose position in the ordered list is to the right of the median. This value is the third quartile.Note: When the number of observations is odd, don’t include the median value in the calculations in steps 2 and 3.

Page 8: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

• If the position of a quartile is between two numbers, average those numbers as you would to find a median.

• Subtract the first quartile from the third quartile to get the IQR.

• When dealing with frequency distributions, there are useful formulas to help us figure out the position of the quartiles.

• If the position is a whole number, then the quartile lies on that exact one observation.

• If the position is a decimal, then the quartile is the average.

Page 9: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Example: Find the interquartile range.5, 6, 5, 10, 11, 16, 15, 13Order the data: 5, 5, 6, 10, 11, 13, 15, 16The median is 10.5Q1 is 5.5.Q3 is 14IQR=14-5.5=8.5

Page 10: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Example: Find the IQR for the following data.

Page 11: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Standard Deviation

• Standard deviation measures the variability in a distribution using the squared deviations from the mean.

• A deviation is the difference between the score and the mean of the data set.

• Deviations can be positive, negative, or 0.

Page 12: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Example: Find the deviations and the sum of the deviations for the following data set: 25, 26, 30.

• Since the sum of the deviations from the mean is always zero, their average will always be zero. We need a different way to measure the average deviations from the mean.

• To counteract that problem, we will square each deviation and then find the average of the squared deviations.

Page 13: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

The Variance

• The variance is the average of the squared deviations from the mean.

• Standard deviation is the square root of the variance.

• The variance (and as a result standard deviation) is calculated differently for the population and the sample.

Page 14: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Notation

Page 15: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents
Page 16: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Steps to find the variance.1) Find the sum of your data set.2) Find the mean.3) Calculate the deviations.4) Square the deviations.5) Sum the squared deviations.6) Divide the sum of the squared deviations by

n if working with the population or n-1 if working with the sample.

To find the standard deviation, take the square root of step 6.

Page 17: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Example: Find the variance and the standard deviation for the sample data: 25, 26, 30.

Page 18: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

You try. Find the variance and standard deviation for the population data: 2, 4, 8, 14.

Page 19: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Short-cut Formula for Variance and SD

• There are two quantities in the short-cut formula that we need to clarify.

Page 20: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

• Note: For the population variance and standard deviation, divide by n stead of n-1

Page 21: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

• Let’s retry the previous example of the sample of 25, 26, and 30. We know the variance should come out to be 7 and the sd 2.65.

Page 22: Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents

Now try the example with 2, 4, 8, and 14. Remember we are dealing with a population!