113
Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed.

Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Embed Size (px)

Citation preview

Page 1: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Larson/Farber 4th ed. Copyright © Cengage Learning. All rights reserved.

Averages and Variation

3

Page 2: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Copyright © Cengage Learning. All rights reserved.

Section

3.1

Measures of Central Tendency: Mode, Median, and Mean

Larson/Farber 4th ed. 2

Page 3: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Focus Points

• Compute mean, median, and mode from raw data

• Interpret what mean, median, and mode tell you

• Explain how mean, median, and mode can be affected by extreme data values

• What is a trimmed mean?

• Compute a weighted average

Larson/Farber 4th ed. 3

Page 4: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency

Measure of central tendency

A value that represents a typical, or central, entry of a data set.

Most common measures of central tendency:• Mean• Median• Mode

Larson/Farber 4th ed. 4

Page 5: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• The average price of an ounce of gold is $1200. The Zippy car averages 39 miles per gallon on the highway. A survey showed the average shoe size for women is size 9.

• In each of the preceding statements, one number is used to describe the entire sample or population. Such a number is called an average.

• There are many ways to compute averages, but we will study only three of the major ones. The easiest average to compute is the mode.

Larson/Farber 4th ed. 5

Page 6: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency

Measure of central tendency

Larson/Farber 4th ed. 6

Page 7: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 1 – Mode

• Count the letters in each word of this sentence and give the mode. The numbers of letters in the words of the sentence are

• 5 3 7 2 4 4 2 4 8 3 4 3 4• Scanning the data, we see that 4 is the mode because

more words have 4 letters than any other number. • For larger data sets, it is useful to order—or sort—the

data before scanning them for the mode.

Larson/Farber 4th ed. 7

Page 8: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Not every data set has a mode. For example, if Professor Fair gives equal numbers of A’s, B’s, C’s, D’s, and F’s, then there is no modal grade.

• In addition, the mode is not very stable. Changing just one number in a data set can change the mode dramatically.

• However, the mode is a useful average when we want to know the most frequently occurring data value, such as the most frequently requested shoe size.

Larson/Farber 4th ed. 8

Page 9: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• When no number occurs more than once in a data set, there is no mode.

• If each of two numbers occurs twice, we say the set is bimodal.

• Example (bimodal): 1, 1, 1, 2, 2, 2, 3 • 1 and 2 both have the most occurrences so they are

both modes• Since they tie for first place, they both get it

• 3 is not a mode because it occurs less frequently

Larson/Farber 4th ed. 9

Page 10: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Another average that is useful is the median, or central value, of an ordered distribution. The median is the middle value.

• When you are given the median, you know there are an equal number of data values in the ordered distribution that are above it and below it.

Larson/Farber 4th ed. 10

Page 11: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measure of Central Tendency: Median

Median

The value that lies in the middle of the data when the data set is ordered.

Measures the center of an ordered data set by dividing it into two equal parts.

If the data set has an• odd number of entries: median is the middle data

entry.• even number of entries: median is the mean of

the two middle data entries.

Larson/Farber 4th ed. 11

Page 12: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Procedure:

Larson/Farber 4th ed. 12

Page 13: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2 – Median

• Case: n is even• What do barbecue-flavored potato chips cost?

According to Consumer Reports, Vol. 66, No. 5, the prices per ounce in cents of the rated chips are

• 19 19 27 28 18 35

(a) To find the median, we first order the data, and then note that there are an even number of entries.

• So the median is constructed using the two middle values.

Larson/Farber 4th ed. 13

Page 14: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2 – Median

• (Sorted) prices per ounce in cents of the rated chips are

• 18 19 19 27 18 35

• (b) According to Consumer Reports, the brand with the lowest overall taste rating costs 35 cents per ounce.

• Eliminate that brand, and find the median price per ounce for the remaining barbecue-flavored chips.

Larson/Farber 4th ed. 14

Page 15: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2 – Median

• Case: n is odd• Again order the data. • The median is simply the middle value.

• 18 19 19 27 28

• middle value• Median = middle value = 19 cents

Larson/Farber 4th ed. 15

Page 16: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2 – Median

• (c) One ounce of potato chips is considered a small serving. Is it reasonable to budget about $10.45 to serve the barbecue-flavored chips to 55 people?

• Yes, since the median price of the chips is 19 cents per small serving: 55 * $0.19 = $10.45

• This budget for chips assumes that there is plenty of other food! Each guest gets just 1 ounce!

Larson/Farber 4th ed. 16

Page 17: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• The median uses the position rather than the specific value of each data entry. If the extreme values of a data set change, the median usually does not change.

• This is why the median is often used as the average for house prices.

• If one mansion costing several million dollars sells in a community of much-lower-priced homes, the median selling price for houses in the community would be affected very little, if at all.

Larson/Farber 4th ed. 17

Page 18: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Note: • For small ordered data sets, we can easily scan the set

to find the location of the median.

• However, for large ordered data sets of size n, it is convenient to have a formula to find the middle of the data set.

Larson/Farber 4th ed. 18

Page 19: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• For instance, if n = 99 then the middle value is the (99 +1)/2 or 50th data value in the ordered data.

• If n = 100, then (100 + 1)/2 = 50.5 tells us that the two middle values are in the 50th and 51st positions.

• An average that uses the exact value of each entry is the mean (sometimes called the arithmetic mean).

Larson/Farber 4th ed. 19

Page 20: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measure of Central Tendency: Mean

Mean (average)

The sum of all the data entries divided by the number of entries.

Sigma notation: Σx = add all of the data entries (x) in the data set.

Population mean:

Sample mean:

Larson/Farber 4th ed. 20

x

N

xx

n

Page 21: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 3 – Population Mean

• To graduate, Linda needs at least a B in biology. She did not do very well on her first three tests; however, she did well on the last four. Here are all her scores:

• 58 67 60 84 93 98 100

• Compute the mean and determine if Linda’s grade will be a B (80 to 89 average) or a C (70 to 79 average).

Larson/Farber 4th ed. 21

Page 22: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 3 – Solution

•Since the average is 80, Linda will get the needed B.

x

N

Larson/Farber 4th ed. 22

Page 23: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 4: Finding a Sample Mean

The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. What is the mean price of the flights?

872 432 397 427 388 782 397

Larson/Farber 4th ed. 23

Page 24: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding a Sample Mean

872 432 397 427 388 782 397

Larson/Farber 4th ed. 24

The sum of the flight prices is

Σx = 872 + 432 + 397 + 427 + 388 + 782 + 397 = 3695

To find the mean price, divide the sum of the prices by the number of prices in the sample

3695527.9

7

xx

n

The mean price of the flights is about $527.90.

Page 25: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 5: Compare Measures of Center

Larson/Farber 4th ed. 25

The unit load of 40 randomly selected students from a college shown below. Find the mean, median and mode:

17 12 14 17 13 16 18 20 13 12

12 17 16 15 14 12 12 13 17 14

15 12 17 16 12 18 20 19 12 15

18 14 16 17 15 19 12 13 13 15

Page 26: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 5: Compare Measures of Center

Larson/Farber 4th ed. 26

Solution:

Mean 15.0

Median 15

Mode 12

If the state is going to fund the college according to the “average” credit load, which “average” do you think the college will report? Why?

Page 27: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• We have seen three averages: the mode, the median, and the mean. For later work, the mean is the most important.

• A disadvantage of the mean, however, is that it can be affected by exceptional values. A resistant measure is one that is not influenced by extremely high or low data values.

• The mean is not a resistant measure of center because we can make the mean as large as we want by changing the size of only one data value.

Larson/Farber 4th ed. 27

Page 28: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• The median, on the other hand, is more resistant. However, a disadvantage of the median is that it is not sensitive to the specific size of a data value.

• A measure of center that is more resistant than the mean but still sensitive to specific data values is the trimmed mean.

• A trimmed mean is the mean of the data values left after “trimming” a specified percentage of the smallest and largest data values from the data set.

Larson/Farber 4th ed. 28

Page 29: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Usually a 5% trimmed mean is used. This implies that we trim the lowest 5% of the data as well as the highest 5% of the data. A similar procedure is used for a 10% trimmed mean.

• Procedure:

Larson/Farber 4th ed. 29

Page 30: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 6: Find measures of central tendency

The class sizes of 20 randomly chosen Introductory Algebra classes in California are shown.

14 20 20 20 20 23 25 30 30 30

35 35 35 40 40 42 50 50 80 80

a) Compute the mean

b) Compute a 5% trimmed mean

Larson/Farber 4th ed. 30

Page 31: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 6 Solution: Find Trimmed Mean

Larson/Farber 4th ed. 31

Page 32: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• In general, when a data distribution is mound-shaped symmetrical, the values for the mean, median, and mode are the same or almost the same.

• For skewed-left distributions, the mean is less than the median and the median is less than the mode.

• For skewed-right distributions, the mode is the smallest value, the median is the next largest, and the mean is the largest.

Larson/Farber 4th ed. 32

Page 33: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Measures of Central Tendency: Mode, Median, and Mean

• Figure 3-1, shows the general relationships among the mean, median, and mode for different types of distributions.

Figure 3.1

(a) Mound-shaped symmetrical (b) Skewed left (c) Skewed right

Larson/Farber 4th ed. 33

Page 34: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Comparing the Mean, Median, and Mode

Find the mean, median, and mode of the sample ages of a class shown. Which measure of central tendency best describes a typical entry of this data set? Are there any outliers?

Larson/Farber 4th ed. 34

Ages in a class

20 20 20 20 20 20 21

21 21 21 22 22 22 23

23 23 23 24 24 65

Page 35: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Comparing the Mean, Median, and Mode

Larson/Farber 4th ed. 35

Mean: 20 20 ... 24 6523.8 years

20

xx

n

Median:21 22

21.5 years2

20 years (the entry occurring with thegreatest frequency)

Ages in a class

20 20 20 20 20 20 21

21 21 21 22 22 22 23

23 23 23 24 24 65

Mode:

Page 36: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Comparing the Mean, Median, and Mode

Larson/Farber 4th ed. 36

Mean ≈ 23.8 years Median = 21.5 years Mode = 20 years

• The mean takes every entry into account, but is influenced by the outlier of 65.

• The median also takes every entry into account, and it is not affected by the outlier.

• In this case the mode exists, but it doesn't appear to represent a typical entry.

Page 37: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Comparing the Mean, Median, and Mode

All three measures describe a typical entry of a data set.

Advantage of using the mean:• The mean is a reliable measure because it takes

into account every entry of a data set.

Disadvantage of using the mean:• Greatly affected by outliers (a data entry that is far

removed from the other entries in the data set).

Larson/Farber 4th ed. 37

Page 38: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Weighted Average

Larson/Farber 4th ed. 38

Page 39: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Weighted Average

• Sometimes we wish to average numbers, but we want to assign more importance, or weight, to some of the numbers.

Category Weight

Class Work 20%

Homework 20%

Test 30%

Quiz 30%

Larson/Farber 4th ed. 39

Page 40: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Weighted Average

• The average you need is the weighted average.

Larson/Farber 4th ed. 40

Page 41: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 7: Find Weighted Average

You are taking a class in which your grade is determined from five sources: 50% from your test mean, 15% from your midterm, 20% from your final exam, 10% from your computer lab work, and 5% from your homework.

Your scores are 86 (test mean), 96 (midterm), 82 (final exam), 98 (computer lab), and 100 (homework).

What is the weighted mean of your scores? If the minimum average for an A is 90, did you get an A?

Larson/Farber 4th ed. 41

Page 42: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Ex 7 Solution: Finding a Weighted Mean

Larson/Farber 4th ed. 42

Source Score, x Weight, w x∙w

Test Mean 86 0.50 86(0.50)= 43.0

Midterm 96 0.15 96(0.15) = 14.4

Final Exam 82 0.20 82(0.20) = 16.4

Computer Lab 98 0.10 98(0.10) = 9.8

Homework 100 0.05 100(0.05) = 5.0

Σw = 1 Σ(x∙w) = 88.6

( ) 88.688.6

1

x wx

w

Your weighted mean for the course is 88.6. You did not get an A.

Page 43: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Weighted Mean

Larson/Farber 4th ed. 43

x w xw

Location Average Number of ProductRating Customers

1 7.8 117 912.62 8.5 86 7313 6.6 68 448.84 7.4 90 666

TOTAL 361 2758.4

The data below represents customer satisfaction ratings from 4 different restaurants in a chain. Find the average customer satisfaction rating.

Page 44: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Weighted Mean

Larson/Farber 4th ed. 44

h𝑤𝑒𝑖𝑔 𝑡𝑒𝑑𝑎𝑣𝑔=∑ 𝑥𝑤

∑𝑤=2758.4

361=7.6 4

Page 45: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Mean of Grouped Data

Larson/Farber 4th ed. 45

Page 46: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Mean of Grouped Data

Mean of a Frequency Distribution

Approximated by

where x and f are the midpoints and frequencies of a class, respectively

Larson/Farber 4th ed. 46

( )x fx n f

n

Page 47: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Finding the Mean of a Frequency Distribution

In Words In Symbols

Larson/Farber 4th ed. 47

( )x fx

n

(lower limit)+(upper limit)

2x

( )x f

n f

1. Find the midpoint of each class.

2. Find the sum of the products of the midpoints and the frequencies.

3. Find the sum of the frequencies.

4. Find the mean of the frequency distribution.

Page 48: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 8: Find the Mean of a Frequency Distribution

Use the frequency distribution to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent session.

Larson/Farber 4th ed. 48

Class Midpoint Frequency, f

7 – 18 12.5 6

19 – 30 24.5 10

31 – 42 36.5 13

43 – 54 48.5 8

55 – 66 60.5 5

67 – 78 72.5 6

79 – 90 84.5 2

Page 49: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Find the Mean of a Frequency Distribution

Class Midpoint, x Frequency, f (x∙f)

7 – 18 12.5 6 12.5∙6 = 75.0

19 – 30 24.5 10 24.5∙10 = 245.0

31 – 42 36.5 13 36.5∙13 = 474.5

43 – 54 48.5 8 48.5∙8 = 388.0

55 – 66 60.5 5 60.5∙5 = 302.5

67 – 78 72.5 6 72.5∙6 = 435.0

79 – 90 84.5 2 84.5∙2 = 169.0

n = 50 Σ(x∙f) = 2089.0

( ) 208941.8 minutes

50

x fx

n

Larson/Farber 4th ed. 49

Page 50: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Summary: Section 3.1

• Computed mean, median, and mode from raw data

• Interpreted mean, median, and mode

• Explained how mean, median, and mode can be affected by extreme data values

• Computed Trimmed meanWeighted averageMean of frequency distribution

Larson/Farber 4th ed. 50

Page 51: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Section 3.2

Measures of Variation

Larson/Farber 4th ed. 51

Page 52: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Objectives

Determine the range of a data set

Determine the variance and standard deviation of a population and of a sample

Use Chebychev’s Theorem to interpret standard deviation

Approximate the sample standard deviation for grouped data

Larson/Farber 4th ed. 52

Page 53: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Range

Range

The difference between the maximum and minimum data entries in the set.

The data must be quantitative.

Range = (Max. data entry) – (Min. data entry)

Larson/Farber 4th ed. 53

Page 54: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Finding the Range

A corporation hired 10 graduates. The starting salaries for each graduate are shown. Find the range of the starting salaries.

Starting salaries (1000s of dollars)

41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 54

Page 55: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Range

Ordering the data helps to find the least and greatest salaries.

37 38 39 41 41 41 42 44 45 47

Range = (Max. salary) – (Min. salary)

= 47 – 37 = 10

The range of starting salaries is 10 or $10,000.

Larson/Farber 4th ed. 55

minimum maximum

Page 56: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Deviation, Variance, and Standard Deviation

Deviation

The difference between the data entry, x, and the mean of the data set.

Population data set:• Deviation of x = x – μ

Sample data set:• Deviation of x = x – x

Larson/Farber 4th ed. 56

Page 57: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 1: Finding the Deviation

A corporation hired 10 graduates. The starting salaries for each graduate are shown. Find the deviation of the starting salaries.

Starting salaries (1000s of dollars)

41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 57

Solution:• First determine the mean starting salary.

41541.5

10

x

N

Page 58: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Deviation

Larson/Farber 4th ed. 58

• Determine the deviation for each data entry.

Salary ($1000s), x Deviation: x – μ

41 41 – 41.5 = –0.5

38 38 – 41.5 = –3.5

39 39 – 41.5 = –2.5

45 45 – 41.5 = 3.5

47 47 – 41.5 = 5.5

41 41 – 41.5 = –0.5

44 44 – 41.5 = 2.5

41 41 – 41.5 = –0.5

37 37 – 41.5 = –4.5

42 42 – 41.5 = 0.5

Σx = 415 Σ(x – μ) = 0

Page 59: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Deviation, Variance, and Standard Deviation

• The variance can be thought of a kind of average of the squares of the deviations

• Standard deviation is a measure of the typical amount an entry deviates from the mean

• The more the entries are spread out, the greater the standard deviation

Larson/Farber 4th ed. 59

Page 60: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Deviation, Variance, and Standard Deviation

Population Variance

Population Standard Deviation

Larson/Farber 4th ed. 60

22 ( )x

N

Sum of squares, SSx

22 ( )x

N

Page 61: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Finding the Population Variance & Standard Deviation

In Words In Symbols

Larson/Farber 4th ed. 61

1. Find the mean of the population data set.

2. Find deviation of each entry.

3. Square each deviation.

4. Add to get the sum of squares.

x

N

x – μ

(x – μ)2

SSx = Σ(x – μ)2

Page 62: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Finding the Population Variance & Standard Deviation

Larson/Farber 4th ed. 62

5. Divide by N to get the population variance.

6. Find the square root to get the population standard deviation.

22 ( )x

N

2( )x

N

In Words In Symbols

Page 63: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2: Finding the Population Standard Deviation

A corporation hired 10 graduates. The starting salaries for each graduate are shown. Find the population variance and standard deviation of the starting salaries.

Starting salaries (1000s of dollars)

41 38 39 45 47 41 44 41 37 42

Recall μ = 41.5

Larson/Farber 4th ed. 63

Page 64: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Population Standard Deviation

Larson/Farber 4th ed. 64

• Determine SSx

• N = 10

Salary, x Deviation: x – μ Squares: (x – μ)2

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

38 38 – 41.5 = –3.5 (–3.5)2 = 12.25

39 39 – 41.5 = –2.5 (–2.5)2 = 6.25

45 45 – 41.5 = 3.5 (3.5)2 = 12.25

47 47 – 41.5 = 5.5 (5.5)2 = 30.25

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

44 44 – 41.5 = 2.5 (2.5)2 = 6.25

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

37 37 – 41.5 = –4.5 (–4.5)2 = 20.25

42 42 – 41.5 = 0.5 (0.5)2 = 0.25

Σ(x – μ) = 0 SSx = 88.5

Page 65: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Population Standard Deviation

Larson/Farber 4th ed. 65

Population Variance

Population Standard Deviation

22 ( ) 88.5

8.910

x

N

2 8.85 3.0

The population standard deviation is about 3.0, or $3000.

Page 66: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Deviation, Variance, and Standard Deviation

Sample Variance

Sample Standard Deviation

Larson/Farber 4th ed. 66

22 ( )

1

x xs

n

22 ( )

1

x xs s

n

Page 67: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Finding the Sample Variance & Standard Deviation

In Words In Symbols

Larson/Farber 4th ed. 67

1. Find the mean of the sample data set.

2. Find deviation of each entry.

3. Square each deviation.

4. Add to get the sum of squares.

xx

n

2( )xSS x x

2( )x x

x x

Page 68: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Finding the Sample Variance & Standard Deviation

Larson/Farber 4th ed. 68

5. Divide by n – 1 to get the sample variance.

6. Find the square root to get the sample standard deviation.

In Words In Symbols2

2 ( )

1

x xs

n

2( )

1

x xs

n

Page 69: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 3 : Finding the Sample Standard Deviation

The starting salaries are for the Chicago branches of a corporation. The corporation has several other branches, and you plan to use the starting salaries of the Chicago branches to estimate the starting salaries for the larger population. Find the sample standard deviation of the starting salaries.

Starting salaries (1000s of dollars)

41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 69

Page 70: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Sample Standard Deviation

Larson/Farber 4th ed. 70

• Determine SSx

• n = 10

Salary, x Deviation: x – μ Squares: (x – μ)2

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

38 38 – 41.5 = –3.5 (–3.5)2 = 12.25

39 39 – 41.5 = –2.5 (–2.5)2 = 6.25

45 45 – 41.5 = 3.5 (3.5)2 = 12.25

47 47 – 41.5 = 5.5 (5.5)2 = 30.25

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

44 44 – 41.5 = 2.5 (2.5)2 = 6.25

41 41 – 41.5 = –0.5 (–0.5)2 = 0.25

37 37 – 41.5 = –4.5 (–4.5)2 = 20.25

42 42 – 41.5 = 0.5 (0.5)2 = 0.25

Σ(x – μ) = 0 SSx = 88.5

Page 71: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Sample Standard Deviation

Larson/Farber 4th ed. 71

Sample Variance

Sample Standard Deviation

22 ( ) 88.5

9.81 10 1

x xs

n

2 88.53.1

9s s

The sample standard deviation is about 3.1, or $3100.

Page 72: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Sample Variance: Computational Formula

Larson/Farber 4th ed. 72

Page 73: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 3: Variance Computational Formula

Larson/Farber 4th ed. 73

L1 L2x x2

1 41 16812 38 14443 39 15214 45 20255 47 22096 41 16817 44 19368 41 16819 37 1369

10 42 1764SUM 415 17311

Page 74: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Using Technology to Find the Standard Deviation

Sample office rental rates (in dollars per square foot per year) for Miami’s central business district are shown in the table. Use a calculator or a computer to find the mean rental rate and the sample standard deviation. (Adapted from: Cushman & Wakefield Inc.)

Larson/Farber 4th ed. 74

Office Rental Rates

35.00 33.50 37.00

23.75 26.50 31.25

36.50 40.00 32.00

39.25 37.50 34.75

37.75 37.25 36.75

27.00 35.75 26.00

37.00 29.00 40.50

24.50 33.00 38.00

Page 75: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Using Technology to Find the Standard Deviation

Larson/Farber 4th ed. 75

Sample Mean

Sample Standard Deviation

Page 76: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Interpreting Standard Deviation

Standard deviation is a measure of the typical amount an entry deviates from the mean.

The more the entries are spread out, the greater the standard deviation.

Larson/Farber 4th ed. 76

Page 77: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Coefficient of Variation

Notice that the numerator and denominator in the definition of CV have the same units, so CV itself has no units of measurement.

Larson/Farber 4th ed. 77

Page 78: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Coefficient of Variation

• This gives us the advantage of being able to directly compare the variability of two different populations using the coefficient of variation.

• In the next example, we will compute the CV of a population and of a sample and then compare the results.

Larson/Farber 4th ed. 78

Page 79: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 4 – Coefficient of Variation

The Trading Post on Grand Mesa is a small, family-run store in a remote part of Colorado. It has just eight different types of spinners for sale. The prices (in dollars) are 2.10 1.95 2.60 2.00 1.85 2.25 2.15 2.25

Since the Trading Post has only eight different kinds of spinners for sale, we consider the eight data values to be the population.

Larson/Farber 4th ed. 79

Page 80: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 4a – Coefficient of Variation

• (a) Use a calculator with appropriate statistics keys to verify that for the Trading Post data, and $2.14 and $0.22.

• Solution:• Since the computation formulas for x and are

identical, most calculators provide the value of x only.• Use the output of this key for . The computation

formulas for the sample standard deviation and the population standard deviation s are slightly different.

Larson/Farber 4th ed. 80

Page 81: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 4b – Coefficient of Variation

• (b) Compute the CV of prices for the Trading Post and comment on the meaning of the result.

• Solution:

• Since the Trading Post is very small, it carries a small selection of spinners that are all priced similarly.

• The CV tells us that the standard deviation of the spinner prices is only 10.28% of the mean.

Larson/Farber 4th ed. 81

Page 82: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Chebychev’s Theorem

The portion of any data set lying within k standard deviations (k > 1) of the mean is at least:

Larson/Farber 4th ed. 82

2

11

k

• k = 2: In any data set, at least2

1 31 or 75%

2 4

of the data lie within 2 standard deviations of the mean.

• k = 3: In any data set, at least 2

1 81 or 88.9%

3 9

of the data lie within 3 standard deviations of the mean.

Page 83: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Chebychev’s Theorem

The portion of any data set lying within k standard deviations (k > 1) of the mean is at least:

Larson/Farber 4th ed. 83

2

11

k

k 2 3 4 5 10

75% 88.9% 93.8% 96% 99%

Page 84: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Using Chebychev’s Theorem

The age distribution for Florida is shown in the histogram. Apply Chebychev’s Theorem to the data using k = 2. What can you conclude?

Larson/Farber 4th ed. 84

Page 85: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Using Chebychev’s Theorem

k = 2: μ – 2σ = 39.2 – 2(24.8) = -10.4 (use 0 since age can’t be negative)

μ + 2σ = 39.2 + 2(24.8) = 88.8

Larson/Farber 4th ed. 85

At least 75% of the population of Florida is between 0 and 88.8 years old.

Page 86: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 5: Using Chebychev’s Theorem

Larson/Farber 4th ed. 86

A newspaper periodically runs an ad in its own advertising section offering a free month’s subscription. Over a period of two years the mean number of responses was 525 with a sample standard deviation of s = 30.

a) What is the smallest percentage of data we expect to fall within 2 standard deviations of the mean (i.e. between 465 and 585).

b) Determine the interval from A to B about the mean in which 88.9% of the data fall.

75%

450 to 585

Page 87: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 5: Using Chebychev’s Theorem

Larson/Farber 4th ed. 87

A newspaper periodically runs an ad in its own advertising section offering a free month’s subscription. Over a period of two years the mean number of responses was 525 with a sample standard deviation of s = 30.

c) What is the smallest percent of respondents to the ad that falls within 2.5 standard deviation of the mean?

d) What is the interval from A to B from part c. Explain its meaning in this application.

Page 88: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Standard Deviation for Grouped Data: Defining Formula

Sample standard deviation for a frequency distribution

When a frequency distribution has classes, estimate the sample mean and standard deviation by using the midpoint of each class.

Larson/Farber 4th ed. 88

2( )

1

x x fs

n

where n= Σf (the number of entries in the data set)

Page 89: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 6: Finding the Standard Deviation for Grouped Data

Larson/Farber 4th ed. 89

You collect a random sample of the number of children per household in a region. Find the sample mean and the sample standard deviation of the data set.

Number of Children in 50 Households

1 3 1 1 1

1 2 2 1 0

1 1 0 0 0

1 5 0 3 6

3 0 3 1 1

1 1 6 0 1

3 6 6 1 2

2 3 0 1 1

4 1 1 2 2

0 3 0 2 4

Page 90: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

x f xf

0 10 0(10) = 0

1 19 1(19) = 19

2 7 2(7) = 14

3 7 3(7) =21

4 2 4(2) = 8

5 1 5(1) = 5

6 4 6(4) = 24

Solution: Finding the Standard Deviation for Grouped Data

First construct a frequency distribution.

Find the mean of the frequency distribution.

Larson/Farber 4th ed. 90

Σf = 50 Σ(xf )= 91

911.8

50

xfx

n

The sample mean is about 1.8 children.

Page 91: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Standard Deviation for Grouped Data

Determine the sum of squares.

Larson/Farber 4th ed. 91

x f

0 10 0 – 1.8 = –1.8 (–1.8)2 = 3.24 3.24(10) = 32.40

1 19 1 – 1.8 = –0.8 (–0.8)2 = 0.64 0.64(19) = 12.16

2 7 2 – 1.8 = 0.2 (0.2)2 = 0.04 0.04(7) = 0.28

3 7 3 – 1.8 = 1.2 (1.2)2 = 1.44 1.44(7) = 10.08

4 2 4 – 1.8 = 2.2 (2.2)2 = 4.84 4.84(2) = 9.68

5 1 5 – 1.8 = 3.2 (3.2)2 = 10.24 10.24(1) = 10.24

6 4 6 – 1.8 = 4.2 (4.2)2 = 17.64 17.64(4) = 70.56

x x 2( )x x 2( )x x f

2( ) 145.40x x f

Page 92: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding the Standard Deviation for Grouped Data

Find the sample standard deviation.

Larson/Farber 4th ed. 92

x x 2( )x x 2( )x x f2( ) 145.401.7

1 50 1

x x fs

n

The standard deviation is about 1.7 children.

Page 93: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Standard Deviation for Grouped Data: Computational Formula

Larson/Farber 4th ed. 93

Page 94: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Sample Variance of a Frequency Distribution – Computational Formula

Larson/Farber 4th ed. 94

Lower Limit

Upper Limit

Midpoint x f x2 * f xf

7 18 12.5 6 937.5 7519 30 24.5 10 6002.5 24531 42 36.5 1317319.25 474.543 54 48.5 8 18818 38855 66 60.5 518301.25 302.567 78 72.5 6 31537.5 43579 90 84.5 2 14280.5 169

TOTAL 50107196.5 2089

Page 95: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Summary

• Determined the range of a data set• Determined the variance and standard deviation of a

population and of a sample• Used the Empirical Rule and Chebychev’s Theorem

to interpret standard deviation• Approximated the sample standard deviation for

grouped data

Larson/Farber 4th ed. 95

Page 96: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Section 3.3

Measures of Position

Box-and-Whisker Plots

Larson/Farber 4th ed. 96

Page 97: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Objectives

• Determine the quartiles of a data set• Interpret other fractiles such as percentiles• Determine the interquartile range of a data set• Create a box-and-whisker plot

Larson/Farber 4th ed. 97

Page 98: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Percentiles

A percentile measure the position of a single data item based on the percentage of data items below that single data item.

Standardized tests taken by larger numbers of students, convert raw scores to a percentile score.

If approximately n percent of the items in a distribution are less than the number x, then x is the nth percentile of the distribution, denoted Pn.

Larson/Farber 4th ed. 98

Page 99: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Quartiles

Fractiles are numbers that partition (divide) an ordered data set into equal parts.

Quartiles approximately divide an ordered data set into four equal parts.

• First quartile, Q1: About one quarter of the data fall on or below Q1.

• Second quartile, Q2: About one half of the data fall on or below Q2 (median).

• Third quartile, Q3: About three quarters of the data fall on or below Q3.

Larson/Farber 4th ed. 99

Page 100: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Percentiles and Other Fractiles

Fractiles Summary Symbols

Quartiles Divides data into 4 equal parts

Q1, Q2, Q3

Deciles Divides data into 10 equal parts

D1, D2, D3,…, D9

Percentiles Divides data into 100 equal parts

P1, P2, P3,…, P99

Larson/Farber 4th ed. 100

Page 101: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed
Page 102: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Example: Interpreting Percentiles

The ogive represents the cumulative frequency distribution for SAT test scores of college-bound students in a recent year. What test score represents the 72nd percentile? How should you interpret this? (Source: College Board Online)

Larson/Farber 4th ed. 102

Page 103: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Interpreting Percentiles

The 72nd percentile corresponds to a test score of 1700.

This means that 72% of the students had an SAT score of 1700 or less.

Larson/Farber 4th ed. 103

Page 104: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 1: Interpreting Percentiles

Suppose you challenge freshman composition by taking an exam.

a. If your score was in the 89th percentile, what percentage of scores was at or below your score?

Answer: 89 %

b. If the scores ranged from 0 to 100 and your raw score was 95, does that mean that your score is at the 95th percentile

Answer: No! Percentile score is based on position.

Larson/Farber 4th ed. 104

Page 105: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 2: Finding Quartiles

The test scores of 15 employees enrolled in a CPR training course are listed. Find the first, second, and third quartiles of the test scores.

13 9 18 15 14 21 7 10 11 20 5 18 37 16 17

Larson/Farber 4th ed. 105

Solution:

• Q2 divides the data set into two halves.

5 7 9 10 11 13 14 15 16 17 18 18 20 21 37

Q2

Lower half Upper half

Page 106: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Solution: Finding Quartiles

The first and third quartiles are the medians of the lower and upper halves of the data set.

5 7 9 10 11 13 14 15 16 17 18 18 20 21 37

Larson/Farber 4th ed. 106

Q2

Lower half Upper half

Q1 Q3

About one fourth of the employees scored 10 or less, about one half scored 15 or less; and about three fourths scored 18 or less.

Page 107: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Interquartile Range

Interquartile Range (IQR)

The difference between the third and first quartiles.

IQR = Q3 – Q1

Larson/Farber 4th ed. 107

Page 108: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 3: Find the Interquartile Range

Find the interquartile range of the test scores.

Recall Q1 = 10, Q2 = 15, and Q3 = 18

Larson/Farber 4th ed. 108

Solution:

• IQR = Q3 – Q1 = 18 – 10 = 8

The test scores in the middle portion of the data set vary by at most 8 points.

Page 109: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Box-and-Whisker Plot

Box-and-whisker plot

Exploratory data analysis tool.

Highlights important features of a data set.

Requires (five-number summary):• Minimum entry

• First quartile Q1

• Median Q2

• Third quartile Q3

• Maximum entry

Larson/Farber 4th ed. 109

Page 110: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Drawing a Box-and-Whisker Plot

1. Find the five-number summary of the data set.

2. Construct a horizontal scale that spans the range of the data.

3. Plot the five numbers above the horizontal scale.

4. Draw a box above the horizontal scale from Q1 to Q3 and draw a vertical line in the box at Q2.

5. Draw whiskers from the box to the minimum and maximum entries.

Larson/Farber 4th ed. 110

Whisker Whisker

Maximum entry

Minimum entry

Box

Median, Q2 Q3Q1

Page 111: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 4: Draw Box-and-Whisker Plot

Draw a box-and-whisker plot that represents the 15 test scores.

Recall Min = 5 Q1 = 10 Q2 = 15 Q3 = 18 Max = 37

Larson/Farber 4th ed. 111

5 10 15 18 37

Solution:

Page 112: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Exercise 5: Interpret Box-and-Whisker Plot

Recall Min = 5 Q1 = 10 Q2 = 15 Q3 = 18 Max = 37

Larson/Farber 4th ed. 112

5 10 15 18 37

Solution:

About half the scores are between 10 and 18. By looking at the length of the right whisker, you can conclude 37 is a possible outlier.

Page 113: Copyright © Cengage Learning. All rights reserved. Averages and Variation 3 Larson/Farber 4th ed

Summary

• Determined the quartiles of a data set• Interpreted other fractiles such as percentiles• Determined the interquartile range of a data set• Created a box-and-whisker plot

Larson/Farber 4th ed. 113