14
Measures of Central Tendency and Dispersion 1 Statistics involves the collection, organization, summarization, presentation, and interpretation of data. The branch of statistics that involves the collection, organi- zation, summarization, and presentation of data is called descriptive statistics. The branch that interprets and draws conclusions from the data is called inferen- tial statistics. Statisticians often collect data from small portions of a large group in order to determine information about the group. For instance, to determine who will be elected as the next president of the United States, an organization may poll a small group of voters and, from the information it obtains from the small group, make conjectures about the voting preferences of the entire group of voters. In such situations the entire group under consideration is known as the population, and any proper subset of the population is a sample. Due to practical restraints such as time and money, it is common to apply de- scriptive statistical procedures to a sample of a population and then to make use of inferential statistics to deduce conclusions about the population. Obviously, some samples are more representative of a population than are other samples. One of the most basic statistical concepts involves finding measures of central tendency of a set of numerical data. Here is a scenario in which it would be helpful to find numerical values that locate, in some sense, the center of a set of data. Elle is a senior at a university. In a few months she plans to graduate and start a career as a graphic artist. A small sample of five graphic artists from her class shows that they have received job offers with the following yearly salaries. $41,000 $39,500 $34,000 $32,500 $30,500 Before Elle interviews for a job, she wishes to determine an average of these five salaries. This average should be a “central” number around which the salaries cluster. We will consider three types of averages, known as the arithmetic mean, the median, and the mode. Each of these averages are measures of central tendency for numerical data. Mean The arithmetic mean is the most commonly used measure of central tendency. The arithmetic mean of a set of numbers is often referred to as simply the mean. The mean is denoted by (which is read “x bar”). To find the mean of a set of data, find the sum of the data values and divide by the number of data values. For instance, to find the mean of the five salaries listed above, divide the sum of the salaries by 5. The mean of the five salaries is $35,500. In statistics it is often necessary to find the sum of a set of numbers. The tradi- tional symbol used to indicate a summation is the Greek letter sigma, . Thus the 177,500 5 35,500 Mean x 41,000 39,500 34,000 32,500 30,500 5 x Measures of Central Tendency and Dispersion Mean Median Mode Range Standard Deviation Variance Copyright © Houghton Mifflin Company. All rights reserved. 302360_File_A.qxd 7/7/03 7:12 AM Page 1

Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

  • Upload
    buinhi

  • View
    237

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Measures of Central Tendency and Dispersion 1

Statistics involves the collection, organization, summarization, presentation, andinterpretation of data. The branch of statistics that involves the collection, organi-zation, summarization, and presentation of data is called descriptive statistics.The branch that interprets and draws conclusions from the data is called inferen-tial statistics.

Statisticians often collect data from small portions of a large group in order todetermine information about the group. For instance, to determine who will beelected as the next president of the United States, an organization may poll asmall group of voters and, from the information it obtains from the small group,make conjectures about the voting preferences of the entire group of voters. Insuch situations the entire group under consideration is known as the population,and any proper subset of the population is a sample.

Due to practical restraints such as time and money, it is common to apply de-scriptive statistical procedures to a sample of a population and then to make useof inferential statistics to deduce conclusions about the population. Obviously,some samples are more representative of a population than are other samples.

One of the most basic statistical concepts involves finding measures of centraltendency of a set of numerical data. Here is a scenario in which it would be helpfulto find numerical values that locate, in some sense, the center of a set of data. Elleis a senior at a university. In a few months she plans to graduate and start a careeras a graphic artist. A small sample of five graphic artists from her class shows thatthey have received job offers with the following yearly salaries.

$41,000 $39,500 $34,000 $32,500 $30,500

Before Elle interviews for a job, she wishes to determine an average of thesefive salaries. This average should be a “central” number around which thesalaries cluster. We will consider three types of averages, known as the arithmeticmean, the median, and the mode. Each of these averages are measures of centraltendency for numerical data.

� MeanThe arithmetic mean is the most commonly used measure of central tendency. Thearithmetic mean of a set of numbers is often referred to as simply the mean. Themean is denoted by (which is read “x bar”). To find the mean of a set of data, findthe sum of the data values and divide by the number of data values. For instance, tofind the mean of the five salaries listed above, divide the sum of the salaries by 5.

The mean of the five salaries is $35,500.In statistics it is often necessary to find the sum of a set of numbers. The tradi-

tional symbol used to indicate a summation is the Greek letter sigma, . Thus the�

�177,500

5� 35,500

Mean � x �41,000 � 39,500 � 34,000 � 32,500 � 30,500

5

x

Measures of Central Tendency and Dispersion� Mean� Median� Mode� Range� Standard Deviation� Variance

Copyright © Houghton Mifflin Company. All rights reserved.

302360_File_A.qxd 7/7/03 7:12 AM Page 1

Page 2: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

notation denotes the sum of all the numbers in a given set. The use of summa-tion notation enables us to define the mean as follows.

MeanThe mean of n numbers is the sum of the number divided by n. Thus

EXAMPLE 1 Find the Mean

In her biology class, Rebecca has received test grades of

92, 84, 65, 76, 88, and 90

Find the mean of Rebecca’s test scores.

Solution

The mean of Rebecca’s test scores is 82.5.

A patient had total blood cholesterol levels of

245, 235, 220, and 210

Find the mean of the total blood cholesterol levels.

Solution See page S1.

From a physical perspective, numerical data can be represented by weights on aseesaw, and the mean of the data is represented by the balance point of the seesaw.For instance, the mean of 1, 3, 5, 5, 5, and 8 is . If equal weights are placed at lo-cations 1, 3, and 8, and three more weights are placed at 5 (as measured by a uni-form scale), then the seesaw will balance at

� MedianAnother type of average is called the median. The median is the middle number orthe average of the two middle numbers in a list of numbers that have been arranged in

A physical interpretation of the mean

0 1 2 3 4 5 6 7 8 9

Balancepointat 4 1

2

412 .

412

CHECK YOUR PROGRESS 1

x ��xn

�92 � 84 � 65 � 76 � 88 � 90

6�

4956

� 82.5

x ��xn

x

�x

2

Copyright © Houghton Mifflin Company. All rights reserved.

It is traditional to denote themean of a sample by and todenote the mean of a populationby .�

x

302360_File_A.qxd 7/7/03 7:12 AM Page 2

Page 3: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

numerical order from smallest to largest or from largest to smallest. Any list ofnumbers that is arranged in numerical order from smallest to largest or fromlargest to smallest is a ranked list.

MedianThe median of a ranked list of n numbers is� the middle number if n is odd.� the mean of the two middle numbers if n is even.

EXAMPLE 2 Find the Median

Find the median for the data in the following lists.

a. 4, 8, 1, 14, 9, 21, 12 b. 46, 23, 92, 89, 77, 108

Solutiona. The list 4, 8, 1, 14, 9, 21, 12 contains seven numbers. The median of a list with

an odd number of numbers is found by ranking the numbers and finding themiddle number. Ranking the numbers from smallest to largest gives us 1, 4,8, 9, 12, 14, 21. The middle number is 9. Thus 9 is the median.

b. The list 46, 23, 92, 89, 77, 108 contains six numbers. The median of a list withan even number of numbers is found by ranking the numbers and comput-ing the mean of the two middle numbers. Ranking the numbers from small-est to largest gives us 23, 46, 77, 89, 92, 108. The two middle numbers are 77and 89. The mean of 77 and 89 is 83. Thus 83 is the median.

Find the median for the data in the following lists.

a. 14, 27, 3, 82, 64, 34, 8, 51 b. 21.3, 37.4, 11.6, 82.5, 17.2

Solution See page S1.

QUESTION The median of the ranked list 3, 4, 7, 11, 17, 29, 37 is 11. If the maxi-mum value of 37 is increased to 55, what effect will this have onthe median?

� ModeA third type of average is called the mode.

ModeThe mode of a list of numbers is the number that occurs most frequently.

CHECK YOUR PROGRESS 2

Measures of Central Tendency and Dispersion 3

Copyright © Houghton Mifflin Company. All rights reserved.

ANSWER The median will remain the same because 11 will still be the middle numberin the ranked list.

302360_File_A.qxd 7/7/03 7:12 AM Page 3

Page 4: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Some lists of numbers do not have a mode. For instance, in the list 1, 6, 8, 10,32, 15, 49, each number occurs exactly once. Because no number occurs more thanthe other numbers, there is no mode.

A list of numerical data can have more than one mode. For instance, in the list4, 2, 6, 2, 7, 9, 2, 4, 9, 8, 9, 7, the number 2 occurs three times and the number 9 oc-curs three times. Each of the other numbers occurs less than three times. Thus 2and 9 are both modes for the data.

EXAMPLE 3 Find the Mode

Find the mode(s) for the data in the following lists.

a. 18, 15, 21, 16, 15, 14, 15, 21 b. 2, 5, 8, 9, 11, 4, 7, 23

Solutiona. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than

the other numbers, so 15 is the mode.b. Each number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. Because no

number occurs more than the others, there is no mode.

Find the mode(s) for the data in the following lists.

a. 3, 3, 3, 3, 3, 4, 4, 5, 5, 5, 8 b. 12, 34, 12, 71, 48, 93, 71

Solution See page S1.

The mean, the median, and the mode are all averages; however, they are generallynot equal and they have different properties. The following summary illustratessome of the properties of each type of average for a set of numerical data.

Comparative Properties of the Mean, the Median, and the ModeThe mean of a set of data

� is the most “sensitive” of the averages, because it takes into account the value ofevery number in the data set. A change in any of the numbers changes the mean.

� can be different from each of the numbers in the set.� can be changed drastically by changing an extreme value.

The median of a set of data

� is usually not changed by changing an extreme value.� is generally easy to compute

The mode of a set of data

� may not exist, and when it does exist is may not be unique.� is one of the numbers in the set, provided a mode exists.� is generally not changed by changing an extreme value.� is generally easy to compute.

CHECK YOUR PROGRESS 3

4

Copyright © Houghton Mifflin Company. All rights reserved.

302360_File_A.qxd 7/7/03 7:12 AM Page 4

Page 5: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

EXAMPLE 4 Compare the Mean, the Median, and the Mode

The following data show the salaries of the five employees of a small company.

$370,000 $60,000 $32,000 $16,000 $16,000

Compare the mean, the median, and the mode of the salaries.

Solution The sum of the five salaries is $494,000. The mean is

The median is the middle number, $32,000. Because the $16,000 salary occurs themost, the mode is $16,000. The data contain one extreme value that is much largerthan the others. This extreme value causes the mean to be considerably largerthan the median. The four employees who are paid the least would probablyagree that the median of $32,000 is a better representative of the average of thesalaries than the mean of $98,000 or the mode of $16,000.

The following data lists the salaries of the six em-ployees of a company.

$260,000 $260,000 $180,000 $75,000 $17,000 $15,000

Find the mean, the median, and the mode of the salaries and state which of thesenumbers you think best represents the “average salary” of the employees.

Solution See page S1.

� RangeThe mean, the median, and the mode are measures of central tendency for thedata from which they were derived. Some characteristics of a set of data may notbe evident solely from an examination of these measures of central tendency. Forinstance, consider the test scores for Alan and Tara shown in the table at the left.The mean, the median, and the mode of Alan’s test scores and Tara’s test scoresare identical; however, an inspection of the test scores shows that Alan’s scores arewidely scattered, whereas all of Tara’s scores are within a few points of the mean.This example illustrates that average values do not reflect the spread or disper-sion of data. To measure the spread or dispersion of data, we define statistical val-ues known as the range and the standard deviation.

RangeThe range of a set of data values is the difference between the largest data value andthe smallest data value.

For instance, Alan’s highest test score is 100 and his lowest test score is 55. Therange of Alan’s test scores is Tara’s highest test score is 84 and herlowest test score is 76. Thus her range of test scores is only 84 � 76 � 8.

100 � 55 � 45.

CHECK YOUR PROGRESS 4

$494,0005

� $98,800

Measures of Central Tendency and Dispersion 5

Copyright © Houghton Mifflin Company. All rights reserved.

For some data sets, the meanand the median can differ by alarge amount. For instance, dur-ing the 1994 baseball strike, thebaseball owners reported that theaverage (mean) player’s salarywas $1.2 million. This was true;however, the median salary was$500,000. (Source: A Mathe-matician Reads the Newspaper,1995, by John Allen Paulos.)

Mark Richards/PhotoEdit, Inc.

Alan Tara

55 80

80 76

97 77

80 83

68 84

100 80

Mean: 80 Mean: 80

Median: 80 Median: 80

Mode: 80 Mode: 80

Test Scores

302360_File_A.qxd 7/7/03 7:12 AM Page 5

Page 6: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

� Standard DeviationThe range of a set of data is easy to compute, but it can be deceiving. The range isa measure that depends only on the two most extreme values and, as such, it isvery sensitive. For instance, the table at the left shows the times for two sprintersin six track meets. The range of times for the first sprinter is 0.7 second, and therange of times for the second sprinter is 1.7 seconds. If you consider only rangevalues, then you will probably conclude that the first sprinter’s times are moreconsistent than those of the second sprinter. However, a closer examination showsthat if you exclude the time of 25.8 seconds by the second sprinter in the sixthrace, then the second sprinter has a range of only 0.1 second. On this basis onecould argue that the second sprinter has a more consistent performance record.

The next measure of dispersion that we will consider is called the standarddeviation. It is less sensitive to a change in an extreme value than is the range. Thestandard deviation of a set of numerical data makes use of the individual amountthat each data value deviates from the mean. These deviations, represented by

, are positive when the data value x is greater than the mean and nega-tive when x is less than the mean . The sum of all the deviations is 0 forall sets of data. For instance, consider the data set 2, 6, 11, 12, 14. For this data set,

The individual deviation from the mean for each data value is shown in thetable at the left. Note that the sum of the deviations is 0.

Because the sum of all the deviations is always 0, we cannot use the sum of thedeviations as a measure of dispersion for a set of data. What is needed is a proce-dure that can be applied to the deviations such that the sum of the numbers de-rived by adjusting the deviations is not always 0. The procedure we will usesquares each of the deviations to make each of them nonnegative. The sumof the squares of the deviations is then divided by a constant that depends on thenumber of data values. Finally, we compute the square root of this result. The fol-lowing definitions show that the formula for calculating the standard deviation ofa population differs slightly from the formula used to calculate the standard devia-tion of a sample.

Standard Deviations for Populations and SamplesIf is a population of n numbers with a mean of then the standard

deviation of the population is (1)

If is a sample of n numbers with a mean of , then the standard devia-

tion of the sample is (2)

Most statistical applications involve a sample taken from a population.Sample standard deviations are designated by the lowercase letter s. When we dowork with a population, we designate the standard deviation of the populationby , which is the lowercase Greek letter sigma. It is helpful to use the followingprocedure to calculate the standard deviation of a set of numbers.

s � ���x � x �2

n � 1

xx1, x2, x3, . . . , xn

� � ���x � � �2

n

�,x1, x2, x3, . . . , xn

�x � x �

x � 9.

�x � x �xx�x � x �

6

Copyright © Houghton Mifflin Company. All rights reserved.

Race Sprinter 1 Sprinter 2

1 23.8 24.1

2 24.0 24.2

3 24.1 24.1

4 24.4 24.2

5 23.9 24.1

6 24.5 25.8

Range: 0.7 1.7

220-Yard Dash (times in seconds)

x

2

6

11

12

14

Sum of thedeviations

0

14 � 9 � 5

12 � 9 � 3

11 � 9 � 2

6 � 9 � �3

2 � 9 � �7

x � x

'Deviations from the Mean

Deviations

Notice that a denominator ofis used instead of n when

we compute a sample standarddeviation. This is because a sam-ple standard deviation is oftenused to estimate a populationstandard deviation, and it can beshown mathematically that theuse of tends to yield betterestimates.

n � 1

n � 1

302360_File_A.qxd 7/7/03 7:12 AM Page 6

Page 7: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Procedure for Computing a Standard Deviation of n Numbers1. Determine the mean of the numbers.2. For each number, calculate the deviation (difference) between the number and the

mean of the numbers.3. Calculate the square of each of the deviations and find the sum of these squared

deviations.4. If the data is a population, divide the sum by n. If the data is a sample, divide the

sum by 5. Find the square root of the quotient from Step 4.

EXAMPLE 5 Find the Standard Deviation

The following quiz scores were obtained by sampling a population.

2, 4, 7, 12, 15

Find the standard deviation of this sample.

SolutionStep 1 The mean of the numbers is

Step 2 For each score, calculate the deviation between the score and the mean.

Step 3 Calculate the square of each of the deviations in Step 2 and find the sumof these squared deviations.

Continued ➤

x �2 � 4 � 7 � 12 � 15

5�

405

� 8

n � 1.

Measures of Central Tendency and Dispersion 7

Copyright © Houghton Mifflin Company. All rights reserved.

x

2

4

7

12

15 15 � 8 � 7

12 � 8 � 4

7 � 8 � �1

4 � 8 � �4

2 � 8 � �6

x � x

x

2

4

7

12

15118

72 � 4915 � 8 � 7

42 � 1612 � 8 � 4

��1�2 � 17 � 8 � �1

��4�2 � 164 � 8 � �4

��6�2 � 362 � 8 � �6

�x � x �2x � x

The sum of the squared deviationsk

Because the sum of the devia-tions is always 0, you can usethis as a means to check yourarithmetic. That is, if your devia-tions from the mean do not havea sum of 0, then you know youhave made an error.

302360_File_A.qxd 7/7/03 7:12 AM Page 7

Page 8: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Step 4 Because we are working with a sample of values, divide the sum118 by , which is 4.

Step 5 The standard deviation is . Rounded to the nearest hundredth,

A student has the following quiz scores: 5, 8, 16, 17,18, 20. Find the standard deviation for this population. Round to the nearesthundredth.

Solution See page S1.

In Example 6 we examine standard deviations to determine which company pro-duces batteries that are the most consistent with regard to their life expectancy.

EXAMPLE 6 Compare Standard Deviations

A consumers group has tested a sample of eight size D batteries from each of threecompanies. The results of the tests are shown in the following table. According tothese tests, which company produces batteries for which the hours of constant usehave the least standard deviation?

The mean for each sample of batteries is 7 hours.The batteries from EverSoBright have a standard deviation of

The batteries from Dependable have a standard deviation of

The batteries from Beacon have a standard deviation of

The batteries from Dependable have the least standard deviation. According tothese standard deviations, the Dependable company produces the most consis-tent batteries in regard to life expectancy under constant use. Continued ➤

s3 � ��6.1 � 7�2 � �6.6 � 7�2 � � � � � �8.5 � 7�2

7� �5.38

7� 0.877 hours

s2 � ��6.8 � 7�2 � �6.2 � 7�2 � � � � � �8.2 � 7�2

7� �3.62

7� 0.719 hours

s1 � ��6.2 � 7�2 � �6.4 � 7�2 � � � � � �9.3 � 7�2

7� �12.34

7� 1.328 hours

CHECK YOUR PROGRESS 5

s � 5.43.�29.5

1184

� 29.5

n � 1n � 5

8

Copyright © Houghton Mifflin Company. All rights reserved.

Company Hours of constant use per battery

EverSoBright 6.2, 6.4, 7.1, 5.9, 8.3, 5.3, 7.5, 9.3

Dependable 6.8, 6.2, 7.2, 5.9, 7.0, 7.4, 7.3, 8.2

Beacon 6.1, 6.6, 7.3, 5.7, 7.1, 7.6, 7.1, 8.5

302360_File_A.qxd 7/7/03 7:12 AM Page 8

Page 9: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

A consumer testing agency has sampled the breakingstrengths of three different brands of -inch rope. The results of the tests are shown in the following table. According to the test results, which company produces -inch rope for which the breaking strength has the least standard deviation?

Solution See page S1.

Many calculators have built-in features for calculating the mean and the standarddeviation of a set of numbers. The next example illustrates these features on a TI-83 graphing calculator.

EXAMPLE 7 Use a Calculator to Find the Mean and the Standard Deviation

Use a calculator to find the mean and the standard deviation of thetimes in the following table. Because the table contains all the winning

times for this race (up to the year 2000), the data set is a population.

Olympic Women’s 400-Meter Dash Results in Seconds, 1964–2000

52.0 52.0 51.08 49.29 48.88 48.83 48.65 48.83 48.25 49.11

Solution On a TI-83 calculator, press STAT ENTER and then enter the abovetimes into list [L1]. See the following figure. Press STAT ENTER ENTER. Thecalculator displays the mean and the standard deviations shown below. Becausewe are working with a population, we are interested in the population standarddeviation , which is displayed as .

Use a calculator to find the mean and the popu-lation standard deviation of the race times inthe following table. Continued ➤

CHECK YOUR PROGRESS 7

TI-83 display of List 1 TI-83 display of x, s and σ

x=49.692∑x=496.92∑x2=24711.3398Sx=1.429497192σx=1.356140111

↓n=10

1-Var Stats525251.0849.2948.8848.8348.65L1(1) = 52

- - - - - - - - - - - -L2 L3L1 1

Mean

Sample standard deviationPopulation standarddeviation

�x��1.356 seconds�

18

18

CHECK YOUR PROGRESS 6

Measures of Central Tendency and Dispersion 9

Copyright © Houghton Mifflin Company. All rights reserved.

Company Breaking point of -inch rope, in pounds

Trustworthy 122, 141, 151, 114, 108, 149, 125

Brand X 128, 127, 148, 164, 97, 109, 137

NeverSnap 112, 121, 138, 131, 134, 139, 135

18

302360_File_A.qxd 7/7/03 7:12 AM Page 9

Page 10: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Olympic Men’s 400-Meter Dash Results in Seconds, 1896–2000

54.2 49.4 49.2 53.2 50.0 48.2 49.6 47.6 47.8

46.2 46.5 46.2 45.9 46.7 44.9 45.1 43.8 44.66

44.26 44.60 44.27 43.87 43.50 43.49 43.84

Solution See page S2.

� VarianceA statistic known as the variance is also used as a measure of dispersion. The vari-ance for a given set of data is the square of the standard deviation of the data. Thefollowing chart shows the mathematical notations that are used to denote stan-dard deviation and variance.

Notations for Standard Deviation and Varianceis the standard deviation of a population.is the variance of a population.

s is the standard deviation of a sample.is the variance of a sample.

EXAMPLE 8 Find the Variance

Find the variance of the sample given in Example 5 on page 7.

Solution In Example 5 we found that Variance is the square of thestandard deviation. The variance of the sample is

Find the variance of the population that has a stan-dard deviation of 8.2 inches.

Solution See page S2.

Although the variance of a set of data is an important measure of dispersion,it does have a disadvantage that is not shared by the standard deviation: the vari-ance does not have the same unit measure as the original data. For instance, if aset of data consists of times measured in hours, then the variance of the data willbe measured in square hours. The standard deviation of this set of data is thesquare root of the variance and, as such, it is measured in hours, which are a moreintuitive unit of measure.

Topics for Discussion1. If one number in a set of data is changed, will this necessarily change the

mean of the data set?

CHECK YOUR PROGRESS 8

s2 � ��29.5 �2� 29.5.

s � �29.5.

s2

�2�

10

Copyright © Houghton Mifflin Company. All rights reserved.

302360_File_A.qxd 7/7/03 7:12 AM Page 10

Page 11: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

In Exercises 1 to 6, find the mean, the median, and themode(s) for the given data. Round noninteger means to thenearest tenth.

1. 2, 7, 5, 7, 14

2. 8, 3, 3, 17, 9, 22, 19

3. 11, 8, 2, 5, 17, 39, 52, 42

4. 101, 88, 74, 60, 12, 94, 74, 85

5. 2.1, 4.6, 8.2, 3.4, 5.6, 8.0, 9.4, 12.2, 56.1, 78.2

6. 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5

In Exercises 7 to 12, find the range, the mean, the standarddeviation, and the variance for the given samples. Roundnoninteger results to the nearest tenth.

7. 1, 2, 5, 7, 8, 19, 22

8. 3, 4, 7, 11, 12, 12, 15, 16

9. 2.1, 3.0, 1.9, 1.5, 4.8

10. 5.2, 11.7, 19.1, 3.7, 8.2, 16.3

11. 48, 91, 87, 93, 59, 68, 92, 100, 81

12. 93, 67, 49, 55, 92, 87, 77, 66, 73, 96, 54

13. After his first six biology tests, Ruben has a mean score of78. What score does Ruben need on the next test to raisehis average (mean) to 80?

14. After her first four algebra tests, Alisa has a meanscore of 82. One more 100-point test is to be given

in this class. All of the test scores are of equal importance.

Is it possible for Alisa to raise her average (mean) to 90?Explain.

15. Mark averaged 60 miles per hour during the 30-mile tripto college. Due to heavy traffic, he was able to averageonly 40 miles per hour during the return trip. What wasMark’s average speed for the round trip?

16. a. On December 24, 1924, the temperature inFairfield, Montana changed from a high of 63°F

to a low of during a 12-hour period. What wasthe range of temperatures during this period? (Source:Time Almanac 2002)

b. On January 23–24, 1916, the temperature in Browing,Montana changed from a high of 44°F to a low of

during a 24-hour period. What was the range oftemperatures during this period? (Source: TimeAlmanac 2002)

17. Pick six numbers and compute the mean and the medianof the numbers. Now add 12 to each of your originalnumbers and compute the mean and the median for thisnew set of numbers.

a. How does the mean of the new set of data comparewith the mean of the original set of data?

b. How does the median of the new set of data comparewith the median of the original set of data?

18. Pick four numbers and compute the sample standarddeviation of the numbers. Now add 12 to each of youroriginal numbers and compute the sample standarddeviation for this new set of numbers. How does thesample standard deviation of the second set of datacompare with the sample standard deviation of theoriginal data set?

�56�F

�21�F

E X E R C I S E S

2. If one number in a set of data is changed, will this necessarily change the me-dian of the data set?

3. If a set of data has a mode, then must the mode be one of the numbers in thedata set? Explain.

4. Is the variance of a set of data always larger than the standard deviation of thedata set? Explain.

Measures of Central Tendency and Dispersion 11

Copyright © Houghton Mifflin Company. All rights reserved.

302360_File_A.qxd 7/7/03 7:12 AM Page 11

Page 12: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

19. According to the Guinness World Records 2001, thetallest man for whom there is reliable evidence was

Robert Pershing Wadlow. He attained a height of 8 feet,The shortest man was Gul Mohammad. He

attained a height of According to theserecords, what is the range in height of all men for whomwe have reliable evidence?

20. Which data set would you expect to have the larger standard deviation: the resting pulse rates

of the students in a physical education class, or the pulserates of the same students after a 5-minute exerciseperiod? Explain.

21. Which data set would you expect to have the largerstandard deviation: the SAT scores of 30 students in

a physics class at a large university, or the SAT scores of30 students chosen at random from a community college?Explain.

Business and Economics22. Career Decisions Dental schools provide urban

statistics to their students. Use the following data to decide which city you would pick in which to set upyour practice.

Cloverdale: Population: 18,250;median price of a home: $177,000;dentists: 12; median age: 49;mean number of patients: 1294.5

Barnbridge: Population: 27,840;median price of a home: $198,400;dentists: 17; median age: 53;mean number of patients: 1148.7

Explain how you made your decision.

23. Daily Expenditures A salesperson records thefollowing daily expenditures during a 10-day trip.

$185.34, $234.55, $211.86, $147.65, $205.60,$216.74, $1345.75, $184.16, $320.45, $88.12

Which of the three measures of central tendency do youthink best represents the salesperson’s average dailyexpenditures? Explain.

Social Sciences24. Ages of Female Oscar Recipients

The following table displaysthe ages of female actors when

they starred in their Oscar-winning bestactor performances.

Ages of Best Female Actor Oscar Recipients, Academy Awards,1971–2002

Find the mean, the median, the mode(s), and thestandard deviation for the data in this sample. Roundnoninteger results to the nearest tenth.

25. Ages of Male Oscar Recipients The followingtable displays the ages of male actors whenthey starred in their Oscar-winning best actor

performances.

Ages of Best Male Actor Oscar Recipients, Academy Awards,1971–2002

a. Find the mean, the median, the mode(s), and the stan-dard deviation for the data in this sample. Round non-integer results to the nearest tenth.

b. How do the results in part a. compare with the resultsin Exercise 24?

26. Presidential Inauguration Ages The followingtable lists the U.S. presidents and their ages atthe times of their inaugurations. President

Cleveland has two entries because he served twononconsecutive terms. Use a calculator to find the meanand the population standard deviation of the ages.Round to the nearest tenth.

Washington 57 J. Adams 61

Jefferson 57 Madison 57

Monroe 58 J. Q. Adams 57

22 12 inches.

11 110 inches.

12

Copyright © Houghton Mifflin Company. All rights reserved.

35 34 34 26 37 42 39 35 31 41 33

31 74 33 49 38 61 22 41 26 80 44

29 33 35 44 49 39 34 25 32 33

62 43 40 48 48 56 38 60 30 40 42

37 76 38 52 44 35 61 43 51 31 42

54 53 36 37 31 45 60 40 35 47

AP/Wide World Photos

302360_File_A.qxd 7/7/03 7:12 AM Page 12

Page 13: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

Jackson 61 Van Buren 54

W. H. Harrison 68 Tyler 51

Polk 49 Taylor 64

Fillmore 50 Pierce 48

Buchanan 65 Lincoln 52

A. Johnson 56 Grant 46

Hayes 54 Garfield 49

Arthur 50 Cleveland 47, 55

B. Harrison 55 McKinley 54

T. Roosevelt 42 Taft 51

Wilson 56 Harding 55

Coolidge 51 Hoover 54

F. D. Roosevelt 51 Truman 60

Eisenhower 62 Kennedy 43

L. B. Johnson 55 Nixon 56

Ford 61 Carter 52

Reagan 69 G. H. W. Bush 64

Clinton 46 G. W. Bush 54Source: Time Almanac 2002

27. Age at Death for U.S. Presidents The followingtable lists the deceased U.S. presidents andtheir ages at the times of their deaths.

a. Use a calculator to find the mean and the populationstandard deviation of the ages. Round to the nearesttenth.

Washington 67 J. Adams 90

Jefferson 83 Madison 85

Monroe 73 J. Q. Adams 80

Jackson 78 Van Buren 79

W. H. Harrison 68 Tyler 71

Polk 53 Taylor 65

Fillmore 74 Pierce 64

Buchanan 77 Lincoln 56

A. Johnson 66 Grant 63

Hayes 70 Garfield 49

Arthur 56 Cleveland 71

B. Harrison 67 McKinley 58

T. Roosevelt 60 Taft 72

Wilson 67 Harding 57

Coolidge 60 Hoover 90

F. D. Roosevelt 63 Truman 88

Eisenhower 78 Kennedy 46

L. B. Johnson 64 Nixon 81Source: Time Almanac 2002

b. How do the results from part a. compare with the re-sults from Exercise 26?

Sports and Recreation28. Baseball Averages

a. Compute the six missing batting averages in the fol-lowing table. Note: For a given time period, a player’sbatting average is determined by dividing her numberof hits by the number of times she has been at bat.Round to the nearest hundred-thousandth.

Batting Statistics for Two Baseball Players

b. Which of the two players had the higher batting aver-age during the first month? Which of the two playershad the higher batting average during the secondmonth? Which of the two players had the higher bat-ting average over the two-month period?

You may be surprised by the above results, since Dawnhas the higher batting average in the first month and inthe second month, but she does not have the higheraverage for both months. In statistics, an example such asthis is known as a Simpson’s paradox.

Measures of Central Tendency and Dispersion 13

Copyright © Houghton Mifflin Company. All rights reserved.

First month Second month Both months

Dawn 2 hits; 19 hits; 21 hits;5 at-bats 49 at-bats 54 at-batsAverage: ? Average: ? Average: ?

Joanne 29 hits; 31 hits; 60 hits;73 at-bats 80 at-bats 153 at-batsAverage: ? Average: ? Average: ?

302360_File_A.qxd 7/7/03 7:12 AM Page 13

Page 14: Measures of Central Tendency and Dispersion - Cengagecollege.cengage.com/mathematics/aufmann/college... · 07/07/2003 · Measures of Central Tendency and Dispersion 1 Statistics

29. Breaking Strength of Ropes A mountain climberplans to buy some rope to use as a lifeline. Which

of the following would be the better choice?

Rope A: Mean breaking strength: 600 pounds; standarddeviation of 100 pounds.

Rope B: Mean breaking strength: 500 pounds; standarddeviation of 25 pounds.

Explain why you think your choice is the better choice.

30. Golf Scores Two longtime golfers are trying tobreak a course record of 66. Golfer A has a mean

score of 75 with a standard deviation of 3. Golfer B has amean score of 77 with a standard deviation of 4.5. Whichgolfer do you think has the better chance of breaking thecourse record? Explain why you think your choice hasthe better chance.

Explorations

1. Linear Interpolation and AnimationLinear interpolation is a method used to find a particularnumber between two given numbers. For instance, if atable lists the two entries 0.3156 and 0.8248, then thevalue exactly halfway between the numbers is the meanof the numbers, which is 0.5702. To find the number thatis 0.2 of the way between 0.3156 and 0.8248, compute0.2 times the difference between the numbers and, be-cause the first number is smaller than the second, addthis result to the first number.

Difference between thetable entries

0.2 of the abovedifference

Interpolated result, whichis 0.2 of the way betweenthe table entries

This linear interpolation process can be used to find anintermediate number that is any specified fraction of thedifference between two given numbers. Note that if thefirst number is larger than the second number, the por-tion of the difference is subtracted from the first number.

a. Use linear interpolation to find the number that is 0.7of the way between 1.856 and 1.972.

b. Use linear interpolation to find the number that is 0.3of the way between 0.8765 and 0.8652.

c. A calculator shows that andUse linear interpolation to esti-

mate Round your estimate to the nearest hundred-thousandth. Explain why your estimate issmaller than the value of 0.63347 that was obtained byusing a calculator to evaluate

d. A calculator shows that andUse linear interpolation to esti-

mate Round your estimate to the nearest hundred-thousandth. Explain why your estimate is larger thanthe value of 13.46374 that was obtained by using a cal-culator to evaluate

e. At the present time a professional football playerweighs 325 pounds. There are 90 days until the playerneeds to report to spring training at a weight of290 pounds. The player plans to lose weight at a con-stant rate— that is, the player plans on losing the sameamount of weight each of the 90 days. What weight, tothe nearest 0.1 pound, should the player attain in25 days?

2. Florence Nightingale and Mathematics As ayoung girl, Florence Nightingale pleaded

with her parents to let her study mathematics. Write ashort report that provides some of the details concerningFlorence Nightingale’s mathematical training and hermathematical accomplishments.

e2.6.

e2.6.e3 � 20.08554.

e2 � 7.38906

log 4.3.

log 4.3.log 5 � 0.69897.

log 4 � 0.60206

k 0.3156 � 0.10184 � 0.41744

k 0.2�0.5092� � 0.10184

k 0.8248 � 0.3156 � 0.5092

14

Copyright © Houghton Mifflin Company. All rights reserved.

302360_File_A.qxd 7/7/03 7:12 AM Page 14