32
Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar

Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Measures of Central Tendency

11.220Lecture 5

22 February 2006R. Ryznar

Page 2: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Today’s Content

• Wrap-up from yesterday• Frequency Distributions• The Mean, Median and Mode• Levels of Measurement and Measures of

Central Tendency• Measures of Dispersion• Measures of Central Tendency and Dispersion

Together

Page 3: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Definitions for Frequency Distributions

• Variable– the trait or characteristic on which the classification is based.

• Class– one of the grouped categories of the variable.

• Class Boundaries– the lowest and highest values that fall within the class.

• Class Midpoints– the point halfway between the upper and lower class boundaries.

• Class Interval– the distance between the upper limit of one class and the upper

limit of the next higher class.• Class Frequency

– the number of observations or occurrences of the variable with agiven class.

• Total Frequency– the total number of observations or cases in the table.

Page 4: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

• Bin the data• Bins must be exclusive and exhaustive• Look for overall patterns• Look for any striking deviations to the pattern• Look for any outliers• Give the center and the spread• Describe the shape

Creating and Using Frequency Distributions

Page 5: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

14.0 16.0 18.0 20.0 22.0 24.0 26.0

Percentage

0

2

4

6

8

10Fr

eque

ncy

Mean = 20.492Std. Dev. = 2.3883N = 50

Percentage of Adult Population who are Obese byU.S. State (2001)

Page 6: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Stem-and-Leaf PlotPercentage of adult population who are obese (2001)

Frequency Stem & Leaf

1.00 14 . 4.00 15 .1.00 16 . 15.00 17 . 1 3 3 6 85.00 18 . 2 4 4 8 99.00 19 . 0 0 0 1 2 2 7 8 98.00 20 . 0 0 0 1 5 6 7 98.00 21 . 0 0 4 7 7 8 8 95.00 22 . 1 1 4 5 63.00 23 . 3 4 84.00 24 . 0 2 4 61.00 25 . 9

Stem width: 1.0Each leaf: 1 case(s)

Page 7: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Babe Ruth’s homeruns, 1920 to 1934:

54 59 35 41 46 25 47 60 54 46 49 46 41 34 22

Barry Bonds home runs, 1986 to 2004:

16 25 24 19 33 25 34 46 37 33 42 40 37 34 49 73 46 45 45

Page 8: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Back to back stemplot

Bonds Ruth9 6 1

5 5 4 2 2 57 7 4 4 3 3 3 4 5

9 6 6 5 5 2 0 4 1 1 6 6 6 7 95 4 4 96 0

3 7

Page 9: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Number of Crews

Tons of Garbage Normal Moore

50 - 59 15 2260 - 69 25 3770 - 79 30 4980 - 89 20 3690 - 99 10 21Total Crews 100 165

Page 10: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Percentage of Work Crews

Tons of Garbage Normal Moore

50 - 59 15 1360 - 69 25 2270 - 79 30 3080 - 89 20 2290 - 99 10 13Total Percent 100 100N = 100 165

Page 11: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Opinion on president's performance(number of responses by gender)

0

50

100

150

200

250

300

Strongly approve Approve Neither approvenor disapprove

Disapprove Stronglydisapprove

Females (n=110)Males (n=683)

Page 12: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Opinion on president's performance(percentage of responses by gender)

0%

10%

20%

30%

40%

50%

60%

Strongly approve Approve Neither approvenor disapprove

Disapprove Stronglydisapprove

FemalesMales

Page 13: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Response Time of Metro Fire Department

0

10

20

30

40

50

60

70

80

90

100

Under 5minutes

Under 10minutes

Under 15minutes

Under 20Minutes

Cum

ulat

ive

Perc

enta

ge

Response Time Percentage (Cumulative) of

Response Times

Under 5 minutes 27.6

Under 10 minutes 82.4

Under 15 minutes 98.8

Under 20 Minutes 100

Page 14: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Atlantis Fire Department, 2001 Metro Fire Department, 2001

Response TimePercentage

(Cumulative) of Response

TimesUnder 5 minutes 27.6Under 10 minutes 82.4Under 15 minutes 98.8Under 20 Minutes 100.0

Response Time Percentage (Cumulative) of

Response Times

Under 5 minutes 21.2Under 10 minutes 63.9Under 15 minutes 86.4Under 20 Minutes 100.0

Under 5 minutesUnder 10minutes Under 15

minutes Under 20Minutes

Atlantis

Metro0

102030405060708090

100

Fire Department Response Time

Page 15: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

μ =Xi

i=1

N

∑N

=X1 + X2 + ...+ XN( )

N( )

nxxx

n

xx n

n

ii +++==

∑= ...211

Sample mean:Population mean:

The median is the middle number in an ordered dataset.

It can be found by ordering the values from lowest to highest and then finding the observation at (n+1)/2.

Mode?The most frequently occurring value in the dataset.

Page 16: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Journal of Statistics Education Volume 13, Number 2 (2005), www.amstat.org/publications/jse/v13n2/vonhippel.htmlCopyright © 2005 by Paul T. von Hippel

Classic illustration of the relationship between skew, mean, median and mode

The mean is not a resistant measure of center.

Mode

f

x

Mean

Median

Figure by MIT OCW.

Page 17: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Ruth BondsMean 659/15=43.93 703/19=37

Median 46 37

Mode 46 multiple

Bonds Ruth9 6 1

5 5 4 2 2 57 7 4 4 3 3 3 4 5

9 6 6 5 5 2 0 4 1 1 6 6 6 7 95 4 4 96 0

3 7

Page 18: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

14.0 16.0 18.0 20.0 22.0 24.0 26.0

Percentage

0

2

4

6

8

10

Freq

uenc

y

Mean = 20.492Std. Dev. = 2.3883N = 50

Percentage of Adult Population who are Obese byU.S. State (2001)

Median = 20.3

Skewness = -.012

A normal distribution has a skew of 0 since it is symmetric.

Page 19: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Statistics

PctObese500

20.49220.300

19.0a

2.38835.704-.012.337

-.100.662

18.97520.30022.100

ValidMissing

N

MeanMedianModeStd. DeviationVarianceSkewnessStd. Error of SkewnessKurtosisStd. Error of Kurtosis

255075

Percentiles

Multiple modes exist. The smallest value is showna.

Page 20: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

A normal distribution has kurtosis 0, since it is symmetric.

Page 21: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

0.0 10.0 20.0 30.0 40.0 50.0

Percent Hispanic

0

5

10

15

20

Freq

uenc

y

Mean = 7.73Std. Dev. = 8.9125N = 50

Percent of Population of Hispanic Originby U.S. State (2000)

Median = 4.7

Skewness = 2.245

Kurtosis = 5.12

Page 22: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

5.0 10.0 15.0 20.0 25.0

Percent with no Health Insurance

0

2

4

6

8Fr

eque

ncy

Mean = 13.288Std. Dev. = 3.6878N = 50

Percentage of Population without Health Insurance for One Year by U.S. State

(Three Year Average from 2000 to 2002)

Median = 13.05

Skewness = .732

Kurtosis = .262

Page 23: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Levels of Measurement and Measures of Central Tendency

Nominal Ordinal Interval

Mean

Median

Mode

Nominal Ordinal Interval

Mean X

Median X X

Mode X X X

Page 24: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Would you prefer to ride on this facility?Active Living Visual Preference Survey, Collingwood, Ontario - November 2005

-3

-2

-1

0

1

2

3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Slide Number

Mea

n (N

o -3

to Y

es +

3)

Page 25: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

0 2 4 6 8

Slide1

0

10

20

30

40

50

Freq

uenc

y

Mean = 3.31Std. Dev. = 1.895N = 184

Histogram

-4.00 -2.00 0.00 2.00 4.00

Slide40Rec

0

20

40

60

80

100

120

Freq

uenc

yMean = -1.30Std. Dev. = 2.33003N = 180

Histogram

Page 26: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

0 10 20 30 40 50 60

Slide19

0

20

40

60

80Fr

eque

ncy

Mean = 3.97Std. Dev. = 4.097N = 186

Slide19

Page 27: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

The Five Number SummaryMinimum, Q1, Median, Q3, Maximum

To calculate quartiles:

• Arrange the observations in increasing order and locate the median M in the ordered list of observations.

• The first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median.

• The third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median.

Page 28: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

0 1male

0.00

25000.00

50000.00

75000.00

100000.00

125000.00

incw

s

40

The ends of the box (hinges) are at the quartiles, so that the length of the box is the IQR.

The median is marked by a line within the box.

The two vertical lines (called whiskers) outside the box extend to the smallest and largest observations within 1.5 X IQR of the quartiles.

Observations that fall outside of 3 X IQR are called extreme outliers and are marked, for example, with an open circle.

Observations between 1.5 X IQR and 3 X IQR are called mild outliers and are distinguished by a different mark, e.g., a closed circle.

Page 29: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Variance and Standard Deviation

Measures the average distance of the observations from their mean.

Variance

Standard Deviation1

)(...)()( 222

212

−−++−+−

=n

xxxxxxs n

1)( 2

−−∑

=n

xxs i

Page 30: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Visitors

1792 1762 - 1600 = 192 1922 = 36,8641666 1666 - 1600 = 66 662 = 4,3561362 1362 - 1600 = -238 -2382 = 56,6441614 1614 - 1600 = 14 142 = 1961460 1460 - 1600 = -140 -1402 = 19,6001867 1867 - 1600 = 267 2672 = 71,2891439 1439 - 1600 = -161 -1612 = 25,921

Sum = 0 Sum = 214,870

xxi −2)( xxi −

67.811,356870,2142 ==s

24.18967.811,35 ==s

Page 31: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

14.0 16.0 18.0 20.0 22.0 24.0 26.0

Percentage

0

2

4

6

8

10

Freq

uenc

y

Mean = 20.492Std. Dev. = 2.3883N = 50

Percentage of Adult Population who are Obese byU.S. State (2001)

Page 32: Measures of Central Tendencydspace.mit.edu/bitstream/handle/1721.1/55900/11... · Measures of Central Tendency 11.220 Lecture 5 22 February 2006 R. Ryznar. Today’s Content • Wrap-up

Variable 1 Variable 29.14 6.588.14 5.768.74 7.718.77 8.849.26 8.478.10 7.046.13 5.253.10 5.569.13 7.917.26 6.894.74 12.50

3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00

Variable 1

0

1

2

3

4

Freq

uenc

y

Mean = 7.5009Std. Dev. = 2.03166N = 11

Histogram

4.00 6.00 8.00 10.00 12.00 14.00

Variable 2

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Freq

uenc

y

Mean = 7.5009Std. Dev. = 2.03058N = 11

Histogram