44
Central Tendency Introduction to Statistics Introduction to Statistics Chapter 3 Chapter 3 Jan 26, 2010 Jan 26, 2010 Class #3 Class #3

Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Embed Size (px)

Citation preview

Page 1: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Central Tendency

Introduction to StatisticsIntroduction to StatisticsChapter 3 Chapter 3

Jan 26, 2010Jan 26, 2010Class #3Class #3

Page 2: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Central Tendency

A statistical measure to determine a single A statistical measure to determine a single score that defines the center of the score that defines the center of the distributiondistribution

There are many measures of central There are many measures of central tendency. The most common are the tendency. The most common are the mean, median, and mode mean, median, and mode

Page 3: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

MEASURES OF CENTRAL MEASURES OF CENTRAL TENDENCYTENDENCY

The ModeThe Mode The MedianThe Median

The Mean The Mean The Symmetrical Distribution The Symmetrical Distribution The Positively Skewed DistributionThe Positively Skewed Distribution The Negatively Skewed Distribution The Negatively Skewed Distribution

Page 4: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Illustrating the mode

GOV SHOULD ASSIST LOW-INCOME COLLEGE STUDENT

DEFIN SHOULD NOT BE

PROB SHOULD NOT BE

PROBAB SHOULD BE

DEFIN SHOULD BE

Pe

rce

nt

60

50

40

30

20

10

0

Mode

Page 5: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Mode: The Category or Score with the Mode: The Category or Score with the Largest frequency (or %)Largest frequency (or %)

The mode is always a category or scoreThe mode is always a category or score The mode is not necessarily the category The mode is not necessarily the category

with the majority (more than 50% of the with the majority (more than 50% of the cases)cases)

Some distributions are bimodalSome distributions are bimodal

Page 6: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

GOV SHOULD PROVIDE HOUSING TO POOR

80 8.2 19.3 19.3

194 19.8 46.9 66.2

99 10.1 23.9 90.1

41 4.2 9.9 100.0

414 42.3 100.0

524 53.5

18 1.8

23 2.3

565 57.7

979 100.0

DEFIN SHOULD BE

PROBAB SHOULD BE

PROB SHOULD NOT BE

DEFIN SHOULD NOT BE

Total

Valid

NAP

CANT CHOOSE

NA

Total

Missing

Total

Frequency PercentValid

PercentCumulative

Percent

Mode is the category with highest frequency (not necessarily the majority)

Page 7: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

GOVTS RESP: PROVIDE FOR THE UNEMPLOYED

56 5.7 13.5 13.5

135 13.8 32.5 45.9

139 14.2 33.4 79.3

86 8.8 20.7 100.0

416 42.5 100.0

524 53.5

17 1.7

22 2.2

563 57.5

979 100.0

DEFIN SHOULD BE

PROBAB SHOULD BE

PROB SHOULD NOT BE

DEFIN SHOULD NOT BE

Total

Valid

NAP

CANT CHOOSE

NA

Total

Missing

Total

Frequency PercentValid

PercentCumulative

Percent

Bimodal Distribution

Page 8: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Bimodal Distribution

Bimodal distributionBimodal distribution is a continuous probability is a continuous probability distribution with two different modes. These appear as distribution with two different modes. These appear as

distinct peaks. See below:distinct peaks. See below:

Page 9: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

THE MEDIANTHE MEDIAN

The median is a measure of central The median is a measure of central tendency for variables which are at tendency for variables which are at least ordinalleast ordinal

The median represents the exact The median represents the exact middle of a distributionmiddle of a distribution

It is the score that divides the It is the score that divides the distribution into two equal partsdistribution into two equal parts

Page 10: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Locating the Median when Locating the Median when NN is odd is odd

The median is the response associated with The median is the response associated with the middle case.the middle case.

You find the middle case by :(You find the middle case by :(NN + 1) / 2 + 1) / 2 Since Since N N = 7, = 7, the middle case is the (7 + the middle case is the (7 +

1) / 2, or the 4th case1) / 2, or the 4th case Example:Example:

• 4, 14, 14, 15, 34, 35, 774, 14, 14, 15, 34, 35, 77• Answer: 15Answer: 15

Page 11: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

To locate the medianTo locate the median (N=Even) (N=Even)

Suicide rates of cities

7.44, 10.00, 12.26, 12.61, 13.38, 14.11, 14.30, 14.78

The median is located halfway between the two middle cases. When the variable is interval we can average the two middle cases.

Median = 12.61 + 13.38 = 12.99

2

Page 12: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Finding the Median in Frequency Finding the Median in Frequency DistributionsDistributions

Construct a cumulative percentage Construct a cumulative percentage distributiondistribution

The observation located at the middle The observation located at the middle of the distribution is the one that has a of the distribution is the one that has a cumulative percentage value equal to cumulative percentage value equal to 50%50%

The median is the category associated The median is the category associated with this observationwith this observation

Page 13: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

POWER OF BUSINESS AND INDUSTRY

49 5.0 12.2 12.2

148 15.1 36.9 49.1

170 17.4 42.4 91.5

32 3.3 8.0 99.5

2 .2 .5 100.0

401 41.0 100.0

524 53.5

37 3.8

17 1.7

578 59.0

979 100.0

FAR TOO MUCH POWER

TOO MUCH POWER

RIGHT AMNT OF POWER

TOO LITTLE POWER

FAR TOO LITTLE POWER

Total

Valid

NAP

CANT CHOOSE

NA

Total

Missing

Total

Frequency PercentValid

PercentCumulative

Percent

Median

Look here

50th % included here

Page 14: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Locating Percentiles in a Locating Percentiles in a Frequency DistributionFrequency Distribution

A percentile is a score below which a specific A percentile is a score below which a specific percentage of the distribution falls (the median is the percentage of the distribution falls (the median is the 50th percentile.50th percentile.

The 75th percentile is a score below which 75% of The 75th percentile is a score below which 75% of the cases fall.the cases fall.

The median is the 50th percentile: 50% of the cases The median is the 50th percentile: 50% of the cases fall below itfall below it

Another type of percentile :The quartile lower Another type of percentile :The quartile lower quartile is 25th percentile and the upper quartile is quartile is 25th percentile and the upper quartile is the 75th percentilethe 75th percentile

Page 15: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

97th Percentile

85th Percentile

50th Percentile

Page 16: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

NUMBER OF CHILDREN

260 26.6 26.6 26.6

161 16.4 16.5 43.1

260 26.6 26.6 69.7

155 15.8 15.9 85.6

70 7.2 7.2 92.7

31 3.2 3.2 95.9

21 2.1 2.1 98.1

11 1.1 1.1 99.2

8 .8 .8 100.0

977 99.8 100.0

2 .2

979 100.0

0

1

2

3

4

5

6

7

EIGHT OR MORE

Total

Valid

NAMissing

Total

Frequency PercentValid

PercentCumulative

Percent

50th percentile

80th percentile

50% included here

80% includedhere

25th percentile

25% included here

Locating Percentiles in a Frequency Distribution

Page 17: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

________________________________________________________Mean. Mean.

The arithmetic average obtained by The arithmetic average obtained by adding up all the scores and dividing by adding up all the scores and dividing by

the total number of scores.the total number of scores.______________________________________________________________________

Page 18: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

The Mean_________________________________________________________________The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.

Mean = M = X n

_________________________________________________________________

__

X = the sum of all the x scores in the sample

n = the number of observations in the sample

Page 19: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Often we use symbol referred to Often we use symbol referred to as “x-bar”as “x-bar”

Page 20: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Murder Rate in U.S Murder Rate in U.S Cities:Finding the MeanCities:Finding the Mean

Page 21: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Murder Rate in U.S Cities: Murder Rate in U.S Cities: Finding the MeanFinding the Mean

CITYNew YorkLos AngelesChicagoHoustonPhiladelphiaSan DiegoDetroitDallasPhoenixSan AntonioHonoluluSan JoseBaltimoreSan FranciscoJacksonville

MURDER RATE per 100029.328.932.936.5

2514.758.448.812.821.83.46.6

40.612.919.8

Total 392.4

Page 22: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Finding the mean in a frequency distribution

When data are arranged in a frequency distribution, we must give each score its proper weight by multiplying it by its frequency. We use the following formula to calculate the mean:

__

X = fxn

where__X = the meanf X = a score multiplied by its frequencyn = the total number of cases in the distribution

Page 23: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Calculating the Mean from a Calculating the Mean from a Frequency DistributionFrequency Distribution

# of Children(X)01234567

Total

Frequency(f)1225

733333183261512

1339

Frequency*X(fX)0

2514669997321309084

3526

Page 24: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Weighted MeanWeighted Mean

Often it is necessary to combine two sets of Often it is necessary to combine two sets of scores and then find the overall mean for the scores and then find the overall mean for the combined groupcombined group

See formula:See formula:

• X11 + + X22……

nn11+ + nn22……

Page 25: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Changing a score…Changing a score…

Changing any score will change the meanChanging any score will change the mean

Page 26: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Introducing a new score or Introducing a new score or removing a score…removing a score…

Will usually change the meanWill usually change the mean

Page 27: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Adding or subtracting a new Adding or subtracting a new score…score…

If you add (or subtract) a constant from If you add (or subtract) a constant from each score then the mean will increase (or each score then the mean will increase (or decrease) by that scoredecrease) by that score

Page 28: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Multiplying or dividing by a Multiplying or dividing by a constant…constant…

Same thing here: Same thing here: If you multiply of divide each score by a If you multiply of divide each score by a

constant the mean will be effected in a similar constant the mean will be effected in a similar mannermanner

Page 29: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Selecting a measure of central Selecting a measure of central tendency…tendency…

If it is at all possible, you should attempt to If it is at all possible, you should attempt to find all three measures of central tendencyfind all three measures of central tendency However, if this is not possible, then there are However, if this is not possible, then there are

situations in which the mean, median, and situations in which the mean, median, and mode have their specific "advantages"mode have their specific "advantages"

Page 30: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Choosing the mean…Choosing the mean…

The mean is ordinarily the The mean is ordinarily the preferred measure of central preferred measure of central tendency tendency The mean presented along with the The mean presented along with the

variance and the standard deviation variance and the standard deviation is the "best" measure of central is the "best" measure of central tendency for continuous datatendency for continuous data

Page 31: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Choosing the median…Choosing the median…

There are some situations in which the mean is There are some situations in which the mean is not the "best" measure of central tendency. In not the "best" measure of central tendency. In certain situations, the median is the preferred certain situations, the median is the preferred measure. These situations are as follows:measure. These situations are as follows: when you know that a distribution is skewed when you know that a distribution is skewed when you believe that a distribution might be when you believe that a distribution might be

skewed skewed when you have a small number of subjects when you have a small number of subjects

Page 32: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Choosing the median…Choosing the median…

The purpose for reporting the median in The purpose for reporting the median in these situations is to combat the effect of these situations is to combat the effect of outliersoutliers

Outliers affect the distribution because Outliers affect the distribution because they are extreme scoresthey are extreme scores

Page 33: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Choosing the mode…Choosing the mode… The mode is rarely chosen as the preferred The mode is rarely chosen as the preferred

measure of central tendencymeasure of central tendency The mode is not usually used because the The mode is not usually used because the

largest frequency of scores might not be at the largest frequency of scores might not be at the centercenter

The only situation in which the mode may be The only situation in which the mode may be preferred over the other two measures of central preferred over the other two measures of central tendency is when describing discrete categorical tendency is when describing discrete categorical datadata

The mode is preferred in this situation because The mode is preferred in this situation because the greatest frequency of responses is important the greatest frequency of responses is important for describing categorical datafor describing categorical data

Page 34: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

SymmetricalSymmetricalDistributionsDistributions

A distribution is symmetrical if the A distribution is symmetrical if the frequencies at the right and left tails of the frequencies at the right and left tails of the distribution are identical, so that if it is distribution are identical, so that if it is divided into two halves, each will be the divided into two halves, each will be the mirror image of the other. mirror image of the other.

In a unimodal symmetrical distribution the In a unimodal symmetrical distribution the mean, median, and mode are identical.mean, median, and mode are identical.

Page 35: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Almost Symmetrical DistributionAlmost Symmetrical Distribution

Mean=13.4

Median=13.0

HIGHEST YEAR OF SCHOOL COMPLETED

20.017.515.012.510.07.55.02.50.0

HIGHEST YEAR OF SCHOOL COMPLETED

Fre

qu

en

cy

400

300

200

100

0

Std. Dev = 2.97

Mean = 13.4

N = 975.00

Page 36: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Skewed Distribution:Skewed Distribution:

Skewed DistributionSkewed DistributionFFew extreme values on one side of the distribution or ew extreme values on one side of the distribution or

on the other.on the other.

Positively skewedPositively skewed distributions: distributions: distributions which have one or a few distributions which have one or a few extremely high values (Mean > Median)extremely high values (Mean > Median)

Negatively skewed distributions: Negatively skewed distributions:

distributions which have one or a few distributions which have one or a few

extremely low values (Mean < Median)extremely low values (Mean < Median)

Page 37: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Positively Skewed DistributionPositively Skewed Distribution

Page 38: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Positively Skewed DistributionPositively Skewed Distribution

GOVT INVESTIGATE WORKERS ILLEGAL DRUG USE

4.03.02.01.0

GOVT INVESTIGATE WORKERS ILLEGAL DRUG USE

Fre

qu

en

cy

500

400

300

200

100

0

Std. Dev = .39

Mean = 1.1

N = 474.00

Mean=1.13

Median=1.0

Page 39: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Positively Skewed DistributionPositively Skewed Distribution

Example: Personal income is frequently positively

skewed because there are fewer people with high income; therefore, studies on earnings often report median income. The mean tends to overestimate both the earnings of the most typical earner (the mode) and the earnings represented by the 50th percentile (the median).

Page 40: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Negatively Skewed DistributionNegatively Skewed Distribution

Page 41: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Negatively Skewed DistributionNegatively Skewed Distribution

FAVOR PREFERENCE IN HIRING BLACKS

4.03.02.01.0

FAVOR PREFERENCE IN HIRING BLACKS

Fre

qu

en

cy

600

500

400

300

200

100

0

Std. Dev = .98

Mean = 3.3

N = 908.00

Mean=3.3

Median=4.0

Page 42: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Bas ic Business Statis tics , 10e © 2006 Prentice-Hall, Inc . Chap 3-37

S hape o f a D istr ibution

D e s c r ib e s h o w d a ta a re d is tr ib u te d

M e a s u re s o f s h a p e

S ym m e tric o r ske w e d

M ean = M edianM ean < M edian M edian < M ean

Right-SkewedLeft-Skewed Symmetric

Page 43: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

Choosing a Measure of Central Choosing a Measure of Central TendencyTendency

IF variable is Nominal..IF variable is Nominal.. ModeMode

IF variable is Ordinal...IF variable is Ordinal... Mode or Median (or both)Mode or Median (or both)

IF variable is Interval-Ratio and distribution is IF variable is Interval-Ratio and distribution is Symmetrical…Symmetrical… Mode, Median or Mean Mode, Median or Mean

IF variable is Interval-Ratio and distribution is IF variable is Interval-Ratio and distribution is Skewed…Skewed… Mode or MedianMode or Median

Page 44: Central Tendency Introduction to Statistics Chapter 3 Jan 26, 2010 Class #3

CreditsCredits http://www.uwm.edu/People/chava/centraltend.ppt http://simon.cs.vt.edu/SoSci/converted/MMM/choosingct.html http://www.cvgs.k12.va.us/DIGSTATS/main/descriptv/d_skewd.html