15
(c) 2007 IUPUI SPEA K300 (439 2) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion (Variability) How are data points are deviated from the mean? Range variance, standard deviation

(c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

Embed Size (px)

Citation preview

Page 1: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Outline: Numerical Methods

Measures of Central TendencyRepresentative valueMeanMedian, mode, midrange

Measures of Dispersion (Variability)How are data points are deviated from the

mean?Range variance, standard deviation

Page 2: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Mean

Arithmetic averageSum divided by N (# of observations)Key statistic in data analysis

n

yy i

ni

n

i i

n

ii yyyyyy

....2111

mjj

mmi

class

j

mjj yfyfyfyf

....2211

Page 3: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Mean

Class Midpoint Frequency Frequency × Midpoint

90-98 94 6 564

99-107 93 22 2046

108-116 112 43 4816

117-125 121 28 3388

126-134 130 9 1170

Sum 108 10814

Question 12 on page 117

The mean is 100.12963 = 10814 / 108

Page 4: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Median

Midpoint of data arranged in orderWhen even number of observations,

mean of the two data points in the middleUseful when data are skewed to the right

or left substantially.

Page 5: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Mode

Value that occurs most oftenPeak in the histogramBimodal with two peaks (modes)Figure 3-1 on page 115

Page 6: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Midrange

Mean of minimum and maximum values(minimum + maximum)/2

Page 7: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Central Tendency: Others

Weighted mean when individual data points should be weighted differently (other than 1)

Trimmed mean in the presence of outliers (extremely large or small data points)

Page 8: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Quantiles

Quantiles are points taken at equal intervals from CDF (cumulative density function)

100 quantiles: percentiles10 quantiles: dociles5 quantiles: quintiles4 quantiles: quartiles2 quantiles: ?

Page 9: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Percentiles

Percentiles divide data into 100 groups with an equal interval100 quantilesNth percentile is located at nth from the

smallest in data w/ 100 observationsMedian = 50th percentileTable 3-3 on page 141Figure 3-5 on page 142

Page 10: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Quartiles

4-quantiles1st quartile (25th percentile) 2st (50th percentile or median) 3st (75th percentile) IQR (interquartile range) = 3Q-1Q

Box plot include 1Q, 2Q, 3Q, minimum, maximum

Page 11: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Quartiles in a Box Plots

24

68

10

12

14

Illinois (N=102) Indiana (N=92) Ohio (N=88)

Une

mplo

yme

nt R

ate

(%

)

Indiana Business Research Center (http://www.stats.indiana.edu/)

Source: Bureau of Labor Statistics

Page 12: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Quartiles in Histograms0

.1.2

.3.4

.5

0 3 6 9 12 15 0 3 6 9 12 15 0 3 6 9 12 15

Illinois (N=102) Indiana (N=92) Ohio (N=88)

Indiana Business Research Center (http://www.stats.indiana.edu/)

Source: Bureau of Labor Statistics

Page 13: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Example 1: Example 3-38, p159

01

002

003

004

00

real substitute

Page 14: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Example 2: SAS Output

SAS includes mean in the box plot

Histogram # Boxplot

2.75+* 1 | .** 4 | .******** 23 | .**************** 46 | .*********************** 68 +-----+ .*************************** 80 | | .*************************************** 116 *--+--* .********************** 64 +-----+ .******************* 56 | .********* 27 | .***** 13 | -2.75+* 2 | ----+----+----+----+----+----+----+---- * may represent up to 3 counts

Page 15: (c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion

(c) 2007 IUPUI SPEA K300 (4392)

Example 3: Box Plots

| 9 + | | | | | | | | | | | 8 + | *-----* | | | | | | | + | | | | | 7 + +-----+ +-----+ | | | | | | | | | | | | 6 + | | | | | | | | | | | | | | | | 5 + | | +-----+ 0 | | + | | | | | | | | | | | | | 4 + *-----* | | | | | | | | | | | + | | | | | | 3 + +-----+ *-----* | | | | | | | | | | | | 2 + | +-----+ | | | | | | | | | 1 + | | ------------+-----------+-----------+----------- Site 102 134 137